Yes and no. Look at some of the loop optimisations possible on ARM compared to x86-64. I've had x86-64 run 8 instructions that ARM does in 1 instruction.
I remember PPC and its rlwinms and co. My ARM isn’t that good, though I can read it.
But some of those x86 instructions take 0.5 cycles and some of them take 0 if they’re removed by fusion or register renaming. It has worse problems, like loop instructions you can’t actually use but take up the shortest codes.