I was working on my own little assembler the other day when I ran across OSWiki’s X86-64 Instruction Encoding page trying to get proper instruction encoding functionality working.
I read through the REX part a few times, but this bit did not jump out at me until the 4th time or so
A REX prefix must be encoded when:
- using 64-bit operand size and the instruction does not default to 64-bit operand size; or
- using one of the extended registers (R8 to R15, XMM8 to XMM15, YMM8 to YMM15, CR8 to CR15 and DR8 to DR15); or
- using one of the uniform byte registers SPL, BPL, SIL or DIL.
A REX prefix must not be encoded when:
- using one of the high byte registers AH, CH, BH or DH.
In all other cases, the REX prefix is ignored. The use of multiple REX prefixes is undefined, although processors seem to use only the last REX prefix.
Well, hold the phone. That’s pretty cool.
This means that you can effectively use REX bytes as no-operation bytes in certain circumstances.
mov eax, 1
you can use plain old
0xB8, 0x01, 0x00, 0x00, 0x00
OR, you can use
0x48, 0x42, 0x47, 0x40, 0xB8, 0x01, 0x00, 0x00, 0x00
Since 0x40 is utilized last, and since that clears REX.W, it will default to 32-bit EAX retaining the original intent.
To be sure of this, I tried
mov rax, 0xFFAABBCCDDEEFF22
the original bytes
0x48, 0xB8, 0x22, 0xFF, 0xEE, 0xDD, 0xCC, 0xBB, 0xAA, 0xFF
Can be turned into
0x40, 0x4C, 0x42, 0x48, 0xB8, 0x22, 0xFF, 0xEE, 0xDD, 0xCC, 0xBB, 0xAA, 0xFF
I have run into crashes in some circumstances with certain (unknown) combinations of REX bytes, but the examples here seem to work, and I can’t see any clear difference in the state of the program.