Multiple REX prefix trickery

I was working on my own little assembler the other day when I ran across OSWiki’s X86-64 Instruction Encoding page trying to get proper instruction encoding functionality working.

I read through the REX part a few times, but this bit did not jump out at me until the 4th time or so

A REX prefix must be encoded when:

  • using 64-bit operand size and the instruction does not default to 64-bit operand size; or
  • using one of the extended registers (R8 to R15, XMM8 to XMM15, YMM8 to YMM15, CR8 to CR15 and DR8 to DR15); or
  • using one of the uniform byte registers SPL, BPL, SIL or DIL.

A REX prefix must not be encoded when:

  • using one of the high byte registers AH, CH, BH or DH.

In all other cases, the REX prefix is ignored. The use of multiple REX prefixes is undefined, although processors seem to use only the last REX prefix.

Well, hold the phone. That’s pretty cool.
This means that you can effectively use REX bytes as no-operation bytes in certain circumstances.

To accomplish

mov eax, 1

you can use plain old

0xB8, 0x01, 0x00, 0x00, 0x00

OR, you can use

0x48, 0x42, 0x47, 0x40, 0xB8, 0x01, 0x00, 0x00, 0x00

Since 0x40 is utilized last, and since that clears REX.W, it will default to 32-bit EAX retaining the original intent.

To be sure of this, I tried

mov rax, 0xFFAABBCCDDEEFF22

the original bytes

0x48, 0xB8, 0x22, 0xFF, 0xEE, 0xDD, 0xCC, 0xBB, 0xAA, 0xFF

Can be turned into

0x40, 0x4C, 0x42, 0x48, 0xB8, 0x22, 0xFF, 0xEE, 0xDD, 0xCC, 0xBB, 0xAA, 0xFF

I have run into crashes in some circumstances with certain (unknown) combinations of REX bytes, but the examples here seem to work, and I can’t see any clear difference in the state of the program.

Andrew Artz

Read more posts by this author.