'How do you understand 'REX.W + B8+ rd io' form for x86-64 assembly?

I was originally trying to generate the bytes for an immediate move into a 64 bit register.The specific operation I wanted was

mov rdi, 0x1337

Using https://www.felixcloutier.com/x86/mov, the only non-sign extended instructions I saw was

REX.W + B8+ rd io

This confused me so I created a small assembly program to see what the assembler would generate

          global    _start

          section   .text
_start:   
          mov       rdi, 0x1337 
          syscall                           
          mov       rax, 60                 
          xor       rdi, rdi                
          syscall                           

I had to turn off optimizations so that there would be a move into a 64-bit register. So I compiled with nasm -felf64 -O0 main.asm && ld main.o and generated a a.out. I look at the objdump -M intel -d ./a.out and this line

48 bf 37 13 00 00 00    movabs rdi,0x1337  

That line looks nothing like

REX.W + B8+ rd io

to me? Additionally, after some research, I saw that the command is suppose to be 10 bytes. How do you get that from REX.W + B8+ rd io?



Solution 1:[1]

B8+ rd means the operand (a register) is encoded in the low 3 bits of the opcode, not in a ModR/M byte.

From the Intel Software Developer's Manual,

+rb, +rw, +rd, +ro β€” Indicated the lower 3 bits of the opcode byte is used to encode the register operand without a modR/M byte. The instruction lists the corresponding hexadecimal value of the opcode byte with low 3 bits as 000b. In non-64-bit mode, a register code, from 0 through 7, is added to the hexadecimal value of the opcode byte. In 64-bit mode, indicates the four bit field of REX.b and opcode[2:0] field encodes the register operand of the instruction. β€œ+ro” is applicable only in 64-bit mode.

It looks like Intel wanted to use +ro for 64-bit operands encoded in that way, but then didn't actually do that. Not just in the mov lemma, but anywhere, as far as I could find. For example 64-bit push and pop could have had + ro, but they also have + rd. And "Indicated" is likely a typo, the rest of the text uses the present tense.

The (e/r)di register is number 7, and B8 + 7 = BF, explaining the opcode.

io stands for a qword immediate (o for octo, as in 8 bytes, perhaps?).

The REX prefix (40 for the base prefix, +8 to set the W bit, optionally +1 to set the B bit to access R8..R15), the opcode, no ModR/M byte, and the 8-byte immediate, add up to 10 bytes.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 harold