Ring0 - The Inner Circle: 2016

I've been planning to add 64bit support to my disassembler (github) for sometime now. So I started reading about 64bit assembly from the Intel manuals to see how it differs from 32bit code. To see the differences in live code, I wrote a simple program and looked through the disassembly. Here's what I found out so far. Refer to the source code and disassembly at the bottom while reading -

[Main differences]

1. Absolute code addresses are 32bits vs. 64bits
2. Instructions use the REX prefix byte in 64bit mode to specify 64bit operands. The REX byte is placed immediately before the first opcode byte and after all legacy prefixes.
3. 64bit mode has more registers available so more arguments are passed via registers than on the stack. The call to printf is smaller in 64bit.
4. The REX prefix occupies range 40h - 4Fh, so the single byte opcode forms of the INC/DEC instructions, that occupy the same range, are not available in 64bit mode. The ModR/M forms of these instructions (FFh) is still available.

[REX byte]
This is a byte-sized prefix used in 64bit mode in order specify 64bit operands and registers of size other than 32bits. Format of the REX byte is -
Bit position - 7 6 5 4 3 2 1 0
Bit value - 0 1 0 0 W R X B

The high nibble is always 0100 which is 4h. So the REX prefix range is 40h – 4Fh.
W bit : 1 – 64bit operand, 0 – operand size determined by CS.D bit (CodeSegment descriptor, refer Intel system programming manual).
R bit : This adds a 4th bit to the reg field in ModR/M byte. Enables addressing the new registers (R8-R15).
X bit : This adds a 4th bit to the Index field in SIB byte.
B bit : This adds a 4th bit to the r/m field in ModR/M or the Base field in SIB or Opcode reg field.

[Dissecting two instructions]

00007FF780AE17F5 48 C7 45 28 00 00 00 00 mov qword ptr [f0], 0
REX = 48 : which means only the W bit is set, indicating 64bit operand (the qword ptr [f0] operand).

00007FF780AE1876 4C 8B 45 68 mov r8, qword ptr [fn]
REX = 4C : which means both W and R bits are set. W indicates 64bit operand and the R bit adds a 4th bit to the reg field in ModRM byte.

MOV = 8B : this is 64bit register/memory to 64bit register MOV instruction

ModRM = 45 : Mod = 01, R/M = 101 – this is [RBP]+disp so this is the effective address of 'fn'. Register R8 is identified by the 4bit reg field 1000.

Mod R Reg R/M
7 6 5 4 3 2 1 0
0 1 1 0 0 0 1 0 1 (the Rbit from REX adds a 4th bit to the reg field)

Disp = 68 : This is the 8bit displacement added to RBP to get effective address of 'fn'.

int main()
{
    UINT n = 5;
    size_t f0 = size_t(0);
    size_t f1 = size_t(1);
    size_t fn = f0;
    if (n == 1)
    {
        fn = f0 + f1;
    }
    else if (n > 1)
    {
        for (UINT itr = 2; itr <= n; ++itr)
        {
            fn = f0 + f1;
            f0 = f1;
            f1 = fn;
        }
    }
 
    printf("Fibonacci(%u) = %Iu\n", n, fn);
    return 0;
}

Compiled as 32bit (snippets)

int main()

{

00B217B0 55 push ebp

00B217B1 8B EC mov ebp,esp

00B217B3 81 EC FC 00 00 00 sub esp,0FCh

UINT n = 5;

00B217CE C7 45 F8 05 00 00 00 mov dword ptr [n],5

size_t f0 = size_t(0);

00B217D5 C7 45 EC 00 00 00 00 mov dword ptr [f0],0

size_t fn = f0;

00B217E3 8B 45 EC mov eax,dword ptr [f0]

00B217E6 89 45 D4 mov dword ptr [fn],eax

fn = f0 + f1;

00B2181A 8B 45 EC mov eax,dword ptr [f0]

00B2181D 03 45 E0 add eax,dword ptr [f1]

00B21820 89 45 D4 mov dword ptr [fn],eax

printf("Fibonacci(%u) = %Iu\n", n, fn);

00B21831 8B 45 D4 mov eax,dword ptr [fn]

00B21834 50 push eax

00B21835 8B 4D F8 mov ecx,dword ptr [n]

00B21838 51 push ecx

00B21839 68 30 6B B2 00 push offset string "Fibonacci(%u) = %Iu\n" (0B26B30h)

00B2183E E8 DD FA FF FF call _printf (0B21320h)

00B21843 83 C4 0C add esp,0Ch

Compiled as 64bit (snippets)

int main()

{

00007FF780AE17D0 40 55 push rbp

00007FF780AE17D2 57 push rdi

00007FF780AE17D3 48 81 EC 88 01 00 00 sub rsp,188h

UINT n = 5;

00007FF780AE17EE C7 45 04 05 00 00 00 mov dword ptr [n],5

size_t f0 = size_t(0);

00007FF780AE17F5 48 C7 45 28 00 00 00 00 mov qword ptr [f0],0

size_t fn = f0;

00007FF780AE1805 48 8B 45 28 mov rax,qword ptr [f0]

00007FF780AE1809 48 89 45 68 mov qword ptr [fn],rax

fn = f0 + f1;

00007FF780AE1852 48 8B 45 48 mov rax,qword ptr [f1]

00007FF780AE1856 48 8B 4D 28 mov rcx,qword ptr [f0]

00007FF780AE185A 48 03 C8 add rcx,rax

00007FF780AE185D 48 8B C1 mov rax,rcx

00007FF780AE1860 48 89 45 68 mov qword ptr [fn],rax

printf("Fibonacci(%u) = %Iu\n", n, fn);

00007FF780AE1876 4C 8B 45 68 mov r8,qword ptr [fn]

00007FF780AE187A 8B 55 04 mov edx,dword ptr [n]

00007FF780AE187D 48 8D 0D A4 83 00 00 lea rcx,[string "Fibonacci(%u) = %Iu\n" (07FF780AE9C28h)]

00007FF780AE1884 E8 43 F9 FF FF call printf (07FF780AE11CCh)

Ring0 - The Inner Circle

Pages

October 8, 2016

x86 vs. x64 assembly language