Ring0 - The Inner Circle: November 2012

We all have heard that using a switch() block instead of a long if-else ladder results in faster code. Let us see why it is so. The secret to switch()'s faster code is the use of jump tables. A jump table basically consists of rows of addresses that point to locations in the code section. This table is indexed using the value of the variable specified in switch()'s condition. Every address points to one of the 'case' blocks or the 'default' block. This table is accessed by the CPU's 'Scaled-Index+Displacement' addressing mode. Now let us see what this table looks like:

--[ Basic jump table layout

- Each value in the 'Address' column points to the beginning of a 'case' block.

- Each row is 4 bytes long: standard size of a pointer.

+-------+------------+

| Index | Address |

+-------+------------+

| 0 | codeloc0 | - 4 bytes

|--------------------|

| 1 | codeloc1 | - 4 bytes

|--------------------|

| 2 | codeloc2 | - 4 bytes

|--------------------|

| 3 | codeloc3 | - 4 bytes

|--------------------|

|...

Let us review what a typical switch() block is made of. Every switch statement requires a value to use as index into the jump table. This value can be either stored in a variable or be generated as the result of an expression. Whatever the case, switch uses this as a number to index into the jump table. This is followed by zero or more case blocks with optional break statements. There may optionally be a default block also. If the break statement is omitted at the end of any case block then control falls-through to the next case block until a break or some other control transfer statement is encountered.

--[ switch() block layout

switch(value)

{

case 0: [break;]

case 1: [break;]

case 2: [break;]

...

[default: ...]

}

--] switch() block layout

In order to understand what exactly is being done with the 'value' and how the jump table is used we must look at the assembly code generated by the compiler. The basic steps done is outlined in the following pseudo-code.

begin switch()

if value > highest_case_value then

if default was defined then

jump to default block

else

jump out of switch block

index := value * scale_factor + addr of jump_table

jump to code location stored at jump_table[index]

// case blocks follow sequentially

// followed by optional default block

end switch()

Let us get into a bit of assembly now. All doubts should be cleared after this.

--[ Disassembly snippets

; ** switch() begins

; switch(iIn)

0041155E mov eax,dword ptr [iIn]

00411561 mov dword ptr [ebp-0DCh],eax

00411567 cmp dword ptr [ebp-0DCh],7

0041156E ja $LN2+4 (4115B2h)

00411570 mov ecx,dword ptr [ebp-0DCh]

00411576 jmp dword ptr (411610h)[ecx*4]

iIn is the variable whose value is used to index into the jump table. The value is moved into eax first. It is then copied to a scratch-pad area and compared with the number 7, which is the highest value used in any of the case statements. If value of iIn is above 7 then control is transferred to the default block via the ja(jump if above) instruction. Notice that compiler has generated an unsigned compare instruction ja(jump if above) instead of jg(jump if greater) even though iIn is an signed variable. This is because the jump table index must always be >=0. If iIn was less than 7 then it is again moved to ecx from the scratch-pad area and the next jump instruction directly jumps to the appropriate case block(!). How did that happen you say? Well, scaled-Index+Displacement addressing mode and indirect jump happened! The ecx register is the index, 4 is the scaling factor because each row of the jump table is 4bytes long and 00411610h is the starting address of the jump table. The jump instruction uses the value at the calculated jump table index as an absolute offset. For example, if iIn was 3 then index = 00411610h + 3*4 = 00411610h + 0Ch = 0041161Ch. Value at 0041161Ch = 00411592. This is nothing but the address of the beginning of 'case 3' block. So the jump instruction transfers control to this block!

; case 0: vUnderConst(); break;

0041157D call vUnderConst (41107Dh)

00411582 jmp $LN2+1Bh (4115C9h)

; case 1: vUnderConst(); break;

00411584 call vUnderConst (41107Dh)

00411589 jmp $LN2+1Bh (4115C9h)

; case 2: vUnderConst(); break;

0041158B call vUnderConst (41107Dh)

00411590 jmp $LN2+1Bh (4115C9h)

; case 3: vUnderConst(); break;

00411592 call vUnderConst (41107Dh)

00411597 jmp $LN2+1Bh (4115C9h)

; case 4: vUnderConst(); break;

00411599 call vUnderConst (41107Dh)

0041159E jmp $LN2+1Bh (4115C9h)

; case 5: vUnderConst(); break;

004115A0 call vUnderConst (41107Dh)

004115A5 jmp $LN2+1Bh (4115C9h)

; case 6: vUnderConst(); break;

004115A7 call vUnderConst (41107Dh)

004115AC jmp $LN2+1Bh (4115C9h)

; case 7: return 0;

004115AE call vUnderConst (41107Dh)

004115B0 jmp $LN2+1Bh (4115C9h)

; default: printf("No such service\n"); break;

004115B2 mov esi,esp

004115B4 push offset string "No such service\n" (41575Ch)

004115B9 call dword ptr [__imp__printf (4182DCh)]

; ** switch() ends

--] Disassembly snippets

--[ Annotated jump table

- Index: Logical index of each row.

- CodeAddr: Location where the row value is stored.

- CodeLoc: Absolute code offsets that point to each of the case blocks.

+-------+------------+-----------+

| Index | CodeAddr | CodeLoc |

+-------+------------+-----------+

| 0 | 00411610 | 0041157d |

|--------------------------------|

| 1 | 00411614 | 00411584 |

|--------------------------------|

| 2 | 00411618 | 0041158b |

|--------------------------------|

| 3 | 0041161C | 00411592 |

|--------------------------------|

| 4 | 00411620 | 00411599 |

|--------------------------------|

| 5 | 00411624 | 004115a0 |

|--------------------------------|

| 6 | 00411628 | 004115a7 |

|--------------------------------|

| 7 | 0041162C | 004115ae |

|--------------------------------|

--] Annotated jump table

I hope you understood how the switch statement uses a jump table to optimize code when choices are to be made based on multiple condition values. Before we conclude there are some things related to the jump table that is to be mentioned. The above explanation assumed that the case values started from 0 and continued with consecutive numbers until 7. What happens if the case values do not begin from 0? What happens when the case values are not consecutive? At first, it looks like a lot of if-else conditions might have to be generated by the compiler. No, again the jump table comes to our rescue.

The jump table always has entries from index 0 to highest_case_value. It is always zero-based. So for every case value that is not specified in the source code, the jump table entry for that value will point to the switch's 'default' block, if present, or to the statement after the switch block. For example, consider the following source code and the jump table generated.

switch(val)

{

case 1: break;

case 6: break;

case 7: break;

case 9: break;

default: printf(“Invalid\n”); break;

}

after_switch: statement x

+-------+------------+-----------+

| Index | CodeAddr | CodeLoc |

+-------+------------+-----------+

| 0 | codeaddr0 | default |

|--------------------------------|

| 1 | 00411614 | 00411584 |

|--------------------------------|

| 2 | 00411618 | 0041158b |

|--------------------------------|

| 3 | 0041161C | 00411592 |

|--------------------------------|

| 4 | 00411620 | 00411599 |

|--------------------------------|

| 5 | 00411624 | 004115a0 |

|--------------------------------|

| 6 | 00411628 | 004115a7 |

|--------------------------------|

| 7 | 0041162C | 004115ae |

|--------------------------------|

Ring0 - The Inner Circle

Pages

November 18, 2012

Switch's jump table