None of this will be new to most of my readers… because I’m pretty sure I have 3 at most, and 2 of them should know this stuff already. But! This is for the sake of the third.
Say you’re given a disassembly. For example, here’s some that might be outputted by
objdump -d fun.o:
fun.o: file format elf32-i386
Disassembly of section .text:
00000000 <fun>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 53 push %ebx
4: 83 ec 08 sub $0x8,%esp
7: 83 7d 08 01 cmpl $0x1,0x8(%ebp)
b: 77 08 ja 15 <fun+0x15>
d: 8b 45 08 mov 0x8(%ebp),%eax
10: 89 45 f8 mov %eax,-0x8(%ebp)
13: eb 23 jmp 38 <fun+0x38>
15: 8b 45 08 mov 0x8(%ebp),%eax
18: 83 e8 02 sub $0x2,%eax
1b: 89 04 24 mov %eax,(%esp)
1e: e8 fc ff ff ff call 1f <fun+0x1f>
23: 89 c3 mov %eax,%ebx
25: 8b 45 08 mov 0x8(%ebp),%eax
28: 83 e8 01 sub $0x1,%eax
2b: 89 04 24 mov %eax,(%esp)
2e: e8 fc ff ff ff call 2f <fun+0x2f>
33: 01 c3 add %eax,%ebx
35: 89 5d f8 mov %ebx,-0x8(%ebp)
38: 8b 45 f8 mov -0x8(%ebp),%eax
3b: 83 c4 08 add $0x8,%esp
3e: 5b pop %ebx
3f: 5d pop %ebp
40: c3 ret
Let’s work step-by-step to figure out what this is.
Assumption:
cdecl calling convention.
Assumption:
fun is a function. Stuff other than functions can and do end up in the
.text section, but this seems to disassemble just fine, so it’s probably executable code. So let’s start with a skeleton:
… fun(…);Bytes 0-6 are the function prelude: saving the old
%ebp, setting
%ebp to the top of our frame, and growing our stack by
0x8.
Quick, what does the stack look like?
....
....^-- arguments are above the return address
....<-- return address
v-- top of our frame -- %ebp now points here
....<-- the eight bytes of our frame
....<-- that we just reserved...
^-- bottom of our frame -- %esp now points here
Bytes 7-c compare the literal value
1 to the 4-byte value at an address 8 bytes above
%ebp — in other words, the first argument — then if
(unsigned)arg1 > 1, we jump to byte 15 (
ja is the unsigned version of
jg, jump-if-greater.) Now we know that
fun takes at least one argument, the first of which is an unsigned 4-byte value, so:
… fun(unsigned arg1, …);Suppose we didn’t take the jump, meaning that
(unsigned)arg1 ≤ 1. Bytes d-14 copy
arg1 to
%eax, then
%eax to the bottom of our frame — that’s a local variable.
… fun(unsigned arg1, …) {
unsigned local1;
if (arg1 <= 1)
local1 = arg1;
…
}Alright, so suppose that bytes b-c
do jump us to 15. Bytes 15-24 copy
arg1 to
%eax, subtract 2, push it on the stack, recursively call
fun, then save the return value in
%ebx. Bytes 25-33 copy
arg1 to
%eax, subtract 1, push it on the stack, recursively call
fun, then add the return value to
%ebx. Bytes 35-37 then take
%ebx, the sum, and saves it to what we decided to call
local1. From this, we can gather that
fun takes a single argument, since that’s all we push onto the stack before calling it, and returns a 4-byte value, since we treat
%eax as one:
unsigned fun(unsigned arg1) {
unsigned local1;
if (arg1 <= 1)
local1 = arg1;
else
local1 = fun(arg1 - 2) + fun(arg1 - 1);
…
}Bytes 38-40 copy
local1 to
%eax — the return register — then go through the standard teardown of our frame and return. We’ve completed analyzing this function:
unsigned fun(unsigned arg1)
unsigned local1;
if (arg1 <= 1)
local1 = arg1;
else
local1 = fun(arg1 - 2) + fun(arg1 - 1);
return local1;
}That looks very familiar… almost like a naïve Fibonacci number generator! In fact, if we write this:
unsigned fib(unsigned n) {
if (n <= 1)
return n;
return fib(n - 2) + fib(n - 1);
}
and pass it through
gcc -S, it outputs this:
.file "fib.c"
.text
.globl fib
.type fib, @function
fib:
pushl %ebp
movl %esp, %ebp
pushl %ebx
subl $8, %esp
cmpl $1, 8(%ebp)
ja .L2
movl 8(%ebp), %eax
movl %eax, -8(%ebp)
jmp .L4
.L2:
movl 8(%ebp), %eax
subl $2, %eax
movl %eax, (%esp)
call fib
movl %eax, %ebx
movl 8(%ebp), %eax
subl $1, %eax
movl %eax, (%esp)
call fib
addl %eax, %ebx
movl %ebx, -8(%ebp)
.L4:
movl -8(%ebp), %eax
addl $8, %esp
popl %ebx
popl %ebp
ret
.size fib, .-fib
.ident "GCC: (GNU) 4.2.4 (Gentoo 4.2.4 p1.0)"
.section .note.GNU-stack,"",@progbits
So we pretty much nailed it on the head. (As to be expected from a simple example like this.)