A simple program we can demonstrate this on, will place two integers in to the stack and add one of them before returning 0 from main:
int main() {
int a = 100;
int b = 200 + a;
return 0;
}Normally optimization can improve the code, although we can turn this feature off to better understand the instructions in correlation to our uncompiled code.The key flags for GCC here being:
- -S: Tell GCC to not assemble, and to output what it has done prior to.
- -O: Optimization levels (do not modify the code beyond our recognition)
gcc -O0 -S codecall.c -O codecall.sThe example may produce the following in GCC's native AT&T syntax:
movl $100, 12(%esp) movl $200, 8(%esp) movl 8(%esp), %eax movl 12(%esp), %edx leal (%edx,%eax), %eax leave retCertain lines may be unclear, especially what belongs to which line.
We may be able to add comments in to our C code to manually describe each line before the source (i.e. asm("# this line is a+b:") although this is messy/unrelated to C and assembly comments should not be inside your code for this reason.
You can instruct GCC to generate comments to each corresponding instruction based on the code you have given it to prevent the need for this
The main key flag being:
- -f: Pass a parameter with a long name, in this case verbose-asm.
gcc -O0 -S -fverbose-asm codecall.c -o codecall.sThis may generate the following assembly out of the same C source above (although I have removed the unneeded portions of code, and the large compiler information comment that has been added due to this flag):
movl $100, 12(%esp) #, a movl 12(%esp), %eax # a, tmp61 addl $200, %eax #, tmp60 movl %eax, 8(%esp) # tmp60, b movl $0, %eax #, D.1957 leave retFrom this we can gather what the program does line per line, I could assume the following from the previous information now:
- store 100 (a) in to stack with an offset
- move previous result in to register eax
- eax += 200 (a + b)
- move eax back in to stack with new offset to be used])
This is where optimizations come in, they will likely just store 300 in to the code, or even remove this code reference all together as it will not ever be called later on - This is why you must heed optimization when generating the source (unless you wish to view what code is redundant in any case.)
For clang (part of the LLVM compiler toolchain) we can generate the assemblies as well in a similar syntax, the important flags being:
- -S: Only run preprocessor and compilation steps
- -O0: Do not run any extra optimizations (much the same as GCC)
clang -S -O0 codecall.c -o codecall.sAnd will generate a similar source:
movl $0, %eax movl $200, %ecx movl $0, -4(%ebp) movl $100, -8(%ebp) movl -8(%ebp), %edx addl %ecx, %edx movl %edx, -12(%ebp) addl $12, %esp popl %ebp retDo I have to use AT&T style?
You may be more familiar to Intel syntax than AT&T (for example if you have read Dargeta's set of Intel tutorials: http://forum.codecal...e-part-1-a.html).
GCC and LLVM compilers can both attempt* to compile to another assembly syntax with direct translation from the original assembly. You may pass these following flags to the appropriate compiler:
GCC: -masm=intel LLVM (llc static compiler): --x86-asm-syntax=intelGCC along with the -fverbose-asm and -masm=intel flags may generate this:
mov DWORD PTR [esp+12], 100 # a, mov eax, DWORD PTR [esp+12] # tmp61, a add eax, 200 # tmp60, mov DWORD PTR [esp+8], eax # b, tmp60 mov eax, 0 # D.1957, leave ret*The Darwin version of GCC does not support the Intel syntax.
And those are just a few ways of viewing useful information about each instruction of your program, especially when learning about code you have written and how fundamental optimizations can or will be applied to it.
You may review the man pages of both compiler toolchains to review what options you can pass to each compiler, some of which may increase the clearity or speed of specific portions of code.
Edited by Alexander, 23 June 2011 - 10:50 PM.
Added a reference


Sign In
Create Account

Back to top









