+ Reply to Thread
Results 1 to 7 of 7

Thread: Intro to Intel Assembly Language: Part 5

  1. #1
    Join Date
    Oct 2007
    Location
    /dev/null
    Posts
    4,513
    Blog Entries
    8
    Rep Power
    59

    Intro to Intel Assembly Language: Part 5

    Last time I taught you how to make conditional jumps and create if-else and switch statements, as well as all kinds of loops. Today we're going to learn how to call functions and basic methods for passing arguments to these functions. Passing arguments to variadic functions and returning values from a function will be left for the next tutorial. This tutorial is going to assume you're going to integrate with other libraries or C/C++ code, so we're going to follow those standards. However, if you're doing your own project and it's all in assembly language, I'll also tell you which conventions you can violate and which you probably shouldn't.

    Function Calls
    There's a really simple way of calling functions: with the call mnemonic. call can be used to call hard-coded functions (i.e. the same function will always be called) or variable functions (the same code can call different functions):

    Code:
    /*declare a function pointer type FUNCPTR
    that points to a function taking a single void*
    as an argument and returning an int*/
    typedef    int (*FUNCPTR)(void *);
    
    FUNCPTR    functions[4];
    
    ...
    /*hard-coded function call*/
    foo();
    
    /*variable function call - a different function
    is called on every iteration of the loop*/
    for(int i = 0; i < 4; ++i)
        functions[i](NULL);
    So how would we do this in our code? Mmm...depends on what compiler you're using. Most will let you declare a function and then just use the function name for the call, like so:

    Code:
    call    whatever
    To use a function pointer, you can do something like the following:
    Code:
    ;assume EAX points to a function pointer in memory
    call    [eax]
    I used to use Microsoft's ancient-as-hell debug.exe, which would only allow addresses. So I had to manually calculate the addresses of my functions--a real pain in the butt because if you're off by one byte you're screwed. So my function calls looked like this:

    Code:
    ; calling foo() w/ no args
    call    08F4
    I'm going to use generic syntax that you can easily adapt for use in any assembler, whether it's NASM, MASM, YASM, debug.exe, or whatever.

    All functions have addresses. Otherwise the CPU wouldn't know where to get the instructions from. I'm going to assume you don't really have control over where you locate your functions. If you use something like debug.exe to compile your stuff (not recommended), you can easily figure it out for yourself. If not, just ask me and I'll be glad to help you.

    Aha, you ask, but how does the CPU know where to resume execution? Well, it's simple. The call instruction does two things: push the return address onto the stack, then it jumps to the location you specify. Make a note of this; it'll be important later.

    Anyway, I showed you above how to call a function that returns nothing and takes no arguments. This is all well and good, but what if we want to pass in arguments? What then? The stack comes to our rescue.

    The Stack
    Think of the stack as a bunch of sheets of paper. If you want to remember something, just scribble something on that sheet of paper and stick it on the top of the stack. You can't sift around for what you want, though--you can only "push" a sheet on top, or "pop" the top sheet off. That means that if you want to access the third sheet from the top of the stack, you need to pop off the first and second sheets. There's a way around this, but I'll show you later. There are three registers controlling the stack in Intel CPUs: SS, SP, and BP. (In 32-bit ASM, it's SS/ESP/EBP and 64-bit ASM it's SS/RSP/RBP.)
    SS - The segment of memory that the stack is in. (I will discuss segments, protected mode, real mode, etc. in later tutorials.)
    SP - Points to the top of the stack in memory.
    BP - Points to the bottom of the stack in memory.

    One would think that the stack would grow upwards in memory (i.e. from low to high addresses), but with Intel processors it's exactly the opposite. (See footnote for why.) This means that BP should always be less than SP. If SP reaches BP, then our stack is empty.
    There are two main instructions for manipulating the stack: push and pop:

    Code:
    ;push takes one argument, the data
    ;you want to push onto the stack.
    push    ebx
    
    ;is the same as
    mov    [esp],ebx
    sub    esp,4
    Code:
    ;pop takes one argument, the register/memory
    ;location that you want to pop the top of the
    ;stack into.
    pop    eax
    
    ;is the same as
    mov    eax,[esp]
    add    esp,4
    There are a few other instructions for manipulating the stack, but they're not really necessary/pertinent for this discussion. See footnote 2 for these extra instructions. So what can you do with push and pop?

    Code:
    ;push the contents of EAX onto the stack
    push    eax
    ;push 16 bits from where EBX points to
    push    WORD PTR [ebx]
    ;push a 32-bit value
    push    DWORD    0x8000F185
    ;ILLEGAL!
    push    al
    
    ;pop into a memory location
    pop    DWORD    ES:[0xF858]
    ;pop into a register
    pop    cx
    ;pop into a 32-bit register
    pop    edx
    ;ILLEGAL!
    pop    al
    ;ILLEGAL!
    pop    WORD    0x1234
    You can't pop into a constant for obvious reasons. But why can't you push/pop bytes? This is to keep data aligned on even byte boundaries, to avoid issues with hardware that doesn't like reading multibyte data from odd byte boundaries. For 16-bit systems, always keep the stack aligned to a 2-byte boundary, 4 bytes for 32-bit systems, and 8 bytes for 64-bit systems. If you have to waste memory...oh well.

    Passing Values
    But what does this all have to do with functions? Well, as you've probably guessed, arguments are passed on the stack. Now because the stack grows downward, we usually push arguments on backwards so that the first argument ends up at the top of the stack (the lowest address). The reason for this will become apparent momentarily.

    Let's assume that we have a function void foo(uint32_t a, uint16_t b). We could call foo like so:
    Code:
    ;assume eax=a, ebx=b
    push    ebx
    push    eax
    call    foo
    Note that I stuck a 16-bit variable in a 32-bit register for passing it to a function. Again, we need to keep the stack aligned.

    So how would foo access the arguments? Well, we could pop them off the stack...but we have a limited amount of registers. What if we have a function that takes, I dunno...ten arguments? What if it takes a variable number of arguments? We'd be screwed. There's a better way of doing this: Using a single register to point to the first argument, and just adding offsets to access subsequent arguments, sort of like an array. Most compilers/coders use bp, ebp or rbp for this for historical reasons. (See footnote 3 for why.) Continuing with our above example, here's how we would access the arguments:
    Code:
    ;function entry point
    ;save EBP before we use it for our argument pointer
    push   ebp
    
    ;EBP now points to the top of the stack. this is
    ;actually NOT the first argument, but the EBP followed
    ;by the return address of our function. assuming the
    ;return address is 32 bits (4 bytes) wide, our first
    ;argument is actually at [EBP+8], not [EBP].
    mov    ebp,esp
    
    ;allocate space for local variables
    sub    esp,TOTAL_SIZE_OF_LOCAL_VARIABLES
    ....
    
    ;add 5 to A
    add    DWORD PTR [ebp+8],5
    
    ;subtract 10 from B
    sub    WORD PTR [ebp+12],10
    
    ;store B in a local variable.
    mov    ebx, [ebp+12]
    mov    [esp], ebx
    
    ;store A in a different local variable
    mov    eax, [ebp+8]
    mov    [esp+2], eax
    
    ....
    
    ;return code
    ;deallocate space for local variables
    add    esp, TOTAL_SIZE_OF_LOCAL_VARIABLES
    
    ;restore EBP
    pop    ebp
    
    ;THIS CODE IS ONLY HERE FOR NON-VARIADIC FUNCTIONS ONLY.
    ;Our function takes a fixed number of arguments--we
    ;always know how many bytes' worth of arguments are
    ;passed in. Either we must clean up the arguments off
    ;the stack, or the function that called us. Because
    ;we always know how many bytes were passed in, we might
    ;as well clean up. Otherwise every time our function
    ;is called, the caller would have to clean up--which is
    ;lots of code duplication. If we clean up here, the same
    ;code is only in one place, so we save space.
    ;
    ;We take one DWORD argument and one WORD argument, for a
    ;total of 8 bytes. Remember we have to keep the stack aligned,
    ; so all arguments must be multiples of 4 bytes.
    ret     8
    Note that you need the prologue and return code so that you don't screw up the stack alignment or lose track of where your calling function's variables are at. Note what happens if our function calls a subfunction: because ESP points below our local variables, the arguments to the function don't overwrite our local variables.

    Well, now you know how to call functions and pass in values! Next time I'll teach you how to deal with variadic functions (functions that take a variable number of arguments), and return values from your functions.

    Next In This Series
    Intro to Intel Assembly Language: Part 6

    Footnote 1 - Why the Intel stack is upside-down
    Way back in the day, a lot of programs used the tiny model for code layout, which dictated that everything must fit in one 64K segment - code, data, and stack. To minimize the chance of a stack overflow overwriting code and/or data, Intel engineers decided that the stack should start at the end of the segment and grow downward towards the code and data.

    Footnote 2 - Extra instructions for manipulating the stack
    PUSHA / POPA - Push/pop all 16-bit general registers (except for SP)
    PUSHAD / POPAD - Push/pop all 32-bit general registers (except for ESP)
    PUSHAQ / POPAQ - Push/pop all 64-bit general registers (except for RSP)
    PUSHF / POPF - Push/pop flags register (16-bit)
    PUSHFD / POPFD - Push/pop eflags register (32-bit)
    PUSHFQ / POPFQ - Push/pop rflags register (64-bit)

    Footnote 3 - Why EBP?
    The way Intel instructions were originally encoded, one could only use register-offset addressing with bx and ebp. Since ebp automatically references the stack segment and bx automatically references the data segment, using bx would require a segment override every single time a function tried to access an argument. Clearly this would slow things down and bloat code, so...we use ebp.
    Last edited by dargueta; 11-30-2010 at 12:47 PM. Reason: Made comments easier to read, fixed grammar
    sudo rm -rf /

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Location
    Advertising world
    Posts
    Many

     
  3. #2
    Jordan Guest

    Re: Intro to Intel Assembly Language: Part 5

    Very well done, Dargueta! +rep

  4. #3
    Join Date
    Aug 2009
    Location
    ~/
    Posts
    918
    Rep Power
    19

    Re: Intro to Intel Assembly Language: Part 5

    Very informative +rep

  5. #4
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Re: Intro to Intel Assembly Language: Part 5

    Very nice job. +rep
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  6. #5
    Join Date
    Oct 2007
    Location
    /dev/null
    Posts
    4,513
    Blog Entries
    8
    Rep Power
    59

    Re: Intro to Intel Assembly Language: Part 5

    Thanks!

    EDIT: I fixed an offset error with the order of the arguments. The first argument is at EBP+8, not EBP+4 as I previously stated.
    Last edited by dargueta; 11-14-2009 at 02:18 PM.
    sudo rm -rf /

  7. #6
    kaway! is offline Newbie
    Join Date
    Mar 2010
    Posts
    1
    Rep Power
    0

    Re: Intro to Intel Assembly Language: Part 5

    I enjoyed your tutorial!
    more'm still a beginner, could you help me?
    I program in delphi and wanted to call this address.
    asm
    asm
    push eax
    mov eax, 008D0C70h
    mov byte ptr [eax], 1
    pop eax
    end;
    how do? : T

  8. #7
    Join Date
    Oct 2007
    Location
    /dev/null
    Posts
    4,513
    Blog Entries
    8
    Rep Power
    59

    Re: Intro to Intel Assembly Language: Part 5

    I think this is it...
    Code:
    procedure MyFunction; near;
    begin
        asm
            push   eax
            mov    eax, 008d0c70h
            mov    BYTE PTR [eax], 1
            pop    eax
            ret
        end;
    end;
    sudo rm -rf /

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Intro to Intel Assembly Language: Part 9B
    By dargueta in forum Assembly Tutorials
    Replies: 7
    Last Post: 10-02-2010, 10:08 PM
  2. Intro to Intel Assembly Language: Part 9A
    By dargueta in forum Assembly Tutorials
    Replies: 4
    Last Post: 08-16-2010, 09:06 AM
  3. Intro to Intel Assembly Language: Part 8
    By dargueta in forum Assembly Tutorials
    Replies: 6
    Last Post: 07-15-2010, 09:21 AM
  4. Intro to Intel Assembly Language: Part 7
    By dargueta in forum Assembly Tutorials
    Replies: 5
    Last Post: 12-30-2009, 02:17 PM
  5. Intro to Intel Assembly Language: Part 6
    By dargueta in forum Assembly Tutorials
    Replies: 3
    Last Post: 12-14-2009, 06:06 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts