+ Reply to Thread
Results 1 to 4 of 4

Thread: Intro to Intel Assembly Language: Part 6

  1. #1
    Join Date
    Oct 2007
    Location
    /dev/null
    Posts
    4,513
    Blog Entries
    8
    Rep Power
    59

    Intro to Intel Assembly Language: Part 6

    Hello, and welcome to part 6 of my indefinitely long introduction series to the Intel assembly language. Today we're going to learn how to pass an arbitrary number of arguments to our function (like printf) and how to return various types of values as well. (Please read my previous tutorials if you haven't, as you may get lost.)

    Variadic Functions
    Variadic functions are functions that can take any number of arguments. Well...almost any. As of right now C/C++ require that you pass in at least one argument, so you can have some way of knowing how many arguments there are. According to Wikipedia, C++0x will lift the at-least-one-argument restriction, but until then, we're stuck with this.
    But enough ranting. Really, there's little to discuss here about variadic functions, as you should already have all the skills you need to figure it out. The only thing you're missing is a way to tell how many arguments and what type they are. You could take a few approaches:
    * The first argument is an integer indicating how many other variables are passed. This means that all of your arguments are of a particular type, say unsigned int or something like that.
    * The first argument is some sort of string, like printf, that'll tell you how many arguments you have and their types.
    * If working with pointers, you could use the first argument for whatever, and pass as many pointers as you want provided that the final pointer is null. The execv and execvp functions work like this.
    * You could combine the second and third and do something like this: reserve a special value as an escape, say 0xFFFFFFFF. Then make a rule that says that the value that comes after that indicates the type, and all variables passed in after that are of that type. Then you could end the list of arguments with 0xFFFFFFFF 0xFFFFFFFF. Personally I'd just go with the format string idea, but I'm putting this here for completeness' sake.

    Returning Values
    If you're writing a program entirely in assembly language and you're never going to interface with C/C++ code (or any other language for that matter) then you can define your own protocol for returning values. What I'm going to teach you here is how to return values in a way compatible with C/C++ and other higher-level languages. The method kinda depends on what memory model you're using, i.e. 16-, 32-, or 64-bit. Unfortunately I don't know the rules for 64-bit functions, but I'd assume they're very similar to 32-bit.

    Returning Values in 16-bit Code
    Most values are returned in AL, or AX. 32-bit values such as long integers and far pointers* are returned in DX:AX, where DX holds the upper 16 bits (the segment for far pointers) and AX holds the low 16 bits (the offset for far pointers).
    Basic rule: stuff your return value in the smallest register that'll hold it. That means don't put a char or uint8_t in AX--it has to go in AL. On the other hand, trying to stuff a long int into AL would be stupid to say the least.

    * Near pointers point within a single segment, and are 16 bits wide. DS is assumed as the segment unless explicitly overridden. Long pointers can point anywhere in any segment you have access to. They're 32 bits wide, and consist of the offset followed by the segment. Nowadays with protected mode and flat memory models, pointers are all just either 32 or 64 bits wide (depending on your processor architecture).

    Returning Values in 32-bit Code
    More or less the same rules as 16-bit code: stuff your return value into the smallest register that'll take it. 32-bit integers now go in EAX instead of DX:AX, but long integers (i.e. 64-bit integers) now go in EDX:EAX. Pointers go in EAX.

    Returning Values in 64-bit Code
    I am guessing. As soon as I find the documentation online I'll fix this, but for now this is me making an educated guess.
    Return just about everything in RAX, or whatever the smallest division of RAX capable of holding your return value is. That means chars still go in AL.

    Returning Floating-Point Values
    Compilers differ on this. GCC makes you return stuff in ST(0) for any model, whereas some others make you do the following:
    16-bit: Return doubles in AX:BX:CX:DX, where AX holds the most significant 16 bits and DX holds the least significant 16 bits. Return floats in DX:AX, where DX holds the most significant 16 bits, AX the least significant.
    32-bit: Floats in EAX, doubles in EDX:EAX.
    64-bit: My guess is floats in EAX, doubles in RAX.

    Putting It All Together
    I'm going to write a function called gsprintf (ghetto sprintf) that accepts a buffer, a format string, and any number of integers according to the format string. (This means that %d is the only legal specifier.) It returns a count of the number of characters it's printed to the format string, not including the null.

    Code:
    ; size_t gsprintf(char *buf, const char *formatstr, ...);
    ; Returns a count of the characters it has written.
    ;
    
    _gsprintf:
        ;create stack frame to access variables
        push    ebp
        mov     ebp, esp
        
        ;save registers we're going to trash; by convention
        ;we don't need to save EAX, ECX, EDX or ES. The rest
        ;of the registers need to be saved.
        push    esi
        push    edi
        push    ebx
        pushfd          ;save flags register
        
        ;make ES = DS
        push    ds
        pop     es
        
        ; DS:[ESI] will point to format string, ES:[EDI]
        ; will point to our character buffer.
        mov     edi, [ebp + 8]
        mov     esi, [ebp + 12]
        ;set the count of characters we've printed to 0.
        xor     eax, eax
        ;keep track of what argument we're at
        xor     ecx, ecx
        
        ;start processing characters!
        
        PROCESS_CHAR:
            ;load the next character from our format
            ;string into DL
            mov     dl, [esi]
    
            ;check to see if we hit a format specifier
            cmp     dl, '%'
            je      PROCESS_ARG
    
            ;no format specifier, it's something else.
            ;just write it to the buffer.
            mov     [edi], dl
            
            ;increment the pointers, go on to next char
            inc     edi
            inc     esi
            inc     eax
    
            ;check to see if we hit a null
            or      dl, dl
            jz      EXIT_FUNC
    
            ;not a null.
            jmp     PROCESS_CHAR
            
            PROCESS_ARG:
                ;skip over '%' and see what comes next.
                ;should be 'd'.
                inc     esi
                mov     dl, [esi]
                cmp     dl, 'd'
                je      PRINT_INT
                ;invalid format specifier, print out '%'
                ;followed by whatever character comes
                ;next.
                mov     BYTE PTR [edi], '%'
                inc     edi
                mov     [edi], dl
                ;next character
                inc     esi
                inc     edi
                add     eax, 2
                jmp     PROCESS_CHAR
                
            PRINT_INT:
                ;if we get here then it's an integer. let's
                ;cheat and call itoa() to make it a string.
                ;itoa takes three args: the int to convert,
                ;the buffer, and the radix, which'll always
                ;be 10. note that atoi returns a pointer,
                ;so we have to save EAX to avoid trashing
                ;our own return value.
                push    eax
                
                ;push arguments backwards...
                mov     ebx, 10
                push    ebx         ;radix 10
                push    edi         ;output buffer
                
                ;calculate address of next integer arg
                lea     ebx, [ebp + ecx*4 + 16]
                push    DWORD PTR [ebx]     ;integer to convert
                call    _itoa
    
                ;restore our character count
                pop     eax
                
                ;increment character count by whatever
                ;itoa printed...
                COUNT_ITOA_CHARS:
                    ;edi still points to the beginning of the
                    ;string that itoa printed out. we can just
                    ;keep going until we hit a null, which is
                    ;the end of the itoa string.
                    cmp     BYTE PTR [edi], 0
                    jz      END_PRINT_INT
                    
                    ;not a null, increment character count
                    ;and destination pointer
                    inc     eax
                    inc     edi
                    jmp     COUNT_ITOA_CHARS
    
                END_PRINT_INT:
                ;increment argument counter
                inc     ecx
                
                ;go on to next character in format string
                inc     esi
                jmp     PROCESS_CHAR
        
        EXIT_FUNC:
        ;restore the registers we trashed
        popfd
        pop     ebx
        pop     edi
        pop     esi
        
        ;clean up stack and return
        pop     ebp
        ret
    Well, that's all I have time for for now. In the next tutorial I'll cover how to do more (i.e. console interaction, perhaps a little graphics) with interrupts and system calls.

    Next In This Series
    Intro to Intel Assembly Language: Part 7
    Last edited by dargueta; 11-18-2010 at 07:22 PM. Reason: Clarified some stuff
    sudo rm -rf /

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Location
    Advertising world
    Posts
    Many

     
  3. #2
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Re: Intro to Intel Assembly Language: Part 6

    Interesting... do you have any examples?
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  4. #3
    Join Date
    Oct 2007
    Location
    /dev/null
    Posts
    4,513
    Blog Entries
    8
    Rep Power
    59

    Re: Intro to Intel Assembly Language: Part 6

    *smacks forehead*

    I'll get on those tonight.
    sudo rm -rf /

  5. #4
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Re: Intro to Intel Assembly Language: Part 6

    I'd have to study the example for a while to have it really soak in, but well done! +rep
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Intro to Intel Assembly Language: Part 9B
    By dargueta in forum Assembly Tutorials
    Replies: 7
    Last Post: 10-02-2010, 10:08 PM
  2. Intro to Intel Assembly Language: Part 9A
    By dargueta in forum Assembly Tutorials
    Replies: 4
    Last Post: 08-16-2010, 09:06 AM
  3. Intro to Intel Assembly Language: Part 8
    By dargueta in forum Assembly Tutorials
    Replies: 6
    Last Post: 07-15-2010, 09:21 AM
  4. Intro to Intel Assembly Language: Part 5
    By dargueta in forum Assembly Tutorials
    Replies: 6
    Last Post: 03-07-2010, 12:45 PM
  5. Intro to Intel Assembly Language: Part 7
    By dargueta in forum Assembly Tutorials
    Replies: 5
    Last Post: 12-30-2009, 02:17 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts