Hello, and welcome to part 6 of my indefinitely long introduction series to the Intel assembly language. Today we're going to learn how to pass an arbitrary number of arguments to our function (like printf) and how to return various types of values as well. (Please read my previous tutorials if you haven't, as you may get lost.)
Variadic Functions
Variadic functions are functions that can take any number of arguments. Well...almost any. As of right now C/C++ require that you pass in at least one argument, so you can have some way of knowing how many arguments there are. According to Wikipedia, C++0x will lift the at-least-one-argument restriction, but until then, we're stuck with this.
But enough ranting. Really, there's little to discuss here about variadic functions, as you should already have all the skills you need to figure it out. The only thing you're missing is a way to tell how many arguments and what type they are. You could take a few approaches:
* The first argument is an integer indicating how many other variables are passed. This means that all of your arguments are of a particular type, say unsigned int or something like that.
* The first argument is some sort of string, like printf, that'll tell you how many arguments you have and their types.
* If working with pointers, you could use the first argument for whatever, and pass as many pointers as you want provided that the final pointer is null. The execv and execvp functions work like this.
* You could combine the second and third and do something like this: reserve a special value as an escape, say 0xFFFFFFFF. Then make a rule that says that the value that comes after that indicates the type, and all variables passed in after that are of that type. Then you could end the list of arguments with 0xFFFFFFFF 0xFFFFFFFF. Personally I'd just go with the format string idea, but I'm putting this here for completeness' sake.
Returning Values
If you're writing a program entirely in assembly language and you're never going to interface with C/C++ code (or any other language for that matter) then you can define your own protocol for returning values. What I'm going to teach you here is how to return values in a way compatible with C/C++ and other higher-level languages. The method kinda depends on what memory model you're using, i.e. 16-, 32-, or 64-bit. Unfortunately I don't know the rules for 64-bit functions, but I'd assume they're very similar to 32-bit.
Returning Values in 16-bit Code
Most values are returned in AL, or AX. 32-bit values such as long integers and far pointers* are returned in DX:AX, where DX holds the upper 16 bits (the segment for far pointers) and AX holds the low 16 bits (the offset for far pointers).
Basic rule: stuff your return value in the smallest register that'll hold it. That means don't put a char or uint8_t in AX--it has to go in AL. On the other hand, trying to stuff a long int into AL would be stupid to say the least.
* Near pointers point within a single segment, and are 16 bits wide. DS is assumed as the segment unless explicitly overridden. Long pointers can point anywhere in any segment you have access to. They're 32 bits wide, and consist of the offset followed by the segment. Nowadays with protected mode and flat memory models, pointers are all just either 32 or 64 bits wide (depending on your processor architecture).
Returning Values in 32-bit Code
More or less the same rules as 16-bit code: stuff your return value into the smallest register that'll take it. 32-bit integers now go in EAX instead of DX:AX, but long integers (i.e. 64-bit integers) now go in EDX:EAX. Pointers go in EAX.
Returning Values in 64-bit Code
I am guessing. As soon as I find the documentation online I'll fix this, but for now this is me making an educated guess.
Return just about everything in RAX, or whatever the smallest division of RAX capable of holding your return value is. That means chars still go in AL.
Returning Floating-Point Values
Compilers differ on this. GCC makes you return stuff in ST(0) for any model, whereas some others make you do the following:
16-bit: Return doubles in AX:BX:CX:DX, where AX holds the most significant 16 bits and DX holds the least significant 16 bits. Return floats in DX:AX, where DX holds the most significant 16 bits, AX the least significant.
32-bit: Floats in EAX, doubles in EDX:EAX.
64-bit: My guess is floats in EAX, doubles in RAX.
Putting It All Together
I'm going to write a function called gsprintf (ghetto sprintf) that accepts a buffer, a format string, and any number of integers according to the format string. (This means that %d is the only legal specifier.) It returns a count of the number of characters it's printed to the format string, not including the null.
Well, that's all I have time for for now. In the next tutorial I'll cover how to do more (i.e. console interaction, perhaps a little graphics) with interrupts and system calls.Code:; size_t gsprintf(char *buf, const char *formatstr, ...); ; Returns a count of the characters it has written. ; _gsprintf: ;create stack frame to access variables push ebp mov ebp, esp ;save registers we're going to trash; by convention ;we don't need to save EAX, ECX, EDX or ES. The rest ;of the registers need to be saved. push esi push edi push ebx pushfd ;save flags register ;make ES = DS push ds pop es ; DS:[ESI] will point to format string, ES:[EDI] ; will point to our character buffer. mov edi, [ebp + 8] mov esi, [ebp + 12] ;set the count of characters we've printed to 0. xor eax, eax ;keep track of what argument we're at xor ecx, ecx ;start processing characters! PROCESS_CHAR: ;load the next character from our format ;string into DL mov dl, [esi] ;check to see if we hit a format specifier cmp dl, '%' je PROCESS_ARG ;no format specifier, it's something else. ;just write it to the buffer. mov [edi], dl ;increment the pointers, go on to next char inc edi inc esi inc eax ;check to see if we hit a null or dl, dl jz EXIT_FUNC ;not a null. jmp PROCESS_CHAR PROCESS_ARG: ;skip over '%' and see what comes next. ;should be 'd'. inc esi mov dl, [esi] cmp dl, 'd' je PRINT_INT ;invalid format specifier, print out '%' ;followed by whatever character comes ;next. mov BYTE PTR [edi], '%' inc edi mov [edi], dl ;next character inc esi inc edi add eax, 2 jmp PROCESS_CHAR PRINT_INT: ;if we get here then it's an integer. let's ;cheat and call itoa() to make it a string. ;itoa takes three args: the int to convert, ;the buffer, and the radix, which'll always ;be 10. note that atoi returns a pointer, ;so we have to save EAX to avoid trashing ;our own return value. push eax ;push arguments backwards... mov ebx, 10 push ebx ;radix 10 push edi ;output buffer ;calculate address of next integer arg lea ebx, [ebp + ecx*4 + 16] push DWORD PTR [ebx] ;integer to convert call _itoa ;restore our character count pop eax ;increment character count by whatever ;itoa printed... COUNT_ITOA_CHARS: ;edi still points to the beginning of the ;string that itoa printed out. we can just ;keep going until we hit a null, which is ;the end of the itoa string. cmp BYTE PTR [edi], 0 jz END_PRINT_INT ;not a null, increment character count ;and destination pointer inc eax inc edi jmp COUNT_ITOA_CHARS END_PRINT_INT: ;increment argument counter inc ecx ;go on to next character in format string inc esi jmp PROCESS_CHAR EXIT_FUNC: ;restore the registers we trashed popfd pop ebx pop edi pop esi ;clean up stack and return pop ebp ret
Next In This Series
Intro to Intel Assembly Language: Part 7
Last edited by dargueta; 11-18-2010 at 07:22 PM. Reason: Clarified some stuff
sudo rm -rf /
Interesting... do you have any examples?
*smacks forehead*
I'll get on those tonight.
sudo rm -rf /
I'd have to study the example for a while to have it really soak in, but well done! +rep
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks