Jump to content


Check out our Community Blogs





- - - - -

TI ASM Part 3: Calling Convention

Posted by gregwarner, 27 May 2014 · 7641 views

assembly ti-99/4a old computers
As part of my series on learning assembly for my old 80’s era computer, I’ve been developing a call stack, since the hardware doesn’t have one built-in. Today, we’re going to talk about what a calling convention is, and design one for the TI-99/4A.

Typically, a compiler targeting a particular architecture will have a calling convention defined, that is, a precise set of steps to follow when calling a function, in order that the call stack is kept in proper order, and elements on the stack can be found in their expected locations. Since I don’t currently have access to a C compiler for the TI-99/4A, I’m taking the opportunity to define my own calling convention.

Each time you call a function, you generally pass in one or more inputs, and you may get a result returned when the function ends. When the computer executes a function, you may want to set up temporary local variables for that function to use. You also need to know which hardware registers will be affected by a function, and which registers will remain intact. The decisions we make for all of these will define our calling convention.

Let’s talk about registers first. The TMS9900 CPU has 16 general purpose registers: R0 through R15. We already know from last time that we can’t touch R11 (the Link Register) or R12 (the device IO register), and we reserved R13 as the Stack Pointer. As we have already seen, when calling a function, it is necessary to store R11 on the stack so it can be restored at the end of the function’s execution. Let’s go ahead and say that if we need to change R12 during our function for whatever reason, that we should save and restore this register as well, so that our use of R12 will be transparent to the calling function.

Passing in parameters can be done with registers, stack variables, or a mixture of both. I’m going to model my convention a little after the ARM calling convention, and declare that we can use R0 through R3 to pass in parameters, and if any more space is needed beyond those four registers, we must use the stack. Any return value will be passed back to the calling function in R0.

What this means is that R0 through R3 will get clobbered by our function. They are not guaranteed to be the same as they were before calling the function. Therefore, any code which calls a function should not store any vital information in R0 through R3 before calling a function, since that data may be lost. Instead, R4 through R10 are defined as safe storage locations for local variables. Our calling convention will dictate that, should a function require the use of any of these registers, it should first PUSH the old values to the stack, and POP these values back, restoring their old values before exiting the function.

Since we’re effectively storing registers 11 and 12 as well, let’s go ahead and expand that out all the way to the end of the register file; R4 through R15 are all guaranteed to be the same value after returning from a function as they were before the function was entered.

So to summarize:
R0 through R3 are for passing parameters. Any more than this, and we must use the stack.
R4 through R10 are for local variables. If a function needs more than this, it must use the stack.
R0 thorugh R3 get clobbered when calling a function.
R4 through R15 are preserved when calling a function.
R0 is used for the return value.

Now that we have that, we need to define a stack frame. A stack frame is a section of the stack that is “owned” by a particular function. That function’s saved registers and local variables will exist in its stack frame. The order that we push objects onto the stack will define the structure of the stack frame.

The first thing we should do when entering a function is save the link register, R11, so we know where to return at the end of our function. So R11 gets pushed first. Next, we should save any registers we intend on using. Let’s say we should push those in reverse order, so that R15 is pushed first, and R4 is pushed last. (skipping R11, of course, since it was already pushed.) Last, if our function needs any local variables beyond what the registers can hold, it should push that area on to the stack.

Returning from a function is as simple as reversing this order. First, we should POP the area we reserved for local variables. Next, we POP and restore all the working registers in ascending order, since a stack is First In Last Out, and finally, we pop R11 and return execution to the address which it points to.

If we need to pass additional parameters beyond what can fit in R0 through R3, then it is up to the calling function to allocate this space on the stack. Therefore, parameters for a called function may be found by reaching out into the calling function’s stack frame.

So a rough depiction of our stack frames will look as follows:

Attached Image

Now, let’s create a template for a function which utilizes all of the above features.

Here’s the code for the calling function. It sets up space on the stack for 3 words of memory, copies the arguments onto the stack, copies more arguments into R0 through R3, as per the calling convention, then calls the function FOO, which is defined a little bit lower.

LI   R0,3*2         *Calculate space needed for 3 words.
       S    R0,R13         *Move the stack pointer down by that amount.
       MOV  R4,@0(R13)     *Move the parameters onto the stack.
       MOV  R5,@2(R13)     * (taken from R4, R5, and R6, for example.)
       MOV  R6,@4(R13)
       MOV  R7,R0          *Move more parameters into R0 through R3.
       MOV  R8,R1          * (taken from R7 through R10, for example.)
       MOV  R9,R2
       MOV  R10,R3
       BL   @FOO           *Finally, call FOO.
       MOV  R0,R4          *Do something with the return value.
You might be confused by the @0(R13) notation. This is the indexed addressing mode in the TMS9900, and it uses an offset amount indexed by a register value. Since R13 points to the top of the stack, we use @0(R13) to indicate the topmost word, @2(R13) as the next word under the top one, and @4(R13) as the last word. The offset value before the register must be a hard coded number.

Now, let’s take a look at the code for our function, FOO. Here, our function prologue first stores our Link Register (R11), then it saves all the working registers it wishes to use (R4 through R7 in this example), then it reserves space on the stack for its local variables. Here, we create 3 local variables: a buffer of length 20 bytes, a 32-bit value (two words), and a 16-bit (one word) value. The function epilogue reverses all this stuff.

Notice, also, we set up some EQUate statements to give nice names to our variables on the stack.

*Define the amount of stack space this function must use.
*Saved registers: (4 words)
SREGS  EQU  4*2
*Local variables: (20 bytes, 4 bytes, and 2 bytes)
LVARS  EQU  20+4+2

*Names and calculated positions of our variables:
BUF    EQU  4+2
INT    EQU  2
WORD   EQU  0
*Names and calculated positions of our parameters:
PAR1   EQU  LVARS+SREGS+2+0
PAR2   EQU  LVARS+SREGS+2+2
PAR3   EQU  LVARS+SREGS+2+4

*Begin our function execution
FOO    DECT R13            *PUSH Link Register
       MOV  R11,*R13
       AI   R13,-SREGS     *Reserve space for saved registers
       MOV  R7,@6(R13)     *Save registers R4 through R7
       MOV  R6,@4(R13)
       MOV  R5,@2(R13)
       MOV  R4,@0(R13)
       AI   R13,-LVARS     *Reserve space for local variables
       …
       …                   *Do our function logic
       …
*Function epilogue
       AI   R13,LVARS      *POP the local variables
       MOV  *R13+,R4       *POP and restore the saved registers
       MOV  *R13+,R5
       MOV  *R13+,R6
       MOV  *R13+,R7
       MOV  *R13+,R11      *POP and restore the Link register
       RT                  *And return to calling function.
Each portion of the function prologue/epilog may be removed if that feature is not needed. For example, if a function doesn’t need to allocate any local variables on the stack, then just leave the relevant lines out.

(I should also mention that the line AI R13,-LVARS is actually subtracting the value LVARS from R13, but since LVARS is a hard-coded value, we must use immediate operand addressing, and the TMS9900 doesn’t have a Subtract Immediate operation, only an Add Immediate. So we add the inverse in order to effectively subtract an immediate value.)

Now the one remaining thing I haven’t covered is accessing the function’s stack parameters. Since the number of stack bytes needed by a function’s prologue is known at compile time, the offset addresses of the stack parameters can be known as well, and we’ve defined some nice friendly names using the EQUate assembler directive, so we can reference them like so:

*Copy PAR1 to R4:
       MOV  @PAR1(R13),R4
*Copy PAR2 to R5:
       MOV  @PAR2(R13),R5
*Copy PAR3 to R6:
       MOV  @PAR3(R13),R6
Notice we’re using the indexed addressing modes again, using our stack pointer as the reference address. If you look back at our definitions of our parameter variables:

PAR1   EQU  LVARS+SREGS+2+0
PAR2   EQU  LVARS+SREGS+2+2
PAR3   EQU  LVARS+SREGS+2+4
You’ll see that the parameters are defined at the location past the area allocated for local variables, past the area defined for saved registers, past the saved link register, plus a specific offset. This effectively reaches out of our own local stack frame and into the stack frame of our parent function, where the arguments have been previously placed for us. Very handy!

References to our local variables can be made similarly. However, since local variables were the last things pushed on the stack, they will merely consist of their offset alone, referenced from R13 of course.

Well, that’s it for today’s post. Sorry this has been heavily steeped in theory and not much actual action. But we had to get all this out of the way before we move forward. From here on out, everything I write for the TI-99 will adhere to this calling convention.

Next time, let’s write a function which utilizes everything we’ve done so far! How about a nice recursive function. Let’s write the Factorial function!

  • 2



Reading your blog, I'm at the same time very happy and very sad to not be born sooner (when assembly was the main language)

I'm happy because, let's face it, it really seem like a lots of work for not so much

But sad because with that, you really have an idea on how the computer work and think

 

But I think reading this is the best of both world, I don't have to actually learn assembly, but I get a good idea of it

 

Great post

    • 0

Thanks. My goal in this is to learn how modern programming constructs were first designed and implemented, so that I can understand more thoroughly how those things work. An example is, I'm currently implementing a heap memory management for this computer. I've always taken for granted having the malloc() and free() functions, but it's much different when you have to write those functions yourself. But that'll be in a later post.

    • 0

Vaielab, if you ever read Donald Knuth's The Art of Computer Programming, you'll find that he develops everything in a HIGHLY controlled computer with an assembly-type language. Whenever I get around to reading it, I'll probably try to build a simulator for the computer as well.

    • 0
Those books look pretty fascinating. (albeit expensive!) What do you mean by a highly controlled computer? I'd be really interested to see some snippits from Knuth's works.
    • 0

I'll be able to post more details when I get home, but from memory, it's a computer that has 4 registers with a specific byte size, and a very specific amount of RAM (1024 Bytes? don't recall). I'm pretty sure it had a limited input/output mechanism, too. I've got books 1-3, and ordered 4A today. The series used to be called "The Bible of Computer Programming". It's a series of books that will cement your thinking about how algorithms REALLY work, in part by having a known computer with known properties so that you can truly measure space/time complexity of any algorithm you run on it. It runs a trimmed-down assembly, so there's also no discussion about "compiler optimizations", etc. You are working with concrete, known algorithms on a concrete, known hardware. Suddenly, everything is measurable.

    • 0

Ah, the MMIX: found the wikipedia article on it: https://en.wikipedia.org/wiki/MMIX

    • 0

This sound very interesting, it's on my amazon wish list for sure

Sadly, at the moment, I have so little time, that I even have the serenity's comic book series on my desk since xmas but didn't read it yet

But thoses are the kind that I woule love to read

Thx for the book

    • 0

No problem. I've had the books for a few years, and only gotten partway through the first one.

    • 0