Learning Objectives
- Be able to articulate integer instructions.
- Be able to write simple assembly programs.
- Be able to load and store from and to memory.
- Be able to create labels and jump to them conditionally or unconditionally.
- Be able to create and call functions using standard operating procedures.
- Be able to use MARS's system calls to print and read data.
- Be able to use directives to create strings and other data.
- Understand and use the different executable sections.
- Be able to use the stack for local storage.
- Be able to use saved and temporary registers.
- Understand what destroying registers means and what to do about it.
- Understand how branches are implemented in the architecture.
MARS is a lightweight interactive development environment (IDE) for programming in MIPS assembly language, intended for educational-level use with Patterson and Hennessy's Computer Organization and Design. Mar 07, 2016 I'm not familiar with QtSpim, in fact I haven't coded in mips awhile. I'm a little taken aback that code written for one editor wouldn't work with another. Its not even like there are different dialects. Whatever works is fine. Its nice to know the different methods. – jason dancks Jan 4 '13 at 15:03.
In this lecture and in this course, we will be using MARS (MIPS Assembler and Runtime Simulator), which you can download here: http://courses.missouristate.edu/kenvollmar/mars/download.htm
Assembly
MARS (MIPS Assembler and Runtime Simulator) An IDE for MIPS Assembly Language Programming. Tutorial materials MARS feature map- screenshots with primary MARS. Using Mips 32/Mars. A palindrome is a word that reads the same backward as forward. Examples: noon, deed, redder, terret. Write a program to compute longest unique palindromes by splitting two strings and combining their substrings. Read two strings (string1 and string2) of the same length from the user up to 30 characters.
The code above shows three different pieces of the assembly language. The words that start with a '.' are called directives. Those words that end with a ':' are called labels. Everything else, including add and ret are called instructions.
Directives and labels are for you (the programmer) and to change how the assembler functions. Instructions are encoded directly into machine code by the assembler. So, the important part about learning assembly is to build your vocabulary--that is, learn the instructions and what they do. Remember, everything that C++ does in the end is converted into assembly.
Understanding Assembly
When we write assembly, we become the C++, WE become the order of operations. Let's take a look at a simple C++ statement and see what we can do to convert it into assembly.
When we look at the code above, we can see that we need to multiply j and k together first, then add that product to i to get the result. The order of operations was handled by C++ for us, but now that we're instructing the processor ourselves, WE are now the order of operations.
This code above, we use li (load immediate) to load the values 10, 20, and 30 into three separate registers, t0, t1, and t2 respectively. These registers have specific names, so we can't just name them i, j, and k like we could in C++. We know that 10 + 20 * 30 will give us 610, so let's see what we get.
Our result is $t4, and the value is 0x262, which is (2times 16^2 + 6times 16^1 + 2times 16^0=610_{10}). Notice that we executed the mul (multiply) instruction first. In assembly, the order of operations is whenever that instruction gets executed, so we must be the arbiter.
Registers
Registers are very small pieces of memory inside of the CPU. For our MIPS architecture, these are 32 bits a piece. Therefore, our MIPS is a 32-bit machine. On the Hydra and Tesla machines, each register is at least 64 bits, so our Hydra and Tesla labs machines are 64 bits.
MIPS has several registers which can be address by number, such as $0, $1, $2, or by its name, such as $a0, $t0, $s0. Here are the registers and their purpose in MIPS.
The 'use' of these registers are their recommended use. You have full control over these registers, but if you put a value that MIPS is not expecting, it could have unintended consequences!
What to look for
C++ handled data types (signed and unsigned, integral, and float), data sizes, and order of operations for us. We as the assembly programmer are now required to ensure that our data sizes are correct. MIPS is known as a load/store architecture. Notice that we could load small immediates, but we can only act upon immediates or registers. What about memory? In MIPS, we have to load to get a value from memory into a register or store to put a value from a register into memory.
The table above has some examples of how to use them, however there are essentially five parts: (1) load vs. store (l vs. s), (2) data size (b, h, or w), (3) source (store) or destination (load) register, (3) offset (can be 0), (4) destination (store) or source (load) register. The register for (4) must contain a valid memory address, or the instruction will cause the program to crash.
The lb and lh instructions will take a 1-byte or 2-byte value, respectively from memory and put it into a 32-bit register. Therefore, it needs to widen the value. These instructions will sign-extend the value. So for lb, it takes bit index 7 (the sign bit) and extends it 24 more times to make a full 32-bit value. The lh instruction takes bit index 15 (the sign bit) and extends it 16 more times to make a full 32-bit value.
Sometimes we want to zero-extend a value instead. So, instead of duplicating the sign bit, a widened value is simply padded with leading zeroes. We can modify the behavior by adding a 'u' (for unsigned) at the end of the load instructions.
We obviously don't need an sbu or shu or even a lwu because each store stores exactly 8 or 16 bits into memory. The lwu is not necessary because we cannot widen a 32-bit value (a word) since the registers are only 32-bits. For a 64-bit machine, we would have a lwu to widen a 32-bit value into a 64-bit register.
Common Instructions
There are other instructions besides these, but these are your common integer instructions. Notice that there is only addi (add immediate). MARS will give us a subi, but it is an addi with a negative immediate. The immediate (-100 in the case above) is encoded into the instruction itself. Each instruction consumes exactly 32 bits (4 bytes). The majority of those bits are taken by the instruction and registers themselves, so that leaves little room for the immediate. Therefore, this instruction is only useful for small immediates.
All of the logical operations that we need are supported above. Some of the instructions above are called pseudoinstructions, and the assembler will convert them to the actual instruction. For example, not $t1, $t2, is the same as xori $t1, $t2, -1. In the reduced-instruction-set-computer (RISC), we have a limited number of instructions, so the assembler needs to choose the operation that might not be exactly what you wanted, but it's equivalent.
Jumps and Branches
We've seen how to make simple integer and logical instructions perform their magic, but a lot of times we want to conditionally or repeatedly execute code. These are if statements and loops in C++. In MIPS, we have jumps and branches. A branch can be thought of as a condition, where as jumps are unconditional. As the name implies, it allows us to go somewhere else. Otherwise, the CPU will just execute the very next instruction in sequence.
This is where labels are important. A jump or branch instruction needs to know where to go to. A label is just an easy name we give a memory address. When we jump or branch, we can provide the label instead of hard coding the memory address, and the assembler takes care of the rest!
Who know what this code actually does, but let's see how we can convert it into assembly. I used variable names t0, t1, and t2, but remember, variables are NOT registers! We are required to make sure that each register contains the proper value of each variable.
Let's see what the code above is doing. Notice we have labels (words ending with colons ':'). These mark memory addresses so that when we branch or jump, we can provide a name rather than an address. This is helpful because the more instructions we write, the memory addresses change!
In the code above, I took the negative approach. The branch instruction (bge) stands for branch-if-greater-than-or-equal-to. Notice that that is the opposite of t0 < t1. So, we're telling MIPS to go to the else statement if t0 >= t1. If the branch is taken, we jump to the label else_statement, otherwise, the branch instruction does nothing and we execute the instruction directly underneath it.
So, why the j if_done
? A label just marks a memory address. Like a switch statement in C++, it doesn't stop code from executing. So, without the if_done, we would go right into the else statement and execute the code inside of the else statement even when t0 is indeed less than t1. That's not what we want. So, the jump instruction jumps over the else code and completes the if statement.
We can use the same branches and jump to execute a for loop. It is helpful for us to know how a for loop works, when pieces of a for loop execute, and how many times.
Branch instructions require two registers and a label. Therefore t0 < 100 cannot be executed directly. Instead, we must use a register to hold 100.
As we can see in the for loop, we initialize the iterator ($t0) ONCE and only once. Then, just like in a for loop, we check the condition before doing anything. Again, I took the 'negative' view. In a for loop the condition is 'if it's true run the loop'. Ours is 'if the condition is false, break the loop'. We're saying the same thing except ours tells MIPS when to break the loop. Remember, when a branch is NOT taken, it just executes the very next instruction. In our case, that's $t1 += 1. We can use addi since the immediate is small. Then, as with a for loop, the step (t0++) comes next. After the step, we check the condition again. There's the loop!
As with many other instructions, some of the instructions above are pseudo-instructions. In MIPS, we really only have beq and bne. The assembler figures out how to do the other ones, such as blt and bge. blt and bge will check for negative numbers, so the sign-bit does NOT contribute to the magnitude. This means that blt and bge will work with signed values. However, we still have unsigned values. In these cases, we can use the 'u' variants of blt and bge as below.
There is no such thing as bequ or bneu since these will compare bit-by-bit; however, all inequalities (blt, ble, bgt, bge) all have 'u' variants.
It might be more helpful to envision a for loop as a while loop. Here's an example of the for loop above as a while loop. See if you can match the instructions in the assembly above with what's going on with the loop below.
I ALWAYS recommend first writing your logic in C++ and then translating to assembly. You are probably not comfortable with assembly and developing logic just complicates the issue. Even I will write more complicated code in C++ first, compile it, test it, and then translate into assembly. Doing all of the above in assembly is complicated, even for the pros!
Integer Comparisons
A branch condition can be implemented by subtracting the two operands and seeing what comes out. These can be summed up using the flags NZCV, which stand for negative, zero, carry, and overflow, respectively. If we subtract two numbers and the result is negative, that means that the first number is smaller than the second. If we subtract two numbers and the result is zero, that means both numbers were equal, analogously, if we subtract two numbers and the result is not zero, that means the two numbers were not equal.
In MIPS, we only have three comparisons that we can do, equals, not equals, and less than. For the branches, we only have equals and not equals (bne and beq). For less than, we can use the SLT (set-on-less-than) instruction. The rest of the branch instructions are pseudo-instructions that implement a combination of slt and bne/beq.
In some architectures, flags are stored after a comparison and can be referred to later. As you can see with MIPS, the branch instruction itself will make the comparison and decision to go to the given memory label.
NZCV can come with different names, such as SF (sign flag), ZF (zero flag), CF (carry flag), and OF (overflow flag) in the Intel/AMD architecture. The point of this is to see that you can use a mathematical operation to compare two numbers. Since our architecture deals with numbers, hopefully it is becoming clear(er) how conditions can be implemented.
To see these conditions in action, we can see different ways to represent these flags to give a certain conditional outcome.
You can see this in action. Recall that the conditions are set by looking at the result after subtraction--the difference.
(10 - 2 = 8)We can see Z = 0 (not equals) and N = 0 (not less than). Since it is not equals and not less than, it therefore must be greater than.
(2 - 10 = -8)We can see Z = 0 (not equals) and N = 1 (less than). This means that 2 is less than 10.
To implement this in hardware, we can hook the negative flag directly to the most significant bit of our result. Recall that this is the sign bit. Then we can test all of the digits to see if they are zero for the zero flag.
Executable Sections
Notice that we write .text at the top of our assembly code. This is because executable programs have different sections where we store our data. The MARS simulator only supports two out of the four, but here are the sections and their use. The ones supported by MARS are in bold.
These sections require a label and a directive. So, we can create data using the following directives:
You can see from the example above that we specify the data size as .byte, .half, or .word for 1, 2, and 4 bytes, respectively. These directives require a value. We can create more than one piece of data by adding commas. These will be in contiguous memory. However, how do we get these values? We can use the pseudo-instruction called 'la' for load address.
The code above will print 123456 to the screen. Notice that we first have to load the address of the 'output' label. We cannot just directly dereference the value. Instead, we load the address, then do a lw (load-word) to dereference the memory address into the actual value.
Recall that these are global variables! Just like in C++, we do not want to rely on these variables. So, only use globals VERY sparingly!
To see how we can use a global variable as an array of data, we can create an uninitialized array of bytes and then store into them using the .space directive. This requires a parameter which is the number of bytes that you want to reserve for the given label.
The code above demonstrates using a label as storage. We can grab the address and use the offset of a store instruction to move within it. Notice that we wrote .space 16. This means that we reserved 16 bytes. It DOES NOT mean that we reserved a space and put the value 16 in there. Instead, the memory allocated using .space must be considered garbage since it is uninitialized. MARS will probably make it all zeroes, but do not assume that!
Local Data Storage
Notice that all of the sections above refer to global data. This is because local storage is stored on the stack. The stack is setup for us whenever we execute a program. How do we refer to the stack? There is a special register ($sp) called the stack pointer. This is a memory address where we can store and load from. The stack starts at the higher memory addresses and grows towards the lower memory addresses. We are allowed to use AT or below $sp. If we want to allocate memory, we subtract the number of bytes from the $sp. However, since we're responsible for the stack, we MUST put it back the same way we found it by adding back to it. We don't have to clear it or restore the values, but we do have to put $sp back.
In C++, the stack is used for all variables that are not global. C++ will automatically move the stack pointer around. This is helpful (and necessary) when calling functions.
Above, we allocated 8 bytes by subtracting from the stack pointer, which is two words. Since $sp points to a memory address, we can simply use load and store to read or write values on the stack. This is called a stack frame. Since we allocated 8 bytes, we are allowed to use $sp + 0, $sp + 1, ... $sp + 7 (remember 0-based indices).
The stack pointer is required to be aligned by 8 bytes, which is a fancy term of saying the stack pointer must be a multiple of 8. Therefore, even if we need 1, 2, or even 4 bytes, we must still subtract 8. In fact, anything we subtract from $sp must be a multiple of 8. If we need 12 bytes? We subtract the nearest multiple of 8, which is 16.
Functions
Functions are just a fancy term for a label. The 'hard' part is remembering how arguments are passed to a function and how data is returned. It's not that hard and MIPS makes it much easier.
The $a0 through $a3 registers are a because they are called the argument registers. So, the first argument goes into a0, the second into a1, the third into a2, and the fourth into a3. What about returns? The return value goes into $v0.
Nothing forces us to use this policy, but it is standard operating procedure called the application binary interface or ABI. This is a fancy term for the rules we agree upon to make functions work across programming languages.
A fairly simple example. So, we want to call add_one with an argument of 2. Take note of the data types!
We start by executing j main
. This is because MARS starts at the top of our assembly code and works its way down. Normal executables don't do this. Instead, normal executables start at a label called _start, which eventually works its way to int main().
When we're in main, we have to set up the parameters before we call add_one. The instruction jal stands for jump-and-link. This instruction will put the memory address under the jal instruction into the $ra (return-address) register. This allows the function to find its way back to int main(), otherwise, how do we know which memory address to go back to?
Destroying Registers
Per the same standard that says $a0 through $a3 get our arguments, these are also considered scratch registers. In fact, all $t (temporary) and $a (argument) registers are considered scratch registers. This means that when we call a function, that function is permitted to destroy whatever value is inside any temporary or argument register. Therefore, if we have important data inside of an a or t register, when we call a function, we MUST consider it destroyed.
Mars Mips Tutorial
Saved Registers ($s0..$s7)
This is where the saved registers come into play. These registers per the standard are required to have the same value before AND after a function call. We are still allowed to use them, but we must use the stack to store the original value of any saved register before we use them. Then, before we return from our function, we're required to load the old value back.
You will usually use a saved register if you find yourself in a loop loading and storing over and over again. Recall that each register is exactly 32 bits (4 bytes), so we know how much stack space we need by adding all of the register we need to use together. Remember, the stack must be a multiple of 8!
Also, we only have ONE $ra register. If we need to call another function, we will use the jal (jump-and-link) instruction. However, this instruction will destroy what's already in $ra! Therefore, we're required to save it too only when we call another function.
Notice that we can use the offset to identify the register. We subtract 16 bytes from the stack because we're storing 4 registers and (4times 4 = 16). Even if we were only storing 3 register (12 bytes), we would still need 16 bytes from the stack since it is required to be a multiple of 8.
Since we don't know what's in any of the saved registers and we don't know the exact size of $ra, we must always store and load the maximum size, which is a word (sw and lw). That makes guessing obsolete at least!
System Calls
This is just for MARS and is not a MIPS thing. If we want to read integers or print them to the screen, we would usually use printf or cout. These functions eventually make a system call to the operating system to get the output. In MARS, we have to use system calls, which can be found in 'Help > System Calls'. I will outline just the most useful below.
Here are the steps to make a system call in MARS.
- Load system call number in $v0.
- Set the parameters (see tables below).
- Execute syscall instruction.
- Read return values (if applicable)
An example of the steps above can be seen below. Notice we execute a print instruction for whatever is inside of the $t0 register. We actually make the system call using the syscall instruction.
Output
So, following the procedures, we can print an integer by putting the system code in $v0, which for print integer is 1, or for print string is 4.
In the code above, we use li to put the system call #4 (print string) into $v0. Then we use la (load address) to load the address of the output string into $a0. We then make the system call by using the syscall instruction. Look at the window at the bottom of MARS, we can see:
Notice that I used .asciiz in the .data section to create the string. In C++, string literals go in the .rodata section, but MARS doesn't have one. Also, .asciiz means 'use the ASCII table for characters and put a Z (zero) at the end of it'. C-style strings don't have a length. We know the length of a string by counting until we hit a 0--this 0 is called the null byte. This is why we have .asciiz. If we don't want a zero to automatically be put into our string, we can use just .ascii. The null byte (0) is why we call strings null-terminated. Since we keep writing the string until we hit that null byte, which is 0.
Unfortunately, we don't have the power of printf. So, if we want to print strings and integers, we need two different system calls. The following example shows how to print an integer.
As you can see, we put the value 1 into $v0 for the 'print integer' system call, and then we put whatever value we want to print in $a0. This produces the following.
Input
Just like printf and cout, we have scanf and cin. Unfortunately, we don't have that power with MARS. So, we need to use the input system calls as summarized below.
An example of using the read-based system calls above is below.
The code above first prompts the user to enter an integer. When they do and press ENTER, the system call puts that integer into $v0. We then move that into $a0 so that we can print it using the print integer system call.
You might see the move instruction. This is yet another pseudo-instruction that resolves to using the add instruction. So, move $t0, $s0 is the same as add $t0, $s0, $zero. This is why the syscall help in MARS uses add $a0, $t0, $zero instead of move $a0, $t0. They do the same thing, but the add instruction is real.
Yes, there is a register called $zero. This can be used to load a value of 0 always. If we write to this register, the value is discarded. The zero register is hardwired to 0 and cannot be changed.
'Dropped off Bottom'
The MARS simulator will end your program whenever there are no more instructions to execute. This is NOT how an actual system works. Instead, when we return from int main(), C++ will make a request to the operating system to close your program. However, for purposes of this course, whenever you want to quit your program, just jump to the very end. I always make a label called 'end' that is at the bottom of each program.
In normal situations, we would use the exit (or exit2) system calls to tell the operating system to terminate our process.
Example
MARS MIPS simulator is an assembly language editor, assembler, simulator & debugger for the MIPS processor, developed by Pete Sanderson and Kenneth Vollmar at Missouri State University (src).
You get the MARS for free here. As for installing the 4.5 version, you might need the suitable Java SDK for your system from here
Before assembling, the environment of this simulator can be simplisticly split to three segments: the editor at the upper left where all of the code is being written, the compiler/output right beneath the editor and the list of registers that represent the 'CPU' for our program.
After assembling (by simply pressing F3) the environment changes, with two new segments getting the position of the editor: the text segment where
i) each line of assembly code gets cleared of 'pseudoinstructions' (we'll talk about those in a sec) at the 'basic' column and
ii) the machine code for each instruction at the 'code' column,
and the data segment where we can have a look at a representation of the memory of a processor with little-endian order.
After assembling, we can execute our code either all at once (F5) or step by step (F7), as well as rewinding the execution several steps backwards to the back (F8).
Now, let's see the example code from above and explain each line:
MARS accepts and exports files with the .asm filetype
But the code above prints just a character, what about the good ol' 'Hello World'? What about, dunno, adding a number or something? Well, we can change what we had a bit for just that:
Before illustrating the results through MARS, a little more explanation about these commands is needed:
System calls are a set of services provided from the operating system. To use a system call, a call code is needed to be put to $v0 register for the needed operation. If a system call has arguments, those are put at the $a0-$a2 registers. Here are all the system calls.
li
(load immediate) is a pseudo-instruction (we'll talk about that later) that instantly loads a register with a value.la
(load address) is also a pseudo-instruction that loads an address to a register. Withli $v0, 4
the $v0 register has now4
as value, whilela $a0, str
loads the string ofstr
to the$a0
register.A word is (as much as we are talking about MIPS) a 32 bits sequence, with bit 31 being the Most Significant Bit and bit 0 being the Least Significant Bit.
lw
(load word) transfers from the memory to a register, whilesw
(store word) transfers from a register to the memory. With thelw $s1, 0($t0)
command, we loaded to$s1
register the value that was at the LSB of the$t0
register (thats what the0
symbolizes here, the offset of the word), aka256
.$t0
here has the address, while$s1
has the value.sw $t2, 0($t0)
does just the opposite job.MARS uses the Little Endian, meaning that the LSB of a word is stored to the smallest byte address of the memory.
MIPS uses byte addresses, so an address is apart of its previous and next by 4.
By assembling the code from before, we can further understand how memory and registers exchange, disabling 'Hexadecimal Values' from the Data Segment:
or enabling 'ASCII' from the Data Segment:
Mars Mips Ide
Start it like this
$ java -jar Mars4_5.jar
Mars Mips 32
Create this file and save it.
Press F3 to assembly it and then press run. Now you are started compiling and executing MIPS code.