Lab 2: RISC-V Assembler II

1 Objectives

Following completion of this lab you should be able to:

Assemble RISC-V SB, U, and UJ types.
Understand how the design of instruction types influences and limits their behavior, especially when it comes to immediates.
Explain the relationship between pseudoinstructions and core instructions.
Discuss the benefits and drawbacks of different types of addressing.

Your Tasks

Take a look at the lab worksheet before you start writing code for this lab. The methods you need to implement are flagged with a TODO: Lab 2 throughout the files.

1 Support new instruction types

You need to extend your assembler from Lab 1 to support the missing instruction types listed below. The hints and tips on the Lab 1 page are worth reviewing again before you start this lab. All the tests for this lab are in advanced_assembler_test.py.

Implement helper methods

First, start by implementing these helper methods:

Assemble
index_to_address
parse_labels
label_to_offset

It may be helpful to use the has_label and split_out_label helper functions provided.

For Assemble you simply need to add to it as you go to support the new instruction types. The other helpers here will be useful as you write the other methods. You should fully implement any other unimplemented helpers. Think of this as a warm up, there are not explicit test cases for these, so you'll need to figure out how to verify they are working as you go. (This is a good time to review the general requirments for labs and consider how you will meet these.) You may need to come back and change the behavior of these helpers as you move further into the lab.

You should push at least one commit to your repo that contains the implementation of all of these helpers.

Add new instruction types

You need to implement each of these methods:

Assemble_U_Type
Assemble_SB_Type
Assemble_UJ_Type

After you finish each of these methods (or pause working) you must commit and push your repo with a meaningful commit message. As you debug and fix errors you should consider doing more commits.

For the latter two I recommend you focus on getting them working when a number is passed in as the branch target (e.g. bne t0 t1 40) then after that works, add support for labels (e.g. beq t0 t1 LABEL). All numbers used as targets for branches or jumps should be the PC-relative offset, not the immediate itself. You need to adjust the number before you translate it into an immediate. For reference the bne instruction above should be interpreting as branching to PC = PC + 40).

Your assembler only needs to support decimal immediates, assume all numbers passed as operands to an instruction are in decimal. As you work through the test cases you may want to consider the binary or hex representation of the numbers used in the tests.

Consider using the helper method is_int() and the helpers you wrote above.

2 Support pseudoinstructions

Your assembler needs to support a few pseudoinstructions. The behavior of individual pseudoinstructions is defined in pseudoinstruction_handler.py. Note that you can see the list of methods and their docs for this file by opening docs/pseudoinstruction_handler.html in your repo.

Implement individual pseudoinstruction methods

In pseudoinstruction_handler.py you will see these methods when you need to implement:

double
diffsums
push
li
beqz
jalif

You should push at least one commit to your repo that contains the implementation of all of these pseudoinstructions.

The behavior of each of these pseudoinstructions is defined in the code comments. For each of these methods two arguments are given, the method might be called like this:

double("double t5, s0", 7)

The first argument is the actual use of the pseudoinstruction, the second argument is the line/instruction number in the assembled program where this pseudoinstruction starts. For most of these methods this is just for error output, but some of them will need to use this argument in other ways.

This function should return a list of new core instructions that will have the same behavior as the pseudoinstruction.

Recall that pseudoinstructions should not change other registers beyond those implied by the instruction definition, with the exception of at (aka x31) which can be modified freely. Also, recall that any register could be used as any register operand in a pseudoinstruction. Additionally, think carefully about the size of immediates supported by each pseudoinstruction.

The test cases for the pseudoinstructions do not directly test if your implementation of the instruction works. Instead they test general rules about the pseudoinstructions. It will be your job to explain how you know these pseudoinstructions behave correctly in the lab worksheet. You will need to run the code produced by your pseudoinstructions in a RISC-V simulator. You could use this online one, or jump to Lab 3 and install the one we use there.

You may want to consider using the helpers: replace_all(), assembler.reverse(), assembler.is_int(), assembler.dec_to_bin(), assembler.index_to_address(), assembler.label_to_offset()

Once you've implemented all of the pseudoinstruction handlers, tests in the TestPseudos unit test category should pass, except for the one called test_pseudoinstructions_pass.

Implement the pseudoinstruction pass of the assembler

Look at the main assemble_asm() method. This function does all the steps of the assembler, notice that after removing comments the next thing is the processing of pseudoinstructions. For now, the assembler assumes there are no pseudoinstructions, go look at the definition of pseudoinstruction_pass().

You need to implement this function. The big picture is this: raw code comes in that may contain pseudoinstructions, this method should return a list of core instructions (and labels) where the pseudoinstructions have been replaced. You need to look at each line of code, determine if it is a pseudoinstruction, if it is then you need to call the correct pseudoinstruction-replacement method (which you wrote above), otherwise simply leave the line unchanged. The pseudoinstruction methods are passed in the second argument of to pseudoinstruction_pass(), so to apply the double method I could do this:

new_code = pseudos_dictionary["double"](my_line, inst_num)

new_code will be a list of the new instructions that I can add to my growing program.

You should look at the other pass methods that are implemented for you, if you need help starting. Keep in mind that one pseudoinstruction can become more than one core instruction, this will affect the line numer/address of each instruction following a pseudoinstruction in the original code.

You should consider the different "cases" you may hit as this method goes over each line of code in a file:

case 1: a core instruction
case 2: a label (e.g. LABEL:)
case 3: a label and another instruction (e.g. LABEL: add t0, t0, t0)
case 4: a pseudoinstruction
case 5: an unknown instruction

Work on these one case at a time, make sure that your pseudoinstruction pass returns the correct number of instructions and labels pointing to the right instructions.

The tests for this method once again do not test the exact instructions you return, since implementations of pseudoinstructions can be variable. Instead, it tests general patterns about what the code should look like. You will need to explain how you know your code is correct on the lab worksheet.

By the time you commit this work to your repo you should have at least 6 commits with meaningful messages, if not more.

Once you've implemented pseudoinstruction_pass (and have working pseudoinstruction handlers), all tests in the TestPseudosFileAssembly and TestPseudos unit test categories should all pass.

Grading Rubric

All the labs for CSSE232 have these general requirements:

General Requirements for all Labs

The solution fits the need
Aspects of performance are discussed
The solution is tested for correctness
The submission shows iteration and documentation

Some labs will hit some of these requirements more than others. But you should always be thinking about them.

Fill out the Lab Worksheet

In the worksheet, explain how you satisfy each of these items. Some guidelines:

None of these answers should be more than 100 words. (Unless otherwise indicated on the worksheet.)

You will upload this sheet to gradescope. Make sure you indicate your partner when you upload. TODO: Talk about these rubrics, should the test cases be separate from each entry? or should I pool that somehow?

Lab 1 Rubric items	Possible Points
Lab Worksheet	20
Implements U-Types	10
Implements SB-Types	15
Implements UJ-Types	15
Implements 6 pseudoinstructions	15
Implements pseudoinstruction_pass	5
Autograder test cases	20
Total out of	100

Submit your completed worksheet to gradescope, only 1 per team (make sure all team member's names are included). In gradescope you are able add your team members names to the submission, make sure you do so. You can find the gradescope link on the course moodle page.