RISC-V CPU Pipeline Simulation
1. Introduction
RISC-V is an open-source architecture and instruction set standard originating from Berkeley. This project requires you to implement a RISC-V CPU pipeline simulator based on the standard five-stage pipeline. You will need to implement a subset of the instructions from the RV32I instruction set specified in RISC-V Specification 2.2. Implementing a complete CPU simulator can effectively exercise system programming capabilities and deepen understanding of architecture-related knowledge.
2. Project introduction
2.1. Project requirements
The most important part of this project is to implement a RISC-V CPU pipeline simulator. The specific requirements are as follows:
• command-line argument parser module that allows for parsing paths to RISC-V binary
files specified in the command line. It also provides an option to enable or disable printing a history log at the end of the file. Please make sure your simulator can be run by Simulator xxx.riscv, where xxx.riscv is the path of the riscv binary code.
• load ELF files (has implemented in templates).
• history module (has a reference structure in templates).
• memory management (has a reference structure in templates, is needed by some
instruction).
• simulate the required instructions (see Section 2.7), including handle the system call (see
Section 2.6).
• handle data hazard, control hazard and memory access hazard.
2.2. Possible Structure
NOTE: This is just a possible structure. You can design your own structure.
The overview diagram (Figure 1) of the simulator code architecture is shown below. The entry point of the simulator is Main.cpp, which includes parsing parameters, loading ELF files, initializing the simulator module, and finally calling the simulate() function to enter the execution of the simulator. Unless there is an error in executing the simulator, theoretically, simulate() function will not return.
The simulator itself is designed as a large class, which is the class The data in the Simulator class includes PC, general registers, pipeline registers, execution history recorders, memory modules and branch prediction modules (not necessary for you, will not affect your score). Among them, because the memory module and branch prediction module are relatively independent, they are implemented as two separate classes MemoryManager and BranchPredictor.
The most core function in the simulator is the simulate() function, which performs cycle- level simulation on the simulator. In each simulation, it will execute fetch(), decode(), execute(), accessMemory() and writeBack() five functions, each of which takes as input the
pipeline register from the previous cycle and outputs to the pipeline register for the next cycle. At the end of a cycle, contents of new registers are copied into those used as inputs. During execution, each function handles content related to data hazards, control hazards and memory access hazards and records historical information at appropriate places.
Figure 1: Simulator Architecture
2.3. Memory Management
The function of MemoryManager is to provide a simple and easy-to-use memory access interface for the simulator, which must support arbitrary memory size and memory address access, and can detect illegal memory address access. What you need to do is to load the different sections from the elf file into the correct memory locations based on the section’s virtual memory address and memory size (not file size). Then, when simulating the execution of read/write instructions, you just need to parse the memory address and directly operate with this memory address in the MemoryManager, without the need for any conversions in between.
The following implementation of MemoryManager uses a mechanism similar to the two- level page table (single-level page table is OK) used in x86 architecture. Specifically speaking, it divides the 32-bit memory space (4GB) logically into pages with a size of 4KB (2^12), using the first 10 bits of the memory address as an index for level one page table, followed by another 10 bits as an index for level two page table, and finally using last 12 bits as an offset within a single page.
).
2.4. ELF Load and Initialization
You need to according to Section 3 to implement the ELF file loader and initialize the simulator. The ELF file loader is responsible for loading the ELF file into the simulator’s memory, and the initialization process is responsible for setting the initial state of the simulator, including setting the initial value of the PC, setting the initial value of the general register, and setting the initial value of the stack pointer, etc.
NOTE: This is a sample memory manager. You can create your own design. But you have to
make sure that your ELF file is loaded into the correct position. With the above
implementation, you don’t need to allocate 4GB of memory at once; you only need to
allocate as needed. Of course, you can also directly allocate the memory required for loading
the ELF file along with an additional stack area (Figure 2
Figure 2: Memory Layout
Figure 2 shows the typical layout of a simple computer’s program memory with the text, various data, and stack and heap sections. The text and data segments are placed in their corresponding positions when you load the ELF file. After loading the ELF file, initialize the stack by setting the stack pointer to the top of memory and adjusting the stack size as needed, for example, to 4MB. Heap management is typically handled by software, so you don’t need to worry about it.
2.5. Simulator Implementation
For the RISC-V pipeline simulator, you need to implement the five-stage pipeline, including
• Fetch, all instructions in the RV32I instruction set are fixed-length 4 bytes.
• Decode, translates instructions into RISC-V assembly format strings. In addition, mimics
hardware implementations by abstracting common fields such as op1, op2, and dest from
instructions.
• Execute, simply executes corresponding behaviors based on different types. In conclusion,
it checks data hazards, control hazards, and memory access hazards according to the current commands and situations during the decode stage, and takes actions accordingly. At this point, jump command gets whether or not jump happens, and inserts bubbles into pipeline registers when branch to wrong path.
• Memory access, performs memory read-write operations,and detects data hazard and forwarding. When detecting data hazard, it needs consider both general data hazard and situation where pipeline stalls due to memory access hazard last cycle. Besides, priority level for forwarding must also taken into account.
• Write back, writes execution results back to register,and handles data hazard like before.
For the RISC-V Pipeline Hazards, I recommend you to refer to the following links: • RISCV-V Pipeline Hazards from Berkeley
• RISCV-V Pipeline Hazards from Washington
2.6. System Call
This project use following system calls. The system call use ecall instruction to trigger. The a7 register saves the system call number, the a0 register saves the system call parameter, and the return value will be saved in the a0 register.
System Call Name
System Call Number
Parameter
Return Value
Print string
0
The initial address of string
None
Print char
1
The value of char
None
Print number
2
The value of number
None
Exit program
3
None
None
Read char
4
None
The value of char
Read number
5
None
The value of number
The detailed information about system call can be found in test-release/lib.c.
2.7. Required Instructions
The following table lists the instructions that you need to implement in the simulator. You can refer to the RISC-V Specification 2.2 for the detailed information about these instructions.
"lui", "auipc", "jal", "jalr", "beq", "bne", "blt", "bge", "bltu",
"bgeu", "lb", "lh", "lw", "ld", "lbu", "lhu", "sb", "sh",
"sw", "sd", "addi", "slti", "sltiu", "xori", "ori", "andi", "slli",
"srli", "srai", "add", "sub", "sll", "slt", "sltu", "xor", "srl",
"sra", "or", "and", "ecall"
2.8. History
The simulator needs to record the number of cycles and the number of instructions executed during the simulation process, and output the number of cycles and the number of instructions executed when the input parameters indicate that these need to be printed.
2.9. Advanced Features
You can implement the following advanced features to improve the simulator:
• Implement a branch prediction module to improve the performance of the simulator.
• Implement a cache module to improve the performance of the simulator. This will be
related to the next project.
• Implement a out-of-order execution module to improve the performance of the simulator.
• Some other advanced features that you are interested in.
NOTE: This part is not tested by the test scripts, but you need to implement it and provide
the usage of it in your ReadMe.md. Please make sure your ReadMe is clear and detailed.
NOTE: These advanced features are not necessary for this project, and they will not affect
your score. If you have interest, you can implement them.
2.10. Test Cases
We provide some test cases for you to verify your simulator. You can find them in the test- release directory. We also have other programs to further verify your simulator, all these test cases will be part of your final score.
How to run the test cases:
• Download the test-release.zip file from the course platform.
• Unzip the test-release.zip file in the root directory of your project, you will get test-
release directory and run-test-release.sh.
• Run the run-test-release.sh script in the root directory of your project, like bash run-
test-release.sh
• build
The example output:
> bash run-test-release.sh
Comparing ./test-release/add.out and ./test-release/add.ref
Succeed! Files ./test-release/add.out ./test-release/add.ref are the same
Comparing ./test-release/mul-div.out and ./test-release/mul-div.ref
Succeed! Files ./test-release/mul-div.out ./test-release/mul-div.ref are the
same
Comparing ./test-release/n!.out and ./test-release/n!.ref
Succeed! Files ./test-release/n!.out ./test-release/n!.ref are the same
Comparing ./test-release/qsort.out and ./test-release/qsort.ref
Succeed! Files ./test-release/qsort.out ./test-release/qsort.ref are the same
Comparing ./test-release/simple-function.out and ./test-release/simple-
function.ref
Succeed! Files ./test-release/simple-function.out ./test-release/simple-
function.ref are the same
5 / 5 tests pass!
3. ELF File Loader
3.1. ELF File Format
There are three main types of object files in the ELF (Executable and Linking Format) format:
• Relocatable file: holds code and data suitable for linking with other object files.
• Executable file: holds a program suitable for execution.
• Shared object file: holds code and data suitable for linking in two contexts.
Object files participate in program linking (building a program) and program execution (running a program). For convenience and efficiency, the object file format provides parallel views of a file’s contents, reflecting the differing needs of these activities. Figure 3 shows the basic structure of an ELF object file.
Please make sure your executable file is named
and is located in the
directory.
Simulator
Figure 3: Object File Format
Section in object file format:
• Sections are used during the linking and compilation process
• They represent different types of data within the ELF file, such as code (.text), initialized
data (.data), uninitialized data (.bss), symbols table (.symtab), string table (.strtab),
relocation information (.rel.text, .rel.data), and debugging information.
• Sections contain information that is useful for linking and for debugging, but they are not
necessarily loaded into memory when the program is executed.
• The ELF file contains a section header table that lists all sections and their attributes.
Segment in object file format:
• Segments are used during the execution process.
• They are typically a collection of sections that need to be loaded into memory as a unit.
In summary, sections are for organization and use during compilation and linking, while segments are for mapping the ELF file into memory during execution. An object file segment contains one or more sections, as “Segment Contents”.
3.2. Program Loading
As the system creates or augments a process image, it logically copies a file’s segment to a virtual memory segment. Virtual addresses and file offsets for SYSTEM V architecture segments are congruent modulo 4KB (0x1000) or larger powers of 2, which means when you divide the virtual address and the file offset by 4KB, the remainders are the same. Because 4KB is the maximum page size, the files will be suitable for mapping regardless of physical page size. Figure 4 shows the basic structure of an ELF executable file.
Figure 4: Executable File
Although the example’s file offsets and virtual addresses are congruent modulo 4KB for both text data, up to four file pages hold impure text or data (depending on page size and file system block size).
• The first text page contains the ELF header, the program header table, and other info.
• The last text page holds a copy of the beginning of data.
• The first data page has a copy of the end of text.
• The last data page may contain file information not relevant to the running process.
Figure 5: Process Image Segments
Logically, the system enforces the memory permissions as if each segment were complete and separate; segments’ addresses are adjusted to ensure each logical page in the address space has a single set of permissions. In the example (Figure 4) above, the region of the file holding the end of text and the beginning of data will be mapped twice; at one virtual address for text and at a different virtual address for data.
The end of the data segment requires special handling for uninitialized data (often referred to as the .bss segment (Block Started by Symbol), is a portion of the memory in a program that is reserved for variables that have not been given an explicit initial value by the programmer.), which the system defines to begin with zero values. Thus if a files’s last data page includes information not in the logical memory page, the extraneous data must be set to zero, not the unknown contents of the executable file. “Impurities” in the other three pages are not logically part of the process image; whether the system expunges them is unspecified. The memory image (Figure 5) for this program follows, assuming 4KB (0x1000) pages.
3.3. Program Loading Example
Here is an example of loading an ELF file into memory. The following is the output of the simulator when loading the add.riscv file. You need to allocate memory for segments
In this project, you do not need to care about “Impurities” in the pages. Just deal with the
uninitialized data.
according to MSize (memory size). For address larger than FSize (file size), you need to fill the memory with 0.
> ./Simulator ../test-inclass/add.riscv -s -v
==========ELF Information==========
Type: ELF32
Encoding: Little Endian
ISA: RISC-V(0xf3)
Number of Sections: 14
ID Name
[0]
[1] .text
[2] .eh_frame
[3] .init_array
[4] .fini_array
[5] .data
Address Size
0x0 0
[6] .sdata
[7] .sbss
[8] .bss
[9] .comment
[10] .riscv.attributes 0x0 28
0x100e8 8636
0x13000 4
0x13008 16
0x13018 8
0x13020 2472
0x139c8 32
0x139e8 56
0x13a20 1416
0x0 45
0x0 4632
0x0 1478
0x0 118
[11] .symtab
[12] .strtab
[13] .shstrtab
Number of Segments: 3
ID Flags Address FSize MSize
[0] 0x4 0x0 28 0
[1] 0x5 0x10000 8868 8868
[2] 0x6 0x13000 2536 4008
===================================
Memory Pages:
0x0-0x400000:
0x10000-0x11000
0x11000-0x12000
0x12000-0x13000
0x13000-0x14000
Fetched instruction 0x00003197 at address 0x1012c
3.4. Some other information
For entry point, you can find it in ELF header (e_entry in ELF header). It gives the virtual address to which the system first transfers control, thus starting the process. If the file has no associated entry point, it holds zero.
You have the option to create your own ELF file loader or utilize existing libraries like elfio. Your choice won’t affect your score, but I recommend writing it yourself for a better understanding of the ELF file format and program loading process.
For more information about the ELF file format, you can refer to the following links: cmu-elf
NOTE: We have provided a sample ELF file loader. You may use it as is or modify it to suit
your needs. Please ensure you understand it before using.
NOTE: All ELF files used for testing are little-endian. Ensure to manage the file’s endianness
accordingly.
4. Submission
For this project, you must use C/C++/Rust to implement the simulator. If you use python, you will get a 0 score. You need to submit the following files:
• src/*, include all source code files
• include/*, include all header files if you use C/C++
• CMakelists.txt, the cmake file for your project if you use C++/C
• Cargo.toml, the cargo file for your project if you use Rust
• ReadMe.md, a brief introduction to your project, including the usage of your simulator,
the implementation details of your simulator, the history information of your simulator,
and how to compile and run your project.
• test-release/*, include all test cases provided by us, do not change the file name
• build.sh, a script to build your project which should be able to compile your project just
by running bash build.sh
Please compress all files into a single zip file and submit it to the course platform. The file
name should be your student ID, like xxxxxxxxx.zip.
Please ensure that your emulator can be compiled by cmake with gcc/
g++ or cargo. If you use other tools, it is not acceptable.
If you have any questions or have some suggestions about the submission process, please feel free to ask me (TA ZHANG Yanglin)in the course group or send an email (lucky@lucky9.cyou / 119010446@link.cuhk.edu.cn) to me.
5. Grading
For this assignment, we are to submit a RISC-V CPU pipeline simulator. If you have difficulty completing this, you may submit a sequential version of the simulator; however, you will receive a maximum of 30% of the score.
The overall score will be calculated as follows: • Not provided test cases: 45%
• Provided test cases: 25%
• History (like Section 2.8): 10%
• ReadMe.md: 10%
• Code style and comments: 10% • Advanced features (bonus): 5%
Some matters need attention:
• The code should be well-structured and easy to understand.
• The ReadMe.md should be clear and easy to understand. Please provide detailed
introduction about your simulator, including the usage of your simulator, the implementation details of your simulator, the history information of your simulator, how to compile and run your project, and other information that you consider important.
Please make sure your project can be
compiled and run on the Linux platform. If your project cannot be compiled and run, you
will receive a 0 score.
• Do not plagiarize. If we discover that you have plagiarized, you will not only receive a score of zero for this project, but you will also fail this course directly. Additionally, we will report your actions to the Registry office. If we use plagiarism-detection software and after confirmation by the TA, it is found that you have indeed plagiarized, we will notify you via email.
• Please ensure that your project can pass the aforementioned test scripts (Section 2.10); we will provide you with some example tests. If your project does not yield the expected output, it will initially receive a score of zero. Following that, you may contact us with your test scripts, but the score for the part of your project pertaining to code style and comments will directly be 0.
6. Development Environment
6.1. RISC-V Environment Installation and Configuration
For convenience, this experiment is entirely based on the RISC-V 32I instruction set, with reference to the RISC-V Specification 2.2 standard.
The following steps were taken to configure the environment:
• Downloaded riscv-tools from GitHub and configured, compiled and installed riscv-gnu-
toolchain for Linux platform
• To use official simulator as a reference, downloaded, compiled and installed riscv-qemu
from GitHub;
It should be noted that when compiling riscv-gnu-toolchain, it is necessary to specify that the tool chain and C language standard library use RV32I instruction set. Otherwise during compilation compiler will use extended instruction sets like RV32C、RV32D etc., even if compiler settings are made only for using RV32I instructions during compile time compiler would still link in standard library functions which uses extended instructions sets.
Therefore in order to get ELF program which only uses RV32I standard instructions one must recompile within riscv-gnu-toolchain with following options:
mkdir build; cd build
../configure --with-arch=rv32i --prefix=/path/to/riscv32i
make -j$(nproc)
During compilation, use -march=rv32i to let the compiler generate ELF programs for the RV32I standard instruction set:
riscv32-unknown-elf-gcc -march=rv32i add.c lib.c -o add.riscv Dissasemble the ELF program use following command: riscv32-unknown-elf-objdump -D add.riscv > add.s
7. Collection of potentially useful links
• RISCV-V Pipeline Hazards from Berkeley
NOTE: This section is for project test environment setup. You do not need to do this in your
project.
• RISCV-V Pipeline Hazards from Washington • elfio
• cmu-elf
• cmake-tutorial
• introduction to modern cmake
8. About template
We have provided a template code using C++. You can refer to this code for your work, or modify it as per your requirements. We are not responsible for any errors that arise from your use of the template. This template is provided for reference only and its accuracy is not guaranteed (except for loading ELF). For those who choose to complete this project using Rust, I believe you have the sufficient skills to do so without a template. However, you may still refer to the C++ template if necessary.
WX:codehelp