Project03 - RISC-V Emulation, Analysis, and Cache Simulation¶

Due Thu Apr 2nd by 11:59pm in your Project03 GitHub repo

Links¶

Tests: https://github.com/USF-CS631-S26/tests

Background¶

The goal of this project is to write an emulator in Rust for a subset of the RISC-V Instruction Set Architecture (ISA). Your emulator will execute real RISC-V machine code by decoding instructions and simulating their effects on processor state (registers, memory, program counter). For each target program, the driver compares three implementations — a Rust reference, a native RISC-V assembly execution, and your emulator — printing results labeled Rust:, Asm:, and Emu:.

Beyond basic emulation, you will implement dynamic analysis to collect execution statistics (instruction counts by type, branch behavior) and a processor cache simulator supporting both direct-mapped and set-associative cache configurations.

Your emulator will need the logic (decoding and emulating instructions) and state from lab05.

Requirements¶

Write an emulator in Rust for a subset of the RISC-V ISA — enough to emulate the following programs:
- quadratic_s (given)
- midpoint_s (given)
- max3_s (given)
- to_upper_s (given)
- get_bitseq_s (given)
- get_bitseq_signed_s (given)
- swap_s (given)
- sort_s (given)
- fib_rec_s (yours)
Support dynamic analysis of instruction execution with the -a flag (see Section 2).
Implement a processor cache simulator supporting four configurations (see Section 3):
1. Direct mapped cache with a block size of 1 word
2. Direct mapped cache with a block size of 4 words
3. 4-way set associative cache with a block size of 1 word and LRU replacement
4. 4-way set associative cache with a block size of 4 words and LRU replacement
All programs must produce output matching the automated tests exactly.

Section 1: RISC-V Emulation¶

Command Line¶

project03 [FLAGS] <prog> [args...]

Where <prog> is one of the target program names and [args...] are its arguments. Flags are described in Sections 2 and 3.

Driver Pattern¶

Your single project03 binary serves as a test harness. For each target program, the driver:

Calls a pure Rust reference implementation and prints the result with Rust:
Calls the native RISC-V assembly function via FFI and prints with Asm:
Initializes the emulator state, runs the emulator on the assembly function's machine code, and prints with Emu:

The Emu: output must match the Rust: and Asm: lines.

Emulator State¶

The emulator state includes 32 64-bit registers, a program counter, and an allocated stack:

pub struct RvState {
    pub regs: [u64; 32],
    pub pc: *const u8,
    pub stack: [u8; 8192],
    pub analyze: bool,
    pub cache_sim: bool,
    pub analysis: RvAnalysis,
    pub i_cache: Cache,
}

Initialization sets:

pc to the address of the target assembly function
a0–a3 (registers 10–13) to the function arguments
ra (register 1) to null (used as the halt condition)
sp (register 2) to the top of the stack
x0 (register 0) must always remain 0

The emulator loop fetches, decodes, and executes one instruction at a time until pc becomes null (the function returns via jalr ra).

Target Programs¶

Program	Arguments	Description
`quadratic`	`x a b c`	Computes ax² + bx + c
`midpoint`	`start end`	Computes (start + end) / 2
`max3`	`a b c`	Returns maximum of three values
`to_upper`	`string`	Converts string to uppercase
`get_bitseq`	`n start end`	Extracts unsigned bit sequence
`get_bitseq_signed`	`n start end`	Extracts signed bit sequence
`swap`	`i j elem1 elem2 ...`	Swaps array elements at indices i and j
`sort`	`elem1 elem2 ...`	Sorts array in ascending order (insertion sort)
`fib_rec`	`n`	Computes Fibonacci(n) recursively

Examples¶

$ ./riscv-run project03 quadratic 2 1 2 3
Rust: 11
Asm: 11
Emu: 11

$ ./riscv-run project03 midpoint 0 10
Rust: 5
Asm: 5
Emu: 5

$ ./riscv-run project03 max3 3 7 5
Rust: 7
Asm: 7
Emu: 7

$ ./riscv-run project03 to_upper hello
Rust: HELLO
Asm: HELLO
Emu: HELLO

$ ./riscv-run project03 get_bitseq 94116 12 15
Rust: 6
Asm: 6
Emu: 6

$ ./riscv-run project03 get_bitseq_signed 94117 4 7
Rust: -6
Asm: -6
Emu: -6

$ ./riscv-run project03 fib_rec 10
Rust: 55
Asm: 55
Emu: 55

$ ./riscv-run project03 swap 0 3 10 20 30 40 50
Rust: 40 20 30 10 50
Asm: 40 20 30 10 50
Emu: 40 20 30 10 50

$ ./riscv-run project03 sort 5 3 1 4 2
Rust: 1 2 3 4 5
Asm: 1 2 3 4 5
Emu: 1 2 3 4 5

RISC-V Instructions¶

You do not need to emulate the entire RISC-V ISA. You need to support enough instructions to run the target programs. The instructions you will likely need include:

Format	Opcode	Instructions
R-type	`0x33`	`add`, `sub`, `mul`, `div`, `sll`, `srl`, `sra`, `and`
R-type (word)	`0x3B`	`sllw`, `srlw`, `sraw`
I-type (arithmetic)	`0x13`	`addi`, `slli`, `srli`, `srai`
I-type (load)	`0x03`	`lb`, `lw`, `ld`
I-type (jalr)	`0x67`	`jalr`
S-type (store)	`0x23`	`sb`, `sw`, `sd`
SB-type (branch)	`0x63`	`beq`, `bne`, `blt`, `bge`
UJ-type (jump)	`0x6F`	`jal`

Section 2: Dynamic Analysis¶

The -a flag enables dynamic analysis output after the emulator runs. Your emulator collects the following metrics during execution:

Metric	Description
`i_count`	Total number of instructions executed
`ir_count`	Number of R-type and I-type (arithmetic) instructions executed
`ld_count`	Number of load instructions executed
`st_count`	Number of store instructions executed
`j_count`	Number of jump instructions executed (`jal`, `jalr`)
`b_taken`	Number of conditional branches taken
`b_not_taken`	Number of conditional branches not taken

Analysis Output Format¶

=== Analysis
Instructions Executed  = N
R-type + I-type        = N (X.XX%)
Loads                  = N (X.XX%)
Stores                 = N (X.XX%)
Jumps/JAL/JALR         = N (X.XX%)
Conditional branches   = N (X.XX%)
  Branches taken       = N (X.XX%)
  Branches not taken   = N (X.XX%)

The percentages for R-type + I-type, Loads, Stores, Jumps, and Conditional branches are relative to total instruction count. The percentages for branches taken/not taken are relative to total conditional branches.

Example¶

$ ./riscv-run project03 -a quadratic 2 1 2 3
Rust: 11
Asm: 11
Emu: 11
=== Analysis
Instructions Executed  = 10
R-type + I-type        = 8 (80.00%)
Loads                  = 0 (0.00%)
Stores                 = 0 (0.00%)
Jumps/JAL/JALR         = 1 (10.00%)
Conditional branches   = 1 (10.00%)
  Branches taken       = 0 (0.00%)
  Branches not taken   = 1 (100.00%)

Section 3: Cache Simulation¶

Your emulator includes an instruction cache simulator. The cache intercepts instruction fetches — each time the emulator fetches an instruction word, the fetch goes through the cache. See the Cache Memory Guide for background on cache memory concepts and the different cache types.

Command Line Flags¶

Flag	Arguments	Description
`-dm`	`<size> <block_size>`	Direct-mapped cache with given size (in slots) and block size (in words)
`-sa`	`<size> <block_size> <ways>`	Set-associative cache with given size, block size, and number of ways
`-v`	(none)	Verbose cache output (shows individual lookups)

Cache Data Structures¶

pub struct CacheSlot {
    pub valid: bool,
    pub tag: u64,
    pub block: [u32; CACHE_MAX_BLOCK_SIZE],
    pub timestamp: u64,
}

pub struct Cache {
    pub slots: Vec<CacheSlot>,
    pub cache_type: CacheType,
    pub size: i32,
    pub ways: i32,
    pub block_size: i32,
    // Address decomposition
    pub block_mask: u64,
    pub block_bits: u64,
    pub index_mask: u64,
    pub index_bits: u64,
    // Statistics
    pub refs: i32,
    pub hits: i32,
    pub misses: i32,
    pub misses_cold: i32,
    pub misses_hot: i32,
}

Address Decomposition¶

An instruction address is decomposed into three fields (from least to most significant bits, after removing the 2-bit word offset):

Block index — selects a word within the cache block
Set index — selects which set (or slot for direct-mapped) to check
Tag — the remaining upper bits, stored in the slot to verify a match

The number of sets is size / (block_size * ways).

Direct-Mapped Cache¶

A direct-mapped cache has one slot per set (ways = 1). On a lookup:

Compute the block index, set index, and tag from the address
Check if the slot at the set index is valid and its tag matches
Hit: return the word from the block
Miss: load the entire block from memory into the slot, set the tag and valid bit, and return the requested word. Track whether this is a cold miss (slot was invalid) or a hot miss (slot was valid but tag didn't match)

Set-Associative Cache¶

A set-associative cache has multiple ways (slots) per set. On a lookup:

Compute the block index, set index, and tag from the address
Check each way in the set for a valid slot with a matching tag
Hit: update the slot's timestamp (for LRU tracking) and return the word
Miss with invalid slot available: fill the first invalid slot in the set (cold miss)
Miss with all slots valid: evict the slot with the oldest (smallest) timestamp using LRU replacement (hot miss)

On every access (hit or miss), update the slot's timestamp to the current reference count for LRU tracking.

Cache Output Format¶

=== Cache (I)
Type          = direct mapped | set associative
Size          = N slots
Block size    = N words
Ways          = N
References    = N
Hits          = N (X.XX% hit ratio)
Misses        = N (X.XX% miss ratio)
Misses (cold) = N
Misses (hot)  = N
% Used        = X.XX%

% Used is the percentage of slots that have their valid bit set.

Examples¶

$ ./riscv-run project03 -dm 256 1 fib_rec 10
Rust: 55
Asm: 55
Emu: 55
=== Cache (I)
Type          = direct mapped
Size          = 256 slots
Block size    = 1 words
Ways          = 1
...

$ ./riscv-run project03 -sa 256 4 4 sort 5 3 1 4 2
Rust: 1 2 3 4 5
Asm: 1 2 3 4 5
Emu: 1 2 3 4 5
=== Cache (I)
Type          = set associative
Size          = 256 slots
Block size    = 4 words
Ways          = 4
...

Section 4: Build System¶

Project Structure¶

project03/
├── Cargo.toml
├── build.rs
├── Makefile
├── Dockerfile
├── riscv-run
├── .cargo/
│   └── config.toml
├── asm/
│   ├── quadratic_s.s
│   ├── midpoint_s.s
│   ├── max3_s.s
│   ├── to_upper_s.s
│   ├── get_bitseq_s.s
│   ├── get_bitseq_signed_s.s
│   ├── fib_rec_s.s
│   ├── swap_s.s
│   └── sort_s.s
└── src/
    ├── main.rs
    ├── rv_emu.rs
    ├── cache.rs
    ├── functions.rs
    ├── bits.rs
    └── asm_ffi.rs

Cargo.toml¶

Your Cargo.toml uses the cc crate for building assembly files:

[package]
name = "project03"
version = "0.1.0"
edition = "2024"

[build-dependencies]
cc = "1"

build.rs¶

The build.rs script compiles assembly files only when targeting RISC-V:

fn main() {
    let target = std::env::var("TARGET").unwrap_or_default();

    if target.contains("riscv") {
        cc::Build::new()
            .flag("-march=rv64g")
            .file("asm/quadratic_s.s")
            .file("asm/midpoint_s.s")
            .file("asm/max3_s.s")
            .file("asm/to_upper_s.s")
            .file("asm/get_bitseq_s.s")
            .file("asm/get_bitseq_signed_s.s")
            .file("asm/fib_rec_s.s")
            .file("asm/swap_s.s")
            .file("asm/sort_s.s")
            .compile("asm_functions");

        println!("cargo:rustc-link-arg-bins=-lasm_functions");
    }

    println!("cargo:rerun-if-changed=asm/");
}

.cargo/config.toml¶

[target.riscv64gc-unknown-linux-gnu]
linker = "riscv64-linux-gnu-gcc"
runner = "qemu-riscv64 -L /usr/riscv64-linux-gnu"

Makefile¶

The Makefile auto-detects whether you're on a RISC-V machine or need Docker:

UNAME_M := $(shell uname -m)
IMAGE_NAME := project03-riscv

ifeq ($(UNAME_M),riscv64)
  CARGO_CMD = cargo
  CARGO_FLAGS =
else
  CARGO_CMD = docker run --rm -v $(CURDIR):/project $(IMAGE_NAME) cargo
  CARGO_FLAGS = --target riscv64gc-unknown-linux-gnu
endif

build: docker-image-ensure
    $(CARGO_CMD) build $(CARGO_FLAGS)

run: docker-image-ensure
    $(CARGO_CMD) run $(CARGO_FLAGS) -- $(ARGS)

clean:
    cargo clean

docker-image-ensure:
ifeq ($(UNAME_M),riscv64)
    @true
else
    @if ! docker image inspect $(IMAGE_NAME) >/dev/null 2>&1; then \
        echo "Building Docker image '$(IMAGE_NAME)'..." >&2; \
        docker build -t $(IMAGE_NAME) .; \
    fi
endif

docker-build:
    docker build -t $(IMAGE_NAME) .

docker-shell:
    docker run --rm -it -v $(CURDIR):/project $(IMAGE_NAME) bash

Key targets:

make build — build the project
make run ARGS="quadratic 2 1 2 3" — run with arguments
make docker-build — rebuild the Docker image
make docker-shell — open a shell inside the Docker container

Dockerfile¶

FROM ubuntu:24.04

RUN apt-get update && apt-get install -y \
    gcc-riscv64-linux-gnu qemu-user curl build-essential \
    && rm -rf /var/lib/apt/lists/*

# Install Rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
ENV PATH="/root/.cargo/bin:${PATH}"
RUN rustup target add riscv64gc-unknown-linux-gnu

WORKDIR /project

Running Natively on RISC-V¶

On a RISC-V machine (e.g., the class server), build and run directly with Cargo:

$ cargo build
$ cargo run -- quadratic 2 1 2 3
$ cargo run -- -a fib_rec 10
$ cargo run -- -dm 256 4 sort 5 3 1 4 2

Cross-Compilation with Docker¶

On non-RISC-V hosts (macOS, x86 Linux), use the riscv-run script or the Makefile, which handle Docker automatically. Make sure Docker is installed and running.

$ ./riscv-run project03 quadratic 2 1 2 3
$ ./riscv-run project03 -a fib_rec 10
$ make run ARGS="-dm 256 4 sort 5 3 1 4 2"

Grading¶

Tests: https://github.com/USF-CS631-S26/tests

Grading is based on automated tests (100 points total).

Code Quality

You need to have a clean repo, consistent naming and indentation, no dead code, no unnecessarily complex code. Any deductions can be earned back.

Code Quality¶

Code quality deductions may be applied and can be earned back. We are looking for:

Consistent spacing and indentation
Consistent naming and commenting
No commented-out ("dead") code
No redundant or overly complicated code
A clean repo, that is no build products, extra files, etc.