← Back to Course
# Introduction to RISC-V Assembly ## CS 631 Systems Foundations — Feb 26, 2026 --- ## Today's Agenda 1. What is RISC-V? 2. The register set and calling convention 3. Instruction syntax and types 4. Pseudo-instructions 5. Assembly file structure and the Rust interface 6. In-class examples: progressive complexity 7. Control flow patterns 8. Memory access: load and store --- ## What is RISC-V? An **open-standard instruction set architecture** (ISA): - Free to use — no licensing fees (unlike x86, ARM) - Clean, modular design — created at UC Berkeley in 2010 - Small base ISA + optional extensions (M, A, F, D) - This course: **RV64IM** (64-bit base + multiply/divide) | ISA | Type | Open? | Common Uses | |-----|------|-------|-------------| | x86 / x86-64 | CISC | No | Desktops, servers | | ARM / AArch64 | RISC | Licensed | Phones, Apple Silicon | | RISC-V | RISC | Yes | Embedded, education, growing | --- ## Why Learn Assembly? - **Understand performance**: see what the compiler generates - **Debug effectively**: read disassembly, step through instructions - **Systems software**: OS kernels, compilers, and embedded code - **Security**: buffer overflows and exploits operate at the instruction level
In this course we write RISC-V assembly called from
Rust
— combining low-level control with a modern language.
--- ## The Programming Model
graph LR A["Processor
(32 registers, PC)"] <-->|"load / store"| B["Memory
(stack, heap, data, code)"]
- **Registers**: fast on-chip storage (32 × 64-bit) - **Memory**: byte-addressable, accessed via load/store - **Program Counter (PC)**: points to the next instruction - Arithmetic operates **only on registers** — must load data from memory first --- ## The RISC-V Register Set | Register | ABI Name | Usage | Preserved? | |----------|----------|-------|------------| | x0 | `zero` | Hardwired to zero | N/A | | x1 | `ra` | Return address | Yes | | x2 | `sp` | Stack pointer | Yes | | x5–x7 | `t0`–`t2` | Temporaries | No | | x10–x11 | `a0`–`a1` | Args / return value | No | | x12–x17 | `a2`–`a7` | Arguments | No | | x18–x27 | `s2`–`s11` | Saved | Yes | | x28–x31 | `t3`–`t6` | Temporaries | No | --- ## Calling Convention for Lab03
All six Lab03 functions are
leaf functions
— they do not call other functions.
- **Arguments** arrive in `a0`, `a1`, `a2`, `a3` - **Return value** goes in `a0` - **Temporaries** `t0`–`t6`: use freely for intermediate calculations - No need to save/restore registers — only `t` and `a` registers needed ```text Rust: fn add4_s(a: i32, b: i32, c: i32, d: i32) -> i32 ↓ ↓ ↓ ↓ ↓ RISC-V: a0 a1 a2 a3 → ret a0 ``` --- ## Instruction Format General syntax: ```asm opcode destination, source1, source2 # comment ``` Examples: ```asm add a0, a0, a1 # a0 = a0 + a1 addi t0, t0, 1 # t0 = t0 + 1 (immediate) mul a0, a0, a1 # a0 = a0 * a1 ``` Key rule: **destination is always first**. --- ## Data Processing Instructions | Instruction | Meaning | Example | |---|---|---| | `add rd, rs1, rs2` | rd = rs1 + rs2 | `add a0, a0, a1` | | `sub rd, rs1, rs2` | rd = rs1 − rs2 | `sub a0, a0, a1` | | `mul rd, rs1, rs2` | rd = rs1 × rs2 | `mul a0, a0, a1` | | `div rd, rs1, rs2` | rd = rs1 / rs2 | `div a0, a0, a1` | | `rem rd, rs1, rs2` | rd = rs1 % rs2 | `rem t0, a0, a1` | | `and rd, rs1, rs2` | rd = rs1 & rs2 | `and a0, a0, a1` | | `or rd, rs1, rs2` | rd = rs1 \| rs2 | `or a0, a0, a1` | | `xor rd, rs1, rs2` | rd = rs1 ^ rs2 | `xor a0, a0, a1` | | `sll rd, rs1, rs2` | rd = rs1 << rs2 | `sll a0, a0, a1` | | `srl rd, rs1, rs2` | rd = rs1 >> rs2 (logical) | `srl a0, a0, a1` | --- ## Immediate Instructions Replace second source register with a constant (12-bit signed: −2048 to 2047): | Instruction | Meaning | Example | |---|---|---| | `addi rd, rs, imm` | rd = rs + imm | `addi a0, a0, 1` | | `andi rd, rs, imm` | rd = rs & imm | `andi a0, a0, 0xFF` | | `ori rd, rs, imm` | rd = rs \| imm | `ori a0, a0, 1` | | `xori rd, rs, imm` | rd = rs ^ imm | `xori a0, a0, -1` | | `slli rd, rs, imm` | rd = rs << imm | `slli a0, a0, 2` | | `srli rd, rs, imm` | rd = rs >> imm | `srli a0, a0, 4` |
No
subi
instruction — use
addi
with a negative value:
addi a0, a0, -1
--- ## Pseudo-Instructions Assembler conveniences that expand to real instructions: | Pseudo | Expansion | Meaning | |---|---|---| | `li rd, imm` | `addi rd, zero, imm` | Load immediate | | `mv rd, rs` | `addi rd, rs, 0` | Copy register | | `not rd, rs` | `xori rd, rs, -1` | Bitwise NOT | | `neg rd, rs` | `sub rd, zero, rs` | Negate | | `j label` | `jal zero, label` | Unconditional jump | | `ret` | `jalr zero, ra, 0` | Return to caller | | `nop` | `addi zero, zero, 0` | No operation | | `ble rs1, rs2, L` | `bge rs2, rs1, L` | Branch if ≤ | | `bgt rs1, rs2, L` | `blt rs2, rs1, L` | Branch if > | --- ## Assembly File Structure ```asm .global function_name function_name: # instructions here ret ``` - **`.global`** makes the label visible to the linker (so Rust can call it) - **`function_name:`** is a label marking the function entry point - **`ret`** returns control to the caller - The function name must match the `extern "C"` declaration in Rust --- ## The Rust-Assembly Interface
**Rust side** ```rust extern "C" { fn add3_s(a: i32, b: i32, c: i32) -> i32; } fn main() { let r = unsafe { add3_s(1, 2, 3) }; println!("{}", r); } ```
**Assembly side** ```asm .global add3_s add3_s: add a0, a0, a1 add a0, a0, a2 ret ```
- `extern "C"` → C calling convention (args in a0–a7) - `unsafe` → Rust can't verify foreign code - Function name must match `.global` label --- ## The Build Pipeline
graph LR A[".s files"] -->|"cc crate"| B["libasm_functions.a"] C["src/*.rs"] -->|"rustc"| D["object files"] B --> E["Linker"] D --> E E --> F["Binary"]
```rust // build.rs fn main() { cc::Build::new() .file("asm/add3_s.s") .file("asm/loop_s.s") // ... .compile("asm_functions"); println!("cargo:rerun-if-changed=asm/add3_s.s"); } ``` Run with: `cargo run --bin add3 -- 1 2 3` --- ## Example 1: `first_s` — No Arguments Computes `3 + 99` and returns the result.
**Rust** ```rust fn first_rust() -> i32 { let x = 3; let y = 99; x + y } ```
**Assembly** ```asm .global first_s first_s: li t0, 3 li t1, 99 add a0, t0, t1 ret ```
**New concepts**: `li` (load immediate), `add`, return value in `a0`, `ret` --- ## Example 2: `add1_s` — One Argument Adds 1 to its argument.
**Rust** ```rust fn add1_rust(a: i32) -> i32 { a + 1 } ```
**Assembly** ```asm .global add1_s # int a - a0 add1_s: addi a0, a0, 1 ret ```
**New concepts**: argument arrives in `a0`, `addi` (add immediate) --- ## Example 3: `add3_s` — Multiple Arguments Adds three arguments. `add` only takes two source registers, so we chain operations.
**Rust** ```rust fn add3_rust(a: i32, b: i32, c: i32) -> i32 { a + b + c } ```
**Assembly** ```asm .global add3_s # a0 - a, a1 - b, a2 - c add3_s: add a0, a0, a1 add a0, a0, a2 ret ```
**New concepts**: multiple args in `a0`, `a1`, `a2`; accumulating result into `a0` --- ## Example 4: `add3arr_s` — Array Access Receives a pointer to an array of three `i32` values, returns their sum.
**Rust** ```rust extern "C" { fn add3arr_s( arr: *const i32 ) -> i32; } fn add3arr_rust(arr: &[i32; 3]) -> i32 { arr[0] + arr[1] + arr[2] } ```
**Assembly** ```asm .global add3arr_s # a0 - int arr[] (address) add3arr_s: lw t0, (a0) addi a0, a0, 4 lw t1, (a0) add t0, t0, t1 addi a0, a0, 4 lw t1, (a0) add t0, t0, t1 mv a0, t0 ret ```
**New concepts**: `lw` (load word), pointer arithmetic (+4 per `i32`), `mv` --- ## Memory Access: Load and Store RISC-V is a **load/store architecture** — arithmetic only on registers. | Instruction | Bytes | Use For | |---|---|---| | `lb` / `lbu` | 1 | `i8` / `u8` | | `lh` / `lhu` | 2 | `i16` / `u16` | | `lw` / `lwu` | 4 | `i32` / `u32` | | `ld` | 8 | `i64` / pointers | | `sb` / `sh` / `sw` / `sd` | 1/2/4/8 | Store equivalents | Syntax: `lw t0, offset(base)` → address = base + offset ```asm lw t0, 0(a0) # load arr[0] lw t1, 4(a0) # load arr[1] lw t2, 8(a0) # load arr[2] ``` --- ## Array Indexing For an `i32` array (4 bytes per element): | Element | Offset | Load | |---------|--------|------| | `arr[0]` | 0 | `lw t0, 0(a0)` | | `arr[1]` | 4 | `lw t0, 4(a0)` | | `arr[2]` | 8 | `lw t0, 8(a0)` | | `arr[i]` | i×4 | compute offset, then `lw` | Two ways to compute offset for element `i`: - Multiply: `mul t0, i_reg, 4` - Shift (faster): `slli t0, i_reg, 2` — left shift by 2 = multiply by 4
Lab03 findmax
: receives array pointer in
a0
, length in
a1
. Use
lw
with offsets that are multiples of 4.
--- ## Example 5: `ifelse_s` — Conditional Branching Returns 1 if positive, 0 otherwise.
**Rust** ```rust fn ifelse_rust(val: i32) -> i32 { if val > 0 { 1 } else { 0 } } ```
**Assembly** ```asm .global ifelse_s # a0 - val, t0 - retval ifelse_s: ble a0, zero, else li t0, 1 j done else: li t0, 0 done: mv a0, t0 ret ```
**Key insight**: branch on the **inverse** condition — `ble` (≤ 0) skips the then-block --- ## Control Flow: If/Else Pattern To implement `if (a > b)`, branch when `a ≤ b` to skip the then-block: ```text ble a, b, else_label # inverse condition # then body j end_label else_label: # else body end_label: ``` | Rust Condition | Branch to Skip | |---|---| | `a == b` | `bne a, b, else` | | `a != b` | `beq a, b, else` | | `a < b` | `bge a, b, else` | | `a >= b` | `blt a, b, else` | | `a > b` | `ble a, b, else` | | `a <= b` | `bgt a, b, else` | --- ## Branch Instructions | Instruction | Condition | Real/Pseudo | |---|---|---| | `beq rs1, rs2, label` | rs1 == rs2 | Real | | `bne rs1, rs2, label` | rs1 != rs2 | Real | | `blt rs1, rs2, label` | rs1 < rs2 | Real | | `bge rs1, rs2, label` | rs1 >= rs2 | Real | | `ble rs1, rs2, label` | rs1 <= rs2 | Pseudo | | `bgt rs1, rs2, label` | rs1 > rs2 | Pseudo | | `j label` | Always | Pseudo |
ble
and
bgt
are pseudo-instructions: the assembler swaps the operands and uses
bge
or
blt
.
--- ## Control Flow: While Loop Pattern ```text loop_label: branch_if_NOT_condition done_label # loop body j loop_label done_label: ```
graph TD A["loop_label:
Check condition"] -->|"False"| D["done_label"] A -->|"True"| B["loop body"] B --> C["j loop_label"] C --> A
--- ## Example 6: `loop_s` — Loop with Accumulation Sums integers from 0 to n−1.
**Rust** ```rust fn loop_rust(n: i32) -> i32 { let mut sum = 0; for i in 0..n { sum += i; } sum } ```
**Assembly** ```asm .global loop_s # a0 - n, t0 - i, t1 - sum loop_s: li t0, 0 # i = 0 li t1, 0 # sum = 0 loop: bge t0, a0, done add t1, t1, t0 # sum += i addi t0, t0, 1 # i++ j loop done: mv a0, t1 ret ```
**New concepts**: `bge` for loop exit, `j` to loop back, accumulator pattern --- ## Lab03 Overview Six functions to implement in RISC-V assembly, called from Rust: | Function | What It Does | Key Concepts | |----------|-------------|--------------| | `add4` | a + b + c + d | Sequential adds (like `add3_s`) | | `quadratic` | ax² + bx + c | `mul` + `add` | | `min` | Smaller of two values | Branch + `mv` (like `ifelse_s`) | | `findmax` | Max in an array | `lw` + loop + comparison | | `grade` | Score → letter grade | Chained `bge` branches | | `sum_to_n` | Sum 1 to n | Loop + accumulator (like `loop_s`) |
Every in-class example maps to a Lab03 function. Study the examples, then apply the same patterns.
--- ## Lab03: `add4` Walkthrough **Rust**: `fn add4_rust(a: i32, b: i32, c: i32, d: i32) -> i32 { a + b + c + d }` ```asm .global add4_s add4_s: add a0, a0, a1 # a0 = a + b add a0, a0, a2 # a0 = a + b + c add a0, a0, a3 # a0 = a + b + c + d ret ``` ```text $ cargo run --bin add4 -- 1 2 3 4 C: 10 Asm: 10 ``` This extends `add3_s` with one more `add` for the fourth argument in `a3`. --- ## Lab03: Hints for Remaining Functions - **quadratic** `(x, a, b, c) → ax²+bx+c`: use `mul` for `x*x`, then `a*x²`, then `b*x`, then `add` the terms - **min** `(a, b) → smaller`: use `ble`/`blt` like `ifelse_s`; `a0` may already hold the answer - **findmax** `(arr, len) → max`: loop through array with `lw` like `add3arr_s`, track max with `bge`/`blt` - **grade** `(score) → grade_value`: chain `li`/`bge` pairs to check 90, 80, 70, 60 thresholds - **sum_to_n** `(n) → sum`: same pattern as `loop_s` but start `i` at 1 and use `bgt` for `while i <= n` --- ## Key Takeaways 1. **RISC-V** is an open-standard ISA with a clean design — ideal for learning assembly 2. **32 registers**: `a0`–`a7` for arguments, `t0`–`t6` for temporaries, return value in `a0` 3. **Instruction format**: `opcode dest, src1, src2` — destination always first 4. **Pseudo-instructions** (`li`, `mv`, `ret`, `j`) make code readable 5. **Rust interface**: `extern "C"` + `unsafe` + `cc` crate builds `.s` files into libraries 6. **Control flow**: branch on the **inverse condition** to skip code blocks 7. **Memory**: load/store architecture — `lw`/`sw` for 32-bit values, offsets in multiples of element size --- ## Further Reading - [RISC-V ISA Specification](https://riscv.org/technical/specifications/) - [RISC-V Assembly Programmer's Manual](https://github.com/riscv-non-isa/riscv-asm-manual/blob/main/riscv-asm.md) - [The RISC-V Reader](http://www.riscvbook.com/) — Patterson & Waterman - [Lab03: RISC-V Assembly Programming](/assignments/lab03)