← Back to Course
# RISC-V Assembly Functions ## CS 631 Systems Foundations — Mar 5, 2026 --- ## Today's Agenda 1. The call/return mechanism 2. Caller-saved vs callee-saved registers 3. Stack frame management 4. Non-leaf functions: caller-saved approach 5. Non-leaf functions: callee-saved approach 6. Recursive functions --- ## The Call/Return Mechanism Two things must happen when calling a function: 1. **Jump** to the function's code 2. **Remember** where to return RISC-V uses the **program counter (PC)** and **`ra`** register: ```asm call target # ra = PC + 4, then PC = target # (equivalent to: jal ra, target) ret # PC = ra (jump back to caller) # (equivalent to: jalr zero, ra, 0) ``` --- ## How `call` and `ret` Work ```text Address Instruction 0x1000 call bar # ra = 0x1004, PC = bar 0x1004 addi a0, a0, 1 # resumes here after ret ... bar: 0x2000 addi a0, a0, 1 # bar's code 0x2004 ret # PC = ra = 0x1004 ``` - `call` saves the **return address** (PC + 4) into `ra` - `ret` jumps back to the address in `ra` --- ## Example: `call_s.s` — foo calls bar
**Rust** ```rust fn bar(a: i32) -> i32 { a + 1 } fn foo(a: i32) -> i32 { bar(a) + 1 } ```
**Assembly** ```asm bar_s: addi a0, a0, 1 ret foo_s: addi sp, sp, -16 sd ra, (sp) # Save ra call bar_s # Overwrites ra addi a0, a0, 1 ld ra, (sp) # Restore ra addi sp, sp, 16 ret ```
--- ## Why Save `ra`? Without saving `ra`, a non-leaf function **loops forever**: ```asm # BROKEN — ra gets overwritten foo_s: call bar_s # ra = address after this line addi a0, a0, 1 ret # jumps to "addi" above, NOT to caller! ```
call
overwrites
ra
. If you don't save it first,
ret
returns to the wrong place — an
infinite loop
.
--- ## Register Convention The calling convention defines who preserves each register: | Category | Registers | Preserved? | Who Saves? | |---|---|---|---| | **Caller-saved** | `a0`–`a7`, `t0`–`t6`, `ra` | No | Caller (if needed) | | **Callee-saved** | `s0`–`s11`, `sp` | Yes | Callee (prologue/epilogue) | - **Caller-saved**: the called function may freely overwrite them - **Callee-saved**: the called function must restore them before returning --- ## Argument Passing & Return Values Arguments are passed in `a0`–`a7`. Return value goes in `a0`. | Argument | Register | |---|---| | 1st | `a0` (also return value) | | 2nd | `a1` | | 3rd | `a2` | | 4th | `a3` | | 5th–8th | `a4`–`a7` | | 9th+ | stack |
If you need a value to survive a function call, either
save it on the stack
(caller-saved) or
put it in an
s
register
(callee-saved).
--- ## Stack Frame: Prologue/Epilogue Every non-leaf function follows this pattern: ```asm my_function: # === Prologue === addi sp, sp, -32 # 1. Allocate (multiple of 16) sd ra, 0(sp) # 2. Save registers sd s0, 8(sp) # === Body === # ... function logic, calls ... # === Epilogue === ld s0, 8(sp) # 3. Restore registers ld ra, 0(sp) addi sp, sp, 32 # 4. Deallocate ret # 5. Return ```
Stack must stay
16-byte aligned
. Always allocate in multiples of 16.
--- ## Stack Frame Layout ```text Higher addresses +---------------------------+ | Caller's frame | +---------------------------+ ← sp (before prologue) | ra | sp + 24 +---------------------------+ | s0 | sp + 16 +---------------------------+ | s1 | sp + 8 +---------------------------+ | local variable | sp + 0 +---------------------------+ ← sp (after prologue) Lower addresses ``` Frame size = round up bytes needed to next multiple of 16. --- ## Frame Size Calculation | Values to Save | Bytes | Allocate | |---|---|---| | ra only | 8 | 16 | | ra + 1 s-reg | 16 | 16 | | ra + 2 s-regs | 24 | 32 | | ra + 3 s-regs | 32 | 32 | | ra + 3 s-regs + local | 36 | 48 |
Count bytes needed, round up to the next multiple of 16.
--- ## Non-Leaf: Caller-Saved Approach **Strategy**: keep values in `a` registers, save/restore on the stack around each call. Example: `add4f(a, b, c, d) = add2(add2(a,b), add2(c,d))` ```asm add2_s: add a0, a0, a1 ret ``` We need `c`, `d`, and intermediate results to survive across calls to `add2_s`. --- ## `add4f_s` — Caller-Saved ```asm add4f_s: addi sp, sp, -32 sd ra, (sp) sd a2, 8(sp) # Save c sd a3, 16(sp) # Save d call add2_s # a0 = a + b sd a0, 24(sp) # Save a+b ld a0, 8(sp) # a0 = c ld a1, 16(sp) # a1 = d call add2_s # a0 = c + d mv a1, a0 # a1 = c + d ld a0, 24(sp) # a0 = a + b call add2_s # a0 = (a+b) + (c+d) ld ra, (sp) addi sp, sp, 32 ret ``` Save values on stack **before** each call, load back **after**. --- ## Non-Leaf: Callee-Saved Approach **Strategy**: move values into `s` registers (preserved across calls). Save `s` regs once in prologue, restore in epilogue. ```asm add4f_callee_s: addi sp, sp, -32 sd ra, (sp) sd s0, 8(sp) # Save s0, s1, s2 (we'll use them) sd s1, 16(sp) sd s2, 24(sp) mv s0, a2 # s0 = c (survives calls) mv s1, a3 # s1 = d ``` Values in `s` registers survive any function call — no need to save/restore around each `call`. --- ## `add4f_callee_s` — Full ```asm add4f_callee_s: addi sp, sp, -32 sd ra, (sp) sd s0, 8(sp) sd s1, 16(sp) sd s2, 24(sp) mv s0, a2 # s0 = c mv s1, a3 # s1 = d call add2_s # a0 = a + b mv s2, a0 # s2 = a + b mv a0, s0 # a0 = c mv a1, s1 # a1 = d call add2_s # a0 = c + d mv a1, a0 # a1 = c + d mv a0, s2 # a0 = a + b call add2_s # a0 = (a+b) + (c+d) ld ra, (sp) ld s0, 8(sp) ld s1, 16(sp) ld s2, 24(sp) addi sp, sp, 32 ret ``` --- ## Comparing the Two Approaches | Aspect | Caller-Saved | Callee-Saved | |---|---|---| | Values live in | Stack | `s` registers | | Save/restore | Around each call | Once (prologue/epilogue) | | Stack accesses | More | Fewer | | Between calls | `ld`/`sd` | `mv` |
Callee-saved
is generally preferred when values must survive
multiple
calls.
Caller-saved
is simpler for one or two calls.
--- ## Recursive Functions A recursive function calls **itself**. Each call gets its own **stack frame**. ```rust fn factrec(n: i32) -> i32 { if n <= 0 { 1 } else { n * factrec(n - 1) } } ``` The stack grows with each recursive call and shrinks as calls return. --- ## `factrec_s` — Recursive Factorial ```asm factrec_s: addi sp, sp, -16 sd ra, (sp) # Base case: n <= 0 → return 1 bgt a0, zero, factrec_recstep li a0, 1 j factrec_done factrec_recstep: sd a0, 8(sp) # Save n addi a0, a0, -1 # a0 = n - 1 call factrec_s # a0 = factorial(n-1) ld t0, 8(sp) # Restore n mul a0, a0, t0 # a0 = factorial(n-1) * n factrec_done: ld ra, (sp) addi sp, sp, 16 ret ``` --- ## Stack Frames: `factrec_s(3)` Each call creates a new 16-byte frame: ```text +---------------------------+ | factrec_s(3): ra, n=3 | +---------------------------+ | factrec_s(2): ra, n=2 | +---------------------------+ | factrec_s(1): ra, n=1 | +---------------------------+ | factrec_s(0): ra (base) | +---------------------------+ ← sp (deepest) ``` | Call | n | Returns | |---|---|---| | `factrec_s(0)` | 0 | 1 (base case) | | `factrec_s(1)` | 1 | 1 × 1 = 1 | | `factrec_s(2)` | 2 | 1 × 2 = 2 | | `factrec_s(3)` | 3 | 2 × 3 = **6** | --- ## Practice: Fix the Bug This function has a bug — find it: ```asm broken_func: addi sp, sp, -16 sd s0, 0(sp) mv s0, a0 call helper add a0, a0, s0 ld s0, 0(sp) addi sp, sp, 16 ret ```
Bug
:
ra
is not saved! After
call helper
,
ra
points to the line after the call — not the original caller.
ret
will loop forever.
--- ## Key Takeaways - **`call`** saves PC+4 to `ra` and jumps; **`ret`** jumps to `ra` - **Caller-saved** (`a`, `t`, `ra`): may be clobbered by calls - **Callee-saved** (`s`, `sp`): preserved across calls - **Prologue/epilogue**: allocate → save → body → restore → deallocate → ret - **Caller-saved approach**: save values on stack around each call - **Callee-saved approach**: use `s` regs, save once in prologue - **Recursion**: each call gets its own stack frame --- ## Further Reading - [RISC-V ISA Specification](https://riscv.org/technical/specifications/) - [RISC-V Assembly Programmer's Manual](https://github.com/riscv-non-isa/riscv-asm-manual/blob/main/riscv-asm.md) - [The RISC-V Reader](http://www.riscvbook.com/) — Patterson & Waterman - [RISC-V Calling Convention](https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf)