CS 631-01 Systems Foundations — Meeting Summary¶

Date/Time: March 31, 2026, 08:17 AM Pacific Time (US & Canada)
Meeting ID: 870 0988 0761

Quick Recap¶

Greg taught a session on cache memory and processor emulation, covering:
Cache structures: direct-mapped and set-associative
Temporal and spatial locality
Hit/miss behavior and how caches bridge the CPU–memory performance gap
The class discussed how to implement cache simulation for Project 3 (due Thursday, April 2), including address lookup mechanics and handling hits/misses.
The next class (Thursday) will begin CPU and digital design topics.

Students should:
Complete Project 3 (RISC-V emulation, dynamic analysis, and cache simulation) by Thursday, April 2.
Read the cache memory guide provided for Project 3.
Update the metrics struct in the emulator per project instructions.
Implement cache simulation in the emulator, supporting:
- Direct-mapped and set-associative caches
- Block sizes of 1 and 4 (as specified)
Attend office hours if help is needed.

Project 3 (due Thursday, April 2) has three parts:
RISC-V emulation
Dynamic analysis
Cache simulation
Emulators provide visibility into processor operations, in contrast to hardware’s focus on efficiency over observability.
Dynamic analysis tracks metrics such as instruction counts. An example showed how adding a single number can nearly double instructions in a sorting routine, illustrating how small code changes can significantly affect dynamic behavior.

Discussion covered algorithmic complexity:
Simple sorts (insertion, bubble): O(n²)
More efficient sorts (merge sort, quicksort): O(n log n)
Instruction-type analysis in emulated programs should consider:
R-type and I-type instructions
Loads and stores
Jumps and branches
Recursive functions (e.g., Fibonacci) generate loads/stores due to stack usage. Branch behavior significantly affects performance.

Branching introduces pipeline/decision challenges and underscores the need for efficient memory access.
Cache fundamentals:
Exploit temporal and spatial locality
Most programs exhibit patterns that caches leverage effectively
Key design elements:
- Address mapping to cache slots (indexing)
- Tags and valid bits for hit detection
- Conflict handling and replacement when full
Byte addresses are converted to word addresses for cache operations where appropriate.

The team is developing a new GGL (Golden Gate Language) library to specify digital circuits in code; Phil has built a web-based UI.
Greg favors schematic design for its pedagogical value but recognizes limitations in the current Digital Java tool.
Options under consideration:
Using the GGL library in coursework
Automatically visualizing circuits constructed from code
There is an ongoing effort to balance industry-standard HDLs (Verilog, VHDL) with the course’s learning goals.

Direct-mapped caching uses modular arithmetic to map word addresses to cache slots.
Structure:
Each slot has a valid bit, a tag, and data (block)
The slot index is derived from lower address bits (e.g., lower two bits for small caches)
Tags uniquely identify which address is stored in a slot. A hit occurs when both the valid bit is set and the stored tag matches the requested tag.

Each cache row can be represented as a struct containing:
A 64-bit tag (e.g., U64)
A data block (initially one word; later, multiple words for larger block sizes)
A valid bit
Direct-mapped lookup flow:
Extract index and tag from the address.
Check the valid bit and compare the tag.
On a hit, return the data; on a miss, fetch from memory and update the slot.
Later enhancements include supporting multi-word blocks to improve spatial locality.

Caching exists at multiple layers: hardware caches, OS buffer caches, and web caches.
Implementations discussed:
Direct-mapped caches
Set-associative caches with set indexing
Support for varying block sizes
Replacement policy:
For set-associative caches, use LRU (Least Recently Used): evict the line with the oldest timestamp when inserting new data.