Meeting Summary: CS 631-02 Systems Foundations¶

Quick Recap¶

The lecture centered on Project 3: implementing a cache simulator for a RISC-V emulator.
Core topics included:
Dynamic analysis and runtime metric collection.
Cache fundamentals: temporal/spatial locality, hits/misses, hit rate, block size.
Cache architectures: direct-mapped, set-associative, fully associative.
Implementation considerations: block address calculation, tags/valid bits, and Least Recently Used (LRU) eviction using timestamps.
Broader context covered:
Modern interview practices, including effective use of AI coding assistants.
Performance and security implications of cache behavior (e.g., Spectre-style attacks).

Students:
Implement metric collection in the emulator at the specified, non-redundant points.
Extend the cache simulator to support 4-word blocks (in addition to 1-word):
- Compute correct block addresses.
- Handle data for both direct-mapped and set-associative caches.
Implement LRU eviction for set-associative caches:
- Update timestamps on every access.
- Prefer invalid (not valid) slots before evicting valid entries.
- Select the least-recently used slot when eviction is necessary.
Review the provided cache initialization and parameter computation code to understand how configuration impacts implementation.
Begin work on Project 3 and consult the implementation guide for details and support.
Prepare to start digital design content on Thursday (as announced).

Greg clarified the goals and scope of Project 3, noting it should not require extensive coding given the course’s focus on RISC-V and Rust.
He discussed evolving interview practices:
Effective use of coding assistants to explore multiple solutions and demonstrate problem-solving.
Emphasis on prompt quality, reasoning, and collaboration with tools during interviews.

Dynamic analysis will collect runtime metrics within the emulator to profile behavior efficiently.
Metrics include:
Instruction counts
Data processing operations
Loads and stores
Branch usage
The approach leverages emulation to gain insight into processor operations without sacrificing execution efficiency.

Processors have outpaced memory performance, creating a bottleneck that caches help mitigate.
Caches store recently used data in smaller, faster memory to reduce access latency.
Key outcomes:
Cache hit: data found in cache
Cache miss: data must be fetched from main memory
Blocked transfers (bringing in contiguous data) exploit spatial locality for higher efficiency.

Greg shared insights on cache optimization from graduate studies.
He discussed AI’s growing role in software quality:
Large language models (e.g., Claude) aiding in bug discovery and remediation, including long-standing issues (e.g., Linux buffer overflow cases).
The promise and challenge of formal verification for proving correctness.
Security relevance:
Ongoing evolution in mitigations for hardware-level vulnerabilities (e.g., Spectre) that exploit cache timing side channels.

Greg and Junge examined a speculative execution bug that goes beyond simple branch prediction.
Modern CPUs speculatively execute multiple paths, which can be manipulated to influence cache state and leak data via timing analysis.
The discussion included a worked example of a direct-mapped cache:
A 4-slot cache using byte addresses and modulo operations to compute slot indices.

Address mapping and structures:
Direct-mapped: each address maps to exactly one slot.
Set-associative: each address maps to a set, then to one of several ways within that set.
Fully associative: data can be placed in any slot.
Components:
Tags, valid bits, and block size handling.
LRU approximation:
Use timestamps to track recent access and select eviction candidates.
Prefer filling invalid entries before evicting valid ones.
Handling multi-word blocks:
Compute aligned block addresses.
Maintain correct indexing and structural alignment consistent with hardware design.