Meeting Summary: CS 631-02 Systems Foundations
- Date: Mar 31, 2026
- Time: 03:00 PM Pacific Time (US and Canada)
- Meeting ID: 882 2309 0019
Quick Recap
- The lecture centered on Project 3: implementing a cache simulator for a RISC-V emulator.
- Core topics included:
- Dynamic analysis and runtime metric collection.
- Cache fundamentals: temporal/spatial locality, hits/misses, hit rate, block size.
- Cache architectures: direct-mapped, set-associative, fully associative.
- Implementation considerations: block address calculation, tags/valid bits, and Least Recently Used (LRU) eviction using timestamps.
- Broader context covered:
- Modern interview practices, including effective use of AI coding assistants.
- Performance and security implications of cache behavior (e.g., Spectre-style attacks).
Next Steps
- Students:
- Implement metric collection in the emulator at the specified, non-redundant points.
- Extend the cache simulator to support 4-word blocks (in addition to 1-word):
- Compute correct block addresses.
- Handle data for both direct-mapped and set-associative caches.
- Implement LRU eviction for set-associative caches:
- Update timestamps on every access.
- Prefer invalid (not valid) slots before evicting valid entries.
- Select the least-recently used slot when eviction is necessary.
- Review the provided cache initialization and parameter computation code to understand how configuration impacts implementation.
- Begin work on Project 3 and consult the implementation guide for details and support.
- Prepare to start digital design content on Thursday (as announced).
Summary
Project 3 and Coding Interviews
- Greg clarified the goals and scope of Project 3, noting it should not require extensive coding given the course’s focus on RISC-V and Rust.
- He discussed evolving interview practices:
- Effective use of coding assistants to explore multiple solutions and demonstrate problem-solving.
- Emphasis on prompt quality, reasoning, and collaboration with tools during interviews.
Dynamic Analysis in Project 3
- Dynamic analysis will collect runtime metrics within the emulator to profile behavior efficiently.
- Metrics include:
- Instruction counts
- Data processing operations
- Loads and stores
- Branch usage
- The approach leverages emulation to gain insight into processor operations without sacrificing execution efficiency.
- Processors have outpaced memory performance, creating a bottleneck that caches help mitigate.
- Caches store recently used data in smaller, faster memory to reduce access latency.
- Key outcomes:
- Cache hit: data found in cache
- Cache miss: data must be fetched from main memory
- Blocked transfers (bringing in contiguous data) exploit spatial locality for higher efficiency.
Core Caching Principles
- Locality:
- Temporal locality: recently accessed data is likely to be accessed again soon.
- Spatial locality: nearby data is likely to be accessed.
- Essential management questions:
- Where to look for data? (mapping strategy)
- How to check presence? (tags and valid bits)
- How to resolve conflicts? (eviction policy when full)
- Metrics to monitor: hits, misses, and hit rate.
Cache Design and AI in Software
- Greg shared insights on cache optimization from graduate studies.
- He discussed AI’s growing role in software quality:
- Large language models (e.g., Claude) aiding in bug discovery and remediation, including long-standing issues (e.g., Linux buffer overflow cases).
- The promise and challenge of formal verification for proving correctness.
- Security relevance:
- Ongoing evolution in mitigations for hardware-level vulnerabilities (e.g., Spectre) that exploit cache timing side channels.
- Greg and Junge examined a speculative execution bug that goes beyond simple branch prediction.
- Modern CPUs speculatively execute multiple paths, which can be manipulated to influence cache state and leak data via timing analysis.
- The discussion included a worked example of a direct-mapped cache:
- A 4-slot cache using byte addresses and modulo operations to compute slot indices.
Cache Architecture Implementation Details
- Address mapping and structures:
- Direct-mapped: each address maps to exactly one slot.
- Set-associative: each address maps to a set, then to one of several ways within that set.
- Fully associative: data can be placed in any slot.
- Components:
- Tags, valid bits, and block size handling.
- LRU approximation:
- Use timestamps to track recent access and select eviction candidates.
- Prefer filling invalid entries before evicting valid ones.
- Handling multi-word blocks:
- Compute aligned block addresses.
- Maintain correct indexing and structural alignment consistent with hardware design.