Skip to content

Meeting Summary: CS 631-02 Systems Foundations

  • Date: Mar 31, 2026
  • Time: 03:00 PM Pacific Time (US and Canada)
  • Meeting ID: 882 2309 0019

Quick Recap

  • The lecture centered on Project 3: implementing a cache simulator for a RISC-V emulator.
  • Core topics included:
  • Dynamic analysis and runtime metric collection.
  • Cache fundamentals: temporal/spatial locality, hits/misses, hit rate, block size.
  • Cache architectures: direct-mapped, set-associative, fully associative.
  • Implementation considerations: block address calculation, tags/valid bits, and Least Recently Used (LRU) eviction using timestamps.
  • Broader context covered:
  • Modern interview practices, including effective use of AI coding assistants.
  • Performance and security implications of cache behavior (e.g., Spectre-style attacks).

Next Steps

  • Students:
  • Implement metric collection in the emulator at the specified, non-redundant points.
  • Extend the cache simulator to support 4-word blocks (in addition to 1-word):
    • Compute correct block addresses.
    • Handle data for both direct-mapped and set-associative caches.
  • Implement LRU eviction for set-associative caches:
    • Update timestamps on every access.
    • Prefer invalid (not valid) slots before evicting valid entries.
    • Select the least-recently used slot when eviction is necessary.
  • Review the provided cache initialization and parameter computation code to understand how configuration impacts implementation.
  • Begin work on Project 3 and consult the implementation guide for details and support.
  • Prepare to start digital design content on Thursday (as announced).

Summary

Project 3 and Coding Interviews

  • Greg clarified the goals and scope of Project 3, noting it should not require extensive coding given the course’s focus on RISC-V and Rust.
  • He discussed evolving interview practices:
  • Effective use of coding assistants to explore multiple solutions and demonstrate problem-solving.
  • Emphasis on prompt quality, reasoning, and collaboration with tools during interviews.

Dynamic Analysis in Project 3

  • Dynamic analysis will collect runtime metrics within the emulator to profile behavior efficiently.
  • Metrics include:
  • Instruction counts
  • Data processing operations
  • Loads and stores
  • Branch usage
  • The approach leverages emulation to gain insight into processor operations without sacrificing execution efficiency.

Cache Memory and Performance Concepts

  • Processors have outpaced memory performance, creating a bottleneck that caches help mitigate.
  • Caches store recently used data in smaller, faster memory to reduce access latency.
  • Key outcomes:
  • Cache hit: data found in cache
  • Cache miss: data must be fetched from main memory
  • Blocked transfers (bringing in contiguous data) exploit spatial locality for higher efficiency.

Core Caching Principles

  • Locality:
  • Temporal locality: recently accessed data is likely to be accessed again soon.
  • Spatial locality: nearby data is likely to be accessed.
  • Essential management questions:
  • Where to look for data? (mapping strategy)
  • How to check presence? (tags and valid bits)
  • How to resolve conflicts? (eviction policy when full)
  • Metrics to monitor: hits, misses, and hit rate.

Cache Design and AI in Software

  • Greg shared insights on cache optimization from graduate studies.
  • He discussed AI’s growing role in software quality:
  • Large language models (e.g., Claude) aiding in bug discovery and remediation, including long-standing issues (e.g., Linux buffer overflow cases).
  • The promise and challenge of formal verification for proving correctness.
  • Security relevance:
  • Ongoing evolution in mitigations for hardware-level vulnerabilities (e.g., Spectre) that exploit cache timing side channels.
  • Greg and Junge examined a speculative execution bug that goes beyond simple branch prediction.
  • Modern CPUs speculatively execute multiple paths, which can be manipulated to influence cache state and leak data via timing analysis.
  • The discussion included a worked example of a direct-mapped cache:
  • A 4-slot cache using byte addresses and modulo operations to compute slot indices.

Cache Architecture Implementation Details

  • Address mapping and structures:
  • Direct-mapped: each address maps to exactly one slot.
  • Set-associative: each address maps to a set, then to one of several ways within that set.
  • Fully associative: data can be placed in any slot.
  • Components:
  • Tags, valid bits, and block size handling.
  • LRU approximation:
  • Use timestamps to track recent access and select eviction candidates.
  • Prefer filling invalid entries before evicting valid ones.
  • Handling multi-word blocks:
  • Compute aligned block addresses.
  • Maintain correct indexing and structural alignment consistent with hardware design.