Skip to content

Processor Components

Overview

This lecture bridges the gap from last class's 4-bit counter to a full RISC-V processor. We start by generalizing the counter into the canonical "state + combinational logic" picture that every digital system fits into, then identify the concrete state elements a processor needs (general-purpose registers, program counter, and data memory). Next we lay out the major processor components as a block diagram — labeling each as sequential or combinational — and walk through what it means for one instruction to complete per clock cycle. The rest of the lecture zooms in on the first three components: the program counter, the instruction memory (ROM), and the register file. The ALU and the instruction decoder come next class.

Learning Objectives

  • Generalize a counter into the universal "state + combinational logic" model of digital systems
  • Identify the state elements of a RISC-V processor: PC, register file, data memory
  • Read and interpret a block diagram of the processor components, labeling each as sequential or combinational
  • Distinguish single-cycle, multi-cycle, and pipelined processor designs
  • Specify the Program Counter as a 64-bit register with synchronous clear
  • Compute the word address that drives the instruction ROM from a byte-addressed PC
  • Describe the register file interface: two read ports, one write port, write enable, x0 hard-wired to zero
  • Sketch the internal structure of a register file built from 32 registers, two 32:1 MUXes, and a 5-to-32 write decoder

Prerequisites


From Counter to Processor

The 4-Bit Counter, Recapped

Last class we built a 4-bit counter from three pieces:

  • A 4-bit register (four D flip-flops with synchronous enable and clear)
  • A 4-bit ripple-carry adder that computes register + 1
  • A clock that drives the rising edge

On each rising edge the register captures the adder's output. The register holds the current count; the adder produces the next count.

Every Digital System Has the Same Shape

Generalize the counter. Replace "register + adder" with "state + combinational logic" and you have the canonical picture of every synchronous digital system — including a processor:

flowchart LR
    CLK[CLK] --> STATE[STATE<br/>registers, PC,<br/>data memory]
    STATE --> CL[Combinational<br/>Logic]
    CL --> STATE

The state elements store the current values; the combinational logic computes the next values. Each rising edge promotes "next" to "current."

Processor State Elements

A RISC-V processor has three kinds of state:

State Element Purpose Size
General-purpose registers x0x31 32 × 64 bits
Program counter (PC) Address of current instruction 64 bits
Data memory Stack and heap storage Variable (RAM)

Everything else in the processor is combinational logic that reads from state and produces the next state.


What One Instruction Does

Consider a single RISC-V instruction:

add t0, t1, t2

Two things have to happen this clock cycle:

  1. Compute the result: read t1 and t2 from the register file, add them in the ALU, write the sum back to t0.
  2. Advance the PC: PC = PC + 4 so that the next rising edge fetches the next instruction.

Both happen simultaneously inside one clock cycle. The register file update and the PC update both occur at the rising edge.

For branches and jumps, step 2 becomes PC = BTA (branch target address) instead of PC + 4.


Processor Components

Here is the block diagram the whiteboard built up, with each component labeled S (sequential — holds state) or C (combinational — pure function of inputs):

flowchart LR
    PC["PC<br/>(S)"] -->|addr| IM["Instruction<br/>Memory<br/>(C, ROM)"]
    IM -->|IW| DEC["Instruction<br/>Decoder<br/>(C)"]
    DEC --> RF["Register<br/>File<br/>(S)"]
    RF -->|RD0, RD1| ALU["ALU<br/>(C)"]
    ALU --> DM["Data<br/>Memory<br/>(S, RAM)"]
    DM --> RF

Component Roles

Component Type Function
PC Sequential Holds the address of the current instruction
Instruction Memory Combinational (ROM) Maps PC to a 32-bit instruction word
Instruction Decoder Combinational Extracts register numbers, immediates, control lines
Register File Sequential 32 general-purpose registers, x0x31
ALU Combinational Arithmetic, logical, shift operations
Data Memory Sequential (RAM) Load/store target memory

One Instruction per Clock Cycle

Here is the picture of the clock driving the processor. Each rising edge completes one instruction:

         init       add        sub        beq
          ↓          ↓          ↓          ↓
CLK  ────┐    ┌─────┐    ┌─────┐    ┌─────┐    ┌────
         └────┘     └────┘     └────┘     └────┘
         ↑          ↑          ↑          ↑
     reset PC   execute    execute    execute
                  add        sub        beq

Between edges, values flow through the combinational logic (instruction fetch → decode → read registers → ALU → memory). On the rising edge, the new values are captured into PC, register file, and data memory.


Design Evolution

Processor designs have grown more complex over time:

Design Description
Single-cycle Each instruction completes in one clock cycle
Multi-cycle Instructions take a variable number of cycles (common in the 1970s)
Pipelined Multiple instructions are in flight at once, each in a different stage

Modern CPUs combine pipelining with out-of-order execution, superscalar issue, and speculative execution. In this course we build a single-cycle processor — the simplest design that can execute every RISC-V instruction.

Trade-off In a single-cycle design, the clock period must be at least as long as the longest combinational path — typically the load-word path through the ALU and data memory. Faster instructions (like `add`) don't finish early; they wait for the next clock edge. Pipelining attacks this by letting each stage run at its own short latency.

Program Counter (PC)

The PC is a 64-bit register that holds the address of the instruction currently being executed.

PC Signals

Signal Width Direction Description
D 64 Input Next PC value
Q 64 Output Current PC value
CLK 1 Input Clock
EN 1 Input Enable update (normally 1)
CLR 1 Input Synchronous clear to 0

Update Rule

For sequential execution, the PC advances by 4 bytes each cycle (each RISC-V instruction is 4 bytes):

PC_next = PC + 4

For branches and jumps, the PC receives a calculated target:

PC_next = BTA    (branch target address)

Adding CLR in Digital

The Digital simulator's built-in register component does not have a CLR input. We add one the same way we did for the 1-bit register last class — a synchronous clear via MUX:

flowchart LR
    D[D] --> CM["CLR MUX"] --> FF["64-bit<br/>register<br/>(D, CLK, EN)"] --> Q[Q]
    ZERO["constant 0"] --> CM
    CLR[CLR] --> CM
    CLK[CLK] --> FF
    EN[EN] --> FF

When CLR = 1, the MUX routes 0 into the flip-flop's D input, so the next rising edge loads 0 instead of the incoming value. The CLR is synchronous — it still waits for the clock edge.


Instruction Memory

Instruction memory stores the program as 32-bit instruction words. We implement it with Digital's ROM (read-only memory) component, loaded from a .hex file produced by makerom3.py.

Specifications

Parameter Value Notes
Data width 32 bits One RISC-V instruction
Address width 8 bits (typical) 2⁸ = 256 instructions
Type ROM Read-only, loaded from .hex

Byte Address vs. Word Address

The PC holds a byte address, but the ROM is indexed by word address. Since each instruction is 4 bytes, the word address is the byte address shifted right by 2:

word_addr = byte_addr >> 2

We do not build a shifter — we use a splitter on the PC and pick off the right bits. With an 8-bit ROM address, we take bits [9:2] of the PC:

flowchart LR
    PC["PC<br/>(64 bits)"] -->|"bits [9:2]"| SPL["splitter"]
    SPL -->|"8-bit word addr"| ROM["Instruction<br/>Memory (ROM)"]
    ROM -->|"32-bit IW"| OUT["instruction word"]
Why bits [9:2]? Bits 0 and 1 of the byte address are always 0 for an aligned 4-byte instruction — the PC advances by 4 every cycle. Bit 2 is the low bit of the word address. With 8 address bits (supporting 256 instructions), we need bits 2 through 9 of the byte address, written `PC[9:2]`.

Register File

The register file holds the 32 general-purpose registers x0x31. In a single clock cycle it must:

  1. Read up to two registers (for the two ALU operands)
  2. Write at most one register (for the instruction's destination)

Interface

Signal Width Direction Description
RR0 5 Input Read register number for port 0
RR1 5 Input Read register number for port 1
WR 5 Input Write register number
WD 64 Input Write data
WE 1 Input Write enable
CLK 1 Input Clock
CLR 1 Input Clear all registers to 0
RD0 64 Output Value of register RR0
RD1 64 Output Value of register RR1
x0x31 64 each Output Individual register values (debug)

Reads are combinational: once RR0 or RR1 stabilizes, the selected register's value appears on RD0 or RD1 with no clock edge needed. Writes are sequential: on the rising edge, if WE = 1, the register selected by WR captures WD.

x0 Is Hard-Wired to Zero

RISC-V's x0 always reads as 0 and ignores writes:

RD0 = (RR0 == 0) ? 0 : registers[RR0]
RD1 = (RR1 == 0) ? 0 : registers[RR1]

In the circuit, x0 is not a real register — it is literally wired to a constant 0. Any write with WR = 0 has no effect because there is no register to update at index 0.

Internal Structure

The register file is built from 31 D-flip-flop-based registers (x1x31) plus the hard-wired x0. Two 32:1 multiplexers select the read-port outputs. A 5-to-32 decoder, AND-ed with WE, drives the individual register enables.

flowchart LR
    X0["x0 = 0<br/>(hard-wired)"] --> M0["32:1 MUX<br/>(RD0)"]
    X1["x1"] --> M0
    XN["... x31"] --> M0
    X0 --> M1["32:1 MUX<br/>(RD1)"]
    X1 --> M1
    XN --> M1
    RR0[RR0] --> M0
    RR1[RR1] --> M1
    M0 --> RD0[RD0]
    M1 --> RD1[RD1]
    WR[WR] --> DEC["5-to-32<br/>decoder"]
    WE[WE] --> DEC
    DEC --> X1
    DEC --> XN

The decoder activates exactly one register's enable line (if WE = 1); on the next rising edge only that register captures WD.

Read vs. write timing Reads are asynchronous — the MUX output tracks `RR0`/`RR1` directly. That matters because the ALU needs `RD0` and `RD1` *this cycle*, before the next rising edge. Writes are synchronous — the destination register only changes on the rising edge. Thus within a single cycle, an instruction reads the *old* value of `RR0`/`RR1` and writes the *new* value to `WR`, and both are consistent.

Coming Up

Next class we will complete the datapath by covering:

  • The ALU: a combinational unit that performs add, sub, mul, shift, and logical operations
  • The instruction decoder: extracts opcode, funct3, funct7, register numbers, and immediate values from each instruction word, and drives the control lines that tell every other component what to do

With those two pieces, the single-cycle processor will be able to run a small program end-to-end.


Practice Problems

Problem 1: PC Calculation

After executing a jal that jumps forward 20 instructions, the PC is 0x1000. What was the PC before the jal?

Solution Twenty instructions is `20 × 4 = 80 = 0x50` bytes. The PC before the jump was `0x1000 - 0x50 = 0xFB0`.

Problem 2: Writing to x0

A program contains the instruction add x0, x5, x6. What happens on the rising edge?

Solution Nothing visible. The ALU computes `x5 + x6` and `WD` is driven with that value, but `x0` is hard-wired to 0 — there is no flip-flop to update. The next read of `x0` still returns 0. This is exactly why some assemblers emit `add x0, x0, x0` as a NOP.

Problem 3: Address Extraction

A Digital ROM is configured with 10 address bits. Which PC bits do we feed to the ROM?

Solution Bits `[11:2]` of the PC. Bits 0 and 1 are always 0 for aligned 4-byte instructions; bits 2 through 11 give us the 10-bit word address needed to index 2¹⁰ = 1024 instructions.

Problem 4: Read vs. Write in the Same Cycle

An instruction has RR0 = 5, RR1 = 7, WR = 5, WE = 1. If the ALU computes WD = 42, what value does RD0 carry during this clock cycle?

Solution `RD0` carries the *old* value of `x5`. Reads are combinational and reflect the current state of the register file; the write of `42` into `x5` only happens on the rising edge at the *end* of the cycle. Starting next cycle, `x5` reads as 42.

Problem 5: Why Two 32:1 MUXes?

Why does the register file need two 32:1 MUXes on the read side instead of one?

Solution Every R-type instruction reads two source registers simultaneously (e.g., `add t0, t1, t2` reads `t1` and `t2` together). The two ALU operands `RD0` and `RD1` must both be available within a single clock cycle, so we need two independent read ports — each with its own 32:1 MUX selected by its own read-register number (`RR0` and `RR1`).

Key Concepts

Concept Description
State + CL Every synchronous digital system = storage elements + combinational logic
Sequential components PC, register file, data memory — hold values across clock edges
Combinational components Instruction memory (ROM), decoder, ALU — pure functions
Single-cycle One instruction completes per clock cycle
Byte vs. word address PC is byte-addressed; instruction ROM is word-addressed — use a splitter
x0 Hard-wired to zero; writes are silently discarded
Two reads + one write Register file services both ALU operands and the destination in one cycle

Summary

  1. Every synchronous digital system has the same shape: state elements driven by combinational logic, connected in a feedback loop through the clock.
  2. A RISC-V processor's state is its general-purpose registers, program counter, and data memory. Everything else is pure combinational logic.
  3. In a single-cycle processor, one instruction finishes every clock cycle. The clock period must cover the longest combinational path.
  4. The program counter is a 64-bit register with a synchronous-clear MUX. It normally advances by 4; for branches and jumps it takes a calculated target address.
  5. The instruction memory is a ROM addressed by the top bits of the PC (bits [9:2] for an 8-bit ROM address), producing a 32-bit instruction word.
  6. The register file has two combinational read ports and one synchronous write port. x0 is hard-wired to zero; the other 31 registers are selected by 32:1 MUXes for reads and by a 5-to-32 decoder AND WE for writes.

Further Reading

  • Processor Design Part 1 — the 2024 guide, with full diagrams for the PC, instruction memory, register file, and ALU
  • Processor Design Part 2 — the register decoder, immediate decoder, and instruction decoder (covered in later lectures)
  • Patterson & Hennessy, Computer Organization and Design, RISC-V Edition, Chapter 4 — single-cycle datapath