Project06 - Octox Kernel Tracking and the track Command¶
Due Thu May 21nd by 11:59pm in your Project06 GitHub repo
Links¶
Tests: https://github.com/USF-CS631-S26/tests
Background¶
In Project 05 you wrote user programs that call into the Octox kernel through
the existing syscall interface. In Project 06 you cross to the other side of
that interface: you will extend the kernel itself with a small per-process
tracking facility, expose two new syscalls, and write a user program named
track that uses them to fork+exec a target command and print a report of
what that target did — its syscalls, byte counts, memory map, and page
tables.
The work touches almost every layer of the kernel you have seen this
semester: the syscall dispatcher
(UNIX System Calls lecture),
process state and exec
(OS Kernel lecture), and
page tables
(Page Tables lecture).
Setup¶
Clone the Octox repo into your Project06 GitHub repo. Build it once to confirm your toolchain works:
$ cd octox
$ cargo build --target riscv64gc-unknown-none-elf
$ cargo run --target riscv64gc-unknown-none-elf
You should land at the $ shell prompt. Ctrl-A x exits QEMU.
Requirements¶
- Add a
TrackMetricsstruct insrc/kernel/syscall.rs. This is the ABI shared between kernel and user — the autograder depends on the exact field layout shown in Section 2. - Add per-process tracking fields to
ProcDatainsrc/kernel/proc.rs. - Add two new syscalls —
track_selfandtrack_wait— wired through the syscall enum, dispatch table, andfrom_usize. Numbers24and25. - Instrument the syscall dispatcher,
read,write, andexecso that tracked processes accumulate counters and untracked processes pay only a single boolean check. - Write
src/user/bin/track.rswith a matching[[bin]]entry insrc/user/Cargo.toml. The binary lands at/bin/track. - All output must match the autograder spec (spacing on each line, blank lines, and page counts all included).
- Use only
sys::*andulib— do not pull in additional crates.
Section 1: The track user program¶
track runs a target command in a child process and prints a per-process
metrics report after the child exits.
Usage:
With no arguments, print exactly:
…and exit 1.
For a command without a / in it, resolve it to /bin/<cmd> before passing
it to sys::exec. (Octox's mkfs strips the leading _, so _track lands at
/bin/track, and likewise for the command you exec.)
The control flow is the same fork/exec/wait pattern you saw in
src/user/bin/ex_exec.rs, with two substitutions:
- The child calls
sys::track_self()beforesys::exec(...). This sets thetrackedflag on the calling process so the kernel knows to start counting. - The parent replaces
sys::wait(...)withsys::track_wait(&mut xstatus, &mut metrics). This both reaps the zombie child and fills in aTrackMetricsstruct with the counters the kernel accumulated.
Worked example 1: track echo hello
$ track echo hello
hello
=== track report ===
PID: 3
Name: echo
Exit: 0
System Calls (total: 4)
exit: 1
write: 3 bytes: 7
Memory (total: 118784 bytes)
TEXT: 2 pages
DATA: 2 pages
HEAP: 0 pages
STACK: 25 pages
Page Tables (5 pages, 20480 bytes)
L2: 1
L1: 2
L0: 2
Worked example 2: track ls
$ track ls
dev Dir 2 64
bin Dir 3 432
lib Dir 4 32
etc Dir 5 48
README.org File 6 5260
init File 12 171576
initcode File 21 91600
=== track report ===
PID: 3
Name: ls
Exit: 0
System Calls (total: 241)
exit: 1
read: 65 bytes: 1024
fstat: 8
sbrk: 1
open: 8
write: 150 bytes: 213
close: 8
Memory (total: 196608 bytes)
TEXT: 4 pages
DATA: 3 pages
HEAP: 16 pages
STACK: 25 pages
Page Tables (5 pages, 20480 bytes)
L2: 1
L1: 2
L0: 2
Worked example 3: head -c 4194304 README.org > /big, then
track grep zzzzzz /big. This exercises the page-table walker —
because the heap grows past 2 MB, the user page table needs more than
one L0 page to cover it, and L0 rises from 2 (the value in the
smaller programs above) to 6. The rest of the report shape is
identical.
$ head -c 4194304 README.org > /big
$ track grep zzzzzz /big
=== track report ===
PID: 4
Name: grep
Exit: 0
System Calls (total: 131084)
exit: 1
read: 131073 bytes: 4194304
sbrk: 8
open: 1
close: 1
Memory (total: 8527872 bytes)
TEXT: 4 pages
DATA: 4 pages
HEAP: 2049 pages
STACK: 25 pages
Page Tables (9 pages, 36864 bytes)
L2: 1
L1: 2
L0: 6
$ rm /big
Print format. The autograder is byte-exact per line (it strips leading/trailing whitespace on each line before comparing, but does not collapse interior runs of spaces). Use these format strings exactly:
- Header:
"=== track report ===". - Identity block (three separate lines), then a blank line:
"PID: {}""Name: {}""Exit: {}"
- Syscall section header:
"System Calls (total: {})". - Syscall rows (skip entries with count
0):- normal:
" {}: {}" read/write:" {}: {} bytes: {}"
- normal:
- Memory section:
"Memory (total: {} bytes)"then four lines:" TEXT: {} pages"," DATA: {} pages"," HEAP: {} pages"," STACK: {} pages". - Page tables:
"Page Tables ({} pages, {} bytes)"then" L2: {}"," L1: {}"," L0: {}". - One blank line between major sections (matches the examples above).
- Every detail line uses a single space between every adjacent token.
No format-string width specifiers (
{:>N}/{:<N}) anywhere.
You will need a SYSCALL_NAMES table in track.rs indexed by the same
numbering as the kernel's SysCalls enum. Keep the two in sync — if you add
or rename a syscall in the kernel, update the table.
Section 2: The shared TrackMetrics struct (the ABI)¶
Add this struct near the top of src/kernel/syscall.rs, before the
SysCalls enum. The exact field names, types, and order are part of the
ABI — student kernels and the autograder must agree.
pub const NUM_SYSCALLS: usize = 24;
pub const TRACK_NAME_LEN: usize = 16;
#[repr(C)]
#[derive(Debug, Clone, Copy)]
pub struct TrackMetrics {
pub pid: usize,
pub name: [u8; TRACK_NAME_LEN],
pub total_syscalls: usize,
pub syscall_counts: [usize; NUM_SYSCALLS],
pub bytes_read: usize,
pub bytes_written: usize,
pub mem_bytes: usize,
pub text_pages: usize,
pub data_pages: usize,
pub heap_pages: usize,
pub stack_pages: usize,
pub pt_l2_pages: usize,
pub pt_l1_pages: usize,
pub pt_l0_pages: usize,
pub pt_total_bytes: usize,
}
Also provide impl Default for TrackMetrics returning the all-zero value
(the user program builds an empty one and hands it to track_wait as the
out-param), and mark the struct as kernel-copyable so Uvm::copyout will
accept it:
#[cfg(all(target_os = "none", feature = "kernel"))]
unsafe impl crate::defs::AsBytes for TrackMetrics {}
User-side, TrackMetrics, NUM_SYSCALLS, and TRACK_NAME_LEN come through
the same pub use chain that already exposes Result, FcntlCmd, etc., so
you do not need to write a manual user stub — adding the syscalls to the
SysCalls enum is enough for the build script to regenerate usys.rs.
Section 3: Per-process tracking state¶
Extend ProcData in src/kernel/proc.rs with the fields the syscall
handlers will update. You need:
tracked: bool— the fast-path gate; checked on every syscall.stack_top: usize— the value ofszat the end ofexec; used later to classify pages as STACK vs HEAP. Recorded once, then stable until the proc dies.track_total_syscalls: usizetrack_counts: [usize; NUM_SYSCALLS]track_bytes_read: usizetrack_bytes_written: usize
Three rules to honor:
- Zero-initialize all fields in
ProcData::new(). - Reset them in
Proc::free()alongsidedata.sz = 0. This matters becauseProcslots are recycled — a staletracked = trueleft over from a previous process would silently double-count. fork()must not propagatetrackedto the child. The parent of the tracked process is yourtrackprogram itself, which should not be tracked. The defaults inProc::new()already handle this; just make sure you do not add an explicit copy.
Section 4: Kernel instrumentation points¶
Four locations. In each, the work is small — a counter bump or a stash — but
gated on pdata.tracked so untracked processes pay only one boolean check.
-
Syscall counter —
src/kernel/syscall.rs, inside thesyscall()dispatcher, right aftersyscall_idis decoded and before the table dispatch. If the current proc is tracked, bumptrack_counts[syscall_id]andtrack_total_syscalls. -
readbyte counter —src/kernel/syscall.rs, inside theread()handler. Capture the byte count returned by the underlying fileread, and if the proc is tracked, add it totrack_bytes_readbefore returning. -
writebyte counter — same file, insidewrite(). Mirror theread()change againsttrack_bytes_written. -
Exec reset and
stack_topcapture —src/kernel/exec.rs, inside the "commit to the new user image" block (afterproc_data.uvm.replace(...)andproc_data.sz = sz). Two things happen here: - Always record
proc_data.stack_top = sz. This is true for everyexec, tracked or not — it costs you nothing and means the field is valid whenever you need it later. - If
proc_data.tracked, zero the four syscall counters (track_counts,track_total_syscalls,track_bytes_read,track_bytes_written). This is what makes the report reflect the target binary's lifetime rather than thetrackprogram's argv-parsing-and-fork+exec stub.
Use saturating_add for the byte counters — usize is wide enough that you
will not actually overflow in this course, but a long-running tracked
process eventually would.
Section 5: The two new syscalls¶
Add the new variants to SysCalls in src/kernel/syscall.rs:
…and matching entries in the dispatch TABLE, the from_usize match, and
the gen_usys() generator (read the surrounding code — the pattern is
mechanical).
track_self(). Mark the calling process tracked and clear its counters.
There is no userspace argument; you just look up the current proc with
Cpus::myproc().unwrap(), flip tracked, and zero the four counter
fields. Returns Ok(()).
track_wait(xstatus_addr, metrics_addr). This is a near-copy of the
existing wait() (in src/kernel/proc.rs). The shape is identical: scan
the proc table for an exited child of the calling process, block on the
parent's wait channel if none have exited yet, and once you find a zombie
child, copy its exit status out to userspace and free the slot.
The one extra step happens just before c.free(c_guard): build a
TrackMetrics struct from the zombie child's ProcData and use
Uvm::copyout to write it into the parent's metrics_addr buffer. The
fields you read from ProcData are direct copies; the only fields that
require new computation are the memory / page-table fields described in the
next section. Note that the child is in ZOMBIE state and not running, so
it is safe to read its ProcData and walk its page table directly.
Order matters: build and copy out the metrics before calling c.free(),
because free() resets data.sz, data.stack_top, and the page table you
need to walk.
Section 6: Memory and page-table classification¶
The four memory buckets (TEXT, DATA, HEAP, STACK) are not stored explicitly
in ProcData — you derive them by walking the child's user address space
one virtual page at a time and classifying each mapped page by its
permissions and where it sits relative to stack_top. This is the part of
the project where the
OS Kernel and
Page Tables lectures
are most useful.
The layout (set up by exec in src/kernel/exec.rs):
0 sz
+---------+---------+- - -+------+--------+------+- - -+----+
| TEXT | DATA | | guard | STACK | HEAP | | |
+---------+---------+- - -+------+--------+------+- - -+----+
^
stack_top (= sz at end of exec)
Walk the address space from va = 0 to va = sz in PGSIZE steps,
calling page_table.walk(va, false) on each. For each mapped leaf:
- TEXT =
flags & PTE_X != 0(executable). - STACK =
stack_top - STACK_PAGE_NUM * PGSIZE <= va < stack_top(STACK_PAGE_NUM = 25, seesrc/kernel/exec.rs). - HEAP =
va >= stack_top(sbrk growsszupward, above the original stack region). - DATA = remaining user-readable pages below the stack.
The guard page exec installs at the bottom of the stack lacks PTE_U and
is naturally skipped — walk will report it but your PTE_U filter will
discard it.
Once classified, fill in text_pages, data_pages, heap_pages,
stack_pages directly, and set
mem_bytes = (text + data + heap + stack) * PGSIZE.
Page-table page counts. Add a small read-only method on PageTable<V>
in src/kernel/vm.rs that returns (l2, l1, l0) — the number of
page-table pages allocated at each Sv39 level under this root. l2 is
always 1 (the root). For l1, count the valid non-leaf entries in the
root. For l0, count the valid non-leaf entries in each of those L1
tables. Then fill pt_l2_pages, pt_l1_pages, pt_l0_pages, and
pt_total_bytes = (l2 + l1 + l0) * PGSIZE.
Building and Running¶
# build kernel + user programs
$ cargo build --target riscv64gc-unknown-none-elf
# boot in QEMU
$ cargo run --target riscv64gc-unknown-none-elf
# exit QEMU
Ctrl-A x
To run a single command end-to-end (the way the autograder tests it), use
runoctox.py:
$ python3 runoctox.py "track echo hello"
$ python3 runoctox.py "echo hi > /a" "track cat /a" "rm /a"
$ python3 runoctox.py "head -c 4194304 README.org > /big" "track grep zzzzzz /big" "rm /big"
Tips and Pitfalls¶
execdoes not return on success. Always followsys::exec(...)in the child withsys::exit(1)so the failure path is defined.- Counters reset on
exec, not ontrack_self. If you reset intrack_selfand not inexec, your counts will include syscalls made betweentrack_selfandexec(which is just theexecitself, but still wrong) and your report will not match the autograder. - Build the report before
free().Proc::free()zerossz,stack_top, and the page table — exactly the state you need to walk. - Close the syscall fast-path. The
trackedcheck insidesyscall()runs on every system call from every process; keep it a single boolean test and anif-guarded counter bump, nothing more. - Whitespace tolerance. The autograder strips leading/trailing whitespace on each line before comparing, so the 2-space indent on detail rows can be any width (or absent) without affecting grading. Interior spaces on a line still have to match, which is why every detail line in the spec uses a single space between tokens.
- Keep
SYSCALL_NAMESin sync. Index0is the invalid slot; index1isfork; etc. The user-side table intrack.rsand the kernel'sSysCallsenum must agree, or your report will print the wrong name for the right count. STACK_PAGE_NUM = 25. YourSTACKpage count intrack echo hellomust be exactly25. If it is not, your stack/heap boundary is off.
Grading¶
Tests: https://github.com/USF-CS631-S26/tests
Grading is based on automated tests (100 points total):
| Tests | Points | Description |
|---|---|---|
| track-usage | 5 | No-arg usage message |
| track-echo | 10 | Smallest tracked workload (write + exit) |
| track-wc | 15 | bytes_read matches file size |
| track-ls | 20 | Heap (sbrk) + fstat + multi-syscall report |
| track-cat | 20 | Couples bytes_read with bytes_written; self-cleaning |
| track-grep | 30 | Forces heap > 2 MB; L0 page count grows past 2 |
| Total | 100 |
Code Quality¶
Code quality deductions may be applied and can be earned back. We are looking for:
- Consistent spacing and indentation
- Consistent naming and commenting
- No commented-out ("dead") code
- No redundant or overly complicated code
- A clean repo, that is no build products, extra files, etc.