Evaluation and Scripts¶
Overview¶
This lecture covers the fundamental concepts of Abstract Syntax Tree (AST) evaluation and extending an expression-based language to support scripts with statements. We examine depth-first tree traversal for expression evaluation, the extension of grammar from expressions to statements, and the implementation of an environment (symbol table) for variable storage. These concepts form the foundation for implementing NTLang scripts with variables, assignments, and print statements.
Learning Objectives¶
- Understand depth-first tree traversal for expression evaluation
- Extend a grammar from expressions to statements
- Design and implement an environment (symbol table) for variable storage
- Implement variable binding and lookup operations
- Support assignment statements and print statements with formatting options
- Apply post-order evaluation patterns for recursive tree processing
Prerequisites¶
- Understanding of parsing and Abstract Syntax Trees (ASTs)
- Basic knowledge of recursive algorithms
- Familiarity with expression evaluation and operator precedence
- Understanding of binary, decimal, and hexadecimal number bases
1. From Parsing to Evaluation¶
The Compilation Pipeline¶
Recall the stages of language processing:
In the previous lectures, we built:
- Scanner: Converts source text into tokens
- Parser: Converts tokens into an Abstract Syntax Tree (AST)
Now we focus on:
- Evaluator: Traverses the AST and computes the result
What is an AST?¶
An Abstract Syntax Tree represents the hierarchical structure of a program:
The tree structure encodes operator precedence — multiplication happens before addition because * is deeper in the tree.
Evaluation: Walking the Tree¶
To evaluate an expression tree, we perform a depth-first traversal:
- Recursively evaluate child nodes (operands)
- Apply the operation at the current node
- Return the result
This is called post-order traversal because we process children before the parent.
2. Tree Traversal and Expression Evaluation¶
Post-Order Traversal¶
Post-order means: left subtree → right subtree → root.
Evaluation order: 2, 3, 4, *, +
- Evaluate
2→ 2 - Evaluate
3→ 3 - Evaluate
4→ 4 - Evaluate
3 * 4→ 12 - Evaluate
2 + 12→ 14
Recursive Evaluation Algorithm¶
function evaluate(node):
if node is a literal:
return node.value
if node is a unary operation:
operand = evaluate(node.child)
return apply_unary_op(node.operator, operand)
if node is a binary operation:
left = evaluate(node.left)
right = evaluate(node.right)
return apply_binary_op(node.operator, left, right)
Example: Evaluating (2 + 3) * 4¶
Trace:
| Call | Node | Action | Returns |
|---|---|---|---|
eval(*) |
* |
Recurse on left and right | |
eval(+) |
+ |
Recurse on left and right | |
eval(2) |
2 |
Literal | 2 |
eval(3) |
3 |
Literal | 3 |
(back to +) |
+ |
2 + 3 |
5 |
eval(4) |
4 |
Literal | 4 |
(back to *) |
* |
5 * 4 |
20 |
Final result: 20
Operator Application¶
For binary operators, we extract the left and right values and apply the operation:
| Operator | Operation | Example |
|---|---|---|
+ |
Addition | 5 + 3 → 8 |
- |
Subtraction | 5 - 3 → 2 |
* |
Multiplication | 5 * 3 → 15 |
/ |
Division | 15 / 3 → 5 |
>> |
Logical shift right | 8 >> 2 → 2 |
<< |
Logical shift left | 2 << 2 → 8 |
& |
Bitwise AND | 0xCA & 0xF0 → 0xC0 |
\| |
Bitwise OR | 0xCA \| 0x0F → 0xCF |
^ |
Bitwise XOR | 0xCA ^ 0xFF → 0x35 |
>- |
Arithmetic shift right | -8 >- 2 → -2 |
For unary operators:
| Operator | Operation | Example |
|---|---|---|
- |
Negation (two's complement) | -5 → -5 |
~ |
Bitwise NOT | ~0x0F → 0xFFFFFFF0 |
3. From Expressions to Statements¶
Expression Mode vs Script Mode¶
NTLang operates in two modes:
Expression Mode (-e flag):
Evaluates a single expression and prints the result.
Script Mode (file input):
Executes a sequence of statements from a file.
Grammar Extension¶
Expression grammar (Section 1):
program ::= expression EOT
expression ::= operand (operator operand)*
operand ::= intlit | hexlit | binlit
| '-' operand | '~' operand
| '(' expression ')'
Script grammar (Section 2):
program ::= (statement '\n')* EOT
statement ::= assignment | print_statement
assignment ::= ident '=' expression
print_statement ::= 'print' '(' expression ',' expression ',' expression ')'
expression ::= operand (operator operand)*
operand ::= intlit | hexlit | binlit | ident
| '-' operand | '~' operand
| '(' expression ')'
Key Changes¶
- Program is now a list of statements, not a single expression
- Statements include assignments and print statements
- Operands can now be identifiers (variable names)
- Newlines separate statements
4. Variables and the Environment¶
What is an Environment?¶
An environment (or symbol table) is a mapping from variable names to values:
When we evaluate an identifier, we look it up in the environment.
Operations on Environments¶
| Operation | Description | Example |
|---|---|---|
| Lookup | Get the value of a variable | env["x"] → 10 |
| Insert | Add a new variable | env["w"] = 40 |
| Update | Change an existing variable | env["x"] = 15 |
| Upsert | Insert or update | env["x"] = value |
For NTLang, we use upsert semantics — assignment either creates a new variable or updates an existing one.
Scope¶
NTLang has a single global scope — all variables are global. There are no local scopes, functions, or blocks.
Output:
Environment Example¶
Consider this script:
Execution trace:
| Statement | Environment After | Action |
|---|---|---|
| (start) | {} |
Empty environment |
x = 2 + 3 |
{"x": 5} |
Evaluate 2 + 3 → 5, bind x to 5 |
y = x * 4 |
{"x": 5, "y": 20} |
Lookup x → 5, evaluate 5 * 4 → 20, bind y to 20 |
print(y, 10, 32) |
{"x": 5, "y": 20} |
Lookup y → 20, print as base 10, width 32 |
Output: 20
5. Statement Evaluation¶
Assignment Statement¶
Evaluation:
- Evaluate the expression on the right side
- Bind (or update) the variable name to the computed value in the environment
Example:
After execution, the environment is:
Print Statement¶
Evaluation:
- Evaluate the
expression→ get a value - Evaluate
base→ should be 2, 10, or 16 - Evaluate
width→ should be 4, 8, 16, or 32 - Format and print the value
Example:
Print Formatting Rules¶
| Base | Format | Example Value | Output |
|---|---|---|---|
| 10 | Signed decimal | 15 | 15 |
| 10 | Signed decimal | -5 | -5 |
| 16 | Hexadecimal with prefix | 15 | 0x0F (width 8) |
| 16 | Hexadecimal with prefix | 255 | 0x000000FF (width 32) |
| 2 | Binary with prefix | 15 | 0b1111 (width 4) |
| 2 | Binary with prefix | 15 | 0b00001111 (width 8) |
For base 10, values are printed as signed integers (using two's complement interpretation).
For base 16 and 2, the width determines the number of digits.
6. Complete Evaluation Algorithm¶
Program Evaluation¶
function evaluate_program(statements, env):
for each statement in statements:
evaluate_statement(statement, env)
Statement Evaluation¶
function evaluate_statement(stmt, env):
if stmt is assignment:
value = evaluate_expression(stmt.expression, env)
env[stmt.identifier] = value
if stmt is print:
value = evaluate_expression(stmt.expr, env)
base = evaluate_expression(stmt.base, env)
width = evaluate_expression(stmt.width, env)
print_formatted(value, base, width)
Expression Evaluation with Variables¶
function evaluate_expression(expr, env):
if expr is literal:
return expr.value
if expr is identifier:
if expr.name not in env:
error("undefined variable: " + expr.name)
return env[expr.name]
if expr is unary operation:
operand = evaluate_expression(expr.child, env)
return apply_unary_op(expr.operator, operand)
if expr is binary operation:
left = evaluate_expression(expr.left, env)
right = evaluate_expression(expr.right, env)
return apply_binary_op(expr.operator, left, right)
Error Handling¶
Undefined variables:
Should produce an error message and exit.
Invalid base or width:
print(5, 11, 32) # Error: base must be 2, 10, or 16
print(5, 10, 7) # Error: width must be 4, 8, 16, or 32
7. Script Examples¶
Example 1: Basic Assignment and Print¶
p1.ntl:
Output:
Explanation:
xis bound to 3- Printed in different bases and widths
Example 2: Quadratic Equation¶
p2.ntl:
Output:
Explanation:
- Computes \(q = 3 \times 7^2 + 4 \times 7 + 5 = 147 + 28 + 5 = 180\)
- Comments (starting with
#) are ignored
Example 3: Two's Complement¶
p3.ntl:
# Two's Complement
x = 241
y = (~x) + 1
print(x, 10, 32)
print(x, 2, 32)
print(y, 10, 32)
print(y, 2, 32)
print((~x) + 1, 10, 32)
print((~x) + 1, 2, 32)
Output:
241
0b00000000000000000000000011110001
-241
0b11111111111111111111111100001111
-241
0b11111111111111111111111100001111
Explanation:
x = 241y = (~x) + 1computes the two's complement negation of 241~241in 32 bits =0xFFFFFF0E= 4294967054 unsigned(~241) + 1= 4294967055 unsigned = -241 signed
Example 4: Expression in Print¶
p4.ntl:
Output:
Explanation:
- Evaluates
(2 * 3) + 4→6 + 4→ 10 - Binds
xto 10 - Prints 10
8. Implementation Considerations¶
Data Structures¶
| Component | Data Structure | Purpose |
|---|---|---|
| Environment | Hash map or array | Map variable names to values |
| AST | Tree of nodes | Represent program structure |
| Values | 32-bit unsigned integers | Store computed values |
Environment Implementation Choices¶
Hash Map (e.g., Rust HashMap, Python dict):
- Fast lookup: O(1) average case
- Dynamic size
- More memory overhead
Array (C implementation):
- Simple implementation
- Linear search: O(n) lookup
- Fixed capacity
- Lower memory overhead for small programs
For NTLang with a small number of variables, an array-based symbol table is sufficient.
Memory Management¶
- In C: Allocate environment on the stack or heap with fixed size
- In Rust: Use
HashMap<String, u32>for dynamic allocation
Passing the Environment¶
The environment must be passed to all evaluation functions:
- In C: Pass a pointer to the environment struct
- In Rust: Pass a mutable reference
&mut HashMap<String, u32>
Key Concepts¶
| Concept | Description |
|---|---|
| AST | Abstract Syntax Tree representing program structure |
| Post-order traversal | Evaluate children before parent |
| Expression evaluation | Recursive depth-first traversal of AST |
| Statement | A command that performs an action (assignment, print) |
| Environment | Mapping from variable names to values |
| Lookup | Retrieve value of a variable from environment |
| Upsert | Insert or update a variable in environment |
| Assignment | Bind a variable name to a value |
| Print statement | Format and display a value with specified base/width |
| Scope | Visibility region for variables (global in NTLang) |
Practice Problems¶
Problem 1: Trace Expression Evaluation¶
Given the AST for ((1 + 2) * 3) - 4:
Trace the evaluation order (post-order traversal).
Click to reveal solution
Problem 2: Environment Trace¶
Given the script:
Trace the environment after each statement.
Click to reveal solution
Initial environment: {}
After "a = 5":
Environment: {"a": 5}
After "b = a + 3":
Lookup a → 5
Evaluate 5 + 3 → 8
Environment: {"a": 5, "b": 8}
After "c = a * b":
Lookup a → 5
Lookup b → 8
Evaluate 5 * 8 → 40
Environment: {"a": 5, "b": 8, "c": 40}
After "print(c, 10, 32)":
Lookup c → 40
Print: 40
Environment: {"a": 5, "b": 8, "c": 40}
Problem 3: Undefined Variable¶
What happens when this script is executed?
Click to reveal solution
Problem 4: Print Formatting¶
Given x = 15, what is the output of each print statement?
Click to reveal solution
Problem 5: Grammar Extension¶
Identify what is new in the script grammar compared to the expression grammar.
Click to reveal solution
New in script grammar:
1. Program structure:
- Changed from single expression to list of statements
- Statements separated by newlines
2. Statement types:
- assignment: "ident = expression"
- print_statement: "print(expr, expr, expr)"
3. Operands:
- Added "ident" as a valid operand (variable reference)
4. Tokens:
- ident (identifiers/variable names)
- assign (=)
- comma (,)
- newline (\n)
- comments (# ... \n)
5. Semantics:
- Need environment to store variable bindings
- Need lookup operation for identifiers
- Need formatted output for print statement
Problem 6: Two's Complement Negation¶
Given x = 8, trace the evaluation of y = (~x) + 1 using 8-bit arithmetic.
Click to reveal solution
Given: x = 8
In 8-bit binary: x = 0000 1000
Step 1: Evaluate ~x (bitwise NOT)
~0000 1000 = 1111 0111
In 8-bit unsigned: 247
In 8-bit signed (two's complement): -9
Wait, let's reconsider with 32-bit as in NTLang:
Given: x = 8 (as uint32_t)
In 32-bit binary: x = 0x00000008 = 0000...0000 1000
Step 1: Evaluate ~x
~0x00000008 = 0xFFFFFFF7
In unsigned: 4,294,967,287
In signed: -9
Step 2: Evaluate (~x) + 1
0xFFFFFFF7 + 1 = 0xFFFFFFF8
In unsigned: 4,294,967,288
In signed: -8
Result: y = -8 (when interpreted as signed)
Verification: 8 + (-8) = 0 ✓
This confirms that (~x) + 1 is the two's complement negation of x.
Summary¶
-
Expression evaluation uses recursive depth-first traversal of the AST. We evaluate child nodes before applying operations at parent nodes (post-order traversal).
-
Extending from expressions to statements involves changing the program structure from a single expression to a list of statements, and adding assignment and print statement types.
-
Environment (symbol table) maps variable names to values. It supports lookup, insert, and update operations. NTLang uses a single global scope.
-
Assignment statements evaluate the right-hand expression and bind the result to a variable name in the environment (upsert semantics).
-
Print statements evaluate three expressions (value, base, width), then format and display the value accordingly. Base 10 uses signed representation.
-
Evaluation algorithm recursively processes statements, maintaining an environment that threads through all evaluation calls.
-
Identifiers in expressions require looking up the variable name in the environment. Undefined variables should produce an error.
-
Implementation choices include using hash maps or arrays for the environment, and passing the environment by pointer (C) or mutable reference (Rust) to evaluation functions.