Skip to content

Evaluation and Scripts

Overview

This lecture covers the fundamental concepts of Abstract Syntax Tree (AST) evaluation and extending an expression-based language to support scripts with statements. We examine depth-first tree traversal for expression evaluation, the extension of grammar from expressions to statements, and the implementation of an environment (symbol table) for variable storage. These concepts form the foundation for implementing NTLang scripts with variables, assignments, and print statements.

Learning Objectives

  • Understand depth-first tree traversal for expression evaluation
  • Extend a grammar from expressions to statements
  • Design and implement an environment (symbol table) for variable storage
  • Implement variable binding and lookup operations
  • Support assignment statements and print statements with formatting options
  • Apply post-order evaluation patterns for recursive tree processing

Prerequisites

  • Understanding of parsing and Abstract Syntax Trees (ASTs)
  • Basic knowledge of recursive algorithms
  • Familiarity with expression evaluation and operator precedence
  • Understanding of binary, decimal, and hexadecimal number bases

1. From Parsing to Evaluation

The Compilation Pipeline

Recall the stages of language processing:

Source Code → Scanner → Tokens → Parser → AST → Evaluator → Result

In the previous lectures, we built:

  1. Scanner: Converts source text into tokens
  2. Parser: Converts tokens into an Abstract Syntax Tree (AST)

Now we focus on:

  1. Evaluator: Traverses the AST and computes the result

What is an AST?

An Abstract Syntax Tree represents the hierarchical structure of a program:

Expression: 2 + 3 * 4

AST:
        +
       / \
      2   *
         / \
        3   4

The tree structure encodes operator precedence — multiplication happens before addition because * is deeper in the tree.

Evaluation: Walking the Tree

To evaluate an expression tree, we perform a depth-first traversal:

  1. Recursively evaluate child nodes (operands)
  2. Apply the operation at the current node
  3. Return the result

This is called post-order traversal because we process children before the parent.


2. Tree Traversal and Expression Evaluation

Post-Order Traversal

Post-order means: left subtree → right subtree → root.

        +
       / \
      2   *
         / \
        3   4

Evaluation order: 2, 3, 4, *, +

  1. Evaluate 2 → 2
  2. Evaluate 3 → 3
  3. Evaluate 4 → 4
  4. Evaluate 3 * 4 → 12
  5. Evaluate 2 + 12 → 14

Recursive Evaluation Algorithm

function evaluate(node):
    if node is a literal:
        return node.value

    if node is a unary operation:
        operand = evaluate(node.child)
        return apply_unary_op(node.operator, operand)

    if node is a binary operation:
        left = evaluate(node.left)
        right = evaluate(node.right)
        return apply_binary_op(node.operator, left, right)

Example: Evaluating (2 + 3) * 4

AST:
        *
       / \
      +   4
     / \
    2   3

Trace:

Call Node Action Returns
eval(*) * Recurse on left and right
eval(+) + Recurse on left and right
eval(2) 2 Literal 2
eval(3) 3 Literal 3
(back to +) + 2 + 3 5
eval(4) 4 Literal 4
(back to *) * 5 * 4 20

Final result: 20

Operator Application

For binary operators, we extract the left and right values and apply the operation:

Operator Operation Example
+ Addition 5 + 3 → 8
- Subtraction 5 - 3 → 2
* Multiplication 5 * 3 → 15
/ Division 15 / 3 → 5
>> Logical shift right 8 >> 2 → 2
<< Logical shift left 2 << 2 → 8
& Bitwise AND 0xCA & 0xF00xC0
\| Bitwise OR 0xCA \| 0x0F0xCF
^ Bitwise XOR 0xCA ^ 0xFF0x35
>- Arithmetic shift right -8 >- 2 → -2

For unary operators:

Operator Operation Example
- Negation (two's complement) -5 → -5
~ Bitwise NOT ~0x0F0xFFFFFFF0

3. From Expressions to Statements

Expression Mode vs Script Mode

NTLang operates in two modes:

Expression Mode (-e flag):

$ ./project01 -e "2 + 3"
5

Evaluates a single expression and prints the result.

Script Mode (file input):

$ ./project01 script.ntl

Executes a sequence of statements from a file.

Grammar Extension

Expression grammar (Section 1):

program    ::= expression EOT
expression ::= operand (operator operand)*
operand    ::= intlit | hexlit | binlit
             | '-' operand | '~' operand
             | '(' expression ')'

Script grammar (Section 2):

program    ::= (statement '\n')* EOT
statement  ::= assignment | print_statement

assignment ::= ident '=' expression
print_statement ::= 'print' '(' expression ',' expression ',' expression ')'

expression ::= operand (operator operand)*
operand    ::= intlit | hexlit | binlit | ident
             | '-' operand | '~' operand
             | '(' expression ')'

Key Changes

  1. Program is now a list of statements, not a single expression
  2. Statements include assignments and print statements
  3. Operands can now be identifiers (variable names)
  4. Newlines separate statements

4. Variables and the Environment

What is an Environment?

An environment (or symbol table) is a mapping from variable names to values:

Environment = { "x": 10, "y": 20, "z": 30 }

When we evaluate an identifier, we look it up in the environment.

Operations on Environments

Operation Description Example
Lookup Get the value of a variable env["x"] → 10
Insert Add a new variable env["w"] = 40
Update Change an existing variable env["x"] = 15
Upsert Insert or update env["x"] = value

For NTLang, we use upsert semantics — assignment either creates a new variable or updates an existing one.

Scope

NTLang has a single global scope — all variables are global. There are no local scopes, functions, or blocks.

x = 10
y = x + 5
x = 20        # Updates existing x
print(x, 10, 32)
print(y, 10, 32)

Output:

20
15

Environment Example

Consider this script:

x = 2 + 3
y = x * 4
print(y, 10, 32)

Execution trace:

Statement Environment After Action
(start) {} Empty environment
x = 2 + 3 {"x": 5} Evaluate 2 + 3 → 5, bind x to 5
y = x * 4 {"x": 5, "y": 20} Lookup x → 5, evaluate 5 * 4 → 20, bind y to 20
print(y, 10, 32) {"x": 5, "y": 20} Lookup y → 20, print as base 10, width 32

Output: 20


5. Statement Evaluation

Assignment Statement

ident = expression

Evaluation:

  1. Evaluate the expression on the right side
  2. Bind (or update) the variable name to the computed value in the environment

Example:

a = 3
b = 4
c = 5
q = (a * 7 * 7) + (b * 7) + c

After execution, the environment is:

{"a": 3, "b": 4, "c": 5, "q": 180}
print(expression, base, width)

Evaluation:

  1. Evaluate the expression → get a value
  2. Evaluate base → should be 2, 10, or 16
  3. Evaluate width → should be 4, 8, 16, or 32
  4. Format and print the value

Example:

x = 15
print(x, 10, 32)    # Prints: 15
print(x, 16, 8)     # Prints: 0x0F
print(x, 2, 4)      # Prints: 0b1111
Base Format Example Value Output
10 Signed decimal 15 15
10 Signed decimal -5 -5
16 Hexadecimal with prefix 15 0x0F (width 8)
16 Hexadecimal with prefix 255 0x000000FF (width 32)
2 Binary with prefix 15 0b1111 (width 4)
2 Binary with prefix 15 0b00001111 (width 8)

For base 10, values are printed as signed integers (using two's complement interpretation).

For base 16 and 2, the width determines the number of digits.


6. Complete Evaluation Algorithm

Program Evaluation

function evaluate_program(statements, env):
    for each statement in statements:
        evaluate_statement(statement, env)

Statement Evaluation

function evaluate_statement(stmt, env):
    if stmt is assignment:
        value = evaluate_expression(stmt.expression, env)
        env[stmt.identifier] = value

    if stmt is print:
        value = evaluate_expression(stmt.expr, env)
        base = evaluate_expression(stmt.base, env)
        width = evaluate_expression(stmt.width, env)
        print_formatted(value, base, width)

Expression Evaluation with Variables

function evaluate_expression(expr, env):
    if expr is literal:
        return expr.value

    if expr is identifier:
        if expr.name not in env:
            error("undefined variable: " + expr.name)
        return env[expr.name]

    if expr is unary operation:
        operand = evaluate_expression(expr.child, env)
        return apply_unary_op(expr.operator, operand)

    if expr is binary operation:
        left = evaluate_expression(expr.left, env)
        right = evaluate_expression(expr.right, env)
        return apply_binary_op(expr.operator, left, right)

Error Handling

Undefined variables:

print(x, 10, 32)    # Error if x is not defined

Should produce an error message and exit.

Invalid base or width:

print(5, 11, 32)    # Error: base must be 2, 10, or 16
print(5, 10, 7)     # Error: width must be 4, 8, 16, or 32

7. Script Examples

Example 1: Basic Assignment and Print

p1.ntl:

x = 1 + 2
print(x, 10, 32)
print(x, 2, 32)
print(x, 16, 32)
print(x, 2, 4)
print(x, 16, 4)

Output:

3
0b00000000000000000000000000000011
0x00000003
0b0011
0x3

Explanation:

  • x is bound to 3
  • Printed in different bases and widths

Example 2: Quadratic Equation

p2.ntl:

# Quadratic equation

x = 7
a = 3
b = 4
c = 5

q = (a * x * x) + (b * x) + c

print(q, 10, 32)

Output:

180

Explanation:

  • Computes \(q = 3 \times 7^2 + 4 \times 7 + 5 = 147 + 28 + 5 = 180\)
  • Comments (starting with #) are ignored

Example 3: Two's Complement

p3.ntl:

# Two's Complement

x = 241
y = (~x) + 1

print(x, 10, 32)
print(x, 2, 32)
print(y, 10, 32)
print(y, 2, 32)
print((~x) + 1, 10, 32)
print((~x) + 1, 2, 32)

Output:

241
0b00000000000000000000000011110001
-241
0b11111111111111111111111100001111
-241
0b11111111111111111111111100001111

Explanation:

  • x = 241
  • y = (~x) + 1 computes the two's complement negation of 241
  • ~241 in 32 bits = 0xFFFFFF0E = 4294967054 unsigned
  • (~241) + 1 = 4294967055 unsigned = -241 signed

Example 4: Expression in Print

p4.ntl:

x = (2 * 3) + 4
print(x, 10, 32)

Output:

10

Explanation:

  • Evaluates (2 * 3) + 46 + 4 → 10
  • Binds x to 10
  • Prints 10

8. Implementation Considerations

Data Structures

Component Data Structure Purpose
Environment Hash map or array Map variable names to values
AST Tree of nodes Represent program structure
Values 32-bit unsigned integers Store computed values

Environment Implementation Choices

Hash Map (e.g., Rust HashMap, Python dict):

  • Fast lookup: O(1) average case
  • Dynamic size
  • More memory overhead

Array (C implementation):

  • Simple implementation
  • Linear search: O(n) lookup
  • Fixed capacity
  • Lower memory overhead for small programs

For NTLang with a small number of variables, an array-based symbol table is sufficient.

Memory Management

  • In C: Allocate environment on the stack or heap with fixed size
  • In Rust: Use HashMap<String, u32> for dynamic allocation

Passing the Environment

The environment must be passed to all evaluation functions:

evaluate_expression(expr, env)
evaluate_statement(stmt, env)
  • In C: Pass a pointer to the environment struct
  • In Rust: Pass a mutable reference &mut HashMap<String, u32>

Key Concepts

Concept Description
AST Abstract Syntax Tree representing program structure
Post-order traversal Evaluate children before parent
Expression evaluation Recursive depth-first traversal of AST
Statement A command that performs an action (assignment, print)
Environment Mapping from variable names to values
Lookup Retrieve value of a variable from environment
Upsert Insert or update a variable in environment
Assignment Bind a variable name to a value
Print statement Format and display a value with specified base/width
Scope Visibility region for variables (global in NTLang)

Practice Problems

Problem 1: Trace Expression Evaluation

Given the AST for ((1 + 2) * 3) - 4:

        -
       / \
      *   4
     / \
    +   3
   / \
  1   2

Trace the evaluation order (post-order traversal).

Click to reveal solution
Evaluation order:

1. evaluate(1) → 1
2. evaluate(2) → 2
3. evaluate(+) → 1 + 2 = 3
4. evaluate(3) → 3
5. evaluate(*) → 3 * 3 = 9
6. evaluate(4) → 4
7. evaluate(-) → 9 - 4 = 5

Final result: 5

Problem 2: Environment Trace

Given the script:

a = 5
b = a + 3
c = a * b
print(c, 10, 32)

Trace the environment after each statement.

Click to reveal solution
Initial environment: {}

After "a = 5":
  Environment: {"a": 5}

After "b = a + 3":
  Lookup a → 5
  Evaluate 5 + 3 → 8
  Environment: {"a": 5, "b": 8}

After "c = a * b":
  Lookup a → 5
  Lookup b → 8
  Evaluate 5 * 8 → 40
  Environment: {"a": 5, "b": 8, "c": 40}

After "print(c, 10, 32)":
  Lookup c → 40
  Print: 40
  Environment: {"a": 5, "b": 8, "c": 40}

Problem 3: Undefined Variable

What happens when this script is executed?

y = x + 1
Click to reveal solution
Error: undefined variable "x"

When evaluating the expression "x + 1", we try to lookup x in the environment.
Since x has not been defined (not in the environment), we must report an error
and exit the program.

Proper error message:
  "Error: undefined variable 'x'"
or
  "undefined variable: x"

Problem 4: Print Formatting

Given x = 15, what is the output of each print statement?

print(x, 10, 32)
print(x, 16, 8)
print(x, 2, 4)
print(x, 16, 4)
Click to reveal solution
print(x, 10, 32)  →  15
  (base 10, signed, width 32 - just prints the decimal value)

print(x, 16, 8)   →  0x0F
  (base 16, 8 bits = 2 hex digits)

print(x, 2, 4)    →  0b1111
  (base 2, 4 bits)

print(x, 16, 4)   →  0xF
  (base 16, 4 bits = 1 hex digit)

Problem 5: Grammar Extension

Identify what is new in the script grammar compared to the expression grammar.

Click to reveal solution
New in script grammar:

1. Program structure:
   - Changed from single expression to list of statements
   - Statements separated by newlines

2. Statement types:
   - assignment: "ident = expression"
   - print_statement: "print(expr, expr, expr)"

3. Operands:
   - Added "ident" as a valid operand (variable reference)

4. Tokens:
   - ident (identifiers/variable names)
   - assign (=)
   - comma (,)
   - newline (\n)
   - comments (# ... \n)

5. Semantics:
   - Need environment to store variable bindings
   - Need lookup operation for identifiers
   - Need formatted output for print statement

Problem 6: Two's Complement Negation

Given x = 8, trace the evaluation of y = (~x) + 1 using 8-bit arithmetic.

Click to reveal solution
Given: x = 8

In 8-bit binary: x = 0000 1000

Step 1: Evaluate ~x (bitwise NOT)
  ~0000 1000 = 1111 0111
  In 8-bit unsigned: 247
  In 8-bit signed (two's complement): -9

Wait, let's reconsider with 32-bit as in NTLang:

Given: x = 8 (as uint32_t)

In 32-bit binary: x = 0x00000008 = 0000...0000 1000

Step 1: Evaluate ~x
  ~0x00000008 = 0xFFFFFFF7
  In unsigned: 4,294,967,287
  In signed: -9

Step 2: Evaluate (~x) + 1
  0xFFFFFFF7 + 1 = 0xFFFFFFF8
  In unsigned: 4,294,967,288
  In signed: -8

Result: y = -8 (when interpreted as signed)

Verification: 8 + (-8) = 0 ✓
This confirms that (~x) + 1 is the two's complement negation of x.

Summary

  1. Expression evaluation uses recursive depth-first traversal of the AST. We evaluate child nodes before applying operations at parent nodes (post-order traversal).

  2. Extending from expressions to statements involves changing the program structure from a single expression to a list of statements, and adding assignment and print statement types.

  3. Environment (symbol table) maps variable names to values. It supports lookup, insert, and update operations. NTLang uses a single global scope.

  4. Assignment statements evaluate the right-hand expression and bind the result to a variable name in the environment (upsert semantics).

  5. Print statements evaluate three expressions (value, base, width), then format and display the value accordingly. Base 10 uses signed representation.

  6. Evaluation algorithm recursively processes statements, maintaining an environment that threads through all evaluation calls.

  7. Identifiers in expressions require looking up the variable name in the environment. Undefined variables should produce an error.

  8. Implementation choices include using hash maps or arrays for the environment, and passing the environment by pointer (C) or mutable reference (Rust) to evaluation functions.