Binary Representation and Number Bases¶

Overview¶

This lecture covers the fundamental concepts of binary representation and number bases that underlie all computing. We examine positional notation, binary and hexadecimal number systems, two's complement for signed integers, bitwise operations, shift operations, and the algorithms for converting between string representations and integer values. These concepts are language-agnostic and form the foundation for the C and Rust implementations covered in the next lecture.

Learning Objectives¶

Understand positional notation and its application to any base
Convert between binary, decimal, and hexadecimal representations
Apply two's complement for signed integer representation
Use bitwise operations (NOT, AND, OR, XOR) for bit manipulation
Understand shift operations and their relationship to multiplication/division
Describe algorithms for string-to-integer and integer-to-string conversion
Apply bit masking and width concepts

Prerequisites¶

Basic arithmetic and algebra
Understanding of powers of 2
Familiarity with C or Rust programming basics

1. Positional Notation¶

The General Formula¶

In a positional number system, the value of a number is determined by the position of each digit and the base of the system:

\[value = \sum_{i=0}^{n-1} d_i \times base^i\]

where $d_i$ is the digit at position $i$ (counting from the right, starting at 0).

Decimal (Base 10) — The Familiar Example¶

We use 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Example: 735 in base 10:

Position	Digit	Place Value	Contribution
2	7	$10^2 = 100$	700
1	3	$10^1 = 10$	30
0	5	$10^0 = 1$	5

$735 = 7 \times 100 + 3 \times 10 + 5 \times 1 = 735$

Generalization to Any Base¶

The same principle applies to any base $b$:

Base 2 (binary): digits are 0, 1
Base 8 (octal): digits are 0-7
Base 10 (decimal): digits are 0-9
Base 16 (hexadecimal): digits are 0-9, A-F

The number of distinct digits in a base-$b$ system is always $b$.

2. Binary (Base 2)¶

Why Binary?¶

Computers use binary because digital circuits have two stable states:

State	Meaning
Low voltage (0V)	`0` (false)
High voltage (3.3V/5V)	`1` (true)

A single binary digit is called a bit. Groups of bits form larger units:

Unit	Bits	Values
Bit	1	2
Nibble	4	16
Byte	8	256
Word (32-bit)	32	4,294,967,296

32-Bit Registers¶

In NTLang, we work with 32-bit unsigned integers (uint32_t in C, u32 in Rust):

Bit:  31 30 29 28 ... 3  2  1  0
      MSB                     LSB

MSB = Most Significant Bit (leftmost, highest value)
LSB = Least Significant Bit (rightmost, lowest value)

Powers of 2 Table¶

Power	Value	Common Name
$2^0$	1
$2^1$	2
$2^2$	4
$2^3$	8
$2^4$	16
$2^7$	128
$2^8$	256
$2^{10}$	1,024	1K
$2^{16}$	65,536	64K
$2^{20}$	1,048,576	1M
$2^{32}$	4,294,967,296	4G

Binary-to-Decimal Conversion¶

Convert 1011 (binary) to decimal:

Bit Position	3	2	1	0
Bit Value	1	0	1	1
Place Value	$2^3=8$	$2^2=4$	$2^1=2$	$2^0=1$
Contribution	8	0	2	1

$1011_2 = 8 + 0 + 2 + 1 = 11_{10}$

Decimal-to-Binary Conversion¶

Convert 13 to binary using repeated division:

13 / 2 = 6 remainder 1  (LSB)
 6 / 2 = 3 remainder 0
 3 / 2 = 1 remainder 1
 1 / 2 = 0 remainder 1  (MSB)

Read remainders bottom-to-top: 1101

$13_{10} = 1101_2$

3. Hexadecimal (Base 16)¶

Why Hexadecimal?¶

Binary numbers are long and hard to read. Hexadecimal provides a compact representation because 4 binary digits = 1 hex digit.

Hex Digits¶

Hex	Decimal	Binary
0	0	0000
1	1	0001
2	2	0010
3	3	0011
4	4	0100
5	5	0101
6	6	0110
7	7	0111
8	8	1000
9	9	1001
A	10	1010
B	11	1011
C	12	1100
D	13	1101
E	14	1110
F	15	1111

The 4-Bit Correspondence¶

Since $2^4 = 16$, each hex digit maps to exactly 4 bits:

Hex:     0x1A3F
Binary:  0001 1010 0011 1111
         ^^^^ ^^^^ ^^^^ ^^^^
          1    A    3    F

This makes hex-binary conversion trivial — just substitute each hex digit with its 4-bit pattern.

Hex-to-Decimal Conversion¶

Convert 0x2B to decimal:

$2B_{16} = 2 \times 16^1 + 11 \times 16^0 = 32 + 11 = 43_{10}$

Hex Prefix Conventions¶

Context	Prefix	Example
C/Rust	`0x`	`0xFF`
NTLang	`0x`	`0x2B`
Assembly	`0x` or `$`	`$FF`
HTML/CSS	`#`	`#00543c`

4. Two's Complement¶

The Problem: Representing Negative Numbers¶

With $n$ bits, we can represent $2^n$ different values. For unsigned integers, that's $0$ to $2^n - 1$. But how do we represent negative numbers?

Signed Magnitude (Naive Approach)¶

Use the MSB as a sign bit: 0 = positive, 1 = negative.

For 4-bit signed magnitude:

Binary	Value
`0101`	+5
`1101`	-5
`0000`	+0
`1000`	-0

Problems with signed magnitude:

Two zeros (+0 and -0) — wastes a bit pattern
Addition is complex — need to check signs and perform add or subtract
Hardware is complicated — separate adder/subtractor circuits needed

Two's Complement Solution¶

Two's complement is the standard representation used by virtually all modern hardware.

To negate a number: invert all bits, then add 1.

Example: negate 5 in 4 bits:

  5 = 0101
      ↓ invert all bits
      1010
      ↓ add 1
 -5 = 1011

Verify: 0101 + 1011 = 10000 (the carry out of bit 3 is discarded → result is 0000 = 0)

Two's Complement Range¶

For an $n$-bit two's complement number:

Property	Formula	4-bit Example	8-bit Example	32-bit Example
Minimum	$-2^{n-1}$	-8	-128	-2,147,483,648
Maximum	$2^{n-1} - 1$	7	127	2,147,483,647
Total values	$2^n$	16	256	4,294,967,296

4-Bit Two's Complement Table¶

Binary	Unsigned	Signed (Two's Complement)
`0000`	0	0
`0001`	1	1
`0010`	2	2
`0011`	3	3
`0100`	4	4
`0101`	5	5
`0110`	6	6
`0111`	7	7
`1000`	8	-8
`1001`	9	-7
`1010`	10	-6
`1011`	11	-5
`1100`	12	-4
`1101`	13	-3
`1110`	14	-2
`1111`	15	-1

Key Properties of Two's Complement¶

MSB indicates sign: 0 = non-negative, 1 = negative
One zero: only 0000...0 represents zero
Addition works the same for signed and unsigned — the hardware doesn't need to know
Asymmetric range: one more negative value than positive (e.g., -8 to 7 for 4 bits)

5. Bitwise Operations¶

Bitwise operations work on individual bits of a value. They are fundamental to systems programming for tasks like masking, flag manipulation, and low-level data processing.

NOT (Bitwise Complement)¶

Inverts every bit: 0 becomes 1, 1 becomes 0.

NOT 0101 1010
  = 1010 0101

Input	Output
0	1
1	0

AND (Bitwise AND)¶

Result is 1 only when both bits are 1.

  1100 1010
& 1111 0000
-----------
  1100 0000

A	B	A AND B
0	0	0
0	1	0
1	0	0
1	1	1

Use case — Masking: Extract specific bits by ANDing with a mask.

value:  1011 0110
mask:   0000 1111   (keep lower 4 bits)
result: 0000 0110

OR (Bitwise OR)¶

Result is 1 when either (or both) bits are 1.

  1100 1010
| 0011 0000
-----------
  1111 1010

A	B	A OR B
0	0	0
0	1	1
1	0	1
1	1	1

Use case — Setting bits: Set specific bits to 1 by ORing with a pattern.

XOR (Bitwise Exclusive OR)¶

Result is 1 when bits are different.

  1100 1010
^ 1111 0000
-----------
  0011 1010

A	B	A XOR B
0	0	0
0	1	1
1	0	1
1	1	0

Use case — Toggling bits: Flip specific bits by XORing with a mask.

Key property: A XOR A = 0 and A XOR 0 = A

Summary of Bitwise Operations¶

Operation	Symbol	Rule	Use Case
NOT	`~` / `!`	Invert all bits	Complement
AND	`&`	1 only if both 1	Masking (extract bits)
OR	`\\|`	1 if either 1	Setting bits
XOR	`^`	1 if different	Toggling bits

6. Shift Operations¶

Shift operations move all bits left or right by a specified number of positions.

Logical Shift Left (LSL)¶

Shifts bits left, filling vacated positions with 0:

Before:  0011 0101  (53)
LSL 2:   1101 0100  (212)

Bits shifted out on the left are discarded.
New bits on the right are 0.

Effect: Multiply by $2^n$ (where $n$ is the shift amount).

$53 \times 2^2 = 53 \times 4 = 212$

Logical Shift Right (LSR)¶

Shifts bits right, filling vacated positions with 0:

Before:  1101 0100  (212)
LSR 2:   0011 0101  (53)

Bits shifted out on the right are discarded.
New bits on the left are 0.

Effect: Unsigned divide by $2^n$.

$212 \div 4 = 53$

Arithmetic Shift Right (ASR)¶

Shifts bits right, but preserves the sign bit (MSB):

Before:  1101 0100  (signed: -44)
ASR 2:   1111 0101  (signed: -11)

New bits on the left copy the original MSB.

Effect: Signed divide by $2^n$ (rounds toward negative infinity).

Comparison of Shift Types¶

Operation	Fill Bit	Use For	Equivalent To
LSL	0 (right side)	All integers	Multiply by $2^n$
LSR	0 (left side)	Unsigned integers	Unsigned divide by $2^n$
ASR	Sign bit (left side)	Signed integers	Signed divide by $2^n$

Visual Diagram¶

Logical Shift Left (LSL by 1):
[b31 b30 ... b1 b0]  →  [b30 ... b1 b0 0]
 ↑ lost                              ↑ new 0

Logical Shift Right (LSR by 1):
[b31 b30 ... b1 b0]  →  [0 b31 b30 ... b1]
            lost ↑        ↑ new 0

Arithmetic Shift Right (ASR by 1):
[b31 b30 ... b1 b0]  →  [b31 b31 b30 ... b1]
            lost ↑        ↑ copy of MSB

7. Conversion Algorithms¶

Converting between string representations and integer values is a core operation in NTLang. Here we describe the algorithms at a conceptual level.

String-to-Integer: Left-to-Right Algorithm¶

To convert a string like "1A3" (hex) to an integer:

Algorithm: Process digits left-to-right, accumulating the value.

value = 0
for each character c in the string:
    digit = char_to_digit(c)
    value = value * base + digit

Example: Convert "1A3" in base 16:

Step	Character	Digit	Computation	Value
1	`1`	1	$0 \times 16 + 1$	1
2	`A`	10	$1 \times 16 + 10$	26
3	`3`	3	$26 \times 16 + 3$	419

Result: 0x1A3 = 419

Character-to-Digit Mapping¶

Converting a character to its numeric value:

Character Range	Digit Value
`'0'` - `'9'`	`c - '0'` (0-9)
`'a'` - `'f'`	`c - 'a' + 10` (10-15)
`'A'` - `'F'`	`c - 'A' + 10` (10-15)

Integer-to-String: Division/Modulo Algorithm¶

To convert an integer to a string representation:

Algorithm: Repeatedly divide by the base, collecting remainders (which are the digits in reverse order).

digits = empty list
while value > 0:
    remainder = value % base
    digits.append(digit_to_char(remainder))
    value = value / base
reverse(digits)

Example: Convert 419 to hex:

Step	Value	Value % 16	Digit	Value / 16
1	419	3	`3`	26
2	26	10	`A`	1
3	1	1	`1`	0

Reversed digits: 1A3 → Result: "0x1A3"

Digit-to-Character Mapping¶

Digit Value	Character
0-9	`'0' + digit`
10-15	`'a' + (digit - 10)`

8. Bit Width and Masking¶

The Width Parameter¶

In NTLang, the -w flag specifies a bit width. This controls how many bits are significant in the output:

ntlang -w 4 "15"    → output shows only lower 4 bits
ntlang -w 8 "255"   → output shows only lower 8 bits
ntlang -w 32 "255"  → output shows all 32 bits (default)

Computing a Width Mask¶

A mask isolates the lower $w$ bits of a value:

\[mask = (1 << w) - 1\]

Examples:

Width	Mask Computation	Mask (Binary)	Mask (Hex)
4	$(1 << 4) - 1$	`0000 1111`	`0x0F`
8	$(1 << 8) - 1$	`1111 1111`	`0xFF`
16	$(1 << 16) - 1$	`0000...1111 1111 1111 1111`	`0xFFFF`

Special case: When width = 32, shifting by 32 causes undefined behavior (in C) or would overflow, so we handle it separately by returning 0xFFFFFFFF.

Applying a Mask¶

To extract the lower $w$ bits:

masked_value = value & mask

Example: Extract lower 4 bits of 0xAB:

  1010 1011   (0xAB)
& 0000 1111   (mask for width 4)
-----------
  0000 1011   (0x0B)

Sign Extension¶

When displaying a masked value as a signed number, the MSB of the masked value (bit $w-1$) determines the sign. If the MSB is 1, we need to sign-extend — fill the upper bits with 1 to maintain the negative value in the full 32-bit representation.

Width 4, value = 0xB (1011 in 4 bits):

MSB of 4-bit value is 1 (bit 3) → negative in two's complement

Sign extend to 32 bits:
  0000 0000 0000 0000 0000 0000 0000 1011  (before)
  1111 1111 1111 1111 1111 1111 1111 1011  (after sign extension)
  = 0xFFFFFFFB = -5 in 32-bit two's complement

Key Concepts¶

Concept	Description
Positional notation	Value determined by digit position and base
Binary (base 2)	Two digits (0, 1); foundation of digital computing
Hexadecimal (base 16)	Compact binary representation; 4 bits per hex digit
Two's complement	Standard signed integer representation; negate = invert + add 1
NOT	Invert all bits
AND	Masking — extract specific bits
OR	Setting — turn on specific bits
XOR	Toggling — flip specific bits
LSL	Shift left, multiply by $2^n$
LSR	Shift right (unsigned), divide by $2^n$
ASR	Shift right (signed), preserves sign bit
String-to-integer	Left-to-right: `val = val * base + digit`
Integer-to-string	Division/modulo: extract digits in reverse
Width mask	$(1 << w) - 1$; isolates lower $w$ bits
Sign extension	Fill upper bits with sign bit for signed interpretation

Practice Problems¶

Problem 1: Binary-to-Decimal¶

Convert 1100 1010 (binary) to decimal.

Click to reveal solution

1100 1010

Position:  7    6    5    4    3    2    1    0
Bit:       1    1    0    0    1    0    1    0
Value:     128  64   0    0    8    0    2    0

Sum: 128 + 64 + 8 + 2 = 202

Problem 2: Hex-to-Binary¶

Convert 0xDEAD to binary.

Click to reveal solution

D    E    A    D
1101 1110 1010 1101

0xDEAD = 1101 1110 1010 1101
       = 57005 in decimal

Problem 3: Two's Complement Negation¶

Negate 6 in 8-bit two's complement.

Click to reveal solution

 6 = 0000 0110

Step 1 - Invert all bits:
     1111 1001

Step 2 - Add 1:
     1111 1001
   + 0000 0001
   -----------
     1111 1010

-6 = 1111 1010 = 0xFA

Verify: 0000 0110 + 1111 1010 = 1 0000 0000 (carry discarded → 0) ✓

Problem 4: Bitwise Operations¶

Given A = 0xCA and B = 0xF0, compute:

A & B
A | B
A ^ B
~A (8-bit)

Click to reveal solution

A = 0xCA = 1100 1010
B = 0xF0 = 1111 0000

1. A & B:
   1100 1010
 & 1111 0000
 -----------
   1100 0000 = 0xC0

2. A | B:
   1100 1010
 | 1111 0000
 -----------
   1111 1010 = 0xFA

3. A ^ B:
   1100 1010
 ^ 1111 0000
 -----------
   0011 1010 = 0x3A

4. ~A:
   ~1100 1010
 = 0011 0101 = 0x35

Problem 5: String-to-Integer Algorithm¶

Trace the left-to-right algorithm to convert "B2" from hex to decimal.

Click to reveal solution

String: "B2", base = 16

Step 1: value = 0
Step 2: c = 'B', digit = 11
        value = 0 * 16 + 11 = 11
Step 3: c = '2', digit = 2
        value = 11 * 16 + 2 = 178

Result: 0xB2 = 178

Verify: 11 * 16 + 2 = 176 + 2 = 178 ✓

Problem 6: Width Mask¶

What is the mask for width 6? Apply it to the value 0xFF.

Click to reveal solution

mask = (1 << 6) - 1
     = 64 - 1
     = 63
     = 0x3F
     = 0011 1111

Apply to 0xFF:
  1111 1111  (0xFF)
& 0011 1111  (mask)
-----------
  0011 1111  (0x3F = 63)

The mask keeps only the lower 6 bits.

Summary¶

Positional notation represents values using digits weighted by powers of the base. This generalizes from familiar decimal to binary, octal, hex, and any base.
Binary (base 2) is the native language of digital hardware. Each bit represents a power of 2, and 32-bit registers can hold values from 0 to $2^{32} - 1$ (unsigned).
Hexadecimal (base 16) provides a compact way to write binary, with each hex digit mapping to exactly 4 bits.
Two's complement is the standard for signed integers: negate by inverting all bits and adding 1. It gives a single zero, simple addition hardware, and a range of $-2^{n-1}$ to $2^{n-1} - 1$.
Bitwise operations (NOT, AND, OR, XOR) manipulate individual bits and are used for masking, setting, clearing, and toggling bits.
Shift operations (LSL, LSR, ASR) move bits left or right, effectively multiplying or dividing by powers of 2. ASR preserves the sign for signed values.
Conversion algorithms use left-to-right accumulation (string-to-integer) and division/modulo (integer-to-string) to convert between representations.
Bit width and masking allow working with specific bit ranges using the formula $mask = (1 << w) - 1$, with sign extension needed for signed interpretation of narrow values.

Power	Value	Common Name
\(2^0\)	1
\(2^1\)	2
\(2^2\)	4
\(2^3\)	8
\(2^4\)	16
\(2^7\)	128
\(2^8\)	256
\(2^{10}\)	1,024	1K
\(2^{16}\)	65,536	64K
\(2^{20}\)	1,048,576	1M
\(2^{32}\)	4,294,967,296	4G

Bit Position	3	2	1	0
Bit Value	1	0	1	1
Place Value	\(2^3=8\)	\(2^2=4\)	\(2^1=2\)	\(2^0=1\)
Contribution	8	0	2	1

Property	Formula	4-bit Example	8-bit Example	32-bit Example
Minimum	\(-2^{n-1}\)	-8	-128	-2,147,483,648
Maximum	\(2^{n-1} - 1\)	7	127	2,147,483,647
Total values	\(2^n\)	16	256	4,294,967,296

Operation	Fill Bit	Use For	Equivalent To
LSL	0 (right side)	All integers	Multiply by \(2^n\)
LSR	0 (left side)	Unsigned integers	Unsigned divide by \(2^n\)
ASR	Sign bit (left side)	Signed integers	Signed divide by \(2^n\)

Step	Character	Digit	Computation	Value
1	`1`	1	\(0 \times 16 + 1\)	1
2	`A`	10	\(1 \times 16 + 10\)	26
3	`3`	3	\(26 \times 16 + 3\)	419

Width	Mask Computation	Mask (Binary)	Mask (Hex)
4	\((1 << 4) - 1\)	`0000 1111`	`0x0F`
8	\((1 << 8) - 1\)	`1111 1111`	`0xFF`
16	\((1 << 16) - 1\)	`0000...1111 1111 1111 1111`	`0xFFFF`