Binary Representation and Number Bases¶
Overview¶
This lecture covers the fundamental concepts of binary representation and number bases that underlie all computing. We examine positional notation, binary and hexadecimal number systems, two's complement for signed integers, bitwise operations, shift operations, and the algorithms for converting between string representations and integer values. These concepts are language-agnostic and form the foundation for the C and Rust implementations covered in the next lecture.
Learning Objectives¶
- Understand positional notation and its application to any base
- Convert between binary, decimal, and hexadecimal representations
- Apply two's complement for signed integer representation
- Use bitwise operations (NOT, AND, OR, XOR) for bit manipulation
- Understand shift operations and their relationship to multiplication/division
- Describe algorithms for string-to-integer and integer-to-string conversion
- Apply bit masking and width concepts
Prerequisites¶
- Basic arithmetic and algebra
- Understanding of powers of 2
- Familiarity with C or Rust programming basics
1. Positional Notation¶
The General Formula¶
In a positional number system, the value of a number is determined by the position of each digit and the base of the system:
where \(d_i\) is the digit at position \(i\) (counting from the right, starting at 0).
Decimal (Base 10) — The Familiar Example¶
We use 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
Example: 735 in base 10:
| Position | Digit | Place Value | Contribution |
|---|---|---|---|
| 2 | 7 | \(10^2 = 100\) | 700 |
| 1 | 3 | \(10^1 = 10\) | 30 |
| 0 | 5 | \(10^0 = 1\) | 5 |
\(735 = 7 \times 100 + 3 \times 10 + 5 \times 1 = 735\)
Generalization to Any Base¶
The same principle applies to any base \(b\):
- Base 2 (binary): digits are
0, 1 - Base 8 (octal): digits are
0-7 - Base 10 (decimal): digits are
0-9 - Base 16 (hexadecimal): digits are
0-9, A-F
The number of distinct digits in a base-\(b\) system is always \(b\).
2. Binary (Base 2)¶
Why Binary?¶
Computers use binary because digital circuits have two stable states:
| State | Meaning |
|---|---|
| Low voltage (0V) | 0 (false) |
| High voltage (3.3V/5V) | 1 (true) |
A single binary digit is called a bit. Groups of bits form larger units:
| Unit | Bits | Values |
|---|---|---|
| Bit | 1 | 2 |
| Nibble | 4 | 16 |
| Byte | 8 | 256 |
| Word (32-bit) | 32 | 4,294,967,296 |
32-Bit Registers¶
In NTLang, we work with 32-bit unsigned integers (uint32_t in C, u32 in Rust):
Bit: 31 30 29 28 ... 3 2 1 0
MSB LSB
MSB = Most Significant Bit (leftmost, highest value)
LSB = Least Significant Bit (rightmost, lowest value)
Powers of 2 Table¶
| Power | Value | Common Name |
|---|---|---|
| \(2^0\) | 1 | |
| \(2^1\) | 2 | |
| \(2^2\) | 4 | |
| \(2^3\) | 8 | |
| \(2^4\) | 16 | |
| \(2^7\) | 128 | |
| \(2^8\) | 256 | |
| \(2^{10}\) | 1,024 | 1K |
| \(2^{16}\) | 65,536 | 64K |
| \(2^{20}\) | 1,048,576 | 1M |
| \(2^{32}\) | 4,294,967,296 | 4G |
Binary-to-Decimal Conversion¶
Convert 1011 (binary) to decimal:
| Bit Position | 3 | 2 | 1 | 0 |
|---|---|---|---|---|
| Bit Value | 1 | 0 | 1 | 1 |
| Place Value | \(2^3=8\) | \(2^2=4\) | \(2^1=2\) | \(2^0=1\) |
| Contribution | 8 | 0 | 2 | 1 |
\(1011_2 = 8 + 0 + 2 + 1 = 11_{10}\)
Decimal-to-Binary Conversion¶
Convert 13 to binary using repeated division:
13 / 2 = 6 remainder 1 (LSB)
6 / 2 = 3 remainder 0
3 / 2 = 1 remainder 1
1 / 2 = 0 remainder 1 (MSB)
Read remainders bottom-to-top: 1101
\(13_{10} = 1101_2\)
3. Hexadecimal (Base 16)¶
Why Hexadecimal?¶
Binary numbers are long and hard to read. Hexadecimal provides a compact representation because 4 binary digits = 1 hex digit.
Hex Digits¶
| Hex | Decimal | Binary |
|---|---|---|
| 0 | 0 | 0000 |
| 1 | 1 | 0001 |
| 2 | 2 | 0010 |
| 3 | 3 | 0011 |
| 4 | 4 | 0100 |
| 5 | 5 | 0101 |
| 6 | 6 | 0110 |
| 7 | 7 | 0111 |
| 8 | 8 | 1000 |
| 9 | 9 | 1001 |
| A | 10 | 1010 |
| B | 11 | 1011 |
| C | 12 | 1100 |
| D | 13 | 1101 |
| E | 14 | 1110 |
| F | 15 | 1111 |
The 4-Bit Correspondence¶
Since \(2^4 = 16\), each hex digit maps to exactly 4 bits:
This makes hex-binary conversion trivial — just substitute each hex digit with its 4-bit pattern.
Hex-to-Decimal Conversion¶
Convert 0x2B to decimal:
\(2B_{16} = 2 \times 16^1 + 11 \times 16^0 = 32 + 11 = 43_{10}\)
Hex Prefix Conventions¶
| Context | Prefix | Example |
|---|---|---|
| C/Rust | 0x |
0xFF |
| NTLang | 0x |
0x2B |
| Assembly | 0x or $ |
$FF |
| HTML/CSS | # |
#00543c |
4. Two's Complement¶
The Problem: Representing Negative Numbers¶
With \(n\) bits, we can represent \(2^n\) different values. For unsigned integers, that's \(0\) to \(2^n - 1\). But how do we represent negative numbers?
Signed Magnitude (Naive Approach)¶
Use the MSB as a sign bit: 0 = positive, 1 = negative.
For 4-bit signed magnitude:
| Binary | Value |
|---|---|
0101 |
+5 |
1101 |
-5 |
0000 |
+0 |
1000 |
-0 |
Problems with signed magnitude:
- Two zeros (
+0and-0) — wastes a bit pattern - Addition is complex — need to check signs and perform add or subtract
- Hardware is complicated — separate adder/subtractor circuits needed
Two's Complement Solution¶
Two's complement is the standard representation used by virtually all modern hardware.
To negate a number: invert all bits, then add 1.
Example: negate 5 in 4 bits:
Verify: 0101 + 1011 = 10000 (the carry out of bit 3 is discarded → result is 0000 = 0)
Two's Complement Range¶
For an \(n\)-bit two's complement number:
| Property | Formula | 4-bit Example | 8-bit Example | 32-bit Example |
|---|---|---|---|---|
| Minimum | \(-2^{n-1}\) | -8 | -128 | -2,147,483,648 |
| Maximum | \(2^{n-1} - 1\) | 7 | 127 | 2,147,483,647 |
| Total values | \(2^n\) | 16 | 256 | 4,294,967,296 |
4-Bit Two's Complement Table¶
| Binary | Unsigned | Signed (Two's Complement) |
|---|---|---|
0000 |
0 | 0 |
0001 |
1 | 1 |
0010 |
2 | 2 |
0011 |
3 | 3 |
0100 |
4 | 4 |
0101 |
5 | 5 |
0110 |
6 | 6 |
0111 |
7 | 7 |
1000 |
8 | -8 |
1001 |
9 | -7 |
1010 |
10 | -6 |
1011 |
11 | -5 |
1100 |
12 | -4 |
1101 |
13 | -3 |
1110 |
14 | -2 |
1111 |
15 | -1 |
Key Properties of Two's Complement¶
- MSB indicates sign:
0= non-negative,1= negative - One zero: only
0000...0represents zero - Addition works the same for signed and unsigned — the hardware doesn't need to know
- Asymmetric range: one more negative value than positive (e.g., -8 to 7 for 4 bits)
5. Bitwise Operations¶
Bitwise operations work on individual bits of a value. They are fundamental to systems programming for tasks like masking, flag manipulation, and low-level data processing.
NOT (Bitwise Complement)¶
Inverts every bit: 0 becomes 1, 1 becomes 0.
| Input | Output |
|---|---|
| 0 | 1 |
| 1 | 0 |
AND (Bitwise AND)¶
Result is 1 only when both bits are 1.
| A | B | A AND B |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 1 |
Use case — Masking: Extract specific bits by ANDing with a mask.
OR (Bitwise OR)¶
Result is 1 when either (or both) bits are 1.
| A | B | A OR B |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |
Use case — Setting bits: Set specific bits to 1 by ORing with a pattern.
XOR (Bitwise Exclusive OR)¶
Result is 1 when bits are different.
| A | B | A XOR B |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 0 |
Use case — Toggling bits: Flip specific bits by XORing with a mask.
Key property: A XOR A = 0 and A XOR 0 = A
Summary of Bitwise Operations¶
| Operation | Symbol | Rule | Use Case |
|---|---|---|---|
| NOT | ~ / ! |
Invert all bits | Complement |
| AND | & |
1 only if both 1 | Masking (extract bits) |
| OR | \| |
1 if either 1 | Setting bits |
| XOR | ^ |
1 if different | Toggling bits |
6. Shift Operations¶
Shift operations move all bits left or right by a specified number of positions.
Logical Shift Left (LSL)¶
Shifts bits left, filling vacated positions with 0:
Before: 0011 0101 (53)
LSL 2: 1101 0100 (212)
Bits shifted out on the left are discarded.
New bits on the right are 0.
Effect: Multiply by \(2^n\) (where \(n\) is the shift amount).
\(53 \times 2^2 = 53 \times 4 = 212\)
Logical Shift Right (LSR)¶
Shifts bits right, filling vacated positions with 0:
Before: 1101 0100 (212)
LSR 2: 0011 0101 (53)
Bits shifted out on the right are discarded.
New bits on the left are 0.
Effect: Unsigned divide by \(2^n\).
\(212 \div 4 = 53\)
Arithmetic Shift Right (ASR)¶
Shifts bits right, but preserves the sign bit (MSB):
Before: 1101 0100 (signed: -44)
ASR 2: 1111 0101 (signed: -11)
New bits on the left copy the original MSB.
Effect: Signed divide by \(2^n\) (rounds toward negative infinity).
Comparison of Shift Types¶
| Operation | Fill Bit | Use For | Equivalent To |
|---|---|---|---|
| LSL | 0 (right side) | All integers | Multiply by \(2^n\) |
| LSR | 0 (left side) | Unsigned integers | Unsigned divide by \(2^n\) |
| ASR | Sign bit (left side) | Signed integers | Signed divide by \(2^n\) |
Visual Diagram¶
Logical Shift Left (LSL by 1):
[b31 b30 ... b1 b0] → [b30 ... b1 b0 0]
↑ lost ↑ new 0
Logical Shift Right (LSR by 1):
[b31 b30 ... b1 b0] → [0 b31 b30 ... b1]
lost ↑ ↑ new 0
Arithmetic Shift Right (ASR by 1):
[b31 b30 ... b1 b0] → [b31 b31 b30 ... b1]
lost ↑ ↑ copy of MSB
7. Conversion Algorithms¶
Converting between string representations and integer values is a core operation in NTLang. Here we describe the algorithms at a conceptual level.
String-to-Integer: Left-to-Right Algorithm¶
To convert a string like "1A3" (hex) to an integer:
Algorithm: Process digits left-to-right, accumulating the value.
Example: Convert "1A3" in base 16:
| Step | Character | Digit | Computation | Value |
|---|---|---|---|---|
| 1 | 1 |
1 | \(0 \times 16 + 1\) | 1 |
| 2 | A |
10 | \(1 \times 16 + 10\) | 26 |
| 3 | 3 |
3 | \(26 \times 16 + 3\) | 419 |
Result: 0x1A3 = 419
Character-to-Digit Mapping¶
Converting a character to its numeric value:
| Character Range | Digit Value |
|---|---|
'0' - '9' |
c - '0' (0-9) |
'a' - 'f' |
c - 'a' + 10 (10-15) |
'A' - 'F' |
c - 'A' + 10 (10-15) |
Integer-to-String: Division/Modulo Algorithm¶
To convert an integer to a string representation:
Algorithm: Repeatedly divide by the base, collecting remainders (which are the digits in reverse order).
digits = empty list
while value > 0:
remainder = value % base
digits.append(digit_to_char(remainder))
value = value / base
reverse(digits)
Example: Convert 419 to hex:
| Step | Value | Value % 16 | Digit | Value / 16 |
|---|---|---|---|---|
| 1 | 419 | 3 | 3 |
26 |
| 2 | 26 | 10 | A |
1 |
| 3 | 1 | 1 | 1 |
0 |
Reversed digits: 1A3 → Result: "0x1A3"
Digit-to-Character Mapping¶
| Digit Value | Character |
|---|---|
| 0-9 | '0' + digit |
| 10-15 | 'a' + (digit - 10) |
8. Bit Width and Masking¶
The Width Parameter¶
In NTLang, the -w flag specifies a bit width. This controls how many bits are significant in the output:
ntlang -w 4 "15" → output shows only lower 4 bits
ntlang -w 8 "255" → output shows only lower 8 bits
ntlang -w 32 "255" → output shows all 32 bits (default)
Computing a Width Mask¶
A mask isolates the lower \(w\) bits of a value:
Examples:
| Width | Mask Computation | Mask (Binary) | Mask (Hex) |
|---|---|---|---|
| 4 | \((1 << 4) - 1\) | 0000 1111 |
0x0F |
| 8 | \((1 << 8) - 1\) | 1111 1111 |
0xFF |
| 16 | \((1 << 16) - 1\) | 0000...1111 1111 1111 1111 |
0xFFFF |
Special case: When width = 32, shifting by 32 causes undefined behavior (in C) or would overflow, so we handle it separately by returning 0xFFFFFFFF.
Applying a Mask¶
To extract the lower \(w\) bits:
Example: Extract lower 4 bits of 0xAB:
Sign Extension¶
When displaying a masked value as a signed number, the MSB of the masked value (bit \(w-1\)) determines the sign. If the MSB is 1, we need to sign-extend — fill the upper bits with 1 to maintain the negative value in the full 32-bit representation.
Width 4, value = 0xB (1011 in 4 bits):
MSB of 4-bit value is 1 (bit 3) → negative in two's complement
Sign extend to 32 bits:
0000 0000 0000 0000 0000 0000 0000 1011 (before)
1111 1111 1111 1111 1111 1111 1111 1011 (after sign extension)
= 0xFFFFFFFB = -5 in 32-bit two's complement
Key Concepts¶
| Concept | Description |
|---|---|
| Positional notation | Value determined by digit position and base |
| Binary (base 2) | Two digits (0, 1); foundation of digital computing |
| Hexadecimal (base 16) | Compact binary representation; 4 bits per hex digit |
| Two's complement | Standard signed integer representation; negate = invert + add 1 |
| NOT | Invert all bits |
| AND | Masking — extract specific bits |
| OR | Setting — turn on specific bits |
| XOR | Toggling — flip specific bits |
| LSL | Shift left, multiply by \(2^n\) |
| LSR | Shift right (unsigned), divide by \(2^n\) |
| ASR | Shift right (signed), preserves sign bit |
| String-to-integer | Left-to-right: val = val * base + digit |
| Integer-to-string | Division/modulo: extract digits in reverse |
| Width mask | \((1 << w) - 1\); isolates lower \(w\) bits |
| Sign extension | Fill upper bits with sign bit for signed interpretation |
Practice Problems¶
Problem 1: Binary-to-Decimal¶
Convert 1100 1010 (binary) to decimal.
Click to reveal solution
Problem 2: Hex-to-Binary¶
Convert 0xDEAD to binary.
Click to reveal solution
Problem 3: Two's Complement Negation¶
Negate 6 in 8-bit two's complement.
Click to reveal solution
Problem 4: Bitwise Operations¶
Given A = 0xCA and B = 0xF0, compute:
A & BA | BA ^ B~A(8-bit)
Click to reveal solution
Problem 5: String-to-Integer Algorithm¶
Trace the left-to-right algorithm to convert "B2" from hex to decimal.
Click to reveal solution
Problem 6: Width Mask¶
What is the mask for width 6? Apply it to the value 0xFF.
Click to reveal solution
Summary¶
-
Positional notation represents values using digits weighted by powers of the base. This generalizes from familiar decimal to binary, octal, hex, and any base.
-
Binary (base 2) is the native language of digital hardware. Each bit represents a power of 2, and 32-bit registers can hold values from 0 to \(2^{32} - 1\) (unsigned).
-
Hexadecimal (base 16) provides a compact way to write binary, with each hex digit mapping to exactly 4 bits.
-
Two's complement is the standard for signed integers: negate by inverting all bits and adding 1. It gives a single zero, simple addition hardware, and a range of \(-2^{n-1}\) to \(2^{n-1} - 1\).
-
Bitwise operations (NOT, AND, OR, XOR) manipulate individual bits and are used for masking, setting, clearing, and toggling bits.
-
Shift operations (LSL, LSR, ASR) move bits left or right, effectively multiplying or dividing by powers of 2. ASR preserves the sign for signed values.
-
Conversion algorithms use left-to-right accumulation (string-to-integer) and division/modulo (integer-to-string) to convert between representations.
-
Bit width and masking allow working with specific bit ranges using the formula \(mask = (1 << w) - 1\), with sign extension needed for signed interpretation of narrow values.