Skip to content

Binary Representation and Number Bases

Overview

This lecture covers the fundamental concepts of binary representation and number bases that underlie all computing. We examine positional notation, binary and hexadecimal number systems, two's complement for signed integers, bitwise operations, shift operations, and the algorithms for converting between string representations and integer values. These concepts are language-agnostic and form the foundation for the C and Rust implementations covered in the next lecture.

Learning Objectives

  • Understand positional notation and its application to any base
  • Convert between binary, decimal, and hexadecimal representations
  • Apply two's complement for signed integer representation
  • Use bitwise operations (NOT, AND, OR, XOR) for bit manipulation
  • Understand shift operations and their relationship to multiplication/division
  • Describe algorithms for string-to-integer and integer-to-string conversion
  • Apply bit masking and width concepts

Prerequisites

  • Basic arithmetic and algebra
  • Understanding of powers of 2
  • Familiarity with C or Rust programming basics

1. Positional Notation

The General Formula

In a positional number system, the value of a number is determined by the position of each digit and the base of the system:

\[value = \sum_{i=0}^{n-1} d_i \times base^i\]

where \(d_i\) is the digit at position \(i\) (counting from the right, starting at 0).

Decimal (Base 10) — The Familiar Example

We use 10 digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Example: 735 in base 10:

Position Digit Place Value Contribution
2 7 \(10^2 = 100\) 700
1 3 \(10^1 = 10\) 30
0 5 \(10^0 = 1\) 5

\(735 = 7 \times 100 + 3 \times 10 + 5 \times 1 = 735\)

Generalization to Any Base

The same principle applies to any base \(b\):

  • Base 2 (binary): digits are 0, 1
  • Base 8 (octal): digits are 0-7
  • Base 10 (decimal): digits are 0-9
  • Base 16 (hexadecimal): digits are 0-9, A-F

The number of distinct digits in a base-\(b\) system is always \(b\).


2. Binary (Base 2)

Why Binary?

Computers use binary because digital circuits have two stable states:

State Meaning
Low voltage (0V) 0 (false)
High voltage (3.3V/5V) 1 (true)

A single binary digit is called a bit. Groups of bits form larger units:

Unit Bits Values
Bit 1 2
Nibble 4 16
Byte 8 256
Word (32-bit) 32 4,294,967,296

32-Bit Registers

In NTLang, we work with 32-bit unsigned integers (uint32_t in C, u32 in Rust):

Bit:  31 30 29 28 ... 3  2  1  0
      MSB                     LSB

MSB = Most Significant Bit (leftmost, highest value)
LSB = Least Significant Bit (rightmost, lowest value)

Powers of 2 Table

Power Value Common Name
\(2^0\) 1
\(2^1\) 2
\(2^2\) 4
\(2^3\) 8
\(2^4\) 16
\(2^7\) 128
\(2^8\) 256
\(2^{10}\) 1,024 1K
\(2^{16}\) 65,536 64K
\(2^{20}\) 1,048,576 1M
\(2^{32}\) 4,294,967,296 4G

Binary-to-Decimal Conversion

Convert 1011 (binary) to decimal:

Bit Position 3 2 1 0
Bit Value 1 0 1 1
Place Value \(2^3=8\) \(2^2=4\) \(2^1=2\) \(2^0=1\)
Contribution 8 0 2 1

\(1011_2 = 8 + 0 + 2 + 1 = 11_{10}\)

Decimal-to-Binary Conversion

Convert 13 to binary using repeated division:

13 / 2 = 6 remainder 1  (LSB)
 6 / 2 = 3 remainder 0
 3 / 2 = 1 remainder 1
 1 / 2 = 0 remainder 1  (MSB)

Read remainders bottom-to-top: 1101

\(13_{10} = 1101_2\)


3. Hexadecimal (Base 16)

Why Hexadecimal?

Binary numbers are long and hard to read. Hexadecimal provides a compact representation because 4 binary digits = 1 hex digit.

Hex Digits

Hex Decimal Binary
0 0 0000
1 1 0001
2 2 0010
3 3 0011
4 4 0100
5 5 0101
6 6 0110
7 7 0111
8 8 1000
9 9 1001
A 10 1010
B 11 1011
C 12 1100
D 13 1101
E 14 1110
F 15 1111

The 4-Bit Correspondence

Since \(2^4 = 16\), each hex digit maps to exactly 4 bits:

Hex:     0x1A3F
Binary:  0001 1010 0011 1111
         ^^^^ ^^^^ ^^^^ ^^^^
          1    A    3    F

This makes hex-binary conversion trivial — just substitute each hex digit with its 4-bit pattern.

Hex-to-Decimal Conversion

Convert 0x2B to decimal:

\(2B_{16} = 2 \times 16^1 + 11 \times 16^0 = 32 + 11 = 43_{10}\)

Hex Prefix Conventions

Context Prefix Example
C/Rust 0x 0xFF
NTLang 0x 0x2B
Assembly 0x or $ $FF
HTML/CSS # #00543c

4. Two's Complement

The Problem: Representing Negative Numbers

With \(n\) bits, we can represent \(2^n\) different values. For unsigned integers, that's \(0\) to \(2^n - 1\). But how do we represent negative numbers?

Signed Magnitude (Naive Approach)

Use the MSB as a sign bit: 0 = positive, 1 = negative.

For 4-bit signed magnitude:

Binary Value
0101 +5
1101 -5
0000 +0
1000 -0

Problems with signed magnitude:

  1. Two zeros (+0 and -0) — wastes a bit pattern
  2. Addition is complex — need to check signs and perform add or subtract
  3. Hardware is complicated — separate adder/subtractor circuits needed

Two's Complement Solution

Two's complement is the standard representation used by virtually all modern hardware.

To negate a number: invert all bits, then add 1.

Example: negate 5 in 4 bits:

  5 = 0101
      ↓ invert all bits
      1010
      ↓ add 1
 -5 = 1011

Verify: 0101 + 1011 = 10000 (the carry out of bit 3 is discarded → result is 0000 = 0)

Two's Complement Range

For an \(n\)-bit two's complement number:

Property Formula 4-bit Example 8-bit Example 32-bit Example
Minimum \(-2^{n-1}\) -8 -128 -2,147,483,648
Maximum \(2^{n-1} - 1\) 7 127 2,147,483,647
Total values \(2^n\) 16 256 4,294,967,296

4-Bit Two's Complement Table

Binary Unsigned Signed (Two's Complement)
0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 5
0110 6 6
0111 7 7
1000 8 -8
1001 9 -7
1010 10 -6
1011 11 -5
1100 12 -4
1101 13 -3
1110 14 -2
1111 15 -1

Key Properties of Two's Complement

  1. MSB indicates sign: 0 = non-negative, 1 = negative
  2. One zero: only 0000...0 represents zero
  3. Addition works the same for signed and unsigned — the hardware doesn't need to know
  4. Asymmetric range: one more negative value than positive (e.g., -8 to 7 for 4 bits)

5. Bitwise Operations

Bitwise operations work on individual bits of a value. They are fundamental to systems programming for tasks like masking, flag manipulation, and low-level data processing.

NOT (Bitwise Complement)

Inverts every bit: 0 becomes 1, 1 becomes 0.

NOT 0101 1010
  = 1010 0101
Input Output
0 1
1 0

AND (Bitwise AND)

Result is 1 only when both bits are 1.

  1100 1010
& 1111 0000
-----------
  1100 0000
A B A AND B
0 0 0
0 1 0
1 0 0
1 1 1

Use case — Masking: Extract specific bits by ANDing with a mask.

value:  1011 0110
mask:   0000 1111   (keep lower 4 bits)
result: 0000 0110

OR (Bitwise OR)

Result is 1 when either (or both) bits are 1.

  1100 1010
| 0011 0000
-----------
  1111 1010
A B A OR B
0 0 0
0 1 1
1 0 1
1 1 1

Use case — Setting bits: Set specific bits to 1 by ORing with a pattern.

XOR (Bitwise Exclusive OR)

Result is 1 when bits are different.

  1100 1010
^ 1111 0000
-----------
  0011 1010
A B A XOR B
0 0 0
0 1 1
1 0 1
1 1 0

Use case — Toggling bits: Flip specific bits by XORing with a mask.

Key property: A XOR A = 0 and A XOR 0 = A

Summary of Bitwise Operations

Operation Symbol Rule Use Case
NOT ~ / ! Invert all bits Complement
AND & 1 only if both 1 Masking (extract bits)
OR \| 1 if either 1 Setting bits
XOR ^ 1 if different Toggling bits

6. Shift Operations

Shift operations move all bits left or right by a specified number of positions.

Logical Shift Left (LSL)

Shifts bits left, filling vacated positions with 0:

Before:  0011 0101  (53)
LSL 2:   1101 0100  (212)

Bits shifted out on the left are discarded.
New bits on the right are 0.

Effect: Multiply by \(2^n\) (where \(n\) is the shift amount).

\(53 \times 2^2 = 53 \times 4 = 212\)

Logical Shift Right (LSR)

Shifts bits right, filling vacated positions with 0:

Before:  1101 0100  (212)
LSR 2:   0011 0101  (53)

Bits shifted out on the right are discarded.
New bits on the left are 0.

Effect: Unsigned divide by \(2^n\).

\(212 \div 4 = 53\)

Arithmetic Shift Right (ASR)

Shifts bits right, but preserves the sign bit (MSB):

Before:  1101 0100  (signed: -44)
ASR 2:   1111 0101  (signed: -11)

New bits on the left copy the original MSB.

Effect: Signed divide by \(2^n\) (rounds toward negative infinity).

Comparison of Shift Types

Operation Fill Bit Use For Equivalent To
LSL 0 (right side) All integers Multiply by \(2^n\)
LSR 0 (left side) Unsigned integers Unsigned divide by \(2^n\)
ASR Sign bit (left side) Signed integers Signed divide by \(2^n\)

Visual Diagram

Logical Shift Left (LSL by 1):
[b31 b30 ... b1 b0]  →  [b30 ... b1 b0 0]
 ↑ lost                              ↑ new 0

Logical Shift Right (LSR by 1):
[b31 b30 ... b1 b0]  →  [0 b31 b30 ... b1]
            lost ↑        ↑ new 0

Arithmetic Shift Right (ASR by 1):
[b31 b30 ... b1 b0]  →  [b31 b31 b30 ... b1]
            lost ↑        ↑ copy of MSB

7. Conversion Algorithms

Converting between string representations and integer values is a core operation in NTLang. Here we describe the algorithms at a conceptual level.

String-to-Integer: Left-to-Right Algorithm

To convert a string like "1A3" (hex) to an integer:

Algorithm: Process digits left-to-right, accumulating the value.

value = 0
for each character c in the string:
    digit = char_to_digit(c)
    value = value * base + digit

Example: Convert "1A3" in base 16:

Step Character Digit Computation Value
1 1 1 \(0 \times 16 + 1\) 1
2 A 10 \(1 \times 16 + 10\) 26
3 3 3 \(26 \times 16 + 3\) 419

Result: 0x1A3 = 419

Character-to-Digit Mapping

Converting a character to its numeric value:

Character Range Digit Value
'0' - '9' c - '0' (0-9)
'a' - 'f' c - 'a' + 10 (10-15)
'A' - 'F' c - 'A' + 10 (10-15)

Integer-to-String: Division/Modulo Algorithm

To convert an integer to a string representation:

Algorithm: Repeatedly divide by the base, collecting remainders (which are the digits in reverse order).

digits = empty list
while value > 0:
    remainder = value % base
    digits.append(digit_to_char(remainder))
    value = value / base
reverse(digits)

Example: Convert 419 to hex:

Step Value Value % 16 Digit Value / 16
1 419 3 3 26
2 26 10 A 1
3 1 1 1 0

Reversed digits: 1A3 → Result: "0x1A3"

Digit-to-Character Mapping

Digit Value Character
0-9 '0' + digit
10-15 'a' + (digit - 10)

8. Bit Width and Masking

The Width Parameter

In NTLang, the -w flag specifies a bit width. This controls how many bits are significant in the output:

ntlang -w 4 "15"    → output shows only lower 4 bits
ntlang -w 8 "255"   → output shows only lower 8 bits
ntlang -w 32 "255"  → output shows all 32 bits (default)

Computing a Width Mask

A mask isolates the lower \(w\) bits of a value:

\[mask = (1 << w) - 1\]

Examples:

Width Mask Computation Mask (Binary) Mask (Hex)
4 \((1 << 4) - 1\) 0000 1111 0x0F
8 \((1 << 8) - 1\) 1111 1111 0xFF
16 \((1 << 16) - 1\) 0000...1111 1111 1111 1111 0xFFFF

Special case: When width = 32, shifting by 32 causes undefined behavior (in C) or would overflow, so we handle it separately by returning 0xFFFFFFFF.

Applying a Mask

To extract the lower \(w\) bits:

masked_value = value & mask

Example: Extract lower 4 bits of 0xAB:

  1010 1011   (0xAB)
& 0000 1111   (mask for width 4)
-----------
  0000 1011   (0x0B)

Sign Extension

When displaying a masked value as a signed number, the MSB of the masked value (bit \(w-1\)) determines the sign. If the MSB is 1, we need to sign-extend — fill the upper bits with 1 to maintain the negative value in the full 32-bit representation.

Width 4, value = 0xB (1011 in 4 bits):

MSB of 4-bit value is 1 (bit 3) → negative in two's complement

Sign extend to 32 bits:
  0000 0000 0000 0000 0000 0000 0000 1011  (before)
  1111 1111 1111 1111 1111 1111 1111 1011  (after sign extension)
  = 0xFFFFFFFB = -5 in 32-bit two's complement

Key Concepts

Concept Description
Positional notation Value determined by digit position and base
Binary (base 2) Two digits (0, 1); foundation of digital computing
Hexadecimal (base 16) Compact binary representation; 4 bits per hex digit
Two's complement Standard signed integer representation; negate = invert + add 1
NOT Invert all bits
AND Masking — extract specific bits
OR Setting — turn on specific bits
XOR Toggling — flip specific bits
LSL Shift left, multiply by \(2^n\)
LSR Shift right (unsigned), divide by \(2^n\)
ASR Shift right (signed), preserves sign bit
String-to-integer Left-to-right: val = val * base + digit
Integer-to-string Division/modulo: extract digits in reverse
Width mask \((1 << w) - 1\); isolates lower \(w\) bits
Sign extension Fill upper bits with sign bit for signed interpretation

Practice Problems

Problem 1: Binary-to-Decimal

Convert 1100 1010 (binary) to decimal.

Click to reveal solution
1100 1010

Position:  7    6    5    4    3    2    1    0
Bit:       1    1    0    0    1    0    1    0
Value:     128  64   0    0    8    0    2    0

Sum: 128 + 64 + 8 + 2 = 202

Problem 2: Hex-to-Binary

Convert 0xDEAD to binary.

Click to reveal solution
D    E    A    D
1101 1110 1010 1101

0xDEAD = 1101 1110 1010 1101
       = 57005 in decimal

Problem 3: Two's Complement Negation

Negate 6 in 8-bit two's complement.

Click to reveal solution
 6 = 0000 0110

Step 1 - Invert all bits:
     1111 1001

Step 2 - Add 1:
     1111 1001
   + 0000 0001
   -----------
     1111 1010

-6 = 1111 1010 = 0xFA

Verify: 0000 0110 + 1111 1010 = 1 0000 0000 (carry discarded → 0) ✓

Problem 4: Bitwise Operations

Given A = 0xCA and B = 0xF0, compute:

  1. A & B
  2. A | B
  3. A ^ B
  4. ~A (8-bit)
Click to reveal solution
A = 0xCA = 1100 1010
B = 0xF0 = 1111 0000

1. A & B:
   1100 1010
 & 1111 0000
 -----------
   1100 0000 = 0xC0

2. A | B:
   1100 1010
 | 1111 0000
 -----------
   1111 1010 = 0xFA

3. A ^ B:
   1100 1010
 ^ 1111 0000
 -----------
   0011 1010 = 0x3A

4. ~A:
   ~1100 1010
 = 0011 0101 = 0x35

Problem 5: String-to-Integer Algorithm

Trace the left-to-right algorithm to convert "B2" from hex to decimal.

Click to reveal solution
String: "B2", base = 16

Step 1: value = 0
Step 2: c = 'B', digit = 11
        value = 0 * 16 + 11 = 11
Step 3: c = '2', digit = 2
        value = 11 * 16 + 2 = 178

Result: 0xB2 = 178

Verify: 11 * 16 + 2 = 176 + 2 = 178 ✓

Problem 6: Width Mask

What is the mask for width 6? Apply it to the value 0xFF.

Click to reveal solution
mask = (1 << 6) - 1
     = 64 - 1
     = 63
     = 0x3F
     = 0011 1111

Apply to 0xFF:
  1111 1111  (0xFF)
& 0011 1111  (mask)
-----------
  0011 1111  (0x3F = 63)

The mask keeps only the lower 6 bits.

Summary

  1. Positional notation represents values using digits weighted by powers of the base. This generalizes from familiar decimal to binary, octal, hex, and any base.

  2. Binary (base 2) is the native language of digital hardware. Each bit represents a power of 2, and 32-bit registers can hold values from 0 to \(2^{32} - 1\) (unsigned).

  3. Hexadecimal (base 16) provides a compact way to write binary, with each hex digit mapping to exactly 4 bits.

  4. Two's complement is the standard for signed integers: negate by inverting all bits and adding 1. It gives a single zero, simple addition hardware, and a range of \(-2^{n-1}\) to \(2^{n-1} - 1\).

  5. Bitwise operations (NOT, AND, OR, XOR) manipulate individual bits and are used for masking, setting, clearing, and toggling bits.

  6. Shift operations (LSL, LSR, ASR) move bits left or right, effectively multiplying or dividing by powers of 2. ASR preserves the sign for signed values.

  7. Conversion algorithms use left-to-right accumulation (string-to-integer) and division/modulo (integer-to-string) to convert between representations.

  8. Bit width and masking allow working with specific bit ranges using the formula \(mask = (1 << w) - 1\), with sign extension needed for signed interpretation of narrow values.