8086 Instruction Decoder

A complete 8086 instruction decoder implementing perfect round-trip verification: machine code → assembly → machine code with byte-identical output. The project demonstrates low-level systems programming, bit manipulation, and state management.

Core Metrics

256 opcode handlers with O(1) dispatch
100% round-trip accuracy verified via binary diff
Zero runtime allocation for opcode table (compile-time initialization)
Handles all 8086 addressing modes including edge cases

Technical Architecture

Opcode Dispatch Table

Instead of a switch statement with 256 cases, I used a compile-time initialized function pointer table:

struct OpcodeEntry {
    void(InstructionDecoder::*handler)(u8, std::string);
    std::string mnemonic;
};

static std::array<OpcodeEntry, 256> table = []() {
    std::array<OpcodeEntry, 256> t;
    t.fill({&InstructionDecoder::unknownOpcode, ""});
    // Configure 256 entries...
    return t;
}();

Performance characteristics:

O(1) dispatch: table[opcode].handler
No branch prediction failures
All initialization at compile time (constexpr-friendly)
Better cache locality than switch-based jump tables

State Management: RAII for Transient Context

8086 segment prefixes (es:, cs:, ss:, ds:) modify only the following instruction. Traditional approaches use manual cleanup:

// Fragile - what if exception occurs?
ctx = "es:";
decodeInstruction();
ctx = "";  // Might not execute

I implemented RAII-based automatic cleanup:

class ContextGuard {
    std::string& ctx_ref;
public:
    ContextGuard(std::string& ctx) : ctx_ref(ctx) {}
    ~ContextGuard() { ctx_ref = ""; }
};

std::string getRM(...) {
    ContextGuard guard(this->ctx);  // Automatically clears on scope exit
    std::string segment_prefix = this->ctx;
    // ... rest of function
}

Why this matters: Exception-safe state management ensures cleanup even during error paths, similar to lock guards in concurrent code.

Bit-Level Decoding: ModR/M Byte Processing

The ModR/M byte encodes addressing modes in 8 bits:

| mod (2) | reg (3) | r/m (3) |

Key implementation details:

u8 modrm = code[pc + 1];
u8 mod = (modrm >> 6) & 3;      // Bits 7-6
u8 reg = (modrm >> 3) & 7;      // Bits 5-3
u8 rm = modrm & 7;              // Bits 2-0

Edge case handling:

When mod=00 and rm=110: Direct addressing [address], not [bp]
When mod=01: 8-bit signed displacement
When mod=10: 16-bit signed displacement
When mod=11: Register-to-register (no memory)

This required careful state tracking—reading the displacement bytes modifies pc, so I had to sequence operations correctly:

std::string mem = getRM(mod, rm, w);  // Advances pc for displacement
u16 data = readU16();                  // Now read immediate data

Wrong ordering would read displacement bytes as immediate data.

Endianness and Sign Extension

Little-endian multi-byte reads:

u16 readU16() {
    u16 value = (code[pc + 1] << 8) | code[pc];  // LSB at lower address
    pc += 2;
    return value;
}

Sign extension for immediate operands:
The s bit in opcodes 0x80-0x83 controls sign extension:

s=0, w=1: Read 16-bit immediate
s=1, w=1: Read 8-bit immediate, sign-extend to 16 bits
s=0, w=0: Read 8-bit immediate

u8 s = (opcode >> 1) & 1;
u16 data = (w && !s) ? readU16() : readU8();

I let NASM handle the actual sign extension rather than manually extending, since the assembler verifies the value fits.

Jump Displacement Calculation

Relative jumps encode signed displacements from the end of the instruction:

i8 disp = static_cast<i8>(readU8());
// Target = (pc + 2) + disp, where pc points at current instruction
// NASM $ syntax: current address
std::cout << mnemonic << ((disp >= 0) ? " $+ " : " $- ")
          << ((disp >= 0) ? disp + 2 : -(disp + 2)) << '\n';

The +2 accounts for instruction length (opcode + displacement byte).

Template-Based Displacement Formatting

Type-safe displacement handling using templates:

template<typename T>
std::string formatDisplacement() {
    auto disp_unsigned = (sizeof(T) == 1) ? readU8() : readU16();
    auto disp_signed = static_cast<T>(disp_unsigned);

    if (disp_signed > 0) return " + " + std::to_string(disp_signed);
    else if (disp_signed < 0) return " - " + std::to_string(-disp_signed);
    return "";
}

Usage:

return "[" + addr[rm] + formatDisplacement<i8>() + "]";   // 8-bit
return "[" + addr[rm] + formatDisplacement<i16>() + "]";  // 16-bit

Type system enforces correct signed interpretation at compile time.

Verification Methodology

Round-trip testing ensures correctness:

nasm program.asm -o program          # Assemble ground truth
./build/sim8086 program > decoded.asm      # Decode to assembly
nasm decoded.asm -o reassembled      # Reassemble
diff program reassembled             # Must be byte-identical

This is property-based testing—if decode(encode(x)) == x for all valid x, the decoder is proven correct for the test domain.

Why this matters more than unit tests: Unit tests verify known cases. Round-trip testing verifies the decoder's understanding of the encoding scheme is complete and correct. Any semantic misunderstanding produces detectable binary differences.

Critical Edge Cases

1. Ambiguous operand sizes:

mov [100], 50    ; Error: byte or word?
mov [100], byte 50   ; Must specify

Implementation:

std::string size = (mod != 3) ? ((w) ? " word" : " byte") : "";
std::cout << "mov " << size << " " << mem << ", " << data << '\n';

2. Group opcodes (sub-operations in ModR/M):

Opcode 0xF6/F7 with different reg fields:

const char* instr[] = {"test", "", "not", "neg", "mul", "imul", "div", "idiv"};
std::cout << instr[reg] << " " << operand << '\n';

Empty string at index 1 represents undefined behavior—real hardware behavior varies.

3. Two-byte pseudo-instructions:

aam and aad have mandatory second bytes:

void decodeNullaryInstructionTwoBytes(u8 opcode, std::string mnemonic) {
    pc += 2;  // Must consume both bytes
}

Missing this would desynchronize the instruction stream.

Performance Considerations

Why function pointers over virtual dispatch?

No vtable lookup indirection
Better cache locality (table is cache-line sized)
No inheritance hierarchy overhead
Enables better inlining at call sites

Memory efficiency:

Single pass through instruction stream
O(n) time complexity where n is code size
No intermediate representation or AST construction
Direct streaming output

Const correctness:

std::string getRegister(u8 reg, u8 w) const;  // Doesn't modify state
bool isIndexValid(std::size_t bytes) const;   // Read-only checks

Enables compiler optimizations and communicates intent.

Production Considerations

Error handling:

bool isIndexValid(std::size_t bytes) const {
    return pc + bytes <= code.size();
}

All read operations check bounds. In production, I'd add:

Structured error reporting with byte offsets
Recovery mechanisms for malformed instruction streams
Logging for debugging partial decodes

Extensibility:
Adding new instructions requires:

Define handler function
Add entry to opcode table
Round-trip test validates automatically

No changes to dispatch logic needed.

Skills Demonstrated

Low-level bit manipulation Extracting packed fields, handling signed/unsigned conversions

State management RAII patterns for automatic cleanup

Verification methodology Property-based testing via round-trip validation

Performance awareness O(1) dispatch, compile-time optimization

Edge case handling Comprehensive coverage of encoding corner cases

Type safety Template-based approach to displacement formatting

Build & Run

# Build the decoder
./build.sh

# Run on a binary file
./build/sim8086 binary_file

# Round-trip verification
nasm program.asm -o program
./build/sim8086 program > decoded.asm
nasm decoded.asm -o reassembled
diff program reassembled  # Should be identical

Dependencies: None (standard library only)
Build system: Shell script (build.sh)

View on GitHub