8086 Instruction Decoder

A complete 8086 instruction decoder implementing perfect round-trip verification: machine code → assembly → machine code with byte-identical output. The project demonstrates low-level systems programming, bit manipulation, and state management.

Core Metrics

  • 256 opcode handlers with O(1) dispatch
  • 100% round-trip accuracy verified via binary diff
  • Zero runtime allocation for opcode table (compile-time initialization)
  • Handles all 8086 addressing modes including edge cases

Technical Architecture

Opcode Dispatch Table

Instead of a switch statement with 256 cases, I used a compile-time initialized function pointer table:

struct OpcodeEntry { void(InstructionDecoder::*handler)(u8, std::string); std::string mnemonic; }; static std::array<OpcodeEntry, 256> table = []() { std::array<OpcodeEntry, 256> t; t.fill({&InstructionDecoder::unknownOpcode, ""}); // Configure 256 entries... return t; }();

Performance characteristics:

  • O(1) dispatch: table[opcode].handler
  • No branch prediction failures
  • All initialization at compile time (constexpr-friendly)
  • Better cache locality than switch-based jump tables

State Management: RAII for Transient Context

8086 segment prefixes (es:, cs:, ss:, ds:) modify only the following instruction. Traditional approaches use manual cleanup:

// Fragile - what if exception occurs? ctx = "es:"; decodeInstruction(); ctx = ""; // Might not execute

I implemented RAII-based automatic cleanup:

class ContextGuard { std::string& ctx_ref; public: ContextGuard(std::string& ctx) : ctx_ref(ctx) {} ~ContextGuard() { ctx_ref = ""; } }; std::string getRM(...) { ContextGuard guard(this->ctx); // Automatically clears on scope exit std::string segment_prefix = this->ctx; // ... rest of function }

Why this matters: Exception-safe state management ensures cleanup even during error paths, similar to lock guards in concurrent code.

Bit-Level Decoding: ModR/M Byte Processing

The ModR/M byte encodes addressing modes in 8 bits:

| mod (2) | reg (3) | r/m (3) |

Key implementation details:

u8 modrm = code[pc + 1]; u8 mod = (modrm >> 6) & 3; // Bits 7-6 u8 reg = (modrm >> 3) & 7; // Bits 5-3 u8 rm = modrm & 7; // Bits 2-0

Edge case handling:

  • When mod=00 and rm=110: Direct addressing [address], not [bp]
  • When mod=01: 8-bit signed displacement
  • When mod=10: 16-bit signed displacement
  • When mod=11: Register-to-register (no memory)

This required careful state tracking—reading the displacement bytes modifies pc, so I had to sequence operations correctly:

std::string mem = getRM(mod, rm, w); // Advances pc for displacement u16 data = readU16(); // Now read immediate data

Wrong ordering would read displacement bytes as immediate data.

Endianness and Sign Extension

Little-endian multi-byte reads:

u16 readU16() { u16 value = (code[pc + 1] << 8) | code[pc]; // LSB at lower address pc += 2; return value; }

Sign extension for immediate operands:
The s bit in opcodes 0x80-0x83 controls sign extension:

  • s=0, w=1: Read 16-bit immediate
  • s=1, w=1: Read 8-bit immediate, sign-extend to 16 bits
  • s=0, w=0: Read 8-bit immediate
u8 s = (opcode >> 1) & 1; u16 data = (w && !s) ? readU16() : readU8();

I let NASM handle the actual sign extension rather than manually extending, since the assembler verifies the value fits.

Jump Displacement Calculation

Relative jumps encode signed displacements from the end of the instruction:

i8 disp = static_cast<i8>(readU8()); // Target = (pc + 2) + disp, where pc points at current instruction // NASM $ syntax: current address std::cout << mnemonic << ((disp >= 0) ? " $+ " : " $- ") << ((disp >= 0) ? disp + 2 : -(disp + 2)) << '\n';

The +2 accounts for instruction length (opcode + displacement byte).

Template-Based Displacement Formatting

Type-safe displacement handling using templates:

template<typename T> std::string formatDisplacement() { auto disp_unsigned = (sizeof(T) == 1) ? readU8() : readU16(); auto disp_signed = static_cast<T>(disp_unsigned); if (disp_signed > 0) return " + " + std::to_string(disp_signed); else if (disp_signed < 0) return " - " + std::to_string(-disp_signed); return ""; }

Usage:

return "[" + addr[rm] + formatDisplacement<i8>() + "]"; // 8-bit return "[" + addr[rm] + formatDisplacement<i16>() + "]"; // 16-bit

Type system enforces correct signed interpretation at compile time.

Verification Methodology

Round-trip testing ensures correctness:

nasm program.asm -o program # Assemble ground truth ./build/sim8086 program > decoded.asm # Decode to assembly nasm decoded.asm -o reassembled # Reassemble diff program reassembled # Must be byte-identical

This is property-based testing—if decode(encode(x)) == x for all valid x, the decoder is proven correct for the test domain.

Why this matters more than unit tests: Unit tests verify known cases. Round-trip testing verifies the decoder's understanding of the encoding scheme is complete and correct. Any semantic misunderstanding produces detectable binary differences.

Critical Edge Cases

1. Ambiguous operand sizes:

mov [100], 50 ; Error: byte or word? mov [100], byte 50 ; Must specify

Implementation:

std::string size = (mod != 3) ? ((w) ? " word" : " byte") : ""; std::cout << "mov " << size << " " << mem << ", " << data << '\n';

2. Group opcodes (sub-operations in ModR/M):

Opcode 0xF6/F7 with different reg fields:

const char* instr[] = {"test", "", "not", "neg", "mul", "imul", "div", "idiv"}; std::cout << instr[reg] << " " << operand << '\n';

Empty string at index 1 represents undefined behavior—real hardware behavior varies.

3. Two-byte pseudo-instructions:

aam and aad have mandatory second bytes:

void decodeNullaryInstructionTwoBytes(u8 opcode, std::string mnemonic) { pc += 2; // Must consume both bytes }

Missing this would desynchronize the instruction stream.

Performance Considerations

Why function pointers over virtual dispatch?

  • No vtable lookup indirection
  • Better cache locality (table is cache-line sized)
  • No inheritance hierarchy overhead
  • Enables better inlining at call sites

Memory efficiency:

  • Single pass through instruction stream
  • O(n) time complexity where n is code size
  • No intermediate representation or AST construction
  • Direct streaming output

Const correctness:

std::string getRegister(u8 reg, u8 w) const; // Doesn't modify state bool isIndexValid(std::size_t bytes) const; // Read-only checks

Enables compiler optimizations and communicates intent.

Production Considerations

Error handling:

bool isIndexValid(std::size_t bytes) const { return pc + bytes <= code.size(); }

All read operations check bounds. In production, I'd add:

  • Structured error reporting with byte offsets
  • Recovery mechanisms for malformed instruction streams
  • Logging for debugging partial decodes

Extensibility:
Adding new instructions requires:

  1. Define handler function
  2. Add entry to opcode table
  3. Round-trip test validates automatically

No changes to dispatch logic needed.

Skills Demonstrated

Low-level bit manipulation Extracting packed fields, handling signed/unsigned conversions
State management RAII patterns for automatic cleanup
Verification methodology Property-based testing via round-trip validation
Performance awareness O(1) dispatch, compile-time optimization
Edge case handling Comprehensive coverage of encoding corner cases
Type safety Template-based approach to displacement formatting

Build & Run

# Build the decoder ./build.sh # Run on a binary file ./build/sim8086 binary_file # Round-trip verification nasm program.asm -o program ./build/sim8086 program > decoded.asm nasm decoded.asm -o reassembled diff program reassembled # Should be identical

Dependencies: None (standard library only)
Build system: Shell script (build.sh)