Binary Protocol Parsers: Designing and Parsing Wire Formats From Scratch

March 22, 2026126 min read0 views

binary parsing

performance engineering

infrastructure engineering

Binary Protocol Parsers: Designing and Parsing Wire Formats From Scratch

JSON is human-readable but wastes bandwidth. Binary protocols are 2-10× smaller, 10-100× faster to parse, and power every high-performance system: TCP/IP headers, Protocol Buffers, MessagePack, DNS, TLS, WebSocket frames, and database wire protocols. Today we build binary protocol parsers from the ground up — type-length-value encoding, schema evolution, zero-copy decoding, and a complete protocol implementation.

Why Binary Protocols

JSON:  {"id":12345,"name":"Alice","age":30,"active":true}
Bytes: 52 bytes, requires full text parsing + string allocation

Binary (TLV):  [01 00 04 00 00 30 39] [02 00 05 41 6C 69 63 65] ...
Bytes: ~20 bytes, parsed via pointer arithmetic

                          JSON    Binary (Protobuf)
  ─────────────────────────────────────────────────
  Payload size             100%        20-40%
  Parse time               100%         5-10%
  Memory allocations       many         zero (zero-copy)
  Schema evolution         ✅ (flexible)  ✅ (with field numbers)
  Human readable           ✅           ❌
  Debug-friendly           ✅           ❌ (need tooling)

1. Binary Reader/Writer Primitives

/**
 * Low-level binary buffer reader with cursor management.
 * Foundation for ALL binary protocol parsers.
 */

class BinaryReader {
  private view: DataView;
  private offset: number = 0;
  private readonly buffer: ArrayBuffer;
  
  constructor(buffer: ArrayBuffer) {
    this.buffer = buffer;
    this.view = new DataView(buffer);
  }
  
  // --- Integer types ---
  
  readUint8(): number {
    const value = this.view.getUint8(this.offset);
    this.offset += 1;
    return value;
  }
  
  readUint16LE(): number {
    const value = this.view.getUint16(this.offset, true);
    this.offset += 2;
    return value;
  }
  
  readUint16BE(): number {
    const value = this.view.getUint16(this.offset, false);
    this.offset += 2;
    return value;
  }
  
  readUint32LE(): number {
    const value = this.view.getUint32(this.offset, true);
    this.offset += 4;
    return value;
  }
  
  readUint32BE(): number {
    const value = this.view.getUint32(this.offset, false);
    this.offset += 4;
    return value;
  }
  
  readInt32LE(): number {
    const value = this.view.getInt32(this.offset, true);
    this.offset += 4;
    return value;
  }
  
  readUint64LE(): bigint {
    const value = this.view.getBigUint64(this.offset, true);
    this.offset += 8;
    return value;
  }
  
  readFloat32LE(): number {
    const value = this.view.getFloat32(this.offset, true);
    this.offset += 4;
    return value;
  }
  
  readFloat64LE(): number {
    const value = this.view.getFloat64(this.offset, true);
    this.offset += 8;
    return value;
  }
  
  /**
   * Variable-length integer (varint) — same as Protocol Buffers.
   * Each byte uses 7 bits for data, 1 bit to indicate continuation.
   * 
   * Value 300 = 0b100101100:
   *   Byte 1: 1_0101100  (continuation=1, data=0101100)
   *   Byte 2: 0_0000010  (continuation=0, data=0000010)
   *   Result: 0000010_0101100 = 300
   */
  readVarint(): number {
    let result = 0;
    let shift = 0;
    
    while (true) {
      const byte = this.readUint8();
      result |= (byte & 0x7F) << shift;
      
      if ((byte & 0x80) === 0) break; // No continuation bit
      shift += 7;
      
      if (shift > 35) throw new Error('Varint too long');
    }
    
    return result >>> 0; // Ensure unsigned
  }
  
  /**
   * Signed varint using ZigZag encoding.
   * Maps signed integers to unsigned:
   *   0 → 0, -1 → 1, 1 → 2, -2 → 3, 2 → 4, ...
   * 
   * This way, small negative numbers are still small.
   */
  readSignedVarint(): number {
    const unsigned = this.readVarint();
    return (unsigned >>> 1) ^ -(unsigned & 1);
  }
  
  // --- String / Bytes ---
  
  /**
   * Length-prefixed string (varint length + UTF-8 bytes).
   * Zero-copy: returns a view into the original buffer.
   */
  readString(): string {
    const length = this.readVarint();
    const bytes = new Uint8Array(this.buffer, this.offset, length);
    this.offset += length;
    return new TextDecoder().decode(bytes);
  }
  
  readBytes(length: number): Uint8Array {
    const bytes = new Uint8Array(this.buffer, this.offset, length);
    this.offset += length;
    return bytes;
  }
  
  readLengthPrefixedBytes(): Uint8Array {
    const length = this.readVarint();
    return this.readBytes(length);
  }
  
  // --- Cursor management ---
  
  getOffset(): number { return this.offset; }
  setOffset(offset: number): void { this.offset = offset; }
  remaining(): number { return this.buffer.byteLength - this.offset; }
  isEOF(): boolean { return this.offset >= this.buffer.byteLength; }
  
  /**
   * Peek without advancing cursor.
   */
  peek(fn: () => any): any {
    const savedOffset = this.offset;
    const result = fn();
    this.offset = savedOffset;
    return result;
  }
  
  /**
   * Sub-reader for parsing nested messages.
   */
  subReader(length: number): BinaryReader {
    const sub = new BinaryReader(
      this.buffer.slice(this.offset, this.offset + length)
    );
    this.offset += length;
    return sub;
  }
}


class BinaryWriter {
  private buffer: ArrayBuffer;
  private view: DataView;
  private offset: number = 0;
  private capacity: number;
  
  constructor(initialCapacity: number = 1024) {
    this.capacity = initialCapacity;
    this.buffer = new ArrayBuffer(initialCapacity);
    this.view = new DataView(this.buffer);
  }
  
  // Auto-grow
  private ensureCapacity(needed: number): void {
    if (this.offset + needed <= this.capacity) return;
    
    while (this.capacity < this.offset + needed) {
      this.capacity *= 2;
    }
    
    const newBuffer = new ArrayBuffer(this.capacity);
    new Uint8Array(newBuffer).set(new Uint8Array(this.buffer));
    this.buffer = newBuffer;
    this.view = new DataView(this.buffer);
  }
  
  writeUint8(value: number): void {
    this.ensureCapacity(1);
    this.view.setUint8(this.offset, value);
    this.offset += 1;
  }
  
  writeUint16LE(value: number): void {
    this.ensureCapacity(2);
    this.view.setUint16(this.offset, value, true);
    this.offset += 2;
  }
  
  writeUint16BE(value: number): void {
    this.ensureCapacity(2);
    this.view.setUint16(this.offset, value, false);
    this.offset += 2;
  }
  
  writeUint32LE(value: number): void {
    this.ensureCapacity(4);
    this.view.setUint32(this.offset, value, true);
    this.offset += 4;
  }
  
  writeUint32BE(value: number): void {
    this.ensureCapacity(4);
    this.view.setUint32(this.offset, value, false);
    this.offset += 4;
  }
  
  writeFloat64LE(value: number): void {
    this.ensureCapacity(8);
    this.view.setFloat64(this.offset, value, true);
    this.offset += 8;
  }
  
  writeVarint(value: number): void {
    value = value >>> 0;
    while (value > 0x7F) {
      this.writeUint8((value & 0x7F) | 0x80);
      value >>>= 7;
    }
    this.writeUint8(value);
  }
  
  writeSignedVarint(value: number): void {
    // ZigZag encode: (n << 1) ^ (n >> 31)
    const zigzag = (value << 1) ^ (value >> 31);
    this.writeVarint(zigzag >>> 0);
  }
  
  writeString(str: string): void {
    const encoded = new TextEncoder().encode(str);
    this.writeVarint(encoded.length);
    this.writeByteArray(encoded);
  }
  
  writeByteArray(bytes: Uint8Array): void {
    this.ensureCapacity(bytes.length);
    new Uint8Array(this.buffer).set(bytes, this.offset);
    this.offset += bytes.length;
  }
  
  /**
   * Get the final buffer (trimmed to actual content).
   */
  finish(): ArrayBuffer {
    return this.buffer.slice(0, this.offset);
  }
  
  getOffset(): number { return this.offset; }
}

2. TLV Protocol (Type-Length-Value)

/**
 * TLV (Type-Length-Value) — the foundation of many binary protocols.
 * Used in: ASN.1/BER, RADIUS, DHCP options, DNS resource records.
 * 
 * Each field is self-describing:
 * ┌──────┬────────┬─────────────────┐
 * │ Type │ Length │      Value      │
 * │ 1-2B │  1-4B │  (Length bytes)  │
 * └──────┴────────┴─────────────────┘
 */

enum TLVType {
  // Wire types
  VARINT    = 0,   // int32, int64, bool, enum
  FIXED64   = 1,   // fixed64, double
  BYTES     = 2,   // string, bytes, embedded messages
  FIXED32   = 5,   // fixed32, float
}

interface TLVField {
  fieldNumber: number;
  wireType: TLVType;
  value: number | bigint | Uint8Array | string;
}

class TLVCodec {
  /**
   * Encode a message as TLV fields.
   * 
   * Field key format (same as Protocol Buffers):
   *   key = (field_number << 3) | wire_type
   *   Encoded as varint.
   */
  static encode(fields: TLVField[]): ArrayBuffer {
    const writer = new BinaryWriter();
    
    for (const field of fields) {
      const key = (field.fieldNumber << 3) | field.wireType;
      writer.writeVarint(key);
      
      switch (field.wireType) {
        case TLVType.VARINT:
          writer.writeVarint(field.value as number);
          break;
          
        case TLVType.FIXED64:
          writer.writeFloat64LE(field.value as number);
          break;
          
        case TLVType.BYTES:
          if (typeof field.value === 'string') {
            writer.writeString(field.value);
          } else {
            const bytes = field.value as Uint8Array;
            writer.writeVarint(bytes.length);
            writer.writeByteArray(bytes);
          }
          break;
          
        case TLVType.FIXED32:
          writer.writeUint32LE(field.value as number);
          break;
      }
    }
    
    return writer.finish();
  }
  
  /**
   * Decode TLV bytes into fields.
   * Unknown fields are preserved (forward compatibility!).
   */
  static decode(buffer: ArrayBuffer): TLVField[] {
    const reader = new BinaryReader(buffer);
    const fields: TLVField[] = [];
    
    while (!reader.isEOF()) {
      const key = reader.readVarint();
      const fieldNumber = key >>> 3;
      const wireType = (key & 0x7) as TLVType;
      
      let value: TLVField['value'];
      
      switch (wireType) {
        case TLVType.VARINT:
          value = reader.readVarint();
          break;
          
        case TLVType.FIXED64:
          value = reader.readFloat64LE();
          break;
          
        case TLVType.BYTES:
          value = reader.readLengthPrefixedBytes();
          break;
          
        case TLVType.FIXED32:
          value = reader.readUint32LE();
          break;
          
        default:
          throw new Error(`Unknown wire type: ${wireType}`);
      }
      
      fields.push({ fieldNumber, wireType, value });
    }
    
    return fields;
  }
}

3. Schema-Driven Protocol (like Protobuf)

/**
 * A schema-driven binary protocol with:
 * - Field numbers for backward/forward compatibility
 * - Nested messages
 * - Repeated fields
 * - Optional fields (just omit from wire)
 * - Default values (not serialized)
 * 
 * Schema definition:
 *   message User {
 *     1: uint32 id
 *     2: string name
 *     3: string email
 *     4: repeated string tags
 *     5: Address address  // Nested message
 *     6: bool active = true  // Default value
 *   }
 */

type FieldType =
  | 'uint32' | 'int32' | 'uint64' | 'int64'
  | 'float' | 'double'
  | 'bool' | 'string' | 'bytes'
  | 'message';

interface FieldDef {
  number: number;
  name: string;
  type: FieldType;
  repeated?: boolean;
  messageSchema?: MessageSchema;
  defaultValue?: any;
}

interface MessageSchema {
  name: string;
  fields: Map<number, FieldDef>;
}

class SchemaSerializer {
  private schema: MessageSchema;
  
  constructor(schema: MessageSchema) {
    this.schema = schema;
  }
  
  /**
   * Serialize an object according to schema.
   */
  serialize(obj: Record<string, any>): ArrayBuffer {
    const writer = new BinaryWriter();
    this.writeMessage(writer, obj, this.schema);
    return writer.finish();
  }
  
  private writeMessage(
    writer: BinaryWriter,
    obj: Record<string, any>,
    schema: MessageSchema
  ): void {
    for (const [fieldNum, fieldDef] of schema.fields) {
      const value = obj[fieldDef.name];
      
      // Skip missing or default-valued fields
      if (value === undefined || value === null) continue;
      if (value === fieldDef.defaultValue) continue;
      
      if (fieldDef.repeated && Array.isArray(value)) {
        // Repeated field: write each element with the same field number
        for (const item of value) {
          this.writeField(writer, fieldNum, fieldDef, item);
        }
      } else {
        this.writeField(writer, fieldNum, fieldDef, value);
      }
    }
  }
  
  private writeField(
    writer: BinaryWriter,
    fieldNum: number,
    fieldDef: FieldDef,
    value: any
  ): void {
    switch (fieldDef.type) {
      case 'uint32':
      case 'int32':
      case 'bool': {
        const key = (fieldNum << 3) | TLVType.VARINT;
        writer.writeVarint(key);
        writer.writeVarint(fieldDef.type === 'bool' ? (value ? 1 : 0) : value);
        break;
      }
      
      case 'int64':
      case 'uint64': {
        const key = (fieldNum << 3) | TLVType.VARINT;
        writer.writeVarint(key);
        writer.writeVarint(Number(value));
        break;
      }
      
      case 'double': {
        const key = (fieldNum << 3) | TLVType.FIXED64;
        writer.writeVarint(key);
        writer.writeFloat64LE(value);
        break;
      }
      
      case 'float': {
        const key = (fieldNum << 3) | TLVType.FIXED32;
        writer.writeVarint(key);
        writer.writeUint32LE(value);
        break;
      }
      
      case 'string': {
        const key = (fieldNum << 3) | TLVType.BYTES;
        writer.writeVarint(key);
        writer.writeString(value);
        break;
      }
      
      case 'bytes': {
        const key = (fieldNum << 3) | TLVType.BYTES;
        writer.writeVarint(key);
        writer.writeVarint(value.length);
        writer.writeByteArray(value);
        break;
      }
      
      case 'message': {
        // Serialize nested message to bytes, then write as length-delimited
        const nested = new SchemaSerializer(fieldDef.messageSchema!);
        const nestedBuf = nested.serialize(value);
        
        const key = (fieldNum << 3) | TLVType.BYTES;
        writer.writeVarint(key);
        writer.writeVarint(nestedBuf.byteLength);
        writer.writeByteArray(new Uint8Array(nestedBuf));
        break;
      }
    }
  }
  
  /**
   * Deserialize bytes according to schema.
   */
  deserialize(buffer: ArrayBuffer): Record<string, any> {
    const reader = new BinaryReader(buffer);
    return this.readMessage(reader, this.schema, buffer.byteLength);
  }
  
  private readMessage(
    reader: BinaryReader,
    schema: MessageSchema,
    length: number
  ): Record<string, any> {
    const result: Record<string, any> = {};
    const endOffset = reader.getOffset() + length;
    
    // Initialize defaults
    for (const [, fieldDef] of schema.fields) {
      if (fieldDef.defaultValue !== undefined) {
        result[fieldDef.name] = fieldDef.defaultValue;
      }
      if (fieldDef.repeated) {
        result[fieldDef.name] = [];
      }
    }
    
    while (reader.getOffset() < endOffset) {
      const key = reader.readVarint();
      const fieldNum = key >>> 3;
      const wireType = key & 0x7;
      
      const fieldDef = schema.fields.get(fieldNum);
      
      if (!fieldDef) {
        // Unknown field — skip it (forward compatibility!)
        this.skipUnknownField(reader, wireType);
        continue;
      }
      
      const value = this.readFieldValue(reader, fieldDef, wireType);
      
      if (fieldDef.repeated) {
        result[fieldDef.name].push(value);
      } else {
        result[fieldDef.name] = value;
      }
    }
    
    return result;
  }
  
  private readFieldValue(
    reader: BinaryReader,
    fieldDef: FieldDef,
    wireType: number
  ): any {
    switch (fieldDef.type) {
      case 'uint32':
      case 'int32':
        return reader.readVarint();
      case 'bool':
        return reader.readVarint() !== 0;
      case 'double':
        return reader.readFloat64LE();
      case 'string': {
        const len = reader.readVarint();
        const bytes = reader.readBytes(len);
        return new TextDecoder().decode(bytes);
      }
      case 'bytes': {
        const len = reader.readVarint();
        return reader.readBytes(len);
      }
      case 'message': {
        const len = reader.readVarint();
        return this.readMessage(reader, fieldDef.messageSchema!, len);
      }
      default:
        this.skipUnknownField(reader, wireType);
        return undefined;
    }
  }
  
  private skipUnknownField(reader: BinaryReader, wireType: number): void {
    switch (wireType) {
      case TLVType.VARINT:
        reader.readVarint();
        break;
      case TLVType.FIXED64:
        reader.readBytes(8);
        break;
      case TLVType.BYTES: {
        const len = reader.readVarint();
        reader.readBytes(len);
        break;
      }
      case TLVType.FIXED32:
        reader.readBytes(4);
        break;
    }
  }
}

/**
 * Demo: define and use a schema.
 */
function schemaDemo(): void {
  const addressSchema: MessageSchema = {
    name: 'Address',
    fields: new Map([
      [1, { number: 1, name: 'street', type: 'string' }],
      [2, { number: 2, name: 'city', type: 'string' }],
      [3, { number: 3, name: 'zip', type: 'string' }],
    ]),
  };
  
  const userSchema: MessageSchema = {
    name: 'User',
    fields: new Map([
      [1, { number: 1, name: 'id', type: 'uint32' }],
      [2, { number: 2, name: 'name', type: 'string' }],
      [3, { number: 3, name: 'email', type: 'string' }],
      [4, { number: 4, name: 'tags', type: 'string', repeated: true }],
      [5, {
        number: 5, name: 'address', type: 'message',
        messageSchema: addressSchema,
      }],
      [6, { number: 6, name: 'active', type: 'bool', defaultValue: true }],
    ]),
  };
  
  const serializer = new SchemaSerializer(userSchema);
  
  const user = {
    id: 12345,
    name: 'Alice',
    email: 'alice@example.com',
    tags: ['admin', 'dev'],
    address: { street: '123 Main St', city: 'Springfield', zip: '62701' },
    active: true, // Won't be serialized (it's the default!)
  };
  
  const binary = serializer.serialize(user);
  console.log(`JSON size: ${JSON.stringify(user).length} bytes`);
  console.log(`Binary size: ${binary.byteLength} bytes`);
  console.log(`Ratio: ${(binary.byteLength / JSON.stringify(user).length * 100).toFixed(0)}%`);
  
  const decoded = serializer.deserialize(binary);
  console.log('Decoded:', decoded);
}

4. Framing Protocol (like WebSocket/gRPC)

/**
 * Message framing — how to send multiple messages over a stream.
 * TCP is a byte stream, not a message stream.
 * You need to delimit message boundaries.
 * 
 * FRAMING STRATEGIES:
 * 1. Length-prefixed (protobuf, gRPC, most binary protocols)
 * 2. Delimiter-based (HTTP headers use \r\n\r\n)
 * 3. Fixed-size (some network protocols)
 * 
 * We implement length-prefixed framing like gRPC/HTTP2.
 * 
 * Frame format:
 * ┌──────────┬──────────┬──────────────────┐
 * │ Flags    │ Length   │    Payload       │
 * │ (1 byte) │ (4 bytes)│  (Length bytes)  │
 * └──────────┴──────────┴──────────────────┘
 */

const enum FrameFlag {
  NONE        = 0x00,
  COMPRESSED  = 0x01,
  END_STREAM  = 0x02,
  HEARTBEAT   = 0x04,
}

interface Frame {
  flags: number;
  payload: Uint8Array;
}

class FrameCodec {
  private readonly maxFrameSize: number;
  
  constructor(maxFrameSize: number = 16 * 1024 * 1024) { // 16MB default
    this.maxFrameSize = maxFrameSize;
  }
  
  /**
   * Encode a single frame.
   */
  encodeFrame(payload: Uint8Array, flags: number = FrameFlag.NONE): ArrayBuffer {
    const writer = new BinaryWriter(5 + payload.length);
    writer.writeUint8(flags);
    writer.writeUint32BE(payload.length);
    writer.writeByteArray(payload);
    return writer.finish();
  }
  
  /**
   * Streaming frame decoder.
   * Handles partial reads (TCP doesn't guarantee complete messages).
   * 
   * This is the core challenge of binary protocol parsing over streams:
   * you might receive half a frame, need to buffer it, and continue
   * when more data arrives.
   */
  createStreamDecoder(): StreamFrameDecoder {
    return new StreamFrameDecoder(this.maxFrameSize);
  }
}

class StreamFrameDecoder {
  private chunks: Uint8Array[] = [];
  private totalBuffered: number = 0;
  private readonly maxFrameSize: number;
  
  // Parser state machine
  private state: 'header' | 'payload' = 'header';
  private currentFlags: number = 0;
  private currentLength: number = 0;
  
  constructor(maxFrameSize: number) {
    this.maxFrameSize = maxFrameSize;
  }
  
  /**
   * Feed incoming bytes. Returns decoded frames.
   * May return 0 frames (incomplete data) or multiple frames.
   */
  feed(data: Uint8Array): Frame[] {
    this.chunks.push(data);
    this.totalBuffered += data.length;
    
    const frames: Frame[] = [];
    
    while (true) {
      if (this.state === 'header') {
        // Need 5 bytes: 1 (flags) + 4 (length)
        if (this.totalBuffered < 5) break;
        
        const header = this.consume(5);
        this.currentFlags = header[0];
        this.currentLength =
          (header[1] << 24) | (header[2] << 16) |
          (header[3] << 8)  | header[4];
        
        if (this.currentLength > this.maxFrameSize) {
          throw new Error(
            `Frame too large: ${this.currentLength} > ${this.maxFrameSize}`
          );
        }
        
        this.state = 'payload';
      }
      
      if (this.state === 'payload') {
        if (this.totalBuffered < this.currentLength) break;
        
        const payload = this.consume(this.currentLength);
        frames.push({
          flags: this.currentFlags,
          payload,
        });
        
        this.state = 'header';
      }
    }
    
    return frames;
  }
  
  /**
   * Consume exactly `n` bytes from the buffer.
   */
  private consume(n: number): Uint8Array {
    const result = new Uint8Array(n);
    let copied = 0;
    
    while (copied < n) {
      const chunk = this.chunks[0];
      const needed = n - copied;
      
      if (chunk.length <= needed) {
        result.set(chunk, copied);
        copied += chunk.length;
        this.chunks.shift();
      } else {
        result.set(chunk.subarray(0, needed), copied);
        this.chunks[0] = chunk.subarray(needed);
        copied += needed;
      }
    }
    
    this.totalBuffered -= n;
    return result;
  }
}

5. Complete RPC Protocol

/**
 * A complete binary RPC protocol (simplified gRPC-like).
 * 
 * Message types:
 *   REQUEST:  [msgType=1] [requestId] [methodLen] [method] [payloadLen] [payload]
 *   RESPONSE: [msgType=2] [requestId] [statusCode] [payloadLen] [payload]
 *   STREAM:   [msgType=3] [streamId]  [payloadLen] [payload]
 *   PING:     [msgType=4] [timestamp]
 *   PONG:     [msgType=5] [timestamp]
 */

enum RPCMessageType {
  REQUEST  = 1,
  RESPONSE = 2,
  STREAM   = 3,
  PING     = 4,
  PONG     = 5,
}

enum RPCStatus {
  OK                  = 0,
  ERROR               = 1,
  NOT_FOUND           = 2,
  INVALID_ARGUMENT    = 3,
  TIMEOUT             = 4,
  INTERNAL_ERROR      = 5,
}

interface RPCRequest {
  type: RPCMessageType.REQUEST;
  requestId: number;
  method: string;
  payload: Uint8Array;
}

interface RPCResponse {
  type: RPCMessageType.RESPONSE;
  requestId: number;
  status: RPCStatus;
  payload: Uint8Array;
}

interface RPCStream {
  type: RPCMessageType.STREAM;
  streamId: number;
  payload: Uint8Array;
}

interface RPCPing {
  type: RPCMessageType.PING;
  timestamp: number;
}

interface RPCPong {
  type: RPCMessageType.PONG;
  timestamp: number;
}

type RPCMessage = RPCRequest | RPCResponse | RPCStream | RPCPing | RPCPong;

class RPCCodec {
  static encode(msg: RPCMessage): ArrayBuffer {
    const writer = new BinaryWriter();
    writer.writeUint8(msg.type);
    
    switch (msg.type) {
      case RPCMessageType.REQUEST:
        writer.writeUint32LE(msg.requestId);
        writer.writeString(msg.method);
        writer.writeVarint(msg.payload.length);
        writer.writeByteArray(msg.payload);
        break;
        
      case RPCMessageType.RESPONSE:
        writer.writeUint32LE(msg.requestId);
        writer.writeUint8(msg.status);
        writer.writeVarint(msg.payload.length);
        writer.writeByteArray(msg.payload);
        break;
        
      case RPCMessageType.STREAM:
        writer.writeUint32LE(msg.streamId);
        writer.writeVarint(msg.payload.length);
        writer.writeByteArray(msg.payload);
        break;
        
      case RPCMessageType.PING:
      case RPCMessageType.PONG:
        writer.writeFloat64LE(msg.timestamp);
        break;
    }
    
    return writer.finish();
  }
  
  static decode(buffer: ArrayBuffer): RPCMessage {
    const reader = new BinaryReader(buffer);
    const type = reader.readUint8() as RPCMessageType;
    
    switch (type) {
      case RPCMessageType.REQUEST:
        return {
          type,
          requestId: reader.readUint32LE(),
          method: reader.readString(),
          payload: reader.readLengthPrefixedBytes(),
        };
        
      case RPCMessageType.RESPONSE:
        return {
          type,
          requestId: reader.readUint32LE(),
          status: reader.readUint8() as RPCStatus,
          payload: reader.readLengthPrefixedBytes(),
        };
        
      case RPCMessageType.STREAM:
        return {
          type,
          streamId: reader.readUint32LE(),
          payload: reader.readLengthPrefixedBytes(),
        };
        
      case RPCMessageType.PING:
        return { type, timestamp: reader.readFloat64LE() };
        
      case RPCMessageType.PONG:
        return { type, timestamp: reader.readFloat64LE() };
        
      default:
        throw new Error(`Unknown message type: ${type}`);
    }
  }
}

/**
 * Full RPC client with request tracking and timeouts.
 */
class RPCClient {
  private nextId: number = 1;
  private pending: Map<number, {
    resolve: (resp: RPCResponse) => void;
    reject: (err: Error) => void;
    timer: ReturnType<typeof setTimeout>;
  }> = new Map();
  
  private frameCodec = new FrameCodec();
  private decoder = this.frameCodec.createStreamDecoder();
  
  async call(
    method: string,
    payload: Uint8Array,
    timeout: number = 5000
  ): Promise<RPCResponse> {
    const requestId = this.nextId++;
    
    const request: RPCRequest = {
      type: RPCMessageType.REQUEST,
      requestId,
      method,
      payload,
    };
    
    const encoded = RPCCodec.encode(request);
    const frame = this.frameCodec.encodeFrame(new Uint8Array(encoded));
    
    // Send frame over transport (TCP, WebSocket, etc.)
    this.send(new Uint8Array(frame));
    
    return new Promise((resolve, reject) => {
      const timer = setTimeout(() => {
        this.pending.delete(requestId);
        reject(new Error(`RPC timeout: ${method} (${timeout}ms)`));
      }, timeout);
      
      this.pending.set(requestId, { resolve, reject, timer });
    });
  }
  
  /**
   * Called when bytes arrive from the transport.
   */
  onData(data: Uint8Array): void {
    const frames = this.decoder.feed(data);
    
    for (const frame of frames) {
      const msg = RPCCodec.decode(frame.payload.buffer);
      
      if (msg.type === RPCMessageType.RESPONSE) {
        const pending = this.pending.get(msg.requestId);
        if (pending) {
          clearTimeout(pending.timer);
          this.pending.delete(msg.requestId);
          
          if (msg.status === RPCStatus.OK) {
            pending.resolve(msg);
          } else {
            pending.reject(
              new Error(`RPC error: status=${RPCStatus[msg.status]}`)
            );
          }
        }
      }
      
      if (msg.type === RPCMessageType.PING) {
        // Respond with PONG
        const pong = RPCCodec.encode({
          type: RPCMessageType.PONG,
          timestamp: msg.timestamp,
        });
        this.send(new Uint8Array(this.frameCodec.encodeFrame(new Uint8Array(pong))));
      }
    }
  }
  
  private send(data: Uint8Array): void {
    // Placeholder: in production, write to TCP socket / WebSocket
    console.log(`[RPC] Sending ${data.length} bytes`);
  }
}

Protocol Comparison

Protocol	Encoding	Schema	Size	Parse Speed	Streaming	Human Readable
JSON	Text	No	Largest	Slow	No	Yes
MessagePack	Binary	No	~60% of JSON	Fast	No	No
Protobuf	Binary TLV	Yes (.proto)	~30% of JSON	Very Fast	Yes (gRPC)	No
FlatBuffers	Binary (zero-copy)	Yes (.fbs)	~35% of JSON	Instant	No	No
Cap'n Proto	Binary (zero-copy)	Yes	~35% of JSON	Instant	Yes	No
Avro	Binary	Yes (.avsc)	~30% of JSON	Fast	Yes	No
CBOR	Binary	No	~50% of JSON	Fast	No	No

Schema Evolution Rules

SAFE SCHEMA CHANGES (backward + forward compatible):

  ✅ ADD optional field (new number)
     Old reader: ignores unknown field number
     New reader: uses default if field missing
     
  ✅ REMOVE optional field
     Old reader: still reads, just ignores if missing
     New reader: skips unknown field
  
  ✅ RENAME field (field number stays the same)
     Wire format uses numbers, not names
  
  ✅ Change int32 ↔ int64 (widening)
     Varint encoding handles both

UNSAFE CHANGES:

  ❌ CHANGE field number
     Old and new readers see different fields!
     
  ❌ Change wire type (e.g., string → int)
     Parser will misinterpret bytes
     
  ❌ Make optional → required
     Old writers don't send it; new reader rejects
     
  ❌ Reuse deleted field number
     Old data with that number → wrong semantics

PROTOBUF BEST PRACTICES:
  - Never reuse field numbers (use `reserved`)
  - Start field numbers at 1 (1-15 use 1-byte key)
  - Use field numbers 1-15 for frequently-set fields
  - All fields are effectively optional in proto3

Interview Questions & Answers

Q: How would you design a binary protocol for a real-time multiplayer game?

Key requirements: minimal latency, small packets (fit in MTU ~1400 bytes), tolerant of packet loss (UDP). Design: (1) Fixed header: 1-byte message type + 2-byte sequence number + 4-byte timestamp (7 bytes). (2) Delta encoding: send only changes since last acknowledged state, not full state. (3) Bit packing: a player position (x,y,z) doesn't need float64 — quantize to 16-bit fixed-point (0.01 unit precision in a 655-unit world). That's 6 bytes instead of 24. (4) No length-prefixed framing: each UDP datagram is one message. (5) No varints for hot fields: fixed-size fields are faster to parse (direct offset, no loops). Varints save space but cost parse time. (6) Prediction + correction: client predicts, server sends corrections only when prediction error exceeds threshold. Total packet: 20-100 bytes for typical state update, supporting 60Hz tick rate.

Q: What is zero-copy parsing and when should you use it?

Zero-copy parsing means reading data directly from the receive buffer without copying or allocating new objects. FlatBuffers and Cap'n Proto serialize data in a format that IS the in-memory representation — to "parse" a FlatBuffer, you just cast the buffer pointer and start reading. Benefits: no parse step (instant access), no memory allocation (no GC pressure), great for large messages. Tradeoffs: (1) Random access is fast, but sequential scan may be cache-unfriendly (data isn't compact). (2) The buffer must be kept alive as long as any reference exists. (3) Byte order must match (typically little-endian). (4) Schema changes are constrained (can't reorder fields). Use it for: large messages, latency-sensitive systems (game engines, real-time analytics), or when you only need a few fields from a large message (access is O(1), not O(n) like Protobuf).

Q: How does Protocol Buffers achieve backward and forward compatibility?

Via field numbers + wire types. Each field is identified by a number, not a name. On the wire: (field_number << 3 | wire_type) followed by the value. For backward compatibility (new reader, old data): new fields have defaults, so missing fields get default values. For forward compatibility (old reader, new data): unknown field numbers are skipped using the wire type to determine how many bytes to skip (varint: read until no continuation bit; length-delimited: read length then skip; fixed32/64: skip 4/8 bytes). This means producers and consumers can be updated independently — critical for microservices and mobile apps where you can't deploy all clients simultaneously.

Real-World Problems & How to Solve Them

Problem 1: Parser crashes on truncated packets

Symptom: Production logs show RangeError or random decode values near network boundaries.

Root cause: Reader methods advance cursor without checking remaining bytes before each primitive read.

Fix — add explicit bounds checks in low-level readers:

class SafeBinaryReader {
  private view: DataView;
  private offset = 0;

  constructor(private readonly buffer: ArrayBuffer) {
    this.view = new DataView(buffer);
  }

  private ensure(bytes: number): void {
    if (this.offset + bytes > this.buffer.byteLength) {
      throw new Error(`Truncated buffer: need ${bytes} bytes`);
    }
  }

  readUint32LE(): number {
    this.ensure(4);
    const value = this.view.getUint32(this.offset, true);
    this.offset += 4;
    return value;
  }
}

Problem 2: Parsed numbers look byte-swapped

Symptom: Message length 1024 becomes 262144 or IDs are wildly incorrect.

Root cause: Producer writes little-endian while parser reads big-endian (or vice versa).

Fix — lock endianness in protocol header and use one read path:

interface FrameHeader {
  magic: number;
  version: number;
  length: number;
}

function readHeaderLE(reader: BinaryReader): FrameHeader {
  const magic = reader.readUint16LE();
  if (magic !== 0xCAFE) throw new Error('bad magic');

  const version = reader.readUint8();
  const length = reader.readUint32LE();
  return { magic, version, length };
}

Problem 3: Malformed varints hang decode loop

Symptom: A single bad packet pegs CPU and the parser never returns.

Root cause: Varint decoder loops until continuation bit clears, but malformed input can keep it set forever.

Fix — cap bytes and fail fast on overlong varints:

function readVarintSafe(reader: BinaryReader): number {
  let value = 0;
  let shift = 0;

  for (let i = 0; i < 5; i++) {
    const b = reader.readUint8();
    value |= (b & 0x7f) << shift;
    if ((b & 0x80) === 0) return value >>> 0;
    shift += 7;
  }

  throw new Error('Invalid varint: exceeds 5 bytes for uint32');
}

Problem 4: TCP packet splitting breaks parser state

Symptom: Decoder fails intermittently even though payloads are valid when inspected in full.

Root cause: Code assumes each socket data event contains one whole message; TCP is a byte stream.

Fix — add a frame accumulator with length-prefix decoding:

class LengthPrefixedFramer {
  private pending = new Uint8Array(0);

  push(chunk: Uint8Array): Uint8Array[] {
    const merged = new Uint8Array(this.pending.length + chunk.length);
    merged.set(this.pending, 0);
    merged.set(chunk, this.pending.length);

    const frames: Uint8Array[] = [];
    let offset = 0;

    while (offset + 4 <= merged.length) {
      const len = new DataView(merged.buffer, merged.byteOffset + offset, 4).getUint32(0, true);
      if (offset + 4 + len > merged.length) break;
      frames.push(merged.subarray(offset + 4, offset + 4 + len));
      offset += 4 + len;
    }

    this.pending = merged.subarray(offset);
    return frames;
  }
}

Problem 5: High GC pauses under throughput spikes

Symptom: CPU time is spent in garbage collection, not parsing.

Root cause: Decoder repeatedly copies payload slices (buffer.slice) instead of using views.

Fix — use zero-copy Uint8Array views over the original buffer:

function readBytesView(
  buffer: ArrayBuffer,
  byteOffset: number,
  length: number
): Uint8Array {
  return new Uint8Array(buffer, byteOffset, length); // zero-copy view
}

// Avoid:
// buffer.slice(byteOffset, byteOffset + length) // allocates + copies

Problem 6: Nested message parser overreads parent frame

Symptom: Decoding one nested field corrupts subsequent top-level fields.

Root cause: Length-delimited submessages are parsed against the parent reader with no strict boundary.

Fix — parse nested sections with bounded sub-readers and consumption checks:

function parseNestedMessage(parent: BinaryReader): Record<string, unknown> {
  const length = parent.readVarint();
  const start = parent.getOffset();
  const nested = parent.subReader(length);

  const result: Record<string, unknown> = {
    code: nested.readUint16LE(),
    value: nested.readString(),
  };

  if (!nested.isEOF()) {
    throw new Error('Nested message has trailing bytes');
  }

  parent.setOffset(start + length);
  return result;
}

Problem 7: Unknown TLV fields crash older clients

Symptom: New server releases break old mobile clients with “unknown type” errors.

Root cause: Parser treats unknown field types as fatal instead of skipping length-delimited payloads.

Fix — implement skip logic for unknown-but-well-formed fields:

function parseTLVField(reader: BinaryReader): { type: number; value: Uint8Array } | null {
  if (reader.isEOF()) return null;

  const type = reader.readUint8();
  const length = reader.readUint16LE();

  // Known types: 1..5
  if (type >= 1 && type <= 5) {
    return { type, value: reader.readBytes(length) };
  }

  reader.setOffset(reader.getOffset() + length); // skip unknown type
  return null;
}

Key Takeaways

BINARY PROTOCOL DESIGN CHECKLIST:

  ┌─ WIRE FORMAT ──────────────────────────────┐
  │  □ Endianness decided (LE is standard)     │
  │  □ Integer encoding (fixed vs varint)      │
  │  □ String encoding (length-prefixed UTF-8) │
  │  □ Null/optional representation            │
  └────────────────────────────────────────────┘
  
  ┌─ FRAMING ──────────────────────────────────┐
  │  □ Message boundaries (length-prefix)      │
  │  □ Max message size limit                  │
  │  □ Streaming/chunking support              │
  │  □ Partial read handling (state machine)   │
  └────────────────────────────────────────────┘
  
  ┌─ EVOLUTION ────────────────────────────────┐
  │  □ Field numbers (not names) on wire       │
  │  □ Unknown field skipping                  │
  │  □ Default values for new fields           │
  │  □ Never reuse field numbers               │
  └────────────────────────────────────────────┘
  
  ┌─ PERFORMANCE ──────────────────────────────┐
  │  □ Fixed-size hot path fields              │
  │  □ Zero-copy where possible                │
  │  □ Varint for variable-range cold fields   │
  │  □ Minimize allocations (buffer reuse)     │
  └────────────────────────────────────────────┘

What did you think?