Theta-lang: Implementation Plan: FlatBuffer-Backed Procedural Dataflow VM

Project: Theta Virtual Machine

Target Language: C11 Architecture: Three-layer VM with procedural execution, dataflow engine, and FlatBuffer-backed shared memory

Core Design Decisions:

  • Memory Model: Arena allocation with bump-pointer allocators
  • Register Architecture: 256 typed registers (r0-r255)
  • Language: Statically typed, C-like procedural language
  • Storage: Columnar table layout with typed columns
  • Execution: Pull-based lazy evaluation with DAG interpreter

Phase 1: Foundation - FlatBuffers Schema Definition

1.1 Core Type System Schema

  • Define primitive type enumeration (i8, i16, i32, i64, f32, f64, bool, string)
  • Define reference types (ptr, table_ref, node_ref)
  • Create type descriptor tables with nullability, stride, and alignment metadata
  • Design schema versioning strategy

1.2 Memory Layout Schema (Arena-Based)

  • Define Arena table (base pointer, current offset, capacity, generation ID)
  • Define ArenaPool table for managing multiple arena generations
  • Create buffer metadata (arena reference, offset, size)
  • Design arena lifecycle tracking (creation, active, retired, freed)

1.3 Value Representation Schema

  • Define Value union type (scalar, table reference, dataflow node reference)
  • Create ScalarValue table (type tag, raw bytes)
  • Design TableValue table (schema reference, row count, column references)
  • Define DataflowNodeReference (node ID, output port)

1.4 Schema Evolution Strategy

  • Implement schema version numbering
  • Design backward compatibility layer for schema changes
  • Create validation utilities for schema conformance

Deliverables:

  • schemas/types.fbs - Type system definitions
  • schemas/memory.fbs - Memory management structures
  • schemas/values.fbs - Runtime value representations
  • schemas/versioning.fbs - Schema evolution support

Phase 2: Procedural Layer Foundation

2.1 Instruction Set Architecture (Typed Instructions)

  • Define typed instruction enumeration (LOAD_I32, STORE_F64, ADD_I64, etc.)
  • Create Instruction table (opcode, typed operands)
  • Design register addressing (r0-r255 with type tags)
  • Define control flow instructions (BR, BR_IF, CALL, RET)
  • Implement type validation for instruction operands

2.2 Program Representation

  • Define Function table (name, typed parameters, typed locals, typed return, instruction sequence)
  • Create Module table (function list with signatures, typed constant pool, import/export tables)
  • Design ConstantPool table (typed constants: scalar, string, table_ref)
  • Implement function signature validation (type checking at load time)

2.3 Execution State Schema (Typed Register File)

  • Define TypedRegister table (register number, type tag, value union)
  • Create RegisterFile table (256 registers: r0-r255)
    • r0: constant zero (read-only)
    • r1: function return value
    • r2-r15: argument passing
    • r16-r255: general purpose
  • Design CallFrame table (return address, base register, saved register snapshot)
  • Define CallStack table (array of call frames, frame pointer)
  • Create ExecutionState table (program counter, typed register file, call stack, current arena)

2.4 C11 Interpreter Core

  • Implement instruction decoder (FlatBuffer accessor → operation)
  • Create register file manipulator (load/store operations)
  • Build call stack manager (push/pop frames)
  • Implement control flow executor (branches, loops)
  • Design error handling model (trap codes, error propagation)

Deliverables:

  • schemas/instructions.fbs - Instruction set schema
  • schemas/program.fbs - Program representation
  • schemas/execution.fbs - Runtime execution state
  • src/interpreter/decoder.c - Instruction decoder
  • src/interpreter/executor.c - Execution engine
  • src/interpreter/callstack.c - Call stack management

Phase 3: Dataflow Layer Foundation

3.1 Operator Schema Definition

  • Define operator enumeration (SCAN, FILTER, PROJECT, MAP, REDUCE, JOIN, AGGREGATE, SORT)
  • Create Operator table (operator type, parameters, metadata)
  • Design operator parameter encoding (predicate expressions, projection lists, join conditions)

3.2 Dataflow Graph Schema (DAG)

  • Define DataflowNode table (node ID, operator, input node references, output schema)
  • Create DataflowGraph table (node list, topological ordering for validation)
  • Design node dependency tracking (enforce DAG constraint, detect cycles)
  • Implement graph validation (no cycles, valid input references)

3.3 Table Schema Definition (Columnar Layout)

  • Define Column table:
    • name: string
    • type: ColumnType (i8, i16, i32, i64, f32, f64, bool, string)
    • nullable: bool
    • null_bitmap: [u8] (bit-packed, 1 bit per row)
    • data: [u8] (typed data buffer or string offsets+data)
  • Create Table table (schema, column array, row count)
  • Design StringColumn (offsets array + concatenated UTF-8 data)
  • Implement schema inference utilities

3.4 Dataflow Executor (Pull-Based Iterator Model)

  • Define NodeIterator interface (next(), reset() operations)
  • Implement operator iterator dispatch (node type → iterator constructor)
  • Create SCAN iterator (read from columnar FlatBuffer table)
  • Implement FILTER iterator (pull from input, evaluate predicate, pass matching rows)
  • Build PROJECT iterator (pull from input, select/reorder columns)
  • Design MAP iterator (pull from input, apply transformation per row)
  • Implement materialize() function (pull all rows, build new FlatBuffer in arena)

3.5 Expression Evaluation

  • Define expression tree schema (binary ops, unary ops, literals, column references)
  • Implement expression evaluator (recursive descent on expression tree)
  • Create predicate evaluator for FILTER operations

Deliverables:

  • schemas/operators.fbs - Dataflow operator definitions
  • schemas/dataflow.fbs - Dataflow graph structure
  • schemas/tables.fbs - Table and column schemas
  • schemas/expressions.fbs - Expression trees
  • src/dataflow/executor.c - Operator executor
  • src/dataflow/operators/ - Individual operator implementations
  • src/dataflow/expressions.c - Expression evaluator

Phase 4: Integration Layer

4.1 Procedural-Dataflow Bridge

  • Design instruction set for dataflow operations (CREATE_NODE, EXECUTE_NODE, MATERIALIZE)
  • Implement dataflow node creation from procedural code
  • Create mechanism to pass dataflow results to registers
  • Build reference counting for dataflow node lifecycle

4.2 FlatBuffer Builder Integration

  • Create wrapper for FlatBuffers builder API
  • Implement result materialization (dataflow output → new FlatBuffer)
  • Design buffer pool management (allocation, reuse, freeing)
  • Build buffer versioning layer (maintain multiple generations)

4.3 Immutability Enforcement

  • Implement copy-on-write semantics for modified data
  • Design buffer snapshot mechanism
  • Create buffer lineage tracking (derivation chains)

Deliverables:

  • src/integration/bridge.c - Procedural-dataflow bridge
  • src/integration/builder.c - FlatBuffer builder wrapper
  • src/integration/buffers.c - Buffer pool manager
  • schemas/integration.fbs - Integration layer schemas

Phase 5: Host Interoperability

5.1 Host API Definition

  • Define VM initialization API (vm_create, vm_destroy)
  • Create program loading API (vm_load_module, vm_unload_module)
  • Design execution API (vm_execute_function, vm_step, vm_resume)
  • Implement state inspection API (vm_get_state, vm_get_register, vm_get_dataflow_graph)
  • Build result extraction API (vm_get_result_table, vm_materialize_node)

5.2 Zero-Copy Access Guarantees

  • Document buffer lifetime semantics
  • Implement buffer pinning mechanism (prevent premature freeing)
  • Create buffer accessor utilities for host languages
  • Design error propagation to host (error codes, structured error objects)

5.3 Host Language Bindings

  • Generate C header files from FlatBuffer schemas
  • Create Python binding layer (using FlatBuffers Python library)
  • Implement Rust binding layer (using FlatBuffers Rust library)
  • Design Go binding layer (using FlatBuffers Go library)

Deliverables:

  • include/theta_vm.h - Public C API
  • bindings/python/ - Python host bindings
  • bindings/rust/ - Rust host bindings
  • bindings/go/ - Go host bindings
  • docs/host_integration.md - Integration guide

Phase 6: Advanced Dataflow Operations

6.1 Join Operators

  • Implement hash join (build hash table, probe)
  • Create nested loop join (fallback for non-equi joins)
  • Design sort-merge join (for sorted inputs)

6.2 Aggregation Operators

  • Implement GROUP BY execution (hash-based grouping)
  • Create aggregate functions (SUM, COUNT, AVG, MIN, MAX)
  • Design streaming aggregation for memory efficiency

6.3 Sort Operator

  • Implement multi-column sort with collation
  • Create external sort for large datasets (spill to disk)

6.4 Query Optimization (Pre-Execution)

  • Implement predicate pushdown (move filters closer to scans)
  • Create projection pushdown (only read needed columns from source)
  • Design operator fusion (combine filter→project→map into single pass)
  • Implement common subexpression elimination (reuse shared DAG subtrees)
  • Note: Lazy pull-based evaluation is already implemented in Phase 3.4

Deliverables:

  • src/dataflow/operators/join.c - Join implementations
  • src/dataflow/operators/aggregate.c - Aggregation operators
  • src/dataflow/operators/sort.c - Sort operator
  • src/dataflow/lazy_executor.c - Lazy evaluation engine
  • src/dataflow/optimizer.c - Query optimization

Phase 7: Safety and Correctness

7.1 Memory Safety

  • Implement bounds checking for all buffer accesses
  • Create assertion framework for invariant checking
  • Design buffer validity verification (detect corruption)
  • Build memory leak detection utilities

7.2 Determinism Validation

  • Create reproducibility test suite (same input → same output)
  • Implement instruction execution tracing
  • Design regression testing framework

7.3 Platform Portability

  • Test on x86-64, ARM64, RISC-V architectures
  • Validate endianness handling
  • Test alignment requirement compliance
  • Verify float/double precision consistency

7.4 Error Handling

  • Define error code enumeration (type errors, bounds errors, execution errors)
  • Create structured error reporting (error message, location, stack trace)
  • Implement graceful degradation (trap and unwind vs. abort)

Deliverables:

  • src/safety/bounds_check.c - Bounds checking utilities
  • src/safety/validator.c - Buffer validation
  • tests/determinism/ - Determinism test suite
  • tests/portability/ - Cross-platform tests
  • src/errors/ - Error handling infrastructure

Phase 8: Tooling and Developer Experience

8.1 VM Introspection Tools

  • Build execution state dumper (print current VM state)
  • Create dataflow graph visualizer (DOT format export)
  • Implement instruction disassembler (bytecode → human-readable)
  • Design execution tracer (log each instruction)

8.2 Debugging Support

  • Implement breakpoint mechanism
  • Create single-step execution mode
  • Build register watch facility
  • Design dataflow node inspection utilities

8.3 Performance Analysis

  • Create execution profiler (instruction counts, operator timings)
  • Implement memory usage analyzer
  • Build dataflow bottleneck detector

8.4 Documentation

  • Write architecture overview documentation
  • Create FlatBuffer schema documentation
  • Document instruction set reference
  • Write operator semantics guide
  • Create host integration tutorial

Deliverables:

  • tools/vm_inspect - State inspection tool
  • tools/df_visualizer - Dataflow graph visualizer
  • tools/disassembler - Bytecode disassembler
  • tools/profiler - Performance profiler
  • docs/architecture.md - Architecture documentation
  • docs/instruction_set.md - ISA reference
  • docs/operators.md - Dataflow operator reference

Phase 9: Standard Library and Built-ins

9.1 Built-in Functions

  • Implement string manipulation functions
  • Create mathematical functions (sin, cos, sqrt, log, exp)
  • Design date/time manipulation functions
  • Build type conversion functions

9.2 I/O Operations

  • Design file reading operators (CSV, JSON, Parquet)
  • Implement table serialization (export to formats)
  • Create streaming I/O for large datasets

9.3 Standard Dataflow Patterns

  • Create common query templates (group-aggregate, window functions)
  • Implement reusable transformation pipelines

Deliverables:

  • src/stdlib/ - Standard library implementations
  • src/io/ - I/O operation implementations
  • schemas/stdlib.fbs - Standard library schemas
  • docs/stdlib.md - Standard library reference

Phase 10: Optimization and Production Readiness

10.1 Performance Optimization

  • Implement SIMD vectorization for hot paths (columnar operations, filters)
  • Create instruction dispatch optimization (computed goto, threaded code)
  • Optimize FlatBuffer access patterns (cache locality, prefetching)
  • Tune arena allocation parameters (arena size, growth strategy)

10.2 Benchmarking

  • Create benchmark suite (micro-benchmarks, end-to-end scenarios)
  • Compare against Lua (procedural execution speed)
  • Compare against DuckDB (dataflow operation speed)
  • Measure memory overhead

10.3 Production Hardening

  • Implement comprehensive error recovery
  • Create resource limitation enforcement (memory caps, execution timeouts)
  • Design security sandboxing (restrict host access)
  • Build graceful shutdown mechanism

10.4 Release Engineering

  • Create build system (CMake or Meson)
  • Set up continuous integration
  • Implement versioning and release process
  • Design upgrade path for schema evolution

Deliverables:

  • Optimized production build
  • Benchmark results and analysis
  • Security audit report
  • Release artifacts (binaries, headers, documentation)

Dependencies and Prerequisites

Required Tools

  • C11 compliant compiler (GCC 4.9+, Clang 3.4+, MSVC 2015+)
  • FlatBuffers compiler (flatc)
  • CMake 3.15+ or Meson build system
  • Python 3.8+ (for tooling and bindings)

External Libraries

  • FlatBuffers C library
  • Testing framework (e.g., Check, Unity, or cmocka)

Risk Assessment

Technical Risks

  1. FlatBuffer performance overhead - Accessor indirection may slow hot paths

    • Mitigation: Benchmark early, optimize critical sections with SIMD, cache hot paths
  2. Arena memory overhead - Long-running computations may accumulate large arenas

    • Mitigation: Implement arena compaction, allow arena reuse for similar-sized allocations
  3. Determinism on edge cases - Floating-point operations may vary across platforms

    • Mitigation: Use strict FP modes, document platform requirements
  4. Schema evolution complexity - Breaking changes may complicate upgrades

    • Mitigation: Version all schemas, maintain compatibility layer

Project Risks

  1. Scope creep - Feature additions may delay core implementation

    • Mitigation: Strict adherence to phased plan, defer non-essential features
  2. Documentation debt - Complex design may be hard to document

    • Mitigation: Document schemas and APIs during development, not after

Success Criteria

  1. VM can execute procedural scripts with Lua-like semantics
  2. Dataflow operations match DuckDB’s expressiveness for core operators
  3. Host languages can access VM state without copying memory
  4. All FlatBuffer regions remain valid and accessible to host
  5. Execution is deterministic across supported platforms
  6. Memory safety guaranteed (no UB, no leaks in normal operation)
  7. Performance within 2x of native C for procedural code
  8. Performance within 3x of DuckDB for dataflow operations

Timeline Estimation (Rough Order)

  • Phase 1: 2-3 weeks (schema design is iterative)
  • Phase 2: 4-6 weeks (interpreter core is complex)
  • Phase 3: 4-6 weeks (dataflow engine requires careful design)
  • Phase 4: 2-3 weeks (integration layer)
  • Phase 5: 3-4 weeks (host bindings for multiple languages)
  • Phase 6: 4-5 weeks (advanced operators are intricate)
  • Phase 7: 3-4 weeks (safety testing is thorough)
  • Phase 8: 2-3 weeks (tooling development)
  • Phase 9: 2-3 weeks (standard library)
  • Phase 10: 3-4 weeks (optimization and hardening)

Total estimated effort: 29-41 weeks (single developer, full-time)

For a team of 2-3 developers, overlapping phases could reduce calendar time to 6-9 months.


Next Steps

  1. Review and validate this plan with stakeholders
  2. Set up project repository and build infrastructure
  3. Begin Phase 1: Design and iterate on FlatBuffer schemas
  4. Create minimal working prototype (simple procedural interpreter)
  5. Iterate on design based on early implementation learnings
Tags: