Theta-lang: Implementation Plan: FlatBuffer-Backed Procedural Dataflow VM

November 24, 2025

Project: Theta Virtual Machine

Target Language: C11 Architecture: Three-layer VM with procedural execution, dataflow engine, and FlatBuffer-backed shared memory

Core Design Decisions:

Memory Model: Arena allocation with bump-pointer allocators
Register Architecture: 256 typed registers (r0-r255)
Language: Statically typed, C-like procedural language
Storage: Columnar table layout with typed columns
Execution: Pull-based lazy evaluation with DAG interpreter

Phase 1: Foundation - FlatBuffers Schema Definition

1.1 Core Type System Schema

Define primitive type enumeration (i8, i16, i32, i64, f32, f64, bool, string)
Define reference types (ptr, table_ref, node_ref)
Create type descriptor tables with nullability, stride, and alignment metadata
Design schema versioning strategy

1.2 Memory Layout Schema (Arena-Based)

Define Arena table (base pointer, current offset, capacity, generation ID)
Define ArenaPool table for managing multiple arena generations
Create buffer metadata (arena reference, offset, size)
Design arena lifecycle tracking (creation, active, retired, freed)

1.3 Value Representation Schema

Define Value union type (scalar, table reference, dataflow node reference)
Create ScalarValue table (type tag, raw bytes)
Design TableValue table (schema reference, row count, column references)
Define DataflowNodeReference (node ID, output port)

1.4 Schema Evolution Strategy

Implement schema version numbering
Design backward compatibility layer for schema changes
Create validation utilities for schema conformance

Deliverables:

schemas/types.fbs - Type system definitions
schemas/memory.fbs - Memory management structures
schemas/values.fbs - Runtime value representations
schemas/versioning.fbs - Schema evolution support

Phase 2: Procedural Layer Foundation

2.1 Instruction Set Architecture (Typed Instructions)

Define typed instruction enumeration (LOAD_I32, STORE_F64, ADD_I64, etc.)
Create Instruction table (opcode, typed operands)
Design register addressing (r0-r255 with type tags)
Define control flow instructions (BR, BR_IF, CALL, RET)
Implement type validation for instruction operands

2.2 Program Representation

Define Function table (name, typed parameters, typed locals, typed return, instruction sequence)
Create Module table (function list with signatures, typed constant pool, import/export tables)
Design ConstantPool table (typed constants: scalar, string, table_ref)
Implement function signature validation (type checking at load time)

2.3 Execution State Schema (Typed Register File)

Define TypedRegister table (register number, type tag, value union)
Create RegisterFile table (256 registers: r0-r255)
- r0: constant zero (read-only)
- r1: function return value
- r2-r15: argument passing
- r16-r255: general purpose
Design CallFrame table (return address, base register, saved register snapshot)
Define CallStack table (array of call frames, frame pointer)
Create ExecutionState table (program counter, typed register file, call stack, current arena)

2.4 C11 Interpreter Core

Implement instruction decoder (FlatBuffer accessor → operation)
Create register file manipulator (load/store operations)
Build call stack manager (push/pop frames)
Implement control flow executor (branches, loops)
Design error handling model (trap codes, error propagation)

Deliverables:

schemas/instructions.fbs - Instruction set schema
schemas/program.fbs - Program representation
schemas/execution.fbs - Runtime execution state
src/interpreter/decoder.c - Instruction decoder
src/interpreter/executor.c - Execution engine
src/interpreter/callstack.c - Call stack management

Phase 3: Dataflow Layer Foundation

3.1 Operator Schema Definition

Define operator enumeration (SCAN, FILTER, PROJECT, MAP, REDUCE, JOIN, AGGREGATE, SORT)
Create Operator table (operator type, parameters, metadata)
Design operator parameter encoding (predicate expressions, projection lists, join conditions)

3.2 Dataflow Graph Schema (DAG)

Define DataflowNode table (node ID, operator, input node references, output schema)
Create DataflowGraph table (node list, topological ordering for validation)
Design node dependency tracking (enforce DAG constraint, detect cycles)
Implement graph validation (no cycles, valid input references)

3.3 Table Schema Definition (Columnar Layout)

Define Column table:
- name: string
- type: ColumnType (i8, i16, i32, i64, f32, f64, bool, string)
- nullable: bool
- null_bitmap: [u8] (bit-packed, 1 bit per row)
- data: [u8] (typed data buffer or string offsets+data)
Create Table table (schema, column array, row count)
Design StringColumn (offsets array + concatenated UTF-8 data)
Implement schema inference utilities

3.4 Dataflow Executor (Pull-Based Iterator Model)

Define NodeIterator interface (next(), reset() operations)
Implement operator iterator dispatch (node type → iterator constructor)
Create SCAN iterator (read from columnar FlatBuffer table)
Implement FILTER iterator (pull from input, evaluate predicate, pass matching rows)
Build PROJECT iterator (pull from input, select/reorder columns)
Design MAP iterator (pull from input, apply transformation per row)
Implement materialize() function (pull all rows, build new FlatBuffer in arena)

3.5 Expression Evaluation

Define expression tree schema (binary ops, unary ops, literals, column references)
Implement expression evaluator (recursive descent on expression tree)
Create predicate evaluator for FILTER operations

Deliverables:

schemas/operators.fbs - Dataflow operator definitions
schemas/dataflow.fbs - Dataflow graph structure
schemas/tables.fbs - Table and column schemas
schemas/expressions.fbs - Expression trees
src/dataflow/executor.c - Operator executor
src/dataflow/operators/ - Individual operator implementations
src/dataflow/expressions.c - Expression evaluator

Phase 4: Integration Layer

4.1 Procedural-Dataflow Bridge

Design instruction set for dataflow operations (CREATE_NODE, EXECUTE_NODE, MATERIALIZE)
Implement dataflow node creation from procedural code
Create mechanism to pass dataflow results to registers
Build reference counting for dataflow node lifecycle

4.2 FlatBuffer Builder Integration

Create wrapper for FlatBuffers builder API
Implement result materialization (dataflow output → new FlatBuffer)
Design buffer pool management (allocation, reuse, freeing)
Build buffer versioning layer (maintain multiple generations)

4.3 Immutability Enforcement

Implement copy-on-write semantics for modified data
Design buffer snapshot mechanism
Create buffer lineage tracking (derivation chains)

Deliverables:

src/integration/bridge.c - Procedural-dataflow bridge
src/integration/builder.c - FlatBuffer builder wrapper
src/integration/buffers.c - Buffer pool manager
schemas/integration.fbs - Integration layer schemas

Phase 5: Host Interoperability

5.1 Host API Definition

Define VM initialization API (vm_create, vm_destroy)
Create program loading API (vm_load_module, vm_unload_module)
Design execution API (vm_execute_function, vm_step, vm_resume)
Implement state inspection API (vm_get_state, vm_get_register, vm_get_dataflow_graph)
Build result extraction API (vm_get_result_table, vm_materialize_node)

5.2 Zero-Copy Access Guarantees

Document buffer lifetime semantics
Implement buffer pinning mechanism (prevent premature freeing)
Create buffer accessor utilities for host languages
Design error propagation to host (error codes, structured error objects)

5.3 Host Language Bindings

Generate C header files from FlatBuffer schemas
Create Python binding layer (using FlatBuffers Python library)
Implement Rust binding layer (using FlatBuffers Rust library)
Design Go binding layer (using FlatBuffers Go library)

Deliverables:

include/theta_vm.h - Public C API
bindings/python/ - Python host bindings
bindings/rust/ - Rust host bindings
bindings/go/ - Go host bindings
docs/host_integration.md - Integration guide

Phase 6: Advanced Dataflow Operations

6.1 Join Operators

Implement hash join (build hash table, probe)
Create nested loop join (fallback for non-equi joins)
Design sort-merge join (for sorted inputs)

6.2 Aggregation Operators

Implement GROUP BY execution (hash-based grouping)
Create aggregate functions (SUM, COUNT, AVG, MIN, MAX)
Design streaming aggregation for memory efficiency

6.3 Sort Operator

Implement multi-column sort with collation
Create external sort for large datasets (spill to disk)

6.4 Query Optimization (Pre-Execution)

Implement predicate pushdown (move filters closer to scans)
Create projection pushdown (only read needed columns from source)
Design operator fusion (combine filter→project→map into single pass)
Implement common subexpression elimination (reuse shared DAG subtrees)
Note: Lazy pull-based evaluation is already implemented in Phase 3.4

Deliverables:

src/dataflow/operators/join.c - Join implementations
src/dataflow/operators/aggregate.c - Aggregation operators
src/dataflow/operators/sort.c - Sort operator
src/dataflow/lazy_executor.c - Lazy evaluation engine
src/dataflow/optimizer.c - Query optimization

Phase 7: Safety and Correctness

7.1 Memory Safety

Implement bounds checking for all buffer accesses
Create assertion framework for invariant checking
Design buffer validity verification (detect corruption)
Build memory leak detection utilities

7.2 Determinism Validation

Create reproducibility test suite (same input → same output)
Implement instruction execution tracing
Design regression testing framework

7.3 Platform Portability

Test on x86-64, ARM64, RISC-V architectures
Validate endianness handling
Test alignment requirement compliance
Verify float/double precision consistency

7.4 Error Handling

Define error code enumeration (type errors, bounds errors, execution errors)
Create structured error reporting (error message, location, stack trace)
Implement graceful degradation (trap and unwind vs. abort)

Deliverables:

src/safety/bounds_check.c - Bounds checking utilities
src/safety/validator.c - Buffer validation
tests/determinism/ - Determinism test suite
tests/portability/ - Cross-platform tests
src/errors/ - Error handling infrastructure

Phase 8: Tooling and Developer Experience

8.1 VM Introspection Tools

Build execution state dumper (print current VM state)
Create dataflow graph visualizer (DOT format export)
Implement instruction disassembler (bytecode → human-readable)
Design execution tracer (log each instruction)

8.2 Debugging Support

Implement breakpoint mechanism
Create single-step execution mode
Build register watch facility
Design dataflow node inspection utilities

8.3 Performance Analysis

Create execution profiler (instruction counts, operator timings)
Implement memory usage analyzer
Build dataflow bottleneck detector

8.4 Documentation

Write architecture overview documentation
Create FlatBuffer schema documentation
Document instruction set reference
Write operator semantics guide
Create host integration tutorial

Deliverables:

tools/vm_inspect - State inspection tool
tools/df_visualizer - Dataflow graph visualizer
tools/disassembler - Bytecode disassembler
tools/profiler - Performance profiler
docs/architecture.md - Architecture documentation
docs/instruction_set.md - ISA reference
docs/operators.md - Dataflow operator reference

Phase 9: Standard Library and Built-ins

9.1 Built-in Functions

Implement string manipulation functions
Create mathematical functions (sin, cos, sqrt, log, exp)
Design date/time manipulation functions
Build type conversion functions

9.2 I/O Operations

Design file reading operators (CSV, JSON, Parquet)
Implement table serialization (export to formats)
Create streaming I/O for large datasets

9.3 Standard Dataflow Patterns

Create common query templates (group-aggregate, window functions)
Implement reusable transformation pipelines

Deliverables:

src/stdlib/ - Standard library implementations
src/io/ - I/O operation implementations
schemas/stdlib.fbs - Standard library schemas
docs/stdlib.md - Standard library reference

Phase 10: Optimization and Production Readiness

10.1 Performance Optimization

Implement SIMD vectorization for hot paths (columnar operations, filters)
Create instruction dispatch optimization (computed goto, threaded code)
Optimize FlatBuffer access patterns (cache locality, prefetching)
Tune arena allocation parameters (arena size, growth strategy)

10.2 Benchmarking

Create benchmark suite (micro-benchmarks, end-to-end scenarios)
Compare against Lua (procedural execution speed)
Compare against DuckDB (dataflow operation speed)
Measure memory overhead

10.3 Production Hardening

Implement comprehensive error recovery
Create resource limitation enforcement (memory caps, execution timeouts)
Design security sandboxing (restrict host access)
Build graceful shutdown mechanism

10.4 Release Engineering

Create build system (CMake or Meson)
Set up continuous integration
Implement versioning and release process
Design upgrade path for schema evolution

Deliverables:

Optimized production build
Benchmark results and analysis
Security audit report
Release artifacts (binaries, headers, documentation)

Dependencies and Prerequisites

Required Tools

C11 compliant compiler (GCC 4.9+, Clang 3.4+, MSVC 2015+)
FlatBuffers compiler (flatc)
CMake 3.15+ or Meson build system
Python 3.8+ (for tooling and bindings)

External Libraries

FlatBuffers C library
Testing framework (e.g., Check, Unity, or cmocka)

Risk Assessment

Technical Risks

FlatBuffer performance overhead - Accessor indirection may slow hot paths
- Mitigation: Benchmark early, optimize critical sections with SIMD, cache hot paths
Arena memory overhead - Long-running computations may accumulate large arenas
- Mitigation: Implement arena compaction, allow arena reuse for similar-sized allocations
Determinism on edge cases - Floating-point operations may vary across platforms
- Mitigation: Use strict FP modes, document platform requirements
Schema evolution complexity - Breaking changes may complicate upgrades
- Mitigation: Version all schemas, maintain compatibility layer

Project Risks

Scope creep - Feature additions may delay core implementation
- Mitigation: Strict adherence to phased plan, defer non-essential features
Documentation debt - Complex design may be hard to document
- Mitigation: Document schemas and APIs during development, not after

Success Criteria

VM can execute procedural scripts with Lua-like semantics
Dataflow operations match DuckDB’s expressiveness for core operators
Host languages can access VM state without copying memory
All FlatBuffer regions remain valid and accessible to host
Execution is deterministic across supported platforms
Memory safety guaranteed (no UB, no leaks in normal operation)
Performance within 2x of native C for procedural code
Performance within 3x of DuckDB for dataflow operations

Timeline Estimation (Rough Order)

Phase 1: 2-3 weeks (schema design is iterative)
Phase 2: 4-6 weeks (interpreter core is complex)
Phase 3: 4-6 weeks (dataflow engine requires careful design)
Phase 4: 2-3 weeks (integration layer)
Phase 5: 3-4 weeks (host bindings for multiple languages)
Phase 6: 4-5 weeks (advanced operators are intricate)
Phase 7: 3-4 weeks (safety testing is thorough)
Phase 8: 2-3 weeks (tooling development)
Phase 9: 2-3 weeks (standard library)
Phase 10: 3-4 weeks (optimization and hardening)

Total estimated effort: 29-41 weeks (single developer, full-time)

For a team of 2-3 developers, overlapping phases could reduce calendar time to 6-9 months.

Next Steps

Review and validate this plan with stakeholders
Set up project repository and build infrastructure
Begin Phase 1: Design and iterate on FlatBuffer schemas
Create minimal working prototype (simple procedural interpreter)
Iterate on design based on early implementation learnings

Tags: