Technical Depth

Read the Algorithms

Every optimization in TabbyDB is documented at the algorithmic level. Read the white paper before you decide. Then run the benchmark.

01

Constraint Propagation Rule Optimization

Describes the new algorithm that solves Constraints Blow Up due to the permutational nature of the logic in stock Spark. Ensures the number of constraints never exceeds the number of filters — preventing the exponential growth that causes 1–8 hour compile times.

02

Capping the Query Plan Size

Describes collapsing project nodes in the analysis phase to prevent extremely large tree sizes for query plans created via DataFrame APIs. Ensures optimizer rules run on the most compact possible tree — preventing OutOfMemory errors.

Compile TimeRead White Paper
03

Pushdown of Broadcast Hash Join Keys as Runtime Filters

Describes the runtime enhancement of using broadcasted keys as dynamic filters for file pruning on non-partitioned column joins. Delivers 13% improvement on TPC-DS at 1TB and 2TB, and 46%+ with Apache Iceberg.

04

TPC-DS Benchmark Details

Full breakdown of methodology, configuration, and results for 1TB and 2TB benchmarks on AWS (6 nodes r6gd.4xlarge, 768GB RAM, 96 vCPUs). Includes query-by-query timing and data generation setup.

05

Extraction of Repeated Sub-Expressions

Describes applying expensive optimizer rules only once to complex repeated sub-expressions — marking them immutable so subsequent rule batches skip them entirely. Eliminates redundant tree traversal.