Get Started

Download TabbyDB. Free Trial.

Trial valid for approximately 3 months. Maximum 8 executors. No code changes required. Full rollback in 30 seconds.

Download TabbyDB as a complete Spark installation. 100% compatible with the corresponding Apache Spark version.

Iceberg Performance · TabbyDB 4.1.1

Apache Iceberg Runtime — Drop-in Replacement

The iceberg-tabbydb-runtime jar unlocks Broadcast Hash Join key pushdown at the scan level when querying Iceberg tables — unavailable in the standard iceberg-spark-runtime. On a single-node M4 Mac, 50 GB non-partitioned Iceberg testing shows 46% improvement (1608s → 862s) for queries sorted on the date column. Larger benchmarks at 1 TB and 2 TB are underway.

  • Already included in the Fresh Install (.tgz) of TabbyDB 4.1.1 — no action needed.
  • For the Convert Existing Spark path: download the jar below and replace your existing iceberg-spark-runtime-4.1.1 jar with it.
  • The default iceberg-spark-runtime jar remains 100% compatible with TabbyDB 4.1.1 — you just won't get the Broadcast Hash Join scan-level pushdown gains without this drop-in replacement.
TPC-DS Benchmark · Patch

Non-Partitioned, Date-Sorted Splits Patch

If you're running TPC-DS on TabbyDB and seeing similar numbers to stock Spark, the toolkit is generating partitioned tables by default — which hides TabbyDB's gains. Apply this patch to the Databricks TPC-DS toolkit source to generate non-partitioned, locally date-sorted splits. Data generation becomes 6–7× faster and benchmark results show ~13% better performance at 1TB–2TB scale.

  • Once-only application of expensive optimizer rules to complex repeated sub-expressions regardless of batch iterations (SPARK-36786)
  • Improved Broadcast Hash Join key pushdown performance at scan level (SPARK-44662)

Compare Performance in Real Time

Run the same query on stock Spark vs TabbyDB in our hosted Zeppelin notebooks.