Miles Cole makes an announcement:
I’m excited to formally announce LakeBench, now in version v0.3, the first Python-based multi-modal benchmarking library that supports multiple data processing engines on multiple benchmarks. You can find it on GitHub and PyPi.
Traditional benchmarks like TPC-DS and TPC-H focus heavily on analytical queries, but they miss the reality of modern data engineering: building complex ELT pipelines. LakeBench bridges this gap by introducing novel benchmarks that measure not just query performance, but also data loading, transformation, incremental processing, and maintenance operations. The first of such benchmarks is called ELTBench and is initially available in
light
mode.
Click through to see how it works and grab a copy if you’re interested.
Leave a Comment