Where Columnar Databases Struggle

Teo Lachev makes a good point regarding columnar databases:

A large company uses the SAP HANA ERP system. Users requires real-time access to transactional data. To avoid performance degradation, SLT replication (trigger-based change data capture) replicates data to another SAP HANA system that is used solely for reporting. The problem is that the more detailed the report gets and the more columns it has, the slower it gets and SAP HANA throws out of memory exceptions.

SAP HANA is an in-memory columnar database like Tabular. So, it stores data in columns, not rows. Columnar databases are primarily designed for analytical reports which typically have a few columns (sales by customer, product, date), but can potentially aggregate large datasets. As the reporting grain lowers and more columns are added (order number, order line item, customer name, phone number, etc.), a columnar database has to cross-join more and more columns. This is not efficient and performance quickly degrades irrespective that storage is fast. SSAS Tabular and Power BI are no different. SAP HANA complicates the issue further by preventing direct access to tables and requiring “analytical” views that join tables and potentially nest other views.

Read the whole thing.  Teo has a great point:  there are trade-offs between different data platform technologies, and choosing the right one is important.

Related Posts

When Rowstore Compression Beats Columnstore

Joe Obbish looks at scenarios where page-level compression on rowstore tables can beat columnstore compression in terms of resultant table size: It’s certainly more difficult to come up with a demo that works without string columns, but consider how the page compression algorithm works. Data can be compressed on page basis, which includes both multiple rows […]

Read More

Clustered Columnstore Index Online Rebuild

Niko Neugebauer looks at a feature which will pop up in SQL Server vNext: The current state of the Clustered Columnstore Index ONLINE rebuild points to be an unfinished version, which will definitely get vastly improved before being released & supported in SQL Server. I have seen a couple of deadlocks and canceled transactions and […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

July 2018
MTWTFSS
« Jun  
 1
2345678
9101112131415
16171819202122
23242526272829
3031