The Value Of Sparse Columns

Erin Stellato discusses sparse columns:

In conclusion, we see a significant reduction in disk space and IO when sparse columns are used, and they perform slightly better than non-sparse columns in our simple data modification tests (note that retrieval performance should also be considered; perhaps the subject of another post).

Sparse columns have a lot of potential value, but in my experience, they fall short in one huge way:  you cannot compress tables with sparse columns.  Given that both sparse columns and data compression are things which benefit from scale, it’s important to make the right choice upfront.

Related Posts

Power BI Architecture Diagram V4

Dustin Ryan has a new version of the Power BI Architecture Diagram: First and most importantly, I updated the Power BI logo in the diagram to the latest version of the logo! Secondly, I included Power BI Dataflows in the diagram tagged #6. Power BI Dataflows are used to ingest, transform, integrate, and enrich big […]

Read More

Kafka And The Differing Aims Of Data Professionals

Kai Waehner argues that there is an impedence mismatch between data engineers, data scientists, and ML production engineers: Data scientists love Python, period. Therefore, the majority of machine learning/deep learning frameworks focus on Python APIs. Both the stablest and most cutting edge APIs, as well as the majority of examples and tutorials use Python APIs. […]

Read More

Categories

February 2016
MTWTFSS
« Jan Mar »
1234567
891011121314
15161718192021
22232425262728
29