Dimensional Modeling

Jen Underwood explains the basics of dimensional modeling:

A dimensional model is also commonly called a star schema. It provides a way to improve report query performance without affecting data integrity. This type of model is popular in data warehousing because it can provide better query performance than transactional, normalized, OLTP data models. It also allows for data history to be stored accurately over time for reporting. Another reason why dimensional models are created…they are easier for non-technical users to navigate. Creating reports by joining many OLTP database tables together becomes overwhelming quickly.

Dimensional models contain facts surrounded by descriptive data called dimensions. Facts contains numerical values of what you measure such as sales or user counts that are additive, or semi-additive in nature. Fact tables also contain the keys/links to associated dimension tables. Compared to most dimension tables, fact tables typically have a large number of rows.

Jen’s post was built off of an early SQL Saturday presentation.  It’s still quite relevant today.

Related Posts

Predicting Advertising Budgets With Kafka Streams

Boyang Chen explains how Pinterest uses Kafka Streams to reduce advertising overdelivery: Overdelivery occurs when free ads are shown for out-of-budget advertisers. This reduces opportunities for advertisers with available budget to have their products and services discovered by potential customers. Overdelivery is a difficult problem to solve for two reason: Real-time spend data: Information about […]

Read More

Using Kafka To Drive Machine Learning

Kai Waehner has a nice architectural post on using Kafka as the focal point for machine learning training and prediction: The essence of this architecture is that it uses Kafka as an intermediary between the various data sources from which feature data is collected, the model building environment where the model is fit, and the […]

Read More


July 2017
« Jun Aug »