Dimensional Modeling

Jen Underwood explains the basics of dimensional modeling:

A dimensional model is also commonly called a star schema. It provides a way to improve report query performance without affecting data integrity. This type of model is popular in data warehousing because it can provide better query performance than transactional, normalized, OLTP data models. It also allows for data history to be stored accurately over time for reporting. Another reason why dimensional models are created…they are easier for non-technical users to navigate. Creating reports by joining many OLTP database tables together becomes overwhelming quickly.

Dimensional models contain facts surrounded by descriptive data called dimensions. Facts contains numerical values of what you measure such as sales or user counts that are additive, or semi-additive in nature. Fact tables also contain the keys/links to associated dimension tables. Compared to most dimension tables, fact tables typically have a large number of rows.

Jen’s post was built off of an early SQL Saturday presentation.  It’s still quite relevant today.

Related Posts

Page Ranking With Kafka Streams

Hunter Kelly walks through a page ranking algorithm: Once you have the adjacency matrix, you perform some straightforward matrix calculations to calculate a vector of Hub scores and a vector of Authority scores as follows: Sum across the columns and normalize, this becomes your Hub vector Multiply the Hub vector element-wise across the adjacency matrix […]

Read More

Data Lakes Aren’t New

Shannon Lowder reveals one of the deep, dark data lake secrets: Turns out there are three basic zones or areas to a data lake. Raw, Managed, and Presentation. The raw zone should be optimized for fast storage.  The goal is to get the data in as quickly as possible.  Don’t make any changes to this […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31