A common practice in the data warehousing world is to use a Date Key as unique identifier in a date dimension. This attribute is usually a number in the format yyyymmdd. I’m not going to dive into all the reasons why it is used in data warehouse environments here, but (for fun) let’s change our data model to use the Date Keyattribute in the relationship between the two tables.
If we look at our visualizations again, we see a very different picture. Sales by date still looks the same, but the sales by month seems a little out of whack (image below). If you had cumulative sales at any other aggregated level (quarter, year, etc.) it would also have been incorrect.
The answer is not immediately intuitive, so it’s good to know this ahead of time rather than have to struggle with it later.