Making Wide World Importers Bigger

Kevin Feasel

2016-08-15

Data

Koen Verbeeck wants bigger fact tables for Wide World Importers:

Microsoft released a new sample database a couple of months back: Wide World Importers. It’s quite great: not every (unnecessary feature) is included but only features you’d actually use, lots of sample scripts are provided and – most importantly – you can generate data until the current date. One small drawback: it’s quite tiny. Especially the data warehouse is really small. The biggest table, Fact.Order, has about 266,000 rows and uses around 280MB on disk. Your numbers may vary, because I have generated data until the current date (12th of August 2016) and I generated data with more random samples per day. So most likely, other versions of WideWorldImportersDW might be even smaller. That’s right. Even smaller.

260 thousand rows is nothing for a fact table.  I was hoping that the data generator would allow for a bigger range of results, from “I only want a few thousand records” like it does up to “I need a reason to buy a new hard drive.”  Koen helps out by giving us a script to expand the primary fact table.

Related Posts

L-Diversity versus K-Anonymity

Duncan Greaves explains the concepts behind l-diversity: There are problems with K-anonymous datasets, namely the homogeneous pattern attack, and the background knowledge attack, details of which are in my original post. A slightly different approach to anonymising public datasets comes in the form of ℓ -diversity, a way of introducing further entropy/diversity into a dataset. […]

Read More

Economic Articles With Data Included

Kevin Feasel

2019-02-22

Data, R

Sebastian Kranz has a Shiny app to help you find economic papers with included data: One gets some information about the size of the data files and the used code files. I also tried to find and extract a README file from each supplement. Most README files explain whether all results can be replicated with […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031