It’s easy enough to create a handful of records for testing in SQL Server. What if you want 100 rows or 1000 rows? What if you want data that looks more legitimate compared to gibberish? In this post, we’ll look at different ways to generate mock data.
One of the trickiest things about creating mock data is getting the distributions right. For example,
ABS(CHECKSUM(NEWID()) is great (just as
RAND(CHECKSUM(NEWID())), but the results follow a uniform distribution because of the nature of checksums and random number generators. This makes charting numeric values look unnatural. Here’s an example I put together of generating data off of a normal distribution. It does take more effort, but if you’re generating this fake data to show it to users in tools like Power BI or Tableau, having data follow reasonable distributions is a good thing. That is, use whatever distribution makes sense for the particular data element: uniform, normal, Pareto (power law), gamma, etc.