Press "Enter" to skip to content

Generating Fake Data

Rich Benner shows us how to use the Faker library in Python to generate test data:

There are far more options when using Faker. Looking at the official documentation you’ll see the list of different data types you can generate as well as options such as region specific data.

Go have fun trying this, it’s a small setup for a large amount of time saved.

These types of tools can be great for generating a bunch of data but come with a couple of risks. One is that in the fake addresses Rich shows, ZIP codes don’t match their states at all, so if your application needs valid combos, it can cause issues. The other problem comes from distributions: generated data often gets created off of a uniform distribution, so you might not find skewness-related problems (e.g., parameter sniffing issues) strictly in your test data.

That said, easily generating test data is powerful and I don’t want to let the good be the enemy of the great.