Generating Artificial Data with Databricks Generator

Ust Oldfield shows off a new tool:

Databricks Labs is a relatively new offering from Databricks which showcases what their teams have been creating in the field to help their customers. As a Consultant, this makes my life a lot easier as I don’t have to re-invent the wheel and I can use it to demonstrate value in partnering with Databricks. There’s plenty of use cases that I’ll be using, and extending, with my client but the one I want to focus on in this post is the Data Generator.

Read on for an example of how this works. Something not in Ust’s post but worth mentioning is that you can control the distribution of random numeric features. That’s a piece of functionality you often don’t see in data generators.

