Generating Test Data In Kafka

Yeva Byzek takes us through the Kafka Connect Datagen connector:

Short of using real data from a real source, you do have a few options on how to generate more interesting test data for your topics. One option is to write your own client. Kafka has many programming language options—you choose: Java, Python, Go, .NET, Erlang, Rust—the list goes on. You can write your own Kafka client applications that produce any kind of records to a Kafka topic, and then you’re set.
But wouldn’t it be great if you could generate data locally to just fill topics with messages? Fortunately, you’re in luck! Because we have those data generators.

Click through for a demonstration.

Related Posts

When Not to Use Spark

Ramandeep Kaur gives us several cases when it makes sense not to use Apache Spark: There can be use cases where Spark would be the inevitable choice. Spark considered being an excellent tool for use cases like ETL of a large amount of a dataset, analyzing a large set of data files, Machine learning, and […]

Read More

Hyperparameter Tuning with MLflow

Joseph Bradley shows how you can perform hyperparameter tuning of an MLlib model with MLflow: Apache Spark MLlib users often tune hyperparameters using MLlib’s built-in tools CrossValidator and TrainValidationSplit.  These use grid search to try out a user-specified set of hyperparameter values; see the Spark docs on tuning for more info. Databricks Runtime 5.3 and 5.3 ML and above support […]

Read More

Categories

January 2019
MTWTFSS
« Dec Feb »
 123456
78910111213
14151617181920
21222324252627
28293031