Confluent Hub: A Central Repo For Kafka Connect

Kevin Feasel

2018-06-25

Hadoop

Tim Berglund announces Confluent Hub:

Connect has been an integral part of Apache Kafka since version 0.9, released late 2015. It has proved to be an effective framework for streaming data in and out of Kafka from nearby systems like relational databases, Amazon S3, HDFS clusters, and even nonstandard legacy systems that typically show themselves in the enterprise. Connect is an API on which the connectors themselves are built, plus a run-time framework that runs them in a scalable, fault-tolerant way. The intent was for the community to provide its own connectors to plug into this framework and do the work of data integration while saving everyone a bunch of unrewarding coding that was near-boilerplate and didn’t add a lot of differentiated value to the business.

So where would those connectors live? Well, GitHub, for starters. At the time of this writing, there were 660 repositories matching the search phrase “Kafka Connect” on the popular hosting service, all in various stages of repair and levels of maintenance. Beyond those, Confluent’s popular Connectors page has proven to be one of the best ways to find connectors, some of which are supported by Confluent, and others of which have robust community support behind them. The Connectors page lists for each entry the type of connector, the developer, a few tags, and how you could obtain the code—but that’s really all it did. You still had to go find the released JARs for the connector, download them, and know how to install them properly. And if there were no released JARs available, you had to clone the repository, figure out how to run the build, and then install the JARs into your own Kafka Connect installation. Maybe not rocket science, but we all know it’s never as simple as it sounds. And besides, this was just connectors—no transformations or converters were available on this page.

We knew there was a better way. We wanted something that was easier to use, would avoid you having to building a connector from source every time you wanted an update (and learning a new build tool every now and then), and would be built on top of a meaningful and functional discovery mechanism. And most importantly, we wanted to avoid the pitfalls of manually moving JARs around and having to debug why Connect didn’t find them.

This looks like a good addition to the Kafka ecosystem.

Related Posts

Apache Avro 1.9.0 Released

Fokko Driesprong announces the release of Apache Avro 1.9.0: Avro is a remote procedure call and data serialization framework developed within Apache’s Hadoop project. It uses JSON for defining data types and protocols, and serializes data in a compact binary format. If you’re unfamiliar with Avro, I would highly recommend the explanation of Dennis Vriend […]

Read More

Temporal Tables with Flink

Marta Paes shows off a new feature in Apache Flink: In the 1.7 release, Flink has introduced the concept of temporal tables into its streaming SQL and Table API: parameterized views on append-only tables — or, any table that only allows records to be inserted, never updated or deleted — that are interpreted as a changelog and […]

Read More

Categories

June 2018
MTWTFSS
« May Jul »
 123
45678910
11121314151617
18192021222324
252627282930