Stephane Maarek explains the value of using Apache Avro as a schema structure for your Kafka topics:
-
Avro has support for primitive types (
int
,string
,long
,bytes
, etc…), complex types (enum
,arrays
,unions
, optionals), logical types (dates
,timestamp-millis
,decimal
), and data record (name
andnamespace
). All the types you’ll ever need. -
Avro has support for embedded documentation. Although documentation is optional, in my workflow I will reject any Avro Schema PR (pull request) that does not document every single field, even if obvious. By embedding documentation in the schema, you reduce data interpretation misunderstandings, you allow other teams to know about your data without searching a wiki, and you allow your devs to document your schema where they define it. It’s a win-win for everyone.
-
Avro schemas are defined using JSON. Because every developer knows or can easily learn JSON, there’s a very low barrier to entry
Read on for more about Avro as well as the possibilities of using other techniques for defining schemas in Kafka.