Kafka Analytics Patterns In HDP 3.1

Kevin Feasel

2018-11-23

Hadoop

George Vetticaden walks us through what’s coming with Apache Kafka in Hortonworks Data Platform 3.1:

A summary of these three new access patterns:

  • Stream Processing: Kafka Streams Support – With existing support for Spark Streaming, SAM/Storm, Kafka Streams addition provides developers with more options for their stream processing and microservice needs.

  • SQL Analytics: New Hive Kafka Storage Handler – View Kafka topics as tables and execute SQL via Hive with full SQL Support for joins, windowing, aggregations, etc.

  • OLAP Analytics: New Druid Kafka Indexing Service – View Kafka topics as cubes and perform OLAP style analytics on streaming events in Kafka using Druid.

Click through for high-level explanations of each.  George promises more detailed explanations as well.

Related Posts

MRAppMaster Errors Running MapReduce Jobs

I have a post looking at potential causes when PolyBase MapReduce jobs are unable to find the MRAppMaster class: Let me tell you about one of my least favorite things I like to see in PolyBase: Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster This error is not limited to PolyBase but is instead […]

Read More

Database-First or Kafka-First for Event Streaming

Gwen Shapiro takes us through a scenario where database-first writes for event streaming makes the most sense: Note that the DB does quite a lot for you: it enforces serializability, locks, your logical constraints, etc. If the DB is distributed (Vitesse, Cockroach, Spanner, Yugabyte), it does even more. If you were to go Kafka-first… well, […]

Read More

Categories

November 2018
MTWTFSS
« Oct Dec »
 1234
567891011
12131415161718
19202122232425
2627282930