Storm 1.0 Microbenchmarks

Kevin Feasel

2016-07-18

Hadoop

Roshan Naik and Sapin Amin have Storm 1.0 benchmarks on a small cluster:

Numbers suggest that Storm has come a long way in terms of performance but it still has room go faster. Here are some of the broad areas that should improve performance in future:

  • An effort to rewrite much of Storm’s Clojure code in Java is underway. Profiling has shown many hotspots in Clojure code.

  • Better scheduling of workers. Yahoo is experimenting with a Load Aware Scheduler for Storm to be smarter about the way in which topologies are scheduled on the cluster.

  • Based on microbenchmarking and discussions with other Storm developers there appears potential for streamlining the internal queueing for faster message transfer.

  • Operator coalescing (executing consecutive spouts/bolts in a single thread when possible) is another area that reduces intertask messaging and improve throughput.

Even with these potential improvements, Storm has come a long way—their benchmarks show around 5x throughput and a tiny fraction of the latency of Storm 0.9.1.

Related Posts

Working With The Databricks API Via Powershell

Gerhard Brueckl has a Powershell module for interacting with Databricks, either Azure or AWS: As most of our deployments use PowerShell I wrote some cmdlets to easily work with the Databricks API in my scripts. These included managing clusters (create, start, stop, …), deploying content/notebooks, adding secrets, executing jobs/notebooks, etc. After some time I ended […]

Read More

Kafka Connect Converters And Serialization

Robin Moffatt goes into great detail on Apache Kafka Connect converters and serialization techniques: Kafka Connect is modular in nature, providing a very powerful way of handling integration requirements. Some key components include: Connectors – the JAR files that define how to integrate with the data store itself Converters – handling serialization and deserialization of […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031