Optimizing HBase In HDInsight

Ashish Thapliyal links to a 30-minute presentation on HBase optimization:

This session was presented by Nitin Verma (Sr. Software Engineer) and Pravin Mittal (Principal Engineering Manager) @ HBaseCon 2016. The session goes deeper into success story of enabling a big internal customer on HDInsight HBase.

HBase design is a totally different mindset from relational design, so you have to unlearn a lot of habits when moving over to it.

Related Posts

Page Ranking With Kafka Streams

Hunter Kelly walks through a page ranking algorithm: Once you have the adjacency matrix, you perform some straightforward matrix calculations to calculate a vector of Hub scores and a vector of Authority scores as follows: Sum across the columns and normalize, this becomes your Hub vector Multiply the Hub vector element-wise across the adjacency matrix […]

Read More

Stateful Processing In Spark Streaming

Bill Chambers and Jules Damji look at a couple of stateful scenarios within Spark Streaming: No streaming events are free of duplicate entries. Dropping duplicate entries in record-at-a-time systems is imperative—and often a cumbersome operation for a couple of reasons. First, you’ll have to process small or large batches of records at time to discard […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031