Using The Spark-HBase Connector

Anunay Tiwari shows how to use the Spark-HBase connector in HDInsight:

The Spark-Hbase Connector provides an easy way to store and access data from HBase clusters with Spark jobs. HBase is really successful for highest level of data scale needs. Thus, existing Spark customers should definitely explore this storage option. Similarly, if the customers are already having HDinsight HBase clusters and they want to access their data by Spark jobs then there is no need to move data to any other storage medium. In both the cases, the connector will be extremely useful.

I’m not the biggest fan of HBase, but if it’s part of your environment, you should definitely look at this Spark connector.

Related Posts

Leveraging Hive In Pyspark

Fisseha Berhane shows how to use Spark to connect Python to Hive: If we are using earlier Spark versions, we have to use HiveContext which is variant of Spark SQL that integrates with data stored in Hive. Even when we do not have an existing Hive deployment, we can still enable Hive support. In this […]

Read More

Stream Reactor Update

Andrew Stevenson announces Stream Reactor 1.0.0 for Kafka Connect 1.0: Stream Reactor is an Apache License, Version 2.0 open source collection of components built on top of Kafka and provides Kafka Connect compatible connectors to move data between Kafka and popular data stores. Stream Reactor provides source connectors to publish data into Kafka and sink connectorsto bring data from Kafka […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031