Landon Robinson shows how you can use Spark to extract data from Cassandra and move it into HDFS:
Cassandra is a great open-source solution for accessing data at web scale, thanks in no small part to its low-latency performance. And if you’re a power user of Cassandra, there’s a high probability you’ll want to analyze the data it contains to create reports, apply machine learning, or just do some good old fashioned digging.
However, Cassandra can prove difficult to use as an analytical warehouse, especially if you’re using it to serve data in production around the clock. But one approach you can take is quite simple: copy the data to Hadoop (HDFS).
Read on to learn how.