Press "Enter" to skip to content

Ordering and Sorting Data in Spark

Landon Robinson shows how to sort data in Spark RDDs and DataFrames:

In the analysis section of Spark Starter Guide 4.6: How to Aggregate Data, we asked these questions: “Who is the youngest cat in the data? Who is the oldest?”

Let’s use ordering in Spark as an alternative method to answer those same questions, and achieve the same result. Specifically, let’s again find the youngest and oldest cats in the data.

Click through for plenty of examples.