The Hadoop in Real World team compares two functions against RDDs in Spark:
Let’s examine the below aggregateByKey. The first parameter – 0 is the initial value and also indicates the type of the output.
First _+_ function indicates the function on the map side combine and second _+_ function indicates the reduce side combine. Both functions are the same in this case.
This is a demo-driven post, so check it out.