Press "Enter" to skip to content

Why 200 Tasks for a Spark Execution?

The Hadoop in Real World team explains why you might see 200 tasks when running a Spark job:

It is quite common to see 200 tasks in one of your stages and more specifically at a stage which requires wide transformation. The reason for this is, wide transformations in Spark requires a shuffle. Operations like join, group by etc. are wide transform operations and they trigger a shuffle.

Read on to learn why 200, and whether 200 is the right number for you.