Mikhail Stolpner gives us some tips on how to optimize Apache Spark clusters:
There are four major resources: memory, compute (CPU), disk, and network. Memory and compute are by far the most expensive. Understanding how much compute and memory your application requires is crucial for optimization.
You can configure how much memory and how many CPUs each executor gets. While the number of CPUs for each task is fixed, executor memory is shared between the tasks processed by a single executor.
A few key parameters provide the most impact on how Spark is executed in terms of resources: spark.executor.memory, spark.executor.cores, spark.task.cpus, spark.executor.instances, and spark.qubole.max.executors.
This article gives us some idea of the levers we have available as well as when to pull them. Though the article itself is vendor-specific, a lot of the advice is general.