The folks at Beginner’s Hadoop take us through resource allocation in Spark applications:
Tiny executors essentially means one executor per core. Following table depicts the values of our spar-config params with this approach:
Analysis: With only one executor per core, as we discussed above, we’ll not be able to take advantage of running multiple tasks in the same JVM. Also, shared/cached variables like broadcast variables and accumulators will be replicated in each core of the nodes which is 16 times. Also, we are not leaving enough memory overhead for Hadoop/Yarn daemon processes and we are not counting in ApplicationManager. NOT GOOD!
Read on for the full analysis.