Memory Management in Spark

Rishitesh Mishra has started a new series on slow or failing Spark applications and starts with the big reason:

If we were to got all Spark developers to vote, out of memory (OOM) conditions would surely be the number one problem everyone has faced. This comes as no big surprise as Spark’s architecture is memory-centric. Some of the most common causes of OOM are:
* Incorrect usage of Spark
* High concurrency
* Inefficient queries
* Incorrect configuration

Definitely worth the read.

Related Posts

Flink’s State Processor API

Seth Wiesman and Fabian Hueske show off Apache Flink’s State Processor API: The State Processor API that comes with Flink 1.9 is a true game-changer in how you can work with application state! In a nutshell, it extends the DataSet API with Input and OutputFormats to read and write savepoint or checkpoint data. Due to […]

Read More

Derivative Event Sourcing

Anna McDonald explains the concept of derivative event sourcing: If you happen to be the proud owner of a single order service, then you are all set to begin. But what if you have more than one order service? Something that tends to happen at companies that have been around for more than a sprint […]

Read More

Categories

April 2019
MTWTFSS
« Mar May »
1234567
891011121314
15161718192021
22232425262728
2930