This guide will be completely online and completely free. A book’s worth of content, containing exercises in Python and Scala to teach you Spark, at your fingertips. Again, free.
This section introduces the concept of data pipelines – how data is processed from one form into another. It’s also the generic term used to describe how data moves from one location or form, and is consumed, altered, transformed, and delivered to another location or form.
You’ll be introduced to Spark functions like join, filter, and aggregate to process data in a variety of forms. You’ll learn it all through interactive Spark exercises in Scala and Python.
This is very early in the process but I’m excited.