A Spark Job Definition is effectively a way to run a packaged Spark application, Fabric’s version of executing a
spark-submitjob. You define:
- what code should run (the entry point),
- which files or resources should be shipped with it,
- and which command-line arguments should control its behavior.
Unlike a notebook, there is no interactive editor or cell output, but this is arguably not a missing feature, it’s the whole point… an SJD is not meant for exploration; it is meant to deterministically run a Spark application.
With that concept in mind, click through for the process.