Spark and dotnet in a Single Container

Ed Elliott shows how you can combine Spark and .NET Core in a single Docker container:

This is quite new syntax in docker and you need at least docker 17.05 (client and daemon), after the images “FROM blah” you can specify a name “core” in this case, then later you can copy from the first image to the second using “–from=” on the “COPY” command.

In this dockerfile I have added Spark 2.4.3 and the default environment variables we need to get spark running, if you grab this dockerfile and run “docker build -t dotnet-spark .” you should get an images you can then run which includes the dependencies for dotnet as well as spark.

Ed includes all of the scripts needed to test this out, too.

Related Posts

Notebooks in Azure Databricks

Brad Llewellyn takes us through Azure Databricks notebooks: Azure Databricks Notebooks support four programming languages, Python, Scala, SQL and R.  However, selecting a language in this drop-down doesn’t limit us to only using that language.  Instead, it makes the default language of the notebook.  Every code block in the notebook is run independently and we […]

Read More

Reading and Writing CSV Files with spark-dotnet

Ed Elliott continues a series on Spark for .NET: How do you read and write CSV files using the dotnet driver for Apache Spark? I have a runnable example here:https://github.com/GoEddie/dotnet-spark-examples Specifcally:https://github.com/GoEddie/dotnet-spark-examples/tree/master/examples/split-csv The quoted links will take you straight to the code, but click through to see Ed’s commentary.

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

July 2019
MTWTFSS
« Jun  
1234567
891011121314
15161718192021
22232425262728
293031