Running PySpark In Visual Studio Code

Jenny Jiang shows how to run PySpark on HDInsight in VSCode:

We are excited to introduce the integration of HDInsight PySpark into Visual Studio Code (VSCode), which allows developers to easily edit Python scripts and submit PySpark statements to HDInsight clusters. For PySpark developers who value productivity of Python language, VSCode HDInsight Tools offer you a quick Python editor with simple getting started experiences, and enable you to submit PySpark statements to HDInsight clusters with interactive responses. This interactivity brings the best properties of Python and Spark to developers and empowers you to gain faster insights.

Click through to see how it’s done.

Related Posts

Joining Multiple Types Of Data With KSQL

Robin Moffatt has an example where he enriches streaming CSV data with information stored in MySQL: This is a continuous query that executes in the background until explicitly terminated by the user. In effect, these are stream processing applications, and all we need to create them is SQL! Here all we’ve done is an enrichment (joining two […]

Read More

Multi-Class Text Classification In Python

Susan Li has a series on multi-class text classification in Python.  First up is analysis with PySpark: Our task is to classify San Francisco Crime Description into 33 pre-defined categories. The data can be downloaded from Kaggle. Given a new crime description comes in, we want to assign it to one of 33 categories. The classifier […]

Read More


November 2017
« Oct Dec »