Press "Enter" to skip to content

Category: Notebooks

Using Jupyter Notebooks in SQL Agent Jobs

Rob Sewell shows us how to run an Azure Data Studio notebook as a SQL Agent job:

Azure Data Studio is a great tool for connecting with your data platform whether it is in Azure or on your hardware. Jupyter Notebooks are fantastic, you can have words, pictures, code and code results all saved in one document.

I have created a repository in my Github https://beard.media/Notebooks where I have stored a number of Jupyter notebooks both for Azure Data Studio and the new .NET interactive notebooks.

Another thing that you can do with notebooks is run them as Agent Jobs and save the results of the run.

Read on to learn how.

Comments closed

Working with Spark.Net on Azure Synapse Analytics

Paul Andrew takes a look at Spark.NET (or Spark.Net or dotnet-spark or however I’m calling it this time):

The main reason I wanted access to Synapse is to play around with Spark.Net via the Synapse workspace Notebooks. Currently if deploying Synapse via the public Azure portal you only get the option to create a SQL compute pool, formally known as an Azure SQLDW. While this is good, it gives us none of the exciting things that we were shown about Synapse back in November last year during the Microsoft Ignite conference.

To get the good stuff in Azure Synapse Analytics you need access to the full developer UI and Synapse Workspace.

Click through to learn more about the experience.

Comments closed

Azure Data Studio March 2020 Release

Alan Yu announces the March 2020 release of Azure Data Studio:

Now you can add visualizations using a T-SQL query. In addition, as the gif illustrates, you can also customize your visualization whether it is a scatter or time series graph.

You can also copy your visualization or save the image so that you can quickly add this in an email or report to other team members.

We will continue to bring improvements to charting over the next few months.

They’ve put a lot of time and effort into notebooks. They’re still missing some of the quality of life improvements I want to see before moving to them full-time, but they’re consistently getting better.

Comments closed

Using Pester with .NET Powershell Notebooks

Rob Sewell has Powershell in notebooks, so of course Rob is going to write tests:

Using Pester to validate that an environment is as you expect it is a good resource for incident resolution, potentially enabling you to quickly establish an area to concentrate on for the issue. However, if you try to run Pester in a .NET Notebook you will receive an error

Click through for the reason why this error appears and a workaround until it’s fixed for real.

Comments closed

Installing .NET Notebooks for Powershell

Max Trinidad shows us how to install .NET Interactive on Linux:

In Windows, just takes a few steps to set it up. For Linux, it takes a few extra steps but still is quick enough to get you started.

For Windows, follow the instructions found at the .NET Interactive page in Github.

For Linux, for Ubuntu 18.04, follow the blog post “Ubuntu 18.04 Package Manager – Install .NET Core“.

Basically, in either operating systems, you install:

Install the .NET Core SDK
Install the ASP.NET Core runtime
Install the .NET Core runtime

Click through for the step-by-step instructions. Once you have it done, you get not only Powershell but also F# and C#.

Comments closed

.NET and Powershell 7 Notebooks

Rob Sewell forwards on some exciting news:

A notebook experience for PowerShell 7 that sounds amazing. This will enable a true cross-platform PowerShell Notebook experience which is lacking from the Python version as it uses Windows PowerShell on Windows and PowerShell Core on other OS’s

The first thing I asked was – Will this come to Azure Data Studio. I got an immediate response from Sydney Smith PowerShell Program Manager saying it is on the roadmap

Two notes of importance. First, these are kernels for Jupyter Notebooks and not Azure Data Studio or VS Code (yet). Second, Rob buried the lede on the most important language in there: F#. You can also read the full announcement from Maria Naggaga, to which Rob linked.

Comments closed

Updating the Powershell Kernel in Azure Data Studio Notebooks

Bob Pusateri has a two-parter on Powershell notebooks. First up is the problem:

PowerShell Notebooks are a great new feature in Azure Data Studio, first becoming available in the November 2019 release. Like SQL notebooks, PowerShell notebooks are based on Jupyter Notebooks format, which are interactive documents containing text and executable code blocks.

Having some working PowerShell code that I wanted to share along with explanations and examples, I created a PowerShell Notebook. The only problem was my functions would never initialize. Actually they would never stop initializing – I would run the cell they were defined in, and it would just keep running forever.

And then Bob has the solution:

It turns out I did not have the latest version of the PowerShell Kernel running on my machine. The latest version is currently 0.1.3, and I had 0.1.2. Upgrading appears to have solved this issue for me – yay!

This solution also raises the issue that there is no notification from Azure Data Studio that a PowerShell Kernel exists or is in need of updating. I (and probably others) will just believe that as long as Azure Data Studio is up to date, we’re good to go. So how does one update their PowerShell kernel? Well, it’s simple, but not intuitive.

Read on to see how.

Comments closed

Databricks Automated Deployment and Testing

Li Yu, et al, explain how to use Databricks notebooks and MLflow to automate deployment and testing of Spark solutions:

Today many data science (DS) organizations are accelerating the agile analytics development process using Databricks notebooks.  Fully leveraging the distributed computing power of Apache Spark™, these organizations are able to interact easily with data at multi-terabytes scale, from exploration to fast prototype and all the way to productionize sophisticated machine learning (ML) models.  As fast iteration is achieved at high velocity, what has become increasingly evident is that it is non-trivial to manage the DS life cycle for efficiency, reproducibility, and high-quality. The challenge multiplies in large enterprises where data volume grows exponentially, the expectation of ROI is high on getting business value from data, and cross-functional collaborations are common.

In this blog, we introduce a joint work with Iterable that hardens the DS process with best practices from software development.  This approach automates building, testing, and deployment of DS workflow from inside Databricks notebooks and integrates fully with MLflow and Databricks CLI. It enables proper version control and comprehensive logging of important metrics, including functional and integration tests, model performance metrics, and data lineage. All of these are achieved without the need to maintain a separate build server.

Read on to see how.

Comments closed