Press "Enter" to skip to content

Category: Notebooks

Executing Parameterized Notebooks via Azure Data Studio

Julie Koesmarno takes us through three methods for executing parameterized notebooks in Azure Data Studio:

In Feb 2021 release, Azure Data Studio (v1.26.1) has added parameterized URI execution. See the “Preview of passing parameters through URI” section and the Parameterization of Notebooks in Azure Data Studio on Microsoft Docs.

So, in total there are three ways of executing parameterized notebook (from another notebook). Check out the demo files here:

Click through for the notebooks.

Comments closed

Survival Analysis Notebooks

Dan Morris, et al, walk us through a survival analysis scenario:

In contrast to other methods that may seem similar on the surface, such as linear regression, survival analysis takes censoring into account. Censoring occurs when the start and/or end of a measured value is unknown. For example, suppose our historical data includes records for the two customers below. In the case of customer A, we know the precise duration of the subscription because the customer churned in December 2020. For customer B, we know that the contract started four months ago and is still active, but we do not know how much longer they will be a customer. This is an example of right censoring because we do not yet know the end date for the measured value. Right censoring is what we most commonly see with this form of analysis.

Click through for an intro as well as a half-dozen notebooks.

Comments closed

Using Spark Pools in Azure Synapse Analytics

Rahul Mehta shows how to create and use an Apache Spark pool in Azure Synapse Analytics:

In the last part of the Azure Synapse Analytics article series, we learned how to create a dedicated SQL pool. Azure Synapse support three different types of pools – on-demand SQL pool, dedicated SQL pool and Spark pool. Spark provides an in-memory distributed processing framework for big data analytics, which suits many big data analytics use-cases. Azure Synapse Analytics provides mechanisms to use SQL on-demand pool to query data as a service, SQL dedicated pool for data warehousing using distributed data processing engine, and Spark pool for analytics using in-memory big data processing engine. This article shows how to create a Spark pool in Azure Synapse Analytics and further how to process the data using it.

Click through for a demo on setup and a sample notebook to get started.

Comments closed

Spark Streaming in a Databricks Notebook

Tomaz Kastrun shows off Spark Streaming in a Databricks notebook:

Spark Streaming is the process that can analyse not only batches of data but also streams of data in near real-time. It gives the powerful interactive and analytical applications across both hot and cold data (streaming data and historical data). Spark Streaming is a fault tolerance system, meaning due to lineage of operations, Spark will always remember where you stopped and in case of a worker error, another worker can always recreate all the data transformation from partitioned RDD (assuming that all the RDD transformations are deterministic).

Click through for the demo.

Comments closed

Linking between Notebooks in Azure Data Studio

Julie Koesmarno shows us the rules of linking notebooks in Azure Data Studio:

When writing a notebook, it can be very handy to be able to refer to a specific part to a notebook and allow the readers to jump to that part, i.e linking or anchoring. Using this technique, you can also create an index list or a table of contents or cross-referencing to parts of other notebooks too. Check out my demo notebook for this linking topic, from MsSQLGirl Github Repo.

Read on for those rules.

Comments closed

Using Notebooks to Load Data into the Databricks File System

Tomaz Kastrun is putting together an Advent of Azure Databricks:

Yesterday we started working towards data import and how to use drop zone to import data to DBFS. We have also created our first Notebook and this is where I would like to start today. With a light introduction to notebooks.

Read on for a depiction of notebooks, as well as an example which loads data into the Databricks File System (DBFS).

Comments closed

Kusto Queries in Azure Data Studio Notebooks

Julie Koesmarno shows off the Kusto Query Language magic in Azure Data Studio notebooks:

To do this, you’ll need to ensure that you have Kqlmagic installed. See Install and set up Kqlmagic in a notebook. Then in a notebook, you can load Kqlmagic with %reload_ext Kqlmagic in a code cell.

The next step is then in a new code cell, you can start connecting to a Log Analytics workspace. There are three ways to do so (roughly – as I’m also learning in this space too):

1. Using Azure Active Directory Device Login authentication.
2. Using Az CLI login
3. Using Client Secret

Read on for one example using Azure AD authentication.

Comments closed