Notebooks – Page 11 – Curated SQL

The notebook experience in Azure Data Studio allows users to create and share documents containing live code, execution results, and narrative text. Potential usage includes data cleaning and transformation, statistical modeling, troubleshooting guides, data visualization, and machine learning. Jupyter books compile a collection of notebooks into a richer experience with more structure and a table of contents. In Azure Data Studio we are able not only to use Jupyter books but also create and share them. Learn the basics of notebooks in Azure Data Studio from the documentation and read on to learn how to leverage a GitHub Action to publish and share remote Jupyter books.

Click through for the process of creating, opening, and distributing Jupyter Books.

Comments closed

Azure Data Studio November 2020 Release

Published 2020-11-13 by Kevin Feasel

Alan Yu announces the November 2020 release of Azure Data Studio:

Another feature request was to provide support for parameters in a notebook. Parameterization is the ability to execute the same notebook with different parameters.
With this release of Azure Data Studio, users will now be able to utilize Papermill’s ability to parameterize, execute, and store notebooks. By stating the parameters cell as the first code cell in your notebook, it ensures that the injected parameters in the outputted parameterized notebook will be placed directly after the original parameters cell. That way the parameterized notebook will utilize the newly injected parameters instead of the original parameters cell.
Users can utilize Papermill CLI as well as the Python API to pass in a new set of parameters quickly and efficiently as shown below.

That does look interesting.

Comments closed

Creating Widgets in Databricks Notebooks

Published 2020-08-20 by Kevin Feasel

Unmesha Sreeveni shows how you can create a widget in a Databricks notebook:

In order to get some inputs from user we will require widgets in our Azure Databricks notebook.
This blog helps you to create a text based widget in your python notebook.

The syntax is rather similar for Scala as well.

Comments closed

Using Jupyter as an External Tool for Power BI Desktop

Published 2020-08-18 by Kevin Feasel

David Eldersveld continues a series on Power BI external tools:

Many people use Python with notebooks, so let’s take a look at one possible way to enable a Jupyter external tool for Power BI Desktop. The following stepwise approach begins with simply opening Jupyter. It then progresses to creating and opening a notebook that includes Power BI’s server and database arguments. Finally, it works its way toward downloading a notebook definition contained in a GitHub gist and connects to Power BI’s tabular model to start to make this approach more useful.
This post continues a series of posts related to Python and Power BI. The first three parts of this blog series introduced some possible uses for Python connected to a Power BI model, how to setup a basic Python external tool, and how to both use it with a virtual environment and connect to the Tabular Object Model.

This was a cool usage of Power BI’s external tool functionality and starts to give you an idea of how powerful it can be.

Comments closed

Getting Started with Jupyter Notebooks

Published 2020-08-17 by Kevin Feasel

Aveek Das takes us through the most popular name in notebooks:

In this article, I am going to explain what Jupyter Notebooks are and how to install the same on your machine. Further, I will demonstrate how to use these notebooks using Visual Studio Code and perform data analysis and other development activities. It is an open-source platform using which you can create and share documents that contain live code, equations, and visualizations along with the formatted text. Most importantly, these notebooks can be run on the web browser by just starting a server and using it. This open-source project is maintained by the team at Project Jupyter.

This is a fairly basic introduction to the topic, good if you have heard about notebooks but don’t know where to begin.

Comments closed

Accessing the SQL Server Diagnostic Book Remotely

Published 2020-08-14 by Kevin Feasel

Emanuele Meazzo has an update to how to get the SQL Server Diagnostic Book:

Good news everyone, thanks to the Azure Data Studio August 2020 update you can now access the SQL Server Diagnostic Book without having to download the book from my Github.

If you have Git (or even the GitHub app) installed, that was already pretty easy. But this is easier.

Comments closed

HIVE-6384 Errors with Spark and Parquet

Published 2020-08-06 by Kevin Feasel

Manoj Pandey troubleshoots an issue:

But I was getting following error:
warning: there was one feature warning; re-run with -feature for details
java.lang.UnsupportedOperationException: Parquet does not support decimal. See HIVE-6384

As per the above error it relates to some Hive version conflict, so I tried checking the Hive version by running below command and found that it is pointing to an old version (0.13.0). This version of Hive metastore did not support the BINARY datatypes for parquet formatted files.

Read on to see how Manoj was able to fix the problem in Azure Databricks.

Comments closed

Secrets Management in Powershell Demos

Published 2020-07-21 by Kevin Feasel

Rob Sewell is happy to stop using Import-Clixml:

I love notebooks and to show some people who had asked about storing secrets, I have created some. So, because I am ~~efficient~~ lazy I have embedded them here for you to see. You can find them in my Jupyter Notebook repository
https://beard.media/dotnetnotebooks

Rob has a follow-up on the topic:

Following on from my last post about the Secret Management module. I was asked another question.

> Can I use this to run applications as my admin account?
A user with a beard

Well, Rob has a notebook for that.

1 Comment

Using Flink in Zeppelin Notebooks

Published 2020-06-30 by Kevin Feasel

Jeff Zhang continues a series on using Apache Flink in Zeppelin Notebooks:

With Zeppelin, you can build a real time streaming dashboard without writing any line of javascript/html/css code.
Overall, Zeppelin supports 3 kinds of streaming data analytics:
– Single Mode
– Update Mode
– Append Mode

Read on for examples of each of these, as well as a few tips around user-defined functions.

Comments closed

Using Apache Flink in Zeppelin Notebooks

Published 2020-06-16 by Kevin Feasel

Jeff Zhang walks us through reviewing data streamed through Apache Flink in an Apache Zeppelin notebook:

In this post, we explained how the redesigned Flink interpreter works in Zeppelin 0.9.0 and provided some examples for performing streaming ETL jobs with Flink and Zeppelin. In the next post, I will talk about how to do streaming data visualization via Flink on Zeppelin. Besides that, you can find an additional tutorial for batch processing with Flink on Zeppelin as well as using Flink on Zeppelin for more advance operations like resource isolation, job concurrency & parallelism, multiple Hadoop & Hive environments and more on our series of posts on Medium. And here’s a list of Flink on Zeppelin tutorial videos for your reference.

Click through for the demo, and stay tuned for part 2.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Notebooks

Creating Jupyter Books in Azure Data Studio

Azure Data Studio November 2020 Release

Creating Widgets in Databricks Notebooks

Using Jupyter as an External Tool for Power BI Desktop

Getting Started with Jupyter Notebooks

Accessing the SQL Server Diagnostic Book Remotely

HIVE-6384 Errors with Spark and Parquet

Secrets Management in Powershell Demos

Using Flink in Zeppelin Notebooks

Using Apache Flink in Zeppelin Notebooks