Press "Enter" to skip to content

Category: Notebooks

Power BI, Event Streaming, and Notebooks in Microsoft Fabric

Tomaz Kastrun continues a series on Microsoft Fabric. Day 18 has us looking at Power BI:

We have created a Power BI report directly from the datalake and today we will check how to do same with dashboard and paginated reports.

Day 19 covers event streaming:

In Fabric, you can create streaming semantic model and when selecting you will get the usual sources:

Day 20 shows how you can work with notebooks in Microsoft Fabric:

Notebooks have been around for a long time and people, community, and professionals have proven the usability, practicality, versioning and reliability of notebooks. Not to mention the clarity and hygiene. But opinions are also divided.

The purpose of this post today is to check for a couple of functionalities that might not be that straightforward when it comes to notebooks.

Comments closed

Looping through Lakehouses in Microsoft Fabric Spark Jobs

Dennes Torres builds a loop:

I have published videos and articles before about Lakehouse maintenance. In this article I want to address a missing point for a lot of Fabric administrators: How to do maintenance on multiple lakehouses that are located in different workspaces.

One of the videos I have published explains the maintenance of multiple lakehouses, but only addresses maintenance in a single workspace. Is it a good idea to keep multiple lakehouses in the same workspace? Probably not.

Click through for the process.

Comments closed

Creating Charts in Microsoft Fabric Notebooks using Vega

Phil Seamark tries out Vega in a Microsoft Fabric notebook:

I recently needed to generate a quick visual inside a Microsoft Fabric notebook. After a little internet searching, I found there are many good quality charting libraries in Python, however it was going to take too long to figure out how to create a very specific type of chart.

This is where Vega came to the rescue. The purpose of this short article is to share a very simple implementation of generating a Vega chart using a Microsoft Fabric notebook.

Click through for the example code.

Comments closed

Visualizing a Power BI Refresh with the semantic-link Library

Phil Seamark builds a notebook:

A few blogs back I shared a technique using Power BI Profiler (or VS Code) to run and capture a trace over a refresh of a Power BI semantic model (the object formally known as a dataset).

I’ve since received a lot of positive feedback from people saying how useful it was to visualize each internal step within a problematic Power BI refresh. Naturally, in the age of Fabric, I’m keen to share how the same approach works using a Microsoft Fabric Notebook.

Click through to see how you can do it.

Comments closed

Loading Data from Sharepoint Lists into Microsoft Fabric

Stepan Resl loads some data:

In a time of Fabric, it’s worth pointing out our three options for data ingestion.

  • Data Pipelines with Copy Activity
  • Dataflows Gen 2
  • Notebooks

We must compare them to understand ​​what each can offer us from different perspectives. To be able to compare them thoroughly, there are some guardrails that we need to set so that everything goes the same way.

My biggest takeaway from this is, don’t load important business data into Sharepoint Lists to begin with.

Comments closed

Enabling Python and R Support for VS Code Polyglot Notebooks

Joy George Kunjikkur enables a preview option:

Obviously, we should have Polyglot notebooks up and running. The first step to enable Python preview is that we need to install Jupyter on the machine and make sure the Python kernel spec is available. Run the below command to make sure it is there.

It looks like what the preview is doing is shelling out to Jupyter notebooks, so I’d imagine variables won’t cross over between languages.

Comments closed

Cache Management and Semantic Link in Fabric Notebooks

Marc Lelijveld warms up the cache:

In the previous blog, I wrote about data temperature as part of Fabric when you’re using Direct Lake storage mode. In that blog, I explained how you can get insights in the temperature of a column, what that temperature means and what the impact of the temperature is on columns that are queried more often.

In this blog, I will continue this story by elaborating on a process called framing and how you can influence data eviction to drop data from memory. Finally, this blog goes into more details on how you could use Semantic Link in Fabric Notebooks to warm up the data for most optimal end-user performance.

The SQL Server analog here is having some automated queries which keep specific pages in the buffer pool, like a warm-up script for an instance with plenty of memory but slow disks.

Comments closed

Parameterizing Databricks Notebooks with Widgets

Meagan Longoria adds some widgets:

Widgets provide a way to parameterize notebooks in Databricks. If you need to call the same process for different values, you can create widgets to allow you to pass the variable values into the notebook, making your notebook code more reusable. You can then refer to those values throughout the notebook.

Click through to learn more about the four types of widgets and how they work.

Comments closed

ggplot2 in Python Notebooks

John Mount runs R in Python with rpy2:

For an article on A/B testing that I am preparing, I asked my partner Dr. Nina Zumel if she could do me a favor and write some code to produce the diagrams. She prepared an excellent parameterized diagram generator. However being the author of the book Practical Data Science with R, she built it in R using ggplot2. This would be great, except the A/B testing article is being developed in Python, as it targets programmers familiar with Python.

As the production of the diagrams is not part of the proposed article, I decided to use the rpy2 package to integrate the R diagrams directly into the new worksheet. Alternatively, I could translate her code into Python using one of: Seaborn objectsplotnineggpy, or others. The large number of options is evidence of how influential Leland Wilkinson’s grammar of graphics (gg) is.

Click through to see how you can execute R code within the context of Python, similar to how you can use the reticulate package to execute Python code in the context of R.

Comments closed

Generating Reproducible Reports with Jupyter and Quarto

Parisa Gregg and Myles Mitchell don’t need to copy and paste for their TPS reports:

Quarto is a free-to-use, open-source software based on Pandoc that enables users to convert plain text files into a range of formats, including PDF, HTML and powerpoint presentations. These documents can contain a mixture of narrative text, Python code, and figures that are dynamically generated by the embedded code.

This has many use-cases:

  • Your company may have a weekly board meeting to go over the latest sales figures. By having a Quarto presentation that pulls in the latest company sales data, you can regenerate the presentation slides each week at the click of a button.
  • As a researcher you may be preparing a report for publication. By having the code that generates data tables and figures embedded within the report, regenerating the draft as the experimental data floods in is a breeze!

Read on for a fun example of how you could automated a research-driven report.

Comments closed