Press "Enter" to skip to content

Author: Kevin Feasel

Embedding Notebooks on a Website

Eduardo Pivaral shows how to embed the results of a Jupyter notebook created in Azure Data Studio on a website:

Notebooks are a functionality available in Azure Data Studio, that allows you to create and share documents that may contain text, code, images, and query results. These documents are helpful to be able to share database insights and create runbooks that you can share easily.

Are you new to notebooks? don’t know what are the uses for it? want to know how to create your first notebook? then you can get started in ADS notebooks checking my article for MSSQLTips.com here.

Once you have created your first notebooks and share them among your team, maybe you want to share it on your website or blog for public access.
even when you can share the file for download, you can also embed it on the HTML code.

Be sure to read the comments too. Rendering notebooks is…an imperfect operation.

Comments closed

Recompile Hints and Query Store

Erin Stellato answers two questions relating to recompile hints and whether those queries show up in Query Store:

Last week in our IEPTO2 class I was asked about queries with OPTION (RECOMPILE) and Query Store. Specifically: Do queries that have the OPTION (RECOMPILE) hint go into Query Store, AND do queries in a stored procedure created with the RECOMPILE option go into Query Store? I knew the answer to the first question, and was pretty sure I know the answer to the second one, but I wanted to test to verify. Let’s take a look.

Erin gives you a tl;dr version but I’m going to ask you to read the whole thing anyhow.

Comments closed

Defining Tidy Data

John Mount shares thoughts about the concept of tidy data:

A question is: is such a data set “tidy”? The paper itself claims the above definitions are “Codd’s 3rd normal form.” So, no the above table is not “tidy” under that paper’s definition. The the winner’s date of birth is a fact about the winner alone, and not a fact about the joint row keys (the tournament plus year) as required by the rules of Codd’s 3rd normal form. The critique being: this data presentation does not express the intended data invariant that Al Fredrickson must have the same “Winner Date of Birth” in all rows.

My spin on it is that tidy data is Boyce-Codd Normal Form but may subsequently be denormalized. This may reintroduce violations of 3NF (as in Mount’s example) and sometimes 2NF, but does not change the shape of the variables themselves—that is, a variable still represents a single thing and exists per observation.

Comments closed

Visualizing Earthquake Data

Giorgio Garziano continues a series on analyzing earthquake data:

This is the third part of our post series about the exploratory analysis of a publicly available dataset reporting earthquakes and similar events within a specific 30 days time span. In this post, we are going to show static, interactive and animated earthquakes maps of different flavors by using the functionalities provided by a pool of R packages as specifically explained herein below.

Giorgio looks at 9 separate R mapping packages, so you get your money’s worth here.

Comments closed

Testing Maximum Rows in Table Value Constructors

Solomon Rutzky dives into how big a table value constructor can be in terms of rows:

On 2019-05-08, a helpful individual, Michael B, commented on my answer saying that the 1000-row limit only existed when using a TVC as the VALUES clause of an INSERT statement. And, that there was no limit when using a TVC as a derived table.

Could this be true?

Solomon finds out. Click through and so can you.

Comments closed

Automating Log File Expansion

Max Vernon shows how you can automatically expand log files to optimize VLF counts:

SQL Server Database Log file expansion can be fairly tedious if you need to make the log bigger in many reasonably-sized growth increments. It can be tedious because you may need to write and execute a large number of ALTER DATABASE ... MODIFY FILE ... commands.

The following code automatically grows a SQL Server Database log file, using the size and growth increments you configure in the script. If you set the @DebugOnly flag to 1, the script will only print the commands required, instead of executing them. This allows you to see what exactly will be executed ahead of time. Alternately, you could copy-and-paste the commands into a query window and execute them one-by-one.

Click through for that code.

Comments closed

Azure Data Studio May Release

Alan Yu announces the May release of Azure Data Studio:

Since its release two months ago, the community continues to love SQL Notebooks. This month, we had a laser-eyed focus on quality of life bug fixes instead of new features. These improvements include:

– Markdown rendering improvements, including better support for notes and tables
– Usability improvements to the toolbar
– Markdown links for trusted notebooks no longer requires Command/Ctrl + click and can be clicked directly
– Improvements in cleaning up Jupyter processes after closing notebooks and reducing errors when starting multiple notebooks concurrently
– Improvements to SQL Notebook connections to ensure errors don’t occur when running two notebooks against the same database
– Improvements to notebook auto-scrolling to the currently executing cell when clicking the run cells button from the toolbar
– General stability and performance improvements

And based on some of the GitHub comments, I’m going to really like the June release if those changes all make it in.

Comments closed

Memory Pressure and Azure SQL Managed Instances

Jovan Popovic takes us through determining whether we have enough memory on an Azure SQL Managed Instance:

Managed Instance has memory that is proportional to the number of cores. As an example, in Gen5 architecture you have 5.1GB of memory per vCore, meaning that 8-core instance will have 41GB memory. Periodically you should check is this amount of memory good for your workload.
 
Do not monitor does the Managed Instance use ~100% of available memory. Some people believe that this is an issue (like hitting 100% CPU), but this is expected and desired behavior. Managed Instance should use as much as possible memory to cache the pages from disk into the buffer pool. The only case where you will not see near 100% usage of memory is the case where you have the databases much smaller that the available memory size so all of them can fit into the memory.

The spoiler version is that it’s the same process as on-prem.

Comments closed

Azure Data Factory: Mapping and Wrangling Data Flows

Cathrine Wilhelmsen explains the difference between Mapping Data Flows and Wrangling Data Flows in Azure Data Factory:

Now, we all know that the consultant answer to “which should I use?” is It Depends ™ 🙂 But what does it depend on?

To me, it boils down to a few key questions you need to ask:
– What is the task or problem you are trying to solve?
– Where and how will you use the output?
– Which tool are you most comfortable using?

Read on to see how they both work.

Comments closed