Press "Enter" to skip to content

Month: November 2020

Kusto Queries in Azure Data Studio Notebooks

Julie Koesmarno shows off the Kusto Query Language magic in Azure Data Studio notebooks:

To do this, you’ll need to ensure that you have Kqlmagic installed. See Install and set up Kqlmagic in a notebook. Then in a notebook, you can load Kqlmagic with %reload_ext Kqlmagic in a code cell.

The next step is then in a new code cell, you can start connecting to a Log Analytics workspace. There are three ways to do so (roughly – as I’m also learning in this space too):

1. Using Azure Active Directory Device Login authentication.
2. Using Az CLI login
3. Using Client Secret

Read on for one example using Azure AD authentication.

Leave a Comment

Deploying to Multiple SQL Server SKUs with Azure DevOps

Kevin Chant wants to deploy to all of the SQL Servers:

To give the above pipeline a bit more context the below types of SQL Server databases were updated after being unit tested (initial unit testing yaml courtesy of Sander):

– Three SQL Server 2019 instances in three Docker containers. Representing Integration, Staging and Production environments.
– At the same time the Git repository in Azure Repos would sync with a GitHub Repo. Which would then start a GitHub Action to update another database.
– An Azure SQL Database.
– Finally, a Synapse Analytics SQL Pool was also updated.

Read on to learn how.

Leave a Comment

Pruning Indexes on a Table

John McCormack takes us through an index pruning exercise:

Your SQL Server indexes can’t always be perfect. What I mean by that is you can’t index for every query on a busy SQL server. In this post I’m going to describe just one table from a server I have been tuning recently. It had 26 indexes, and that was not unusual for this database. This most likely happened over time as a result of developers creating an index per query using the table, as well as following some perceived best practices (with the best of intentions).

John got it down from 26 to 11 and shares thoughts on how. If you have that many separate indexes, it most likely stems from a failure in normalization—it’s uncommon that a properly-described table has that many unique access patterns. And that will often lead you to a hard floor which includes more indexes than you’d like without reimagining the table and going through a lot of developer pain.

Leave a Comment

Azure Data Factory Integration Runtimes

Tino Zishiri takes us through the concept of the Integration Runtime:

An Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory to provide data integration capabilities such as Data Flows and Data Movement. It has access to resources in either public networks, or hybrid scenarios (public and private networks).

Read on to learn more about what they do and the variety of Integration Runtimes available to you.

Leave a Comment

Using Delimiters when Performing Change Detection with Hashbytes

Andy Leonard clues us in on an important character:

A student of my SSIS training asked why I place vertical pipe delimiters between column values when calculating the hash value for a row. I sent back a quick response and then informed him that he’d inspired a blog post.

Hence, this blog post.

To understand what I am talking about, you need to know that I define a 100% efficient data integration operation as a data integration process – such as an SSIS package or ADF pipeline – that loads only new and changed rows from a source. I explain this in the video. To achieve 100% efficiency, I use the T-SQL HashBytes function for change detection.

Check out the full reasoning. A secondary reason for using delimiters is that if you’re using the CONCAT() function to combine rather than calling CAST() or CONVERT() on everything individually, CONCAT() requires at least two inputs. If you always put a delimiter at the end as well, you have at least two inputs, guaranteed. It’s a small thing, but adds to consistency when, say, you only have one non-key, type 2 field on a dimension.

Leave a Comment

Image Modification with R in Machine Learning Services

Rajendra Gupta messes with The Mouse:

There is a famous adage in English: “A picture is worth a thousand words”. You can represent your information using the image in various formats such as JPEG, PNG, GIF. Usually, we use various client tools such as MS Paint, Photo, photoshop or other client applications for working with the images. You can convert image format, modify the size, applying various effects, multiple animated images.

SQL Machine Learning language – R makes us capable of working with the images directly with the SQL Server. In this article, we will use SQL Machine Learning using R scripts for image processing.

Click through for examples.

Leave a Comment

The Performance Cost of AT TIME ZONE

Erik Darling shows that AT TIME ZONE does not scale well when used in filters against columns:

Databases really do make you pay dearly for mistakes, and new linguistic functionality is not implemented with performance in mind.

I’ve written before about how to approach date math in where clauses: Where To Do Date Math In Your Where Clause

And it turns out that this lesson is brutally true if you need to pass time zones around, too.

Read the whole thing. In this respect, AT TIME ZONE is similar to pretty much all other date operators and functions.

Leave a Comment

GREATEST and LEAST in Azure SQL Database

Arun Sirpal shows off some missing functionality in SQL Server:

Being in the cloud does have many benefits, from lower administration to fast scaling but another “side effect” of operating in Azure SQL Database is the cloud first nature of changes. By this I basically mean new features always get pushed to Azure first before the classic on-premises version so some gems come to light.

Here is one for you. Have you ever wanted MySQL’s functinality to apply LEAST() and GREATEST() argument? Well, you can now, in Azure.

I can’t say that I would use this every day or anything, but I have felt the pain of not having it. There are workarounds, though nothing as convenient as syntax. Hopefully this shows up on-prem in the next version of SQL Server.

Leave a Comment

Saving Transcripts in Powershell

Garry Bargsley shows how easy it is to save a transcript in Powershell:

This week we are going to launch a blog series geared towards folks that are new to PowerShell. The growing popularity of automation is seeing people getting started with the PowerShell scripting language.

The Start-Transcript is a built-in command that allows you to quickly build a log of actions being taken by your script. There are other ways to build logs, but for beginners using the commands that are available to you is pretty easy.

Read on for a demo.

Leave a Comment

Beyond 10GB for Power BI Users

Gilbert Quevauvilliers wants to go to infinity and beyond:

By default, when using Power BI Premium or Power BI Premium per user the dataset size is set to 10GB.

I have had the wonderful experience of refreshing my dataset and getting the following error:

In the steps below I will show you how to change this setting to allow for larger dataset sizes.

There are a few steps involved, but hey, if you’re paying for Premium, it’s worth a few steps to get this.

Leave a Comment