Press "Enter" to skip to content

Month: August 2021

Scanning and Classification with Azure Purview

Angela Henry continues a series on Azure Purview:

In our previous article for this series, Purview Part 2: Data Catalog, we examined the portion of the end user experience where people will spend the majority of their time. But the question is, how does that Data Catalog get populated? The Data Catalog is populated by the Scanning and Classification features of Purview, which is the focus of this article.

Click through to see what you need to set up and how the process works.

Comments closed

Getting Distinct Values before STRING_AGG

Greg Dodd shows how to remove duplicate values from a list before passing them to the STRING_AGG() function:

SQL introduced the new STRING_AGG feature in SQL 2017, and it works just like it suggests it would: it’s an aggregate function that takes all of the string values and joins them together with a separator. To see how it works, I’m using the StackOverflow users table, and let’s say we want to create a list of Display Names and we’re going to group it based on Location:

Click through for two methods, one of which is considerably better than the other.

Comments closed

Sending an E-Mail upon Database Creation

Thomas Williams has a job for SQL Server:

As part of commissioning a SQL Server, I create a server trigger to send an e-mail when a new database is created (adapted from https://www.mssqltips.com/sqlservertip/2864/email-alerts-when-new-databases-are-created-in-sql-server/), and a SQL Agent alert which e-mails when a database is restored (inspired by Jeremy Dearduff’s comment at https://www.brentozar.com/archive/2017/06/tracking-restores-hard/#comment-2446362).

After receiving the e-mail I can follow up and include the database in an inventory. See below for the trigger and alert scripts – feel free to use these as a basis for your own monitoring and inventory.

Click through for the script, as well as an important disclaimer.

Comments closed

The Cost of a Checkbox: Power Apps Edition

Paul Thurrott looks at a security issue:

Over 1000 web apps created with Microsoft’s Power Apps inadvertently exposed the data from over 38 million users thanks to a misconfiguration, according to a new report in Wired. The good news? The issue has been fixed and no customers are known to have been compromised.

“We found [a web app created with Power Apps] that was misconfigured to expose data and we thought, we’ve never heard of this, is this a one-off thing or is this a systemic issue?” UpGuard vice president Greg Pollock told Wired. “Because of the way the Power Apps portals product works, it’s very easy to quickly do a survey. And we discovered there are tons of these exposed. It was wild.”

“Known to have been compromised” probably needs a “yet” in there somewhere. Read the whole thing.

Comments closed

Improving Dataset Refresh with Query Folding and the Dataflows Connector

Chris Webb has a performance tip for us:

You may have noticed that a new dataflows connector was announced in the August 2021 release of Power BI Desktop, and that it now supports query folding between a dataset and a dataflow – which you may be surprised to learn was not possible before. In this post I thought I’d take a look at how much of an improvement in performance this can make to dataset refresh performance.

Click through for the demonstration.

Comments closed

Power BI Cleaner Gen2

Imke Feldmann has an update to the Power BI Cleaner:

In this post I’m sharing a new version of my Power BI Cleaner tool. Besides from some bug fixes it can now automatically identify and analyze reports that are connected to an Analysis Services Model in tabular mode.

Just a warning that the live connections to Analysis Service or shared Power BI dataset might be slow, as Power Query will retrieve the data from all datasets that are available for your from the service. I might include a feature to pre-select datasets in a future release if there is interest. So please add comments if you want to see this feature.

Read on to see how to set everything up, as well as the things you can do with it.

Comments closed

Query Waits in the QDS Toolbox

Jared Poche continues a series on QDS Toolbox:

Carrying on from my previous post on the QDS Toolbox, let’s review the Query Waits component.

This component provides details about the wait types associated with a given object, query, or plan. This doesn’t pull data from the reports generated by the Server Top Queries component; this gets the data directly from the Query Store in the database you specify. So you can use this to review the waits on a given procedure, even if you haven’t generated any reports with Server Top Queries .

That being said, reports from Server Top Queries could identify problem queries and give you the PlanID, QueryID, or ObjectName you need to run the procedure. A query that spends a lot of time waiting would tend to have a higher duration and lower active metrics like CPU usage or logical reads. So those are the queries I would want to run the Query Waits report against.

Read on to see how to call this procedure and what you might expect.

Comments closed

From API Call to ML Services Prediction

Tomaz Kastrun continues a series:

From the previous two blog posts:

Creating REST API for reading data from Microsoft SQL Server in web browser

Writing Data to Microsoft SQL Server from web browser using REST API and node.js

We have looked into the installation process of Node.js, setup of Microsoft SQL Server and made couple of examples on reading the data from database through REST API and how to insert data back to database.

In this post, we will be looking the R predictions using API calls against a sample dataset.

Click through to see it in action.

Comments closed

Projecting Disk Space Available

Constantine Kokkinos predicts the future:

The first question I wanted to model out was a bigger issue with on-premises databases – when are we going to run out of storage?

Back in the day I’d cheat with msdb backups, comparing compressed sized to actuals, and moving on. However I don’t have a historical reference for Stack Overflow… so what can I do?

Taking a look at the tables we see a commonality in many tables – CreationDate! It looks like the rows faithfully are stamped when they are created.

Constantine does at the end hit on something we tend to forget: most operations in life aren’t quite linear. We often get lucky in that certain stretches are close enough to be linear that we can model them that way, but even in this dataset, you can see the effects of polynomial growth slowly build up. Still, this is a good way of taking us through what an analysis and projection can look like.

Comments closed

Changing the Slow Query Log Threshold in RDS

John McCormack wants to know about those slow queries:

The slow query log will record all queries which are above the threshold level. The default value is 10 (seconds) but you can set it higher or lower depending on your requirements. It is useful for finding slow queries and allows you to pick out candidates for tuning.

If you set the threshold too low, it can increase I/O overhead on your instance and use a lot of valuable disk space. If you set it too high, it might not capture enough useful information.

This is a setting in AWS Relational Database Services and mimics functionality in MySQL

Comments closed