Press "Enter" to skip to content

Author: Kevin Feasel

sp_PressureDetector Updates

Erik Darling continues to be busy:

I haven’t run into a bug with this thing in just about… Well, ever. This proc has been pretty solid.

As I use it though, especially in front of clients, I find myself wanting some additional data points all in one place. With this release, I’ve added a few small things.

Click through for the list of updates.

Comments closed

Power BI and Synapse Book Roundup

Chris Webb checks out some books:

I like free stuff and I like books, so of course I like free books – and it seems that the more I provide free publicity for relevant books here the more free books I get sent. I’ve now got enough to merit writing another post covering those I’ve received recently from various publishers and authors. As always these are not reviews, just short summaries of books you might want to check out.

Read on for the not-reviews.

Comments closed

Pre-Processing Data Explorer Data with Spark

Hauke Mallow does some data engineering:

We often see customer scenarios where historical data has to be migrated to Azure Data Explorer (ADX). Although ADX has very powerful data-transformation capabilities via update policies, sometimes more or less complex data engineering tasks must be done upfront. This happens if the original data structure is too complex or just single data elements being too big, hitting data explorer limits of dynamic columns of 1 MB or maximum ingest file-size of 1 GB for uncompressed data (see also Comparing ingestion methods and tools) .

Let’s think about an Industrial Internet-of-Things (IIoT) use-case where you get data from several production lines. In the production line several devices read humidity, pressure, etc. The following example shows a scenario where a one-to-many relationship is implemented within an array. With this you might get very large columns (with millions of device readings per production line) that might exceed the limit of 1 MB in Azure Data Explorer for dynamic columns. In this case you need to do some pre-processing.

Click through to see how you can do this with an Azure Synapse Analytics Spark pool prior to ingesting it with a Data Explorer pool.

Comments closed

Replacing Error Values on All Columns in Power BI

Kristyna Hughes needs to clear out some errors:

End users that are not trained in data governance but are actively involved in maintaining a data set can easily make data entry mistakes or create inconsistent data types within columns. For example, you may have a column in the dataset called “Sales” and instead of 0, someone may type “None” or “NA”. When this gets loaded into Power BI, Power BI will not know how to convert the text value “None” to a number, and it will throw an error on the refresh of the report.

One way to mitigate the impact of user-entered data is to replace errors with null values. This is not ideal since it doesn’t fix the data entry issues, but it does enable reports to still be refreshed and used while the data issues are addressed. In Power Query, you can manually replace the errors with null by going to the “Transform” tab then selecting the drop down for “Replace Values” and choosing “Replace Errors”.

Click through for a script which does this for all columns in Power Query.

Comments closed

CETAS and the Serverless SQL Pool

Liliam Leme takes us through the Create External Table as Select operation in the Azure Synapse Analytics serverless SQL pool:

Serverless SQL pool  has a very cool way to handle CREATE EXTERNAL TABLE AS SELECT (CETAS).  I mean, you can use a serverless SQL pool to create a CETAS which will materialize the query results. It means a heavy query in which, for example, the results would be part of future large join with other queries or aggregations that can be consolidated for reporting purposes. Those scenarios are examples that could be used as a CETAS. CETAS will be stored as a parquet file accessed by an external table in your storage and the performance is awesome. Later I am suggesting how to recreate the CETAS using pipeline against serverless SQL Pool.

Click through to see it in action.

Comments closed

Pinned SSMS Tabs in a Separate Row

Kenneth Fisher keeps ’em separated:

I use this all of the time. I have a solution in SSMS to store all of the queries I use on a regular basis. I open one of the queries and then pin it to the top to keep it sperate from the rest of my sometimes literally dozens of queries. Things like my query to figure out where the backups for this instance are, and my diagnostic queries are almost always up there at the top left. So I open the file, pin the file. A few days later when I end up re-opening SSMS I open the file, and I pin the file. Not hard, but rather repetitive.

The other day though I found a fantastic option.

Click through to learn what Kenneth found.

Comments closed

Deleting from Multiple Tables by Key

Guy Glantser needs to delete some data:

Sometimes you need to delete data from multiple tables in a database. For example, you might have a multi-tenant database, and you need to delete all the data that belong to some tenant.

The problem is that there are many tables in the database that contain data, which you need to delete. If you have a column like “TenantId” in all tables, then your life is easier, because you have a simple predicate to apply to your DELETE statements against all tables. But even then, if there are foreign keys between tables to enforce referential integrity, then things get more complicated.

Read through for Guy’s answer, which definitely works and can be the quickest solution. If you can’t drop foreign key constraints (even temporarily), I have a post from a while back on tracing foreign keys to “levels.” The post only covers finding the ordering but could be extended to delete data one level at a time.

Comments closed

The Benefit of Tick Marks on a Visual

Alex Velez lays out the case for tick marks:

Lately, I’ve noticed that more and more graphs don’t include gridlines. If it’s unclear, I believe this to be a positive trend. I, myself, rarely use gridlines, and often remove them when I find them in a graph I’m reviewing. But I don’t stop there. 

More often than not, if a chart has gridlines, it will be lacking tick marks along the axis, and possibly an axis line as well. 

Read on to see why.

Comments closed

Sorting in KQL

Robert Cain continues a series on KQL:

Like most query languages, the Kusto Query Language as the ability to sort the output. It works almost, but not quite, like you expect. So let’s take a look at the KQL sort operator.

Read on to get the general idea but also some of the nuance behind this operator.

Comments closed

Storing sp_BlitzIndex Details between Reboots

Tracy Boggiano continues a series on starting a new job as a DBA:

I mentioned in my New Database Job – The 90 Day Plan blog how I have a trick for storing index usage stats up until close to the next reboot of the SQL Server.  You really can do this for any DMV related query that you gets reset at the reboot of a SQL Server instance.  With this I am able to have data of the most the current index usage stats up to the midnight before a reboot of my SQL instances to analyze.

This is a good one to store, as is the output of sp_WhoIsActive.

Comments closed