Press "Enter" to skip to content

Curated SQL Posts

Delta Lake Schema Enforcement

Burak Yavuz, et al, explain the concept of schema enforcement with Databricks Delta Lake:

Schema enforcement, also known as schema validation, is a safeguard in Delta Lake that ensures data quality by rejecting writes to a table that do not match the table’s schema. Like the front desk manager at a busy restaurant that only accepts reservations, it checks to see whether each column in data inserted into the table is on its list of expected columns (in other words, whether each one has a “reservation”), and rejects any writes with columns that aren’t on the list.

Something something “relational database” something something. They also walk us through some examples in a Databricks notebook, so check that out.

Comments closed

Performance Spools Plus Nested Loop Joins

Paul White takes us through the different types of performance-related spools you might see in an execution plan:

All performance spools are lazy. The spool’s worktable is gradually populated, a row at a time, as rows stream through the spool. (Eager spools, by contrast, consume all input from their child operator before returning any rows to their parent).

Performance spools always appear on the inner side (the lower input in graphical execution plans) of a nested loops join or apply operator. The general idea is to cache and replay results, saving repeated executions of inner-side operators wherever possible.

When a spool is able to replay cached results, this is known as a rewind. When the spool has to execute its child operators to obtain correct data, a rebind occurs.

You may find it helpful to think of a spool rebind as a cache miss, and a rewind as a cache hit.

Read the whole thing. I am admittedly the type of person Paul mentions at the end (I reflexively hiss when I have a query performing poorly and see a spool).

Comments closed

Power BI Admin API Updates

Matthew Roche takes us through a few updates to the Power BI Admin API:

Even with these new experiences, there will still be times when you want or need to use the Power BI API to get insight into all the workspaces in the Power BI tenant for which you are an administrator. This ability isn’t new, but some recent updates to the Power BI admin API have made it easier.

I don’t do a lot of development these days, so when a post titled Avoiding workspace loops by expanding navigation properties in the GetGroupsAsAdmin API showed up on the Power BI blog, I didn’t pay much attention. I should have.

Read on for the full set of updates.

Comments closed

SSIS Term Extraction

Tim Mitchell takes us through term extraction in SQL Server Integration Services:

The SSIS term extraction component is a data flow transformation that will aggregate the frequency of words found in a specified column supplied by an upstream data source. This component expects a Unicode text or text stream field as an input, and calculates as an output the frequency of the nouns and/or noun phrases in the specified source column. As shown below on the data flow surface, the term extraction component (circled) is always used as a transformation, accepting exactly one input and expecting either one or two outputs (the second being an optional error output).

This is one component I’ve never used before.

Comments closed

Automating Azure Data Studio Notebooks

Aaron Nelson has two separate ways of scheduling Azure Data Studio notebooks for us:

There are two new options for automating your SQL Notebooks with your SQL Servers. Earlier this month, the Insiders build of Azure Data Studio received the ability to add SQL Notebooks in SQL Agent. This past Friday (September 20th, 2019) a new version of the SqlServer PowerShell module was posted to the Gallery, with a new Invoke-SqlNotebook cmdlet.

Read on for demos of both.

Comments closed

SQL Server Management Studio 18.3

Dinakar Nethi announces a new version of SQL Server Management Studio:

Fixed the hard to see dialog for Query Options when the user right-clicks on the T-SQL editor window by making it resizable.

The Completion time message that’s visible in the result grid/file  that was introduced in SQL Server Management Studio 18.2 is now configurable under Tools > Options > Query Execution > SQL Server > Advanced > Show completion time.

In the connection dialog, we replaced Active Directory – Password and Active Directory – Integrated with Azure Active Directory – Password and Azure Active Directory – Integrated, respectively.

There are a pretty good number of bugfixes in this release.

Comments closed

Record Transformation with cdata

John Mount shows off one of the advantages of using cdata to define data-driven record transformation specifications:

We have a tutorial on how to design such transforms by writing down the shape your incoming data records are arranged in, and also the shape you wish your outgoing data records to be arranged in.

This simple data transform is in fact not a single pivot/un-pivot, as the result records spread data-values over multiple rows and multiple columns at the same time. We call the transform simple, because from a user point of view: it takes records of one form to another form (with the details left to the implementation).

Read the whole thing.

Comments closed

Ring Buffer CPU Over 100%

Thomas Rushton ran into an oddity with sys.dm_os_ring_buffers:

Wait! Process Utilization + Idle adds up to waaaaay over 100…What’s going on?

My first thought was that the CPU was being throttled, so I checked the windows power options – for some really irritating reason, Windows defaults to a “balanced” power setting, which is rubbish for SQL Server.

That looks OK, but… if you check the CPU page of the Task Manager, things are a bit more interesting:

Unfortunately, the story doesn’t have a conclusion, but Thomas’s conjecture does make sense.

Comments closed

Documenting Maintenance Plans

Dave Mason wants to document each SQL Agent job, including maintenance plans:

I’m not a regular user of Maintenance Plans for SQL Server, but I run into them from time to time. I had a task to document all of the SQL Agent jobs, which for a number of environments, included some Maintenance Plans. This became a more time consuming task than I had anticipated!

I had known beforehand that Maintenance Plans were SSIS packages under the covers. So I started with a query on msdb.dbo.sysssispackages. I also knew that SSIS packages are essentially XML data. 

Take advantage of the bulk of time Dave had to sink into this.

Comments closed

Query Store Q&A

Erin Stellato had a lot of questions about Query Store, and the answers turned into a blog post:

5. If you have 3+ plans how does SQL Server decide which plan to use?
A: I assume this is specific to the Automatic Plan Correction feature, and if so, it will force the last good plan (most recent plan that performed better than the current plan).  More details in my Automatic Plan Correction in SQL Server post.
 
6. What equivalent options we have for lower versions?
A:  There is an open-source tool called Open Query Store for versions prior to SQL Server 2016.

Click through for all 19 of the questions.

Comments closed