Press "Enter" to skip to content

Month: September 2019

Documenting Integration Services Packages

Dave Mason continues a quest for documentation:

The output is a tabular result showing each SSIS package, their Names and Descriptions, and the top-level task Names and Descriptions of each subplan. (In addition to the “DailyETLMain” package, we see metadata for some other plans related to the Management Data Warehouse Data Collector.) Note the 16 rows of metadata for the “DailyETLMain” package correspond to the 16 top-level objects of the package–the query doesn’t recurse into containers to obtain their object metadata. I decided not to attempt that–it seemed like overkill for documentation purposes. Another caveat: the results order may not match the order that’s mandated by Precedence Constraints in the Visual Studio designer.

Click through for the full example.

Comments closed

SQL Server and Integer Math

Bert Wagner reminds us that integer math is a thing:

To determine how much lumber I would need for building the new walls, I decided to write a SQL query to help with my framing calculations. I was building a 6 foot wall and wanted to put a stud every 16 inches. Easy enough to do the math on this:

SELECT (6*12)/16

The output of the query above was 4, indicating the number of studs I would need for one wall section.

What’s interesting is that if we do this same equation in a calculator, we get a slightly different answer: 4.5.

Click through for Bert’s thoughts on the issue. Other languages and platforms do this as well, so it’s not unique to SQL Server, but if you’re not used to it, you might be in for a surprise.

Comments closed

Delta Lake Schema Enforcement

Burak Yavuz, et al, explain the concept of schema enforcement with Databricks Delta Lake:

Schema enforcement, also known as schema validation, is a safeguard in Delta Lake that ensures data quality by rejecting writes to a table that do not match the table’s schema. Like the front desk manager at a busy restaurant that only accepts reservations, it checks to see whether each column in data inserted into the table is on its list of expected columns (in other words, whether each one has a “reservation”), and rejects any writes with columns that aren’t on the list.

Something something “relational database” something something. They also walk us through some examples in a Databricks notebook, so check that out.

Comments closed

Performance Spools Plus Nested Loop Joins

Paul White takes us through the different types of performance-related spools you might see in an execution plan:

All performance spools are lazy. The spool’s worktable is gradually populated, a row at a time, as rows stream through the spool. (Eager spools, by contrast, consume all input from their child operator before returning any rows to their parent).

Performance spools always appear on the inner side (the lower input in graphical execution plans) of a nested loops join or apply operator. The general idea is to cache and replay results, saving repeated executions of inner-side operators wherever possible.

When a spool is able to replay cached results, this is known as a rewind. When the spool has to execute its child operators to obtain correct data, a rebind occurs.

You may find it helpful to think of a spool rebind as a cache miss, and a rewind as a cache hit.

Read the whole thing. I am admittedly the type of person Paul mentions at the end (I reflexively hiss when I have a query performing poorly and see a spool).

Comments closed

Power BI Admin API Updates

Matthew Roche takes us through a few updates to the Power BI Admin API:

Even with these new experiences, there will still be times when you want or need to use the Power BI API to get insight into all the workspaces in the Power BI tenant for which you are an administrator. This ability isn’t new, but some recent updates to the Power BI admin API have made it easier.

I don’t do a lot of development these days, so when a post titled Avoiding workspace loops by expanding navigation properties in the GetGroupsAsAdmin API showed up on the Power BI blog, I didn’t pay much attention. I should have.

Read on for the full set of updates.

Comments closed

SSIS Term Extraction

Tim Mitchell takes us through term extraction in SQL Server Integration Services:

The SSIS term extraction component is a data flow transformation that will aggregate the frequency of words found in a specified column supplied by an upstream data source. This component expects a Unicode text or text stream field as an input, and calculates as an output the frequency of the nouns and/or noun phrases in the specified source column. As shown below on the data flow surface, the term extraction component (circled) is always used as a transformation, accepting exactly one input and expecting either one or two outputs (the second being an optional error output).

This is one component I’ve never used before.

Comments closed

Automating Azure Data Studio Notebooks

Aaron Nelson has two separate ways of scheduling Azure Data Studio notebooks for us:

There are two new options for automating your SQL Notebooks with your SQL Servers. Earlier this month, the Insiders build of Azure Data Studio received the ability to add SQL Notebooks in SQL Agent. This past Friday (September 20th, 2019) a new version of the SqlServer PowerShell module was posted to the Gallery, with a new Invoke-SqlNotebook cmdlet.

Read on for demos of both.

Comments closed

SQL Server Management Studio 18.3

Dinakar Nethi announces a new version of SQL Server Management Studio:

Fixed the hard to see dialog for Query Options when the user right-clicks on the T-SQL editor window by making it resizable.

The Completion time message that’s visible in the result grid/file  that was introduced in SQL Server Management Studio 18.2 is now configurable under Tools > Options > Query Execution > SQL Server > Advanced > Show completion time.

In the connection dialog, we replaced Active Directory – Password and Active Directory – Integrated with Azure Active Directory – Password and Azure Active Directory – Integrated, respectively.

There are a pretty good number of bugfixes in this release.

Comments closed

Record Transformation with cdata

John Mount shows off one of the advantages of using cdata to define data-driven record transformation specifications:

We have a tutorial on how to design such transforms by writing down the shape your incoming data records are arranged in, and also the shape you wish your outgoing data records to be arranged in.

This simple data transform is in fact not a single pivot/un-pivot, as the result records spread data-values over multiple rows and multiple columns at the same time. We call the transform simple, because from a user point of view: it takes records of one form to another form (with the details left to the implementation).

Read the whole thing.

Comments closed

Ring Buffer CPU Over 100%

Thomas Rushton ran into an oddity with sys.dm_os_ring_buffers:

Wait! Process Utilization + Idle adds up to waaaaay over 100…What’s going on?

My first thought was that the CPU was being throttled, so I checked the windows power options – for some really irritating reason, Windows defaults to a “balanced” power setting, which is rubbish for SQL Server.

That looks OK, but… if you check the CPU page of the Task Manager, things are a bit more interesting:

Unfortunately, the story doesn’t have a conclusion, but Thomas’s conjecture does make sense.

Comments closed