Press "Enter" to skip to content

Day: September 25, 2019

SQL Server and Integer Math

Bert Wagner reminds us that integer math is a thing:

To determine how much lumber I would need for building the new walls, I decided to write a SQL query to help with my framing calculations. I was building a 6 foot wall and wanted to put a stud every 16 inches. Easy enough to do the math on this:

SELECT (6*12)/16

The output of the query above was 4, indicating the number of studs I would need for one wall section.

What’s interesting is that if we do this same equation in a calculator, we get a slightly different answer: 4.5.

Click through for Bert’s thoughts on the issue. Other languages and platforms do this as well, so it’s not unique to SQL Server, but if you’re not used to it, you might be in for a surprise.

Comments closed

Delta Lake Schema Enforcement

Burak Yavuz, et al, explain the concept of schema enforcement with Databricks Delta Lake:

Schema enforcement, also known as schema validation, is a safeguard in Delta Lake that ensures data quality by rejecting writes to a table that do not match the table’s schema. Like the front desk manager at a busy restaurant that only accepts reservations, it checks to see whether each column in data inserted into the table is on its list of expected columns (in other words, whether each one has a “reservation”), and rejects any writes with columns that aren’t on the list.

Something something “relational database” something something. They also walk us through some examples in a Databricks notebook, so check that out.

Comments closed

Performance Spools Plus Nested Loop Joins

Paul White takes us through the different types of performance-related spools you might see in an execution plan:

All performance spools are lazy. The spool’s worktable is gradually populated, a row at a time, as rows stream through the spool. (Eager spools, by contrast, consume all input from their child operator before returning any rows to their parent).

Performance spools always appear on the inner side (the lower input in graphical execution plans) of a nested loops join or apply operator. The general idea is to cache and replay results, saving repeated executions of inner-side operators wherever possible.

When a spool is able to replay cached results, this is known as a rewind. When the spool has to execute its child operators to obtain correct data, a rebind occurs.

You may find it helpful to think of a spool rebind as a cache miss, and a rewind as a cache hit.

Read the whole thing. I am admittedly the type of person Paul mentions at the end (I reflexively hiss when I have a query performing poorly and see a spool).

Comments closed

Power BI Admin API Updates

Matthew Roche takes us through a few updates to the Power BI Admin API:

Even with these new experiences, there will still be times when you want or need to use the Power BI API to get insight into all the workspaces in the Power BI tenant for which you are an administrator. This ability isn’t new, but some recent updates to the Power BI admin API have made it easier.

I don’t do a lot of development these days, so when a post titled Avoiding workspace loops by expanding navigation properties in the GetGroupsAsAdmin API showed up on the Power BI blog, I didn’t pay much attention. I should have.

Read on for the full set of updates.

Comments closed

SSIS Term Extraction

Tim Mitchell takes us through term extraction in SQL Server Integration Services:

The SSIS term extraction component is a data flow transformation that will aggregate the frequency of words found in a specified column supplied by an upstream data source. This component expects a Unicode text or text stream field as an input, and calculates as an output the frequency of the nouns and/or noun phrases in the specified source column. As shown below on the data flow surface, the term extraction component (circled) is always used as a transformation, accepting exactly one input and expecting either one or two outputs (the second being an optional error output).

This is one component I’ve never used before.

Comments closed

Automating Azure Data Studio Notebooks

Aaron Nelson has two separate ways of scheduling Azure Data Studio notebooks for us:

There are two new options for automating your SQL Notebooks with your SQL Servers. Earlier this month, the Insiders build of Azure Data Studio received the ability to add SQL Notebooks in SQL Agent. This past Friday (September 20th, 2019) a new version of the SqlServer PowerShell module was posted to the Gallery, with a new Invoke-SqlNotebook cmdlet.

Read on for demos of both.

Comments closed