Press "Enter" to skip to content

Curated SQL Posts

Blocking Operators

Gail Shaw explains how operators like sort can reduce the actual row count:

A non-blocking operator is one that consumes and produces rows at the same time. Nested loop joins are non-blocking operators.

A blocking operator is one that requires that all rows from the input have been consumed before a single row can be produced. Sorts are blocking operators.

Some operators can be somewhere between the two, requiring a group of rows to be consumed before an output row can be produced. Stream aggregates are an example here.

Gail ends by explaining that this is why “Which way do you read an execution plan, right-to-left or left-to-right?” has a correct answer:  both ways.  This understanding of blocking, non-blocking, and partially-blocking operators will also help you optimize Integration Services data flows by making you think in terms of streams.

Comments closed

Ragged Right Files

Sifiso Ndlovu walks us through ragged right formatted files in Integration Services:

The configuration of columns is perhaps a critical part of the entire ETL process as it helps us build mapping metadata for your ETL. In fact, regardless of where or not SSIS/SSMS can detect delimiters, if you skip Column Mapping section – your ETL will fail validation. In order to clarify how Ragged right formatted files work, I have gone a step back and used Figure 4 to actually displayed a preview of our fictitious Fruits transaction dataset from Notepad++. It can already be seen from Notepad++ that the file only has row delimiter in a form of CRLF.

Read the whole thing.

Comments closed

Restoration Failures

Tibor Karaszi shows us cases in which Management Studio can generate an invalid database restoration sequence:

Above, the GUI incorrectly base the restore on a copy only backup. After using the timeline dialog to point to an earlier point in time, you can see that the GUI now has changed so it bases the restore on this potentially non-existing copy only backup. Not a nice situation to be in if the person doing the restore hasn’t practiced using the T-SQL RESTORE commands.

It’s important to be able to write the relevant T-SQL queries to restore your database, just in case you run into one of these issues.

Comments closed

Powershell With Azure SQL Database

Mike Fal introduces us to Azure SQL Database operations using Powershell:

What is this all about? It took me a bit of digging, but what it boils down to is that Microsoft made a fundamental change to how things are managed within Azure. You will now find documentation on these two different deployment models: Classic Deployments and Resource Manager Deployments. These two different set of Powershell cmdlets reflect these different models, as anything for Classic Deployments are handled by cmdlets in the Azure and Azure.Storage modules. All the Resource Manager Deployment stuff is handled by the AzureRM* modules.

This is the first in a series and serves as an introduction to the topic.

Comments closed

AT TIME ZONE

Tom LaRock disusses the AT TIME ZONE function in SQL Server 2016:

Of course you will need to know what is allowed for you to use for the time zone name. Fortunately for us, this list is stored in the registry of the server. In other words, you can use whatever timezones are installed on the server. For a complete list you can query the sys.time_zone_info DMV:

If you work at a company with international dealings, you probably already have a time zone table somewhere, but this is a nice way of encapsulating possibly-slow time zone conversion and calculation operations.

Comments closed

R And SSH Tunnels

Steph Locke shows how to set up an SSH tunnel to connect to an external server within R:

Whilst down the rabbit hole, I discovered just in passing via a beanstalk article that there’s actually been a command line interface for PuTTY called plink. D’oh! This changed the whole direction of the solution to what I present throughout.

Using plink.exe as the command line interface for PuTTY we can then connect to our remote network using the key pre-authenticated via pageant. As a consequence, we can now use the shell() command in R to use plink. We can then connect to our database using the standard Postgres driver.

PuTTY is a must-have for any Windows box.

Comments closed

Get Diretory Information For SSAS

Jens Vestergaard shows us how to get the Data, Log, Temp, and Backup directories for Analysis Services using Powershell:

Just recently a reply was made to the Connect item, highlighting the fact, that the current values of the Data/Log/Temp and Backup Directories – meaning the currently configured values – is exposed through the Server.ServerProperties collection. According to the answer, only public property values are exposed.

Using PowerShell, we can now retrieve the desired information from any given instance of Analysis Services. Doing so would look something like this:

It’s good to know that this information is available via Powershell.

Comments closed

Compression Performance

Rolf Tesmer digs into the case of compression of building an index whose leading column has a low cardinality:

That first one is a cracker – it hit me once when compressing a SQL Server table (600M+ rows) on a 64 core Enterprise SQL Server.  After benchmarking several other data compression activities I thought I had a basic “rule of thumb” (based on GB data size and number of rows)… of which just happened to be coincidence!

This also begs the question of why would you use low selectivity indexes?  Well I can think of a few cases – but the one which stands out the most is the identification of a small number of rows within a greater collection – such as an Index on TYPE columns (ie; [ProcessingStatusFlag] CHAR(1) = [P]rocessed, [U]nprocessed, [W]orking, [F]ailed, etc)

… AND SO – lets do some testing to validate this puppy!

There’s a significant difference here, so check out Rolf’s post for the details.

Comments closed

Using The Default Trace

Jon Morisi shows how to use the default trace:

Often times while troubleshooting an issue, you’ll want more details than what you can find in the application log or SQL Log.  In the background, SQL Server runs a default trace which includes a lot of items to help with troubleshooting including (but not limited to) errors, warnings, and audit data.  I often run the following script as a quick way to find additional details for “ERROR” items from the default trace.

Jon notes that the default trace has been put on the deprecation list, so keep that in mind if you do use it.

Comments closed

Pulling Non-Clustered Index Data

Kenneth Fisher shows using a non-clustered index potentially to reconstruct corrupted data on a clustered index:

So why would you want to do this? Well lets say for example you have a table in a database where the clustered index has become corrupted. Let’s further say that no one mentioned this to you for .. say a year. (No judging!) So your only option at this point might be to use the REPAIR_ALLOW_DATA_LOSS of DBCC CHECKDB. But when you are done how much data has actually been lost? Can you get any of it back?

If you’ve lived a good life and are very lucky, you might recover all data this way.  Otherwise, it’s a good idea to run CHECKDB more frequently and check those backups regularly as well.

Comments closed