Press "Enter" to skip to content

Curated SQL Posts

Naming Indexes

Monica Rathbun hits one of my hobby horses:

As you can see from above, none of the names gave a complete indication of what the index encompassed. The first one did indicate it was a Non Clustered Index so that was good, but includes the date which to me is not needed. At least I knew it was not a Clustered Index. The second index did note it is a “Covering Index” which gave some indication that many columns could be included I also know it was created with the Data Tuning Advisor due to the dta prefix. The third index was also created with dta but it was left with the default dta naming convention. Like the first one I know the date it was created but instead of the word Cover2, I know there are 16 key columns are noted by the “K#” and regular numbers tell me those are included columns. However, I still have no idea exactly what these numbers denote without looking deeper into the index. The last index is closer to what I like however the name only tells me one column name when in fact it encompasses five key columns and two included columns.

I absolutely love seeing lots and lots of “_dta_” indexes; it’s a sign that I have a long day ahead of me.

Comments closed

Comparing Databases On SQL Server Instances

Andrew Pruski has written a quick comparison script to check if two instances have the same set of databases or the same configuration settings:

For instance, last week I was working with two instances that contained databases part of an Always On availability group and needed to ensure that all databases on the primary were on the secondary.

Now I know there’s a few ways to do this but I needed a quick and easy method as there were over 300 dbs involved.

The method I used to do this implemented the powershell cmdlet Compare-Object

What this does is pretty much what it says on the tin. The cmdlet takes two objects and compares them based on a input property (in this case it’ll be database name).

This can turn into a much more complicated comparison, but Andrew shows that the basic concept is quite straightforward.

Comments closed

More SSMS Tricks

Wayne Sheffield continues his SSMS tips & tricks series.  He first covers SQLCMD:

Notice that all of the SQLCMD commands start with a colon (:)

The complete reference for how all of these commands work is at http://msdn.microsoft.com/en-us/library/ms162773.aspx. I will cover just a few of them in this blog post.

SETVAR allows you to set a variable for use throughout the script. Its syntax is SETVAR . For instance, “:SETVAR SourceServer .\SQL2005” will set the variable SourceServer to “.\SQL2005”, which is the name of the SQL 2005 instance on my laptop.

Then he talks about multi-server queries, including local server groups and using a Central Management Server.  Having a CMS is critical when you have more than a couple of instances, and frankly, even then it can be helpful when bringing new people up to speed.

Wayne also covers the basics of regular expressions in SSMS.

All three of these are powerful tips.

Comments closed

Collecting Statistics Usage Info

Grant Fritchey shows us how (safely) to collect data on statistics usage:

Years ago I was of the opinion that it wasn’t really possible to see the statistics used in the generation of a query plan. If you read the comments here, I was corrected of that notion. However, I’ve never been a fan of using undocumented trace flags. Yeah, super heroes like Fabiano Amorim and Paul White use them, but for regular individuals like me, it seems like a recipe for disaster. Further, if you read about these trace flags, they cause problems on your system. Even Fabiano was getting the occasional crash.

So, what’s a safe way to get that information? First up, Extended Events. If you use the auto_stats event, you can see the statistics getting created and getting loaded and used. Even if they’re not created, you can see them getting loaded. It’s an easy way to quickly see which statistics were used to generate a plan. One note, you’ll have to compile or recompile a given query to see this in action.

Read on for more.

Comments closed

Altering Procedures And Object Definitions

Solomon Rutzky shows what metadata doesn’t get updated when you call sp_rename on a T-SQL procedure, function, view, or trigger:

This behavior is noted in the Microsoft documentation for sp_rename:

Renaming a stored procedure, function, view, or trigger will not change the name of the corresponding object name in the definition column of the sys.sql_modules catalog view. Therefore, we recommend that sp_rename not be used to rename these object types. Instead, drop and re-create the object with its new name.

Ok, so we have all been warned, at least when it comes to using sp_rename. But that is not the end of the story. There is, indeed, another way to change the object such that the definition does not reflect its current state. And that other way has to do with something missing from the examples shown thus far, something that wouldn’t be missing had I been following best-practices.

Click through to see what else doesn’t get updated.

Comments closed

Time Series Forecasting With DeepAR

Tim Januschowski, et al, introduce DeepAR on AWS:

Today we are launching Amazon SageMaker DeepAR as the latest built-in algorithm for Amazon SageMaker. DeepAR is a supervised learning algorithm for time series forecasting that uses recurrent neural networks (RNN) to produce both point and probabilistic forecasts. We’re excited to give developers access to this scalable, highly accurate forecasting algorithm that drives mission-critical decisions within Amazon. Just as with other Amazon SageMaker built-in algorithms, the DeepAR algorithm can be used without the need to set up and maintain infrastructure for training and inference.

Click through for a product demonstration.

Comments closed

Geocoding With OpenStreetMap

Dmitry Kisler shows how to geocode addresses in R using the OpenStreetMap API:

It is quite likely to get address info when scraping data from the web, but not geo-coordinates which may be required for further analysis like clustering. Thus geocoding is often needed to get a location’s coordinates by its address.

There are several options, including one of the most popular, google geocoding API. This option can be easily implemented into R with the function geocode from the library ggmap. It has the limitation of 2500 request a day (when it’s used free of charge), see details here.

To increase the number of free of charge geocoding requests, OpenStreetMap (OSM) Nominatim API can be used. OSM allows up to 1 request per second (see the usage policy), that gives about 35 times more API calls compared to the google geocoding API.

Click through for the script.

Comments closed

Compare-Object And Null Values

Mike Robbins notes that the Compare-Object cmdlet in Powershell does not like to handle null values:

I thought I’d run into a bug with the Compare-Object cmdlet in PowerShell version 5.1 earlier today.

Compare-Object : Cannot bind argument to parameter ‘ReferenceObject’ because it is null.
At line:1 char:33
+ Compare-Object -ReferenceObject $DriveLetters -DifferenceObject $Driv …
+ ~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (:) [Compare-Object], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,Microsoft.PowerShell.Commands.CompareObjectCommand

Read on for the description of the problem as well as the solution.

Comments closed

Deleting From Large Tables

Andy Galbraith tells a tale of a failed query and a lot of log file growths:

The App01 database was in SIMPLE recovery (when I find this in the wild, I always question it – isn’t point-in-time recovery important? – but that is another discussion).  The relevance here is that in SIMPLE recovery, LOG backups are irrelevant (and actually *can’t* be run) – once a transaction or batch completes, the LDF/LOG file space is marked available for re-use, meaning that *usually* the LDF file doesn’t grow very large.

A database in SIMPLE recovery growing large LDF/LOG files almost always means a long-running unit of work or accidentally open transactions (a BEGIN TRAN with no subsequent CLOSE TRAN) – looking at the errors in the SQL Error Log the previous night, the 9002 log file full errors stopped at 1:45:51am server time, which means the offending unit of work ended then one way or another (crash or success).

Sure enough when I filtered the XEvents event file to things with a duration > 100 microseconds and then scanned down to 1:45:00 I quickly saw the row shown above.   Note this doesn’t mean the unit of work was excessively large in CPU/RAM/IO/etc. (and in fact the query errored out due to lack of LOG space) but the excessive duration made all of the tiny units of work over the previous 105 minutes have to persist in the transaction LDF/LOG file until this unit of work completed, preventing all of the LOG from that time to be marked for re-use until this statement ended.

It turns out that deleting millions of records in a single transaction is an expensive operation.

Comments closed

Introducing PowerApps

Jason Thomas has a great introduction to PowerApps:

you will be able to pass context aware data to a PowerApps app which updates in real time as you make changes to your report. Now, your users can derive business insights and take actions from right within their Power BI reports and dashboards. No need to switch tabs to open the separate apps, copy paste data from one window to another or worry about fat fingering the wrong customer id or invoice amount.

If you think about it, this is a game changer – you finally have a BI tool that allows you to collaborate and take actions right within the report. How many times have you looked at a report, found out an insight and wished that you could send an email to the account manager, only to forget later? Well, now you don’t have to worry about that, as I am going to show you an example of how we can collaborate by adding comments within the report (not just comments, but context aware comments, based on what you are selecting) as well as show how to send emails (to the appropriate people based on your selection).

This is an interesting concept and Jason has a detailed overview of it.

Comments closed