Press "Enter" to skip to content

Curated SQL Posts

More SSMS Tricks

Wayne Sheffield continues his SSMS tips & tricks series.  He first covers SQLCMD:

Notice that all of the SQLCMD commands start with a colon (:)

The complete reference for how all of these commands work is at http://msdn.microsoft.com/en-us/library/ms162773.aspx. I will cover just a few of them in this blog post.

SETVAR allows you to set a variable for use throughout the script. Its syntax is SETVAR . For instance, “:SETVAR SourceServer .\SQL2005” will set the variable SourceServer to “.\SQL2005”, which is the name of the SQL 2005 instance on my laptop.

Then he talks about multi-server queries, including local server groups and using a Central Management Server.  Having a CMS is critical when you have more than a couple of instances, and frankly, even then it can be helpful when bringing new people up to speed.

Wayne also covers the basics of regular expressions in SSMS.

All three of these are powerful tips.

Comments closed

Collecting Statistics Usage Info

Grant Fritchey shows us how (safely) to collect data on statistics usage:

Years ago I was of the opinion that it wasn’t really possible to see the statistics used in the generation of a query plan. If you read the comments here, I was corrected of that notion. However, I’ve never been a fan of using undocumented trace flags. Yeah, super heroes like Fabiano Amorim and Paul White use them, but for regular individuals like me, it seems like a recipe for disaster. Further, if you read about these trace flags, they cause problems on your system. Even Fabiano was getting the occasional crash.

So, what’s a safe way to get that information? First up, Extended Events. If you use the auto_stats event, you can see the statistics getting created and getting loaded and used. Even if they’re not created, you can see them getting loaded. It’s an easy way to quickly see which statistics were used to generate a plan. One note, you’ll have to compile or recompile a given query to see this in action.

Read on for more.

Comments closed

Altering Procedures And Object Definitions

Solomon Rutzky shows what metadata doesn’t get updated when you call sp_rename on a T-SQL procedure, function, view, or trigger:

This behavior is noted in the Microsoft documentation for sp_rename:

Renaming a stored procedure, function, view, or trigger will not change the name of the corresponding object name in the definition column of the sys.sql_modules catalog view. Therefore, we recommend that sp_rename not be used to rename these object types. Instead, drop and re-create the object with its new name.

Ok, so we have all been warned, at least when it comes to using sp_rename. But that is not the end of the story. There is, indeed, another way to change the object such that the definition does not reflect its current state. And that other way has to do with something missing from the examples shown thus far, something that wouldn’t be missing had I been following best-practices.

Click through to see what else doesn’t get updated.

Comments closed

Time Series Forecasting With DeepAR

Tim Januschowski, et al, introduce DeepAR on AWS:

Today we are launching Amazon SageMaker DeepAR as the latest built-in algorithm for Amazon SageMaker. DeepAR is a supervised learning algorithm for time series forecasting that uses recurrent neural networks (RNN) to produce both point and probabilistic forecasts. We’re excited to give developers access to this scalable, highly accurate forecasting algorithm that drives mission-critical decisions within Amazon. Just as with other Amazon SageMaker built-in algorithms, the DeepAR algorithm can be used without the need to set up and maintain infrastructure for training and inference.

Click through for a product demonstration.

Comments closed

Geocoding With OpenStreetMap

Dmitry Kisler shows how to geocode addresses in R using the OpenStreetMap API:

It is quite likely to get address info when scraping data from the web, but not geo-coordinates which may be required for further analysis like clustering. Thus geocoding is often needed to get a location’s coordinates by its address.

There are several options, including one of the most popular, google geocoding API. This option can be easily implemented into R with the function geocode from the library ggmap. It has the limitation of 2500 request a day (when it’s used free of charge), see details here.

To increase the number of free of charge geocoding requests, OpenStreetMap (OSM) Nominatim API can be used. OSM allows up to 1 request per second (see the usage policy), that gives about 35 times more API calls compared to the google geocoding API.

Click through for the script.

Comments closed

Compare-Object And Null Values

Mike Robbins notes that the Compare-Object cmdlet in Powershell does not like to handle null values:

I thought I’d run into a bug with the Compare-Object cmdlet in PowerShell version 5.1 earlier today.

Compare-Object : Cannot bind argument to parameter ‘ReferenceObject’ because it is null.
At line:1 char:33
+ Compare-Object -ReferenceObject $DriveLetters -DifferenceObject $Driv …
+ ~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (:) [Compare-Object], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,Microsoft.PowerShell.Commands.CompareObjectCommand

Read on for the description of the problem as well as the solution.

Comments closed

Deleting From Large Tables

Andy Galbraith tells a tale of a failed query and a lot of log file growths:

The App01 database was in SIMPLE recovery (when I find this in the wild, I always question it – isn’t point-in-time recovery important? – but that is another discussion).  The relevance here is that in SIMPLE recovery, LOG backups are irrelevant (and actually *can’t* be run) – once a transaction or batch completes, the LDF/LOG file space is marked available for re-use, meaning that *usually* the LDF file doesn’t grow very large.

A database in SIMPLE recovery growing large LDF/LOG files almost always means a long-running unit of work or accidentally open transactions (a BEGIN TRAN with no subsequent CLOSE TRAN) – looking at the errors in the SQL Error Log the previous night, the 9002 log file full errors stopped at 1:45:51am server time, which means the offending unit of work ended then one way or another (crash or success).

Sure enough when I filtered the XEvents event file to things with a duration > 100 microseconds and then scanned down to 1:45:00 I quickly saw the row shown above.   Note this doesn’t mean the unit of work was excessively large in CPU/RAM/IO/etc. (and in fact the query errored out due to lack of LOG space) but the excessive duration made all of the tiny units of work over the previous 105 minutes have to persist in the transaction LDF/LOG file until this unit of work completed, preventing all of the LOG from that time to be marked for re-use until this statement ended.

It turns out that deleting millions of records in a single transaction is an expensive operation.

Comments closed

Introducing PowerApps

Jason Thomas has a great introduction to PowerApps:

you will be able to pass context aware data to a PowerApps app which updates in real time as you make changes to your report. Now, your users can derive business insights and take actions from right within their Power BI reports and dashboards. No need to switch tabs to open the separate apps, copy paste data from one window to another or worry about fat fingering the wrong customer id or invoice amount.

If you think about it, this is a game changer – you finally have a BI tool that allows you to collaborate and take actions right within the report. How many times have you looked at a report, found out an insight and wished that you could send an email to the account manager, only to forget later? Well, now you don’t have to worry about that, as I am going to show you an example of how we can collaborate by adding comments within the report (not just comments, but context aware comments, based on what you are selecting) as well as show how to send emails (to the appropriate people based on your selection).

This is an interesting concept and Jason has a detailed overview of it.

Comments closed

Customizing SQL Operations Studio

Samir Behara shows how to differentiate environments with custom tab colors:

The initial January release insiders build focuses on bug fixes and minor feature improvements. One thing which caught my attention was the ‘SQL Editor Tab Color‘ to differentiate between query tabs inside the IDE.

Using Custom Color to differentiate between environments is one of my favorite feature inside SQL Server Management Studio. The color is displayed in the SSMS status bar, at the bottom. Hence when you connect with a particular environment, it uses the same assigned color. This presents a visual indication of the environment in which you are running your scripts. Lot of 3rd Party tools from RedGate and ApexSQL also has their own versions of setting different colors while connecting to different environments.

Using environment-specific color schemes can be a life-saver.

Comments closed

The Premise Of Cloud Data Warehousing

Derik Hammer explains how cloud data warehouses differ from their on-prem cousins:

Given the data processing needs of a data warehouse, they tend to be implemented on massively parallel processing (MPP) systems. The MPP architecture replies upon a shared nothing concept for distributing data across various slices. Compute nodes are layered on top of the storage and processes queries for data residing in its local slice. The control node is responsible for taking a query and dividing it up into smaller queries to be run in parallel on the compute nodes.

Read the whole thing.

Comments closed