2017-05-26 – Curated SQL

Multi-Shot Games

Published 2017-05-26 by Kevin Feasel

Dan Goldstein explains a counter-intuitive probability exercise:

Peter Ayton is giving a talk today at the London Judgement and Decision Making Seminar

Imagine being obliged to play Russian roulette – twice (if you are lucky enough to survive the first game). Each time you must spin the chambers of a six-chambered revolver before pulling the trigger. However you do have one choice: You can choose to either (a) use a revolver which contains only 2 bullets or (b) blindly pick one of two other revolvers: one revolver contains 3 bullets; the other just 1 bullet. Whichever particular gun you pick you must use every time you play. Surprisingly, option (b) offers a better chance of survival

My recommendation is to avoid playing Russian roulette.

Comments closed

Whitelisting SQL Server Access

Published 2017-05-26 by Kevin Feasel

Patrick Keisler has a script to whitelist access to SQL Server:

A while back, I posted an article about creating a WhiteList for access to SQL Server. Since then I have received a bit of feedback that it was not working as designed. My apologies for taking so long, but I believe I have come up with a fix.

The main issue is the trigger will block some or even all access to the server after it’s created. As it turns out, the issues were really permission being denied. To see it in action, let’s create everything using the original code from here.

This is an interesting concept. Resource whitelisting makes sense, though we tend only to use authentication-based whitelisting (i.e., creating logins).

Comments closed

Visualizing Emergency Room Visits

Published 2017-05-26 by Kevin Feasel

Eugene Joh has a great blog post showing how to parse ICD-9 codes using regular expressions and then visualize the results as a treemap:

It looks like there is a header/title at [1], numeric grouping at [2] “1.\tINFECTIOUS AND PARASITIC DISEASES”, subgrouping by ICD-9 code ranges, at [3] “Intestinal infectious diseases (001-009)” and then 3-digit ICD-9 codes followed by a specific diagnosis, at [10] “007\tOther protozoal intestinal diseases”. At the end we want to produce three separate data frames that we’ll categorize as:

Groups: the title which contains the general diagnosis grouping
Subgroups: the range of ICD-9 codes that contain a certain diagnosis subgroup
Classification: the specific 3-digit ICD-9 code that corresponds with a diagnosis

It’s a beefy article full of insight.

Comments closed

Subplots In Maps

Published 2017-05-26 by Kevin Feasel

Ilya Kashnitsky shows how to embed subplots within a map using ggplot2:

So, with this map I want to show the location of more and less urbanized NUTS-2 regions of Europe. But I also want to show – with subplots – how I defined the three subregions of Europe (Eastern, Southern, and Western) and what is the relative frequency of the three categories of regions (Predominantly Rural, Intermediate, and Predominantly Rural) within each of the subregions. The logic of actions is simple: first prepare all the components, then assemble them in a composite plot. Let’s go!

This is very useful information, well worth the read.

Comments closed

Scripting Foreign Key Constraints

Published 2017-05-26 by Kevin Feasel

Louis Davidson has a process to script out foreign key constraints and includes an example which loads those constraints into a utility table:

As noted in my previous post, I am in the middle of building a utility (for work, and for my next SQLBLOG post), that will help when you need to drop the foreign key constraints on a table, but then replace them without having to save off the script manually. In that post, I showed how to manually create a foreign key constraint in three possible states.

Next, in order to create the utility to script the FOREIGN KEY constraints before dropping them, I need a way to script the constraint. In this post I present a user defined function that will generate a script that does the scripting of a foreign key constraint. The code is largely based on some code from Aaron Bertrand in this blog (with his permission naturally!) with a few edits to script constraints as enabled, disabled or untrusted if the source constraint was in that condition (or you can force the constraints to a certain way if you so desire as well.)

This could be particularly useful for ETL jobs.

Comments closed

Computed Column Performance

Published 2017-05-26 by Kevin Feasel

Paul White has a great article on when computed columns perform poorly:

A major cause of poor performance is a simple failure to use an indexed or persisted computed column value as expected. I have lost count of the number of questions I have had over the years asking why the optimizer would choose a terrible execution plan when an obviously better plan using an indexed or persisted computed column exists.

The precise cause in each case varies, but is almost always either a faulty cost-based decision (because scalars are assigned a low fixed cost); or a failure to match an expanded expression back to a persisted computed column or index.

The match-back failures are especially interesting to me, because they often involve complex interactions with orthogonal engine features. Equally often, the failure to “match back” leaves an expression (rather than a column) in a position in the internal query tree that prevents an important optimization rule from matching. In either case, the outcome is the same: a sub-optimal execution plan.

Definitely read the whole thing if you’re thinking about setting trace flag 176 on.

Comments closed

Finding Queries Which Generate Waits

Published 2017-05-26 by Kevin Feasel

Kendra Little explains how to find which queries are causing waits in SQL Server:

I once had an extremely busy system where we had growing CMEMTHREAD waits. This is an unusual wait, and our question was: is this being caused by a single query, just a few queries, or all queries? We were able to answer this by setting up an extended events trace and looking at waits by query, but this had some downsides:

Extended Events has no GUI in 2008R2, and setting up and testing the scripts took a bit of time (more minor issue)

Generating wait information by query on a very busy system generates a lot of output, so we had to be careful to set up sampling and filtering so we didn’t impact performance (more major issue)

But we were able to use this to figure out that the wait was associated with all queries, not a few queries, which helped us down our troubleshooting path.

Kendra answers this (mostly) in the context of SQL Server 2008 R2, as that was the version the questioner had, but she does mention where later versions make life easier.

Comments closed

Waits: External But Not Preemptive

Published 2017-05-26 by Kevin Feasel

Ewald Cress has some thoughts on external wait types:

Previously I dug into preemptive waits in SQLOS, and to be honest, I equated “preemptive” with “external”. For the most part the two go hand in hand after all.

To recap, a preemptive wait isn’t necessarily a wait at all. What happens is that a worker needs to run some code that can’t be trusted to play by cooperative scheduling rules. And rather than put the SQLOS scheduler (and all its sibling workers) at the mercy of that code, the worker detaches itself from the scheduler and cedes control to a sibling runnable worker.

Read the whole thing.

Comments closed

Built-In SQL Server Functions

Published 2017-05-26 by Kevin Feasel

Tywan Terrell has a listing of various functions built into SQL Server:

SQL Server starting with 2012 ship with a robust set of functions that can be used to make code perform faster, with fewer lines of code. The functions can be used in ETL Process to provide better error handling. A example of this would be the Try_Parse function that allows you to check if a value can be converted.

Another example would be using the FIRST_VALUE() and LAST_VALUE() functions which work against a set of data. These functions are very useful when looking for things like month over month averages and when doing calculations. The table below contain a list of function that are supported starting with 2012 along with some examples of how to use them.

He breaks them down into four categories and provides examples. Functions can bring their own set of problems with query performance, but most of them can be very useful.

Comments closed

Finding Database Growth Events

Published 2017-05-26 by Kevin Feasel

Arun Sirpal has an Extended Events session which tracks growth events on a database file:

A quick post that is hopefully useful, I wanted a quick way to find the time, size of the database file size change and who caused it.

I went down the extended events route using sqlserver.database_file_size_change event. By the way I am no Extended Events expert, I write a lot via trial and error I am trying to wean off Profiler.

Read on for the script as well as a query which shreds the XML and returns a result set.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Day: May 26, 2017