Versioning R Code In SQL Server

Steph Locke shows how to combine R models and SQL Server temporal tables for versioning:

If we’re storing our R model objects in SQL Server then we can utilise another SQL Server capability, temporal tables, to take the pain out of versioning and make it super simple.

Temporal tables will track changes automatically so you would overwrite the previous model with the new one and it would keep a copy of the old one automagically in a history table. You get to always use the latest version via the main table but you can then write temporal queries to extract any version of the model that’s ever been implemented. Super neat!

I do exactly this.  In my case, it’s to give me the ability to review those models after the fact once I know whether they generated good outcomes or not.

NULL Values In The Histogram

Taiob Ali explains how NULL values show up in the SQL Server histogram when you create statistics:

In the density_vector section ‘All density’ value for column ‘PickingCompletedWhen’ is 0.0004705882 which was calculated from: 1/(Number of distinct values of column ‘PickingCompletedWhen’).
In this case which is 1/2125. All NULL values were considered as one. If you do a count of distinct values you will get a result of 2124. Reason is explained here. If you do a select of all distinct values NULL will show along with other 2124 values.

Taiob explains that there’s really nothing special about NULL when it comes to statistics.

Extended Event Handler Application

Dave Mason whipped up an application to track when a particular extended event fires:

Event Responses: when an event is handled, specify how you want to respond to it. In the “Event Responses” section, there are two options: play a system sound or send email (you must choose at least one option). Pick the system sound desired and/or enter an email address.

Dave has made the source code available as well so you can extend the functionality.

Re-Running Powershell History

Claudio Silva shows how to re-run statements in your Powershell history:

While we were running some commands we talked about Get-History cmdlet.

For those who don’t know

this cmdlet lists all the commands that you already ran on the current session.

I wondered if it is possible to pipe the Get-History output and run it again. I could bet yes but I never tried before.

The answer is that yes, you can, and Claudio shows how.

Availability Group Backup Preferences

Shaun Stuart points out that the Backup Preferences tab of the Availability Group Properties for an AG is a little tricky:

The default, and the way my AG was configured, was Prefer Secondary. As the image shows, this means backups will be made on the secondary, unless the secondary is unavailable, in which case, they will be made on the primary.

There are a couple of things to note when you use this setting:

  1. Full backups made on the secondary are Copy Only backups. This means they won’t reset the differential bitmap and your differentials will continue to increase in size until a full backup is made on the primary.

  2. Differential backups cannot be made on the secondary.

  3. Transaction log backups can be made on the secondary and they do clear the log, so your log file will not continue to grow.

Read on for more details.

Columnstore Performance Counters

Niko Neugebauer talks about perfmon counters available for understanding what’s going on with columnstore indexes:

As mentioned right in the beginning of this article, Sunil Agarwal lead the development team into adding greatly valuable performance counters with it’s own object ‘MSSQL:Columnstore‘ that provides some incredible insight on some of the internal operations that are not exposed in other ways.
This was a very much needed step, because SQL Server 2014 has brought a lot of different performance counters and objects for the In-Memory (XTP), while Columnstore Indexes were deserving a good treatment of their own.

This gives you one more avenue for research if you’re experiencing columnstore-related issues.

Genetic Algorithms

Melanie Mitchell provides an introduction to how genetic algorithms work:

Many computational problems require a computer program to be adaptive—to continue to perform well in a changing environment. This is typified by problems in robot control in which a robot has to perform a task in a variable environment, or computer interfaces that need to adapt to the idiosyncrasies of an individual user. Other problems require computers to be innovative—to construct something truly new and original, such as a new algorithm for accomplishing a computational task, or even a new scientific discovery. Finally, many computational problems require complex solutions that are difficult to program by hand. A striking example is the problem of creating artificial intelligence. Early on, AI practitioners believed that it would be straightforward to encode the rules that would confer intelligence in a program; expert systems are a good example. Nowadays, many AI researchers believe that the “rules” underlying intelligence are too complex for scientists to encode in a “top-down” fashion, and that the best route to artificial intelligence is through a “bottom-up” paradigm. In such a paradigm, human programmers encode simple rules, and complex behaviors such as intelligence emerge from these simple rules. Connectionism (i.e., the study of computer programs inspired by neural systems) is one example of this philosophy (Smolensky, 1988); evolutionary computation is another.

For fun and completely inappropriate implementations of genetic algorithms in T-SQL, William Talada and Gail Shaw have us covered.

Happy Memorial Day

Kevin Feasel

2017-05-29

Meta

It’s Memorial Day in the US today, so Curated SQL will go into holiday mode, where we cover old but good works.  We’ll be back tomorrow with the normal fare.

Multi-Shot Games

Dan Goldstein explains a counter-intuitive probability exercise:

Peter Ayton is giving a talk today at the London Judgement and Decision Making Seminar

Imagine being obliged to play Russian roulette – twice (if you are lucky enough to survive the first game). Each time you must spin the chambers of a six-chambered revolver before pulling the trigger. However you do have one choice: You can choose to either (a) use a revolver which contains only 2 bullets or (b) blindly pick one of two other revolvers: one revolver contains 3 bullets; the other just 1 bullet. Whichever particular gun you pick you must use every time you play. Surprisingly, option (b) offers a better chance of survival

My recommendation is to avoid playing Russian roulette.

Whitelisting SQL Server Access

Patrick Keisler has a script to whitelist access to SQL Server:

A while back, I posted an article about creating a WhiteList for access to SQL Server. Since then I have received a bit of feedback that it was not working as designed. My apologies for taking so long, but I believe I have come up with a fix.

The main issue is the trigger will block some or even all access to the server after it’s created. As it turns out, the issues were really permission being denied. To see it in action, let’s create everything using the original code from here.

This is an interesting concept.  Resource whitelisting makes sense, though we tend only to use authentication-based whitelisting (i.e., creating logins).

Categories