Press "Enter" to skip to content

Author: Kevin Feasel

Check Where That Backup’s Restoring To

Shane O’Neill “has a friend” who learned an important lesson about the database restore GUI:

GUIs are good for….

…discovery.

They give you the option to script out the configurations you have chosen. If my friend had chosen to script out the restore, rather then clicking “OK” to run it, maybe he would have caught this mistake when reviewing it – rather than overwriting the Live database with 2 week old data and spending a weekend in the office with 3 colleagues fixing it.

Plus if you ever want to ensure that you know something, try and script it out from scratch.

Read the whole thing; good thing that totally didn’t happen to Shane and was just his friend!

Comments closed

Save Early, Save Often

Kenneth Fisher relays an important life lesson:

So years and years ago, when I was in college, one of my favorite classes was Assembly Language. We were working with Mac Assembly in case anyone is interested (yes I used a Mac at school, one of the big ones that had the monitor built into it). Somewhere around week three or four, we were supposed to print something to the screen. I spent several hours (this was only my second programming class so even Hello World was a challenge) and got my program ready to test. It worked! Sort of.

Hello World was written to the top of the screen! Then a second or so later the bottom half of the screen turned into random ASCII garbage. Then a second or so later the computer rebooted. Well, that’s not good. Time to debug!

So the computer comes back up, I take a look, and I don’t have ANY code. I hadn’t saved (and this was long enough ago there was no auto-save). I had to start ALL over again. In the end, I did manage to re-write my code, got it working and even got an A. I also learned that I needed to save my work before running it. Well, learned my lesson for the first time (of many).

I have attempted to put a sanguine spin on this mishap, based on something Phil Factor once wrote:  if you throw away (or lose) the code the first time around, the second time you write it, the code will probably be better.  This is because the first time you’re writing a set of code, you’re trying to force the pieces together and get the code working; the second time around, you have a working algorithm in mind, so the code will likely be much cleaner.

1 Comment

Changing A Large Table’s Clustered Index

Nate Johnson explains how to change the clustered index on a very large table:

I call this the “setup, dump, & swap”.  Essentially we need to create an empty copy of the table, with the desired index(es), dump all the data into it, and then swap it in.  There are couple ways you can do this, but it boils down to the same basic premise: It’s “better” (probably not in terms of speed, but definitely in terms of efficiency and overhead) to fill this new copy of the table & its indexes, than it is to build the desired index on the existing table.

As Nate notes, “very large” here will depend on your environment, but this is a useful technique because the old table can be live until the moment of the swap.  As Nate writes this, I’m actually in the middle of one of these sorts of swaps—one that will take a week or two to finish due to pacing.

Comments closed

SQL Server Roll-Your-Own Cryptography

David Fowler has an interesting article on a simple cryptographic algorithm in SQL Server:

As this post was prompted by my post on bitwise logic, we’re going to base our algorithm around the XOR cypher.  Basically this cypher works by taking a key which for simplicity sake will be a single byte and XOR-ing that against the message (or plain text) that we want to encrypt.

Let’s look at an example of how this is going to work.

Let’s say that we want to encrypt the plain text ‘SQLUndercover’.  How are we going to do this?  Firstly we need to remember that all text characters are represented by a single byte as an ASCII code.  ‘SQLUndercover’ is represented by the following set of ASCII codes, 83 81 76 85 110 100 101 114 99 111 118 101 114.

I definitely recommend reading this article for two reasons:  first, because it’s interesting; second, because it shows how easy it is to break amateur crypto.  If you feel the need to roll your own cryptographic algorithm, follow my Official Crypto Flow Chart (patent pending):  Do you have a PhD in mathematics with a specialty in cryptanalysis?  If not, don’t do roll your own crypto; if so, maybe do it but even then probably not.

Comments closed

Availability Group Extended Events

Tracy Boggiano shows how to enhance the built-in Extended Events session for Availability Groups:

Now that the extended events session is setup we can use some queries to query information about our AGs and error messages happening on our servers.  First to query when the server failover and becomes your primary server query the event availability_replica_state_change.  The first 10 lines just read in the files for the extended event session so you don’t have to identify the exact filenames.  Then we parse the xml event for get the timestamp, previous state, current state, repliace name, and group name for the event FROM the filenames we collected before.  In the WHERE clause were are looking for when the state has changed to PRIMARY_NORMAL indicating a failover to the server.

Read on for scripts showing what she extends and also how to query this data.

Comments closed

Tibbles In R

Tristan Mahr explains what tibbles and tribbles are and how they compare to built-in data frames:

The name “tribble” is short for “transposed tibble” (the transposed part referring to change from column-wise creation in tibble() to row-wise creation in tribble()).

I like to use light-weight tribbles for two particular tasks:

  • Recoding: Create a tribble of, say, labels for a plot and join it onto a dataset.

  • Exclusion: Identify observations to exclude, and remove them with an anti-join.

I’ve been more used to data frames than tibbles, but this post shows some interesting things you can do with tibbles a lot more easily than with data frames.  It’s enough to make me want to use tibbles more frequently.  H/T R-bloggers

Comments closed

Measuring Model Accuracy

Fabio Veronesi shows several methods of testing model accuracy:

Mean Squared Deviation or Mean Squared Error

This is simply the numerator of the previous equation, but it is not used often. The issue with both the RMSE and the MSE is that since they square the residuals they tend to be more affected by large residuals. This means that even if our model explains the large majority of the variation in the data very well, with few exceptions; these exceptions will inflate the value of RMSE.

Click through for several calculations.  H/T R-bloggers

Comments closed

Watch Those Indexes

Kennie Nybo Pontoppidan explains why that one university registration system was always throwing errors:

Remember that myITU was implemented on an Oracle database. Unlike on SQL Server, all tables in an Oracle database are physically represented as heaps, unless explicitly specified otherwise. That means no indexes. And I didn’t know anything about database performance back then, so I didn’t add any. Effectively any query against the course table would give a full table scans in the EnumerationType and EnumerationValue tables as a side effect. Fast forward to course enrolments…

At ITU, we had implemented a selection algorithm, which considered both the study program you were enrolled into as well as how early you registered for a course. Early birds got the course, so students would be ready when course enrolment was opened at 12pm. And at 12.05 myITU would start failing with 500 Internal Server Errors.

This is a pretty common occurrence, followed up by the “let’s add all the indexes” phase.

Comments closed

Partitioned Views With Memory-Optimized Tables

Ned Otter ran into an issue building partitioned views from a combination of disk-based and memory-optimized tables:

Let’s assume that we have two tables that belong to a partitioned view. Both tables can be memory-optimized, or one table can be memory-optimized, and the other on-disk.

Success condition

an UPDATE occurs to a row in a table, and the UPDATE does not change where the row would reside, i.e. does not cause it to “move” to another table, based on defined CONSTRAINTS

Failure conditions:

   a. UPDATE occurs to a row in the memory-optimized table that causes it to move to either another memory-optimized table, or a on-disk table

   b. UPDATE occurs to a row in the on-disk table that causes it to move to the memory-optimized table

Read the whole thing.  The good news is that if you’re splitting in a way that doesn’t require updating from one type of table to the other, partitioned views work fine.

Comments closed

Bashing Windows

Steve Jones shows how to install Bash on Windows 10:

With SQL Server coming on Linux, some people will want to learn a bit of Linux. Or perhaps they need to get re-acquainted with the OS, which is my situation. I grew up on DOS, but moved to Unix in university. I’ve dabbled in Linux over the years, but with no real use for it over Windows, I abandoned it a decade ago.

Now, I’m trying to re-learn some things as I play with SQL Server on Linux.

Recently I saw a quick video from Scott Hanselman on the Bash subsystem in Windows. I actually first saw this live at the Build 2016 announcement, but when it was added in Beta to Windows 10, I didn’t add it. I’ve been meaning to, but hadn’t.

Until today.

Read on to see how to set this up on your Windows 10 machine.

Comments closed