Press "Enter" to skip to content

Author: Kevin Feasel

Locks And Partitioning

Erik Darling looks at the confusing mess that is SQL Server partitioning:

In the Chicago perf class last month, we had a student ask if partition level locks would ever escalate to a table level lock. I wrote up a demo and everything, but we ran out of time before I could go over it.

Not that I’m complaining — partitioning, and especially partition level locking, can be pretty confusing to look at.

If you really wanna learn about it, you should talk to Kendra — after all, this post is where I usually send folks who don’t believe me about the performance stuff.

Click through for that demo and explanation.

Comments closed

Getting Started With Always Encrypted

Monica Rathbun kicks off a series on Always Encrypted:

There are two possibilities Deterministic and Randomized.

MSDN defines Deterministic encryption as always generates the same encrypted value for any given plain text value. Which means that if you have a birthdate of 01/03/1958 it will always be encrypted with the same value each time such as ABCACBACB. This allows you to index it, use it in WHERE clauses, GROUP BY and JOINS.

Randomized encryption per MSDN- uses a method that encrypts data in a less predictable manner. This makes Randomized encryption more secure, because using the example above each encrypted value of 01/03/1958 will be different. It could be ABCACBACB, BBBCCAA, or CCCAAABBB. All three encrypted values are subsequently decrypted to the same value. Since the encrypted value is random you cannot perform search operations etc. as you can with Deterministic.

Part 1 is about building the certificates and keys needed to encrypt data.

Comments closed

Upgrading That Expired Evaluation Copy Of SQL Server

Cody Konior finds a way to extricate the poor souls who need to upgrade expired evaluation copies of SQL Server from their mess:

Common advice here is to set the clock backwards. My problem with that is that you’re probably doing this on an unsupported unknown black-box flaming garbage can of a system set up by someone who wasn’t meant to do it – because otherwise they wouldn’t be using the evaluation edition. So what are the repercussions of setting the clock backwards? Perhaps their application spawning silently in the background and trashing this or other databases with bad date information? Perhaps you’ll lose your RDP connection and then be unable to connect back in because of the SSPI error generated by a clock mismatch?

No thanks. Instead you need to do some detective work.

Read the whole thing.

Comments closed

Using Pester For Configuration Checks

Andrew Pruski shows how to use Pester to audit SQL Server configuration settings:

One Pester test running!

What I like about this is that it can be easily dropped into a job scheduler (e.g.- Jenkins) and then you’ve got a way to routinely check (and correct) all the configuration settings of the SQL instances that you monitor.

Pester would not have been my first thought for configuration checking, but it does serve as another useful option.

Comments closed

Bridging The R-Python Gap

Siddarth Ramesh argues that revoscalepy helps R developers acquaint themselves with Python:

I’m an R programmer. To me, R has been great for data exploration, transformation, statistical modeling, and visualizations. However, there is a huge community of Data Scientists and Analysts who turn to Python for these tasks. Moreover, both R and Python experts exist in most analytics organizations, and it is important for both languages to coexist.

Many times, this means that R coders will develop a workflow in R but then must redesign and recode it in Python for their production systems. If the coder is lucky, this is easy, and the R model can be exported as a serialized object and read into Python. There are packages that do this, such as pmml. Unfortunately, many times, this is more challenging because the production system might demand that the entire end to end workflow is built exclusively in Python. That’s sometimes tough because there are aspects of statistical model building in R which are more intuitive than Python.

Python has many strengths, such as its robust data structures such as Dictionaries, compatibility with Deep Learning and Spark, and its ability to be a multipurpose language. However, many scenarios in enterprise analytics require people to go back to basic statistics and Machine Learning, which the classic Data Science packages in Python are not as intuitive as R for. The key difference is that many statistical methods are built into R natively. As a result, there is a gap for when R users must build workflows in Python. To try to bridge this gap, this post will discuss a relatively new package developed by Microsoft, revoscalepy.

Having worked with both, my loyalties tend to lie with R for a couple of reasons.  But this might help some people bridge the gap.

Comments closed

Using Keras To Predict Customer Churn

Matt Dancho has an example of building a neural net using Keras to predict customer churn:

Pro Tip: A quick test is to see if the log transformation increases the magnitude of the correlation between “TotalCharges” and “Churn”. We’ll use a few dplyr operations along with the corrr package to perform a quick correlation.

  • correlate(): Performs tidy correlations on numeric data

  • focus(): Similar to select(). Takes columns and focuses on only the rows/columns of importance.

  • fashion(): Makes the formatting aesthetically easier to read.

This is a very useful tutorial.

Comments closed

Powershell Speed Testing

Shane O’Neill shows off a Powershell script which allows you to simplify performance testing:

Apart from catching up on news during my commute I only really use notifications for a certain number of hashtags i.e. #SqlServer, #tsql2sday, #sqlhelp, and #PowerShell.

So during work, every so often a little notification will pop up on the bottom right of my window and I can quickly glance down and decide whether to ignore it or check it out.

That’s what happened with the following tweet:

Click through for Shane’s demo.

Comments closed

Vertical Selection In SSMS

Bert Wagner shows off vertical selection in SSMS (using the Alt key):

Sometimes when writing an ad hoc query you might want to take the results of one query and put them into an IN() statement of another query.

Sure, you can write a subquery to put into your IN() statement…but that’s too much work for a one-time use disposable query.

What you can do instead is:

  1. Copy your values of interest

  2. Paste them into your IN() statement

  3. Hold down the ALT key while dragging the mouse down in front of all of your pasted values

  4. Type a comma (see video above for an easier demonstration).

For SSMS speedrunning strats, you can also hold down ALT + SHIFT and use your keyboard arrow keys instead of using the mouse.

Comments closed

How Non-Clustered Index Key Columns Are Stored

Kendra Little walks through page-level details on a non-clustered index:

Just like in the root page and the intermediate pages, the FirstName and RowID columns are present.

Also in the leaf: CharCol, our included column appears! It was not in any of the other levels we inspected, because included columns only exist in the leaf of a nonclustered index.

Kendra does a great job of explaining the topic.

Comments closed