Press "Enter" to skip to content

Author: Kevin Feasel

Creating Minidumps In SQL Server

Joe Obbish shows how to read the call stack by creating a minidump:

We can use minidumps to generate small files that contain SQL Server call stacks. Note that if you aren’t careful you can end up writing the full contents of memory to a dump file. I believe that writing the file is a single-threaded process, so this can take hours and hundreds of GBs on a real server. Also SQL Server is frozen while the dump is happening, so don’t do it in production.

Dumps are most useful when you want to examine a long running, very simple query or a query that seems to be stuck at a certain point. They aren’t useful to see a full call stack of a non-simple query or to see something that happens just once during execution, like a specific task during query compilation. All that you can get is a snapshot and the snapshot may not contain the information that you’re looking for.

There are multiple ways to generate and read dump files. For this blog post I’ll be using sqldumper.exe and WinDbg.

Click through to read the whole thing.

Comments closed

Working With AT TIME ZONE In SQL Server

Louis Davidson has a post up showing how to use AT TIME ZONE:

I have a series of blogs about temporal tables in SQL Server 2016, which you can see here. However, one thing I never had mentioned was that the Temporal Row Start and End Times are always stored in UTC times (which is generally inconvenient to work with unless you are from the Great Britain area).  However, when you are looking at your data and want to see what the data looked like at ‘Yesterday at 12:00 at the computer you are working from, this can be a bit more work.

In 2016 SQL Server, a new feature was added that totally passed me by. AT TIME ZONE. It is used to manipulate the time zone offset of a point in time value,either to append the time zone to a value, or to shift the time zone value of a DATETIMEOFFSET value. The most common practical use that I will show is converting a value in your local time to UTC (and vice versa).

This is something I tend to forget about, but can be useful when building quick reports from UTC.  I’d store all data in UTC, just to make sure dates and times are consistent across servers in different locations, but you don’t have to do those calculations in your head all the time.

Comments closed

Join Elimination

Lukas Eder has a nice post explaining different forms of automatic join elimination:

We intended to fetch all customers and their addresses. But observe: We project only columns from the CUSTOMER table and we don’t have any predicates at all, specifically not predicates using the ADDRESS table. So, we’re completely ignoring any contributions from the ADDRESS table. We never really needed the JOIN in the first place!

And in fact, the optimiser can prove this too, because of the FOREIGN KEY constraint on C.ADDRESS_ID, which guarantees that every CUSTOMER record has exactly one corresponding ADDRESS record. The JOIN does not duplicate, nor remove any CUSTOMER rows, so it is unneeded and thus eliminated (by some, not all databases, will list each database at the end of the article).

So, the database can rewrite the SQL statement to the following, equivalent SQL statement in the presence of said FOREIGN KEY

Read on for a comparison across different products as well.

Comments closed

Text Featurizing With Microsoft R Server

David Smith has a post summarizing sentiment analysis with Microsoft R Server:

Tsuyoshi Matsuzaki demonstrates the process in a post at the MSDN Blog. The post explores the Multi-Domain Sentiment Dataset, a collection of product reviews from Amazon.com. The dataset includes reviews from 975,194 products on Amazon.com from a variety of domains, and for each product there is a text review and a star rating of 1, 2, 4, or 5. (There are no 3-star rated reviews in the data set.) Here’s one example, selected at random:

What a useful reference! I bought this book hoping to brush up on my French after a few years of absence, and found it to be indispensable. It’s great for quickly looking up grammatical rules and structures as well as vocabulary-building using the helpful vocabulary lists throughout the book. My personal favorite feature of this text is Part V, Idiomatic Usage. This section contains extensive lists of idioms, grouped by their root nouns or verbs. Memorizing one or two of these a day will do wonders for your confidence in French. This book is highly recommended either as a standalone text, or, preferably, as a supplement to a more traditional textbook. In either case, it will serve you well in your continuing education in the French language.

The review contains many positive terms (“useful”, “indespensable”, “highly recommended”), and in fact is associated with a 5-star rating for this book. The goal of the blog post was to find the terms most associated with positive (or negative) reviews. One way to do this is to use the featurizeText function in thje Microsoft ML package included with Microsoft R Client and Microsoft R Server. Among other things, this function can be used to extract ngrams (sequences of one, two, or more words) from arbitrary text. In this example, we extract all of the one and two-word sequences represented at least 500 times in the reviews. Then, to assess which have the most impact on ratings, we use their presence or absence as predictors in a linear model:

If you’re thinking about sentiment analysis, read the whole thing.

Comments closed

Kafka Cruise Control

Jiangjie Qin announces Cruise Control, an automated workload management system for Kafka:

Intelligent automation is critical under these circumstances, which is why we developed Cruise Control: a general-purpose system that continually monitors our clusters and automatically adjusts the resources allocated to them to meet pre-defined performance goals. In essence, users specify goals, Cruise Control monitors for violations of these goals, analyzes the existing workload on the cluster, and automatically executes administrative operations to satisfy those goals. You can see a video here about Cruise Control at the Stream Processing Meet Up last fall.

Today we are pleased to announce that we have open sourced Cruise Control and it is now available on Github. In this post, we’ll describe Cruise Control’s uses both generally and at LinkedIn, its architecture, and some unique challenges we faced when creating it. For further details about Kafka terminology used throughout this post, this reference can be a helpful guide.

This isn’t a monitoring tool per se, but rather a resource balancing tool.  And it’s now freely available to all.

Comments closed

Getting Distinct Rows In R

Rob J. Hyndman shows four different techniques (one “classic” and three tidyverse) for getting a distinct subset of a data set in R:

So that looks much better — clean, short, and easy to understand. But is it fast? Rather than grabbing the first lines of each group, it has to go searching for duplicates. But avoiding grouping and ungrouping must save some time.

So I ran some microbenchmark timings:

Click through for techniques and timings.  I’m not surprised that the “classic” method won out in terms of time, but for explanatory value, I’d definitely prefer trying to explain the tidyverse distinct version.  H/T R-Bloggers

Comments closed

Azure SQL Database Multi-Factor Authentication

Arun Sirpal notes that the latest version of SQL Server Management Studio supports Multi-Factor Authentication with Azure Active Directory:

Quite a mouth full for a title but never the less very exciting. With the new version of SQL Server Management Studio (SSMS) 17.2 You now have the option to use Azure AD authentication for Universal Authentication with Multi-factor authentication (MFA) enabled, by that I mean use a login via SSMS that is enabled for MFA where below I will show you the two step verification using a push notification to my iPhone. (Yes iPhone I love it)

Download SSMS 17.2 from this link. https://docs.microsoft.com/en-us/sql/ssms/download-sql-server-management-studio-ssms

Once installed you will see new Authentication options, the option that I want is the one highlighted below – “Active Directory – Universal with MFA support”

Click through for a demo of this.  I wonder if (when?) something like this comes to on-prem, maybe in conjunction with a third-party multi-factor authentication service.

Comments closed

The Downside Of Trusted Assemblies

Solomon Rutzky does not like the Trusted Assembly solution to SQL Server 2017 CLR:

Hopefully, Microsoft removes all traces of “Trusted Assemblies” (as I have suggested here). In either case, please just use Certificates (and possibly Asymmetric Keys, depending on your preference and situation) as I have demonstrated in these past three posts (i.e. Parts 2, 3, and 4). Even better, especially for those using SSDT, would be if Microsoft implemented my suggestion to allow Asymmetric Keys to be created from a binary hex bytes string. But, even without that convenience, there is still no reason to ever, ever, use the “Trusted Assemblies” feature.

He’s given three alternatives so far, so if you’re interested in CLR security, there’s plenty of food for thought.

Comments closed

SQL Code Smells

Phil Factor has updated his code smells compendium:

Some time ago, Phil Factor wrote his booklet ‘SQL Code Smells’, collecting together a whole range of SQL Coding practices that could be considered to indicate the need for a review of the code. It was published as 119 code smells, even though there were 120 of them at the time. Phil Factor has continued to collect them and the current state of the art is reflected in this article. There are now around 150 of these smells and SQL Code Guard is committed to cover as many as possible of them.

I loved this booklet when it came out almost as much as I loved his Confessions of an IT Manager.  If you’re looking for some light reading over a long weekend, you can do a lot worse than this.

Comments closed