2021-05-11 – Curated SQL

Hot, Cool, and Large Numbers

Published 2021-05-11 by Kevin Feasel

Holger von Jouanne-Diedrich hits the casino:

The longest streak in roulette purportedly happened in 1943 in the US when the colour red won 32 consecutive times in a row! A quick calculation shows that the probability of this happening seems to be beyond crazy:
0.5^32[1] 2.328306e-10
So, what is going on here? For once streaks and clustering happen quite naturally in random sequences: if you got something like “red, black, red, black, red, black” and so on I would worry if there was any randomness involved at all (read more about this here: Learning Statistics: Randomness is a strange beast). The point is that any sequence that is defined beforehand is as probable as any other (see also my post last week: The Solution to my Viral Coin Tossing Poll). Yet streaks catch our eye, they stick out.

There’s one critical assumption in this post, which is that the game is fair, in that each event has an equal probability of happening. But as a Bayesian, if a roulette table hits red 32 times in a row, it certainly opens the door to the idea that maybe the odds on that table with that dealer aren’t quite equal between red and black.

Comments closed

Creating Diagrams from Code

Published 2021-05-11 by Kevin Feasel

Sheldon Hull walks us through the diagrams package in Python:

LucidChart, Draw.io and other tools are great for a quick solution.
Mermaid also provides a nice simple text based diagramming tool that is integrated with many markdown tools.
For me, this just never fit. I like a bit of polish and beauty in a visual presentation and most of these are very utilitarian in their output.
I came across diagrams and found it a perfect fit for intuitive and beautiful diagram rendering of cloud architecture, and figured it would be worth a blog post to share this.

Back when GitPitch was still a viable concern, I had just gotten into using the diagrams package. It takes some getting used to and has very strong preferences on the sorts of relationships diagram elements can have, but it’s good at its job.

Comments closed

Building a Better sp_help_revlogin

Published 2021-05-11 by Kevin Feasel

Eitan Blumin remembers:

Anyways, with that obvious answer out of the way, let’s talk about something more interesting, like sp_help_revlogin.
Remember sp_help_revlogin? It’s that stored procedure that Microsoft published more than 20 years ago, that never found its way into the SQL Server built-in system procedures. Microsoft still maintains that same KB page till this day (by “maintains”, I mean copy-and-pasting it from one place to another as they change their KB platforms).

Read on to understand what this is and several ways of doing the same thing better, including a new sp_help_revlogin2 that Eitan has put together.

Comments closed

The Mechanisms of Page Splits

Published 2021-05-11 by Kevin Feasel

Deepthi Gogrui takes us through page splitting in SQL Server:

Page splits can happen with the logical fragmentation and low page density causing the transactional log to be huge. Page splits are very expensive. As we learned from the previous post, page splits happens when SQL Server tries to insert a row and there is no more space on the page to fit in the page so page will split the page to give the space to fit it in that record. When SQL Server has to insert any rows in to the pages, first the free space on the page is checked at the header information of the page and if the record is within the mentioned free bytes, the record fits in. If the space is not contiguous within the page but there is total space to fit in the record, then the in-memory compaction of the page making the amount of free space on the page contiguous. This is not page split. Page split occurs when the space is not available on the page to fit in the record, in that case page split into half and the split point of the page is usually 50 percent each. Sometimes, the split can happen at the different point on the page is chosen by the storage engine as the obvious split point which is known as skewed page split. This is even more expensive when compared with the regular page split as this will create much larger transaction log.
What causes the page splits?

Read on for the full article.

Comments closed

When RCSI Is Not Enough

Published 2021-05-11 by Kevin Feasel

David Klee diagnoses an issue:

Basically, the use of the WITH (NOLOCK) query hint performs a dirty read, of which I’m sure you can find oodles of examples on the Internet about. Microsoft introduced RCSI in SQL Server 2005 to help you reduce the amount of blocks, and with RCSI exclusively in use, database readers no longer block other readers or writers. I love enabling RCSI wherever appropriate, as long as the TempDB database is monitored to make sure that RCSI’s version store is not causing any issues (and it usually is just fine). It also goes without saying that with RCSI enabled, you should work to remove the WITH (NOLOCK) query hints from your code, as RCSI is superseded by NOLOCK and can still invoke dirty reads.
But… this Client is still seeing serializable connections that will occasionally cause issues. What else can we check?

Read on for a few tips.

Comments closed

Visualizing Power BI Refresh in Real-Time

Published 2021-05-11 by Kevin Feasel

Phil Seamark doesn’t have time to wait:

I recently wrote an article showing how you can visualise a dataset refresh using Power BI. It was a pretty cool way to show some of the internal workings of what otherwise is a black box. The idea from my earlier article uses SSMS Profiler to run a trace against a database hosted in Azure AS, or Power BI Premium. Once the refresh is complete, you import the results of the SSMS Profiler trace into a Power BI report to analyse. The approach requires you to wait until the refresh is complete before you can start exploring the data.
Also recently, I had the opportunity work on some large models that took a long time to refresh. I wondered what might be required to update the earlier process to study the results while the refresh was underway. Does that make me too impatient? Here is what I ended up building.

Click through to enjoy the fruits of Phil’s impatience. This is quite the interesting solution, especially if you’re twiddling your thumbs and wondering if this refresh will ever wrap up.

Comments closed

Registering an SPN for SQL Server

Published 2021-05-11 by Kevin Feasel

Jack Vamvas explains how to register an SPN for SQL Server:

I received a question from a colleague asking how to register a Service Principal Name (SPN) for SQL Server. The specific situation relates to upgrades and new SQL Server instances deployed onto new servers\server names .
Defining an SPN results in an Active Directory name – allowing a client to uniquely identify the service instance. The SPN is comprised of a service name with a computer and user account resulting in a Service id.

Read on for the process.

Comments closed

Fun with MERGE and Deadlocks

Published 2021-05-11 by Kevin Feasel

Daniel Hutmacher walks us through another reason to avoid using the MERGE operator:

I recently ran into a curious deadlock issue. I have a process that performs a lot of updates in a “state” table using multiple, concurrent connections. The business logic in the application guarantees that two connections won’t try to update the same item, so we shouldn’t ever run into any locking issues. And yet, we keep getting deadlocks.
What’s going on here? Hint: it has to do with isolation levels and range locks.

Read on for the problem-causing query and a few ways to resolve the problem.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Day: May 11, 2021