# Day: May 11, 2021

The longest streak in roulette purportedly happened in 1943 in the US when the colour red won 32 consecutive times in a row! A quick calculation shows that the probability of this happening seems to be beyond crazy:

`0.5^32[1] 2.328306e-10`

So, what is going on here? For once streaks and clustering happen quite naturally in random sequences: if you got something like “red, black, red, black, red, black” and so on I would worry if there was any randomness involved at all (read more about this here: Learning Statistics: Randomness is a strange beast). The point is that any sequence that is defined beforehand is as probable as any other (see also my post last week: The Solution to my Viral Coin Tossing Poll). Yet streaks catch our eye, they stick out.

There’s one critical assumption in this post, which is that the game is fair, in that each event has an equal probability of happening. But as a Bayesian, if a roulette table hits red 32 times in a row, it certainly opens the door to the idea that maybe the odds on that table with that dealer aren’t quite equal between red and black.

LucidChart, Draw.io and other tools are great for a quick solution.

Mermaid also provides a nice simple text based diagramming tool that is integrated with many markdown tools.

For me, this just never fit. I like a bit of polish and beauty in a visual presentation and most of these are very utilitarian in their output.

I came across diagrams and found it a perfect fit for intuitive and beautiful diagram rendering of cloud architecture, and figured it would be worth a blog post to share this.

Back when GitPitch was still a viable concern, I had just gotten into using the diagrams package. It takes some getting used to and has very strong preferences on the sorts of relationships diagram elements can have, but it’s good at its job.

Anyways, with that obvious answer out of the way, let’s talk about something more interesting, like sp_help_revlogin.

Remember sp_help_revlogin? It’s that stored procedure that Microsoft published more than 20 years ago, that never found its way into the SQL Server built-in system procedures. Microsoft still maintains that same KB page till this day (by “maintains”, I mean copy-and-pasting it from one place to another as they change their KB platforms).

Read on to understand what this is and several ways of doing the same thing better, including a new `sp_help_revlogin2` that Eitan has put together.

Page splits can happen with the logical fragmentation and low page density causing the transactional log to be huge. Page splits are very expensive. As we learned from the previous post, page splits happens when SQL Server tries to insert a row and there is no more space on the page to fit in the page so page will split the page to give the space to fit it in that record. When SQL Server has to insert any rows in to the pages, first the free space on the page is checked at the header information of the page and if the record is within the mentioned free bytes, the record fits in. If the space is not contiguous within the page but there is total space to fit in the record, then the in-memory compaction of the page making the amount of free space on the page contiguous. This is not page split. Page split occurs when the space is not available on the page to fit in the record, in that case page split into half and the split point of the page is usually 50 percent each. Sometimes, the split can happen at the different point on the page is chosen by the storage engine as the obvious split point which is known as skewed page split. This is even more expensive when compared with the regular page split as this will create much larger transaction log.

What causes the page splits?

Basically, the use of the WITH (NOLOCK) query hint performs a dirty read, of which I’m sure you can find oodles of examples on the Internet about. Microsoft introduced RCSI in SQL Server 2005 to help you reduce the amount of blocks, and with RCSI exclusively in use, database readers no longer block other readers or writers. I love enabling RCSI wherever appropriate, as long as the TempDB database is monitored to make sure that RCSI’s version store is not causing any issues (and it usually is just fine). It also goes without saying that with RCSI enabled, you should work to remove the WITH (NOLOCK) query hints from your code, as RCSI is superseded by NOLOCK and can still invoke dirty reads.

But… this Client is still seeing serializable connections that will occasionally cause issues. What else can we check?

Read on for a few tips.

I recently wrote an article showing how you can visualise a dataset refresh using Power BI. It was a pretty cool way to show some of the internal workings of what otherwise is a black box. The idea from my earlier article uses SSMS Profiler to run a trace against a database hosted in Azure AS, or Power BI Premium. Once the refresh is complete, you import the results of the SSMS Profiler trace into a Power BI report to analyse. The approach requires you to wait until the refresh is complete before you can start exploring the data.

Also recently, I had the opportunity work on some large models that took a long time to refresh. I wondered what might be required to update the earlier process to study the results while the refresh was underway. Does that make me too impatient? Here is what I ended up building.

Click through to enjoy the fruits of Phil’s impatience. This is quite the interesting solution, especially if you’re twiddling your thumbs and wondering if this refresh will ever wrap up.

I received a question from a colleague asking how to register a Service Principal Name (SPN) for SQL Server. The specific situation relates to upgrades and new SQL Server instances deployed onto new servers\server names .

Defining an SPN results in an Active Directory name – allowing a client to uniquely identify  the service instance.  The SPN is comprised of a service name with a computer and user account resulting in a Service id.