Press "Enter" to skip to content

Curated SQL Posts

Fun with MERGE and Deadlocks

Daniel Hutmacher walks us through another reason to avoid using the MERGE operator:

I recently ran into a curious deadlock issue. I have a process that performs a lot of updates in a “state” table using multiple, concurrent connections. The business logic in the application guarantees that two connections won’t try to update the same item, so we shouldn’t ever run into any locking issues. And yet, we keep getting deadlocks.

What’s going on here? Hint: it has to do with isolation levels and range locks.

Read on for the problem-causing query and a few ways to resolve the problem.

Comments closed

Understanding Confidence & Credible Interval Widths

John Cook takes us through the notion of confidence intervals and credible intervals:

Suppose you do N trials of something that can succeed or fail. After your experiment you want to present a point estimate and a confidence interval. Or if you’re a Bayesian, you want to present a posterior mean and a credible interval. The numerical results hardly differ, though the two interpretations differ.

If you got half successes, you will report a confidence interval centered around 0.5. The more unbalanced your results were, the smaller your confidence interval will be. That is, the confidence interval will be smallest if you had no successes and widest if you had half successes.

What can we say about how the width of your confidence varies as a function of your point estimate p

Read on to learn that answer.

Comments closed

Containerizing a Shiny App

Peter Solymos takes us through the process of running a Shiny app in a Docker container:

Docker provides isolation to applications. Images are immutable. Running multiple instances of the same image can serve many users at the same time. All these general advantages of containerized applications apply to Shiny apps too.

All the general advantages of containerized applications apply to Shiny apps. Docker provides isolation to applications. Images are immutable: once build it cannot be changes, and if the app is working, it will work the same in the future. Another important consideration is scaling. Shiny apps are single threaded, but running multiple instances of the same image can serve many users at the same time. Let’s dive into the details of how to achieve this.

Click through for a walkthrough. Containerizing these sorts of apps has been a boon for my team, as it lets us spin up appropriately-sized servers on the cheap. H/T R-Bloggers

Comments closed

Ditch the Powershell ISE

Jeffrey Hicks has a bit of advice:

Gladys’ big challenge, probably like many of you, is trying to break the habit of launching the PowerShell ISE. It doesn’t matter if you are running the ise alias at a PowerShell prompt or using a Start Menu shortcut. My idea is that if you can’t stop yourself from doing this, then why not have the ISE turn around and launch VS Code? I gave Gladys a few lines of code to put in her PowerShell ISE profile script that launched VS Code and then killed the ISE. I’ve since, refined that code and I thought I’d share it with everyone else.

There are a few assumptions in the process. First, is that when you installed VS Code, you included the option to include the application in your %PATH%. You can test this. At a PowerShell prompt run code. If VS Code launches, you are ready to continue.

Click through to see how.

Comments closed

Dynamic Date Formats in Power BI

Joe Billingham shows how to ensure that users see dates in the format most appropriate to them:

Which date format styles should we use if we are building a report that is being consumed internationally?

Remember, 01/12/2021 is December 1st or January 12th depending in which part of the world it is being read.

The decision may be taken from our hands if there is a company policy in place. If the company is based in the USA, for example, they may choose to use US formatted date fields as a standard for reporting across the entire business, however, if the field needs to be truly dynamic depending on the consumers location, the answer lies in this tool tip:

Click through to see how.

Comments closed

Goals in Power BI

Treb Gatte starts a new series:

Microsoft released a new feature called Power BI Goals to public preview. In this blog series, you’ll learn about using Power BI Goals and Scoreboards to monitor your business, learn ways to structure your Scoreboards, understand how to implement this functionality and be clear on requisite licensing and security questions. I will refer to relevant books and blogs which will benefit you as you implement Power BI Goals.

The image attached to this blog post is a real Scoreboard developed for a manufacturing CEO. It spans the organization and provides a glanceable view of the health of the organization. Many of the due dates are set to the end of the fiscal year as they are annual in nature.

Read on for the first post in the series, which introduces the concept of SMART goals and goals in Power BI.

Comments closed

Securing Amazon Managed Streaming for Kafka

Stephane Maarek has some security advice for us:

AWS launched IAM Access Control for Amazon MSK, which is a security option offered at no additional cost that simplifies cluster authentication and Apache Kafka API authorization using AWS Identity and Access Management (IAM) roles or user policies to control access. This eliminates the need for administrators to run an unfamiliar system to control access to Apache Kafka on Amazon MSK, and learn intricate details and specific commands to manage Apache Kafka access control lists (ACLs).

This is a game-changer from a security perspective for AWS customers who use Apache Kafka: I recommend Amazon MSK customers use IAM Access Control unless they have a specific need for using mutual TLS or SASL/SCRAM authN/Z.

Read on to see how it works.

Comments closed

Finding Jobs Run over a Timeframe

Kenneth Fisher has a query for us:

Short post today. Simple query that will tell you every job that ran between two datetime stamps. Note: this of course assumes that the data exists. Job history is only kept for so long so depending on your settings the data may already have been deleted. Also if a job is currently running it’s first step it’s not going to show up in the history table so obviously it won’t show up in the results for this query.

Click through for the query. It uses CROSS APPLY twice, so I like it twice as much.

Comments closed

Types of Fragmentation on Index Pages in SQL Server

Deepthi Goguri explains what sorts of fragmentation can occur on an index in SQL Server:

Logical Fragmentation occurs when the logical order of the leaf level pages (logical order meaning the next key values in order) no longer the continuous page to the next physical data file page. Because of these pages which are out of order will affect the read ahead mechanism and the scan performance. Because of this logical fragmentation, read ahead have to do smaller read ahead reads.

If the logical fragmentation pages are already in the memory than the read ahead mechanism will not be affected in that case. Logical fragmentation will cause the problem for bigger indexes and not for the smaller ones usually (smaller indexes having pages 1000-5000 pages). You can monitor the amount of the index logical fragmentation by using the DMV sys.dm_db_index_physical_stats.

Read on to learn more about logical fragmentation, as well peers extent fragmentation and low page density.

Comments closed