Press "Enter" to skip to content

Curated SQL Posts

Uniquifiers Doing Heavy Lifting

Michael J. Swart is one of a kind:

If you define a clustered index that’s not unique, SQL Server will add a hidden 4-byte column called UNIQUIFIER. You can’t see it directly but it’s there. When you add a row whose key is a duplicate of an existing row, the new row gets a new unique value for it’s uniqueifier. If you add over 2.1 billion rows with the same key, the uniquifier value exceeds the limit and you will see error 666.

A while ago, we nearly got into trouble because of a bad choice for clustering key that went undetected for so long.

Click through for a query to see how many clustered indexes need uniquifiers and which have the most duplication of key fields.

Comments closed

Data Integrity and the LIKE Operator

Joe Celko provides some context:

The LIKE operator has been in SQL since the very beginning. It’s a very weak form of pattern matching, that has only two simple wildcards. The wildcards are ‘_‘ (underscore) and ‘%‘ (percent). They respectively stand for any single character and for zero or more characters in a substring. The syntax is pretty straightforward:

Read on for examples of LIKE in action, including in check constraints.

Comments closed

Cost Threshold for Parallelism and Missing Indexes

Jared Westover explains a phenomenon:

Did you know that the Cost Threshold for Parallelism (CTFP) affects SQL Server’s choice of a trivial execution plan? One area that can suffer from this setting is the optimizer’s ability to suggest index recommendations. When SQL Server picks a trivial plan, it skips suggesting any missing indexes. So, if you’ve set a high CTFP and run simple queries, you might never get those handy index recommendations.

Click through to learn more.

Comments closed

Rating a Dashboard

Martin Schoombee explains the challenge of dashboard review:

A little while ago I was asked to rate a Power BI dashboard. The person who asked, participated in a Power BI challenge (I’ll call it that because that’s the way they are being marketed) and wanted some feedback on the submission. I agreed on the condition that the feedback would be public and in the form of a blog post.

Martin has a thoughtful explanation of the difficulty of providing a review (especially without important context around what the end users intend to do) but then does yeoman’s work talking about the visuals.

Comments closed

Always Encrypted and Key Vault

Rod Edwards starts a series on Always Encrypted:

For the purposes of this post, i’m going to show how to set up Always Encrypted for column data held in an OnPrem SQL Server (which could be MI, or Azure SQL DB ) using keys stored in Azure Key Vault. And showing you some of the many potential issues that you need to be aware of along the way.

I’m not going to talk about the Secure Enclaves flavour here, just the ‘common or garden’ Always Encrypted.

Read on to learn more.

Comments closed

Authentication Monitoring in Postgres

Rafia Sabih announces a new extension:

How about a situation when you want to log all the login attempts made to your PostgreSQL server. Yes, one way would be to read your log files and find out all the information related to login attempts and then transfer to them some other file, etc. to further use this information. Now, this is one way of doing it but it might pose some challenges like time and memory consumption in reading from those long files, additionally this is going to incur a lot of I/O which might be a serious issue when you are on cloud. During my time working for Zalando, we came across this problem and decided to write a Postgres extension to handle this efficiently.

Click through to learn more about the pg_auth_mon extension and how it works, including a link to the GitHub repository.

Comments closed

Speech to Text with Streamlit and Azure AI

I have a new video:

In this video, I show how we can integrate with the Azure AI Services Speech service, using two different methods to capture speech from the microphone via our Streamlit application and submit that to Azure OpenAI.

Check out the video and final set of code. There’s an intermediate set of code for detecting a single utterance. But I think the final product works out pretty well.

Comments closed

Comparing grep() and grepl() in R

Steven Sanderson compares two functions:

Both grep() and grepl() are functions in R that help us search for patterns in text. Think of them as detectives looking for clues in a big pile of words!

  • grep(): This function is like a pointer. It tells you where it found the pattern you’re looking for.
  • grepl(): This one is more like a yes/no checker. It tells you if the pattern exists or not.

Read on for examples of each.

Comments closed

Performance Tuning via Query History in Snowflake

Kevin Wilkie gets down to tuning:

In our last post, we talked about some of my favorite queries I use in Snowflake to see various items of interest – such as finding the worst-performing queries. For today’s post, though, I want to talk about performance tuning.

Yes, you read that right. We’re going to use query history to do some fun performance tuning.

Click through for two queries that can help you find what you may need to tune.

Comments closed

Building a Graph Visual with Fabric and KQL

Brian Sherwin builds a graph:

For quite a while, I have been extremely interested in data visualization. Over the last few years, I have been focused on ways to visualize graph databases (regardless of where the data comes from Using force directed graphs to highlight the similarities or “connected communities” in data is incredibly powerful. The purpose of this post is to highlight the recent work that the Kusto.Explorer team has done to visualize graphs in Azure Data Explorer database with data coming from a Fabric KQL Database.

Note: The Kusto.Explorer application used to visualize the graph is currently only supported on Windows.

I’ll just sit here in the corner with my Linux laptop, not jealous or anything.

Comments closed