May 2024 – Curated SQL

Building a 100% Stacked Bar Chart in R

Published 2024-05-31 by Kevin Feasel

Calculating cumulative percentage or percentage per group for each time can sometimes be a task with a slight twist. Let’s check this with ggplot2 and tidyverse.

Click through for three separate ways of doing this.

Comments closed

Data Encryption Options in Postgres

Published 2024-05-31 by Kevin Feasel

Greg Nokes gives us the options:

Operating system or disk-level encryption protects entire file systems or disks. This method is application-agnostic and offers encryption with minimal overhead. Think technologies like luks in Linux or FileVault in MacOS.

Read on for four options. They’re very similar to options available in SQL Server, so it’s easy enough to compare implementation ideas.

Comments closed

Uniqueidentifier Ordering in SQL Server

Published 2024-05-31 by Kevin Feasel

Jose Manuel Jurado Diaz clears up the mystery:

Today, I worked on a service request that our customer asked about how SQL Server sorts the uniqueidentifier data type. We know that uniqueidentifier store globally unique identifiers (GUIDs). GUIDs are widely used for unique keys due to their extremely low probability of duplication. One common method to generate a GUID in SQL Server is by using the NEWID() function. However, the ordering of GUIDs, especially those generated by NEWID(), can appear non-intuitive. I would like to share my lessons learned how to determine the ordering method using uniqueidentifier and NEWID().

Yeah, things get pretty weird because x86-x64 processors (Intel and AMD both) are Little Endian, and only the fourth chunk of a GUID is Big Endian.

Comments closed

Finding Duplicate Post Titles and Tuning the Query

Published 2024-05-31 by Kevin Feasel

Erik Darling makes a friend:

I’m going to be totally open and honest with you, dear reader: I’ve been experimenting with… AI.

See, I’m just a lonely independent consultant, and sometimes it’s just nice to have someone to talk to. It’s also kind of fun to take a query idea you have, and ask “someone” else to write it to see what they’d come up with.

ChatGPT (for reference, 4 and 4o) does a rather okay job sometimes. In fact, when I ask it to write a query, it usually comes up with a query that looks a lot like the ones that I have to fix when I’m working with clients.

Considering that the clients probably stole the query idea from Stack Overflow as well, that makes sense. But there was a clever trick that the query returned, so check it out.

Comments closed

Unhelpful Error Restoring Azure SQL MI Database

Published 2024-05-31 by Kevin Feasel

Kendra Little encounters an error:

What’s it like to be a Database Administrator for managed databases in Azure? Sometimes it’s a painful guessing game when a routine, core operation– restoring a database – fails with a most unhelpful error.

In this case, if the restore is run via PowerShell, following Microsoft guidance, the error message is:

Restore-AzSqlInstanceDatabase: Long running operation failed with status ‘Failed’. Additional Info: An unexpected error occured while processing the request. [sic]

Somehow the misspelling of ‘occurred’ stings a bit more. Did anyone review the PR for this code?

I’m trying to weigh in my mind whether this error is worse than “String or binary data would be truncated.” One the one hand, the spelling is correct in the latter error message. On the other hand, it uses passive voice. On the gripping hand, they’re both nigh-useless error messages. Hopefully the SQL MI team fixes Kendra’s error message at least as well as the database engine fixed the latter.

Comments closed

Updates to the TidyDensity R Package

Published 2024-05-30 by Kevin Feasel

Steven Sanderson has a round of updates to share:

util_negative_binomial_aic(): Calculate the Akaike Information Criterion (AIC) for the negative binomial distribution. This function aids in model selection, helping you determine the best-fitting model for your data.

Read on for utility functions for a dozen or so distributions, as well as a few other small fixes.

Comments closed

Managing SSIS Project Parameters in Visual Studio

Published 2024-05-30 by Kevin Feasel

Andy Brownsword continues diving into SSIS projects:

Last week we looked at project parameters in SSIS to help us configure our projects. This can help us to tailor the project to the package its in for example. Here we’ll look at using Configurations within Visual Studio to manage those parameters for multiple environments.

Read the whole thing.

Comments closed

Encryption Options in Postgres

Published 2024-05-30 by Kevin Feasel

Umair Shahid goes through the list:

For any financial company that handles sensitive data as part of its operations, the protection of personally identifiable information (PII) is paramount. With the increasing frequency and sophistication of cyberattacks, it is crucial for these companies to implement robust security measures to safeguard PII.

This includes ensuring that even in the event of a breach, unauthorized individuals cannot read or misuse the data. One of the most effective ways to achieve this is through the encryption of data both in motion and at rest.

This blog will delve into the importance of encryption, the methods used to secure data in PostgreSQL databases, and the compliance regulations that mandate these practices.

Click through for the article. The set of capabilities are rather similar to what we have in SQL Server as well.

Comments closed

Fun with Implicit Conversion and Table Partitioning

Published 2024-05-30 by Kevin Feasel

Rod Edwards takes us through an issue:

CONVERT_IMPLICIT(nvarchar(200),[xxxxxxxxxxxxxxxxxxx].[Category,0)=[@Category] AND CONVERT_IMPLICIT(nvarchar(200),[xxxxxxxxxxxxxxxxxxx].[Id],0)=[@Id]

Oh dear, what’s all that then, we have a fat Residual Predicate, where SQL is performing an Implicit conversion on our query predicates.

Time to look at our datatypes.

This is one of several reasons why I espouse the philosophy of NVARCHAR Everywhere. You can’t have implicit conversion if you’re always using NVARCHAR over VARCHAR.

Comments closed

Using the CONVERT() Function in T-SQL

Published 2024-05-30 by Kevin Feasel

Joe Gavin shows how to use a function:

A common task while working with data in Microsoft SQL Server is converting from one data type to another. Most often, it’s done to change the way data is presented, but sometimes it is needed to make sure the right data types are being used for comparisons, joins, or sorting.

The SQL CONVERT function, which has long been part of the SQL language, and as the name implies, can be used to convert a value of one data type into a specified data type with optional formatting attributes. CONVERT gives you the ability to format, whereas the ISO Compliant CAST function does not.

My very strong recommendation for 99% or so of the audience: use TRY_CONVERT() instead. TRY_CONVERT() came out in SQL Server 2012 (sorry for the 1% stuck pre-2012) and has the same performance profile as CONVERT(), except that, when conversion fails, TRY_CONVERT() returns NULL rather than throwing an error.

There is also a TRY_CAST() that does exactly what you think it would.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Month: May 2024