Press "Enter" to skip to content

Curated SQL Posts

The Power of Rounding

Denny Cherry makes a change:

So, I ran across a problem with QuickBooks that involves some of the most basic math that we were all taught in elementary school: how to round numbers properly. You’d think that a company that makes accounting and invoicing software for a living would understand how rounding of numbers works. But based on the last hour of having to edit the data that gets send to QuickBooks from our internal system you’d be wrong.

Denny’s example is $3.18497736 and rounding happens after four decimal spots, so it’s $3.1849 or $3.1850. Denny expects $3.1850 and QuickBooks gives $3.1849.

In this case, Denny’s right. The part that confuses people is banker’s rounding, which has you round to the even number if your last digit is a 5. For example, if it were $3.18495 and you round to four spots after the decimal, that would be $3.1850. With $3.18485, it would round to $3.1848.

.NET uses banker’s rounding by default, which can confuse people unfamiliar with the concept. SQL Server, meanwhile, rounds the way that Denny expects: 5 or higher rounds up, 0-4 rounds down.

Comments closed

Deployment in Azure AI Foundry

Tomaz Kastrun continues a series on Azure AI:

When you are in Azure AI Foundry, on the left navigation bar, select “Model Catalog”.

For this demo, I will be selecting multimodal model “gpt-4” that can work with images and text.

Click “> Deploy” and select the deployment type and also customize the deployment details.

Tomaz has some step-by-step instructions, a bit of detail on deployment types, and a bit of info on how to consume the results.

Comments closed

Against ODBC Functions

Andy Brownsword stakes a claim:

It’s a function, but not quite as we know it. They’re surrounded by { curly braces } and prefixed with fn. The kind of thing I’d expect to see in PowerShell.

It turns out they were introduced in ODBC versions 1-3, with the latter being based on the SQL-92 specification. These were introduced in 1995 and 1992 respectively, which likely explains why I don’t see them used.

I’m thinking hard and don’t believe I’ve ever seen anyone use these in code I’ve maintained. Thus, I agree with Andy: you probably don’t need them.

Comments closed

Alternatives to Long IN() Lists in SQL Server

David Fowler still has a list:

This post comes off the back of my last, where I looked at issues caused by explicitly declaring a large number of values in an IN clause. The query processor ran out of internal resources and could not produce a query plan – When You’ve Managed to Confuse SQL With a Crazy IN Clause.

My suggestion was to put the values into a temp table and carry out a semi-join.

That got me thinking, which of the two methods would perform better.

I’m going to look at the two methods for different amounts of values and have a look at how they get on in terms of reads and total time.

Read on for the answer. If I’m surprised by anything in it, it’s that the threshold where temp tables out-perform the IN() clause is so high.

Comments closed

Unique Constraint Violations Cause Bloat in PostgreSQL

Josef Machytka shows something annoying:

The issue of table and index bloat due to failed inserts on unique constraints is well known and has been discussed in various articles across the internet. However, these discussions sometimes lack a clear, practical example with measurements to illustrate the impact. And despite the familiarity of this issue, we still frequently see this design pattern—or rather, anti-pattern—in real-world applications. Developers often rely on unique constraints to prevent duplicate values from being inserted into tables. While this approach is straightforward, versatile, and generally considered effective, in PostgreSQL, inserts that fail due to unique constraint violations unfortunately always lead to table and index bloat. And on high-traffic systems, this unnecessary bloat can significantly increase disk I/O and the frequency of autovacuum runs. In this article, we aim to highlight this problem once again and provide a straightforward example with measurements to illustrate it. We suggest simple improvement that can help mitigate this issue and reduce autovacuum workload and disk I/O.

I’ll sometimes create unique constraints in SQL Server and set the flag to ignore duplicates, most often in queue tables or situations where I know whatever gets inserted will always be the same, but that multiple agents could act to insert a record and the calling code doesn’t have any sort of protection. The idea that this pattern can hurt you in PostgreSQL seems weird to me.

Comments closed

Using Stored Procedures in Power BI

Annamarie Van Wyk makes a call:

The aim of this blog is to take you through one example of executing SQL stored procedures in Power BI. Similar to SQL Server Reporting Services (SSRS) we can create parameters that will be used to return a result set based on the output parameters that were selected.  The parameter selection will be available on the Power BI home page. The data used in his example is production line data. The production line is inside a factory where products are being packaged during a certain shift and at a certain production run date. It is literally a factory line with products being packed.

Read on for a simple answer, followed by how to parameterize the procedure call.

Comments closed

Creating a Project in Azure AI Foundry

Tomaz Kastrun continues a series on Azure AI:

Azure AI models inference service provides access to the most powerful models available in the Azure AI model catalog. Coming from the key model providers in the industry including OpenAI, Microsoft, Meta, Mistral, Cohere, G42, and AI21 Labs; these models can be integrated with software solutions to deliver a wide range of tasks including content generation, summarization, image understanding, semantic search, and code generation.

The Azure AI model inference service provides a way to consume models as APIs without hosting them on your infrastructure. Models are hosted in a Microsoft-managed infrastructure, which enables API-based access to the model provider’s model. API-based access can dramatically reduce the cost of accessing a model and simplify the provisioning experience.

Read on to learn more about what you get when you create a project.

Comments closed

Backing up SQL Server to S3 and Blob Storage

I have a new video:

In this video, I show how to back up a database directly to AWS S3-compatible storage (in SQL Server 2022) and Azure Blob Storage.

For the S3 portion, I actually use MinIO because I didn’t want to spend several pennies making this video. But it does turn out that the mechanisms are the same as how you would work in AWS S3, once you have the URL and access key.

Comments closed

Delta Tables in Microsoft Fabric with Polars

Sandeep Pawar tries out the Polars library:

The much-anticipated Python notebook in Fabric is finally available and the Fabric users have already developed cool libraries and blogged about the usefulness of these notebooks. Duckdb is everyone’s favorite, but I am a Python guy so here is quick overview of how you can use Polars in the Python notebook.

Polars is an open-source library that uses a Rust engine and supports multi-threaded execution. This means it’s significantly faster than pandas and, in some cases, even faster than Spark. It can efficiently use the limited resources available in Python notebooks (2 cores, 16GB RAM). Polars v1.6 is installed in the default Python notebook environment. So, let’s see how to perform some common operations.

Read on to see how you can load and write out files via Polars.

Comments closed