July 2024 – Page 2 – Curated SQL

Rounding Options in T-SQL

Published 2024-07-29 by Kevin Feasel

Please compare the SQL Server round function to banker’s rounding in T-SQL for converting decimal values to integer values. I seek a framework for assessing how closely banker’s rounding results versus SQL Server Round function results match the underlying decimal values. Please provide a couple of empirical comparisons with the framework to indicate which set of rounded values are closer to the underlying decimals and by how much.

Rick talks about what banker’s rounding is and shows how its results adhere more closely to the underlying distribution. Rick does show a user-defined function that generates a rounded number, but if you’re doing this with large enough amounts of data, using CLR and the System.Math.Round() function will likely give you better performance. Incidentally, this is also why if you write T-SQL code to round decimal numbers and .NET code to round numbers, your results may be a little different: T-SQL rounds to the nearest integer, whereas .NET uses banker’s rounding by default.

Comments closed

Restoring a MySQL Table from Filesystem Backup

Published 2024-07-29 by Kevin Feasel

Chad Callihan recovers from a missing database backup:

There may be no worse feeling than needing a database backup and not having one. It ranks right up there with running a DELETE statement and missing the WHERE clause. God help you if you if you suffer both of those together. If you come across that situation with a MySQL database, you might be able to recover what you need.

Read on to see how. Even so, I’d be concerned about what happens if there are foreign key constraints involved.

Comments closed

SQL Server AGs and Kubernetes

Published 2024-07-29 by Kevin Feasel

Andrew Pruski shakes his head:

Say we have a database that we want to migrate a copy of into Kubernetes for test/dev purposes, and we don’t want to backup/restore.

How can it be done?

Well, with cross platform availability groups! We can deploy a pod to our Kubernetes cluster, create the availability group, and then auto-seed our database!

The caveat is, this probably isn’t a good idea. But then again, when has that ever stopped anyone?

Comments closed

Working with grep in R

Published 2024-07-26 by Kevin Feasel

Steven Sanderson performs a pattern match:

In R, finding patterns in text is a common task, and one of the most powerful functions to do this is grep(). This function is used to search for patterns in strings, allowing you to locate elements that match a specific pattern. Today, we’ll explore how to use wildcard characters with grep() to enhance your string searching capabilities. Let’s dive in!

Read on to learn more about how to use the grep() function.

Comments closed

Handling Multiple Snapshots on a Database

Published 2024-07-26 by Kevin Feasel

Andy Brownsword lets things get out of hand:

Last week we looked at using Database Snapshots to help with rolling back upgrades. The snapshot maintained a point in time copy of the database which could be later restored.

We can go further – a database can have multiple snapshots.

Let’s suppose we want to take one before an upgrade, another once the upgrade is complete, and another before the start of business the following day. This would provide us multiple points to restore too.

This however makes restoring more complicated.

My recollection is that it’s not just restoration that gets more complicated, but also any database activity, to the point where too many database snapshots on a single database can have a considerable performance impact.

Comments closed

Creating a Custom Shape Map in Power BI

Published 2024-07-26 by Kevin Feasel

Elena Drakulevska builds a map:

Are you aiming for that WOW effect when your client opens your report? I’ve found that people often experience an immediate sense of awe when they see a map visual, especially when it’s conditionally formatted to highlight, for example, which country has the highest sales. So, I thought I’d share how you can achieve this and go beyond the built-in shape maps in Power BI. Let’s transform your global data into compelling visual stories!

Click through to learn more about the technique.

Comments closed

Defining the Default Lakehouse for a Fabric Notebook

Published 2024-07-26 by Kevin Feasel

Sandeep Pawar sets up a default lakehouse:

I wrote a blog post a while ago on mounting a lakehouse (or generally speaking a storage location) to all nodes in a Fabric spark notebook. This allows you to use the File API file path from the mounted lakehouse.

Mounting a lakehouse using mssparkutils.fs.mount() doesn’t define the default lakehouse of a notebook. To do so, you can use the configure magic as below:

Read on for that command, as well as some notes around using it.

Comments closed

Contoso Data Generator v2

Published 2024-07-26 by Kevin Feasel

Marco Russo announces an updated product:

I am proud to announce the second version of the Contoso Data Generator!

In January 2022, we released the first version of an open-source project to create a sample relational database for semantic models in Power BI and Analysis Services. That version focused on creating a SQL Server database as a starting point for the semantic model.

We invested in a new version to support more scenarios and products! Yes, Power BI is our primary focus, but 90% of our work could have been helpful for other platforms and architectures, so… why not?

Read on to see how you can use this and generate as much data as you want.

Comments closed

Working with Managed Entities in Azure SQL DB

Published 2024-07-26 by Kevin Feasel

Josephine Bush creates and uses a managed identity:

Benefits of Using Managed Identities and Entra Groups

Enhanced Security: Using managed identities eliminates the need to manage credentials, reducing the risk of credential theft.

Simplified Management: Entra Groups streamline the management of permissions for multiple users or managed identities, making it easier to apply consistent access policies.

Scalability: As your organization grows, you can easily manage access by adding new users or managed identities to Entra Groups without needing to update database permissions individually.

Read on to see how you can create one and what you can do with it.

Comments closed

Changing Distributions and Simpson’s Paradox

Published 2024-07-25 by Kevin Feasel

Jerry Tuttle describes a paradox:

So you spent hours, or maybe days, cranking out thousands of numbers, you submit it to your boss just at the deadline, your boss quickly peruses your exhibit of numbers, points to a single number and says, “This number doesn’t look right.” Bosses have an uncanny ability to do this.

Your boss is pointing to something like this: Your company sells property insurance on both personal and commercial properties. The average personal property premium increased 10% in 2024. The average commercial property premium increased 10% in 2024. But you say the combined average property premium decreased 3% in 2024. You realize that negative 3% does not look right.

Although the blog post doesn’t explicitly mention Simpson’s paradox, I’d argue that this is a good example of the idea. H/T R-Bloggers.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Month: July 2024