2024-11-20 – Curated SQL

E-Mailing Query Results in Snowflake

Published 2024-11-20 by Kevin Feasel

In our last post, we discussed the most basic of all ways how in Snowflake you can send out a basic email. It was pretty simple, straight text – nothing to really grab the attention of our readers – which we know is the way to craft an email, right?

To do this, we’re going to have some fun in Python. Yes, delve deep into your bag of Python tricks as we get up to some shenanigans with Snowflake and Python.

Read on for a procedure to e-mail the prior result set in HTML format.

Comments closed

Creating a Parquet File in Python

Published 2024-11-20 by Kevin Feasel

Ed Pollack has part one of a two-parter:

This article dives into the Apache Parquet file format, how it works, and how it can be used to export and import data directly to SQL Server, even when a data platform that supports Parquet files natively is unavailable to assist.

In the second part of this article, customizations and more advanced options will be highlighted, showing the flexibility of Python as a tool to solve analytic data movement challenges.

I like how Ed covers the Parquet file format, as it’s not all that complicated but it does give you an idea of how so many operations on Parquet data can be so fast.

Comments closed

SQL Database in Microsoft Fabric

Published 2024-11-20 by Kevin Feasel

Nikola Ilic covers a new addition to the Microsoft Fabric family:

Now, let’s get back to the previous point: SQL database in Fabric is a SaaS Azure SQL DB…Generally speaking, in SaaS solutions, “everything just works” (or at least should work) – without (too much) intervention from your side.

In the context of the SQL database in Fabric, creating a database is probably the most straightforward process of database creation you will ever experience, as I’ll show you in the “HOW TO” section of this article. From that point, everything happens automatically: the database will be automatically configured and will automatically scale both in terms of compute resources and storage. In addition, database backups are performed automatically, indexing also happens the same way, as well as all patches and software/hardware fixes. You want more? No more complex firewall rules and permission settings – this time, everything is done via Fabric workspace roles and item permissions, while the well-known SQL native features allow for more granular control.

This is more of a head-scratcher for me than a brilliant solution. I get that there’s a challenge in figuring out what you want with Azure SQL Database: single database or elastic pool, serverless or provisioned, vCore or DTU-based pricing model, General Purpose or Hyperscale or Business Critical (for vCore), Basic or Standard or Premium (for DTU), one of about five separate hardware configurations, etc.

From the standpoint of “I just want a database, please,” Fabric SQL Database is a lot easier. The problem comes in when you hit the use cases that necessitated all of these options to begin with, at which point you’re back to the original creation screen and outside of Fabric once more.

Comments closed

What Happens upon Row Modification in Postgres

Published 2024-11-20 by Kevin Feasel

Semab Tariq takes us on a journey:

But behind the scenes, it’s more complex to ensure that data remains consistent and accessible. In today’s blog, I’ll answer some frequently asked questions from our customers and dive into why PostgreSQL relies on a process called VACUUM to efficiently manage data updates.

Updating a row in PostgreSQL isn’t as straightforward as directly changing the existing data. Instead, PostgreSQL avoids in-place updates, meaning it doesn’t overwrite rows directly.

But what does this actually mean?

Read on to learn what that actually means.

Comments closed

Metadata-Only NOT NULL Column Insertion

Published 2024-11-20 by Kevin Feasel

Andy Brownsword has the need for speed:

When adding a new column and wanting to default the value for existing records, it used to be a painful task. As of SQL Server 2012 that became much easier.

But nobody told me, until Simon casually mentioned it in conversation recently. I had to see it for myself, so I thought I’d share for those who weren’t aware.

Read on to see how. I rarely self-promote in other people’s blog posts (hush, person who knows all the times I’ve done it), but I do have a talk on the topic of near-zero downtime database deployment strategies which includes this and quite a few other notes on what you can do without blocking others. For these sorts of changes, what you’re looking for is asynchronous processing and a Sch-M (schema modification) lock at the very end, such as when rebuilding an index with ONLINE = ON in Enterprise Edition. Alternatively, look for a Sch-M lock only on a metadata table and not the actual data. Andy’s post is an example of the latter.

Comments closed

What’s New in SQL Server 2025

Published 2024-11-20 by Kevin Feasel

Brent Ozar has a list:

Today at Microsoft Ignite, Microsoft announced SQL Server 2025. The biggest new features focus on AI and bringing the latest Azure SQL DB features down to your own servers. Here are the top features:

Call AI services like ChatGPT directly from T-SQL – using T-SQL commands and sp_invoke_external_rest_endpoint

Vector searches, a native vector data type, and vector indexes with DiskANN – more information in the documentation

Fabric mirroring – so your data warehouse in Fabric can more easily have a near-real-time copy of your OLTP data sources – similar to how it works in Azure SQL DB

Regular expression support in T-SQL – to learn more about that, read how it works in Azure

Optimized locking – like how it works in Azure SQL DB, and you might have noticed a theme here

Of this list, I’m excited for regular expression support and optimized locking. On the latter, I would like RCSI to be the default, but I can understand that this would have a significant “Who moved my cheese?” factor for DBAs, and there are just enough cases when RCSI introduces actual change into an environment.

Fabric mirroring is also fine, and sp_invoke_external_rest_endpoint is something that can be useful, not just for making external calls. As for storing vector data, I will want to see its performance—both in terms of how quickly it loads & retrieves data, as well as how accurate the results are—before I get too excited. DiskANN is a good algorithm and it’s in Cosmos DB right now, but I haven’t been that impressed with the quality of results so far from it in Cosmos.

2 Comments

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Day: November 20, 2024

E-Mailing Query Results in Snowflake

Creating a Parquet File in Python

SQL Database in Microsoft Fabric

What Happens upon Row Modification in Postgres

Metadata-Only NOT NULL Column Insertion

What’s New in SQL Server 2025