Press "Enter" to skip to content

Curated SQL Posts

A Story of Bad Data Modeling

Kendra Little unravels a puzzle:

I double-checked my queries. Had I goofed in my sql? Nope. Next, I looked into if some of the data was in an inconsistent state.

What I found was worse than what I’d imagined. As a data person, it made me feel sad and icky.

That’s because it’s usually not too hard to clean up bad data. It’s almost always much harder to fix a badly designed data model which is already established in production.

Read on for a tale as old as time: the clarion call of expediency now causing pain later.

Comments closed

T-SQL Query Transformations and Performance

Erik Darling isn’t satisfied with “good enough”:

Query tuning is hard work, especially when queries run for a long time and you have to test various changes trying to get it down to a reasonable runtime.

There are, of course, things you pick up intuitively over time, and figuring out where problems in actual execution plans has gotten easier with operator runtimes.

Beyond basic query tuning intuition comes the really creative stuff. The stuff you’re amazed anyone ever thought of.

Click through for two really interesting examples.

Comments closed

Datatables and Calculations in KQL

Robert Cain follows up on a prior post:

In the conclusion of my last post, Fun With KQL – Datatable, I mentioned a datatable can also be useful when you need to do special calculations. I decided it really needed further explanation and created this post to dive in a bit deeper.

If you haven’t read my datatable post yet, you should as I’ll refer back to it. Go ahead, I’ll wait.

Click through to see what Robert has in mind.

Comments closed

Indexing in SQL Server Graph

Hugo Kornelis offers a mea culpa:

In my post, I pointed out that SQL Server automatically creates a unique nonclustered index on the (internal and hidden) column graph_id, that you can delete, but can’t modify. I added that users cannot specify indexes on that column, nor on the other internal and hidden columns that exist in edge tables (from_obj_id, from_id, to_obj_id, and to_id). These statements are factually correct, but misleading.

I then concluded from those facts that once you delete the automatically generated index, you can not re-create it. This is actually incorrect.

Read on for the correct answer, as well as more information on indexing practices and advice on the right kinds of indexes to create on graph tables.

Comments closed

Fabric Data Integration

Teo Lachev reviews the primary methods for data ingestion in Microsoft Fabric:

Fabric supports three options for automated data integration: Data Pipeline (Azure Data Factory pipeline), Dataflow Gen2 (Power BI dataflow), and Notebook (Spark). I summarize these three options in the following table, which loosely resembles the Microsoft comparison table but with my take on it.

Read on for Teo’s thoughts on the matter.

Comments closed

VBS Enclaves for Always Encrypted in Azure SQL DB Elastic Pools

Pieter Vanhove makes an announcement:

A few months ago we announced the support for virtualization-based security (VBS) enclaves in Azure SQL Database. This announcement brings numerous advantages, including robust confidential queries and seamless cryptographic operations, to all Azure SQL Database offerings, independent from the underlying hardware. You can use the feature with any compute tier (provisioned or serverless), purchasing model (vCore or DTU), compute size and region that aligns with your workload needs. And, since VBS enclaves are available in existing hardware offerings, there is no additional cost.

In addition to this preview, we are excited to announce the preview of VBS enclaves in Azure SQL Database elastic pools!

Read on to learn more about how to enable enclaves and add a database to an elastic pool.

Comments closed

Microsoft Fabric Architectural Icons

Marc Lelijveld imports some icons:

In the past, I’ve made a draw.io file for Power BI to help you using the right icons to design your solutions and make architectural diagrams. With Fabric, a bunch of new services and icons have been introduced. This asks for a new draw.io file.

With this blog, I will provide the draw.io file for all new icons and elements of Fabric.

Click through for that link. Also note that you might be more familiar with the new name of draw.io, diagrams.net.

Comments closed

Creating an HTTP Header Hash in R

Bob Rudis creates an R package:

HTTP Headers Hashing (HHHash) is a technique developed by Alexandre Dulaunoy to generate a fingerprint of an HTTP server based on the headers it returns. It employs one-way hashing to generate a hash value from the list of header keys returned by the server. The HHHash value is calculated by concatenating the list of headers returned, ordered by sequence, with each header value separated by a colon. The SHA256 of this concatenated list is then taken to generate the HHHash value. HHHash incorporates a version identifier to enable updates to new hashing functions.

Read on to see when it might be useful and other things you should know about the package. H/T R-Bloggers.

Comments closed

Log Analysis by Hand in Postgres

Laetitia Avrot looks at the logs:

If you’re one of my customers, you might know how I insist on monitoring your Postgres logs and digging into them to find precious insights on what’s going on with your Postgres. For a long time now, there is pgBadger. For PGSQL Phriday #010, Alicja asks us to focus on pgBadger.

You might be surprised to find out I am not using pgBadger. I will explain why later, but keep assured that I do think pgBadger is a good tool. It will help DBAs get better performance and follow how their instance is doing before there is a very bad problem.

Click through for Laetitia’s reasons as well as an alternative way of analyzing log files.

Comments closed