Curated SQL – Page 19 – A Fine Slice Of SQL Server

LightSHAP

Published 2025-10-20 by Kevin Feasel

Michael Mayer announces a new Python package:

LightSHAP is here – a new, lightweight SHAP implementation for tabular data. While heavily inspired from the famous shap package, it has no dependency on it. LightSHAP simplifies working with dataframes (pandas, polars) and categorical data.

Read on to see how it works. Version 0.1.12 is the current version as of this post and it’s available via PyPi.

Comments closed

Linux Huge Pages and PostgreSQL

Published 2025-10-20 by Kevin Feasel

Umair Shahid explains the value of huge pages when running PostgreSQL:

Huge pages are a Linux kernel feature that allocates larger memory pages (typically 2 MB or 1 GB instead of the normal 4 KB). PostgreSQL’s shared buffer pool and dynamic shared memory segments are often tens of gigabytes, and using huge pages reduces the number of pages the processor must manage. Fewer page‑table entries mean fewer translation‑lookaside‑buffer (TLB) misses and fewer page table walks, which reduces CPU overhead and improves query throughput and parallel query performance. The PostgreSQL documentation notes that huge pages “reduce overhead … resulting in smaller page tables and less CPU time spent on memory management”

One thing I found interesting here was the advice for PostgreSQL is to disable Transparent Huge Pages whereas in SQL Server on Linux, Microsoft’s recommendation is to keep THP enabled.

Comments closed

Job-Level Bursting in Microsoft Fabric Spark Jobs

Published 2025-10-20 by Kevin Feasel

Santhosh Kumar Ravindran announces a new feature:

Enabled (Default): When enabled, a single Spark job can leverage the full burst limit, consuming up to 3× CUs. This is ideal for demanding ETL processes or large analytical tasks that benefit from maximum immediate compute power.

Disabled: If you disable this switch, individual Spark jobs will be capped at the base capacity allocation. This prevents a single job from monopolizing the burst capacity, thereby preserving concurrency and improving the experience for multi-user, interactive scenarios.

Read on for the list of caveats and the note that it will cost extra money to flip that switch.

Comments closed

Business Continuity Options for SQL Server

Published 2025-10-20 by Kevin Feasel

Aleksey Vitsko goes through the list:

SQL Server and Azure SQL offer many options for high availability and disaster recovery. In this article, we look at various options for SQL Server high availability and SQL Server disaster recovery.

Read on to see the list of options and some information about each.

Comments closed

SSMS Query Hint Recommendation Tool

Published 2025-10-20 by Kevin Feasel

Brent Ozar tries out a new feature of SQL Server Management Studio:

The maximum tuning time defaults to 300 seconds, but I tacked on a couple zeroes because my slow query already took ~20 seconds to run on its own, and I wanted to give the wizard time to wave his little wand around. The tool actually runs your query repeatedly with different hints, so if you have a 5-minute query, you’ll need to give the tool more time.

Click Start, and it begins running your query with different hints. A couple minutes later, I got:

Brent’s review is quite positive, in a “This is way better than the alternative of doing nothing” sense.

Comments closed

Monitoring Microsoft Fabric Costs

Published 2025-10-20 by Kevin Feasel

Chris Webb uses a report:

Following on from my blog post a few months ago about cool stuff in the Fabric Toolbox, there is now another really useful solution available there that anyone with Fabric capacities should check out: Fabric Cost Analysis (or FCA). If you have Fabric capacities it’s important to be able to monitor your Azure costs relating to them, so why not monitor your Fabric costs using a solution built using Fabric itself? This is what the folks behind FCA (who include Romain Casteres, author of this very useful blog post on FinOps for Fabric, plus Cédric Dupui, Manel Omani and Antoine Richet) decided to build and share freely with the community.

Click through to see how it works, and check out the FCA link in the graf above to get the code.

Comments closed

The Downside of Zero-Copy Integration between Kafka and Iceberg

Published 2025-10-16 by Kevin Feasel

Jack Vanlightly lays out an argument:

Over the past few months, I’ve seen a growing number of posts on social media promoting the idea of a “zero-copy” integration between Apache Kafka and Apache Iceberg. The idea is that Kafka topics could live directly as Iceberg tables. On the surface it sounds efficient: one copy of the data, unified access for both streaming and analytics. But from a systems point of view, I think this is the wrong direction for the Apache Kafka project. In this post, I’ll explain why.

Read on for an explanation of what “zero-copy” means here, as well as Jack’s position on the matter. I think it’s a solid argument and worth the read.

Comments closed

Updates to sp_CheckSecurity

Published 2025-10-16 by Kevin Feasel

Jeff Iannucci has been busy:

It’s been a while since we made some improvements to the public version of sp_CheckSecurity, but internally we’ve been busy fine tuning checks and adding even more to discover potential vulnerabilities in your SQL Server instances.

Today we’re announcing a new version that includes additions, corrections, and a few other adjustments that should be helpful. Here’s what new!

Read on to see what has changed.

Comments closed

String Comparisons in Oracle

Published 2025-10-16 by Kevin Feasel

Brendan Tierney gets comparing:

When comparing text strings we have a number of functions on Oracle Database to help us. These include SOUNDEX, PHONIC_ENCODE and FUZZY_MATCH. Let’s have a look at what each of these can do.

These are some classic word comparison techniques, but they work pretty well in specific circumstances.

Comments closed

Customer-Managed Keys in Microsoft Fabric

Published 2025-10-16 by Kevin Feasel

Sumiran Tandon makes an announcement:

Customer managed keys were launched in preview, offering workspace administrators the ability to use keys in Azure Key Vault and Managed HSM, to protect data in certain Fabric items. Now, we are extending the encryption support to more Fabric workloads. You can now create Fabric Warehouses, Notebooks and utilize the SQL Analytics Endpoint in workspaces enabled with encryption using your keys. The changes are rolling out and should be available in all regions over the next few days.

Freddie Santos digs into what this means for Fabric Warehouse and the SQL analytics endpoint:

Fabric already ensures that your data is encrypted at rest using Microsoft-managed keys. But for many organizations—especially in regulated industries—encryption alone isn’t enough. They need the ability to control and manage the keys that protect their data, aligning with internal compliance requirements, regulatory standards, and governance best practices.

I know that there are enough companies where this is absolutely necessary for adoption of a product, but I should point out that even without bringing your own key, Microsoft does use their own generated keys to encrypt your data at rest.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Curated SQL Posts