Press "Enter" to skip to content

Day: June 11, 2025

Spark Streaming plus Drools

Ram Ghadiyaram builds a tool:

Near real-time decision-making systems are critical for modern business applications. Integrating Apache Spark (Streaming) and Drools provides scalability and flexibility, enabling efficient handling of rule-based decision-making at scale. This article showcases their integration through a loan approval system, demonstrating its architecture, implementation, and advantages.  

Click through for a bit of sample code.

Leave a Comment

Vector Search from Scratch

Kanwai Mehreen does a bit of searching:

In this article, I’ll walk you through every step from generating vector representations to searching using cosine similarity, and we’ll even visualize what’s happening behind the scenes. By the end, you’ll not only understand how vector search works but also have a working implementation you can build on. So, let’s get started.

It’s kind of funny how simple this is, but it is. A lot of the complexity is around data quality operations, as well as optimizing the search process.

Leave a Comment

Reshaping Data with the APPLY Operator

I have a new video:

In this video, I show how we can use the APPLY operator to reshape datasets, allowing us to unpivot tables and also calculate the greatest and least values for a row.

If you look closely at the scripts, you’ll see 08 and 10. In the source control repo, I also have a script 09 that covers splitting strings. Using APPLY to split strings has always been a bit of a niche case, but prior to SQL Server 2016’s introduction of STRING_SPLIT() and SQL Server 2022’s improvement of the function, I could make the case that it sometimes made sense to know how to split strings via APPLY. Today, not so much, which is why I tossed that demo from the video.

Leave a Comment

Standard Developer Edition in SQL Server 2025

Joey D’Antoni explains why this is a big deal:

No, it’s not dark mode for SQL Server Management Studio, though the votes are probably close. (Note: I feel like I’m the only IT pro who doesn’t use dark mode, and it’s because I record/present so much—pro tip: you shouldn’t present in dark mode). It’s also not the enhancements to Always On Availability Groups, but you’ll read more about those either here or over at Redmond in the coming months. The most requested feature in SQL Server 2025 is Standard Developer Edition (which is an unfortunate name, but in my discussions with the product group, there just wasn’t anything better they could come up, and legal wouldn’t approve Standard McDatabaseyface).

My hot take is that I don’t use dark mode either. More people should just have proper task lighting when they’re working.

Leave a Comment

Testing ZSTD Backup Compression in SQL Server 2025

Aaron Bertrand runs some tests:

Whether you are a bank or a hot dog stand, creating backups is a boring but essential part of managing databases. Compressing backups – like other types of data compression – can save time and storage space, at the usually unavoidable cost of CPU. There has been little change in compression throughout SQL Server’s long history, but this year, in SQL Server 2025, there is an exciting change coming.

This set of results from Aaron is a bit different from what we’ve seen from Andy Yun and Anthony Nocentino. That’s a big part of why it’s important to get several data points, and to do your own testing in your own environment with your own equipment.

Leave a Comment

Microsoft Fabric Mirroring and Live Monitoring

Teo Lachev is waiting for a message:

A current project called for mirroring a Google BigQuery dataset to Fabric. This feature is currently in private preview so don’t try to find it. However, the tips I share here should be applicable to other available mirroring scenarios, such as mirroring from Azure SQL Database.

One of the GBQ tables was a transaction fact table with some 130 million rows. The issue was that the mirroring window would show this table as normally replicating table with Running green status, but we waited and waited and nothing was happening…

Read on to learn more and how Teo was able to get a better idea of how the initial sync progressed.

Leave a Comment

Troubleshooting with Extended Events

Grant Fritchey knows one way to solve the problem:

A client asked us to tell them when a query ran long. Simple. We have a long running query alert, all built in to Redgate Monitor, so, done. No, see, we like getting alerted when queries run long, but not really long, plus we’re more concerned with just one database.

Click through for the story and how Grant was able to help out the client. Also, read the comments for an entry by Special Guest Star Erik Darling.

Leave a Comment

Troubleshooting Weird Issues

Chad Callihan says sometimes, the best answer is not to play the game:

After some database infrastructure changes related to phasing out the use of linked servers, I encountered issues with a setup tool used to build out new databases and other related features. One section of the tool was failing, and the errors indicated that there were still stored procedures utilizing linked servers, which was causing the problem. I asked myself a few questions on how best to proceed. Does the setup tool need to be updated? Do the related database procedures using linked servers need to be updated? Do the linked server changes made need to be rolled back altogether?

Read on for a proper Gordian Knot solution.

Leave a Comment