February 2025 – Page 2

Thoughts on Scaling Elasticsearch

Published 2025-02-27 by Kevin Feasel

With the evolution of modern applications serving increasing needs for real-time data processing and retrieval, scalability does, too. One such open-source, distributed search and analytics engine is Elasticsearch, which is very efficient at handling data in large sets and high-velocity queries. However, the process for effectively scaling Elasticsearch can be nuanced, since one needs a proper understanding of the architecture behind it and of performance tradeoffs.

Click through for those considerations and the trade-offs you might see.

Comments closed

Step Outputs to Help Troubleshoot Failed SQL Agent Jobs

Published 2025-02-27 by Kevin Feasel

Jim Evans gives us a reminder:

When troubleshooting SQL Agent jobs, often the Job history output is truncated or poorly formatted, making it hard to read. This is especially true when calling SSIS Packages, running jobs like DBCC CheckDB or when running T-SQL code that returns a lot of output. Are there options to get more readable Job output to aid in troubleshooting?

There are a few settings here that we can use to make troubleshooting SQL Agent jobs a little bit easier. In addition to these, it’s also a good idea to retain more history for longer, especially if you’re not in a position to track those job outputs each day.

Comments closed

Microsoft Fabric February 2025 Feature Round-Up

Published 2025-02-27 by Kevin Feasel

Patrick LeBlanc tells us what’s new:

There are a lot of exciting features for you this month! Here are some highlights: In Power BI, Explore from Copilot visual answers which lets you do easy ad-hoc exploration. In Data Warehouse, Browse files with OPENROWSET (Preview) and Copilot for Data Warehouse Chat (Preview). For Data Science, AI Skill is now conversational.

These are just some of the great features this month, keep reading to learn about all of what’s happened in Fabric this month.

Click through for the full report.

Comments closed

Vertical Partitioning Rarely Works

Published 2025-02-26 by Kevin Feasel

Brent Ozar lays out an argument:

You’re looking at a wide table with 100-200 columns.

Years ago, it started as a “normal” table with maybe 10-20, but over the years, people kept gradually adding one column after another. Now, this behemoth is causing you problems because:

The table’s size on disk is huge

Queries are slow, especially when they do table scans

People are still asking for more columns, and you feel guilty saying yes

You’ve started to think about vertical partitioning: splitting the table up into one table with the commonly used columns, and another table with the rarely used columns. You figure you’ll only join to the rarely-used table when you need data from it.

Read on to understand why this is rarely a good idea and what you can do instead.

I will say that I’ve had success with vertical partitioning in very specific circumstances:

There are large columns, like blobs of JSON, binary data, or very large text strings.
There exists a subset of columns the application (or caller) rarely needs.
Those large columns are in the subset of columns the caller rarely needs, or can access via point lookup.

For a concrete example, my team at a prior company worked on a product that performed demand forecasting on approximately 10 million products across the customer base. For each product, we had the choice between using a common model (if the sales fit a common pattern) or generating a unique model. Because we were using SQL Server Machine Learning Services, we needed to store those custom models in the database. But each model, even when compressed, could run in the kilobytes to megabytes in size. We only needed to retrieve the model during training or inference, not reporting, but we did have reports that tracked what kind of model we were using, whether it was a standard model or custom (and if custom, what algorithm we used). Thus, we had the model binary in its own table, separate from the remaining model data.

Comments closed

Sales KPIs in T-SQL

Published 2025-02-26 by Kevin Feasel

I have a new video:

In this video, we will take a look at several useful KPIs for measuring sales. We will also demonstrate the utility of the APPLY operator, the WINDOW clause, and the APPROX_PERCENTILE_CONT() function.

I enjoyed putting this one together, because I enjoy it any time I have a chance to talk about the APPLY operator.

Comments closed

Capturing Stored Procedure Executions via Extended Events

Published 2025-02-26 by Kevin Feasel

Haripriya Naidu creates an Extended Events session:

If you’re a DBA, how many times have developers asked you to check whether a particular stored procedure is hitting SQL Server?

Yes, we’ve all been there. Typically, you either run Profiler or set up an Extended Events session to trace it. But the question is, have you ever been confused about which event action to use?

Read on for the list of candidates and when you should use each.

I do mildly disagree with Haripriya’s last note around running Extended Events in production. There’s a small performance hit for most events. There are a few events that do have a much stronger impact, but for the large majority of events, you can (and probably should) have them running to provide the relevant diagnostic information. This isn’t like Profiler or even a server-side trace, where there’s a fairly significant overhead cost.

Comments closed

Microsoft Fabric Quotas

Published 2025-02-26 by Kevin Feasel

Mihir Wagle puts the kibosh on things:

On February 24, 2025, we launched Microsoft Fabric Quotas, a new feature designed to control resource governance for the acquisition of your Microsoft Fabric capacities. Fabric quotas aimed at helping customers ensure that Fabric resources are used efficiently and help manage the overall performance and reliability of the Azure platform while preventing misuse.

Note that these are not quotas you set on your users, but rather quotas that Microsoft sets on you.

Comments closed

Maxing Out on Stored Procedure Parameters

Published 2025-02-26 by Kevin Feasel

Louis Davidson tries it out:

So I replied: “Challenge Accepted”, well, actually I replied with a gif of Neo saying it, but the effect was the same. So, I decided to just see, what would that look like. Coincidentally I am testing the new template for the Simple Talk site, and a function with 2100 parameters seemed like some code that screams out: “BIG!” Testing is should always be about pressing the limits of your code, so why not.

It turns out that you cannot have 2101 parameters in a single stored procedure definition. But this is definitely an example of Swart’s 10% Rule. It also nets Louis my most coveted category: Wacky Ideas.

1 Comment

Building a RegEx Emulator in SQL Server

Published 2025-02-26 by Kevin Feasel

Sebastiao Pereira offers a fourth-best solution:

Regular expressions (REGEX) let you adaptively investigate, employ, and authenticate text data. This makes it easy to search for simple or complex string patters. There is no direct way to do this in SQL Server, but in this article we look at some SQL functions you can create to emulate regex like functionality.

Regular expressions are coming to SQL Server 2025 and are in Azure SQL Database, so that’s the best option when it becomes available. The second-best option is to use CLR and offload your regular expressions work to .NET, especially if you’re using a library like SQLSharp to do so. The third-best option would be to do this in Python or R with ML Services, though that’s going to be a bit of setup effort and will probably be somewhat limiting. And if all else fails, this is an admirable fallback.

Comments closed

Working with JSON_OBJECTAGG() and JSON_ARRAYAGG() in Azure SQL

Published 2025-02-25 by Kevin Feasel

Koen Verbeek tries out a couple of fairly new functions:

I need to construct JSON from data in our database, but I find the existing FOR JSON PATH limited when the data is not located in one single row but rather scattered over multiple rows. Is there another method on how to handle JSON data in SQL Server? Learn how to use the new SQL Server JSON Functions JSON_OBJECTAGG and JSON_ARRAYAGG in this article.

These aren’t available on-premises yet, though given that there’s a new version of SQL Server coming out in 2025, there’s a good chance we’ll have it by then.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28

Month: February 2025