Press "Enter" to skip to content

Day: January 9, 2025

Prevalence Adjustment in Binary Classifiers

David Lindelöf deal with an issue in analyzing classification models:

When you run a binary classifier over a population you get an estimate of the proportion of true positives in that population. This is known as the prevalence.

But that estimate is biased, because no classifier is perfect. 

Read on to learn what this means for precision, as well as one technique for tracking prevalence changes over itme.

Leave a Comment

Entity Framework and Default Data Lengths

Brent Ozar points out one issue you might run into when using Entity Framework:

Most of the time, I love Entity Framework, and ORMs in general. These tools make it easier for companies to ship applications. Are the apps perfect? Of course not – but they’re good enough to get to market, bring in revenue to pay salaries, and move a company forwards.

However, just like any tool, if you don’t know how to use it, you’re gonna get hurt.

One classic example popped up again last month with a client who’d used EF Core to design their database for them. The developers just had to say which columns were numbers, dates, or strings, and EF Core handled the rest.

Read on for the scenario.

Leave a Comment

Window Function Ranges: UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING

Chad Callihan engages the limit breaker:

I’m familiar with using the OVER clause and don’t think it’s too uncommon to see it used for including row numbers by using ROW_NUMBER() and aggregating data. But even though they’ve been around since SQL Server 2012, I’m not too familiar with using the OVER clause with the UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING to affect the window being queried.

Let’s take a look at a couple of examples using UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING.

Click through for those examples. The default ranges for window functions usually make a lot of sense, but it’s good to understand your options for frames: ROWS vs RANGE, as well as the frame values (UNBOUNDED PRECEDING, {N} PRECEDING, CURRENT ROW, {N} FOLLOWING, and UNBOUNDED FOLLOWING).

Leave a Comment

Reading a SQL Server XML Deadlock Report

Stephen Planck reads a report:

SQL Server includes an Extended Events session called system_health, which runs by default and, among other things, captures information about deadlocks as they occur. When two or more sessions block each other in such a way that no progress can be made (a deadlock), SQL Server chooses one session as the “victim,” rolls back its transaction, and frees resources so other sessions can continue. By reviewing the deadlock report in the system_health session’s XML output, you can see precisely why the deadlock happened and identify which queries or procedures were involved.

Below is a walkthrough of how to interpret a sample XML deadlock report, followed by a brief note on how to access this output.

Read on for that walkthrough.

Leave a Comment