Press "Enter" to skip to content

Day: September 22, 2022

Survival Analysis Model Explanations with survex

Mikolaj Spytek promotes an R package:

You can learn about it in this blog, but long story short, survival models (most often) predict a survival function. It tells us what is the probability of an event not happening until a given time t. The output can also be a single value (e.g., risk score) but these scores are always some aggregates of the survival function and this naturally leads to a loss of information included in the prediction.

The complexity of the output of survival models means that standard explanation methods cannot be applied directly.

Because of this, we (I and the team: Mateusz KrzyzińskiHubert Baniecki, and Przemyslaw Biecek) developed an R package — survex, which provides explanations for survival models. We hope this tool allows for more widespread usage of complex machine learning survival analysis models. Until now, simpler statistical models such as Cox Proportional Hazards were preferred due to their interpretability — vital in areas such as medicine, even though they were frequently outperformed by complex machine learning models.

Read on to dive into the topic. H/T R-Bloggers.

Comments closed

The Bullet Chart

Amy Esselman explains what bullet charts are and when they are useful:

A bullet graph, or a bullet chart, is a variation of a bar chart, typically consisting of a primary bar layered on top of a secondary stack of less-prominent bars. Bullet graphs are best used for making comparisons, such as showing progress against a target or series of thresholds. For example, an organization may want to measure the current year’s sales against a goal, while contrasting it with the performance of the prior year. 

Bullet graphs leverage our familiarity with bar graphs to deliver a lot of information in a compact space. If you want to display metric performance against a goal or reference point, a bullet graph offers a nicely consolidated design. 

Read on for examples and alternatives.

Comments closed

Bit Manipulation in SQL Server 2022

Itzik Ben-Gan twiddles some bits:

The need to manipulate data at the bit level with bitwise operations isn’t common in T-SQL, but you might stumble into such a need in some specialized scenarios. Some implementations store a set of flags (yes/no, on/off, true/false) in a single integer or binary-typed column, where each bit represents a different flag. One example is using a bitwise representation of a set of user/role permissions. Another example is using a bitwise representation of a set of settings turned on or off in a given environment. Even SQL Server stores some flag-based data using bitwise representation.

Here’s the deal. I don’t mind that this new syntax exists, particularly because—as Itzik points out—there are areas built into SQL Server which use integers to store bit flags. In application code, however, this gets a sharp “No!” from me in any code review. If you need to decompose values in your table as a matter of course, your table is not in first normal form. Having a table not be in 1NF isn’t the end of the world but at that point, I think the onus is on the developer to defend the violation at that point.

Comments closed

Releasing a Tabular Model without Users or Roles

Olivier van Streenlandt hit a deployment problem:

A couple of weeks ago my team & I ran into an issue with SQL Server Analysis Services (SSAS), due to a network split between companies, We weren’t able anymore to manage our SSAS access into our SSAS Tabular Model. Since deploying a Tabular Model using Visual Studio is also overwriting members & roles, we needed to find a valid alternative to execute our deployments. Manually at first and automated in the end.

Read on to see how they used Azure DevOps pipelines to solve the issue.

Comments closed

Careful Batching

Michael J. Swart follows up on an older post:

When I wrote Take Care When Scripting Batches, I wanted to guard against a common pitfall when implementing a batching solution (n-squared performance). I suggested a way to be careful. But I knew that my solution was not going to be universally applicable to everyone else’s situation. So I wrote that post with a focus on how to evaluate candidate solutions.

But we developers love recipes for problem solving. I wish it was the case that for whatever kind of problem you got, you just stick the right formula in and problem solved. But unfortunately everyone’s situation is different and the majority of questions I get are of the form “What about my situation?” I’m afraid that without extra details, the best advice remains to do the work to set up the tests and find out for yourself.

Definitely read the original article first. My normal approach is the naive + index method, so I’ll have to try out Michael’s method as well next time I need to delete a chunk of records.

Comments closed

dbops Powershell Module

Kevin Chant looks at a useful Powershell module:

Before covering the dbops PowerShell module I want to quickly cover DbUp.

DbUp is a .NET library that you can use to do migration-based deployments. It is open-source and is licensed under the MIT license, which you can read about in the DbUp license file.

According to the official list of supported databases, it allows you to do migration-based deployments to various databases. Such as SQL Server and MySQL. As you will discover later in this post it also works with a newer Azure service as well.

DbUp has been on my to-learn list for a little while, though I haven’t had a chance to dig into it yet.

Comments closed

DAX OFFSET

Marc Lelijveld looks back on things:

Over the past few days, I attended the Power BI Next Step conference in Lego land – Denmark. During the keynote, Will Thompson – PM on the Power BI team, showed a new DAX function that is available to all of us already, but was very well hidden in the latest builds of Power BI Desktop. This new function, called OFFSET, allows us to do in context comparisons between two values, without writing extremely lengthy and complex DAX.

I gave it a go and in this post I share my first experiences with this new function and how I think this will make our life easier!

This looks a bit like the combination of LAG() and LEAD() in SQL Marc shows off what’s available now and notes what appears to be forthcoming.

Comments closed