Press "Enter" to skip to content

Day: January 8, 2024

Explaining Models with Classic Methods and SHAP

Michael Mayer has some ‘splainin to do:

Let’s explain a {tidymodels} random forest by classic explainability methods (permutation importance, partial dependence plots (PDP), Friedman’s H statistics), and also fancy SHAP.

Disclaimer: {hstats}, {kernelshap} and {shapviz} are three of my own packages.

What I really appreciate in here is that Michael includes classic methods here. It can be easy to say “Oh, this is old and therefore no longer relevant.” But that would be quite wrong.

Comments closed

Digging into Execution Plans for LAG() and LEAD()

Hugo Kornelis looks at a pair of useful window functions:

LAG and LEAD were introduced in SQL Server 2012. They require an OVER clause, but it can only specify PARTITION BY and ORDER BY. No ROWS / RANGE specification for a window frame. Which makes them the stand out as unusual in this series.

By default, they return a value from the last row before the current row, or from the first row after the current row, based on the specified sort order and while observing the specified partition boundaries. But there are two optional parameters, an offset to specify that you want, for instance, the third-last row or the second-next row. And the default parameter specifies a value to be used instead of NULL when the indicated row falls outside of the partition.

Click through to see what the plans look like, as well as how very welcome though potentially performance-impacting changes in SQL Server 2022 have affected this.

Comments closed

Against Publishing Power BI Model Changes from PBI Desktop

Soheil Bakhshi has some thoughts:

In a previous post, I shared a comprehensive guide on implementing Incremental Data Refresh in Power BI Desktop. We covered essential concepts such as truncation and load versus incremental load, understanding historical and incremental ranges, and the significant benefits of adopting incremental refresh for large tables. If you missed that post, I highly recommend giving it a read to get a solid foundation on the topic.

Now, let’s dive into Part 2 of this series where we will explore tips and tricks for implementing Incremental Data Refresh in more complex scenarios. This blog follows up on the insights provided in the first part, offering a deeper understanding of how Incremental Data Refresh works in Power BI. Whether you’re a seasoned Power BI user or just getting started, this post will provide valuable information on optimising your data refresh strategies. So, let’s begin.

Read on for plenty of detail, including your available options and how to use them.

Comments closed

Dynamic Parameters in Powershell

Laerte Junior explains how dynamic parameters work in Powershell:

Have you ever been in a situation that you want to call a cmdlet or a function with a parameter that depends on a conditional criteria that is available as a list? In this article I will show a technique where you can use PowerShell Dynamic Parameters to assist the user with parameter values.

In the documentation of Dynamic Parameters found at about_Functions_Advanced_Parameters in get-help it is defined as “parameters of a cmdlet, function, or script that are available only under certain conditions.” And can be created so that appears “only when another parameter is used in the function command or when another parameter has a certain value.” So, we can say that PowerShell Dynamic Parameters are used when the result of a parameter depends on the previous parameter.

Click through for examples.

Comments closed

Stored Procedure Wrapup

Erik Darling wraps up a series on stored procedures. First, cursors and loops:

You will, for better or worse, run into occasions in your database career that necessitate the use of loops and cursors.

While I do spend a goodly amount of time reworking this sort of code to not use loops and cursors, there are plenty of reasonable uses for them.

I do think we push the “don’t use cursors or loops” thing a little too hard in the SQL Server world, but I also think that a majority of cases in which you’re doing something in a loop, you should be doing it in code outside of SQL Server.

Erik then wraps things up for real:

The general idea of the series was to teach developers about the types of things I always seem to be fixing and adjusting, so that I can hopefully fix really interesting problems in the future.

Of course, that all depends on folks finding these and reading them. If that were the general sway of the world, I’d probably never had been in business in the first place.

Click through for a listing of all of the posts in the series.

Comments closed