Press "Enter" to skip to content

Author: Kevin Feasel

Explaining Black Box Models with LIME

Holger von Jouanne-Diedrich takes us through the intuition of LIME:

There is a new hot area of research to make black-box models interpretable, called Explainable Artificial Intelligence (XAI), if you want to gain some intuition on one such approach (called LIME), read on!

Before we dive right into it it is important to point out when and why you would need interpretability of an AI. While it might be a desirable goal in itself it is not necessary in many fields, at least not for users of an AI, e.g. with text translation, character and speech recognition it is not that important why they do what they do but simply that they work.

In other areas, like medical applications (determining whether tissue is malignant), financial applications (granting a loan to a customer) or applications in the criminal-justice system (gauging the risk of recidivism) it is of the utmost importance (and sometimes even required by law) to know why the machine arrived at its conclusions.

One approach to make AI models explainable is called LIME for Local Interpretable Model-Agnostic Explanations. There is already a lot in this name!

LIME is not trivial to use and it can be very slow, but it is a great way to visualize models.

Comments closed

Auditing Login Events Using Service Broker

Max Vernon takes us through using Service Broker to audit login events:

Logging to the SQL Server Error Log or the Windows Security Event Log means you’ll need some kind of tool to slice-and-dice the data, postmortem. It’s difficult to respond to events as they happen with this kind of auditing, and hard to create simple T-SQL queries to inspect the data. You could create a login trigger at the server level, but that will only allow you to capture successful logins. Coding the trigger incorrectly can result in everyone being locked out of the server. You’ll need to use the Dedicated Administrator Connection, otherwise known as the DAC, to login to the server and disable the errant trigger. Not fun.

Luckily, there is a much better option; using SQL Server’s built-in Event Notification service to receive login events through Service Broker. This event stream is asynchronous to the login process, meaning it won’t interrupt or slow down the login process, and it allows you to capture both successful and failed logins to a table, either locally or remotely. For larger SQL Server infrastructures, it’s not uncommon to setup a single SQL Server instance to gather this information for central analysis.

This blog post shows how to setup a database locally for auditing login events via SQL Server Event Notifications and Service Broker.

Click through for a script-heavy post which helps you all the way through the process.

Comments closed

2 New Undocumented Items in SQL Server 2019

Solomon Rutzky takes us through a couple of undocumented additions to SQL Server 2019:

SQL Server 2019 introduced, among other things, two new filesystem-related items:

1. a system stored procedure, sys.xp_delete_files(), and
2. an instance-level configuration option, 'allow filesystem enumeration'

Both are undocumented, so let’s see if we can figure out what they do.

Click through to see how both work.

Comments closed

Viewing Power BI Audit and Activity Logs

Jeff Pries gives us the rundown on auditing in Power BI:

When using the cloud-based Power BI Service, powerbi.com, every action that is taken while logged into the portal — whether it is viewing or publishing a report, creating a new workspace, or even signing up for a pro trial license, that activity is logged within the Microsoft servers as part of the Office 365 audit logs.

Accessing these logs can be accomplished via a couple of different methods (either through the Office 365 Audit Log functionality using the Office 365 Admin Center or PowerShell cmdlets; or through the new Power BI Activity Log (Power BI Get Activity Events) functionality accessible via a PowerShell cmdlet (Get-PowerBIActivityEvent) and an API). There are a few examples out there already on how to use these commands to access the data (and I have a post on accessing the data using the Power BI API and C# coming out in a few week), but there doesn’t seem to be a lot out there about the data itself, which is what I plan to focus on here.

Read on for more details as well as the structure around a forthcoming application to parse these logs and store them locally in SQL Server.

Comments closed

Add-ClusterNode Error: Keyset Does Not Exist

Jonathan Kehayias troubleshoots a Windows Server clustering problem:

While working on a video recording for Paul this week I ran into an interesting problem with one of my Windows Server 2016 clusters. While attempting to add a new node to the cluster I ran into an exception calling Add-ClusterNode:

The server ‘SQL2K16-AG03.SQLskillsDemos.com’ could not be added to the cluster.
An error occurred while adding node ‘SQL2K16-AG03.SQLskillsDemos.com’ to cluster ‘SQL2K16-WSFC’.

Keyset does not exist

The windows account I was using was the domain administrator account and I had just recently made modifications that involved the certificate store on this specific VM, so I decided to take a backup of the VMDK and then revert to a snapshot to try again, and this time it worked.  So needless to say I was intrigued as to what I could have done that would be causing this error to happen.

Read on to see what the root cause was and how you can fix it.

Comments closed

Generating Synthetic Data with R

Sidharth Macherla uses the conjurer package in R to generate synthetic data:

If you are building data science applications and need some data to demonstrate the prototype to a potential client, you will most likely need synthetic data. In this article, we discuss the steps to generating synthetic data using the R package ‘conjurer’. 

One of the toughest problems of generating data is making it look realistic enough. It’s one level of difficulty to build “steady-state” data, but if you want data to follow a combination of trend and random walk…that’s when things get dicey. H/T R-Bloggers

Comments closed

Dealing with Big Ranges in a Graph

Alex Velez shows how we can work with a particular case of problem:

Today’s post is about a common challenge: when one data series is so large relative to the others that a single scale makes it nearly impossible to see any details. Consider the following line graph. It displays state and local revenue by transportation mode, which I created using data from the Bureau of Transportation Statistics 2018 Report.

Alex has one solution. Another idea could be to change the Y axis to log scale, especially because you’re dealing with money. That would tighten up the series and allow for more information to be displayed on the single graph.

Comments closed

Building a Dual-Axis Line Chart in Power BI

Matt Allington shows how you can build a dual-axis line chart in Power BI:

Unfortunately, Power BI does not support a dual axis line chart as a standard visual at this time. The good news however is there is a custom visual called “Multiple Axes chart by xViz” that can do this in Power BI.  This visual has been around for a while, but there have been some formatting issues (in my view) that prevented it being a solution to this problem – that is now fixed).  I will demonstrate how to set up a dual axis charge using the Adventure Works database and this visual.

Honestly, I’m pretty happy that Power BI does not support a dual-axis line chart. It is the cause of so many instances of spurious correlation that I’d err on the side of not including multiple axes.

Comments closed

Find Installed ODBC Drivers with Powershell

Jack Vamvas answers a question:

Question: I had a question from a developer who was troubleshooting an application to SQL Server connection via a DSN using ODBC. They wanted the SQL Server ODBC Driver names and platform (32 bit|64 bit) used to connect . They will use this information to check application compatibility.

Although i can just RDP onto the server and grab the information through the ODBC gui – how can I use Powershell to get these ODBC details?

Click through for the answer.

Comments closed

Wait Stats: Necessary but not Sufficient

Greg Gonzalez explains how wait stats are not the only thing you should look at to determine system health:

Waits and Queues has been used as a SQL Server performance tuning methodology since Tom Davidson published the above article as well as the well-known SQL Server 2005 Waits and Queues whitepaper in 2006. When used in combination with resource metrics, waits can be valuable for assessing certain performance characteristics of the workload and aid in steering tuning efforts. Waits data is surfaced by many SQL Server performance monitoring solutions, and I’ve been an advocate of tuning using this methodology since the beginning. The approach was influential in the design of the SQL Sentry performance dashboard, which presents waits flanked by queues (key resource metrics) to deliver a comprehensive view of server performance.

However, some seem to have missed Davidson’s point regarding the importance of resources and rely almost entirely on waits to present a picture of query performance and system health. Waitstats come directly from the SQL Server engine and are easy to consume and categorize. Waiting queries mean waiting applications and users, and no one likes to wait! From a marketing standpoint this is pure gold for a SQL Server monitoring tools vendor – it is easier to evangelize waits analysis as a singular solution for making queries and applications faster than the full story, which is more involved.

Unfortunately, a waits-focused approach to the exclusion of resource analysis can mislead users, and worst-case leave them flying blind. SentryOne team members Kevin Kline and Steve Wright have previously touched on this here and here. In this post I’m going to take a deeper dive into some recent research made possible by Query Store that has shed new light on how deficient waits-focused tuning can truly be.

Interesting research and Greg does a great job of explaining it.

Comments closed