2022-10-06 – Curated SQL

Variable Definition and Programmatic ggplot2

Published 2022-10-06 by Kevin Feasel

Sebastian Sauer takes us through an interesting scenario.

No lede here because it’s almost 100% code and headers. A quick description of this is that we can see ways to parse columns in an R DataFrame and plot visuals without hard-coding the column name in our plot definition, using a variable instead.

And I had to rewrite the synopsis above because I used the data science term “variable” until hitting a wall when describing the programming term “variable.”

Comments closed

Debugging Code in R

Published 2022-10-06 by Kevin Feasel

Cosima Meyer explains how debugging works in R with RStudio:

Three basic commands in RStudio let you do the debugging: debug(function_name), browser(), and undebug(function_name).
With debug(function_name) you start the debugging of your function – it’s basically like a mole that digs in. When you’re in debug mode, you can also call the objects in your function.

Read the whole thing to learn the power of debugging beyond the print() statement. H/T R-Bloggers.

Comments closed

Kernel SHAP in R and Python

Published 2022-10-06 by Kevin Feasel

Michael Mayer and Christian Lorentzen team up:

SHAP is one of the most used model interpretation technique in Machine Learning. It decomposes predictions into additive contributions of the features in a fair way. For tree-based methods, the fast TreeSHAP algorithm exists. For general models, one has to resort to computationally expensive Monte-Carlo sampling or the faster Kernel SHAP algorithm. Kernel SHAP uses a regression trick to get the SHAP values of an observation with a comparably small number of calls to the predict function of the model. Still, it is much slower than TreeSHAP.

Read on to see how to do this in both R and Python. With libraries the way they are, the code is very similar and the results are basically the same.

Comments closed

Debouncing RMarkdown Input

Published 2022-10-06 by Kevin Feasel

Thomas Williams waits for the keystroke:

This R Markdown snippet demonstrates “debouncing”: waiting until a user stops changing an input, before updating dependent charts and tables. Debouncing is often used in web sites to prevent the user interface “jumping” as data is being entered, especially when the update takes a noticeable amount of time – for instance calling an API or database, or doing a calculation.

Read on to see an abridged example, as well as a link to the full version.

Comments closed

Diagnosing High CPU on SQL Server when It’s SQL Server’s Fault

Published 2022-10-06 by Kevin Feasel

Ajay Dwivedi continues a series on high CPU utilization:

In the last blog post Live Troubleshooting High CPU On SQL Server – Part 1, we worked on a scenario where we saw a high CPU on SQL Server due to some external OS level task. In this blog, we are going to explore a scenario where a high CPU issue is present because of the workload running on SQL Server.
Just like in the last blog post scenario, when I have to troubleshoot a “slow” SQL Server, if my SQL Server is baselined with the SQLMonitor tool, then I first visit my Monitoring – Live – All Servers dashboard which displays all the metrics of specific SQL Servers that need DBA help.

As normal, we see a scenario with SQLMonitor, as well as other options including sp_BlitzFirst.

Comments closed

Measure Killer: A Power BI Tool

Published 2022-10-06 by Kevin Feasel

Matt Allington reviews a tool:

This week I am sharing a brand new third party Power BI external tool called Measure Killer created by Gregor Brunner. At this writing, the tool is still in beta, so make sure you backup your Power BI workbooks before using this tool. In fact, that is good advice for all Power BI authoring, regardless of what you are doing.

Read on for Matt’s review of the product as well.

Comments closed

Data Virtualization in SQL Server 2022

Published 2022-10-06 by Kevin Feasel

Hugo Queiroz provides an overview of data virtualization options in SQL Server 2022:

SQL Server 2022 now supports CSV, Parquet, and Deltafiles stored on Azure Storage Account v2, Azure Data Lake Storage Gen2, or any simple storage service (S3)–compliant object storage—the last as an on-premises offering or in the cloud. Finally, SQL Server 2022 can now use Create External Table as Select (CETAS), together with commands like OPENROWSET, Create External Table (CET), and all the new T-SQL enhancements. SQL Server 2022 is a powerful data hub.

The post doesn’t get too deep into the topic, though a search here will find you links to articles with concrete examples.

Comments closed

Lock Escalation Thresholds and Parallelism

Published 2022-10-06 by Kevin Feasel

Paul White complicates matters:

In part one of this series, I explained how the lock manager maintains a count of held locks per transaction, and individual access methods keep a separate lock count per HoBt for each plan operator performing data access.
Parallelism adds an extra layer of complexity because both lock counts are maintained per thread.
There are two underlying reasons for this:

Paul explains the reasons and shows off plenty of tests along the way.

Comments closed

Tracking Power BI Desktop Activity in SQL Server

Published 2022-10-06 by Kevin Feasel

Chris Webb looks in on things:

Something I do all the time when performance tuning Power BI is use SQL Server Profiler to monitor query and refresh activity. There’s a handy external tool that lets you open up Profiler with a connection to Power BI Desktop; if you’re using Power BI Premium you can also connect Profiler up to a published dataset using the XMLA Endpoint. Profiler is a bit old-school though and likely to lead to snarky comments from SQL Server DBAs (it’s not deprecated for Analysis Services though!) who’ll say you should be using Extended Events (aka xEvents) instead. And guess what – did you know you can use Analysis Services xEvents to monitor activity in Power BI Desktop?

It’s the power of Extended Events.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Day: October 6, 2022