Press "Enter" to skip to content

Author: Kevin Feasel

Cumulative Costs and Scale

Michael J. Swart talks about costs:

Think of this another way. The triangle above is a graph of the amount of data you have over time. And if you pay for storage as an operational expense such as when you’re renting storage in the cloud (as opposed to purchasing physical drives). Then the cost of storage is the area of the graph. The monthly bills are ever-increasing, so half of the total cost of storage will be for the most recent 29%.

Put yet another way: If you started creating a large file in the cloud every day since March 2014, then the amount you paid to the cloud provider before the pandemic started is the same amount you paid after the pandemic started (as of August 2022).

I was told there would be no math here.

Comments closed

SQL Server Non-Vulnerabilities

Sean Gallardy has an A+++ 10/10 would read again rant:

I get asked if I know anything about <newest SQL vulnerability as reported by random website>, quite often. Generally, my answer is that I don’t for two main reasons… the first being that none of them are actual vulnerabilities, and the second being that none of them are particularly new but merely items from the same bag of tricks everyone else uses and isn’t a buffer overrun/privilege escalation/etc. item. My normal response after taking a quite peek at whatever article is referenced is generally the same response as The Dude, “Yeah, well, you know, that’s just like uh, your opinion, man.”, as all of these items are purported to be vulnerabilities but yet none actually exploit any vulnerability.

Did you know that if you steal someone’s username and password from the sticky note on their monitor, you can use that to connect to a SQL Server? Amazing vulnerability there—it doesn’t even check that you’re the real person who should have those credentials!

Comments closed

Direct Permission is Just the Start

Kenneth Fisher has access to many permissions:

What you have access to is not just what you have direct permissions to. The other day I needed to copy some backups from one location to another. Unfortunately my network id doesn’t have access to either location. Guess what does though. The service account running the SQL Server instance where the backups were taken. Now, since I’m a sysadmin on that instance when I use xp_cmdshell it uses that service account. I don’t have to know the password or log in as the service account, xp_cmdshell will do it for me.

Click through to learn more.

Comments closed

Debugging Code in R

Cosima Meyer explains how debugging works in R with RStudio:

Three basic commands in RStudio let you do the debugging: debug(function_name)browser(), and undebug(function_name).

With debug(function_name) you start the debugging of your function – it’s basically like a mole that digs in. When you’re in debug mode, you can also call the objects in your function.

Read the whole thing to learn the power of debugging beyond the print() statement. H/T R-Bloggers.

Comments closed

Variable Definition and Programmatic ggplot2

Sebastian Sauer takes us through an interesting scenario.

No lede here because it’s almost 100% code and headers. A quick description of this is that we can see ways to parse columns in an R DataFrame and plot visuals without hard-coding the column name in our plot definition, using a variable instead.

And I had to rewrite the synopsis above because I used the data science term “variable” until hitting a wall when describing the programming term “variable.”

Comments closed

Debouncing RMarkdown Input

Thomas Williams waits for the keystroke:

This R Markdown snippet demonstrates “debouncing”: waiting until a user stops changing an input, before updating dependent charts and tables. Debouncing is often used in web sites to prevent the user interface “jumping” as data is being entered, especially when the update takes a noticeable amount of time – for instance calling an API or database, or doing a calculation.

Read on to see an abridged example, as well as a link to the full version.

Comments closed

Diagnosing High CPU on SQL Server when It’s SQL Server’s Fault

Ajay Dwivedi continues a series on high CPU utilization:

In the last blog post Live Troubleshooting High CPU On SQL Server – Part 1, we worked on a scenario where we saw a high CPU on SQL Server due to some external OS level task. In this blog, we are going to explore a scenario where a high CPU issue is present because of the workload running on SQL Server.

Just like in the last blog post scenario, when I have to troubleshoot a “slow” SQL Server, if my SQL Server is baselined with the SQLMonitor tool, then I first visit my Monitoring – Live – All Servers dashboard which displays all the metrics of specific SQL Servers that need DBA help.

As normal, we see a scenario with SQLMonitor, as well as other options including sp_BlitzFirst.

Comments closed

Kernel SHAP in R and Python

Michael Mayer and Christian Lorentzen team up:

SHAP is one of the most used model interpretation technique in Machine Learning. It decomposes predictions into additive contributions of the features in a fair way. For tree-based methods, the fast TreeSHAP algorithm exists. For general models, one has to resort to computationally expensive Monte-Carlo sampling or the faster Kernel SHAP algorithm. Kernel SHAP uses a regression trick to get the SHAP values of an observation with a comparably small number of calls to the predict function of the model. Still, it is much slower than TreeSHAP.

Read on to see how to do this in both R and Python. With libraries the way they are, the code is very similar and the results are basically the same.

Comments closed

Data Virtualization in SQL Server 2022

Hugo Queiroz provides an overview of data virtualization options in SQL Server 2022:

SQL Server 2022 now supports CSV, Parquet, and Deltafiles stored on Azure Storage Account v2, Azure Data Lake Storage Gen2, or any simple storage service (S3)–compliant object storage—the last as an on-premises offering or in the cloud. Finally, SQL Server 2022 can now use Create External Table as Select (CETAS), together with commands like OPENROWSETCreate External Table (CET), and all the new T-SQL enhancements. SQL Server 2022 is a powerful data hub.

The post doesn’t get too deep into the topic, though a search here will find you links to articles with concrete examples.

Comments closed