Press "Enter" to skip to content

Author: Kevin Feasel

Monitoring for Distribution Changes

Nina Zumel explains how we can track if something has changed by monitoring its distribution:

A client recently came to us with a question: what’s a good way to monitor data or model output for changes? That is, how can you tell if new data is distributed differently from previous data, or if the distribution of scores returned by a model have changed? This client, like many others who have faced the same problem, simply checked whether the mean and standard deviation of the data had changed more than some amount, where the threshold value they checked against was selected in a more or less ad-hoc manner. But they were curious whether there was some other, perhaps more principled way, to check for a change in distribution.

The answer is, of course, that there is. Click through to see a few of the techniques.

Comments closed

SELECT * Doesn’t Keep Views up to Date

Reitse Eskens busts a myth:

Last week i read a comment or a blog somewhere (i can’t remember it anymore but please let me know so i can credit!), that a select * in a view gets a full definition under the covers and doesn’t get updated when the underlying table is updated.

So, i decided to take that for a test, see what i can reproduce.

Spoilers: it doesn’t. Click through for the proof, as well as what does update a view’s definition.

Comments closed

Handling Azure SQL Database Scale Changes

Arun Sirpal shows us how to handle scaling events in Azure SQL Database:

For some reason I have friends / colleagues telling me that when scaling (up and down for this example) that no downtime occurs. Well, not only does Microsoft documentation say differently, I will show it. So let’s test it out. Before the practical test, this is the official stance. “There is a switch over period where connectivity is lost to the database for a short amount of time, which can be mitigated using retry logic”.

Retry logic is an important part of any application. We tend to forget about it with on-prem applications talking to on-prem databases, but that’s a mistake.

Comments closed

SQL Agent Job History Limits

Greg Dodd shows how you can control the number of history records for SQL Agent jobs:

Looking at the Job Activity Monitor, I could see that the Last Run was Saturday morning, and the output of the job was there, I knew that the job was running, but when viewing the Agent History, it showed no history.

Digging further, I noticed that it wasn’t just the history of this job missing, but almost all jobs were missing their history or only had 1 or two runs in them.

Read on to find out what happened. Setting Agent job history limits is important, but you also have to leave enough records on there to diagnose issues.

Comments closed

Guids in Persisted Calculated Columns

Slava Murygin shows us some odd things which occur when you try to make a persisted calculated column out from a UNIQUEIDENFITIER data type:

This post is for you in case you decide to use Uniqueidentifier column in your table and then you think about including it into a Persisted calculated column.
You also might see it useful if you like weird or funny SQL Server behavior.

Slava has an Azure feedback item and it looks like someone tested the behavior in SQL Server 2019 and it works as you’d expect, so this must have been fixed sometime between then and now.

Comments closed

SSMS Tips

Taiob Ali has a few tips for us around using SQL Server Management Studio:

Opening SSMS as a different user

I have two Active Directory Accounts. Once I use to login to my workstation and everything except working inside Databases. For that, I needed to open the SQL Server Management Studio (SSMS) with my admin account. I created an instance of runas application on my desktop to do that. Saves me a few clicks and type in my user name and password every time I launch SSMS. Text of the target.

Click through to see how that works. This is quite useful when you’re working in multiple environments or across multiple domains, especially if you slap on the /netonly parameter.

Comments closed

Copying Measures between Power BI Datasets

Tomaz Kastrun walks us through copying measures between datasets and files in Power BI:

Measures in Power BI document are DAX based aggregations for slicing and dicing the tabular datasets using evaluate and filter context and and data relationships. Short hand, measures are DAX formulas stored in Power BI as part the dataset.

For showing how easy (once you know) is to copy measures between different dataset, I have created a sample Power BI with diamonds dataset (also available on Github).

I’d call the process convoluted. It does work, but it seems like Power BI should have an easier method.

Comments closed

Changes in the R foreach Package

Hong Ooi announces some changes to the foreach package in R:

This post is to announce some new and upcoming changes in the foreach package.

First, foreach can now be found on GitHub! The repository is at https://github.com/RevolutionAnalytics/foreach, replacing its old home on R-Forge. Right now the repo hosts both the foreach and iterators packages, but that may change later.

There are also some changes to the package itself, so read on for those.

Comments closed

Building a Cache in ksqlDB

Michael Drogalis shows how to build a materialized cache to reduce the load on your Kafka Streams servers:

There are a lot of ways that you can introduce a materialized cache into your architecture. One such way is to leverage ksqlDB, an event streaming database purpose-built for stream processing applications. With native Kafka integration, ksqlDB makes it easy to replicate the pattern of scaling out many sets of distributed caches.

Let’s look at how this works in action with an example application. Imagine that you have a database storing geospatial data of pings from drivers at a ridesharing company. You have a particular piece of logic that you want to move out of the database—a frequently run query to aggregate how active a territory is. You can build a materialized cache for it using ksqlDB.

The tutorial starts you from “grab the Docker container” and takes you through the process.

Comments closed

SQL Server Life Hacks

Kevin Hill has a few life hacks for us:

Since I am an independent consultant here under the Dallas DBAs name, I am frequently asked to come in and quickly find out why the server is so slow. Sometimes by existing customers, sometimes out of nowhere. Many of these times I cannot make any “permanent” or “lasting” changes, so my go-to is to run sp_whoisactive as a temporary stored procedure (code in link). Works the same, nothing left behind, no corporate policies violated.

Read on for another useful tip.

Comments closed