2020-01-03 – Curated SQL

The confusion matrix is perhaps the most important thing to look at when evaluating a classification model. It contains a large amount of insight for such a small sized table. Despite its name, the confusion matrix is actually quite simple. It is a matrix that visualises the count of actual class instances against predicted class instances. This allows you to quickly see the amount of correct and incorrect predictions for each category, and whether any bias exists, and if so, where it is.

The example is specifically around Azure ML, but applies across the board. I think people get a little bit too hung up on accuracy and forget about important measures like positive and negative predictive value.

Comments closed

Time Series + Power BI Aggregations

Published 2020-01-03 by Kevin Feasel

Shabnam Watson answers a couple of questions around aggregations and time series in Power BI:

I have received a couple of questions about Aggregations in Power BI and whether they can be used to cover time series calculations such as Year to Date, Quarter to Date, and Month To Date. The answer is yes. Since time series calculations break down into calculations over a series of days, an aggregation table defined at day level with the basic summarization methods (min, max, sum, count) and the right relationship with a Date dimension, can answer Year to Date, Quarter to Date, and Month To Date calculations.
Let’s take a quick look at one such calcualtion and how it can be covered with an aggration. I am going to use the same version of AdventureWorks sample database and Power BI model that I used in my previous blog post on aggregations, with a few changes.

Read on for a demonstration.

Comments closed

Drillthrough in Power BI

Published 2020-01-03 by Kevin Feasel

Cecilia Brusatori takes us through Power BI’s drillthrough capabilities:

The drillthrough feature in Power BI will let you let you go into more detail about a specific column in a visualization.
In the most simple way, you can enable a feature that will let you right click on a visual containing the specified column, and take you to a whole different page inside your report where you created more content to provide more details related to that same column.

Click through for an example.

Comments closed

Determining if a UDF Call Inlined

Published 2020-01-03 by Kevin Feasel

Taiob Ali helps us figure out when SQL Server inlines your scalar UDF:

Starting SSMS 18.2 a new attribute was added in QueryPlan. When the inline scalar UDF feature is enabled ‘ContainsInlineScalarTsqludfs’ value will be true. Let’s look at this in action. Run below tsql using the latest version of SSMS and after turning on the actual execution plan in SSMS.

Read on to see it in action.

Comments closed

Fun With Database Names

Published 2020-01-03 by Kevin Feasel

Jason Brimhall takes us through database names you shouldn’t use:

Let’s figure we have a requirement to create a database with sensitive data. Due to the sensitivity of the data, it is classified confidential (for your eyes only, don’t talk about it and plug your ears if somebody starts talking about it). This is so sensitive that an apt name for the database could be anything like 🙈 or 🙉 or 🙊. Being smart, you know there are two more databases coming down the line so you only want to pick one of those for the name and not all three (though all three could make sense for a single database name).

Just because you can doesn’t mean you should…

Comments closed

Gap and Island Analysis

Published 2020-01-03 by Kevin Feasel

Ed Pollack covers a topic of importance for database developers:

Within a data set, an island of data is any ordered sequence where each row is in close proximity to the rows around it. For some data types and analysis, “close proximity” will mean consecutive. Dates, integers, and letters of the alphabet can be ordered sequentially where two adjacent values will not be able to have additional values in between them.
For example, there are no dates between October 23rd and October 24^th. Similarly, there are no integers between 17 and 18 and no English letters between E and F. For these examples, an island of data could be defined as a sequence of consecutive values. A gap can be defined as a sequence of missing values.

There are a lot of difficult problems which gap & island analysis makes much easier by pivoting the way you think about the problem.

Comments closed

Fun With Secure Enclaves

Published 2020-01-03 by Kevin Feasel

Ned Otter continues a series on SQL Server 2019 Always Encrypted with Secure Enclaves:

In the first post of this series, we explored the requirements for using Always Encrypted with secure enclaves, as well as some of the limitations.
For this post, we’ll be using Powershell to install and configure the HGS server (required for “attestation”) as well as executing the steps required to configure the SQL 2019 server to work with HGS.

Read on for a few disclaimers and a detailed setup article.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Day: January 3, 2020

Evaluating Classification Models

Time Series + Power BI Aggregations

Drillthrough in Power BI

Determining if a UDF Call Inlined

Fun With Database Names

Gap and Island Analysis

Fun With Secure Enclaves