2022-02-21 – Curated SQL

The (Non)-Slowness of Loops in R

Published 2022-02-21 by Kevin Feasel

Michael Mayer notes that loops in R aren’t actually all that bad:

Since then, the R core team and the community has invested tons of time to improve R and also to make it faster. There are things like RCPP and parallel computing to speed up loops.
But what still relatively few R users know: loops are not that slow anymore. We want to demonstrate this using two examples.

Click through for the examples.

Comments closed

Getting Started with the Databricks Feature Store

Published 2022-02-21 by Kevin Feasel

Gavita Regunath gives us an introduction to a useful Databricks feature:

Databricks announced the launch of the Databricks Feature Store last year, in May 2021. It is the first of its kind that has been co-designed with Delta Lake and MLflow to accelerate ML deployments.
In this article, we will leverage Databricks Feature Store to store features, create a training dataset by looking up relevant features, and subsequently train an ML model. Follow this step-by-step guide to get started on Databricks Feature Store.

Click through to learn more.

Comments closed

Fun with XESmartTarget

Published 2022-02-21 by Kevin Feasel

Gianluca Sartori shows off a useful project with a new series of posts:

Some time ago, I started a project called XESmartTarget. I find it super useful and you should probably know about it. It’s totally my fault if you’re not using it and I apologize for all the pain that it could have saved you, but it didn’t because I did not promote it enough.
Now I want to remedy my mistake with a 10 days series of blog posts on XESmartTarget, which will show you how useful it can be and how it can be used to accomplish your daily DBA tasks using Extended Events.
In this first post of the series, I will introduce XESmartTarget, show how it works and how to configure it. For the next 10 days I will publish a post to show you how to solve a specific problem using XESmartTarget. Let’s go!

Click through to get off to a good start.

Comments closed

Storage Pools and Volumes

Published 2022-02-21 by Kevin Feasel

John Morehouse illuminates us on storage:

I think there are a couple of lines of thought related to this. I’m one person with a NAS so I don’t need multiple volumes. I can certainly get by with a single volume on each storage pool and this will simplify management of things.
If you were working with enterprise grade storage in a corporate environment, having multiple volumes will make sense. I think of this as carving up disk space for production SQL Servers where each drive letter corresponds to a given volume which resides on a given storage pool. A volume can serve multiple folders.

You know a blog post is going to be good when it starts with “In hindsight, I should have done this differently.”

Comments closed

Goals in Power BI

Published 2022-02-21 by Kevin Feasel

Gogula Aryalingam takes us through Power BI goals:

The feature is currently in preview, introduced some 8 months ago, and has quite a lot of promise. For me, it is particularly exciting since I am working with a large customer, who is a perfect candidate to implement goals for. So, what is Goals in Power BI?
Let us take a quick scenario first: Organizations, regularly (if not frequently) monitor indicators of their business performance to ensure their goals and aspirations are met. Sometimes these aspirations are difficult to keep track of due to various complexities. Consider a goal called Reduce employee turnover and increase satisfaction (something that I picked up from here). To effectively understand and track its progress, the organization would probably have a few key performance indicators (KPIs) that make it easy to look at reducing employee turnover and increasing satisfaction objectively. One such KPI could be a low human capital Turnover Rate while another could be a high Employee Satisfaction Indicator. Collectively these KPIs will help determine the achievement of the goal within a stipulated period (such as a calendar year). Similarly an organization will have many goals that are aligned to organizational KPIs or metrics. Sometimes, certain KPIs/metrics may cascade down the organization’s departments, where each department’s performance determine the overall organizational performance.

Read on to see how Goals work and one use case involving KPIs.

Comments closed

Table-Valued Functions and Dynamic M Parameters

Published 2022-02-21 by Kevin Feasel

Chris Webb uses dynamic M parameters:

My favourite – and it seems many other people’s favourite – new feature in the February 2022 Power BI Desktop release is support for more datasources (including SQL Server, Azure SQL DB and Synapse) with dynamic M parameters. In my opinion dynamic M parameters are extremely important for anyone planning to use DirectQuery: they give you a lot more control over the SQL that is generated by Power BI and therefore give you a lot more control over query performance.
Teo Lachev has already stolen my thunder and blogged about how the new functionality allows you to use a TSQL stored procedure as the source of a table in DirectQuery mode. In this post I’m going to show you something very similar – but instead of using a stored procedure, I’m going to show a simple example of how to use a TSQL table-valued function, which I think has a slight advantage in terms of ease-of-use.

Leaving aside thoughts on table-valued functions in general, dynamic M parameters looks like a really nice feature and as Chris notes, it also works for things like stored procedures.

Comments closed

Multi-Column Transformations in Power Query

Published 2022-02-21 by Kevin Feasel

Imke Feldmann has the need for speed:

In this article I’m going to present a method about transforming multiple columns at once in a fast way. This method also allows you to reference columns that exist in your table already. As I have described in a previous article, this cannot be done using the native Table.TransformColumns function that will be applied if you do column transformations using the UI in Power Query. The function I am sharing here allows you to enter a list of column names to be transformed and a function that defines the transformation itself. So you have to be familiar with defining custom functions to use this approach.

Click through for Imke’s function and explanation but also check out the comments for another take on the problem.

Comments closed

Basics of Risk Management

Published 2022-02-21 by Kevin Feasel

Matthew Roche lays out some of the basics of risk management:

One simple and lightweight approach for risk management involves looking at two factors: risk likelihood, and risk impact.
Risk likelihood is just what it sounds like: how likely is the risk to occur. Once you’re aware that a risk exists, you can measure or estimate how likely that risk is to be realized. In many situations an educated guess is good enough. You don’t need to have a perfectly accurate number – you just need a number that no key stakeholders disagree with too much.^[3]Rather than assigning a percentage value I prefer to use a simple 1-10 scale. This helps make it clear that it’s just an approximation, and can help prevent unproductive discussions about whether a given risk is 25% likely or 26% likely.
Risk impact is also what it sounds like: how bad would it be if the risk did occur? I also like to use a simple 1-10 scale for measuring risk impact, which is more obviously subjective than the risk likelihood. So long as everyone who needs to agree agrees that the impact a given risk is 3 or 4 or whatever, that’s what matters.

Read on for a summary of the topic and a good leaving-off point to learn more.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28

Day: February 21, 2022