Kevin Feasel – Page 840

Using package in R is easy. You install from CRAN using install.packages("packagename"), it resolves dependencies and you’re good to go. What R natively doesn’t handle so well is installing a particular package version without jumping through hoops. Technically you need the source file of the package version you want to install AND all source files of the dependencies (in the correct version, of course). This has been made almost seamless with packages packrat and recently, renv.
This comes handy when you are constructing a Docker file to run in production. Usually you want to run this defensively and do not want things to change from one image build to another. To get there, you can save all your package names and version into a file (renv.lock) and use that to reconstruct the now defined package structure with predictable versions (see renv vignette here).

This is quite useful as R package developers tend not to covet backwards compatibility, and one of the key benefits of containers is to have the option to keep the same code base and configuration in all environments.

Comments closed

Unsupported Versions of SQL Server on Windows Server 2019

Published 2020-06-26 by Kevin Feasel

Randolph West finds a need and fills it:

Last year when Windows Server 2019 was released I wanted to see which versions of SQL Server I could run on it, testing more the unwritten backward compatibility promise Microsoft has maintained over the last 45 years, rather than what the documentation says.
Speaking of documentation, Glenn Berry has a nifty compatibility matrix to show what versions of SQL Server are supported on each version of Windows Server. For official purposes, this is the list you should refer to:
But I know you’re not here for supported versions, because this post is about what Randolph managed to get running on Windows Server 2019, which as you know is a 64-bit operating system.

SQL Server 6.5 surprised me. This idea also nets my most coveted category.

Comments closed

More Fun with 1-Column Fusion in DAX

Published 2020-06-26 by Kevin Feasel

Phil Seamark continues a discussion on single-column fusion:

You may notice the [Package – Bag] measure uses the * (multiplication) operator in line 4 between a measure and what looks like a column filter.
The [base measure] performs a simple COUNTROWS aggregation and returns an INTEGER. This side of the equation makes sense, so let’s see what is going on with the DAX on the right-hand side of the * operator? The filter statement is encompassed with parenthesis, which automatically converts the expression to a TRUE/FALSE boolean value – which is then implicitly converted to an INTEGER value of either 0 or 1. It can’t be anything else.
The result of this double conversion is that we end up with a INTEGER * INTEGER to produce the number we see in the visual.

Click through for plenty more where that came from.

Comments closed

Balancing SQL Server Core Licenses Across NUMA Nodes

Published 2020-06-26 by Kevin Feasel

Glenn Berry explains how to get yourself out of a pickle in Standard Edition:

Ok, that is bad enough, but it gets worse. Unless you fix it, SQL Server 2019 Standard Edition will use 32 logical cores on NUMA node 0, but only 16 logical cores on NUMA node 1. This can have a significant negative effect on performance. What you want is for SQL Server to use 24 logical cores on each NUMA node.

Glenn then explains how to pull this off.

Comments closed

Creating a Time Dimension with Time Bands in Power BI

Published 2020-06-26 by Kevin Feasel

Soheil Bakhshi shares how you can create a time dimension with a granularity of seconds in Power BI and SSAS Tabular:

I wrote some other posts on this topic in the past, you can find them here and here. In the first post I explain how to create “Time” dimension with time bands at minutes granularity. Then one of my customers required the “Time” dimension at seconds granularity which encouraged me to write the second blogpost. In the second blogpost though I didn’t do time bands, so here I am, writing the third post which is a variation of the second post supporting time bands of 5 min, 15 min, 30 min, 45 min and 60 min while the grain of the “Time” dimension is down to second. in this quick post I jump directly to the point and show you how to generate the “Time” dimension in three different ways, using T-SQL in SQL Server, using Power Query (M) and DAX. Here it is then:

Click through for the code, which includes several sample bands (e.g., 5 minutes, 15 minutes) that you can also control.

Comments closed

CTEs Don’t Control Plan Shape

Published 2020-06-26 by Kevin Feasel

Erik Darling dispels a myth:

I’ve heard many times incorrectly over the years that CTEs somehow materialize data.
But a new one to me was that CTEs execute procedurally, and you could use that to influence plan shapes by always doing certain things first.
Unfortunately, that’s not true of them either, even when you use TOP.

Read the whole thing. Though I do chain common table expressions for readability’s sake, but that’s usually because I’m performing a series of repetitive calculations that I can’t simplify via APPLY.

Comments closed

Publishing Azure Data Factory via Azure DevOps

Published 2020-06-26 by Kevin Feasel

Kamil Nowinski shares how to deploy Azure Data Factory flows via Azure DevOps:

Struggling with #ADF deployment? adf_publish branch doesn’t suit your purposes? Don’t have skills with PowerShell? I have good news for you. There is a new tool in the market. It’s a task for Azure DevOps Release Pipeline to deploy whole ADF from code (JSON files) to ADF instance in Azure. Behind the scenes, it runs the PowerShell module which does all job for you.
Sounds unbelievable? But it’s real! Check it out for yourself.

Click through for a video.

Comments closed

Thoughts on Snowflake Database Provisioning

Published 2020-06-25 by Kevin Feasel

David Stelfox takes us through some thoughts on provisioning instances of Snowflake:

For this example, I’ve chosen an open dataset of 2017 taxi rides in New York City. There are a few options for interacting with Snowflake: a dialog box approach in the web-based GUI, using SQL statements in the Worksheets tab in the GUI or a CLI called SnowSQL. For this example, I used SQL statements as I find them easier to follow what’s happening. Once you have set up your account (or trial) and logged in, you need to create your first database.

Click through for some how-to as well as thoughts about cost and performance.

Comments closed

Using INLA for Spatial Regression in R

Published 2020-06-25 by Kevin Feasel

Lionel Hertzog continues a series on spatial regression:

INLA is a package that allows to fit a broad range of model, it uses Laplace approximation to fit Bayesian models much, much faster than algorithms such as MCMC. INLA allows for fitting geostatistical models via stochastic partial differential equation (SPDE), a good place for more background informations on this are these two gitbooks: spde-gitbook and inla-gitbook.

This is not the gentlest introduction, so if you’re new to the concept go back and read part 1.

Comments closed

PolyBase, Excel, and the TOP Operator

Published 2020-06-25 by Kevin Feasel

I follow up on a bug from SQL Server 2019 CU2:

Back with SQL Server 2019 CU2, I reported an error with PolyBase connecting to Excel when trying to select TOP(10) from the table. I’m using the Microsoft Access Database Engine 2016 Redistributable’s Excel driver.

Click through for an explanation of what’s happened since then on this front.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel

Using Specific R Package Versions in Docker Images

Unsupported Versions of SQL Server on Windows Server 2019

More Fun with 1-Column Fusion in DAX

Balancing SQL Server Core Licenses Across NUMA Nodes

Creating a Time Dimension with Time Bands in Power BI

CTEs Don’t Control Plan Shape

Publishing Azure Data Factory via Azure DevOps

Thoughts on Snowflake Database Provisioning

Using INLA for Spatial Regression in R

PolyBase, Excel, and the TOP Operator