2019-08-02 – Curated SQL

PolyBase and External Column Names

Published 2019-08-02 by Kevin Feasel

I have another post looking at external columns on PolyBase V2 data sources:

I’m going to use external two tables in this experiment. In the left corner, we have some ORC files stored in Azure Blob Storage which we’ll represent as FireIncidents2017. In the right corner, we have data stored in a remote SQL Server instance which we’ll call LineItem. The data doesn’t really matter that much, but to give you an idea of where we’re going, I’ll show each table.

There’s quite a bit you can do here.

Comments closed

Validating Errors in A/B Testing

Published 2019-08-02 by Kevin Feasel

Roland Stevenson shows us how to validate Type I and Type II errors when performing A/B tests in R:

In this post, we seek to develop an intuitive sense of what type I (false-positive) and type II (false-negative) errors represent when comparing metrics in A/B tests, in order to gain an appreciation for “peeking”, one of the major problems plaguing the analysis of A/B test today.
To better understand what “peeking” is, it helps to first understand how to properly run a test. We will focus on the case of testing whether there is a difference between the conversion rates cr_a and cr_b for groups A and B. We define conversion rate as the total number of conversions in a group divided by the total number of subjects. The basic idea is that we create two experiences, A and B, and give half of the randomly-selected subjects experience A and half B. Then, after some number of users have gone through our test, we measure how many conversions happened in each group. The important question is: how many users do we need to have in groups A and B in order to measure a difference in conversion rates of a particular size?

Read the whole thing. H/T R-Bloggers

Comments closed

Biml Support in Visual Studio Code

Published 2019-08-02 by Kevin Feasel

Cathrine Wilhelmsen takes us through Biml support in Visual Studio Code:

Please note that you only get syntax highlighting with this extension. You do not get the full Biml or .NET intellisense, the BimlScript preview pane, or the ability to generate SSIS packages from Biml. For those things, you will still need BimlExpress for Visual Studio.
However! If you simply want to view your Biml files in a lightweight editor, the Biml Support extension works beautifully

It’s not full support, but it’s something.

Comments closed

Database Page Allocations Function

Published 2019-08-02 by Kevin Feasel

Max Vernon takes us through the sys.dm_db_database_page_allocations Dynamic Management Function:

sys.dm_db_database_page_allocations is an undocumented SQL Server T-SQL Dynamic Management Function. This DMF provides details about allocated pages, allocation units, and allocation extents.

Read on for additional details. This is an undocumented function, so it might change between versions but it will give you an idea of how it works under the covers.

Comments closed

Using Bookmarks for Power BI Filters

Published 2019-08-02 by Kevin Feasel

Marc Lelijveld continues a series on storytelling with Power BI:

As said, being dynamic is a broad concept. Lets use the above shown example. As a report author, we can define that the end-user should be looking at an top 10 ranking of countries (right side of the report). Since the difference between number 9 and 10 in the ranking is so small, you might want to know what the difference is to number 11. Now, we can’t see that. We need to change the filter context to see the rest of the ranking.

Click through for a step by step example of what to do.

Comments closed

Managed Instance Challenges

Published 2019-08-02 by Kevin Feasel

Joey D’Antoni has a few real-world challenges with migrating to Azure SQL Managed Instances:

While DMS is pretty interesting tooling, I had mostly ignored it until recently. Functionally, the tool works pretty well. The problem is it requires a lot of privileges–you have to have someone who can create a service principal and you need to have the following ports open between your source machine and your managed instance:
– 443
– 53
– 9354
– 445
– 12000
While the scope of those firewall rules is limited, in a larger enterprise, explaining why you need port 445 open to anything is going to be challenging.

The technology is intriguing, though it does seem like there are still some kinks to work out.

Comments closed

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Day: August 2, 2019

PolyBase and External Column Names

Validating Errors in A/B Testing

Biml Support in Visual Studio Code

Database Page Allocations Function

Using Bookmarks for Power BI Filters

Managed Instance Challenges