2020-10-12 – Curated SQL

At a high level we are connecting a time series of regional sales to regional offline and online ad impressions over the trailing thirty days. By using ML to compare the different kinds of measurements (TV impressions or GRPs versus digital banner clicks versus social likes) across all regions, we then correlate the type of engagement to incremental regional sales in order to build attribution and forecasting models. The challenge comes in merging advertising KPIs such as impressions, clicks, and page views from different data sources with different schemas (e.g., one source might use day parts to measure impressions while another uses exact time and date; location might be by zip code in one source and by metropolitan area in another).
As an example, we are using a SafeGraph rich dataset for foot traffic data to restaurants from the same chain. While we are using mocked offline store visits for this example, you can just as easily plug in offline and online sales data provided you have region and date included in your sales data. We will read in different locations’ in-store visit data, explore the data in PySpark and Spark SQL, and make the data clean, reliable and analytics ready for the ML task. For this example, the marketing team wants to find out which of the online media channels is the most effective channel to drive in-store visits.A

Click through for the article as well as notebooks.

Comments closed

Polymorphism in GraphQL

Published 2020-10-12 by Kevin Feasel

Dan Freeman takes us through the concept of polymorphism as it applies to GraphQL:

In APIs (and in domain modeling in general) it’s common to want to represent fields that may point to one (or more) of several different types of object, a.k.a. polymorphism. In GraphQL’s type system, this can be accomplished with either a union or an interface, depending on whether the objects in question are expected to have anything in common.
What’s not always obvious to newcomers to GraphQL, though, is how to best handle that data on the receiving end, when you need to tell what concrete type of object you’re dealing with.

It’s interesting to see how this is handled in GraphQL versus object-oriented languages.

Comments closed

Passing Parameters from Power Query to SQL Server

Published 2020-10-12 by Kevin Feasel

Gilbert Quevauvilliers has an interesting solution to a common problem:

I had a requirement where the client wanted the capability to decide how much data to load from a SQL Server Query (TSQL). This was so that they could limit the dataset returned, as at times they did not need all the data.
So below I demonstrate how to achieve this.
NOTE: This will be slightly advanced because I had to manually add some code in the Advanced Editor in Power Query.

Maybe it’s because of the number of times I had to do this with Reporting Services, but this seems like it should be a lot easier than it is.

Comments closed

Q&A: Migrating to Azure DevOps for SQL Server Deployments

Published 2020-10-12 by Kevin Feasel

Kevin Chant has some frequently asked questions:

5: Have you moved from various services into Azure DevOps?
Yes, I have. In fact, one team used various services and applications and we went all in with Azure DevOps. Organizing our work using the boards, using the Git repos and using Azure Pipelines a lot.

Click through for additional questions, as well as answers.

Comments closed

Validating Data Model Results in Power BI

Published 2020-10-12 by Kevin Feasel

Paul Turley continues a series on doing Power BI the right way:

When designing a new data model, this is typically the first thing I do… For every fact table and for each large dimension table, I create a measure that returns the record count for that table. Users normally think about their data in business terms (like sums, ratios and averages) and not about how many records there are in a table. Record counts are a convenient sanity check for record completeness and distribution across groups; and may also be a good indicator for model size and performance.

Paul takes several passes at the problem, getting a bit deeper into it each time.

Comments closed

Breaking Down the sql_handle

Published 2020-10-12 by Kevin Feasel

Paul White unravels the mysteries of sql_handle:

This article describes the structure of a sql_handle and shows how the batch text hash component is calculated.

Read on to learn more.

Comments closed

Using Query Store over the Plan Cache

Published 2020-10-12 by Kevin Feasel

Erik Darling has a dream:

I used to think the plan cache was so cool.
– You can find queries that aren’t good there
– Plans are full of details (and XML)
– Supporting DMVs give you extra insights about resource usage
But most of the time now, I’m totally frustrated with it.
It clears out a lot, plans aren’t there for some queries, and the plans that are there can be very misleading.
Can you really tell someone what their worst performing queries are when everything in there is from the last 15 minutes?
No.

Read on for what’s nice about Query Store, as well as a few fixes which need to be there before it’s really useful. I’ve used Query Store in big environments to good effect (though our DBAs had to rewrite the cleanup processes because they’re bad) and I’ve had to turn it off in medium-sized environments running 2016 because it was harming performance. It’s a great concept and reasonable implementation with a few too many sharp edges.

Comments closed

The Power BI Release Plan

Published 2020-10-12 by Kevin Feasel

Matthew Roche clues us in on what’s coming for Power BI:

The Power BI team at Microsoft publishes a “release plan,” which is essentially the public product roadmap. Anyone can use it to understand what new capabilities and improvements are planned, and when they’re expected to be released.
One challenge with the official release plan comes from the fact that it is a set of online documents, and that for each “release wave” there is a new set of docs – it’s not always clear where to look for the latest information on a given feature.

But thanks to Alex Powers, this is a lot clearer now. Click through to learn how.

Comments closed

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Day: October 12, 2020

Measuring Advertising Effectiveness

Polymorphism in GraphQL

Passing Parameters from Power Query to SQL Server

Q&A: Migrating to Azure DevOps for SQL Server Deployments

Validating Data Model Results in Power BI

Breaking Down the sql_handle

Using Query Store over the Plan Cache

The Power BI Release Plan