Curated SQL – Page 745 – A Fine Slice Of SQL Server

Contrasting Data Warehouses with Power BI Dataflows

Published 2021-01-06 by Kevin Feasel

Dataflow is the data transformation service in Power BI, and also some other Power Platform services. Data Warehouse is the cloud storage and also compute engine for data. I often get this question that: “Now that we have dataflow in Power BI, should we not use the Data warehouse? What are the differences? which is better? When to use what?” This article and video, explains answer to these questions.

I’m probably a bit lower on self-service BI compared to others. When I see something like Dataflows, it reminds me too much of a mess of Excel spreadsheets on shared drives. There’s a lot of relevant business knowledge embedded in those disbursed locations, and bringing it together becomes as much a forensic exercise as it is architectural.

Comments closed

Don’t Fear the tempdb

Published 2021-01-06 by Kevin Feasel

Erik Darling puts his pants on one leg at a time and once his pants are on, he makes gold records:

One persistent idea is that tempdb is something to be avoided. Either because it was “slow” or to avoid contention.
Granted, if a query has been around long enough, these may have been valid concerns at some point. In general though, temp tables (the # kind, not the @ kind) can be quite useful when query tuning.

Erik is absolutely right in this post. Ceteris paribus I’d rather not directly use tempdb because I’d prefer one query over multiple queries. But once performance comes into question, working on smaller subsets of data one step at a time will typically give you at least an acceptable solution.

Comments closed

Deploying ADS Database Projects Manually

Published 2021-01-06 by Kevin Feasel

Elizabeth Noble continues a series of videos on database projects in Azure Data Studio:

This week, we’ll talk about one of the easier ways to deploy your database changes. One of the benefits of database projects is that they can generate data-tier applications (DAC). The data-tier applications can be bundled into what is called a DACPAC. This is a collection of files that can be used to deploy your database.

Click through for the video.

Comments closed

From R data.frame to Pandas DataFrame

Published 2021-01-05 by Kevin Feasel

Tomaz Kastrun builds a function:

Once we have a data.frame we want to generate Python dictionary that will hold schema and data for direct creation (or import) into your favourite Python environment. Function RtoPy does the needed transformation:

Click through for the code.

Comments closed

Cross-Validation in Azure ML Studio

Published 2021-01-05 by Kevin Feasel

Dinesh Asanka takes us through the cross-validation component in Azure ML Studio:

Let us look at implementing Cross-Validation in Azure Machine Learning. Let us use the sample Adventure Works database that we used for all the articles.
Then Cross Validate Model is dragged and dropped to the experiment. The Cross Validate model has two inputs and two outputs. Two inputs are data input and the relation to the Machine Learning technique. Let us use the Two-Class Decision Jungle as the Machine Learning Technique. Then the first output is connected to the Evaluate Model as shown in the following figure:

Click through for the process.

Comments closed

Adjusting Database Settings with Powershell

Published 2021-01-05 by Kevin Feasel

Eric Cobb takes a look at some nice functionality in dbatools:

There may be times that you want to ensure certain settings are applied to a database, or multiple databases. For example, if you restore a Production database to a QA environment, you may need to change the Recovery Model. Or if you’re migrating databases to a new SQL Server version you want to make sure to update the Compatibility Level. With dbatools this is really, really easy. Here are some examples:

Read on for examples around setting the database owner, changing the compatibility level, and setting the recovery model.

Comments closed

Data Professional Salary Survey Results

Published 2021-01-05 by Kevin Feasel

Brent Ozar has another year of salary data:

We asked what you make, and 1,747 of you in 69 countries answered. Altogether, you made $171,879,034 this year. Hubba hubba, there’s some earning power in this audience.
Download the Data Professional Salary Survey results (XLSX).

Click through for more details on the process.

Comments closed

Power BI End-to-End Diagram Update

Published 2021-01-05 by Kevin Feasel

Melissa Coates has an update for us:

An updated version of the Power BI End-to-End diagram is now available as of January 4, 2021.

This update is as of January 4th and includes things like Azure Purview. Melissa also shows how it has evolved over time, which is particularly interesting for Power BI given its rate of change.

Comments closed

Kerberos vs NTLM

Published 2021-01-05 by Kevin Feasel

Jack Vamvas contrasts Kerberos and NTLM:

There is a message found in SQL Server Error Logs similar to
The SQL Server Network Interface library could not register the Service Principal Name (SPN) [ MSSQLSvc/myserver.net:60000 ] for the SQL Server service. Windows return code: 0x200b, state: 15. Failure to register a SPN might cause integrated authentication to use NTLM instead of Kerberos. This is an informational message. Further action is only required if Kerberos authentication is required by authentication policies and if the SPN has not been manually registered.
The line I’m interested in reviewing is Failure to register a SPN might cause integrated authentication to use NTLM instead of Kerberos. It’s good to first understand the differences between Kerberos & NTLM – both supported by SQL Server during AD authentication

Read the whole thing. It’s easy to fall into the trap of “Windows authentication = Kerberos”—I do that myself far too often.

Comments closed

PFS Contention and Heaps

Published 2021-01-05 by Kevin Feasel

Uwe Ricken continues a series on heaps in SQL Server:

The PFS page “can” become a bottleneck for a heap if many data records are entered in the heap in the shortest possible time. How often the PFS page has to be updated depends mostly on the data record’s size to be saved.
This procedure does not apply to clustered indexes since data records in an index must ALWAYS be “sorted” into the data volume according to the defined index value. Therefore, the search for a “free” space is not carried out via the PFS page but via the value of the key attribute!

Read on for more detail.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts