Curated SQL – Page 672 – A Fine Slice Of SQL Server

Converting from XML in Powershell

Published 2021-06-09 by Kevin Feasel

Phil Factor has unleashed the full power of XML:

I want to convert reasonably small XML files to hash tables and PowerShell objects. PowerShell never had a ConvertFrom-XML Cmdlet because gulping a large XML file into a PowerShell data object is expensive in resources. It is the sheer time it takes to consume a large XML file. Instead, you have to use the XMLDocument object to navigate to the data you want or use an Xpath query. It is all well and good to handle XML in this way, but it is inconsistent to have no ConvertFrom-XML cmdlet. After all, there is a ConvertFrom cmdlet for CSV, JSON, and a variety of text-based data. It would be good to have one for XML as well. Usually, I just want to consume relatively small XML files and just pick out the data I want. I hoped that one that worked would turn up but somehow it never did. So I wrote my own.

Click through for that script, as well as some considerations about using it.

Comments closed

Wait Stats Not in Query Store

Published 2021-06-09 by Kevin Feasel

Erik Darling says wait, wait, don’t tell me:

There are some oddities in the documentation for query store wait stats.
One is that RESOURCE_SEMAPHORE_QUERY_COMPILE is listed as a collected wait, but with an asterisk that says it’s not actually collected. I’ve tested workloads that generate lots of that wait, and just like the docs say, it doesn’t end up there.
Of course, since I added wait stats recently to sp_QuickieStore, I wanted to make sure other waits that I care about actually show up in there.

Read on to see which wait stats you can find in Query Store and which you’ll have to get from someplace else.

Comments closed

Developing a Patch Strategy

Published 2021-06-09 by Kevin Feasel

Brent Ozar shares a patching strategy:

Decide how you’re going to detect problems. Every now and then, an update breaks something. For example, SQL Server 2019 CU7 broke snapshots, SQL Server 2019 CU2 broke Agent, and so many more, but my personal favorite was when SQL Server 2014 SP1 CU6 broke NOLOCK. Sure, sometimes the update installer will just outright fail – but sometimes the installer succeeds, but your SQL Server installation is broken anyway, and it may take hours or days to detect the problem. You need to monitor for new and unusual failures or performance problems.

Click through to see the high-level strategy elements.

Comments closed

Ranger and Jersey Clients

Published 2021-06-08 by Kevin Feasel

Jon Morisi troubleshoots an irksome issue:

Just a quick blog here about an issue I had with HDP-3.1.4.0. I recently was setting up a new user with specific rights in Ranger for Hive access. After creating the new policy and attempting to validate it, I received an error message stating that the hive user does not have use privilege. This error was produced even though I had just created the policy specifically granting those privilege’s.
Upon further review I noticed that the plugin was downloading the policy, but not applying it.

Read on to learn what the problem was and how Jon corrected it.

Comments closed

Announcements from Data+AI Summit

Published 2021-06-08 by Kevin Feasel

Ryan Boyd summarizes Databricks announcements:

The Delta Lake open source project is a key enabler of the lakehouse, as it fixes many of the limitations of data lakes: data quality, performance and governance. The project has come a long way since its initial release, and the Delta Lake 1.0 release was just certified by the community. The release represents a variety of new features, including generated columns and cloud independence with multi-cluster writes and my favorite — Delta Lake standalone, which reads from Delta tables but doesn’t require Apache Spark^TM.
We also announced a bunch of new committers to the Delta Lake project, including QP Hou, R.Tyler Croy, Christian Williams, Mykhailo Osypov and Florian Valeye.
Learn more about Delta Lake 1.0 in the keynotes from co-creator and Distinguished Engineer Michael Armbrust.

Read on for a variety of announcements in this vein.

Comments closed

Querying AWS Athena via Powershell

Published 2021-06-08 by Kevin Feasel

Michael Bourgon needs to get some data out of S3:

I was running into issues with the Linked Server lopping off long JSON that I’m having to pull out from the raw files. I can’t explain it – doesn’t appear to be SSMS. See previous post
But I needed to automate this, rather than use SQL Workbench, save to “Excel” (it was XML), then opening it again and saving it so that instead of 250mb, it’s 30mb. Runs against the previous month, one day at a time (walking the partitions), and then saves to a file. You got your Athena, your ODBC, your Export-Excel…

Incidentally, that previous post was around trying to use a linked server to pull the data in via SQL Server.

Comments closed

Empty Parallel Zones in SQL Server

Published 2021-06-08 by Kevin Feasel

Paul White clues us in on an interesting phenomenon:

An empty parallel zone is an area of the plan bounded by exchanges (or the leaf level) containing no operators.
How and why does SQL Server sometimes generate a parallel plan with an empty parallel zone?

Read on for an example as well as the explanation.

Comments closed

Understanding SUMMARIZE in DAX

Published 2021-06-08 by Kevin Feasel

Alberto Ferrari dives into a DAX operator:

If you like to follow best practices, you can just read this paragraph out of the entire article. If you are using SUMMARIZE to calculate new columns, stop. Seriously, stop doing it. Right now. Open your existing DAX code, search for SUMMARIZE and if you find that you are using SUMMARIZE to compute new columns, add them instead by using ADDCOLUMNS.
At SQLBI we are so strong on this position that we deliberately omitted a part of the detailed description of the behavior of SUMMARIZE in our book. We understand how SUMMARIZE works but we do not want your code to return inaccurate results, just because you use a function without understanding when its result might be different from the result you expected.

Read on as Alberto explains why, as well as the details of SUMMARIZE and how easily you can find yourself in a mess with it.

Comments closed

Database Snapshot Creator in Azure Data Studio

Published 2021-06-08 by Kevin Feasel

Haroon Ashraf takes a look at an extension in Azure Data Studio:

This article talks about the steps required to add and use the DB Snapshot Creator extension in Azure Data Studio.
Additionally, the readers are going to get a conceptual understanding of database snapshots and their use in professional life scenarios. This article highlights the importance of preserving database structure for future reference.
Let us get familiar with the extension prior to its use.

Click through to learn more. The one thing I’d like to see clarified (if it’s not already and I just missed it) is that you really don’t want more than one database snapshot on a given database at any time. Having two or more database snapshots active on a database can cause fairly significant performance issues on non-trivial databases and I’d prefer to see the tool include that knowledge rather than remembering an eight-year-old article from Jonathan Kehayias. But hey, I guess that’s what I’m here for…

Comments closed

A Review of Tabular Editor 3

Published 2021-06-08 by Kevin Feasel

Matt Allington reviews a paid product:

Tabular Editor is a Power BI Tabular Modelling productivity tool developed by Daniel Otykier. I blogged about Version 2 of the Tabular Editor in this article here. The 3^rd edition of Tabular Editor has just been released, and it is a major upgrade from version 2. TE 3 is not free, but in my view, the productivity benefits make it a must have piece of software for anyone that is regularly writing DAX in Power BI Desktop.

Read on for the review.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts