Press "Enter" to skip to content

Author: Kevin Feasel

Ranger and Jersey Clients

Jon Morisi troubleshoots an irksome issue:

Just a quick blog here about an issue I had with HDP-3.1.4.0.  I recently was setting up a new user with specific rights in Ranger for Hive access.  After creating the new policy and attempting to validate it, I received an error message stating that the hive user does not have use privilege.  This error was produced even though I had just created the policy specifically granting those privilege’s.

Upon further review I noticed that the plugin was downloading the policy, but not applying it.  

Read on to learn what the problem was and how Jon corrected it.

Comments closed

Querying AWS Athena via Powershell

Michael Bourgon needs to get some data out of S3:

I was running into issues with the Linked Server lopping off long JSON that I’m having to pull out from the raw files.  I can’t explain it – doesn’t appear to be SSMS.  See previous post

But I needed to automate this, rather than use SQL Workbench, save to “Excel” (it was XML), then opening it again and saving it so that instead of 250mb, it’s 30mb.  Runs against the previous month, one day at a time (walking the partitions), and then saves to a file.  You got your Athena, your ODBC, your Export-Excel…

Incidentally, that previous post was around trying to use a linked server to pull the data in via SQL Server.

Comments closed

Understanding SUMMARIZE in DAX

Alberto Ferrari dives into a DAX operator:

If you like to follow best practices, you can just read this paragraph out of the entire article. If you are using SUMMARIZE to calculate new columns, stop. Seriously, stop doing it. Right now. Open your existing DAX code, search for SUMMARIZE and if you find that you are using SUMMARIZE to compute new columns, add them instead by using ADDCOLUMNS.

At SQLBI we are so strong on this position that we deliberately omitted a part of the detailed description of the behavior of SUMMARIZE in our book. We understand how SUMMARIZE works but we do not want your code to return inaccurate results, just because you use a function without understanding when its result might be different from the result you expected.

Read on as Alberto explains why, as well as the details of SUMMARIZE and how easily you can find yourself in a mess with it.

Comments closed

Database Snapshot Creator in Azure Data Studio

Haroon Ashraf takes a look at an extension in Azure Data Studio:

This article talks about the steps required to add and use the DB Snapshot Creator extension in Azure Data Studio.

Additionally, the readers are going to get a conceptual understanding of database snapshots and their use in professional life scenarios. This article highlights the importance of preserving database structure for future reference.

Let us get familiar with the extension prior to its use.

Click through to learn more. The one thing I’d like to see clarified (if it’s not already and I just missed it) is that you really don’t want more than one database snapshot on a given database at any time. Having two or more database snapshots active on a database can cause fairly significant performance issues on non-trivial databases and I’d prefer to see the tool include that knowledge rather than remembering an eight-year-old article from Jonathan Kehayias. But hey, I guess that’s what I’m here for…

Comments closed

CHECKDB: Repairing with Data Loss

Chad Callihan explains what that CHECKDB option which should sound really scary means:

Just because REPAIR_ALLOW_DATA_LOSS might get your database up and running quickly doesn’t make it a good option. It could wrongly be considered a shortcut or an easy button to get handle corruption. Some may run REPAIR_ALLOW_DATA_LOSS, see their database back online, and call it a day. Let’s look at why some things are too good to be true.

Read the whole thing.

Comments closed

A Review of Tabular Editor 3

Matt Allington reviews a paid product:

Tabular Editor is a Power BI Tabular Modelling productivity tool developed by Daniel Otykier. I blogged about Version 2 of the Tabular Editor in this article here. The 3rd edition of Tabular Editor has just been released, and it is a major upgrade from version 2. TE 3 is not free, but in my view, the productivity benefits make it a must have piece of software for anyone that is regularly writing DAX in Power BI Desktop.

Read on for the review.

Comments closed

Empty Strings vs NULL

Erik Darling lays out a challenge:

Empty Strings Aren’t Better Than NULLs

Prove me wrong.

Click through for commenters’ perspectives.

I’m not at the C.J. Date level of “NULLs are a mockery of the entire relational system and should be burned to the ground and that ground be salted and that salt be burned, just in case” but I do have my sympathies there.

An empty string has meaning: the value of a given attribute in this tuple is an empty string. NULL has no value and therefore violates first normal form (which, unlike my joking histrionics above, is an argument Date lays out in far too much detail for me to include here).

So what if you don’t know the value of a thing? Then the answer is, don’t store it! If there are certain attributes which may be missing at insertion time and are nonetheless relevant to maintain, use 6th Normal Form for each of those attributes. Then, the existence of a record indicates that it has a value and the lack of existence indicates that there is no value as of this time. This is quite different from NULL, about which we can only say “I don’t know if there is a value at this time, but if you’d like, I can do weird things with some of your queries if you happen to forget about this non-information.”

3 Comments

Building a Powershell Script Template

Eitan Blumin has a template for us:

If at any point an automated script fails for some reason, or does not behave as expected, it would be invaluable to have it produce and retain historical logs that could later be investigated for clues as to what it did, and where and why it failed.

Powershell has a few useful cmdlets for this, capable of writing an Output to any sort of destination, such as a log file. For example, Out-File.

However, in my personal experience, nothing beats the level of verbosity offered by a special cmdlet called Start-Transcript.

Read on for an explanation for each part of the template, and then the template itself.

Comments closed

Integrating Power BI Deployment Pipelines with Azure DevOps

Marc Lelijveld shows how you can combine Power BI deployment pipelines with Azure DevOps:

Looking at the Power BI release plan, dataflow support for Deployment Pipelines is coming up shortly! Currently it is scheduled for June 2021 to reach the public preview state. Versioning and DevOps integration go hand-in-hand to our opinion. With Azure DevOps Git integration, we can overcome the versioning challenge while integrating with Azure DevOps at the same time, as described in the previous blog in 2019. Today, we release a new version of the DevOps implementation which uses native Power BI functionality. Stay tuned!

As we really like the metadata deployment and the ease of setup a pipeline in the Power BI Service, Ton and I decided to setup an Azure DevOps extension based on the recently released Power BI REST APIs for Deployment Pipelines. Although Microsoft promised to come-up with a native DevOps extension over time, we decided to go for it. Time to bridge the gap!

Read on for more details.

Comments closed