Press "Enter" to skip to content

Day: June 7, 2021

Empty Strings vs NULL

Erik Darling lays out a challenge:

Empty Strings Aren’t Better Than NULLs

Prove me wrong.

Click through for commenters’ perspectives.

I’m not at the C.J. Date level of “NULLs are a mockery of the entire relational system and should be burned to the ground and that ground be salted and that salt be burned, just in case” but I do have my sympathies there.

An empty string has meaning: the value of a given attribute in this tuple is an empty string. NULL has no value and therefore violates first normal form (which, unlike my joking histrionics above, is an argument Date lays out in far too much detail for me to include here).

So what if you don’t know the value of a thing? Then the answer is, don’t store it! If there are certain attributes which may be missing at insertion time and are nonetheless relevant to maintain, use 6th Normal Form for each of those attributes. Then, the existence of a record indicates that it has a value and the lack of existence indicates that there is no value as of this time. This is quite different from NULL, about which we can only say “I don’t know if there is a value at this time, but if you’d like, I can do weird things with some of your queries if you happen to forget about this non-information.”

3 Comments

Integrating Power BI Deployment Pipelines with Azure DevOps

Marc Lelijveld shows how you can combine Power BI deployment pipelines with Azure DevOps:

Looking at the Power BI release plan, dataflow support for Deployment Pipelines is coming up shortly! Currently it is scheduled for June 2021 to reach the public preview state. Versioning and DevOps integration go hand-in-hand to our opinion. With Azure DevOps Git integration, we can overcome the versioning challenge while integrating with Azure DevOps at the same time, as described in the previous blog in 2019. Today, we release a new version of the DevOps implementation which uses native Power BI functionality. Stay tuned!

As we really like the metadata deployment and the ease of setup a pipeline in the Power BI Service, Ton and I decided to setup an Azure DevOps extension based on the recently released Power BI REST APIs for Deployment Pipelines. Although Microsoft promised to come-up with a native DevOps extension over time, we decided to go for it. Time to bridge the gap!

Read on for more details.

Comments closed

Building a Powershell Script Template

Eitan Blumin has a template for us:

If at any point an automated script fails for some reason, or does not behave as expected, it would be invaluable to have it produce and retain historical logs that could later be investigated for clues as to what it did, and where and why it failed.

Powershell has a few useful cmdlets for this, capable of writing an Output to any sort of destination, such as a log file. For example, Out-File.

However, in my personal experience, nothing beats the level of verbosity offered by a special cmdlet called Start-Transcript.

Read on for an explanation for each part of the template, and then the template itself.

Comments closed

Handling Unused Indexes

Welcome to the Chad Callihan Home for Unloved Indexes:

What do you do if you see that an index is being updated but is low on seeks and scans? Should you drop it because the table is never queried or a better index is being chosen? If an index doesn’t have updates, seeks, or scans then is the next step to drop?

The answer to these questions is a resounding “not yet!” Getting rid of an index that isn’t getting much use sounds simple but there are careful considerations to make.

Click through for those considerations.

Comments closed

Speeding Up Power Query with Evaluation Container Memory

Chris Webb notes a new toggle in Power Query:

However if you have just read the docs you may be wondering what these two new registry key settings actually do. In this post I’m only going to talk about one, MaxEvaluationWorkingSetInMB; I’ll leave ForegroundEvaluationContainerCount for a future post.

At various times in the past I have blogged about how, when you run a Power Query query, the query itself is executed inside a separate process called an evaluation (or mashup) container and how this process has a limit on the amount of memory it can use. Some transformations such as sorting a table, doing a group by, pivoting and unpivoting require an entire table of data to be held in memory and if these operations require more memory than the evaluation container is able to use then it starts paging and query performance gets a lot worse. 

Read on to see where setting the max evaluation working set in memory can help, as well as the caveats that Chris lays out.

Comments closed