Curated SQL – Page 703 – A Fine Slice Of SQL Server

Recommending a Power BI Date Dimension

Published 2022-01-04 by Kevin Feasel

Matthew Roche gives us a date dimension:

A bunch of folks from the Power BI community shared their favorite Power Query date dimension queries– you can look at the Twitter thread for links if you’re interested. With their inspiration (and code!) I picked what I liked, discarded what I didn’t, and came up with this:

Click through for that code.

Comments closed

Building an SSMS Database Solution

Published 2022-01-04 by Kevin Feasel

Andy Leonard has a four-parter four us on database solutions in SQL Server Management Studio. Part one provides an introduction:

I like Microsoft Visual Studio a lot. I know some members of the team that developed Visual Studio, and they are scary-smart individuals who have forgotten more about developing software than I will ever know.
For some reason, I am not fond of SQL Server projects in Visual Studio. I believe the reason is that I am not familiar with the template. Please note I used the word fond intentionally. It’s an emotion. In this case, it’s all about me. I believe my emotion would change if I took the time to learn more about the Visual Studio SQL Server project template.
I continue to attempt to learn VS database projects. In the meantime, I prefer SQL Server Management Studio solutions.

Part two shows how to add a new query:

One solution is to add instrumentation to T-SQL scripts. I personally like to write T-SQL scripts that idempotent (a fancy way to describe “re-executable with the same results”). One way to write idempotent T-SQL is:
1. First check for the current state
2. Provide feedback (instrumentation) on the status
3. Provide more feedback on actions driven by the status (yep, more instrumentation)

Part three includes tables and views in the mix:

Click the “New Query” button in SSMS and add the following T-SQL:

Part four includes stored procedures:

Note the DDL to manage stored procedures is very similar to the DDL for managing views.
If all goes according to plan, the first execution of the s.i DDL T-SQL statement should generate the following messages:

Andy also shows how to use SQLCMD to create a proper deployment script.

Comments closed

Checking if a Spark DataFrame is Empty

Published 2021-12-31 by Kevin Feasel

The Hadoop in Real World team has a one-liner for us:

A quick answer that might come to your mind is to call the count() function on the dataframe and check if the count is greater than 0. count() on a dataframe with a lot of records is super inefficient.
count() will do a global count of records in the dataframe from all partitions and then add all the intermediate counts together to get the final count. You will find this approach very slow for big dataframes.

Click through for a much faster one-liner.

Comments closed

Creating Fireworks with R

Published 2021-12-31 by Kevin Feasel

Tomaz Kastrun is ready for Silvester:

New Year’s eve is almost here and what best way to celebrate with fireworks. Snap, pop, crack, boom. This is the most peaceful, animal friendly, harmless, eco, children friendly, no-fire-needed, educative and nifty fireworks.
To get the fireworks, fire up the following R function.

I mean, but I enjoy fire… Though you could launch these in R and save the good stuff for the 4th of July.

Comments closed

TokenLibrary and Data Exfiltration Protection

Published 2021-12-31 by Kevin Feasel

I troubleshoot an unfortunate error message:

Based on this error message, the command is timing out after one minute. My initial thought is that there must be a network restriction preventing me from communicating with Key Vault, and I know what my first answer is.

This story even has a red herring, which means it’s a set of junior sleuths and a talking dog away from being a ’90s cartoon.

Comments closed

Policy Authors in Azure Purview

Published 2021-12-31 by Kevin Feasel

Wolfgang Strasser sees a new role:

I’ve spotted a new Azure Purview permission level – Policy Authors. This new permission level is connected to the new Data Access Policy Management in Purview (as of today 2021-12-30 in preview; https://docs.microsoft.com/en-us/azure/purview/how-to-access-policies-storage)

Click through for a little bit more detail.

Comments closed

Cleaning SQL Express Databases

Published 2021-12-31 by Kevin Feasel

Kevin Hill knows the pain:

I was contacted by a lawyer that was using a 3rd party application to store emails, keep track of time, etc.
The backend of the application is SQL Server Express edition, which has a hard limit of 10GB for the data file.

One quick note for people with lots of LOB data, remember to reorganize with LOB_COMPACTION = ON as that’s the only way to be sure. Also, depending on how old the version of SQL Server is, there was a bug with LOB compaction which affected SQL Server 2014 and earlier. But, uh, hopefully you’re patched past that point…

Also, getting up to 2016 SP1 means that Express Edition gets data compression. It wouldn’t directly help in this case, but if you have a lot of non-LOB data on Express Edition, it can work wonders, for some definition of “wonders.” After all, if you’re using Express Edition, wonders are by definition pretty small.

Comments closed

Implementing NORM.INV in Power Query

Published 2021-12-31 by Kevin Feasel

Imke Feldmann has another function to implement:

The Excel NORM.INV function returns the inverse of the normal cumulative distribution for the specified mean and standard deviation. So unlike the NORM.DIST function, that returns the probability of a threshold value to occur under the normal distribution (in CDF mode), this function returns the threshold value that matches a given probability.

Click through for the function definition.

Comments closed

Data Exfiltration Protection and Pip

Published 2021-12-30 by Kevin Feasel

I have a post borne from frustration:

I have an Azure Synapse Analytics workspace which uses a managed virtual network and includes data exfiltration protection. I also have a Spark pool. My goal is to import a few packages and use them in a Spark notebook.
Doing so is pretty easy from the Synapse workspace. I navigate to the Manage hub and then choose Apache Spark pools from the Analytics pools menu. Select the ellipsis for my Spark pool and then choose Packages.
From there, because I plan to update Python packages, I can upload a requirements.txt file and have Pip do its job.

But then it doesn’t… Click through to learn why, as well as the workaround for this. It’s stuff like this which makes me say data exfiltration protection is a feature administrators will (mostly) like and developers will hate. Especially because there’s no obvious indicator why this was happening in the error message itself.

Comments closed

Creating Boilerplate Pester Assertions

Published 2021-12-30 by Kevin Feasel

Jeffrey Hicks builds a useful snippet:

During this process, I decided I needed to help myself speed up the test writing phase. I have a standard set of tests that I like to use for functions in my module. But copying and pasting code snippets is tedious. I know I could create a set of VS Code snippets, but that feels limiting and I’d have to make sure the snippets are available on all systems where I might be running VS Code. Instead, I wrote a PowerShell function to accelerate developing Pester 5.x tests.
My function takes a module and extracts all of the public exported functions. For each function, it creates a set of standard Pester assertions. These are the baseline or boilerplate tests that I always want to run for each function. Each function is wrapped in a Describe block. Although, I can opt for a Context block instead. This command will also insert tags. Note that my code for the tag insertion relies on the ternary operator from PowerShell 7.

Click through for the code.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Curated SQL Posts