Author: Kevin Feasel

Query Store Bugfixes

Published 2019-01-25 by Kevin Feasel

Erin Stellato has a roundup of Query Store bugfixes as of January 2019:

The latest CUs for SQL Server 2016 and 2017 contain some important Query Store fixes that I thought worth mentioning for those of you on either version or those of you looking to upgrade. As of this writing, the current CU for SQL Server 2016 SP2 is CU5, and for SQL Server 2017 it is CU13. Many times we see fixes that make it into a SQL Server 2017 CU ported back to a SQL Server 2016 build. Interestingly enough, there are some Query Store fixes in 2016 CUs that are not in 2017 CUs. I don’t know if that’s because the issues do not exist in 2017, or if it’s just that they have been fixed yet in 2017. I’m planning to update this post if the fixes are added down the read. So here we go, in descending CU order…

This post is a great reason to keep those SQL Server instances up to date.

Comments closed

Multi-File Power BI

Published 2019-01-25 by Kevin Feasel

Marc Lelijveld shows us how to break up a single Power BI desktop file into several:

Normally a Power BI Desktop file (PBIX) contains your queries, data model, and reports (visualization). Looking at a multi-file strategy, we split this up into two (or more) files.
The first file only contains the queries and data model. The second file is directly connected to the first one (direct query) and reads the data model. The file it selves includes all report content like visualizations, booksmarks and everything related to that. By working this way, you will be able to build multiple reports based on the same dataset.

Click through for a demonstration.

Comments closed

Pure Versus Impure Functions

Published 2019-01-24 by Kevin Feasel

On the Knoldus blog, Siddhant describes pure and impure functions:

As we can see, a value is pure, if it conforms very strictly to its type. In case of pureFunctionValue the declared type said that it was a function which takes an Int and returns a String, and that was indeed what it did. It could take an Int and it gave us back a reference to a String value and did nothing else.
In case of impureFunctionValue the declared type said that it was a function which takes an Int and returns a String. Indeed we could feed it an Int, but when we did so, it did something else apart from returning us a String. It printed stuff out to the console. This, friends, was not expressed in the type, thus it is a side-effect of the function and thus the value in question is impure, and not exactly a function in the mathematical sense.

Pure functions are great because they’re easy to reason about: you have an input, you have an output, and you can guarantee that nothing else changes in between. Impure functions are great because if we only had pure functions, our programs would add zero value. Impure functions drive I/O, including the ability to see what those pure functions did. The trick in functional programming is to push as much logic into the pure space as possible, making it easier to focus on the impure space and make sure you didn’t goof up there.

Comments closed

Genetic Algorithms In R

Published 2019-01-24 by Kevin Feasel

Pablo Casas touches on one of my favorite lost causes:

In machine learning, one of the uses of genetic algorithms is to pick up the right number of variables in order to create a predictive model.
To pick up the right subset of variables is a problem of combinatory logic and optimization.
The advantage of this technique over others is that it allows the best solution to emerge from the best of the prior solutions. An evolutionary algorithm which improves the selection over time.
The idea of GA is to combine the different solutions generation after generation to extract the best genes (variables) from each one. That way it creates new and more fit individuals.
We can find other uses of GA such as hyper-tunning parameters, finding the maximum (or minimum) of a function, or searching for the correct neural network architecture (neuroevolution), among others.

I’ve seen a few people use genetic algorithms in the past decade, but usually for hyperparameter tuning rather than as a primary algorithm. It was always the “algorithm of last resort” even before neural networks took over the industry, but if you want to spend way too much time on the topic, I have a series. If you have too much time on your hands and meet me in person, ask about my thesis.

Comments closed

A Functional Approach To PySpark

Published 2019-01-24 by Kevin Feasel

Tristan Robinson shows us how we can implement a transform function which makes Python code look a little bit more functional:

After a small bit of research I discovered the concept of monkey patching (modifying a program to extend its local execution) the DataFrame object to include a transform function. This function is missing from PySpark but does exist as part of the Scala language already.
The following code can be used to achieve this, and can be stored in a generic wrapper functions notebook to separate it out from your main code. This can then be called to import the functions whenever you need them.

Things which make Python more of a functional language are fine by me. Even though I’d rather use Scala.

Comments closed

Highlighting Words In Powershell

Published 2019-01-24 by Kevin Feasel

Roman Gelman has a function which lets you highlight words in Powershell text:

Despite of the Write-HostHighlight function intended to work with a text (it highlights a substring within a string), it can operate with objects too. Generally, it accepts any input from any cmdlet. The output is not always looks nice, but it works! Let see some examples.

This looks a bit like newer versions of grep which highlight matched patterns. I like it.

Comments closed

Querying Cosmos DB Execution Metrics

Published 2019-01-24 by Kevin Feasel

Hasan Savran shows us how to retrieve execution metrics for a Cosmos DB call:

When I speak about CosmosDB, I always get questions like “How can I retrieve information about the execution plans?” or “Isn’t there a tool like SSMS which can show me what’s happening in the background?” Usually, questions like that comes from DBAs. If you have questions like that, I have good and bad news for you. Good news is, Yes you can get retrieve metrics from CosmosDB about execution plans. Bad news is, you need to know some programming to be able to do that because you need to use CosmosDB SDK.

The only way to access this information is from CosmosDB SDK 2.x. I couldn’t retrieve execution metrics by using SDK 3.x for custom queries. Here is the available metrics you can retrieve from CosmosDB queries.

I wonder if this is a “this is still new” thing, a “you don’t need these where you’re going” thing, or a “this is exactly how we envisioned implementation” thing. Especially around getting per-query metrics after the fact.

Comments closed

Visualization Failures

Published 2019-01-24 by Kevin Feasel

Stephanie Evergreen talks about two specific instances of self-inflicted visualization failure:

There’s a solid argument to be made that the scales in these charts shouldn’tstart at zero because we wouldn’t see any difference between the two years; all the lines would look flat. But there’s also a solid reason why they should start at zero—maybe I’m exaggerating the change if I don’t. Only the people who work closely with this data would know what kind of scale would fit best given the context of this foundation.
However, people on social media took notice of what they thought was a failure of mine and one commenter tweeted that “there’s no way [a dataviz Godfather] would approve this visual.” So, I got up the guts and sent the whole thing to the Godfather himself.
The Godfather wrote back: “To be honest, almost everything about your redesign is deceitful.” Ouch. I may have actually shed tears over this one. I was devastated.

There’s a good reminder here that failure is a critical part of learning.

Comments closed

Changing Server Collations En Masse

Published 2019-01-24 by Kevin Feasel

Hugo Kornelis has a script to change server collations across a large number of objects:

Problem is: the new instance was set up with a default collation that was different from the default collation of the old instance. And hence different from the database default collation. And when that resulted in errors, they responded by changing the default collation. Now everything ran and they were happy.
Fast forward two years. The database now has lots of “old” tables that use the original default collation for all string columns. But also lots of “new” tables that have their string columns using the new default collation. And the upgrade failed because somewhere in the new code is a comparison between two character columns that now have incompatible collation.

Click through for the script, as well as the standard disclaimer never blindly to run things in production.

Comments closed

Rotating Expired TDE Certificates

Published 2019-01-24 by Kevin Feasel

Chris Bell shows how you can quickly rotate TDE certificates, hopefully before they expire:

We have expired or expiring SQL TDE certificates! What now?
Well, the first thing we do is not panic. Even if our TDE certificate expires it won’t cause any issues. The SQL Server will continue to work normally. Even if we restore the DB elsewhere using the expired certificate we will just get a warning that the certificate is expired.
A warning is nice, and the system still working let’s us breathe a little easier, but we know that an updated certificate is a much better thing to have. In fact, setting up a regular key rotation schedule is even better and a recommended practice.

Good information, and Chris shares scripts to make it easy.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31