Author: Kevin Feasel

A quick(er) post this week, in response to Greg Low’s blog post from a few weeks ago titled “Don’t start identity columns or sequences with large negative values.”
Greg writes that you shouldn’t use large negative values in a table, because… it’s hard to read them, I guess? And also they don’t compress well.
I disagree … to a degree. Dang, words are hard. Anyway, when I design a table I create what’s called a surrogate key as my primary key, which is a value that is intended for the table to uniquely identify a row so that it participates in relational activities like joins and foreign keys in an efficient way. In other words the identity column is not for me, it’s for the database engine. I don’t need to worry about what row a value has. I choose the data type for that identity column based on the estimated number of rows, not whether I can memorize that a [StatusID] of 5 means something. Magic numbers are bad, mmmkay?

I don’t mind using negative values, especially for things like queue tables where the rows are ephemeral. The identity values may be harder to read, but as Randolph points out, in those types of cases, you aren’t really reading the values anyhow.

Comments closed

XML and JSON Creation in SQL Server vs Elsewhere

Published 2020-05-07 by Kevin Feasel

Lukas Eder walks us through converting result sets to XML and JSON using different platforms:

SQL Server supports transforming flat tabular SQL result sets into hierarchical structures by convention using the convenient FOR XML or FOR JSON syntaxes. This is really convenient and less verbose than the standard SQL/XML or SQL/JSON APIs – although the standard ones are more powerful.
In this blog post, I’d like to show a few core features of the SQL Server syntax, and what they correspond to in standard SQL. jOOQ 3.14 will support both SQL Server’s syntax and the standard syntax, and will be able to translate from one to the other, such that you can use SQL Server syntax also on Db2, MariaDB, MySQL, Oracle, PostgreSQL. You can play around with the current state of development on our website here.

Click through for a series of examples.

Comments closed

Filtered Indexes and the Optimizer

Published 2020-05-07 by Kevin Feasel

Paul White covers a couple of issues around filtered indexes:

This is a many-to-many merge join, where the execution engine must keep track of duplicates from the outer input in a worktable, and rewind as necessary. Duplicates? We are scanning a unique index!
It turns out the optimizer does not know that a filtered unique index produces unique values. This is a one-to-one join, but the optimizer costs it as if it were many-to-many. The higher estimated cost of a many-to-many merge join explains why a hash join plan is chosen.

Read the whole thing.

Comments closed

WAITFOR TIME Instead of DELAY

Published 2020-05-07 by Kevin Feasel

Denis Gobo points out that there is a second option when using WAITFOR:

I was looking at some code I wrote the other day and noticed the WAITFOR command.. This got me thinking. How many times have I used WAITFOR in code, probably as much as I have used NTILE 🙂
I looked at the documentation for WAITFOR and notice there is TIME in addition to DELAY. Oh that is handy, I always rolled my own ghetto-style version by calculating how long it would be until a specific time and then I would use that in the WAITFOR DELAY command

Read on to see why you might use WAITFOR and how to use each option. I use it a bit more frequently than Denis, though I still need to run a test each time to get the syntax right…

Comments closed

Dynamic File Pruning on Delta Lake

Published 2020-05-06 by Kevin Feasel

Ali Afroozeh, et al, take us through Dynamic File Pruning in Databricks Runtime 6.1:

In addition to eliminating data at partition granularity, Delta Lake on Databricks dynamically skips unnecessary files when possible. This can be achieved because Delta Lake automatically collects metadata about data files managed by Delta Lake and so, data can be skipped without data file access. Prior to Dynamic File Pruning, file pruning only took place when queries contained a literal value in the predicate but now this works for both literal filters as well as join filters. This means that Dynamic File Pruning now allows star schema queries to take advantage of data skipping at file granularity.

There are some interesting performance results here. I’d also be curious to see how robust the results are as queries get more complicated

Comments closed

Running Postman Collections via Newman and Jenkins

Published 2020-05-06 by Kevin Feasel

Ankur Thakur takes us through configuring Postman automation:

We can write the automation test suite for any service which can be used for performing regression testing. But we can also integrate our automation test scripts using Newman in Jenkins CI/CD pipeline which can be triggered automatically whenever a pull request gets merged.
We need software installed on our system locally:
– Jenkins
– Node
– NPM
– Postman (For writing the collection)

We’ve done this in our environment and I’m reasonably happy with the results. If you have a REST API, it’s a fair sight easier to work with than something like Specflow.

Comments closed

Optimizing Slow Card Visuals in Power BI

Published 2020-05-06 by Kevin Feasel

Marco Russo helps us tune Power BI reports containing a large number of card visuals:

Every visual element in a Power BI report must complete a number of tasks to provide the expected result. Visuals showing data must generate one or more DAX queries to retrieve the required measures applying the correct filters. The execution of these queries increases the waiting time for the end user, and increase the workload on the server, especially when multiple users access a published report at the same time. In order to improve the performance and the scalability of a report, the best practice is reducing the number of visuals consuming data published in a page of a report.
The focus is on a single page of the report. Power BI only gets data and build the visualizations required for the active page of a report. When the user switches the focus to a different page, the waiting time only depends on the visuals of the new page. The content of other pages of the same report is not relevant for the performance. The goal is reducing the number of visuals in a single page of a report. This could be challenging in order to obtain the same report layout, but we can look for the right visualization once we realize that the number of visuals in the same page is negatively affecting the user experience.

Less is more here.

Comments closed

Settings for Tuning Backup Performance

Published 2020-05-06 by Kevin Feasel

Stuart Moore takes us through backup tuning options in dbatools:

Backup-DbaDatabase produces a lot of output, but the one you’re really going to be interested in is Duration. You’re job is going to be to minimise this as much as possible without impacting database performance.
To test changes in isolation you’ll want to remove the biggest variables in backup performance, network and storage speed. To do this, there’s a magical backup device that’s got near infinite speed and storage you can use. This is the black hole known as the bit bucket, /dev/null or NUL:, everything thrown into this pit disappears as fast as it’s put in.

Read the whole thing. And if you’re interested in a systematic method of testing and understanding the impact of these settings on your system, I have a presentation on the topic.

Comments closed

Issues with Window Functions and Views / Derived Tables

Published 2020-05-06 by Kevin Feasel

Paul White takes us through some of the difficulties you are liable to see when using window functions in views and derived tables:

Our expectation is that the execution plan for this new query will be exactly the same as before we created the view. The query optimizer should be able to push the product filter specified in the WHERE clause down into the view, resulting in an index seek.
We need to stop and think a bit at this point, however. The query optimizer can only produce execution plans that are guaranteed to produce the same results as the logical query specification. Is it safe to push our WHERE clause into the view?

Read the whole thing.

Comments closed

Creating Currency Formatting Strings with Power BI

Published 2020-05-06 by Kevin Feasel

Gilbert Quevauvilliers walks us through formatting currencies via calculation groups in Power BI and Analysis services:

When I first started looking at the calculation groups and changing the currency formats, I thought that my existing currency format was correct. Boy was I wrong and once I found that out and corrected it, my Currency Format Strings started working.
As per the Microsoft documentation found here Dynamic format strings for currency conversion I had to make sure that my Currency format followed the following pattern.

Read on for an example and demonstration.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31