Press "Enter" to skip to content

Curated SQL Posts

Dimensional Design Tips

Koen Verbeeck provides some helpful hints when designing dimensions in SQL Server Analysis Services Multidimensional models:

Although traditional dimension modeling – as explained by Ralph Kimball – tries to avoid snowflaking, it might help the processing of larger dimensions. For example, suppose you have a large customer dimension with over 10 million members. One attribute is the customer country. Realistically, there should only be a bit over 200 countries, maximum. When SSAS processes the dimension, it sends SELECT DISTINCT commands to SQL Server. Such a query on top of a large dimension might take some time. However, if you would snowflake (aka normalize) the country attribute into another dimension, the SELECT DISTINCT will run much faster. Here, you need to trade-off performance against the simplicity of your design.

There are several good tips here.

Comments closed

Smarter Differential Backups

Dennes Torres shows us how we can use a new column in an old DMV to make our full vs differential backup processes smarter:

What are the possibilities with this new field ? We are now able to check how many extents have changed since last full backup and decide if a full backup is really needed or we can live with a differential backup, achieving smarter backup plans.

Change our full backup jobs to first check this field and decide if the backup will be full or differential can save space and maintenance time with databases that aren’t updated so often.

Read on to learn more about this new column, which will be available in SQL Server 2017.

Comments closed

Cross-Database Queries With Azure SQL DB

Dustin Ryan shows how to set up cross-database queries within Azure SQL Database:

2. Vertical queries (in preview): A vertical elastic query is a query that is executed across databases that contain different schemas and different data sets. An elastic query can be executed across any two Azure SQL Database instances. This is actually really easy to set up and that what this blog post is about! The diagram below represents a query being issued against tables that exist in separate Azure SQL Database instances that contain different schemas.

Read on to learn how to implement vertical elastic queries today.

Comments closed

Using Common Table Expressions To Drive Queries

Lukas Eder wants one result set which returns records using predicate B if and only if there were no records using predicate A:

We’ve seen that we can easily solve the original problem with SQL only: Select some data from a table using predicate A, and if we don’t find any data for predicate A, then try finding data using predicate B from the same table.

Oracle and PostgreSQL can both optimise away the unnecessary query 2 by inserting a “probe” in their execution plans that knows whether the query 2 needs to be executed or not. In Oracle, we’ve even seen a situation where the combined query outperforms two individual queries. SQL Server 2014 surprisingly does not have such an optimisation.

Interesting totally-not-a-comparison between the three database products.  There are some things I’d ideally like the SQL Server optimizer to do with common table expressions, but as Lukas notes, it doesn’t, so user beware.

Comments closed

Multi-Channel Attribution With R

Sergey Bryl walks through some of the difficulties of the multi-channel attribution solution he came up with before:

The main steps that we will review are the following:

  • splitting paths depending on purchases counts

  • replacing some channels/touch points

  • a unique channel/touchpoint case

  • consequent duplicated channels in the path and higher order Markov chains

  • paths that haven’t led to a conversion

  • customer journey duration

  • attributing revenue and costs comparisons

There’s a lot there, and I like the practical explanations of issues when dealing with a real business problem.

Comments closed

Kerberos Constrained Delegation On Power BI Report Server

Regis Baccaro shows how to set up constrained delegation when connecting Power BI Report Server to a SQL Server instance or Analysis Services cube:

In many demo cases, you will have an all-in-one server where you have installed Power BI Report Server, SSAS (tabular or multidimensional) and SQL Server. In those cases you don’t need any form for credentials delegation since the Report Server is on the same box than the data source.

But there are scenarios where you have a distributed environment like the one I have on my VMs demo domain and for jumping around servers and passing credentials around, you need to setup Kerberos Constrained Delegation. Furthermore you will need protocol transition for it to work in Power BI Report Server.

Read on for step-by-step instructions showing how to do this.

Comments closed

Troubleshooting Distributed Replay Errors

Jonathan Kehayias walks through some common Distributed Replay errors and how to solve them:

On the Distributed Replay Controller machine, permissions need to be set in Component Services to allow the Distributed Replay Client Service accounts Launch and Activate permissions remotely on the COM component. The service accounts also need to be in the Distributed COM Users group in Windows. So in Component Services, expand Computers > My Computer > DCOM Config > DReplayController and right-click and open the Properties for the COM Component.

A lot of this boils down to permissions, but the error messages are about as unhelpful as they get.

Comments closed

Building Newlines In SQL Scripts

Shane O’Neill engages in wacky newline misadventures:

GO is a special little guy. It’s not exactly T-SQL. It’s a way of telling the SQL Server Management Studio (SSMS) to send everything before it, from the beginning of the script or the preceding GO, to the SQL Server instance.

If you read the documents, the main point to take away is…

A Transact-SQL statement cannot occupy the same line as a GO command. However, the line can contain comments.

The problem turns out to be a little trickier than you’d first imagine.  Also, after reading this post, I think it’s lunchtime…

Comments closed

Memory Requirements For Columnstore Rebuild/Reorg

Niko Neugebauer looks at memory requirements for rebuilding and reorganizing columnstore indexes:

To spare all the Wows & how can’s, Microsoft was well aware of this problem and has delivered a solution with Cumulative Update 3 for SQL Server 2016 with Service Pack 1:
FIX: SQL Server 2016 consumes more memory when you reorganize a columnstore index, and here it is – a new trace flag 6404 (documented in the link and thus should be supported), that will allow you to lower the memory requirements for the ALTER INDEX … REORGANIZE command.
Let’s take it for the test, by once again running the setup workload for the FactOnlineSales_Reindex table and then executing the following command, enabling the Trace Flag 6404 and then reorganising our Clustered Columnstore Index:

This is a rather interesting post and once again makes me wish that clustered columnstore indexes could be rebuilt online.

Comments closed