Press "Enter" to skip to content

Curated SQL Posts

Backups on AWS RDS

Grant Fritchey shows how you can back up a database on Amazon’s RDS:

Which results in the following:

Msg 262, Level 14, State 1, Line 1
BACKUP DATABASE permission denied in database ‘HamShackRadio’.
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

Completion time: 2020-06-26T08:34:23.5511314-04:00

In short, by default, you can’t backup SQL Server databases on RDS. However, that’s by default. We can make some changes.

Read on to see the proper way of backing up a database hosted in RDS.

Leave a Comment

Another Way to Calculate Elapsed Business Hours with DAX

Matt Allington follows up on a previous post:

Then, sometimes (like this time) I discover that someone has a better way to solve the same problem that I shared on my blog. This is what happened last week after I shared my first article about how to calculate the total business hours between 2 date/time stamps. I shared the way I solved this problem last week, but one of my readers, Daniil Bogomazov, shared a brilliant alternative solution to the same problem. The solution is so good that I am sharing his solution with you here today.

Read on for a clever solution and a detailed comparison to Matt’s prior answer.

Leave a Comment

The Basics of Spark Streaming

Muskan Gupta gives us an introduction to Spark Streaming:

Spark Streaming is an extension of the core Spark API that enables scalable, high-throughput, fault-tolerant stream processing of live data streams. It was added to Apache Spark in 2013. We can get data from many sources such as Kafka, Flume etc. and process it using functions such as map, reduce etc. After processing we can push data to filesystem, databases and even to live dashboards.

In Spark Streaming we work on near real time data. It divides the received input stream into batches. The Spark Engine processes the batches and generate final output in batches.

Read on to understand the key mechanisms behind Spark Streaming.

Leave a Comment

Tips for Optimizing Columnstore Indexes

Ed Pollack continues a series on columnstore indexes:

This is worth a second mention: Avoid updates at all costs! Columnstore indexes do not treat updates efficiently. Sometimes they will perform well, especially against smaller tables, but against a large columnstore index, updates can be extremely expensive.

If data must be updated, structure it as a single delete operation followed by a single insert operation. This will take far less time to execute, cause less contention, and consume far fewer system resources.

Read on for several more tips along these lines.

Leave a Comment

Role-Playing Dimensions in Power BI

Martin Schoombee explains the concept of role-playing dimensions and then explains how that works in the Power Bi world:

In technical terms a role-playing dimension is when a dimension table has multiple (foreign key) relationships to the same fact table. In more non-technical terms, it is when you have the same attribute (“Date” for instance) that can relate to the same metrics in different ways.

If you look at the data model below as an example, you can see the Date entity could be used to reference either the Invoice Date or Delivery Date from the Sales entity, changing the perspective of the metrics we’re looking at.

It’s a little surprising to me that there isn’t an easier way to handle this concept. Role-playing dimensions are a core part of the Kimball model, and they’re common enough that you’d expect support to be a bit simpler.

Leave a Comment

Formatting TimeSpans in Powershell

Jeffrey Hicks shows how we can format TimeStamp objects:

This is pretty straight forward. Subtract the LastBootUpTime property from the current datetime to get a timespan object that shows how long this computer has been up and running. But…I want to get rid of the milliseconds value.  It’s irrelevant as far as I’m concerned and takes up space that I might want to use for another property.  I need to format that  value.

Click through for several methods.

Leave a Comment

Availability Group Bug when Removing and Adding the Same Database

Josh Darnell takes us through a tricky problem:

I came across a bug in SQL Server 2016 where the Availability Group (AG) health check can get stuck in an infinite loop after removing and re-adding a database from an AG.

Unfortunately, I don’t know exactly what version this bug was introduced. I first noticed the problem on SQL Server 2016 SP2 CU7 GDR (13.0.5366.0). It may have existed before then, but I never encountered it.

Read on for a workaround. And hopefully there will be a proper fix soon. Also, it’d be interesting to see if it can be reproduced in 2017 or 2019.

Leave a Comment

Delayed Prefetch and Hidden Reads

Hugo Kornelis looks at when worlds collide:

So let’s check. The picture above shows, side by side, the properties of the Index Seek and the Key Lookup operator. They show that the Index Seek did 3 logical reads only, while Key Lookup did 650 logical reads. A clear indication where the majority of the work is done.

But wait. Aren’t we missing something?

The SET STATISTICS IO ON output indicates a total of 722 logical reads. The two screenshots above add up to 653 logical reads. Where are the other 69 logical reads?

Read on for the answer.

Leave a Comment

Pattern-Matching and Text Extraction in Power Query

Imke Feldmann shows how we can match specific patterns in Power Query, which lacks regular expresssions:

I plan to approach this by

1. stepping through the string and check each character if it is valid.
2. If so, store it
3. and if not, forget it. And in that case, also forget the stored values so far so the collection starts from scratch.
4. Then if a series of matches builds up, I will have to check the count of the stored values to not exceed the length of my target pattern.
5. Once the length is reached, no further checks shall be performed and the found values be returned.

My aim is to find a generic way so that this solution can be adapted to many other similar use cases as well. Therefore the pattern must be described in a scalable way and the identification of the pattern elements should be easy to compute.

It’s good to be able to adapt, improvise, and overcome, though this is exactly what regular expressions are intended to do. It’s odd that there appears to be no built-in capability, where instead you’d have to do something like bring in external languages like JavaScript.

Leave a Comment