Press "Enter" to skip to content

Curated SQL Posts

Creating an Income Statement in Power BI

Joseph Yeates continues a series on financial statements in Power BI:

I had created a measure to populate the matrix visual. It uses =SWITCH() to return the YTD amount for the revenue and expense row headers and returns a subtotal for the net income header.

To finish building the income statement, I needed to add two more line items: retained income from the beginning and end of the year. I had already created the categories in the Power Query Editor, so I had to update the DAX statement to return logic for these lines.

Read on to see how to add these.

Comments closed

Powershell Runspaces

Mark Wilkinson has a new blog and an itch to explain the notion of runspaces in Powershell:

So I have a problem with some of the posts I’ve read about runspaces. It all comes down to a small detail that I think makes a big difference in your understanding of them.

$Runspace = ::Create()

This code looks innocent. What does it do? You’d probably think it’s creating a new runspace, but it’s not. This code is instead creating a fresh instance of PowerShell. If you run this code and run Get-Runspace you’ll see there is still just one listed, the one attached to your current session. So what is this instance we just created?

Click through for an example and a step-by-step breakout of that example.

Comments closed

Interleaved Execution with SQL Server

Milos Radivojevic takes us through improvements with interleaved execution in SQL Server:

As you might know, the Interleaved Execution is the member of the Intelligent Query Processing family of features. It has been introduced with SQL Server 2017 (as a part of the Adaptive Query Processing). It is designed to improve the performance of queries referencing multi-statement table-valued functions (MSTVF). Actually, it addresses currently only queries using MSTVF, but is hopefully designed for much more. The query optimizer usually has two issues with queries using MSTVF:

MSTVF is a black-box for the optimizer; it does not know what’s inside, it cannot perform cross-statement optimization (as it is a case with inline TVFs) and it assumes it is a cheap and fast operation
MSTVF has a fixed cardinality of 100 (prior to SQL Server 2014, it was 1)

Interleaved execution does not improve the first issue (MSTVF is still a black-box for the optimizer), but solves the cardinality issue.

Read on to understand how this second aspect has changed for the better.

Comments closed

Working with Multiple DataFrames in Pandas

Zehra Can shows how to concatenate and work with multiple data frames at once in Pandas:

Data can be selected from data frames by using loc and iloc options:

Loc is used for selecting rows and columns by index and value label, columns can be selected by column names,

Iloc is used for selecting rows and columns by their indexes.

This is a demo-heavy tutorial style of post, so it has plenty of code to look at.

Comments closed

Rebuilding Rowstore Indexes Online on Tables with Columnstore Indexes

Niko Neugebauer explains something about rebuilding rowstore indexes:

This blogpost will be about older SQL Server versions (2016, 2017) and some implications that I have found that people rather do not understand, until they hit the problems in productions – the ONLINE creation & rebuild operations for the Rowstore indexes when having Columnstore Indexes on their tables.
As you should know by now – SQL Server 2017 & SQL Server 2019 respectively brought the ONLINE features support for the Columnstore Indexes (Nonclustered(2017) & Clustered(2019)) and I have blogged about those news in Columnstore Indexes – part 96 (“Nonclustered Columnstore Index Online Rebuild”) in 2017 and in Columnstore Indexes – part 123 (“Clustered Columnstore Index Online Rebuild”) in 2018.

What I honestly did not expect is the amount of the same question I have faced as in the last 3 months regarding the rebuild operations for the the ONLINE creation & rebuild operations for the Rowstore indexes when having Columnstore Indexes.

Read on to see the demonstration for when you have a clustered columnstore index and a nonclustered columnstore index.

Comments closed

Row-Level Security in Power BI Reports

Drew Skwiers-Koballa shows how to use an embed token to implement row-level security with Power BI:

To present a PowerBI report user or consumer with a securely pre-filtered dataset, row level security must be used. In a PowerBI embedded architecture where “app owns data”, implementing row level security (RLS) requires a modification to the token generation request. By specifying a role and user in the token request, we can generate an embed token specific to the user’s data access.

Click through for the instructions.

Comments closed

Implementing Soft Deletes in SQL Server

Brent Ozar shows how to use soft deletes in SQL Server:

Normally when you run a DELETE statement in a database, the data’s gone.

With the soft delete design pattern, you add a bit column like IsDeleted, IsActive, or IsArchived to the table, and instead of deleting rows, you flip the bit column. This can buy you a few advantages:

– Easier/faster undeletes
– History tracking (keeping deleted rows around for auditing purposes, although the method I’m going to show here doesn’t do anything fancy like track who did the delete)
– Easier reconciliation during disaster recovery failovers (refer to the Senior DBA class module on recovering from failovers for more info)
– Lower workload for Availability Groups secondaries (neat scenario from Aaron Bertrand)

Read on to see how and what the costs of this are.

Comments closed

Delta Lake and ACID Properties

Kundan Kumarr notes that Spark’s Delta Lake allows for ACID transactions:

DeltaLog is the crux of Delta Lake which ensures atomicity, consistency, isolation, and durability of user-initiated transactions. DeltaLog is an ordered record of transactions. Every transaction performed since the inception of Delta Lake Table, has an entry in the DeltaLog (also known as the Delta Lake transaction log). It acts as a single source of truth, giving users access to the last version of a DeltaTable’s state. It provides serializability, the strongest level of isolation level. Let’s see how DeltaLog ensures ACID Transactions.

Click through for the explanation.

Comments closed

Check Those R Repos

John Mount has a public service announcement:

In a lot of our R writing we casually say “install from CRAN using install.packages('PKGNAME')” or “update your packages by using update.packages(ask = FALSE, checkBuilt = TRUE) (and answering ‘no’ to all questions about compiling).”

We recently became aware that for some users this isn’t complete advice.

The above depends on your R install pointing to a repository that is in fact up to date. To check what repositories you are using please use the command options('repos').

The specific example here is around the Microsoft R Archive Network (MRAN), which stays at fixed dates. This is for a good reason: because it helps companies standardize on a known set of versions of R packages by default. That way you don’t have version 1.8 of a package in dev and then get 1.9 in production and find out that something broke between the two versions.

Comments closed