Press "Enter" to skip to content

Curated SQL Posts

Explaining Totals in Power BI

Sheil Bakhshi performs a comparison:

The long-running debate around how Power BI calculates totals in tables and matrices has been part of the community conversation for years. Greg Deckler has kept the topic alive through his ongoing “broken totals” posts on social media, often suggesting that Power BI should include a simple toggle to make totals behave more like Excel. His continued campaign prompted a detailed reply from Daniel Otykier in his article No More Measure Totals Shenanigans, and earlier, Diego Scalioni explored how DAX evaluates totals internally in his post Cache me if you can: DAX Totals behind the scenes.

This blog brings all those perspectives together from a scientific and comparative angle. It looks at how totals are calculated in Power BI and compares that behaviour with Tableau, Excel, Paginated Reports, and even T-SQL. The goal is not to take sides, but to clear up the confusion around what is happening under the hood.

This is a very detailed and dispassionate explanation that helps make sense of the debate.

Leave a Comment

Creating a Python Package via Poetry

Osheen MacOscar builds a package:

In this blog series (this and the next blog) I am going to demonstrate how to use Poetry to create a Python package, set up testing infrastructure and install it. I am going to be creating a wrapper around the Fantasy Premier League API and creating a function which can create a weekly league table.

This is a straightforward example of how to create a new Python package and add a function call to it.

Leave a Comment

Static and Dynamic Bulk Insert into SQL Server

Rick Dobson inserts some data:

There are numerous use cases for multi-file imports of CSV files into a SQL Server table:

  • Dynamic SQL Server bulk insert loads are especially appropriate for tasks that extract content from multiple files to a SQL Server table where the source file names change between successive import jobs.
  • Static bulk insert loads target scenarios where the source file names do not change between successive import jobs.

Read on for examples of how to implement each. Admittedly, bulk insert has rarely worked all that well in my experience, whether due to permissions mishaps, poor data integrity, or sudden changes in data types between file runs. But it does tend to work a lot better if you have a specified data interchange format and a standardized process to prepare the data and make it available on disk for insertion.

Leave a Comment

Fun with SQL Firewall in Oracle

Brendan Tierney follows up on a SQL Firewall post:

In a previous post, we’ve explored some of the core functionality of SQL Firewall in Oracle 23ai, In this post I’ll explore some of the other functionality that I’ve had to use as we’ve deployed SQL Firewall over the past few weeks.

Sometimes, when querying the DBA_SQL_FIREWALL_VIOLATIONS view, you might not get the current up to-date violations, or if you are running it for the first time you might get now rows or violations being returned from the view. This is a slight timing issue, as the violations log/cacbe might not have been persisted to the data dictionary. If you end up in this kind of situation you might need to flush the logs to to data dictionary. To do this, run the following.

Click through for that command, as well as a few other scenarios and commands that may be of interest.

Leave a Comment

Locks in Microsoft Fabric Data Warehouse

Twinkle Cyril enumerates the lock types in Fabric Data Warehouse:

Fabric DW supports ACID-compliant transactions using standard T-SQL (BEGIN TRANSACTION, COMMIT, ROLLBACK) and enforces snapshot isolation across all operations. Locks in Fabric Data Warehouse are used to manage concurrent access to metadata and data, especially during DDL operations. Here’s how locking works:

Click through for a chart. The locking policy is a lot simpler than what we see in SQL Server and you can see a description of the pros and cons of that simpler approach.

Leave a Comment

The Intricacies of COUNT()

Louis Davidson can easily get to 20:

I was reading LinkedIn posts the other day when I saw this blog about what was apparently an interview question about some forms of a COUNT aggregate function

This was apparently asked in an interview. What will each of these constructs do in a SQL statement:

COUNT(*) = ?
COUNT(1) = ?
COUNT(-1) = ?
COUNT(column) = ?
COUNT(NULL) = ?
COUNT() = ?

There’s one tricky bit in this set. Louis then takes it a bit further with CASE expressions and variables, so check out the post for the answers as well as those additional examples in T-SQL.

Leave a Comment

Evading Data Access Auditing in SQL Server and How to Stop It

Andreas Wolter brainstorms:

The methods that I will share here allow an attacker to either conceal his identity or even evade auditing completely.

Most of these commands require sysadmin privileges. However, if your goal is to audit every access to sensitive data, this typically means “all users”  –  not with an exception for administrators. Because of this, it’s important to understand these methods so you can make an informed decision about whether to include them in your auditing scope.

Some of these are wildly impractical, but they do work and Andreas has mitigations for each.

Leave a Comment

Extending Index Information across Availability Group Replicas

Aaron Bertrand extends a prior article:

In my previous tip, Managing Unused Indexes in SQL Server Availability Groups – Part 1, I showed how I use dynamic SQL to gather index usage statistics for a given table from all replicas in an availability group. Knowing the usage from all workloads is definitely better than focusing on only the primary or a single secondary. But what if I want to make more informed decisions, incorporating row counts, size, and index columns into this output?

Read on to see how you can incorporate this additional information.

Leave a Comment