Press "Enter" to skip to content

Curated SQL Posts

Power BI Dataset CI/CD with Azure DevOps

Stephanie Bruno does a bit of continuous integration:

There’s a lot of information on how to get around the lack of an out-of-the box CI/CD solution for Power BI datasets, but for me it’s often complicated and I have to read too many pages before making much progress on my own. This post is here to strip it down and provide you with the easiest way we know to enable a bonafide CI/CD process for Power BI datasets with Azure DevOps. The post is still longer than we’d like, but it includes detailed step-by-step instructions to walk you through every part of the process. To save space, we used slideshows for the screenshots, but you can pause them as you follow along.

There are a lot of steps but the goal is a worthwhile one.

Comments closed

Tips for Enhancing Power BI User Experience

Mara Pereira provides some guidance:

Before we dive into the nitty-gritty, let’s address the elephant in the room – what exactly is user experience, and why is it such an indispensable factor in reporting and Power BI?

To put it simply, user experience (UX) is the overall impression and interaction your audience has with your report. It goes beyond aesthetics; UX focuses on the ease of understanding, navigation, and the ability to extract valuable insights from the data presented. In the world of reporting, having a top-notch user experience is crucial, as it can make or break the effectiveness and adoption of your reports.

The post stays mostly at a high level, providing motivational guidance rather than “here are the specific actions to take on a given report.” What it does provide is the reasoning behind why you would make those changes.

Comments closed

Which Power Query Operations Are Most Resource-Intensive?

Chris Webb answers a question:

Last year I wrote a post about a change in the Power BI Service that meant the CPU Time associated with Power Query transformations was added to the total shown for a dataset refresh operation in Profiler traces and Log Analytics:

https://blog.crossjoin.co.uk/2022/07/03/measuring-power-query-cpu-usage-during-power-bi-dataset-refresh/

This was useful, but it didn’t tell you directly how much CPU Time was used by Power Query and it didn’t tell you which tables or partitions in a refresh were using the most CPU. It also didn’t tell you anything about Power Query memory usage. The good news that recently there has been another change that solves these problems.

Click through for the solution.

Comments closed

ANSI SQL and Trailing Spaces

Chris Johnson finds a language quirk:

Recently I found a quirk of T-SQL, where a group by statement was treating strings as the same if the only difference was one or more trailing spaces. So, ‘aa’ would be grouped with ‘aa ‘. I did some digging, and this is what I found.

Yeah, this isn’t just Microsoft’s T-SQL variant—it’s a standard part of SQL, as Chris notes later in the post.

My “just-so” story is that this might have been implemented to deal with CHAR(x) comparisons, such as CHAR(2) to CHAR(3). There’s no way to make that comparison unless you treat trailing spaces as irrelevant. Because we almost always use VARCHAR(x) or NVARCHAR(x), it isn’t something top of mind to most database practitioners, but there is a method to the madness.

Comments closed

April Tools Day

Erin Stellato dispels some myths:

Myth #1 Azure Data Studio is the only standalone solution now that SSMS is deprecated.

SQL Server Management Studio (SSMS) is not deprecated.  We thought about writing that in ALL CAPS, but figured bold is sufficient.  SSMS has not been deprecated, and we are not planning on deprecating it.  You will see new functionality being added to Azure Data Studio, but we have a fair number of things lined up for SSMS, including migration to the Visual Studio 2022 shell, which brings 64-bit support.

Bold plus all caps might have been a bit too much, yeah.

Click through to see what’s happening in the world of SQL Server tooling from Microsoft.

Comments closed

Join Operations in BigQuery

Rathish Kumar joins a few tables together:

SQL joins are used to combine columns from multiple tables to get desired result set. In a typical Relational model we use normalized tables, each table represents an entity (example: employee, department, etc) and its relationships and when we need to get data from more than one tables, for example employee name and employee department, we use joins to combine employee name column from employee table, department name column from department table based on employee number key column, which is available on both the tables.

Similarly, typical data warehouse setup follows Star or Snowflake schema consisting of a primary fact table and satellite dimension tables. Fact tables represents events (example: orders table in a ecommerce business) and dimension table represents attributes and slowly changing information (example: customer, product tables).

The syntax is rather similar to most database engines, though there are a few physical join operators which differ from typical relational database management systems. Also, I’ll take this moment to say thank you to Rathish for not using Venn diagrams to show joins and instead using a proper technique.

Comments closed

Tracking Change Events in Snowflake

Kevin Wilkie shows off an interesting window function:

Notice that it has the OVER operator, you can order the data, and even partition the data as needed (Not seen in this example)!

But, as usual with Snowflake, there are even more functions we can work with! Sometimes, you just need to know when items are changed. Enter the CONDITIONAL_CHANGE_EVENT windowing function!

Click through for an example of how CONDITIONAL_CHANGE_EVENT() works.

Comments closed

Finding Columnstore Index Storage and Memory Allocations

Jose Manuel Jurado Diaz has a few scripts for us:

Today, we got a new question how much is the size used by a columnstore index table at storage level and memory usage.

TSQL to obtain the total number of rows, size per schema, table and index.

Using the view sys.column_store_row_groups (Transact-SQL) – SQL Server | Microsoft Learn we could see the total number of rows and space usage at storage level.

Click through for that script, as well as a few more to learn how much space and memory that columnstore index is taking.

Comments closed

Storing the Basis of Calculations, Not Results

Vladimir Khorikov hits on a topic of particular interest:

On the front-end, the duration is represented as 1h 47m. But how should we store it in our database? What data type should we use?

We could keep the duration as a 1h 47m string, just as we render it on the screen, but what if we later decide to change the string format? We might need to display it as 1:47 or 107 minutes instead. Do we then parse all the existing strings and convert them into a new format?

I like the thrust of this article. Two things I think people forget about in data modeling are ledgers and events, instead trying to store the results of these. Ledgers and events can be slower—though there are things we can do to pre-calculate more user-friendly results and speed up the process—but they provide you auditability, flexibility in how you want to calculate and display the information, and the ability to correct errors over time.

Comments closed

SQL Alerts on Potential Attacks

David Fowler keeps his head on a swivel:

But let’s be honest, the above attacks are just going to be opportunistic hackers probably using an automated tool to hunt out open servers and try to exploit them, a bit like the guy walking down the road and trying all the car doors in the hope that someone will have left their unlocked. It someone’s specifically targeting your servers, the chances are that they’ll be a bit more sophisticated.

This is where things get a little more troublesome and where SQL doesn’t really give us any tools to help.

But there are warning signs that we can look out for and these can be certain error messages in the logs. There are three in particular that I see as red flags that something amiss may be going on.

Click through for those three warning signs and how you an create SQL Server alerts upon logging these messages. The errors aren’t guaranteed to be an attacker, though in reading them, I’d expect a high signal to noise ratio on them.

Comments closed