Press "Enter" to skip to content

Month: July 2023

Power BI Projects and Git Review

Teo Lachev wants to wait:

But to digress from Fabric, let’s take a look at two developer-oriented and frequently requested features that fortunately doesn’t require Fabric: Power BI Desktop projects and workspace integration with Git. The video in the link is a good starting point to understand how these features work.

Basically, the first feature lets you save a Power BI Desktop file as a *.pbip file which generates a set of folders with human-readable text files, such as the model.bim file (has the model definition described in JSON). Next, you can put these files under source control both in Power BI Desktop using any source-code enabled tool, such as Visual Studio Code or Visual Studio. Currently in preview, Power BI projects is an experimental feature (in fact, the Publish button is disabled so you can’t publish a project to Power BI Service).

Click through for Teo’s thoughts. This is something that I wish had been done 8 years ago, and as Teo points out, there’s still room to grow. But I do consider it a step forward from only being able to save PBIX files.

Comments closed

Understanding String Concatenation with FOR XML PATH

Brent Ozar did some noodling:

The first time I saw FOR XML PATH being used to generate a comma-delimited list, I think I stared at it, shook my head to clear the cobwebs, stared at it some more, and then closed the code editor thinking it was complete witchcraft.

And that same thing probably happened the next several times, too.

But eventually, I took a deep breath and read the code more closely to understand what it was doing.

I do like to joke that this is cryptic code that gets handed down from generation to generation, with each generation saying “Don’t touch the code, for you do not understand it.” But as of SQL Server 2017, you don’t need to do this anymore and can use STRING_AGG().

Comments closed

The Tally Table Splitter

Steve Jones talks tally tables:

That being said, years ago I got an article from Jeff Moden on the tally table. I hadn’t used this, and was fascinated. I know Itzik had written about numbers tables early on, but it hadn’t caught my attention. However, in a follow-up, Jeff wrote about a splitter function, which would use the tally table to split strings efficiently. This is the function (credit to Jeff in his article):

Click through for that function. The CLR-based tally table function is still faster, but if you can’t have CLR in your environment, and you split strings so frequently that you need a splitter, and STRING_SPLIT() just doesn’t do it for you (or you’re on an older version of SQL Server), this is a good solution. It’s also the foundation for a principle I have with T-SQL query tuning: sometimes you need to think in terms of pivoted or unpivoted data and operate on it versus the raw data. Here, we’re unpivoting a string into one row per character to get what we need out of it much more efficiently than if we tried to work the string by itself.

Comments closed

Recommended Automatic Indexes and BACPACs

Jose Manuel Jurado Diaz shares a note:

During a remote session with a customer, a comparison between the source and target databases revealed a discrepancy in the indexes. Specifically, while the source database contained three types of indexes (PK_xxx, IX_xxx, and NCI_xxx), only the manually created PK_xxx and IX_xxx indexes were visible in the target database after restoration. The absence of NCI_xxx indexes prompted an investigation into the BACPAC export process.

This meant that the BACPAC file didn’t have the set of recommended automatic indexes. Click through for the explanation, though I’m not sure there’s a good way to fix it, short of manually scripting out those automated indexes yourself.

Comments closed

A Story of Bad Data Modeling

Kendra Little unravels a puzzle:

I double-checked my queries. Had I goofed in my sql? Nope. Next, I looked into if some of the data was in an inconsistent state.

What I found was worse than what I’d imagined. As a data person, it made me feel sad and icky.

That’s because it’s usually not too hard to clean up bad data. It’s almost always much harder to fix a badly designed data model which is already established in production.

Read on for a tale as old as time: the clarion call of expediency now causing pain later.

Comments closed

T-SQL Query Transformations and Performance

Erik Darling isn’t satisfied with “good enough”:

Query tuning is hard work, especially when queries run for a long time and you have to test various changes trying to get it down to a reasonable runtime.

There are, of course, things you pick up intuitively over time, and figuring out where problems in actual execution plans has gotten easier with operator runtimes.

Beyond basic query tuning intuition comes the really creative stuff. The stuff you’re amazed anyone ever thought of.

Click through for two really interesting examples.

Comments closed

Datatables and Calculations in KQL

Robert Cain follows up on a prior post:

In the conclusion of my last post, Fun With KQL – Datatable, I mentioned a datatable can also be useful when you need to do special calculations. I decided it really needed further explanation and created this post to dive in a bit deeper.

If you haven’t read my datatable post yet, you should as I’ll refer back to it. Go ahead, I’ll wait.

Click through to see what Robert has in mind.

Comments closed

Indexing in SQL Server Graph

Hugo Kornelis offers a mea culpa:

In my post, I pointed out that SQL Server automatically creates a unique nonclustered index on the (internal and hidden) column graph_id, that you can delete, but can’t modify. I added that users cannot specify indexes on that column, nor on the other internal and hidden columns that exist in edge tables (from_obj_id, from_id, to_obj_id, and to_id). These statements are factually correct, but misleading.

I then concluded from those facts that once you delete the automatically generated index, you can not re-create it. This is actually incorrect.

Read on for the correct answer, as well as more information on indexing practices and advice on the right kinds of indexes to create on graph tables.

Comments closed

Fabric Data Integration

Teo Lachev reviews the primary methods for data ingestion in Microsoft Fabric:

Fabric supports three options for automated data integration: Data Pipeline (Azure Data Factory pipeline), Dataflow Gen2 (Power BI dataflow), and Notebook (Spark). I summarize these three options in the following table, which loosely resembles the Microsoft comparison table but with my take on it.

Read on for Teo’s thoughts on the matter.

Comments closed

VBS Enclaves for Always Encrypted in Azure SQL DB Elastic Pools

Pieter Vanhove makes an announcement:

A few months ago we announced the support for virtualization-based security (VBS) enclaves in Azure SQL Database. This announcement brings numerous advantages, including robust confidential queries and seamless cryptographic operations, to all Azure SQL Database offerings, independent from the underlying hardware. You can use the feature with any compute tier (provisioned or serverless), purchasing model (vCore or DTU), compute size and region that aligns with your workload needs. And, since VBS enclaves are available in existing hardware offerings, there is no additional cost.

In addition to this preview, we are excited to announce the preview of VBS enclaves in Azure SQL Database elastic pools!

Read on to learn more about how to enable enclaves and add a database to an elastic pool.

Comments closed