Press "Enter" to skip to content

Month: January 2025

Truncating All Tables while Preserving Foreign Keys in T-SQL

Ronald Kraijesteijn builds a script:

When testing a data warehouse, a common challenge is managing large datasets effectively. Often, you need to reset tables to a clean state, ensuring consistent testing environments. The most efficient way to clear a table is using the SQL command TRUNCATE TABLE. However, this command is not straightforward when foreign key constraints are present. In this article, we’ll explore a solution that temporarily disables constraints, allows truncation, and then restores the constraints—keeping your data model intact.

Click through for the script, which saves a record of all of the foreign key constraints, truncates each table, and then re-creates the foreign keys.

Leave a Comment

Automatically Refreshing a Power BI Semantic Model after Dataflow Loads

Reza Rad refreshes a model:

Although this seems to be a simple thing to do, it is not a function that you can turn on or off. If you have a Dataflow that does the ETL and transforms and prepares the data, then to get the most up-to-date data into the report, you will need to refresh the Power BI semantic model after that, only upon successful refresh of both dataflow and semantic model is when you will have the up-to-date data into the report. Fortunately, in Fabric, this is a straightforward setup. In this article and video, I’ll explain how this is possible.

Click through for the video and the blog post. Granted, this feature is in preview, but using it is pretty straightforward.

Leave a Comment

Calculating the Distance between Points via T-SQL

Sebastiao Pereira gets out the measuring tape:

How do you calculate the distance between two different points on a sphere using TSQL in SQL Server? In this article, we look at how this can be done to calculate different points on the globe.

Sebastiao first shows the raw calculation, then uses the GEOGRAPHY data type to simplify the job. Note that these are “as the crow flies” distances and not travel distances, as there’s no information on roads.

Also, these are calculations specific to Earth. Which sounds like the setup for a joke, but it’s really not. If you have a smaller sphere (or oblong spheroid, if you will) and you want to calculate the distance, use the GEOMETRY data type instead.

Leave a Comment

Power BI Embedded and Direct Lake Mode

Hasan Abo Shally announces a new preview feature:

As we step into 2025, we’re excited to announce the preview of Power BI Embedded with Direct Lake Mode, a new feature designed to enhance how developers and Independent Software Vendors (ISVs) provide embedded analytics in their applications. This capability, available in preview starting Q1 2025, leverages the power of Direct Lake Mode to allow an even enhanced performance and experience for embedded analytics.

This probably affects a small audience, though the functionality is good to see.

Leave a Comment

Drawbacks of Version Store in SQL Server

Haripriya Naidu points out that nothing comes for free:

With every feature comes both advantages and disadvantages and it is important to be aware of both. In regards to version store, we had discussed how to understand version store better and how version store is enabled.

Today, we will talk about the trade-offs and implications that come along when version store is enabled.

Click through for the downsides. With storage prices and speeds where they are, the tradeoffs of using the version store for RCSI or Snapshot isolation are almost always worth it. But it’s still good to know what those trade-offs are and how you can end up in a degenerate state.

Leave a Comment

An Overview of HyperLogLog

Bhala Ranganathan talks about a powerful algorithm:

Cardinality is the number of distinct items in a dataset. Whether it’s counting the number of unique users on a website or estimating the number of distinct search queries, estimating cardinality becomes challenging when dealing with massive datasets. That’s where the HyperLogLog algorithm comes into the picture. In this article, we will explore the key concepts behind HyperLogLog and its applications.

HyperLogLog is the algorithm that SQL Server users in the APPROX_COUNT_DISTINCT() function to make it so much faster than a regular COUNT(DISTINCT) while still providing correctness guarantees within a fixed percentage error: they guarantee a 2% or lower error rate with a 97% probability.

Leave a Comment

Partitioned Tables and Indexes in PostgreSQL

Hettie Dombrovskaya runs into an error:

Here is a story. When anyone gives a talk about partitions, they always bring up an example of archiving: there is a partitioned table, and you only keep “current” partitions, whatever “current” means in that context, and after two weeks or a month, or whatever interval works for you, you detach the oldest partition from the “current” table and attach it to the “archived” table, so that the data is still available when you need it, but it does not slow down your “current” queries.

So here is Hettie confidently suggesting that a customer implement this technique to avoid querying a terabyte-plus-size table. A customer happily agrees, and life is great until one day, an archiving job reports an error of a “name already exists” for an index name.

Read on to learn why.

Leave a Comment

MDX vs DAX for Bulk Data Extraction from Power BI

Chris Webb performs a test but gives us a warning first:

This is a post I’ve avoided writing for many years, and before I carry on let me make one thing clear:

Doing bulk extracts of data from a Power BI semantic model is a **really** bad idea

My colleague Matthew Roche wrote a great post on this topic a couple of years ago that is still relevant: using Power BI (or Analysis Services) as a data source for other systems, including other Power BI Import mode semantic models, is an anti-pattern. Power BI is optimised for small, analytical queries that return the amount of data that can be visualised on a single page. It is not optimised for queries that return millions of rows.

After the warning, Chris still gives a performance breakdown for extracting data from a semantic model in Excel, using automated MDX and DAX.

Leave a Comment

SQL Server and File Access

Vlad Drumea troubleshoots an issue:

This is one of those things that’s obvious for anyone with a bit of SQL Server and Windows experience, but stumps a lot of newcomers.
Especially when it comes to students going through tasks like creating and restoring databases.

It’s also one of those things that I have to explain on a regular basis, so might as well have detailed explanation with examples I can direct people to in the future.

The latest example being this reddit post in r/SQL, where the wrongest answer has the highest upvotes.

Read on for a story about Operating system error 5 (Access is denied). And if I hear anybody switching the virtual service account running SQL Server to Local System, I will personally tell you it’s an awful idea and to stop it.

Leave a Comment

Microsoft Fabric for the SMB

Eugene Meidinger looks out for the smaller fish in the pond:

If you are a small (or even medium) business, you may be wondering “What is Fabric and do we even need it?” If you are primarily on Power BI Pro licenses today, you may not find a compelling reason to switch to Fabric today, but the value add should improve over time as new features are added on the Fabric side and some features get deprecated on the Power BI side.

Read on for plenty of advice, metaphors, and even a few warnings.

Leave a Comment