Press "Enter" to skip to content

Day: October 9, 2024

Exploring Semantic Model Relationships with Sempy

Prathy Kamasani builds a graph:

Understanding the relationships between datasets is crucial in data analytics, especially in the world of self-service BI. Sempy, a Python library unique to Microsoft Fabric, allows users to visualise these relationships seamlessly. This post explores using Sempy to visualise semantic model relationships and view them in a Power BI Report. Viewing them in Notebook is easy and has been documented on MS Docs.

Click through for a notebook and explanation of the underlying code.

Comments closed

Why Take Backups in SQL Server?

I’ve started a new video series:

In this video, I explain why it is so important to back up your databases. From there, we see one technique for taking these backups in SQL Server: using SQL Server Management Studio. We review some of the backup metadata you can collect and also see how to compress and encrypt backups.

The first few videos in the series have all gone a bit longer than I expected, and that’s even with me pointedly ignoring most everything to do with tape drives.

Comments closed

Data Masking in Snowflake

Kevin Wilkie puts on the mask:

The premise behind data masking is fairly simple – make it so that only certain people can see the data in a certain column. For example, social security numbers or credit card numbers probably shouldn’t be seen by everyone who has access to your database.

I like to set it up as follows: (Obviously, this can change depending on your business’ policies.)

  1. Admins have full access.
  2. Call Center Representatives have access to the last 4 digits of the social.
  3. All others see nothing useful.

Read on to see how. I’ll even save my normal “data masking isn’t really a security feature” rant. This time.

Comments closed

Data Types and Arithmetical Calculations in DAX

Marco Russo and Alberto Ferrari do some counting:

The VertiPaq engine is basically data type-independent. This means that it does not matter whether a column is a string, a floating point, or a date: because of the dictionary encoding happening inside VertiPaq, all these data types use around the same amount of memory and perform at nearly the same speed.

However, when mixing different data types in the same expression, DAX will likely need to perform conversions between data types. Some of these conversions are nearly free, whereas others require the intervention of the formula engine, with a related performance impact.

We have already written about possible errors occurring during data type conversion here: Understanding numeric data type conversions in DAX and here: Rounding errors with different data types in DAX. The issue with conversion errors is mostly due to the fact that the precisions of fixed decimals (also known as Currency) and decimals (also known as floating point) are different. This article starts with a focus on performance.

Read on to see what Marco and Alberto have for us this time.

Comments closed

The Most Important Tool for a Data Detective

Andy Yun wants you to use your earholes:

The All Powerful…

… Question. That is what I now believe is the most important tool for a Data Detective.

Asking Questions Effectively

This nuance involves HOW you ask a question. Some of this involves knowing your audience. Is this the right place or the right time? Sometimes there comes a point where asking questions is just counter-productive because your audience has no interest in answering. And it also means you need to make sure you’re asking the correct audience in the first place.

Asking questions is difficult, so instead, I just strawman my way to success.

Comments closed

Building a Data Detective Toolkit

Deb Melkin talks tools:

Happy T-SQL Tuesday! I wasn’t really sure I’d be able to crank something out for this one but somehow I managed to squeeze it in. Tim Mitchell ( b ) is hosting and he has a great topic for us: What’s in our Data Detective toolkit?

I love this topic for so many reasons. Partly because I feel like I’m asked to look at so many projects where I’m dropped in and asked to figure things out, usually performance related but occasionally new functionality or features. But as I’m asked to do this fairly often, I may have to see if Data Detective can be my new title… hmm…

Being a Data Detective in a film noir. On the one hand, that sounds like a really neat idea. On the other hand, things usually don’t turn out so well for the detective.

Comments closed

The Power of Computed Columns

Andy Brownsword speeds up a query:

Bad code smells can run through a system, and one instance can quickly spread when code is recycled. Simon recently looked at a non-sargable example and was able to fix it by using an index and refactoring the query.

I wanted to consider an alternative approach if we saw the same issue repeated consistently. We don’t want to fix each instance, we want a single fix. We’ll solve this with indexed computed column.

We can index computed columns to help resolve deterministic (i.e. won’t change, no GETDATE() allowed) clauses. Let’s get started.

Read on to learn more. This is a powerful approach to the problem of needing to perform some sort of data transformation before filtering your data.

Comments closed