Kevin Feasel – Page 168

COALESCE() in T-SQL

Published 2025-02-06 by Kevin Feasel

Rajendra Gupta has a backup plan in case of NULL:

NULL is a special marker that indicates a missing or undefined value in a column. It is different from zero or an empty string. Handling NULL values is essential for accurate data analysis, data integrity, and error avoidance. This tip explores how to handle NULL values in SQL Server using the COALESCE() function using various queries and reviewing the results.

Click through for a primer on the COALESCE() function, a few use cases for COALESCE(), and how it differs from ISNULL().

Comments closed

Controlling Execution Flow in Fabric Data Pipelines

Published 2025-02-06 by Kevin Feasel

Reza Rad has everything under control:

In Microsoft Fabric, the Data Factory is the workload for ETL and data integration, and the Data Pipeline is a component in that workload for orchestrating the execution flow. There are activities in the pipeline, and you can define in which order you want the activities to run. In this article and video, you will learn about the execution order and output states in Data Pipeline and how they can be used in real-world scenarios of data integration.

The mechanisms here are fundamentally similar to what we’ve had in Azure Data Factory (obviously) and SQL Server Integration Services.

Comments closed

Styling and Deploying an Observable Framework App

Published 2025-02-05 by Kevin Feasel

Tim Brock completes a series:

This post, Part 2 in a series of two, looks at styling and deploying the Observable Framework app we built in Part 1. Codeblocks with burgundy backgrounds refer to specifc tagged commits in the accompanying GitHub repositiory.

Part 1 involved converting an existing R-Shiny app to Observable Framework. Read on for this part.

Comments closed

Preventing Skew in Teradata

Published 2025-02-05 by Kevin Feasel

Sudheer Kumar Lagisetty shares some performance tuning advice:

Teradata performance optimization and database tuning are crucial for modern enterprise data warehouses. Effective data distribution strategies and data placement mechanisms are key to maintaining fast query responses and system performance, especially when handling petabyte-scale data and real-time analytics.

Understanding data distribution mechanisms, workload management, and data warehouse management directly affects query optimization, system throughput, and database performance optimization. These database management techniques enable organizations to enhance their data processing capabilities and maintain competitive advantages in enterprise data analytics.

Click through for some tips around data distribution. This idea becomes important in an MPP architecture.

Comments closed

A Primer on Key Performance Indicators

Published 2025-02-05 by Kevin Feasel

I’ve started a new video series:

In this video, I provide a primer on the concept of key performance indicators. Along the way, we’ll talk about common warehousing terminology such as facts, measures, grain, and additivity.

This is all based on a talk I don’t get to give very often, so instead I’m turning it into a five-part video series.

Comments closed

Retrieving Power BI Licenses in a Tenant

Published 2025-02-05 by Kevin Feasel

Gilbert Quevauvilliers wants to figure out who has licenses:

In this blog post I am going to show you how to get all the Power BI licenses in your tenant.

This can be very useful to understand how many licenses you have, what type of licenses are being paid for, and potentially how you can save by removing licenses due to inactive use or if the licenses are no longer required.

I’m going to be pulling on my previous Blog post where I explained how to get the Entra ID users and groups using a Service Principal for access

Click through for the demonstration.

Comments closed

Top 6 Things Microsoft Ever Did to SQL Server

Published 2025-02-05 by Kevin Feasel

Brent Ozar has a list:

This entire blog post is driven by the #1 feature in this list. I think about the #1 feature a lot, like at least once a week. I think about it so much that I had to stop and think about what other similar great things Microsoft has done over the years, and be thankful for what a nice platform this is to work with. Let’s go through 6 of my favorite Microsoft decisions.

I have to warn you: some of my takes are weird.

English Query was definitely an idea before its time. It was a great idea that I’m sure demoed well (though that was before I got into SQL Server, so can’t tell you from personal experience), but it was dog slow.

Read on for the rest of the list. Admittedly, sometimes I wish Microsoft had gone through on its deprecation notice around statements not ending in a semi-colon, just to watch the world burn.

Comments closed

Brownfield Data Modeling

Published 2025-02-05 by Kevin Feasel

Jared Westover discusses a common trade-off:

Some decisions in life are easy, like whether to drink that second cup of coffee. But when it comes to databases, things get complicated fast. Developers often seek my input on adding tables and columns. A common question arises: Should they create a new table or expand an existing one by adding columns? This decision can be tricky because it depends on several factors, including query performance, future growth, and the complexity of implementing either solution. While adding one or two columns to an existing table may seem the easiest option, is it the best long-term solution? In this article, we look at whether it is better to add new columns versus a new table in SQL Server.

As an architectural pro-tip, when you’re looking to add a new column to an existing table, ask yourself if the new attribute you want to add actually relates to the natural key of the existing table. In Jared’s example, the natural key for video game tracker is presumably video game ID (which itself ties back to, presumably, the video game title, developer, console, and release date) and start date. Does a book actually relate to a video game and start date? No, it does not. Therefore, this book attribute does not belong on the video game tracker table.

When you dig deeper into Boyce-Codd Normal Form, you figure out that “relates to” in the prior paragraph translates to “has a functional dependency upon,” but using non-technical language for people not familiar with normalization, you can still get to the same conclusion, because ultimately, 95% of database normalization is common sense that we strenuously apply to a business domain.

And most of the time, the developer knows that this feels weird, but doesn’t want to spend the extra time doing it the best way and instead tries to do it the expedient way. This is where the role of the architect as politician comes in, and we gently guide people to the right conclusion. Or just tell them to put on their big boy britches and do it right. Either way.

Comments closed

The Logic behind RIGHT OUTER JOIN

Published 2025-02-05 by Kevin Feasel

Constantine Kokkinos provides an explanation:

I was talking to a friend of mine and they are learning some SQL and they said something that I have seen come up multiple times in learning SQL.

They said “Yeah, I need to study the join types more. They make sense to me but I want to be able to not reference my notes” and also “I don’t really get the point of a right join if your can do the same thing with a left join by just switching the table name.”

These are great points, and common questions that occur when first learning SQL.

I won’t steal CK’s thunder (too much) about how we express joins in set theory, though I think when he mentions “OUTER” as a type of join, perhaps that’s supposed to be FULL OUTER JOIN?

Regardless, my take: there is a good reason to use INNER JOIN. There is a good reason to use LEFT OUTER JOIN. There is a good reason to use CROSS JOIN. There is a good reason to use FULL OUTER JOIN. The frequency in which you should use each is in descending order, meaning that there are relatively few circumstances in which you should use a FULL OUTER JOIN, but they do exist.

There are no good circumstances for a RIGHT OUTER JOIN. The concept logically exists, but has no practical value to us.

Comments closed

Alternatives to Error Bars

Published 2025-02-04 by Kevin Feasel

Alex Velez admits to error:

During a client workshop, someone asked me if I was a fan of error bars and whether they should use them in their presentations. As I readied my standard “it depends” response, I realized that for once, it didn’t depend. I couldn’t think of a single time when error bars would be the ideal solution for communicating data. (For clarity, if they had asked whether they should articulate the margin of error around their data, my answer would have certainly been it depends. I just wouldn’t use error bars to do so.)

Before I discuss why I’m not a fan of error bars and an alternative solution, let’s explore what they are.

Click through for Alex’s thoughts, including a pair of interesting alternative displays.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Author: Kevin Feasel