Press "Enter" to skip to content

Day: February 5, 2025

Preventing Skew in Teradata

Sudheer Kumar Lagisetty shares some performance tuning advice:

Teradata performance optimization and database tuning are crucial for modern enterprise data warehouses. Effective data distribution strategies and data placement mechanisms are key to maintaining fast query responses and system performance, especially when handling petabyte-scale data and real-time analytics. 

Understanding data distribution mechanisms, workload management, and data warehouse management directly affects query optimization, system throughput, and database performance optimization. These database management techniques enable organizations to enhance their data processing capabilities and maintain competitive advantages in enterprise data analytics.

Click through for some tips around data distribution. This idea becomes important in an MPP architecture.

Leave a Comment

Top 6 Things Microsoft Ever Did to SQL Server

Brent Ozar has a list:

This entire blog post is driven by the #1 feature in this list. I think about the #1 feature a lot, like at least once a week. I think about it so much that I had to stop and think about what other similar great things Microsoft has done over the years, and be thankful for what a nice platform this is to work with. Let’s go through 6 of my favorite Microsoft decisions.

I have to warn you: some of my takes are weird.

English Query was definitely an idea before its time. It was a great idea that I’m sure demoed well (though that was before I got into SQL Server, so can’t tell you from personal experience), but it was dog slow.

Read on for the rest of the list. Admittedly, sometimes I wish Microsoft had gone through on its deprecation notice around statements not ending in a semi-colon, just to watch the world burn.

Leave a Comment

Retrieving Power BI Licenses in a Tenant

Gilbert Quevauvilliers wants to figure out who has licenses:

In this blog post I am going to show you how to get all the Power BI licenses in your tenant.

This can be very useful to understand how many licenses you have, what type of licenses are being paid for, and potentially how you can save by removing licenses due to inactive use or if the licenses are no longer required.

I’m going to be pulling on my previous Blog post where I explained how to get the Entra ID users and groups using a Service Principal for access

Click through for the demonstration.

Leave a Comment

Brownfield Data Modeling

Jared Westover discusses a common trade-off:

Some decisions in life are easy, like whether to drink that second cup of coffee. But when it comes to databases, things get complicated fast. Developers often seek my input on adding tables and columns. A common question arises: Should they create a new table or expand an existing one by adding columns? This decision can be tricky because it depends on several factors, including query performance, future growth, and the complexity of implementing either solution. While adding one or two columns to an existing table may seem the easiest option, is it the best long-term solution? In this article, we look at whether it is better to add new columns versus a new table in SQL Server.

As an architectural pro-tip, when you’re looking to add a new column to an existing table, ask yourself if the new attribute you want to add actually relates to the natural key of the existing table. In Jared’s example, the natural key for video game tracker is presumably video game ID (which itself ties back to, presumably, the video game title, developer, console, and release date) and start date. Does a book actually relate to a video game and start date? No, it does not. Therefore, this book attribute does not belong on the video game tracker table.

When you dig deeper into Boyce-Codd Normal Form, you figure out that “relates to” in the prior paragraph translates to “has a functional dependency upon,” but using non-technical language for people not familiar with normalization, you can still get to the same conclusion, because ultimately, 95% of database normalization is common sense that we strenuously apply to a business domain.

And most of the time, the developer knows that this feels weird, but doesn’t want to spend the extra time doing it the best way and instead tries to do it the expedient way. This is where the role of the architect as politician comes in, and we gently guide people to the right conclusion. Or just tell them to put on their big boy britches and do it right. Either way.

Leave a Comment

The Logic behind RIGHT OUTER JOIN

Constantine Kokkinos provides an explanation:

I was talking to a friend of mine and they are learning some SQL and they said something that I have seen come up multiple times in learning SQL.

They said “Yeah, I need to study the join types more. They make sense to me but I want to be able to not reference my notes” and also “I don’t really get the point of a right join if your can do the same thing with a left join by just switching the table name.”

These are great points, and common questions that occur when first learning SQL.

I won’t steal CK’s thunder (too much) about how we express joins in set theory, though I think when he mentions “OUTER” as a type of join, perhaps that’s supposed to be FULL OUTER JOIN?

Regardless, my take: there is a good reason to use INNER JOIN. There is a good reason to use LEFT OUTER JOIN. There is a good reason to use CROSS JOIN. There is a good reason to use FULL OUTER JOIN. The frequency in which you should use each is in descending order, meaning that there are relatively few circumstances in which you should use a FULL OUTER JOIN, but they do exist.

There are no good circumstances for a RIGHT OUTER JOIN. The concept logically exists, but has no practical value to us.

Leave a Comment