Press "Enter" to skip to content

Month: April 2024

Implementing a Star Schema for a Power BI Semantic Model

Nikola Ilic reminds us to keep Ralph Kimball’s Data Warehouse Toolkit book at hand:

But, what is a star schema in the first place? I have good and bad news for you:)…The bad news is: I’m not covering it in this article, because this one focuses on explaining how to implement a star schema in Power BI (assuming that you already know what star schema is). The good news is: I’ve already written about it, so go and read this article first, if you’re not sure what star schema represents in the world of data modeling…

Now, let’s get our hands dirty and build a star schema!

Read on for the demo.

Comments closed

Searching for a String Position

Chad Callihan goes index hunting:

SQL Server has the CHARINDEX() and PATINDEX() functions that can both be used to find the position of a value in a string. They each have some differences that make them unique and more fitting for specific situations.

Let’s look at a few examples of how each can be used.

Read on to see how to use CHARINDEX() and PATINDEX() on two statements that are total falsehoods: Kevin Conroy always will be the number 1 and best Batman.

Comments closed

Building a Pareto Chart in Power BI

Riqo Chaar get more than 80% of the way there:

The Pareto principle, commonly referred to as the 80/20 rule, is a concept of prioritisation.

It states that for many outcomes, 80% of the outputs are derived from 20% of the inputs. Although this isn’t a universal truth, this pattern has been observed in many different cases. For example, a large proportion (80%) of the revenue a particular business generates may primarily be associated with only a small proportion (20%) of big-selling products. This concept is related to the law of diminishing returns and poses the following question: If, after reaching a certain level of output (80%), significantly more effort is required to achieve further increases in this output, is this additional effort worth it?

In this article, we demonstrate the process (using DAX expressions) of creating a Pareto chart in Power BI.

Read on to see how you can build a chart like this. It’s a very popular style of chart for manufacturing scenarios.

Comments closed

Branch Cleanup with git prune

Steve Jones breaks out the branch cutters:

As I’ve been working with SQL Saturday and managing changes to events, I’ve accumulated a lot of branches. Even though I’m a solo developer, I decided to use branches, as I expect others to share this load in the future. This post looks at how to start cleaning those up.

As a team gets larger, the necessity of regular branch maintenance increases, but so does the complexity of it: it’s really easy to have one person not pay attention to the e-mails and get burned when old branch deletion en masse does happen.

Comments closed

Database Subetting and Data Generation

Phil Factor tells us about two possibilities for loading a lower environment:

When dealing with the development, testing and releasing of new versions of an existing production database, developers like to use their existing production data. In doing so, the development team will be hit with the difficulties of managing and accommodating the large amount of storage used by a typical production database. It’s not a new problem because the practical storage capacity has grown over the years in line with our ingenuity in finding ways of using it.

To deal with using production data for testing, we generally want to reduce its size by extracting a subset of the entities from a ‘production’ database, anonymized and with referential integrity intact. We then deliver this subset to the various development environments.

Phil gets into some detail on the process behind subsetting and then covers data generation as an alternative.

Comments closed

Using Powershell to Set the SQL Server Port Static

Vlad Drumea doesn’t want a dynamic port number:

This post demos a script that I’ve put together to automate the configuration of the static TCP port for a SQL Server instance using PowerShell.

The script is derived from another PowerShell script that I’ve written to help spin up SQL Server test instances in my home lab.

This should be helpful you’re working in a restrictive environment where you can’t install additional PowerShell modules, and you couldn’t take advantage of dbatools’ Set-DbaTcpPort.

Click through for the script, but also use dbatools whenever you can because it’s a good product and I haven’t done any unpaid shilling for them in far too long.

Comments closed

The Equivalency of Views and Common Table Expressions

Erik Darling makes a comparison:

Perhaps one of the most exhausting parts of my job is disabusing developers of the notion that common table expressions hold some weight in gold over any other abstraction layer in SQL Server.

Think of it like this:

  • Views are like a permanent home
  • Common table expressions are like a mobile home

You can put equally terrible queries in either one and expect equally terrible results.

Read the whole thing for additional spicy analogies and similes.

I would say that I certainly do not disdain views, so much as I see them as a yellow flag. It’s really easy to go from helpful views to views nested in views nested in views like malevolent Matryoshka dolls. It’s harder to do that with common table expressions before that reptilian part of your brain kicks in and says that “hey, maybe this isn’t the greatest idea I’ve ever had.” Not that this will stop some people, admittedly…

Comments closed

Processes in PostgreSQL

Semab Tariq continues a series on internals in PostgreSQL:

PostgreSQL is a client/server type relational database management system. It has a multi-process architecture that runs on a single host. A collection of multiple processes that manage a database cluster is usually referred to as a PostgreSQL server. In PostgreSQL, every operation is treated as a process, and each action we undertake within PostgreSQL follows an append-only approach. This means that every time we execute an action such as an insert, update, or delete, a new tuple is created rather than modifying the existing data directly.

Consequently, PostgreSQL does not execute updates or deletes in place. Instead, it appends new data or marks existing data as obsolete. This append-only methodology ensures data integrity and allows for efficient management of database changes over time.

Read on to learn more about how these processes work.

Comments closed

Azure Regions and Pricing

Koen Verbeeck has a public service announcement:

Today I was having a nice discussion with some colleagues about Fabric and pricing/licensing came up. I mentioned an F2 is only around €250 a month, but a colleague said “no no, it’s over €300”.

There can be significant differences in prices for services based on region, not just for Microsoft Fabric, but also for a variety of services. This will depend on how new the hardware is, how much demand there is in the region, and a few other factors. Cloud Price does a good job of keeping track of VM pricing by region, and even tells you the cheapest region for each class of VM. For other services, you may have to trawl through Azure APIs and pricing pages to get the best deal.

Comments closed