Press "Enter" to skip to content

Curated SQL Posts

The State of CI/CD in Power BI

Paul Turley violates Betteridge’s Law of Headlines:

The answer – finally – is a resounding YES! It has been a long road and the path to implementing Continuous Integration & Delivery (CI/CD) in all of its forms for Power BI is still a journey, but the capabilities are now a reality. This broad category includes a range of features and capabilities related to managing project files with version control and the ability to share and collaborate with other development team members. Additionally, we also have the ability to assign individual components of a solution to different team members who then check-out files and develop features, check them back in and merge changes into a feature branch. In enterprise-class projects, having these capabilities enables true Development Operations or DevOps: a process for managing iterative development, testing and deployment with automated builds, using a central process management tool. Implementing full-fledged DevOps is a big lift for most Business Intelligence professionals but the core concept of code version control is essential for any durable project.

Read on for Paul’s thoughts on the importance of source control and how you can pull this off.

Comments closed

Extracting the Month from a Date with R

Steven Sanderson asks what month it is:

Greetings fellow R enthusiasts! Today, we’re diving into a fundamental task: extracting the month from a date in R. Whether you’re new to R or a seasoned pro, understanding how to manipulate dates is essential. We’ll explore two popular methods: using base R and the powerful lubridate package. So, let’s roll up our sleeves and get started!

Read on for several examples across two solution spaces.

Comments closed

Index Counts and Write Speeds

Erik Darling tells the truth:

Before I go on about how adding an index can fix your blocking and deadlocking problems, let me warn you about how indexes can cause you problems.

Without fail, whenever I suggest adding indexes to clients, someone gets real worried about write speeds.

Like, the server can be burning down, spurting blood everywhere, and cursing our mothers in ancient Aramaic, but no, let’s stop and worry about write speeds going up by a few milliseconds.

Fully agreed: the marginal cost of an index is rarely noticeable and the marginal benefit of a well-designed index serving an existing customer base is significantly higher than its marginal cost. However, when I click that plus sign next to the Indexes folder and SSMS starts spinning because it needs to enumerate the dozens and dozens of indexes on that table, it’s a bad sign. Especially when 90% of them have “_dta_” in the name.

Click through for more thoughts on indexing, scans, and transactions.

Comments closed

Fixing the Power BI Error “Something Is Wrong with One or More Fields”

Kurt Buhler broke it and now needs to buy it:

The “grey box of death” in Power BI is unfortunately a common sight for anyone creating reports. When you see this grey error box, it means that the visual can’t obtain a valid result. If that error says “Something is wrong with one or more fields”, then there’s several possible reasons for this:

Click through for those reasons, as well as how to fix the problem. Kurt shows this both in video form and written out, so check it out.

Comments closed

New Features in Azure SQL MI Instance Pools

Djordje Marinkovic shows off what’s new:

When migrating small SQL Server instances to Azure it is often the case that a single SQL Managed Instance turns out to be overkill in terms of size and, consequently, cost. The oversizing problem can happen whenever very small instances are required, for example when an ISV company builds a multi-tenant app requiring a small SQL MI instance for each customer. In such cases the smallest size (4-vCores) for a single SQL MI can still turn out to be too large and too expensive for the given use case. This is where SQL MI pools (“instance pools”) deliver great value.

Click through for more information on instance pools, as well as new features for instance pools.

Comments closed

UUIDs in Postgres

Ryan Lambert is not a fan:

This month’s PgSQLPhriday #015 topic is about UUIDs. Lætitia Avrot is this month’s host, see her post for the full challenge text. The topic is described as a debate between the Database People and Developers. I’m not sure there’s such a clean divide on people’s opinions on the topic, as I know plenty of Database People that have settled on using UUIDs as their default. Similarly, I know even more developer types that have followed the arguably more conventional choice of using an auto-incrementing ID.

Click through for the debate. Most of this is similar to the argument on the SQL Server side, though without the specter of clustered indexes added to the mix.

Comments closed

Basic Git for the DBA

Steve Jones provides a primer on Git:

One of the really interesting things over the last decade is both the rise of Git as the main VCS system for most projects, and the number of people who know nothing about it. Many of the latter seem intimidated, which is both interesting and strange to me. I usually think of technologies as ready to experiment, but I find more and more they only experiment in narrow boundaries.

This post outlines a quick way to get started with Git.

Click through to learn more.

Comments closed

Using Feature Flags with Data Projects

Ben Johnston builds out feature flags:

My motivation for writing this summary was an interaction with a project owner that didn’t understand why we couldn’t use feature flags directly in Power BI to control the user interface. This was different from our other deployments, so it took a few rounds of explanations to convince them that our use case didn’t support feature flags. It’s an oversimplification to say they can’t be used in data projects. They can be used in Power BI and other reporting tools, but the implementation is different from coding languages and their usage is limited in comparison. Feature flags can also be used in ETL tools, data engines, ETL tools, and other data tools, but with some caveats. Sometimes those caveats are severe enough that you will want to carefully consider how you use feature flags in your data projects.

Read the whole thing. The way Ben lays things out reminds me of why I historically haven’t been the biggest fan of feature flags, though they can be quite useful for application development purposes.

Comments closed

Preserving Non-Occurring Levels in R

Sebastian Sauer saves the levels:

The summary table does not show the level TRUE, as it is not occuring in the data. This can be problematic: If the data is unknown before summarizing and you would expect that both/all levels (TRUE, FALSE) occur. Just imagine that a subsequent function will count the level TRUE and the level FALSE. If one level is missing, your system may brake down.

Click through for a solution, where, even if your dataset is missing a particular level (value of a categorical variable), you will still see it in the final output. That way, if you train a model on this data and the new level shows up in your test dataset or in the wild, it won’t cause an error.

Comments closed