Press "Enter" to skip to content

Curated SQL Posts

Extracting the Month from a Date with R

Steven Sanderson asks what month it is:

Greetings fellow R enthusiasts! Today, we’re diving into a fundamental task: extracting the month from a date in R. Whether you’re new to R or a seasoned pro, understanding how to manipulate dates is essential. We’ll explore two popular methods: using base R and the powerful lubridate package. So, let’s roll up our sleeves and get started!

Read on for several examples across two solution spaces.

Comments closed

Index Counts and Write Speeds

Erik Darling tells the truth:

Before I go on about how adding an index can fix your blocking and deadlocking problems, let me warn you about how indexes can cause you problems.

Without fail, whenever I suggest adding indexes to clients, someone gets real worried about write speeds.

Like, the server can be burning down, spurting blood everywhere, and cursing our mothers in ancient Aramaic, but no, let’s stop and worry about write speeds going up by a few milliseconds.

Fully agreed: the marginal cost of an index is rarely noticeable and the marginal benefit of a well-designed index serving an existing customer base is significantly higher than its marginal cost. However, when I click that plus sign next to the Indexes folder and SSMS starts spinning because it needs to enumerate the dozens and dozens of indexes on that table, it’s a bad sign. Especially when 90% of them have “_dta_” in the name.

Click through for more thoughts on indexing, scans, and transactions.

Comments closed

Fixing the Power BI Error “Something Is Wrong with One or More Fields”

Kurt Buhler broke it and now needs to buy it:

The “grey box of death” in Power BI is unfortunately a common sight for anyone creating reports. When you see this grey error box, it means that the visual can’t obtain a valid result. If that error says “Something is wrong with one or more fields”, then there’s several possible reasons for this:

Click through for those reasons, as well as how to fix the problem. Kurt shows this both in video form and written out, so check it out.

Comments closed

New Features in Azure SQL MI Instance Pools

Djordje Marinkovic shows off what’s new:

When migrating small SQL Server instances to Azure it is often the case that a single SQL Managed Instance turns out to be overkill in terms of size and, consequently, cost. The oversizing problem can happen whenever very small instances are required, for example when an ISV company builds a multi-tenant app requiring a small SQL MI instance for each customer. In such cases the smallest size (4-vCores) for a single SQL MI can still turn out to be too large and too expensive for the given use case. This is where SQL MI pools (“instance pools”) deliver great value.

Click through for more information on instance pools, as well as new features for instance pools.

Comments closed

UUIDs in Postgres

Ryan Lambert is not a fan:

This month’s PgSQLPhriday #015 topic is about UUIDs. Lætitia Avrot is this month’s host, see her post for the full challenge text. The topic is described as a debate between the Database People and Developers. I’m not sure there’s such a clean divide on people’s opinions on the topic, as I know plenty of Database People that have settled on using UUIDs as their default. Similarly, I know even more developer types that have followed the arguably more conventional choice of using an auto-incrementing ID.

Click through for the debate. Most of this is similar to the argument on the SQL Server side, though without the specter of clustered indexes added to the mix.

Comments closed

Basic Git for the DBA

Steve Jones provides a primer on Git:

One of the really interesting things over the last decade is both the rise of Git as the main VCS system for most projects, and the number of people who know nothing about it. Many of the latter seem intimidated, which is both interesting and strange to me. I usually think of technologies as ready to experiment, but I find more and more they only experiment in narrow boundaries.

This post outlines a quick way to get started with Git.

Click through to learn more.

Comments closed

Using Feature Flags with Data Projects

Ben Johnston builds out feature flags:

My motivation for writing this summary was an interaction with a project owner that didn’t understand why we couldn’t use feature flags directly in Power BI to control the user interface. This was different from our other deployments, so it took a few rounds of explanations to convince them that our use case didn’t support feature flags. It’s an oversimplification to say they can’t be used in data projects. They can be used in Power BI and other reporting tools, but the implementation is different from coding languages and their usage is limited in comparison. Feature flags can also be used in ETL tools, data engines, ETL tools, and other data tools, but with some caveats. Sometimes those caveats are severe enough that you will want to carefully consider how you use feature flags in your data projects.

Read the whole thing. The way Ben lays things out reminds me of why I historically haven’t been the biggest fan of feature flags, though they can be quite useful for application development purposes.

Comments closed

Preserving Non-Occurring Levels in R

Sebastian Sauer saves the levels:

The summary table does not show the level TRUE, as it is not occuring in the data. This can be problematic: If the data is unknown before summarizing and you would expect that both/all levels (TRUE, FALSE) occur. Just imagine that a subsequent function will count the level TRUE and the level FALSE. If one level is missing, your system may brake down.

Click through for a solution, where, even if your dataset is missing a particular level (value of a categorical variable), you will still see it in the final output. That way, if you train a model on this data and the new level shows up in your test dataset or in the wild, it won’t cause an error.

Comments closed

The Art of the Code Review

Phil Booth shares some recommendations:

First, let’s establish what the point of code review is and also what it isn’t.

The number one, most important reason to review code is shared ownership. “Ownership” can be tricky to define in code terms, but mostly it’s a feeling. It means you understand the code, that you feel empowered to change it and the responsibility to maintain it.

Click through for Phil’s thoughts on what makes for a good code review. I’ve found that the over-the-shoulder code review isn’t nearly as effective as you’d hope, and a proper code review can take a considerable amount of time, up to hours or days for a large change.

Comments closed