Press "Enter" to skip to content

Curated SQL Posts

Postgres Synchronous Replication Guarantees

Kaarel Moppel has a public service announcement:

At last week’s local Postgres user group meetup here in Estonia, one of the topics was HA and recent Patroni (the most popular cluster manager for Postgres) improvements in supporting quorum commit, which by the way on its own has been possible to use for years. Things went deep quickly and we learned quite a bit of course. Including a good reminder that you shouldn’t build your bank on Patroni’s default synchronous mode 🙂

Anyways, during the hallway track (which sometimes are as valuable as the real ones) got an interesting question – with some 3+ quorum nodes, is Postgres then 100% bulletproof against all kinds failures? Excluding meteorites, rouge DBAs and such of course. One could think so, right? Nope.

Read on to learn what might cause failure in that scenario. Guaranteeing synchronous replication between machines over a network is a surprisingly difficult challenge.

Leave a Comment

Constraints on Polymorphic Associations in SQL Server

Jared Westover wants a foreign key, but one referencing multiple tables:

Do you like a challenge? If you answered yes, you’re my kind of person. Recently, a developer presented me with a problem: they needed a foreign key reference in one table to associate with multiple other tables. Over the years, I’ve often been asked how to make this situation work. However, achieving this relationship with foreign keys is technically impossible with SQL Server and most mainstream relational database platforms. Since SQL Server restricts foreign keys to referencing a single table, how can we solve this problem?

My immediate answer was “triggers,” which happens to be the solution Jared intentionally omits.

I’d rather go with the multiple association tables approach over multiple indicator types, as the latter requires (n-1) NULLs, where n is the number of indicator types (review types in Jared’s example) and I hate NULL because NULL is the void lying about being a value, sort of like how skim milk is water lying about being milk.

Leave a Comment

Refreshing a Power BI Semantic Model via Eventstreams

Chris Webb builds a Rube Goldberg device:

Following on from my last post where I showed how to send data from Power Automate to a Fabric Eventstream, in this post I’m going to show how to use it to solve one of my favourite problems: refreshing a Power BI semantic model that uses an Excel workbook stored in OneDrive as a source when that Excel workbook is modified.

Now before I go on I want to be clear that I know this is a ridiculously over-engineered and expensive (in terms of CUs) solution, and that you can do almost the same thing just using Power Automate or in several other different ways – see my colleague Mark Pryce-Maher’s recent videos on using Fabric Open Mirroring with Excel for example. I’m doing this to teach myself Fabric Eventstreams and Activator and see what’s possible with them. Please excuse any mistakes or bad practices.

Click through for the process.

Leave a Comment

Finding Columns in R with No Data

Steven Sanderson looks for the missing columns:

When working with real-world datasets in R, it’s common to encounter missing values, often represented as NA. These missing values can impact the quality and reliability of your analyses. One important step in data preprocessing is identifying columns that consist entirely of missing values. By detecting these columns, you can decide whether to remove them or take appropriate action based on your specific use case. In this article, we’ll explore how to find columns with all missing values using base R functions.

Click through to see how you can do this. It’s not quite as simple as missing rows (complete_cases()) but it’s also not too much of an ordeal, either.

Comments closed

Comparing Positron to RStudio

Theo Roe performs a product comparison:

Positron is the new beta Data Science IDE from Posit. Though Posit have stressed that maintenance and development of RStudio will continue, I want to use this blog to explore if Positron is worth the switch. I’m coming at this from the R development side but there will of course be some nuances from other languages in use within Positron that require some thought.

Read on for Theo’s perspective. Knowing that it’s using the same underlying framework as Visual Studio Code, I kind of wish this were an extension for VS Code rather than a separate app.

1 Comment

Deployment Parameters in Azure AI Foundry

Tomaz Kastrun continues a series on Azure AI:

Give the model instructions about how it should behave and any context it should reference when generating a response. You can describe the assistant’s personality, tell it what it should and shouldn’t answer, and tell it how to format responses. There’s no token limit for this section, but it will be included with every API call, so it counts against the overall token limit

Click through for a description of each part of the deployment parameters section.

Comments closed

Mounding ADF Instances in Microsoft Fabric

Koen Verbeeck has an existing Azure Data Factory:

We recently started using Microsoft Fabric for our cloud data platform. However, we already have quite an estate of Azure data services running in our company, including a huge number of Azure Data Factory (ADF) pipelines. It seems cumbersome to migrate all those pipelines to Microsoft Fabric, especially because some features are not supported yet and ADF is the mature choice at the moment. We like the concept of Microsoft Fabric’s centralization, where everything is managed in one platform. Is there an option to manage ADF in Fabric?

Read on for the answer, but make sure to check out its limitations as well.

Comments closed

The Power of Rounding

Denny Cherry makes a change:

So, I ran across a problem with QuickBooks that involves some of the most basic math that we were all taught in elementary school: how to round numbers properly. You’d think that a company that makes accounting and invoicing software for a living would understand how rounding of numbers works. But based on the last hour of having to edit the data that gets send to QuickBooks from our internal system you’d be wrong.

Denny’s example is $3.18497736 and rounding happens after four decimal spots, so it’s $3.1849 or $3.1850. Denny expects $3.1850 and QuickBooks gives $3.1849.

In this case, Denny’s right. The part that confuses people is banker’s rounding, which has you round to the even number if your last digit is a 5. For example, if it were $3.18495 and you round to four spots after the decimal, that would be $3.1850. With $3.18485, it would round to $3.1848.

.NET uses banker’s rounding by default, which can confuse people unfamiliar with the concept. SQL Server, meanwhile, rounds the way that Denny expects: 5 or higher rounds up, 0-4 rounds down.

Comments closed