Curated SQL – Page 170 – A Fine Slice Of SQL Server

Tips for Scaling Apache Kafka

Published 2025-02-11 by Kevin Feasel

Narendra Lakshmana Gowda tunes a Kafka cluster:

Apache Kafka is known for its ability to process a huge quantity of events in real time. However, to handle millions of events, we need to follow certain best practices while implementing both Kafka producer services and consumer services.

Before start using Kafka in your projects, let’s understand when to use Kafka:

Much of the advice is pretty standard for performance tuning in Kafka, like setting batch size and linger time on the producer or managing consumers in a consumer group.

Comments closed

Avoiding Inconsistent Automated Test Results

Published 2025-02-11 by Kevin Feasel

Tien Nguyen Anh shares some advice:

A flaky test is a test that has inconsistent results in multiple test run, even when the automation script has not been changed. Sometimes it passes, and sometimes it fails, which makes it unreliable and difficult to trust for identifying issues.

Read on to understand a variety of causes and what you can do about them.

Comments closed

REPLACENULL in SSIS versus DT_DBTIMESTAMP2 Columns

Published 2025-02-11 by Kevin Feasel

Slava Murygin notes an error:

Using the “REPLACENULL” functionality frequently in the “Derived Column” component, the “Conditional Split” component, and other places in SSIS where formulas can be applied is common.

However, I recently encountered an issue with the “DT_DBTIMESTAMP2” data type.

The following formula produced an error:

REPLACENULL(TestDt, (DT_DBTIMESTAMP2,7)”1900-01-01 00:00:00.0000000″)

Error: 0xC020902A at Test Transformation, Derived Column [2]: The “Derived Column” failed because truncation occurred, and the truncation row disposition on “Derived Column.Outputs[Derived Column Output].Columns[TestDt]” specifies failure on truncation. A truncation error occurred on the specified object of the specified component.

Read on for an explanation and two alternatives.

Comments closed

Azure Elastic Jobs to Run Powershell and T-SQL

Published 2025-02-11 by Kevin Feasel

Josephine Bush kicks off a job:

I’ve covered how to create Elastic Jobs in the portal (this one is important to read if you aren’t familiar with elastic jobs already), with Terraform, and with Bicep. Now, I’ll cover how to create them and their associated objects with PowerShell. Don’t do this in prod to start. Always test in a lower environment first.

Click through for the process, as well as the script.

Comments closed

Improving Power Query CSV File Performance with Data Columns

Published 2025-02-11 by Kevin Feasel

Chris Webb makes things go faster:

A few weeks ago I replied to a question on reddit where someone was experiencing extremely slow performance when importing data from a CSV file using Power Query. The original poster worked out the cause of the problem and the solution themselves: they saw that removing all date columns from their query made their Power Query query much faster and that using the Date.FromText function and specifying the date format solved the problem. While I couldn’t reproduce the extreme slowness that was reported I was able to reproduce a performance difference between the two approaches and Curt Hagenlocher of the Power Query team confirmed that this was expected behaviour.

Read on for the example and explanation.

Comments closed

Scaling with PostgreSQL

Published 2025-02-11 by Kevin Feasel

Shayon Mukherjee shares some tips about scaling in PostgreSQL:

“Postgres was great when we started but now that our service is being used heavily we are running into a lot of ‘weird’ issues”

This sentiment is frequently echoed by CTOs and senior engineers at high-growth startups when I speak with them.

Scaling PostgreSQL successfully doesn’t always require a full team of DBAs and experts. The beauty of PostgreSQL is that solutions often lie within the database itself – by rethinking your data access patterns from first principles, you can solve many business problems at scale.

In this post, I’ll address some common “weird” issues I’ve encountered and explore solutions that work at scale.

I like some of them, though I’m not a fan of eliminating or delaying foreign key constraints, as those are important for data quality. I’m not knowledgeable enough in PostgreSQL administration to have a strong opinion on these, however.

Comments closed

Inflation in Medieval China

Published 2025-02-10 by Kevin Feasel

Richard Vale digs into a dataset:

In this post, I would like to draw attention to a very interesting data set collected by Guan, Palma and Wu as part of the replication package for their paper The rise and fall of paper money in Yuan China, 1260-1368. The paper describes inflation, money and prices during the Yuan Dynasty era in China.

First, a little historical background.

Read on for the analysis. H/T R-Bloggers.

Comments closed

Experimenting with BIT_COUNT

Published 2025-02-10 by Kevin Feasel

Louis Davidson has an idea:

I was editing an article the other day that uses the BIT_COUNT function that was added to SQL Server 2022. The solution presented is excellent, and I will try to come back and link to it here when I remember. (It will be linked the other way.

Reading that did two things for me. First it cave me an idea of how the BIT_COUNT function might be actually be used in a useful way. The solution that was presented would only work in SQL Server 2022 (It will work in earlier compatibility levels, based on the tests I have done.)

Read on for what Louis tried out.

Comments closed

Using the Microsoft Fabric Capacity Metrics App

Published 2025-02-10 by Kevin Feasel

Reitse Eskens uses a tool:

In a number of previous blogs and in my session on loadtesting Microsoft Fabric, I’ve always questioned the metrics app and one specific point is the timepoint detail. When you click on a graph, you get the option to go to the timepoint detail and read more.

This is all fun and games but looking at the list of active processes at that specific point in time, you’ll quickly see processes that are way out of the selected point in time. For me, it rendered this thing useless because it messed up the things I wanted to see.

Read on to see the right way to handle this app.

Comments closed

Inlining Views in PostgreSQL

Published 2025-02-10 by Kevin Feasel

Radim Marek shows off how the PostgreSQL database engine can inline a view:

Database VIEWs are powerful tools that often don’t get the attention they deserve when building database-driven applications. They make our database work easier in several ways:

They let us reuse common query patterns instead of writing them over and over

They give us a place to define business rules once and use them everywhere

They help us write cleaner, more organized queries

Let’s see how this works with a practical example.

I’m not sure how well PostgreSQL manages nested views, as the biggest problem in SQL Server with inlining views comes when you have overly complicated views made up of views joined to views made up of views joined to–well, you get the idea. That said, Radim does point out a variety of scenarios in which inlining doesn’t happen in PostgreSQL.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Curated SQL Posts