Curated SQL – Page 430 – A Fine Slice Of SQL Server

Practical Results of a ZooKeeper-less Kafka

Published 2022-11-09 by Kevin Feasel

The Kafka cluster meta-data is now only stored in the Kafka cluster itself, making meta-data update operations faster and more scalable. The meta-data is also replicated to all the brokers, making failover from failure faster too. Finally, the active Kafka controller is now the Quorum Leader, using Raft for leader election.

The motivation for giving Kafka a “brain transplant” (replacing ZooKeeper with KRaft) was to fix scalability and performance issues, enable more topics and partitions, and eliminate the need to run an Apache ZooKeeper cluster alongside every Kafka cluster.

Read on for some initial testing of KRaft versus ZooKeeper.

Comments closed

Diagnosing Kafka Message Throughput Reductions

Published 2022-11-09 by Kevin Feasel

Danica Fine and Nikoleta Verbeck troubleshoot an issue:

One of the greatest advantages of Kafka is its ability to maintain high throughput of data. Unsurprisingly, high throughput starts with the producers. Prior to sending messages off to the brokers, individual records destined for the same topic-partition are batched together as a single compressed collection of bytes. These batches are then further aggregated before being sent to the destination broker.

Batching is a great thing, and we (generally) want it. But how do you know when it’s working well and when it’s not?

This first post covers message throughput but there will be several other topics in the series as well.

Comments closed

Defining Production-Grade Code

Published 2022-11-09 by Kevin Feasel

Chad Callihan takes a detour:

You understand what you’re thinking when you write code but will you remember it in a few weeks or months? What about the new associate that has to work with your code? Will they be able to decipher what you were thinking based on code alone?

If code isn’t documented, it can make work unnecessarily difficult. For production-grade code, I would expect it to be well documented so even someone with minimal knowledge can get an idea of what the code is doing. It’s not necessary to write a novel for documentation but having something is better than nothing.

Read on for more thoughts of what makes code production-grade.

Comments closed

Conditional Formatting from Text in Power BI

Published 2022-11-09 by Kevin Feasel

Mara Pereira shows us a trick:

Have you ever wondered if you can apply conditional formatting based on a text field/measure instead of a numeric field/measure?

If your answer is yes, then this trick is for you!

The other day I was working with a customer who asked something that I had no idea how to build.

They wanted to apply conditional formatting over some of their visuals, but they wanted the conditional formatting applied over a text field and not over a numeric field or a measure.

Read on to see how.

Comments closed

Qualities of Production-Grade Code

Published 2022-11-09 by Kevin Feasel

Aaron Bertrand pulls out the list:

In a lot of programming languages, efficiency is almost always the guidepost. Sometimes, minimizing character count or line count is a “fool’s gold” measure of the quality of the code itself. Other times, unfortunately, engineers are judged not by quality at all, but rather the sheer number of lines of code produced – where more is, somehow, obviously better. Over my career, “how much code there is” has never been a very meaningful measure in any language.

But I’m here to talk about T-SQL, where certainly efficiency is a good thing to measure – though there are some caveats to that:

Read on for those caveats and what Aaron considers to be the hallmarks of high-quality code.

Comments closed

Horizontal Fusion in DAX

Published 2022-11-09 by Kevin Feasel

Marco Russo and Alberto Ferrari put on their lab coats:

Fusion is a DAX optimization that reduces the number of storage engine queries when the engine detects that multiple calculations can be merged together in a single query. There are two types of fusions: vertical fusion and horizontal fusion.

Vertical fusion occurs when multiple measures – or calculations in general – need to be computed in the same filter context. For example, the following query requires the calculation of two measures: Sales Amount and Margin:

Read on to see how horizontal fusion differs and when it can be most useful.

Comments closed

PGSQL Phriday Roundup

Published 2022-11-09 by Kevin Feasel

Andreas Scherbaum gets answers:

Last week for “PGSQL Phriday” I posted the following task:

Describe how you do backups for your PostgreSQL databases.

And I added a bonus question: Is pg_dump a backup tool?

Click through to see what people came up with.

Comments closed

Using Shiny on Python

Published 2022-11-08 by Kevin Feasel

David Saipe crosses the streams:

As someone who has zero experience using Shiny in R, the recent announcement that the framework had been made available to Python users inspired an opportunity for me to learn a new concept from a different perspective to most of my colleagues. I have been tasked with writing a Python related blog post, and having spent the past few weeks carrying out an analysis of Jumping Rivers’ Twitter data (@jumping_uk), creating a dashboard to display some of my findings and then writing about it seemed like a nice way to cap off my 6-week summer placement at Jumping Rivers.

This post will take you through some of the source code for the dashboard I created, whilst I provide a bit of context for the Twitter project itself. For a more bare-bones tutorial on using Shiny for Python, you can check out another recent Jumping Rivers blog post here. I suggest reading this first.

Read on to see how you can get started with Shiny on Python and what David thinks about the experience.

Comments closed

Securing a Kafka Cluster

Published 2022-11-08 by Kevin Feasel

Dan Weston aims to secure an Apache Kafka cluster:

As part of our educational resources, Confluent Developer now offers a course designed to help you apply Confluent Cloud’s security features to meet the privacy and security needs of your organization. This blog post explores the need to implement security for your Apache Kafka® cluster, then briefly reviews the security features and advantages of using Confluent Cloud.

Click through for an overview. The course itself is free, as well.

Comments closed

Querying Multiple Indexes on One Table

Published 2022-11-08 by Kevin Feasel

Daniel Hutmacher answers a question:

Can SQL Server piece together two different indexes in a single-table query, rather than just giving up and scanning a suboptimal clustered index? The short answer is: yes, in a fairly narrow band of conditions.

The actual answer is a lot more restrictive than you might initially think.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts