Press "Enter" to skip to content

Day: October 3, 2023

Apache Kafka Consumer Group Strategy

Lucia Cerchie gives us some advice:

Ever dealt with a misbehaving consumer group? Imbalanced broker load? This could be due to your consumer group and partitioning strategy! 

Once, on a dark and stormy night, I set myself up for this error. I was creating an application to demonstrate how you can use Apache Kafka® to decouple microservices. The function of my “microservices” was to create latte objects for a restaurant ordering service. It was set up a little like this:

I wanted to implement this in Kafka by using consumers, each reading from a common coffee topic, but with their own partition. Now this was a naive approach. Why? 

Click through to learn the reason, as well as some of the mechanics of how consumer groups work.

Comments closed

Plotting Decision Trees in R

Steven Sanderson builds a tree:

Decision trees are a powerful machine learning algorithm that can be used for both classification and regression tasks. They are easy to understand and interpret, and they can be used to build complex models without the need for feature engineering.

Once you have trained a decision tree model, you can use it to make predictions on new data. However, it can also be helpful to plot the decision tree to better understand how it works and to identify any potential problems.

In this blog post, we will show you how to plot decision trees in R using the rpart and rpart.plot packages. We will also provide an extensive example using the iris data set and explain the code blocks in simple to use terms.

Read on to see an example of how to do this.

Comments closed

Data Center Staffing Disasters

Steve Jones reads an after-action report:

There was a failure recently at an Azure data center in Australia when a utility power sag caused equipment to trip offline at one of the Azure data centers in Australia. You can read about it here, but essentially the headline is that there were only three people on site when the incident occurred, and that caused them to be unable to restart the equipment in time before an outage occurred.

Read on to learn more about why this failed and what Steve has seen in the wild.

Comments closed

Options for Running Jobs against Azure SQL DB

Anthony Norwood replaces on-prem SQL Agent jobs:

Both SQL Server on Azure VM and Azure SQL Managed Instance provide you with SQL Server Agent and therefore the capability to run scheduled tasks against your databases, so when we’re talking about being able to run jobs we’re only considering Azure SQL Database as needing guidance – some of the suggestions  in the following paragraphs can also apply to all these options of SQL Server, but perhaps not as necessary.

We’re going to provide you with four options for how you might be able to still run your favourite SQL Agent Jobs against an Azure SQL Database, each of which come with their own advantages and disadvantages – one not mentioned is Data Factory, sometimes referred to as SSIS in the cloud, and this is because we’re trying to focus on some options that may be more comfortable to people who have never built an SSIS package before.

Read on for the four options Anthony has for us.

Comments closed

MERGE is (Kinda) Okay

Hugo Kornelis performs a survey:

The MERGE statement compares source and target data, and then inserts into, updates, and deletes from the target table, all in a single statement. This statement was introduced in SQL Server 2008. I liked it, because it allows you to replace a set of multiple queries with just one single query. And while a statement with that many options necessarily has a more complex syntax, I still believe that, in most cases, a single MERGE statement is easier to read, write, and maintain, than a combination of at least an INSERT and an UPDATE, often a DELETE, and sometimes first a SELECT into a temporary table if the source is complex.

Click through for a review of a variety of problems people have had in the past. It surprised me a bit when I learned how few of these issues were still active problems caused by MERGE.

Comments closed