Press "Enter" to skip to content

Author: Kevin Feasel

What’s New in Apache Kafka 4.1.0

Mickael Maison lays out some changes:

The Apache Kafka community is proud to announce the release of Apache Kafka® 4.1.0. This blog post highlights the many new features and improvements included in this release. For a full list of changes, be sure to check the release notes.

Queues for Kafka (KIP-932) is now in preview. It’s still not ready for production, but you can start evaluating and testing it. See the preview release notes for more details.

This release also introduces a new Streams Rebalance Protocol (KIP-1071) in early access. It is based on the new consumer group protocol (KIP-848).

Read on for another 15 or so completed items.

Comments closed

Defending Kubernetes

Joey D’Antoni defends the defensible:

I’ve seen a couple of posts (of course they were chock full of AI slop images) on LinkedIn in the last couple of weeks, talking about how challenging it is to implement Kubernetes. In the most recent post I saw, it stated that “it took 5 months for our CEO to implement Kubernetes for our app”, to which I would ask, why the hell is your CEO configuring your clusters. I designed, and implemented the Kubernetes infrastructure on my current project, and I’ve worked on for a while, so of course, I felt the need to share my opinions on the matter.

As far as Kubernetes on-premises goes, there are quite valid reasons to run it on-prem. Yeah, it’s easier to host in AKS or EKS, but that’s not always possible. But regardless of whether you’re hosting on-prem or in a cloud provider, Kubernetes requires solid knowledge across several areas, including networking, storage administration, systems administration, and CI/CD, not to mention the development skills needed for containerization.

I think Joey downplays the skill level required, but I don’t want to err in the opposite direction by overstating the challenge. But if you want anything beyond the bog-standard deployment of AKS/EKS, the “You must be this tall to ride the ride” sign is significantly higher than using other containerized solutions like Azure Application Services/Container Apps or Elastic Container Service.

Comments closed

A Primer on Markdown

Mike Robbins introduces Markdown:

Markdown is the standard for writing technical documentation at Microsoft and many other organizations. Its simplicity, readability, and compatibility with other tools make it an ideal choice for blogging, documenting software, procedures, APIs, and more. Whether you’re authoring a user guide, README, or knowledge base article, Markdown enables you to focus on content without getting bogged down in formatting.

As a technical writer, you’re expected to deliver clear, maintainable documentation that works across platforms. Markdown helps you do exactly that, with minimal friction.

The biggest challenge I experience with Markdown is figuring out what’s actually supported in some given implementation of Markdown. Most of the basics will be the same, but as soon as you get into things like nested lists, images, etc., support varies significantly.

Comments closed

Building Storage Tiers with Pure Storage in Powershell

Anthony Nocentino creates a medallion storage layout:

In modern IT environments, not all workloads require the same level of storage performance, protection, or cost. Some applications need high performance with aggressive data protection, while others are perfectly fine with lower performance in exchange for cost savings. This tiered approach to storage service delivery is fundamental to efficient infrastructure management.

In my previous post on Fusion, I took an application-centric approach, showing how to deploy SQL Servers using Fusion. Let’s switch gears now and learn how to define a storage service catalog. In this post, I’ll demonstrate how to build a complete storage service catalog using Pure Storage Fusion Presets, offering Bronze, Silver, and Gold tiers with optional replication. We’ll see how to leverage different array types (FlashArray //X and FlashArray //C) to optimize both performance and cost across your fleet.

Read on for a link to the code, as well as more information on how it works.

Comments closed

What’s New in Microsoft Fabric Data Warehouse

Sowmya Sivaraman has an update:

Welcome to the August 2025 edition of What’s New in Fabric Warehouse. As summer winds down, despite August being a slower month, our team continued to deliver meaningful updates. We shipped several new features focused on enhancing data ingestion, improving the data management, and streamlining security. At the same time, much of our energy is going into preparing exciting announcements for FabCon Vienna — stay tuned for what’s coming next. Whether you’re optimizing workloads, building with SQL, or exploring new integrations, this roundup highlights improvements we think you’ll find valuable.

Click through for a list of changes.

Comments closed

Join Strategies in Apache Spark

Ram Ghadiyaram looks at three join strategies in Apache Spark:

In this article, we are going to discuss three essential joins of Apache Spark.

The data frame or table join operation is most commonly used for data transformations in Apache Spark. With Apache Spark, a developer can use joins to merge two or more data frames according to specific (sortable) keys. Writing a join operation has a straightforward syntax, but occasionally the inner workings are obscured. Apache Spark internal API suggests several algorithms for joins and selects one. A basic join operation could become costly if you do not know what these core algorithms are or which one Spark uses.

This is not a comprehensive list, but it does cover three of the more common strategies when dealing with larger datasets.

Comments closed

Splitting to a Table via Regular Expression

Louis Davidson creates a table:

Continuing on with the REGEXP_ functions series, the next one I want to cover is the table valued function REGEXP_SPLIT_TO_TABLE. This function is definitely one of the ones you probably ought to know, especially if you are ever tasked to pull some data out of a data structure.

This function is a lot like the STRING_SPLIT function, and unlike things like the REGEXP_LIKE function, you can basically use the same main parameters as you used in STRING_SPLIT for simple cases, but from there the possibilities are a lot more endless because you can define almost any delimiters you want. It isn’t perfect, because of a few things, but we will discuss that more later on.

Read on to see how it works, including one major caveat.

Comments closed

Oracle’s LOGMINER and STREAMS Tools in the Modern Era

David Fitzjarrell looks at two classic tools:

Change is good, and occasionally Oracle changes utilities to make them easier to implement. Over the years a tool called LOGMINER has been available for various replication tasks, such as logical standby databases and an older product called STREAMS as well as updated tools such as Golden Gate. Let’s look into this topic again, with versions from 19c onward.

Click through for a bit of history on both tools, as well as where they’re at today.

Comments closed

Making XGBoost Run Faster

Ivan Palomares Carrascosa shares a few tips:

Extreme gradient boosting (XGBoost) is one of the most prominent machine learning techniques used not only for experimentation and analysis but also in deployed predictive solutions in industry. An XGBoost ensemble combines multiple models to address a predictive task like classification, regression, or forecasting. It trains a set of decision trees sequentially, gradually improving the quality of predictions by correcting the errors made by previous trees in the pipeline.

In a recent article, we explored the importance and ways to interpret predictions made by XGBoost models (note we use the term ‘model’ here for simplicity, even though XGBoost is an ensemble of models). This article takes another practical dive into XGBoost, this time by illustrating three strategies to speed up and improve its performance.

Read on for two tips to reduce operational load and one to offload it to faster hardware (when possible).

Comments closed