Press "Enter" to skip to content

Category: Cloud

Kafka Topic Management in Amazon MSK

Swapna Bandla, et al, dig into a managed service:

If you manage Apache Kafka today, you know the effort required to manage topics. Whether you use infrastructure as code (IaC) solutions or perform operations with admin clients, setting up topic management takes valuable time that could be spent on building streaming applications.

Amazon Managed Streaming for Apache Kafka (Amazon MSK) now streamlines topic management by supporting new topic APIs and console integration. You can programmatically create, update, and delete Apache Kafka topics using familiar interfaces including AWS Command Line Interface (AWS CLI), AWS SDKs, and AWS CloudFormation. With these APIs, you can define topic properties such as replication factor and partition count and configuration settings like retention and cleanup policies. The Amazon MSK console integrates these APIs, bringing all topic operations to one place. You can now create or update topics with a few selections using guided defaults while gaining comprehensive visibility into topic configurations, partition-level information, and metrics. You can browse for topics within a cluster, review replication settings and partition counts, and go into individual topics to examine detailed configuration, partition-level information, and metrics. A unified dashboard consolidates partition topics and metrics in one view.

In this post, we show you how to use the new topic management capabilities of Amazon MSK to streamline your Apache Kafka operations. We demonstrate how to manage topics through the console, control access with AWS Identity and Access Management (IAM), and bring topic provisioning into your continuous integration and continuous delivery (CI/CD) pipelines.

Read on to see what the experience looks like using the MSK console.

Leave a Comment

Apache Airflow Jobs in Fabric Data Factory

Mark Kromer makes an announcement:

The world of data integration is rapidly evolving, and staying up to date with the latest technologies is crucial for organizations seeking to make the most of their data assets. Available now are the newest innovations in Fabric Data Factory pipelines and Apache Airflow job orchestration, designed to empower data engineers, architects, and analytics professionals with greater efficiency, flexibility, and scalability.

Read on to see what’s newly available, including some preview functionality.

Leave a Comment

Hyperthreading and SQL Server Licensing

Joe Obbish provides a warning:

Azure VMs with hyper-threading enabled are sized according to logical cores instead of physical cores. These logical cores can perform with 50% of the power of physical cores for high levels of activity but will always perform at 100% of the SQL Server licensing cost rate. As a result, moving from a busy on-premises SQL Server VM sized to an Azure VM with hyper-threading enabled can result in a surprise SQL Server licensing bill.

Joe couches this in terms of Azure, but the licensing effect is the same for on-premises hosts as well. Hyperthreading is better than not in most scenarios, though “busy with CPU-heavy SQL Server queries” is one of those exceptions. And Joe is absolutely right right SQL Server’s per-core licensing means that you really want to bias toward physical cores versus hyperthreaded cores.

Leave a Comment

Azure SQL Managed Instances and CPU

Joe Obbish answers a question:

I’m going to open with a perhaps controversial statement: “when you buy 4 vCores on the Azure SQL Managed Instance platform, what you’re actually buying is 2 physical cores presented as 4 hyperthreaded cores to SQL Server”. That means that if you have 8 physical cores on your SQL Server machine today then your starting Managed Instance vCore equivalent count could be closer to 16 vCores instead of 8. Perhaps this is already well known to everyone else, but I couldn’t find any (accurate) writing on this topic so I gave it a shot.

Click through for a series of tests that do not look great for SQL Managed Instances. And it doesn’t even have to do with storage this time. Azure SQL Managed Instance has to be one of the most disappointing Azure products, simply on hardware grounds alone.

Comments closed

What’s New in SQL Database for Fabric

Idris Motiwala makes some announcements:

The new Migration Assistant for SQL databases simplify moving SQL Server and Azure SQL workloads into Fabric. Designed for SQL developers, it imports schema via DACPACs, identifies compatibility issues, and provides clear, actionable guidance before migration. Built-in assessment and data copy workflows help teams move from evaluation to cutover with less manual effort, preserving existing SQL skills while accelerating time to value on Fabric’s unified analytics platform.  Ready to simplify your SQL migration journey? We will begin rolling this out in the coming weeks, and it will soon be accessible through the Fabric portal.

Click through for more things that are currently in place, including several items that are now GA.

Comments closed

Partitioned Compute and Fabric Dataflow Performance

Chris Webb performs a test:

Partitioned Compute is a new feature in Fabric Dataflows that allows you to run certain operations inside a Dataflow query in parallel and therefore improve performance. While UI support is limited at the moment it can be used in any Dataflow by adding a single line of fairly simple M code and checking a box in the Options dialog. But as with a lot of performance optimisation features (and this is particularly true of Dataflows) it can sometimes result in worse performance rather than better performance – you need to know how and when to use it. And so, in order to understand when this feature should and shouldn’t be used, I decided to do some tests and share the results here.

Click through for the test, the result, and an open door for subsequent analysis.

Comments closed

Deploy Microsoft Fabric Items with fabric-cicd in Azure DevOps

Kevin Chant announces a new Azure DevOps extension:

This post covers how you can simplify Microsoft Fabric deployments with “Deploy Microsoft Fabric items with fabric-cicd”. Which is an Azure DevOps extension that I recently published.

To manage expectations, this post shows how to start working with the extension and its associated task within the GUI-based classic release pipelines in Azure DevOps. Like in the below screenshot.

Read on to see how the extension works.

Comments closed

Preview-Only Steps in Microsoft Fabric Dataflows

Chris Webb covers a new feature:

I have been spending a lot of time recently investigating the new performance-related features that have rolled out in Fabric Dataflows over the last few months, so expect a lot of blog posts on this subject in the near future. Probably my favourite of these features is Preview-Only steps: they make such a big difference to my quality of life as a Dataflows developer.

The basic idea (which you can read about in the very detailed docs here) is that you can add steps to a query inside a Dataflow that are only executed when you are editing the query and looking at data in the preview pane; when the Dataflow is refreshed these steps are ignored. This means you can do things like add filters, remove columns or summarise data while you’re editing the Dataflow in order to make the performance of the editor faster or debug data problems. It’s all very straightforward and works well.

First up, that feature is pretty interesting, though I could see things break if you only do your testing in the preview pane. Second, what Chris does with this is quite interesting.

Comments closed

Troubleshooting Bad Request in ADF Pipelines

Koen Verbeeck said something bad:

A while ago I blogged about a use case where a pipeline fails during debugging with a BadRequest error, even though it validates successfully. If you’re wondering, this is the helpful error message that you get:

Click through for an image of the 400 Bad Request message, how Koen fixed it originally, and then a different scenario in which that 400 message popped up.

Ultimately, a 400 Bad Request comes down to “You sent me information that doesn’t make sense and I can’t fulfill your request, so fix it, dummy.” 400 status codes are very rude and insulting. Especially 418–that thing has a mouth like a sailor’s.

Comments closed