Press "Enter" to skip to content

Category: Cloud

Customer-Managed Keys in Microsoft Fabric

Sumiran Tandon makes an announcement:

Customer managed keys were launched in preview, offering workspace administrators the ability to use keys in Azure Key Vault and Managed HSM, to protect data in certain Fabric items. Now, we are extending the encryption support to more Fabric workloads. You can now create Fabric Warehouses, Notebooks and utilize the SQL Analytics Endpoint in workspaces enabled with encryption using your keys. The changes are rolling out and should be available in all regions over the next few days.

Freddie Santos digs into what this means for Fabric Warehouse and the SQL analytics endpoint:

Fabric already ensures that your data is encrypted at rest using Microsoft-managed keys. But for many organizations—especially in regulated industries—encryption alone isn’t enough. They need the ability to control and manage the keys that protect their data, aligning with internal compliance requirements, regulatory standards, and governance best practices.

I know that there are enough companies where this is absolutely necessary for adoption of a product, but I should point out that even without bringing your own key, Microsoft does use their own generated keys to encrypt your data at rest.

Leave a Comment

Updates to Microsoft Fabric Dataflows Gen2

Nikola Ilic digs into some announcements:

In the ocean of announcements from the recent FabCon Europe in Vienna, one that may have gone under the radar was about the enhancements in performance and cost optimization for Dataflows Gen2.

Before we delve into explaining how these enhancements impact your current Dataflows setup, let’s take a step back and provide a brief overview of Dataflows. For those of you who are new to Microsoft Fabric – a Dataflow Gen2 is the no-code/low-code Fabric item used to extract, transform, and load the data (ETL).

It sounds like these changes move Dataflows Gen2 from the “Never choose this” option to something that has become viable in at least some circumstances.

Leave a Comment

Cross-Cloud Data Replication with Confluent

Ahmed Saef Zamzam and Hannah Miao move some data:

Cross-cloud replication over private networks is powered by Cluster Linking, Confluent’s fully managed, offset-preserving replication service that mirrors topics across clusters. Cluster Linking already makes it simple to connect environments across regions, clouds, and hybrid deployments with near-zero data loss. Now, with private cross-cloud replication, the possibilities expand even further—enabling secure multicloud data sharingdisaster recovery, and compliance use cases that many organizations, particularly those in regulated industries, have struggled to solve for years.

Click through to see how it works and how it can beat mechanisms that existed prior to it.

Leave a Comment

Migrating from Apache Airflow 2 to 3 on Amazon MWAA

Anurag Srivastava, et al, perform a migration:

Apache Airflow 3.x on Amazon MWAA introduces architectural improvements such as API-based task execution that provides enhanced security and isolation. Other major updates include a redesigned UI for better user experience, scheduler-based backfills for improved performance, and support for Python 3.12. Unlike in-place minor Airflow version upgrades in Amazon MWAA, upgrading to Airflow 3 from Airflow 2 requires careful planning and execution through a migration approach due to fundamental breaking changes.

This migration presents an opportunity to embrace next-generation workflow orchestration capabilities while providing business continuity. However, it’s more than a simple upgrade. Organizations migrating to Airflow 3.x on Amazon MWAA must understand key breaking changes, including the removal of direct metadata database access from workers, deprecation of SubDAGs, changes to default scheduling behavior, and library dependency updates. This post provides best practices and a streamlined approach to successfully navigate this critical migration, providing minimal disruption to your mission-critical data pipelines while maximizing the enhanced capabilities of Airflow 3.

Read on to see what has changed between these two major versions of Airflow, recommendations on what to look out for, and a step-by-step migration guide.

Leave a Comment

Updates to Fabric Data Factory

Abhishek Narain has a list of updates:

Workspace Private Link Support for Data Factory (Preview): Microsoft Fabric enables secure data integration through Private Link support in Dataflows Gen2, Pipelines, and Copy jobs. This ensures that inbound data access remains isolated and compliant within protected workspaces. By leveraging VNet data gateways, organizations can securely connect to data sources across Private Link-enabled environments—eliminating exposure to public networks and reinforcing enterprise-grade security for sensitive data operations.

Most of these are security-related updates, with a mixture of things now GA, things currently in preview, and a pair of items coming soon.

Leave a Comment

Set MAXDOP in Azure SQL DB

Brent Ozar has a public service announcement:

In Azure SQL DB, you set max degrees of parallelism at the database level. You right-click on the database, go into properties, and set the MAXDOP number.

I say “you” because it really is “you” – this is on you, bucko. Microsoft’s magical self-tuning database doesn’t do this for you.

And where this backfires, badly, is that Azure SQL DB has much, much lower caps on the maximum number of worker threads your database can consume before it gets cut off. 

Click through to see what kind of error message you get and just how low these limits are.

Leave a Comment

A Primer on GitHub Actions

Temidayo Omoniyi provides an introduction to GitHub Actions workflows:

In today’s fast-paced development cycles, the demand to ship high-quality code quickly is more important than ever before. However, several tedious, labor-intensive, and prone to mistakes procedures that stand between producing code and releasing it to consumers frequently slow down teams.

Every Developer faces these common issues:

  • Repetitive Checks: Before each push, unit tests, linters, and build scripts are manually executed.
  • Inconsistent Environments: Code that passes locally in one environment but fails in another is known as the “it works on my machine” dilemma.
  • High-Stakes Deployments: Deploying code by following a meticulous, manual checklist in which even one mistake could result in downtime.
  • Slow Comments Loops: The review process is prolonged when you wait for a coworker to pull your branch, run tests, and provide comments on a pull request.

I like GitHub Actions workflows a lot. Once you’ve put together a workflow or two, it’s pretty easy to see what’s going on. On top of that, there is a huge amount of functionality and an enormous number of third-party templates to extend it even further.

Comments closed

Contrasting Microsoft Fabric, Databricks, and Snowflake

Ron L’Esteve builds a comparison chart:

Databricks and Microsoft Fabric are two of the most innovative Unified Data and Analytics intelligence platforms available on the market today. While similar, each brings their own advantages and limitations. Snowflake joins these two powerhouses when data warehouse decisioning comes into play. Sometimes it is challenging to decide which one to pick for your organization’s needs. This tip will help with uncovering when to choose Databricks vs Fabric vs Snowflake.

When it comes to Spark performance, Databricks is always going to win—they keep most of their optimizations to themselves, so anyone starting from open-source Spark is at a disadvantage. Otherwise, it’s a bit of a slugfest between Fabric and Databricks. At the end, Ron also brings in Snowflake, focusing on the data warehousing side of things for that three-way comparison. I don’t think there’s a clear winner among the three, and on net, that’s probably a good thing, as it forces the groups to continue competing.

Comments closed

Installing SQL Server 2025 RC0 on an Azure VM

Koen Verbeeck performs an installation:

I already had a virtual machine in Azure, running SQL Server 2025 CTP 2.0 (which uses a pre-made image). I explain how to set that one up in the article Install SQL Server 2025 Demo Environment in Azure. But I wanted to use the latest preview, which is Release Candidate 0 at the time of writing. Unfortunately, there’s no image available (yet?), so I had to do it the old-school way: installing SQL Server manually.

Read on to see how to do it, as well as a few extra things necessary to make everything work well in Azure.

Comments closed

Architectural Guidance for IoT Deployments in Azure

Bhimraj Ghadge shares some tips:

Edge computing, a strategy for computing on location where data is collected or used, allows IoT data to be gathered and processed at the edge, rather than sending the data back to a data center or cloud. Together, IoT and edge computing are a powerful way to rapidly analyze data in real-time.

In this Tutorial, I am trying to lay out the components and considerations for designing IoT solutions based on Azure IoT and services.

Read on for an overview of IoT components in Azure, as well as several things to keep in mind during systems design and implementation.

Comments closed