Press "Enter" to skip to content

Category: Cloud

Goodbye, Azure ML SDK v1

I have a new video:

In this video, I cover some news from Microsoft around the deprecation of the Azure Machine Learning SDK v1. We’ll take a look at the upgrade guide and see what it will take to perform this upgrade.

Microsoft will still support the SDK v1 until September of 2026, so we have a year to get code sorted out. The CLI v1, however, will go away sooner, so be sure you’re keeping up on that.

Leave a Comment

Executing a Fabric Data Pipeline from Azure Data Factory

Koen Verbeeck leaves the confines of Microsoft Fabric:

In the blog post Call a Fabric REST API from Azure Data Factory I explained how you can call a Fabric REST API endpoint from Azure Data Factory (or Synapse if you will). Let’s go a step further and execute a Fabric Data Pipeline from an ADF pipeline, which is a common request. A Fabric capacity cannot auto-resume, so you typically have an ADF pipeline that starts the Fabric capacity. After the capacity is started, you want to kick-off your ETL pipelines in Fabric and now you can do this from ADF as well.

Click through for the process. Though do check the warnings that Koen offers around either spending extra money by remaining in synchronous execution mode, or always getting a positive result in asynchronous execution mode, regardless of whether the underlying Fabric Data Pipeline worked or not.

Leave a Comment

Running Cron Jobs in Azure Database for PostgreSQL Flexible Server

Josephine Bush schedules a task:

pg_cron is a simple cron-based job scheduler for PostgreSQL that runs inside the database as an extension. It allows you to schedule PostgreSQL commands directly from your database, similar to using cron jobs at the operating system level. pg_cron on PG Flex is pretty easy to use, making it easy to schedule regular database maintenance and processing tasks directly from within PostgreSQL.

Read on to see how to install the extension, and then how to manage cron jobs. Josephine also lays out some limitations when using pg_cron on Azure and how to track failed jobs.

Leave a Comment

Comparing Microsoft Fabric to Snowflake

Evanjalin Joseph lays out a comparison:

Take ShopSmart, a global retail chain that operates both online and offline. The company wants to combine its sales, inventory, and customer data in order to facilitate real-time reporting and predictive analytics. Two top platforms are being assessed by the IT team for this change.

Azure, Power BI, and Microsoft 365 are already widely used by ShopSmart, which is in line with Fabric’s integrated ecosystem. The alternative, however, provides more multi-cloud flexibility and strong performance on structured data. The group has to choose between selecting a more specialized warehousing solution with more deployment options or making use of its current Microsoft investments.

Let’s examine the differences between the two platforms.

Click through for an overview of each platform and how they stack up against one another.

Leave a Comment

Loading Excel from SQL Server via Power BI XMLA

Jared Westover doesn’t want to share:

Users want to pull data from tables in an Azure SQL database into Excel via Power Query. This situation sounds simple. However, I don’t want to provide direct access to the database for several reasons, including the potential governance and permissions nightmare. We have a Fabric workspace, and most of the data already exists in Power BI reports. How can we give users access to the data they need without providing direct access to the database for an easy SQL export to Excel?

Click through for the answer. This solution is a bit more roundabout than granting direct database access, but also comes with a host of security benefits.

Leave a Comment

400 Bad Request when Debugging a Data Factory Pipeline

Koen Verbeeck runs into a problem:

I recently had a new pipeline fail. It was actually a copy of an old pipeline where I had made some adjustments into as part of a database migration. When triggered during an execution run, it failed saying some expression could not be parsed. When I went into the pipeline and triggered a debug, it immediately failed with the following helpful error message:

Click through for the error message and how Koen was able to fix the issue.

Comments closed

Calling a Microsoft Fabric REST API via Azure Data Factory

Koen Verbeeck makes the call:

Suppose you want to call a certain Microsoft Fabric REST API endpoint from Azure Data Factory (or Synapse Pipelines). This can be done using a Web Activity, and most Fabric APIs now support service principals or managed identities. Let’s illustrate with an example. I’m going to call the REST API endpoint to create a new lakehouse. 

Click through for the instructions.

Comments closed

Working around Errors Migrating to Azure SQL Managed Instance

Ben Johnston has an after-action report:

I was recently on a project to migrate a very transactional installation of SQL Server to Azure SQL Managed Instance (MI). SQL Managed Instance is a good stepping stone between a full, on-prem SQL instance / Azure VM and an Azure SQL Database. It has most of the functionality of a full, on-prem instance, with management of the SQL engine, backups, OS and underlying hardware done by Microsoft. It allows you to use cross database queries and run SQL Agent jobs, with fewer limitations than Azure SQL Database migrations.

The migration process isn’t completely seamless. During the migration of this system, we encountered several surprises. Hopefully, this will help you avoid, or at least be prepared for these differences from the on-prem version. This also reinforces the importance of testing each aspect of your migration.

This is part one of a two-parter and focuses on issues during the deployment process. Ben promises a follow-up with post-deployment issues you could run into. I expect that’s where the “What is this performance?” issues will come into play.

Comments closed

Data Quality Management with Great Expectations and Databricks

Sairamakrishna BuchiReddy Karri and Srinivasarao Rayankula show off Great Expectations:

Data quality checks are critical for any production pipeline. While there are many ways to implement them, the Great Expectations library is a popular one. 

Great Expectations is a powerful tool for maintaining data quality by defining, managing, and validating expectations for your data. In this article, we will discuss how you can use it to ensure data quality in your data pipelines.

Click through to see how it all works.

Comments closed

Understanding Availability Zones in Azure

Mika Sutinen explains some of the nuance around Azure availability zones:


Azure Availability Zones
 help provide resiliency to your database services within an Azure Region. I simply love it how simple Microsoft has made building geographically dispersed database services. If you’ve ever designed and deployed multi-site, highly available database services in on-premises, you know what I am talking about.

However, with the Availability Zones in Azure, there are a couple of things to know. I’ve learned my lessons the hard way, so in this post I am providing some tools and guidance on how to avoid some pitfalls when building multi-zone database services.

Click through for that guidance.

Comments closed