Press "Enter" to skip to content

Category: Cloud

400 Bad Request when Debugging a Data Factory Pipeline

Koen Verbeeck runs into a problem:

I recently had a new pipeline fail. It was actually a copy of an old pipeline where I had made some adjustments into as part of a database migration. When triggered during an execution run, it failed saying some expression could not be parsed. When I went into the pipeline and triggered a debug, it immediately failed with the following helpful error message:

Click through for the error message and how Koen was able to fix the issue.

Leave a Comment

Calling a Microsoft Fabric REST API via Azure Data Factory

Koen Verbeeck makes the call:

Suppose you want to call a certain Microsoft Fabric REST API endpoint from Azure Data Factory (or Synapse Pipelines). This can be done using a Web Activity, and most Fabric APIs now support service principals or managed identities. Let’s illustrate with an example. I’m going to call the REST API endpoint to create a new lakehouse. 

Click through for the instructions.

Leave a Comment

Working around Errors Migrating to Azure SQL Managed Instance

Ben Johnston has an after-action report:

I was recently on a project to migrate a very transactional installation of SQL Server to Azure SQL Managed Instance (MI). SQL Managed Instance is a good stepping stone between a full, on-prem SQL instance / Azure VM and an Azure SQL Database. It has most of the functionality of a full, on-prem instance, with management of the SQL engine, backups, OS and underlying hardware done by Microsoft. It allows you to use cross database queries and run SQL Agent jobs, with fewer limitations than Azure SQL Database migrations.

The migration process isn’t completely seamless. During the migration of this system, we encountered several surprises. Hopefully, this will help you avoid, or at least be prepared for these differences from the on-prem version. This also reinforces the importance of testing each aspect of your migration.

This is part one of a two-parter and focuses on issues during the deployment process. Ben promises a follow-up with post-deployment issues you could run into. I expect that’s where the “What is this performance?” issues will come into play.

Leave a Comment

Data Quality Management with Great Expectations and Databricks

Sairamakrishna BuchiReddy Karri and Srinivasarao Rayankula show off Great Expectations:

Data quality checks are critical for any production pipeline. While there are many ways to implement them, the Great Expectations library is a popular one. 

Great Expectations is a powerful tool for maintaining data quality by defining, managing, and validating expectations for your data. In this article, we will discuss how you can use it to ensure data quality in your data pipelines.

Click through to see how it all works.

Leave a Comment

Understanding Availability Zones in Azure

Mika Sutinen explains some of the nuance around Azure availability zones:


Azure Availability Zones
 help provide resiliency to your database services within an Azure Region. I simply love it how simple Microsoft has made building geographically dispersed database services. If you’ve ever designed and deployed multi-site, highly available database services in on-premises, you know what I am talking about.

However, with the Availability Zones in Azure, there are a couple of things to know. I’ve learned my lessons the hard way, so in this post I am providing some tools and guidance on how to avoid some pitfalls when building multi-zone database services.

Click through for that guidance.

Comments closed

Saving an Azure Database for PostgreSQL Backup to a Storage Account

Josephine Bush wants an extra copy of the backup:

This may or may not be helpful in the long term, but since I’m doing it to be super cautious, I figured I would blog about it. We migrated to Flex last week, and to be abundantly cautious, we’re putting the last single server backup into cold storage. You could also use this same process to offload Flex if you were going to delete a server and want to save a final backup or have some use case for saving backups to storage longer term.

Read on for the process. It’s not as simple as running a command or two, but Josephine does take us through the process.

Comments closed

Serving Databricks Models via API Management Endpoints

Drew Furgiuele makes available a model:

When it comes to generative AI projects I’d argue that the hardest and most tedious part has moved into a new area: hosting and serving your models. Whether you’re working with CPU intensive models, or models that require GPU horsepower, sourcing the hardware, building out deployment pipelines, configuring monitoring, and then securing everything is real, serious work that requires everyone to lean in to get it right.

And then, there’s the real question of how you’re going to use those models: will you be setting up automation and doing batch processing using your models and infrastructure? Or do you want to get really serious and offer up real-time inference? If the latter, you can add one more thing to solve for: managing your front-end APIs that you will have to build to support that use case.

Click through to see how you can use an API management tool (like Azure API Management) to assist in these things.

Comments closed

Load Testing Azure SQL Databases

Reitse Eskens sets the stage:

Some time ago, I wrote a number of blogposts comparing the different Azure SQL options to give you some idea about performance, differences between tiers and differences between the Stock Keeping Units (SKU’s). This was done by creating data in the database itself and review the metrics. This works fine and gave a good overview of the different tiers and SKU’s. For reference, you can find those blogs here.

For the new series, I’ve thought of a new process that aligns more with my regular line of work, data warehousing. This means ingesting a lot of data and modelling it.

Click through for the summary of method and initial notes.

Comments closed

Cleaning up Azure Container Registries

Jess Pomfret does a bit of cleanup work:

Azure Container Registries can easily become cluttered with many versions of images. Did you know that each ACR sku comes with a certain amount of storage included, and when you go over that, you’ll pay overage charges. Let’s look at how to check your current storage, keep your registry nice and tidy with an ACR clean-up task, and monitor the storage levels so you’ll never pay extra again!

It’s easy to run up the disk space usage with a container registry, especially if you have automated builds running.

Comments closed