Press "Enter" to skip to content

Category: Cloud

Running Cron Jobs in Azure Database for PostgreSQL Flexible Server

Josephine Bush schedules a task:

pg_cron is a simple cron-based job scheduler for PostgreSQL that runs inside the database as an extension. It allows you to schedule PostgreSQL commands directly from your database, similar to using cron jobs at the operating system level. pg_cron on PG Flex is pretty easy to use, making it easy to schedule regular database maintenance and processing tasks directly from within PostgreSQL.

Read on to see how to install the extension, and then how to manage cron jobs. Josephine also lays out some limitations when using pg_cron on Azure and how to track failed jobs.

Comments closed

Comparing Microsoft Fabric to Snowflake

Evanjalin Joseph lays out a comparison:

Take ShopSmart, a global retail chain that operates both online and offline. The company wants to combine its sales, inventory, and customer data in order to facilitate real-time reporting and predictive analytics. Two top platforms are being assessed by the IT team for this change.

Azure, Power BI, and Microsoft 365 are already widely used by ShopSmart, which is in line with Fabric’s integrated ecosystem. The alternative, however, provides more multi-cloud flexibility and strong performance on structured data. The group has to choose between selecting a more specialized warehousing solution with more deployment options or making use of its current Microsoft investments.

Let’s examine the differences between the two platforms.

Click through for an overview of each platform and how they stack up against one another.

Comments closed

Loading Excel from SQL Server via Power BI XMLA

Jared Westover doesn’t want to share:

Users want to pull data from tables in an Azure SQL database into Excel via Power Query. This situation sounds simple. However, I don’t want to provide direct access to the database for several reasons, including the potential governance and permissions nightmare. We have a Fabric workspace, and most of the data already exists in Power BI reports. How can we give users access to the data they need without providing direct access to the database for an easy SQL export to Excel?

Click through for the answer. This solution is a bit more roundabout than granting direct database access, but also comes with a host of security benefits.

Comments closed

400 Bad Request when Debugging a Data Factory Pipeline

Koen Verbeeck runs into a problem:

I recently had a new pipeline fail. It was actually a copy of an old pipeline where I had made some adjustments into as part of a database migration. When triggered during an execution run, it failed saying some expression could not be parsed. When I went into the pipeline and triggered a debug, it immediately failed with the following helpful error message:

Click through for the error message and how Koen was able to fix the issue.

Comments closed

Calling a Microsoft Fabric REST API via Azure Data Factory

Koen Verbeeck makes the call:

Suppose you want to call a certain Microsoft Fabric REST API endpoint from Azure Data Factory (or Synapse Pipelines). This can be done using a Web Activity, and most Fabric APIs now support service principals or managed identities. Let’s illustrate with an example. I’m going to call the REST API endpoint to create a new lakehouse. 

Click through for the instructions.

Comments closed

Working around Errors Migrating to Azure SQL Managed Instance

Ben Johnston has an after-action report:

I was recently on a project to migrate a very transactional installation of SQL Server to Azure SQL Managed Instance (MI). SQL Managed Instance is a good stepping stone between a full, on-prem SQL instance / Azure VM and an Azure SQL Database. It has most of the functionality of a full, on-prem instance, with management of the SQL engine, backups, OS and underlying hardware done by Microsoft. It allows you to use cross database queries and run SQL Agent jobs, with fewer limitations than Azure SQL Database migrations.

The migration process isn’t completely seamless. During the migration of this system, we encountered several surprises. Hopefully, this will help you avoid, or at least be prepared for these differences from the on-prem version. This also reinforces the importance of testing each aspect of your migration.

This is part one of a two-parter and focuses on issues during the deployment process. Ben promises a follow-up with post-deployment issues you could run into. I expect that’s where the “What is this performance?” issues will come into play.

Comments closed

Data Quality Management with Great Expectations and Databricks

Sairamakrishna BuchiReddy Karri and Srinivasarao Rayankula show off Great Expectations:

Data quality checks are critical for any production pipeline. While there are many ways to implement them, the Great Expectations library is a popular one. 

Great Expectations is a powerful tool for maintaining data quality by defining, managing, and validating expectations for your data. In this article, we will discuss how you can use it to ensure data quality in your data pipelines.

Click through to see how it all works.

Comments closed

Understanding Availability Zones in Azure

Mika Sutinen explains some of the nuance around Azure availability zones:


Azure Availability Zones
 help provide resiliency to your database services within an Azure Region. I simply love it how simple Microsoft has made building geographically dispersed database services. If you’ve ever designed and deployed multi-site, highly available database services in on-premises, you know what I am talking about.

However, with the Availability Zones in Azure, there are a couple of things to know. I’ve learned my lessons the hard way, so in this post I am providing some tools and guidance on how to avoid some pitfalls when building multi-zone database services.

Click through for that guidance.

Comments closed

Saving an Azure Database for PostgreSQL Backup to a Storage Account

Josephine Bush wants an extra copy of the backup:

This may or may not be helpful in the long term, but since I’m doing it to be super cautious, I figured I would blog about it. We migrated to Flex last week, and to be abundantly cautious, we’re putting the last single server backup into cold storage. You could also use this same process to offload Flex if you were going to delete a server and want to save a final backup or have some use case for saving backups to storage longer term.

Read on for the process. It’s not as simple as running a command or two, but Josephine does take us through the process.

Comments closed

Serving Databricks Models via API Management Endpoints

Drew Furgiuele makes available a model:

When it comes to generative AI projects I’d argue that the hardest and most tedious part has moved into a new area: hosting and serving your models. Whether you’re working with CPU intensive models, or models that require GPU horsepower, sourcing the hardware, building out deployment pipelines, configuring monitoring, and then securing everything is real, serious work that requires everyone to lean in to get it right.

And then, there’s the real question of how you’re going to use those models: will you be setting up automation and doing batch processing using your models and infrastructure? Or do you want to get really serious and offer up real-time inference? If the latter, you can add one more thing to solve for: managing your front-end APIs that you will have to build to support that use case.

Click through to see how you can use an API management tool (like Azure API Management) to assist in these things.

Comments closed