Press "Enter" to skip to content

Category: Cloud

Incremental Pipline Development with Azure Data Factory

Andy Leonard shows how you can incrementally develop Azure Data Factory pipelines:

A friend pinged me recently to ask about rolling back Azure Data Factory (ADF) pipeline versions. My response was a question: Are you using source control with ADF? That did not help the current situation.

I thought of the way I often build ADF pipelines and shared my methodology, which is relatively simple (it has to be simple for me to understand it!):

Click through for Andy’s approach.

Comments closed

Methods to Run Scheduled Tasks in Azure

Joey D’Antoni has a roundup of several techniques you can use to run scheduled tasks against an Azure SQL Database:

If you’ve worked with Microsoft SQL Server for any period of time, you are familiar with the SQL Server Agent. The Agent, which remains mostly unchanged since I started working with in 1999, is a fairly robust job scheduler that can also alert you in the event of job failures or system errors. I feel as though it’s a testament to the quality of the original architecture that the code hasn’t changed very much–it still meets the needs of about 90-95% of SQL Server workloads, based on an informal twitter discussion I had a few months ago. There are some cases where an enterprise scheduling tool is needed, but for maintaining most SQL Servers and executing basic ETL, the agent works fine. There’s one problem–the agent is only available in SQL Server and Azure SQL Managed Instance.

Read on to learn about those options.

Comments closed

Azure Elastic Jobs Now GA

Niko Neugebauer gives into Azure Elastic Jobs, now generally available:

They key feature that needs to be well understood and which points to the potential of the Elastic Job Agent is that you are in no way limited by your own Azure SQL Database, nor by the logical Azure SQL Server where this database is located (contrary to the MSDB Database on the SQL Server), nor will you be limted by the Azure Region, Azure Resource Group or even Azure Subscription – you can configure the Elastic Job that will be reaching out to potentially any Azure SQL Database (given the necessary settings & permissions are correctly configured).

Read the whole thing. Niko shares some interesting thoughts on how it works, how you can tie your one server to a whole host of SQL Databases, and a wish list on what should come next.

Comments closed

Refreshing Power BI Datasets in Azure Data Factory

Meagan Longoria shows us how to refresh a Power BI dataset using Azure Data Factory:

I recently needed to ensure that a Power BI imported dataset would be refreshed after populating data in my data mart. I was already using Azure Data Factory to populate the data mart, so the most efficient thing to do was to call a pipeline at the end of my data load process to refresh the Power BI dataset.

Power BI offers REST APIs to programmatically refresh your data. For Data Factory to use them, you need to register an app (service principal) in AAD and give it the appropriate permissions in Power BI and to an Azure key vault.

Click through for the solution.

Comments closed

A Brief Overview of Azure Synapse Analytics

Ginger Grant gives us the nickel tour of why Azure Synapse Analytics is interesting:

In the past few months, I have been examining Azure Synapse and what it can do.  When it was first released in November of 2019, the first functionality that was rolled out was an update of Azure SQL DW.  For this reason, many people think that Synapse is just an improved version of a cloud data warehouse.  Microsoft did improve SQL DW when it moved it to Synapse.  The biggest architectural design change is the separation of the code from the compute, a theme with many web projects, which allows the compute power to be increased when need dictates and scaled down when computing needs change.  Within Synapse, resources are allocated as Pools and you can define a sql pools to run data warehouse and later change the compute to a different resource.  You will still need to partition your DW as large datasets require partitioning to perform well.  Subsequently Microsoft Released the Azure Synapse Studio to be a container for a larger environment of tools and notebooks to interact with them.

But it’s more than that. Read on to see what else is available.

Comments closed

Migrating From Cosmos DB to SQL Server

Eitan Blumin builds an app:

The general idea is this:

The app executes a Cosmos DB query and collects a number of records into its “buffer”.

Once that “buffer” reaches a certain number of records (configurable), it’s time to “flush” it into the SQL Server. That could be either a database table receiving a Bulk Copy stream, or a stored procedure receiving a table valued parameter (again, configurable).

After the buffer is flushed, we have the option to execute a “merge” procedure. This is a stored procedure that would implement an “upsert” logic from the “staging” table and into the actual destination table.

Read on for more explanation and check out Eitan’s GitHub repo.

Comments closed

Moving a Semantic Model from SSAS to Power BI

Teo Lachev explains why it might make sense to move a semantic model from SSAS over to Power BI:

We decided to move the semantic model to Power BI so that Power BI owns the data. Besides potentially improving the report load time, this architecture has also other important advantages (to learn more, read my “Power BI Large Datasets: The Good, the Bad, and the Ugly” post). If you’re not on Power BI Premium, that “movement” might not easy if you have opted to use Visual Studio or Tabular Editor for development. That’s because Power BI Pro doesn’t expose the XMLA endpoint, so your only option is to migrate the model to Power BI Desktop. But migrating an SSAS Tabular project to Power BI Desktop is not officially supported and there is no automatic migration path.

Click through for the full explanation of why, as well as notes on the process.

Comments closed

Deploying ADF via Azure DevOps

Kamil Nowinski has part two on a series about releasing Azure Data Factory code:

Struggling with #ADF deployment? adf_publish branch doesn’t suit your purposes? Don’t have skills with PowerShell? I have good news for you. There is a new tool in the market. It’s a task for Azure DevOps Release Pipeline to deploy whole ADF from code (JSON files) to ADF instance in Azure. Behind the scenes, it runs the PowerShell module which does all job for you.
Sounds unbelievable? But it’s real! Check it out for yourself.

Click through for the video.

Comments closed

ADF.Procfwk Version 1.8

Paul Andrew has been busy:

Following more great feedback from the Data Platform community the primary goal of this release was to further improve the resilience of the framework processing. These improvements included its restart clean up capabilities and introducing better dependency chain handling between Worker pipelines when failures occur. The latter builds on the existing restart functionality first introduced in release v1.2 and supplements the logic using a new set of pipeline dependency metadata. I’ve created the below visual to conceptually show the new dependency chain behaviour, should you wish to populate and make use of the new metadata handling.

Read on for the full changelog.

Comments closed