Press "Enter" to skip to content

Category: ETL / ELT

Using the Microsoft Fabric Copy Job with Data in Dataverse

Laura Graham-Brown loads some data:

Dataverse is the data store behind parts of Dynamics and lots of Power Platform projects. So Dataverse can contain vital business data that will be needed for reporting. In this post we are going to look at one method which is using copy job with Dataverse to copy across data in Microsoft Fabric.

Click through to see how, including incremental data loads.

Leave a Comment

Orchestration Options in Microsoft Fabric

Reitse Eskens moves some data:

Well, unless you enjoy waking up every night to start your Extract-Transform-Load (ETL) process and manually running each process to do some work, it’s a smart move to automate this. Also, make sure everything always runs in the correct order. Additionally, there are situations where processes need to run in different configurations.

All these things can be done with what we call orchestration. It may sound a bit vague now, but we’ll get to the different moving parts of this, like parameterisation and pipelines.

Read on for a primer on the topic.

Leave a Comment

Tips for the Import Data Option in SQL Server

Andy Brownsword doesn’t trust wizards, with their pointy caps and long beards:

If you need to create a copy of a table in another database, the ‘Import Data’ option may seem convenient. If you’ve used this method to copy to your dev environment and found things break, this post is for you.

Click through for some solid advice on how to import that data. Another thing I would sometimes do is coerce all of the input columns to long strings and load it into a staging table. Then, I could use T-SQL to re-shape the data however I needed it rather than trying to get a finicky SSIS flow to translate this date and time combination (or whatever) appropriately.

Leave a Comment

Copy Job in Fabric Data Factory Pipelines now GA

Jianlei Shen makes an announcement:

Copy Job Activity allows you to run Copy jobs as native activities inside Data Factory pipelines.

Copy jobs are created and managed independently in Data Factory for quick data movement between supported sources and destinations. With Copy job Activity, that same fast, lightweight experience is now embedded within pipelines, making it easier to automate, schedule, and chain Copy jobs as part of broader data workflows.

Read on for an overview of what’s in the activity and a few links on how to get started with it.

Leave a Comment

Cutting Costs of Azure Self-Hosted Integration Runtimes

Andy Brownsword saves some quid:

If you have a Self-Hosted Integration Runtime (SHIR, or IR for short here) on an Azure Virtual Machine (VM), there’s a cost to keep it online. When used intermittently – for example during batch processes – this is inefficient for costs as you’re paying for the compute you don’t need. One way to alleviate this is by controlling uptime of the environment manually, only bringing it online for as long as needed.

Read on to see how to do this.

Comments closed

Calling Logic Apps via Data Factory Pipelines

Andy Brownsword flips the script:

Last week we looked at calling a Data Factory Pipeline from a Logic App. This week I thought we’d balance it out by taking a look at calling a Logic App from an Azure Data Factory (ADF) Pipeline.

When building the Logic App last week we had to create our own polling mechanism to check for completion of the pipeline. The process is much simpler in the opposite direction. I specifically want to highlight two approaches, and save some pennies whilst we’re at it.

I am all about saving pennies, so be sure to check out that section as well.

Comments closed

Executing Data Factory Pipelines from Logic Apps

Andy Brownsword automates a workflow:

When building Azure Logic Apps we can use the Azure Data Factory connector to start a pipeline. However that action simply triggers a pipeline and doesn’t wait for it to finish. If your downstream logic depends on the output – for example to collect a file – this can cause issues.

In this post I’ll demonstrate how to control the Logic App flow to wait for the pipeline to complete before proceeding.

Read on to see how, as well as some additional ideas of how to improve the pattern.

Comments closed

Copying Data across Tenants with Fabric Data Factory

Ye Xu makes use of the Copy job:

Copy job is the go-to solution in Microsoft Fabric Data Factory for simplified data movement, whether you’re moving data across clouds, from on-premises systems, or between services. With native support for multiple delivery styles, including bulk copy, incremental copy, and change data capture (CDC) replication, Copy job offers the flexibility to handle a wide range of data movement scenarios—all through an intuitive, easy-to-use experience. Learn more in What is Copy job in Data Factory – Microsoft Fabric | Microsoft Learn.

With Copy job, you can also perform cross-tenant data movement between Fabric and other clouds, such as Azure. It also enables cross-tenant data sharing within OneLake, allowing you to copy data across Fabric Lakehouse, Warehouse, and SQL DB in Fabric between tenants with SPN support. This blog provides step-by-step guidance on using Copy job to copy data across different tenants.

Click through for a demonstration, as well as the security permissions that are necessary for this to work.

Comments closed

The Downside of Zero-Copy Integration between Kafka and Iceberg

Jack Vanlightly lays out an argument:

Over the past few months, I’ve seen a growing number of posts on social media promoting the idea of a “zero-copy” integration between Apache Kafka and Apache Iceberg. The idea is that Kafka topics could live directly as Iceberg tables. On the surface it sounds efficient: one copy of the data, unified access for both streaming and analytics. But from a systems point of view, I think this is the wrong direction for the Apache Kafka project. In this post, I’ll explain why. 

Read on for an explanation of what “zero-copy” means here, as well as Jack’s position on the matter. I think it’s a solid argument and worth the read.

Comments closed