Press "Enter" to skip to content

Category: ETL / ELT

Near-Real-TIme Reporting on SQL Server Data with Microsoft Fabric

Rebecca Lewis continues a series on Microsoft Fabric:

You already know the options. Run heavy reporting queries against production. eewgh. Or stand up a reporting replica, build ETL to keep it current, maintain a refresh schedule, and hope nothing breaks on a holiday weekend. It works, but it’s expensive and has an awful lot of moving pieces.

Fabric gives you a third path: continuously replicate your SQL Server data into OneLake using Fabric Mirroring, and let Power BI read it using Direct Lake mode. Your SQL Server stays focused on OLTP and your reporting runs against a near real-time copy in Fabric. No pipelines. No refresh schedules. Nice.

Read on for the options available with Microsoft Fabric, as well as an endearing note that “real-time” isn’t.

Leave a Comment

Checking if a Microsoft Fabric Data Pipeline is Running

Jon Lunn checks the status of a data pipeline:

How do you check if a pipeline is running, not from the monitor, but from your Data Pipelines?

Maybe you’re like me and you have a  Data Pipeline process that needs to check if some other pipeline else is running. In my case I have to check if a process is running due to Delta tables liking you to have one process writing to them, otherwise you can get concurrency issues as two items are trying to update the same delta table metadata file.

Those tricky metadata items like the process to be exclusive. It’s not just a Delta table issue; this can happen with regular SQL databases tables. So you can use this for anything you want to stop a locking issue or have an exclusive access to an object or just don’t want a process to run while another is doing its thing. 

Read on to see how you can check the current status of a data pipeline from within a different data pipeline.

Comments closed

Using the Microsoft Fabric Copy Job with Data in Dataverse

Laura Graham-Brown loads some data:

Dataverse is the data store behind parts of Dynamics and lots of Power Platform projects. So Dataverse can contain vital business data that will be needed for reporting. In this post we are going to look at one method which is using copy job with Dataverse to copy across data in Microsoft Fabric.

Click through to see how, including incremental data loads.

Comments closed

Orchestration Options in Microsoft Fabric

Reitse Eskens moves some data:

Well, unless you enjoy waking up every night to start your Extract-Transform-Load (ETL) process and manually running each process to do some work, it’s a smart move to automate this. Also, make sure everything always runs in the correct order. Additionally, there are situations where processes need to run in different configurations.

All these things can be done with what we call orchestration. It may sound a bit vague now, but we’ll get to the different moving parts of this, like parameterisation and pipelines.

Read on for a primer on the topic.

Comments closed

Tips for the Import Data Option in SQL Server

Andy Brownsword doesn’t trust wizards, with their pointy caps and long beards:

If you need to create a copy of a table in another database, the ‘Import Data’ option may seem convenient. If you’ve used this method to copy to your dev environment and found things break, this post is for you.

Click through for some solid advice on how to import that data. Another thing I would sometimes do is coerce all of the input columns to long strings and load it into a staging table. Then, I could use T-SQL to re-shape the data however I needed it rather than trying to get a finicky SSIS flow to translate this date and time combination (or whatever) appropriately.

Comments closed

Copy Job in Fabric Data Factory Pipelines now GA

Jianlei Shen makes an announcement:

Copy Job Activity allows you to run Copy jobs as native activities inside Data Factory pipelines.

Copy jobs are created and managed independently in Data Factory for quick data movement between supported sources and destinations. With Copy job Activity, that same fast, lightweight experience is now embedded within pipelines, making it easier to automate, schedule, and chain Copy jobs as part of broader data workflows.

Read on for an overview of what’s in the activity and a few links on how to get started with it.

Comments closed

Cutting Costs of Azure Self-Hosted Integration Runtimes

Andy Brownsword saves some quid:

If you have a Self-Hosted Integration Runtime (SHIR, or IR for short here) on an Azure Virtual Machine (VM), there’s a cost to keep it online. When used intermittently – for example during batch processes – this is inefficient for costs as you’re paying for the compute you don’t need. One way to alleviate this is by controlling uptime of the environment manually, only bringing it online for as long as needed.

Read on to see how to do this.

Comments closed