Press "Enter" to skip to content

Category: Microsoft Fabric

Speeding up Dataflow Validation and Publish Times

Chris Webb doesn’t want to wait:

If you’re working with slow data sources in Power BI/Fabric dataflows then you’re probably aware that validation (for Gen1 dataflows) or publishing (for Gen2 dataflows) them can sometimes take a long time. If you’re working with very slow data sources then you may run into the 10 minute timeout on validation/publishing that is documented here. For a Gen1 dataflow you’ll see the following error message if you try to save your dataflow and validation takes more than 10 minutes:

Click through for that common error message, as well as some tips to avoid this issue. There was also an interesting approach in the comments section that circumvented the problem as well.

Leave a Comment

Virtualizing Hadoop Data into OneLake via Apache Ozone

James Morantus hits us with a blast from the past:

Microsoft Fabric OneLake shortcuts facilitate the virtualization of data from various cloud object stores and on-premises environments. For on-premises sources like Cloudera/Apache Ozone, the OneLake S3 Compatible Shortcut can be utilized to connect to these data sources. With OneLake Shortcuts, users can create a virtual reference to their Cloudera cluster data without moving or duplicating the data. To learn more about Fabric OneLake shortcuts, reference this blog OneLake with shortcuts.

If this was a decade ago, I’d be a lot more excited. But it is kind of wild how quickly the data landscape changed, with the adoption of Spark over classic Hadoop; cloud-based data lakes over HDFS; and more focused dataset sizes over “give me everything.”

Leave a Comment

Improving the Microsoft Fabric Copy Job

Krishnakumar Rukmangathan makes a copy:

Copy Job has been a go-to tool for simplified data ingestion in Microsoft Fabric, offering a seamless data movement experience from any source to any destination. Whether you need batch or incremental copying, it provides the flexibility to meet diverse data needs while maintaining a simple and intuitive workflow.

We continuously refine Copy Job based on customer feedback, enhancing both functionality and user experience. In this update, we’re introducing three key UX improvements designed to streamline your workflow and boost efficiency.

Read on for those three improvements.

Leave a Comment

Foreign Key Relationships in Microsoft Fabric Data Warehouses

Jared Westover looks at key constraints:

In late 2024, I noticed a comment on the Microsoft Learn site stating that foreign keys could improve query performance on tables in a Fabric warehouse. That claim immediately caught my attention. I wanted to answer a simple question: Do relationships help, hurt, or have no effect when added to tables in a Fabric warehouse?

Let’s get more specific—do foreign keys improve query performance when reading data (not loading)? In other words, do they make queries run faster?

Sadly, the answer is not as promising as with SQL Server. But this also makes sense considering the distributed nature of Fabric data warehouses.

Leave a Comment

Iceberg Data Support in OneLake

Matthew Hicks isn’t replicating data anymore:

Microsoft OneLake is the single, unified, logical data lake that allows your entire organization to store, manage, and analyze data in one place. It provides seamless integration with various data sources and engines, making it easier to derive insights and drive innovation.

At the most recent Microsoft Build conference, we announced the integration effort between Snowflake and OneLake, which aims to allow users of both Snowflake and Microsoft Fabric to work on the same Iceberg data in OneLake, with no data duplication/movement needed. More recently, we released the preview of OneLake’s Iceberg table format support, which included the ability for Snowflake to write Iceberg tables directly to OneLake.

Click through for more information about the current status of this feature, as well as what’s coming soon.

Leave a Comment

Spatial Queries in Fabric Data Warehouse

Jovan Popvic reads a map:

Spatial data has become increasingly important in various fields, from urban planning and environmental monitoring to transportation and logistics. Fabric Data Warehouse offers spatial functionalities that enable you to query and analyze spatial data efficiently.

In this blog post, we will delve into the spatial capabilities in the Fabric Data Warehouse and demonstrate how to use the spatial functions in your queries.

This looks a bit like the way we perform spatial operations in SQL Server. Jovan shows off some examples of functionality, so check that out.

Leave a Comment

SQL Database Default Checkbox in Microsoft Fabric Delayed

Amar Digamber Patil makes an announcement:

In our ongoing effort to enhance the visibility, accessibility, and efficiency of SQL database in Fabric, we are making a change that ensures organizations can make an informed decision before default enablement takes effect. We have changed the timeline for when SQL database will be enabled by default.

Initially, we planned to roll out the checkbox notification on February 8, 2025, and enable SQL Database in Fabric by default on March 8, 2025. However, based on the need for more flexibility, we have adjusted the timeline:

Click through for the new timeline. You can, of course, enable it on your own today if you are a Microsoft Fabric administrator with rights to change these settings.

Leave a Comment

Vacuuming Delta Tables in Microsoft Fabric

Kenneth Omorodion explains why you sometimes need to bust out the VACUUM:

Efficient data management in Microsoft Fabric is a necessity in maintaining large-scale partitioned Delta tables. In dynamic datasets with frequently generated new files, the need to ensure the removal of stale files becomes very important to prevent storage bloating. In settings with partitioned tables, where data is in a hierarchical structure (e.g., by year, month, day), this can be particularly challenging, and files must be cleaned without disrupting active data. Learn how the VACUUM operation can help optimize delta tables.

Read on to learn more.

Leave a Comment

Deploying Assets via Azure DevOps and fabric-cicd into Microsoft Fabric

Kevin Chant pushes some code:

In this post I want to show how you can operationalize fabric-cicd to work with Microsoft Fabric and Azure DevOps. Which I exclusively revealed at Power BI Gebruikersdag over the weekend.

Just so that everybody is aware, fabric-cicd is a Python library that allows you to perform CI/CD of various Microsoft Fabric items into Microsoft Fabric workspaces. At this moment in time there is a limited number of supported item types. However, that list is increasing.

Click through to see what Kevin did and how it worked out.

Leave a Comment

Writing Data into a Microsoft Fabric Lakehouse via Notebook

Stepan Resl writes some code:

Since Lakehouse is one of the key items within Microsoft Fabric, it is important to know how to write data into it in various formats and using different tools. One of the most common tools is notebooks, as they provide great flexibility and speed for development and testing with graphical outputs. In this article, I want to focus primarily on the following types of notebooks:

  • PySpark
  • Python

Click through to see how it works in both notebook types.

Leave a Comment