Press "Enter" to skip to content

Category: Cloud

Restarting Failed Control Flows in Azure Data Factory

Meagan Longoria doesn’t want to repeat good work:

I presented at SQL Saturday Pittshburgh this past weekend about populating your data warehouse with a metadata-driven, pattern-based approach. One of the benefits I mentioned is that it’s easy to employ this pattern for restartability.

For instance, let’s say I am loading data from 30 tables and 5 files into the staging area of my data mart or data warehouse, and one of table loads fails. I don’t want to reload the other tables I just loaded. I want to load the ones that have not been recently loaded. Or let’s say I have 5 dimensions and 4 facts, and I had a failure loading a fact table. I don’t want to reload my dimensions, and I only want to reload the failed facts. How do we accomplish this?

Read on to learn how.

Comments closed

Querying Audit Log (.xel) Files in Azure SQL DB

Tanayankar Chakraborty reads an audit log:

A recent issue was brought to our attention that customers could not query .xel log files in an Azure SQL DB using t-sql command. The customers complained that when they ran the command, they received column headers but no content whereas they know that there is content in the logs because they were able to open them with SSMS using Merge Extended Event Files. Here was the T-sql command used by the customer:

select * from sys.fn_get_audit_file (‘https://mydbastorage.blob.core.windows.net/sqldbauditlogs/servername/dbname/SqlDbAuditing_Audit_NoRetention/*.xel’, NULL, NULL);

Click through for the solution, which came down to two separate issues.

Comments closed

Online DR from SQL Server 2022 and Azure SQL MI Now Available

Djordje Jeremic announces general availability of one of the key selling points from SQL Server 2022:

Today, we are announcing the general availability of the following two major capabilities of the Managed Instance link feature with SQL Server 2022:

  • Two-way failover between SQL Server 2022 and SQL Managed Instance through the link to unlock true disaster recovery (DR) with Azure
  • Creating a link from SQL Managed Instance to SQL Server 2022 to unlock off-PaaS data mobility for regulatory and dev/test scenarios 

Click through for more detail.

Comments closed

Client Diagnostics in Cosmos DB

Arthur Daniels digs into diagnostic data:


This topic keeps coming up with my customers so the purpose of this blog post is to keep all the useful information in one post. Think of this post as a consolidation of different pieces of information. If you’re stuck on what “pipelined”, “created”, or “transit time” means, you’re in the right spot.

We are not talking about Diagnostic Settings/Log Analytics in this post, I’d describe those as server-side rather than client-side diagnostics. They are useful in different ways, our client diagnostics will help us understand the path that a request takes from the application to Cosmos DB and back, along with any latency on that path.

Note: before we get started, check to see if you’re using Direct or Gateway mode. Ideally sending your requests Direct mode in the .NET or Java SDKs will usually result in faster requests.

Read on to see how to retrieve and interpret this diagnostic data.

Comments closed

Permissions and BACPAC Files in Azure SQL DB

Roberto Yonekawa diagnoses an error:

We had a support request where the customer was getting an error when trying to export his Azure SQL Database to a bacpac file using the SqlPackage command-line utility.

Error message:

Microsoft.Data.Tools.Diagnostics.Tracer Error: 19 : 2024-08-21T16:10:56 : Microsoft.SqlServer.Dac.DacServicesException: One or more unsupported elements were found in the schema used as part of a data package.

Error SQL71627: The element Permission has property Permission set to a value that is not supported in Microsoft Azure SQL Database v12.

Click through for the specific issue Roberto found. I’d imagine that there are other permission sets that are incompatible with Azure SQL Database and would cause this error message to pop up as well.

Comments closed

Reduced Auto-Pause Delay for Azure SQL DB Serverless

Morgan Oslake goes to sleep sooner:

Azure SQL Database serverless automatically scales compute based on workload demand and bills for compute used per second.  In the General Purpose tier, serverless also provides an option to automatically pause the database during idle usage periods when only storage related costs are billed.  When workload activity returns, the database is automatically resumed.

Customers choosing to enable auto-pausing can specify the auto-pause delay as part of the serverless configuration.  The auto-pause delay is the length of time the database must be idle before auto-pausing.  The lower the auto-pause delay and the more frequently auto-pausing occurs, the greater the potential compute cost savings. 

Read on for the update in minimum auto-pause time.

Comments closed

Reading Data from Azure Blob Storage in Snowflake

Arun Sirpal explains a common architectural pattern:

Let’s go back to data platforms today and I want to talk about a very common integration I see nowadays, Azure Blob Storage linked to Snowflake via a storage integration which then we can access semi structured files via external tables, it is a good combination of technology I have to say.

Click through for an architecture diagram and example of the code you’d need.

Comments closed

Mounting Azure Data Factory in Fabric Data Factory

Andy Leonard takes up a factory job:

Thanks to the hard work of the Microsoft Fabric Data Factory Team, it’s now possible to mount an Azure Data Factory in Fabric Data Factory. This post describes one way to mount an existing Azure Data Factory in Fabric Data Factory. In this post, we will:

  • Mount an existing Azure Data Factory in Fabric Data Factory
  • Open the Azure Data Factory in Fabric Data Factory
  • Test-execute two ADF pipelines
  • Modify and publish an ADF pipeline

Read on to see how it all works. One of the odd things about Microsoft Fabric—and its predecessor, Azure Synapse Analytics—is the penchant for similar-but-not-quite-the-same services. Yes, we have Data Factory…but it’s not quite the same. Yes, we have Azure Data Explorer (and KQL)…but it’s not quite the same. I get that there are reasons for this (such as not having a resource group with a dozen separate services hanging around), but I’m sure it’s a bit frustrating working on several separate code bases and trying to keep them all approximately in sync.

Comments closed

Trying out the Databricks For-Each Task

Chen Hirsh goes in a loop:

Databricks recently added a for-each task to their workflow capability. Workflows are Databricks jobs, like Data factory pipelines, or SQL server jobs, a pipeline that you can schedule, that include a number of tasks that together complete some business logic.

Theoretically, the long-awaited for-each task should make the run of multiple processes easier. for example, one of the things I often do is run a list of notebooks, each processing a different table, with no dependencies between them. At the moment I use parallel notebooks – https://docs.databricks.com/en/notebooks/notebook-workflows.html#run-multiple-notebooks-concurrently

As you will see later, this use case is not supported yet. But before that, let’s see what we can do with the for-each task.

Read on to see what it currently can do, and what it cannot.

Comments closed