Synapse Analytics – Curated SQL

Apache Spark 3.5 Support in Azure Synapse Analytics

Published 2025-06-03 by Kevin Feasel

You can now create Azure Synapse Runtime for Apache Spark 3.5. The essential changes include features which come from upgrading Apache Spark to version 3.5 and Delta Lake 3.2. Please review the official release notes for Apache Spark 3.5 to check the complete list of fixes and features. In addition, review the migration guidelines between Spark 3.4 and 3.5 to assess potential changes to your applications, jobs and notebooks.

Credit where credit is due: I’ve made light of the utter lack of work on Azure Synapse Analytics since Microsoft Fabric’s release. But hey, they did a thing. Granted, the impetus behind this was to “prepare for migrating to Microsoft Fabric Spark.”

Comments closed

Troubleshooting a Slow Mapping Data Flow in Azure Synapse Analytics

Published 2025-04-21 by Kevin Feasel

Reitse Eskens has the need for speed:

The issue was quite straightforward. The client has a mapping data flow in Synapse that processes a few hundred to a few thousand rows but takes 15 minutes to complete. The low number of rows compared to the time necessary is a cause for concern.

The data extraction needs a staging storage account where the data is written into TXT files. The second step of the mapping data flow reads the TXT files and writes them out in delta format, which is Parquet files.

The source is an S4Hana CDC table, the target of which is a regular Azure storage account.

Read on for Reitse’s summarization of the troubleshooting and testing process, as well as what ended up working for this customer.

Comments closed

Common Data Model Connector for Synapse Spark 3.4

Published 2025-04-08 by Kevin Feasel

Richard Swinbank deals with a totally-not-deprecated platform:

The underlying problem here appears to be that the Spark connector is simply not supported in v3.4. There’s very little I can find to officially confirm or deny this, but an answer to this question on Microsoft Q&A backs this up. The answer also suggests a few options, including:

downgrade to Spark 3.3 – this isn’t an option because it’s end-of-life

migrate to Fabric – long term this is good idea, but it’s not a quick fix for this problem.

use alternative data access methods, e.g. using the Azure Data Lake Storage Gen2 connector.

In this article, I take a look at option (3).

Click through for Richard’s workaround.

Comments closed

Reasons to Migrate from Synapse to Fabric

Published 2024-11-07 by Kevin Feasel

James Serra has a list:

Many customers ask me about the advantages of moving from Azure Synapse Analytics to Microsoft Fabric. Here’s a breakdown of the standout features that make Fabric an appealing choice:

Unified Environment for All Users
Fabric serves everyone—from report writers and citizen developers to IT engineers—unlike Synapse, which primarily targets IT professionals.

Hands-Free Optimization
Fabric is auto-optimized and fully integrated, allowing most features to perform well without requiring technical adjustments.

I suppose that James is too politic to give what I’d consider the top reason: because there have actually been meaningful updates to Microsoft Fabric in the past year. I’m not sure you can really say the same thing about Azure Synapse Analytics.

The tricky part about this, however, is that–to my knowledge, at least–there’s no clean way to migrate dedicated SQL pools.

Comments closed

Cosmos DB HTAP into Azure Synapse Analytics and Microsoft Fabric

Published 2024-08-13 by Kevin Feasel

Paul Hernandez doesn’t want to write ETL jobs:

In the ever-evolving landscape of data management and analytics, choosing the right tools and approaches is crucial for optimizing performance and achieving business goals. Two prominent solutions that have gained traction are Azure Synapse Link for Azure Cosmos DB and Mirroring in Microsoft Fabric. Both offer unique benefits and cater to different needs, making it essential to understand their differences and use cases.

Read on to see how each of these works, as well as a quick demonstration of efficacy.

Comments closed

GUID Conversion and the Serverless SQL Pool

Published 2024-02-22 by Kevin Feasel

Reitse Eskens hits a weird error:

One of the transformations is to change one primary key column from integer to GUID. This is something you can do with some trickery you’ll see in the code. But what I found was that, even though the primary key is unique, the GUID’s weren’t. And then the fun starts digging into the why…

Read on for the research Reitse performed. I don’t even have a good guess for this, it’s so weird. It feels like a bug but it’s weird regardless.

Comments closed

The Death (and Life?) of Azure Synapse Analytics

Published 2024-02-07 by Kevin Feasel

Paul Andrew plays coroner:

I think it’s fair to say that Azure Synapse Analytics has had a hard life. It was announced in public preview as a surprise to most of the community, including Microsoft cloud solution architects. Ultimately meaning that very little private preview testing and feedback on the product was done before showing it to the world. This resulted in a lot of frustration in the subsequent year before it could be classified as generally available and more frustration after that while we battled with the missing production features. Even now, the product is lacking in a lot of functionality. Anyway, this is all in the past. Microsoft Fabric is the new kid on the block, and we need to address the unpopular question about the future of Synapse. And considering I’ve been very unpopular with the product teams before; I’ll take this one for the team. Sorry, but it needs to be addressed.

Read on for Paul’s thoughts. I tend to agree in general with his take, but do read Bogdan Crivat’s response. Bogdan is on the Synapse product team and shares some thoughts as well.

2 Comments

Network Troubleshooting for Azure Synapse Analytics

Published 2023-11-17 by Kevin Feasel

Sergio Fonseca continues a series on Azure Synapse Analytics connectivity problems:

In this post I will speak about how to capture a network trace and how to do some basic troubleshooting using Wireshark to investigate connection and disconnection issues, not limited to samples error messages below:

An existing connection was forcibly closed by the remote host, The specified network name is no longer available, The semaphore timeout period has expired.

Connection Timeout Expired. The timeout period elapsed while attempting to consume the pre-login handshake acknowledgement. This could be because the pre-login handshake failed or the server was unable to respond back in time. The duration spent while attempting to connect to this server was – [Pre-Login] initialization=5895; handshake=29;

A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 – The semaphore timeout period has expired.)

A connection was successfully established with the server, but then an error occurred during the login process

Failed to copy to SQL Data Warehouse from blob storage. A connection was successfully established with the server, but then an error occurred during the login process. (provider: SSL Provider, error: 0 – An existing connection was forcibly closed by the remote host.) An existing connection was forcibly closed by the remote host

Comments closed

Power BI Authentication to Synapse via Sharable Cloud Connection

Published 2023-11-08 by Kevin Feasel

Dan English continues a series:

This is a bit overdue and a follow up to a few other posts I have regarding using Service Principal authentication with Power BI reports Power BI using Service Principal with Synapse SQL Pool and Power BI using Service Principal with Synapse Data Explorer (Kusto) Pool.

With the other two posts I did last year I had to use the SQL Server ODBC driver to get that to work and the big downside to that is that you need to use a gateway with that. Well in this case we are going to take a look at the new Shareable Cloud Connections that were announced earlier this year Streamlining cloud connection management for datasets, paginated reports, and other artifacts | Microsoft Power BI Blog | Microsoft Power BI

Click through to see what you need to get it working.

Comments closed

Exporting Dynamics 365 Data into Delta Lake via Synapse Link

Published 2023-09-27 by Kevin Feasel

Jose Mendes performs a data migration:

It’s fair to say there have been some considerable changes in the Azure landscape over recent years.

This blog will show you how to configure Synapse Link to export D365 data in the Delta Lake format – an open-source data and transaction storage file format used in Lakehouse implementations.

Before you start considering using this approach, you will need to ensure you meet the following prerequisites (Microsoft documentation).

Read on for those prerequisites as well as a step-by-step guide on how to do it.

Comments closed

Category: Synapse Analytics