Category: Synapse Analytics

CI/CD for Synapse Serverless SQL Pool with SqlPackage and Azure DevOps

Published 2023-09-15 by Kevin Feasel

Azure Synapse Analytics Serverless SQL is a query service mostly used over the data in your data lake, for data discovery, transformation, and exploration purposes. It is, therefore, normal to find in a Synapse Serverless SQL pool many objects referencing external locations, using disparate external data sources, authentication mechanisms, file formats, etc. In the context of CICD, where automated processes are responsible for propagating the database code across environments, one can take advantage of database oriented tools like SSDT and SqlPackage CLI , ensuring that this code is conformed with the targeted resources.

In this article I will demonstrate how you can take advantage of thee tools when implementing the CICD for the Azure Synapse Serverless SQL engine. We will leverage SQL projects in SSDT to define our objects and implement deploy-time variables (SQLCMD variables). Through CICD pipelines, we will build the SQL project to a dacpac artifact, which enables us to deploy the database objects one or many times with automation.

Click through for the demonstration.

Comments closed

Contrasting Azure Synapse Analytics and Microsoft Fabric

Published 2023-08-24 by Kevin Feasel

Warner Chaves explains the difference:

In the modern era of data-driven decision-making, businesses rely heavily on robust and efficient data platforms to process, analyze, and derive insights from their vast amounts of data. Since 2019, Azure Synapse Analytics has been Microsoft’s main contender in this space, offering powerful capabilities to handle complex data workloads.

Now, Microsoft has announced a new data platform called Microsoft Fabric, an evolution of the data platform built with a modified philosophy. It is a similar product but with enough differences to make them not interchangeable and so it’s very important to understand how they both compare and contrast if you’re planning a new data platform deployment. Microsoft wanted a product that was even simpler to deploy and operate and could function well outside of an Azure cloud environment as a full standalone Software As a Service offering.

In this blog post, we’ll compare Synapse Analytics and Fabric, highlighting their features, strengths, and considerations to help you make an informed decision for your organization’s data needs.

Warner has seven main areas of comparison, so click through to see how the two products stack up.

Comments closed

Against Waiting for Microsoft Fabric

Published 2023-08-22 by Kevin Feasel

Paul Andrew follows Betteridge’s Law of Headlines:

But, lets prepare for it in terms of the technical capabilities we line up in our existing data architecture.

The hype curve around Microsoft Fabric since its announcement earlier in the year has been huge. The problem is, we now face some difficult questions in terms of our technology estate. Especially if we have designs and a project already in flight using other Azure Resources.

Read on for Paul’s thoughts on the matter and why you shouldn’t wait until Microsoft Fabric is officially out—use what is available in the meantime and then decide whether you want to make a transition. Paul leaves one thing in the margins that I would want to make clear: if this is your plan, avoid the dedicated SQL pool unless you absolutely need it or plan to stay on Synapse once Fabric is GA.

Comments closed

A First-Pass Approach to Migrating Dedicated SQL Pool Schemas to Fabric

Published 2023-07-20 by Kevin Feasel

Kevin Chant gets a jump on a big problem:

To manage expectations, this post only covers database schema objects. Plus, I need to highlight the fact that this solution has some interesting quirks. Some of which I highlight in this post.

Even though there are some quirks, I still want to show this solution. So that others can see it working and I can highlight a few important facts. Plus, share a template you can use to test this yourself.

The current lack of a good migration strategy is a real challenge for anyone thinking of moving from Azure Synapse Analytics to Microsoft Fabric. Serverless SQL pools and Spark pools are an easy transition, but dedicated SQL pools are a tough nut to crack.

Comments closed

Using the Azure Data Factory Self-Hosted Integration Runtime

Published 2023-07-14 by Kevin Feasel

Chen Hirsh hosts a runtime:

In Azure data factory (ADF), An integration runtime is a compute resource to run your pipelines on. When you run an application on your computer, it uses the computer resources, such as CPU and memory, to run its tasks. When you run activities in a pipeline in ADF, they also need resources to do their job, like copying data or writing a file, and these are provided by the integration runtime.

When you create an instance of ADF, you get a default integration runtime, hosted in the same region that you created ADF in. If you need, you can add your own integration runtimes, either on Azure, or you can download and install a self-hosted integration runtime (SHIR) on your own server.

Read on to understand when you would want to use a self-hosted integration runtime and the process to do so. This SHIR also applies to Synapse pipelines and is one of the few ways to move data out of a Synapse workspace with data exfiltration protection enabled.

Comments closed

Migrating the Serverless SQL Pool to Fabric

Published 2023-07-06 by Kevin Feasel

Kevin Chant makes a move:

By the end of this post, you will know how to migrate serverless SQL Pool objects to a Microsoft Fabric Data Warehouse using Azure DevOps. Along the way I share plenty of links and advice.

Please note that Microsoft Fabric is currently in Public Preview and what you see in this post is subject to change.

This is the relatively easy one. The real challenge will be dedicated SQL pool migration.

Comments closed

Auto-Pausing Synapse Dedicated SQL Pools

Published 2023-06-19 by Kevin Feasel

Mark Broadbent saves some money via pool auto-pausing:

This capability is neither earth shatteringly new nor unexpected, and something that Databricks has provided for some time. Of the two Data Exploration & Data Warehousing Pool types, Synapse Serverless Pool (otherwise know as the built-in Pool) by its very definition does not incur compute charges when it is not running.

Therefore this leaves us with only dedicated SQL Pool to worry about and this is where our problems begin.

Click through for the scripts to pause and resume a dedicated SQL pool, and Mark promises a part 2 in which we see the automation.

Comments closed

Microsoft Fabric vs Synapse

Published 2023-05-26 by Kevin Feasel

Nikola Ilic shares some thoughts:

I’ve already introduced Microsoft Fabric in the previous article, so if you’re still not sure what is it all about and why you can think of Fabric as your “data football team”, I strongly encourage you to check that article. Additionally, there are many great articles and videos, both from Microsoft and the community, where you can find out more about Fabric and its various scenarios and components.

In the above-mentioned article, I scratched the surface of the inevitable topic that now comes into focus: “What now for Azure Synapse Analytics?” Since I’ve been asked this exact question multiple times in the previous days, I’ve decided to put down my thoughts and share them in this article.

Read the whole thing. My thoughts, which are generally similar to Nikola’s:

There are no plans (at this time) to remove Synapse, and even if there were, prior history—like with Azure SQL DW—says that the deprecation timeframe is something we can measure in years rather than months
Fabric is intended to replace Synapse one of these days, and new customers should start with Fabric
Current Synapse customers should stay on Synapse for now, especially given that there is currently no easy migration plan. Give partners and Microsoft some time to sort that out, though, and I expect you’ll see tools and products for this by the time Fabric goes GA
PaaS and SaaS are quite different and that can be an influential factor. My personal preference is for SaaS, especially knowing how difficult it can be to secure Synapse while still enabling developer functionality
We’re on day 4 of Fabric being a thing (at least in public), and it’ll probably be in a public preview for a while, so there’s still plenty of baking left to do

Comments closed

Feature Branching and Hotfixes for Azure DevOps

Published 2023-05-12 by Kevin Feasel

Vytas Suopys covers a bit of source control strategy:

Have you ever deployed a release to production only to find out a bug has escaped your testing process and now users are being severely impacted? In this post, I’ll discuss how to deploy a fix from your development Synapse Workspace into a production Synapse Workspace without adversely affecting ongoing development projects.

This example uses Azure DevOps for CICD along with a Synapse extension for Azure DevOps: Synapse Workspace Deployment. In this example, I assume Synapse is already configured for source control with Azure DevOps Git and Build and Release pipelines are already defined in Azure DevOps. Instructions on how to apply this this can be found in the Azure Synapse documentation for continuous integration and delivery.

The specific example covers Synapse, though the general principle applies no matter what you’re deploying.

Comments closed

An Overview of Azure Synapse Analytics

Published 2023-05-11 by Kevin Feasel

Kevin Chant offers a primer on Azure Synapse Analytics:

In reality, there are a lot of features within Azure Synapse Analytics where your SQL Server background can prove to be useful.

By the end of this post, you will have a good overview of Azure Synapse Analytics. In addition, where your SQL Server background can prove to be useful. Plus, there are plenty of links included in this post.

This is not the slimmest of primers, which makes sense given how broad Synapse is.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31