Synapse Analytics – Page 4

Creating a Disaster Recovery Plan for Synapse

Published 2023-03-03 by Kevin Feasel

Freddie Santos talks HA/DR with Synapse:

Many of our customers have been asking about creating a disaster recovery plan for their Synapse Workspace. In a new blog series, we will cover the basics of disaster recovery and business continuity, discussing available options and custom solutions.

In this first post, we’ll review important concepts and questions to answer before building a disaster recovery plan, including the differences between High Availability and Disaster Recovery.

The focus in this post is on the dedicated SQL pool and Azure Data Lake Storage Gen2 (because people still think about Gen1?), though that’s the majority of what you’d need to think about—Spark pools and the serverless SQL pool really drive from the data lake. There’s also Data Explorer pools, which have their own storage and HA/DR capabilities.

Comments closed

Unit Testing Spark Notebooks in Synapse

Published 2023-03-03 by Kevin Feasel

Arun Sethia grabs the oscilloscope:

In this blog post, we will cover how to test and create unit test cases for Spark jobs developed using Synapse Notebook. This is an extension of my previous blog, Synapse – Choosing Between Spark Notebook vs Spark Job Definition, where we discussed selecting between Spark Notebook and Spark Job Definition. Unit testing is an automated approach that developers use to test individual self-contained code units. By verifying code behavior early, it helps to streamline coding practices for larger systems.

Arun covers three major use cases: when your code is in an external library, when it is in a separate notebook, and when it is in the same notebook.

Comments closed

Working with Managed Private Endpoints in Synapse

Published 2023-03-02 by Kevin Feasel

Sergio Fonseca continues a series on Synapse connectivity:

When you create your Azure Synapse workspace, you can choose to associate it to an Azure Virtual Network. The Virtual Network associated with your workspace is managed by Azure Synapse. This Virtual Network is called a Managed Workspace Virtual Network or Synapse Managed VNET.

A Managed VNET only controls OUTBOUND data flow (From Synapse to Outside). To control INBOUND (From client to Synapse) you need to use Private endpoints. Check out Synapse Connectivity Series Part #2 – Inbound Synapse Private Endpoints for more details.

I am 100% in favor of using managed vNETs with Synapse and about 40% in favor of using Data Exfiltration Protection—it’s a lot lower because of the impact it has on your developers, though if you need it, developers will just have to deal with the added pain.

Comments closed

Rolling Your Own Serverless SQL Pool Database Project

Published 2023-02-28 by Kevin Feasel

Kevin Chant doesn’t let the lack of support for a product limit him:

In this post I want to share how I created a homemade serverless SQL Pool database project.

Because I know people are keen to work this way right now. Mostly due to the comments I received when I covered how to deploy a dacpac to a serverless SQL pool.

By the end of this post you will know how I created a database project for it. Plus, how you can deploy the contents of the database project with Azure DevOps. I also share plenty of links along the way.

Though Kevin did run into some challenges trying to hack in a solution, so it’s not quite as useful as you’d first hope.

Comments closed

Synapse and Azure ML Pipelines

Published 2023-02-23 by Kevin Feasel

Santosh Thomas integrates two Azure products:

As more customers standardize on the Synapse data platform, enabling machine learning workflows through Azure Machine Learning (Azure ML) becomes particularly interesting. This is especially true as more customers look to bring their data engineering and data science practices together and mature capabilities on both sides.

The goal of this blog post is to highlight how Synapse and Azure ML can work well together to deliver key insights. This is motivated by a scenario where a customer modernized their data platform on Azure Synapse but was looking to improve their data science practices through Azure ML. The focus of this blog is to expose existing functionality, and it is not a “hardened” solution with security or other cloud best practice implementations. The workflow steps also assume some level of comfort with Python and working with the Azure Python SDKs.

There was a time in which Microsoft wanted us to remain in Synapse for machine learning tasks, but that time is gone: the emphasis is definitely to do machine learning tasks in Azure ML, regardless of where the data lives…unless there’s a Spark job involved, in which case things get all weird again.

Comments closed

Principles of Synapse Security

Published 2023-02-22 by Kevin Feasel

Liliam Leme provides an overview of security options in Azure Synapse Analytics:

This blog post will provide an overview of the Synapse security environment focused on Dedicated SQL Pool, Serverless SQL Pool, and Spark.

Security has many layers and frequently it will determine how you build your process. I start this post by reviewing several important security considerations which you can later apply to your Synapse environment.

This is a fairly lengthy post and it still only covers a moderate amount of what you’d want to do for Azure Synapse Analytics. This is the downside to having a complex interplay of several products: there’s a lot to secure and a lot to think about along the way.

Comments closed

Getting Started with Azure Synapse Analytics

Published 2023-02-21 by Kevin Feasel

Shabnam Watson gets us started with Azure Synapse Analytics:

In this blog post, I show you how easy it is to start an Azure Synapse Analytics workspace (instance) and use its Serverless SQL Pool engine to analyze sample publicly available data. As you will read shortly, Azure Synapse Analytics provides many compute engines for different use cases. The easiest one to get started with is its Serverless SQL Pool since every Azure Synapse Analytics instance comes with one already created and ready to use. It also does not have any cost unless if you use it which makes it very attractive to those who have a limited Azure budget.

Click through to see how to create a workspace, load some data, and query it via the serverless SQL pool.

Comments closed

Recommendations for Dedicated SQL Pool Data Modeling

Published 2023-02-17 by Kevin Feasel

Bhaskar Sharma has some advice:

In this article, I will discuss how to physically model an Azure Synapse Analytics data warehouse while migrating from an existing on-premises MPP (Massive Parallel Processing) data warehouse solution like Teradata and Netezza. The approach and methodologies discussed in this article are purely based on the knowledge and insight I have gained while migrating these data warehouses to Azure Synapse dedicated SQL pool.

Dedicated SQL pools are close enough to regular SQL Server that we make a lot of assumptions about it, some of which may be wrong.

Comments closed

Deploying a dacpac to the Serverless SQL Pool

Published 2023-02-16 by Kevin Feasel

Kevin Chant drops some dacpacs off at the (serverless) pool:

In this post I want to cover deploying a dacpac to a serverless SQL pool using Azure DevOps. Yes, you are reading that right. It is now possible thanks to a sqlpackage update.

To clarify, a dacpac file is a special file that you can use to deploy updates to SQL Server related databases using a state-based deployment. Plus, when I say serverless SQL pool I mean an Azure Synapse Analytics serverless SQL Pool.

Kevin includes examples for Azure DevOps as well as GitHub Actions.

Comments closed

Trying out Azure Synapse Link for SQL Server 2022

Published 2023-02-14 by Kevin Feasel

Kevin Chant looks at Azure Synapse Link for SQL Server 2022:

My first topic is about a new feature that covers both SQL Server 2022 and Azure. Which is Azure Synapse Link, or to be more precise Azure Synapse Link for SQL Server 2022.

I have been doing various tests with this feature recently. Which has led to some interesting blog posts about Azure Synapse Link for SQL Server 2022.

Read on for a few more thoughts, as well as deployment scripts via Azure DevOps and GitHub Actions.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: Synapse Analytics