Synapse Analytics – Page 9

There are many scenarios where you want to create a new Synapse dedicated SQL pool environment based on an existing Synapse dedicated SQL pool environment. This may be required when you need to create a development or test environment based on your production environment by copying complete schemas and without copying data.

Note that this process won’t move the data itself—given that you’re starting with terabytes for an effective dedicated SQL pool, trying to create a bacpac would be an exercise in misery.

Comments closed

Tips for using Synapse Database Templates

Published 2022-09-20 by Kevin Feasel

James Serra provides some guidance:

I had previously blogged about Azure Synapse Analytics database templates, and wanted to follow-up with some notes and tips on that feature as I have been involved on a project that is using it:
– Purview does not yet pull in the metadata for database templates (table/field descriptions and table relationships). Right now it pulls in the metadata as if it was a SQL table or as if it was a file in ADLS. Both just have the basic information supported by those types. The SQL one is probably preferred
– Power BI does not import the table and field descriptions when connecting to a lake database (where the database templates are stored), but it does import the table relationships. You can see the table descriptions by hovering over the table names in the navigator when importing tables using the “Azure Synapse Analytics workspace (Beta)” connector. Note you are not able to see the table descriptions when hovering over the table names using the “Azure Synapse Analytics SQL” connector. Also note the “Select Related Tables” button does not work in the navigator

Click through for more notes from the field.

Comments closed

Loading the Synapse Data Explorer Pool

Published 2022-09-19 by Kevin Feasel

Gauri Mahajan loads some event data for analysis:

In my previous article, Getting started with Data Explorer pools in Azure Synapse, we learned how to create Data Explorer pools in Azure Synapse and the unique value that Data Explorer brings to semi-structured and free-text data. The creation of the Data Explorer pool is the first step in the process. After the pool is created, one can create data structures, ingest data and then use it for consumption. Data Explorer pool interface provides different ways of ingesting data into the pool including one-click ingestion.

Some of these techniques are really straightforward. Others—especially if you’re talking about large amounts of data—do require installing and working with local tooling.

Comments closed

Importing Delta Tables into a Synapse Dedicated SQL Pool

Published 2022-09-19 by Kevin Feasel

Mark Pryce-Maher does a bit of integration:

In June, Databricks announced that they are open sourcing Delta Lake 2.0. Delta Lake is quickly becoming the format of choice in data science and data engineering.
To import Delta Lake into a Synapse dedicated SQL Pool you would need Azure Data Factory/Synapse Pipelines or Spark to handle the Delta Lake files.

This is not ideal because it adds extra overheads of complexity, time, and costs.
As an intellectual challenge, I wondered if it’s possible to import Delta Lake files directly into the dedicated SQL Pool and support features like time-travel. It turned out to be a great little project and a great way of learning about Delta Lake.

This turned out to be a bit more difficult than I would have imagined. Click through for the script and check the comments as well for a preview of upcoming attractions.

Comments closed

Using a Service Principal Account for Power BI + Dedicated SQL Pool

Published 2022-09-16 by Kevin Feasel

Dan English provides a hookup:

In this post I will go over a topic that is frequently asked about and that is using a Service Principal account with Power BI when connecting to data sources. Currently today none of the built-in connectors support this capability natively, but the SQL Server ODBC driver does support the use of a Service Principal account. The one caveat with using an ODBC driver with Power BI is that a gateway would be required once the report is published to the service.

Read on for the step-by-step process.

Comments closed

Comparing the Dedicated and Serverless SQL Pools

Published 2022-09-14 by Kevin Feasel

Liliam Leme compares SQL pools in Azure Synapse Analytics:

Two recurring questions I frequently get from customers are: “What is the difference between Synapse dedicated SQL pool (formerly SQL DW) and Serverless SQL pool?” and “Which one should I choose for my Business?”
This post is intended to explain the basic concepts of dedicated SQL pool and Serverless SQL Pool, help you understand how they work, and how to use them based on your business needs.

Click through for the comparison.

Comments closed

Spark Query Optimization in Synapse

Published 2022-09-13 by Kevin Feasel

Daniel Coelho lays out a few optimizations in Azure Synapse Analytics Spark pools:

The Azure Synapse Analytics team has prominent engineers enhancing and contributing back to the Apache Spark project. One of our focus areas is Spark query optimization techniques, where Microsoft has decades of experience and is making significant contributions to the Apache Spark open source engine.
The attachment at the bottom of this blog post will be presented at the 48^th International Conference on Very Large Databases (#VLDB2022) and covers the latest developments in query optimization for Apache Spark 3. Those optimizations were developed by Microsoft engineers and are available today in the Azure Synapse runtime for Apache Spark versions 3.1 and 3.2.

Check out the high-level updates as well as a complete technical paper laying out the changes.

Comments closed

Building a Lakehouse with Azure Synapse Analytics

Published 2022-09-12 by Kevin Feasel

Arshad Ali does a bit of construction:

Data Lakehouse architecture has become the de facto standard for designing and building data platforms for analytics as it bridges the gap and breaks the silos created by the traditional/modern data warehouse and the data lake. This blog post introduces you to the world of data lakehouse and it goes into details of how to implement it successfully in Azure with Azure Synapse Analytics.

Read the whole thing.

Comments closed

Data Modification with Synapse Link for SQL Server 2022

Published 2022-08-31 by Kevin Feasel

Kevin Chant changes some data:

In this post I want to cover some things that happen internally when you do updates and deletes with Azure Synapse Link for SQL Server 2022 whilst it is running.
Because recently somebody asked if Azure Synapse Link for SQL Server 2022 captures updates and deletes after they had read a previous post. Where I covered my initial tests for Azure Synapse Link for SQL Server 2022.
Anyway, short answer is that Azure Synapse Link for SQL Server 2022 captures updates and deletes. In this post I will go into more detail about some of the things that appear to happen along the way.

Click through for Kevin’s tests and what the results look like.

Comments closed

Azure Synapse Analytics August 2022 Updates

Published 2022-08-31 by Kevin Feasel

Ryan Majidimehr has a changelog for us:

Full support for MLflow
MLflow is a platform for managing the machine learning lifecycle and streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. We are very happy to announce that SynapseML models now integrates with MLflow with full support for saving, loading, deployment, and autologging!
To learn more, read MLflow in SynapseML getting started guide and SynapseML Autologging.

There are quite a few changes on this list, so they’ve definitely been busy.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Synapse Analytics

Rebuilding a Dedicated SQL Pool via Azure DevOps

Tips for using Synapse Database Templates

Loading the Synapse Data Explorer Pool

Importing Delta Tables into a Synapse Dedicated SQL Pool

Using a Service Principal Account for Power BI + Dedicated SQL Pool

Comparing the Dedicated and Serverless SQL Pools

Spark Query Optimization in Synapse

Building a Lakehouse with Azure Synapse Analytics

Data Modification with Synapse Link for SQL Server 2022

Azure Synapse Analytics August 2022 Updates