Synapse Analytics – Page 13

From Cosmos DB to the Serverless SQL Pool

Published 2022-03-17 by Kevin Feasel

The serverless SQL pools enable you to implement near-real-time analytics solutions on top of your Cosmos DB data. Serverless SQL pools with the Synapse Link provide a cost-effective analytics solution for analyzing NoSQL data stored in Cosmos DB, which is not affecting or spending the resource units on your Cosmos DB transactional store. You can run heavy analytics on the serverless SQL pools that will not affect your workload or price of the main Cosmos DB transactional store. The serverless SQL pools enable you to use the T-SQL query language for analytics that enables you to connect the reporting & analytics tools (such as Power BI, Analytics Services) from a large ecosystem that works with SQL Server or Azure SQL database.
When you are integrating the serverless SQL pools in your solution, you need to apply some best practices. There are general best practices for the serverless SQL pools in the Synapse Analytics workspace, but some of these settings are not applicable to the Cosmos DB scenario. Probably you will use only a subset of the best practices that you can find here. In this post, you will find only the best practices that you should apply in the Cosmos DB solution and some additional hints that could help you to optimize your solution.

Click through to see how the process works and a few recommendations.

Comments closed

Query Performance Insights on the Serverless SQL Pool

Published 2022-03-10 by Kevin Feasel

Jovan Popovic shows how you can use the QPI library on an Azure Synapse Analytics serverless SQL pool:

You can find more of the best practices here. These best practices are very important because some issues might cause performance degradation. You might be surprised how applying some of these best practices might improve the performance of your workload.
The last item that is related to schema optimization is sometimes hard to check. You would need to look at your schema, inspect all columns and find what to optimize. If you have a large schema, this might not be an easy task. But you can make your life easier if you use the QPI helper library that can detect schema issues for you.

Read on to see what it can find.

Comments closed

Building posexplode() in the Serverless SQL Pool

Published 2022-03-07 by Kevin Feasel

Jovan Popvic rides to the rescue with JSON:

The array cells are pivoted and returned as simple scalar columns. Now you can simply use WHERE or GROUP BY clauses to filter or summarize information by array element values. Another very useful piece of information might be the index of every element (generated as pos column).
Spark enables you to use the posexplode() function on every array cell. The posexplode() function will transform a single array element into a set of rows where each row represents one value in the array and the index of that array element. As a result, one row with the array containing three elements will be transformed into three rows containing scalar cells. This flattened/normalized representation is much easier for the analysis.
Once the array is flattened and normalized, you can easily analyze the data and find how much people knowing SQL or Java.

Read on to see how you can implement the equivalent of POSEXPLODE() using OPENJSON() in the Azure Synapse Analytics serverless SQL pool.

Comments closed

Executing SQL Statements in Azure Data Factory

Published 2022-03-07 by Kevin Feasel

Abhishek Narain announces a pretty nice improvement to Azure Data Factory and Synapse Pipelines:

We are introducing a Script activity in pipelines that provide the ability to execute single or multiple SQL statements.
Using the script activity, you can execute common operations with Data Manipulation Language (DML), and Data Definition Language (DDL). DML statements like SELECT, UPDATE, and INSERT let users retrieve, store, modify, delete, insert and update data in the database. DDL statements like CREATE, ALTER, and DROP allow a database manager to create, modify, and remove database objects such as tables, indexes, and users.

Be sure to read the limitations at the bottom, however.

Comments closed

Delta Lake Operability in Azure Synapse Analytics

Published 2022-03-04 by Kevin Feasel

James Serra lets us know when and where we can use Delta Lake within Azure Synapse Analytics:

Many companies are seeing the value in collecting data to help them make better business decisions. When building a solution in Azure to collect the data, nearly everyone is using a data lake. A majority of those are also using delta lake, which is basically a software layer over a data lake that gives additional features. I have yet to see anyone using competing technologies to delta lake in Azure, such as Apache Hudi or Apache Iceberg (see A Thorough Comparison of Delta Lake, Iceberg and Hudi and Open Source Data Lake Table Formats: Evaluating Current Interest and Rate of Adoption).

Read on for more information.

Comments closed

Scoring Azure ML Models in Azure Synapse Analytics

Published 2022-03-03 by Kevin Feasel

Alex Aleksandrov shows off the PREDICT operator:

We can use Synapse for many activities. We can use it not only for ingesting, querying, storing and visualising data, but for developing machine learning models as well. Of course, one can say that doing data science is another functionality of this platform and this is definitely true. However, in this article, I would like to show you that instead of using Python, one can use T-SQL for doing predictions.

Click through to see how.

Comments closed

Creating an SSIS Integration Runtime in Synapse

Published 2022-02-28 by Kevin Feasel

Andy Leonard shows one way to create an Azure * SQL Server Integration Services integration runtime for Azure Synapse Analytics:

On 17 Feb 2022, I first saw the Microsoft announcement of the public preview of Azure-SSIS integration runtimes in Azure Synapse Analytics. I blogged about the announcement in a post titled Azure-SSIS Integration Runtime now available in Azure Synapse Analytics.
I am excited to share one way for you to provision an Azure-SSIS IR in Synapse Analytics, following these steps. To start provisioning a shiny new Azure Synapse Analytics Azure-SSIS integration runtime, open Synapse Studio:

Read on for the step-by-step guide.

Comments closed

Deploying an Azure Synapse Analytics Workspace

Published 2022-02-18 by Kevin Feasel

Rajendra Gupta builds out an Azure Synapse Analytics workspace:

In the article, An Overview of the Azure Synapse Analytics, we explored the Azure Synapse workspace and its features as an analytics service combining Big data analytics and enterprise data warehousing.
This article is a practical demonstration of deploying Azure Synapse Analytics workspace using the Azure portal.

Click through for step-by-step instructions on how to do it.

Comments closed

SSIS Integration Runtimes in Synapse

Published 2022-02-18 by Kevin Feasel

Andy Leonard heard it on the grape vine:

My first response was – and I quote – “WOO HOO!” It’s good to see SSIS getting some love.
A couple years ago, someone claimed SSIS was dying. I first checked it out. Then I blogged about it in a post titled SSIS is Not Dead (Or Dying). It’s been a couple years and SSIS is not dead. One could say SSIS functionality being added to Azure Synapse, arguably Azure’s flagship offering, appears to be the opposite of dying.

I’m not sure I’m as sanguine as Andy is about the future of SSIS but I will say at the very least I agree that it’s not going anywhere anytime soon.

Comments closed

Multivariate Anomaly Detection in SynapseML

Published 2022-02-17 by Kevin Feasel

Louise Han has an announcement:

Today, we are excited to announce a wonderful collaborated feature between Multivariate Anomaly Detector and SynapseML , which joined together to provide a solution for developers and customers to do multivariate anomaly detection in Synapse. This new capability allows you to detect anomalies quickly and easily in very large datasets and databases, perfectly lighting up scenarios like equipment predictive maintenance. For those who is not familiar with predictive maintenance, it is a technique that uses data analysis tools and techniques to detect anomalies in the operation and possible defects in equipment and processes so customers can fix them before they result in failure. Therefore, this new capability will benefit customers who have a huge number of sensor data within hundreds of pieces of equipment, to do equipment monitor, anomaly detection, and even root cause analysis.

Click through for more details and a demonstration on how to use it.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Category: Synapse Analytics