Press "Enter" to skip to content

Category: Synapse Analytics

Folders in Azure Synapse Analytics


Wolfgang Strasser checks out a small but helpful addition to Azure Synapse Analytics
:

Good morning, day, afternoon or night – wherever and whenever you read this blog post! My day started with a nice surprise when I connected to one of our Azure Synapse workspaces …

Sometimes, it’s those little things that make (development) life easier – you can now add folders to structure the list of development artefacts in Synapse:

Read on to see how, including how you can bring order to the chaos of existing Synapse Analytics workspaces.

Comments closed

Querying Data Lake Files in Power BI through Synapse Analytics

Wolfgang Strasser shows us how to integrate Azure Synapse Analytics and Power BI:

Sometimes however, would not it be nice to access the data lake in Direct Query mode – to get the most up to date information for every report view? I would say: yes … but how can you achieve this? The options natively provided by ADLS Gen2 and Power BI are not sufficient to solve this requirement. But: there are options to achieve this and, in this post, I would like to show you the possibilities using Azure Synapse Analytics to build a query layer on top of a ADLS Gen2 storage account.

Click through for a step-by-step walkthrough.

Comments closed

Caching and Statistics in Synapse Dedicated SQL Pools

Tsuyoshi Matsuzaki takes us through statistics and caching in Azure Synapse Analytics Dedicated SQL Pools:

In Synapse Analytics, several database objects (such as, compiled procedure, plan, …) will be cached in some conditions.
For instance, CCI tables (see my previous post “Azure Synapse Analytics : Choose Right Index and Partition” for CCI) will locally cache the recently-used columnstore segments on distributed compute nodes, which is called columnar cache. The local disk-based cache is used on Gen2 caching.
You cannot manually control these caching activities. (These are automatically applied to improve performance in Synapse Analytics.) See team blog “Adaptive caching powers Azure SQL Data Warehouse performance gains” for underlying architecture which improves caching in Gen2.

Dedicated SQL Pool behavior is close enough to on-premises SQL Server that it’s easy to expect everything to be the same, but there are some nuances.

Comments closed

External Tables vs T-SQL Views in Synapse

James Serra explains the differences between external tables and T-SQL views in Azure Synapse Analytics when querying from Data Lake Storage:

A question that I have been hearing recently from customers using Azure Synapse Analytics (the public preview version) is what is the difference between using an external table versus a T-SQL view on a file in a data lake?

Note that a T-SQL view and an external table pointing to a file in a data lake can be created in both a SQL Provisioned pool as well as a SQL On-demand pool.

Here are the differences that I have found:

Click through for the differences.

Comments closed

Getting Started with Azure Synapse Analytics

John Macintyre shares some Azure Synapse Analytics samples:

To further accelerate time to insight in Microsoft Azure Synapse Analytics, we are introducing the Knowledge center to simplify access to pre-loaded sample data and to streamline the getting started process for data professionals. You can now create or use existing Spark and SQL pools, connect to and query Azure Open Datasets, load sample scripts and notebooks, access pipeline templates, and tour the Azure Synapse Studio—all from one place.

Click through for details on the samples.

Comments closed

Self-Service with Azure Synapse Analytics

Paul Andrew lays out an interesting idea:

I’ve been playing around with Azure Synapse Analytics for a while now exploring the preview features and trying to find a meaningful use case for the ‘single pane of glass’ capabilities. In this post I’m exploring one possible option/idea for creating a very simple self service approach to dataset ingestion and consumption. Full disclosure, the below is far from technical perfection for lots of reasons, I mainly wanted to put something out there as an idea and use it to maybe start a conversation.

Click through to see Paul’s take on the matter.

Comments closed

Choosing the Right Index and Partition in Dedicated SQL Pools

Tsuyoshi Matsuzaki gives us some advice on indexing and partitioning data in Azure Synapse Analytics dedicated SQL pools:

Designing index for a table is so primitive and important for better performance.
There’s no “one answer for any case”. You should choose right index for a table depending on the size, usage, query patterns, and cardinality.

In order to help you understand pros/cons in each indexes, I’ll show you each pictures illustrating intuitive structures of indexes available in Synapse Analytics.

Because dedicated SQL pools aren’t the same as the SQL Server box product, it’s important to go in with the understanding that indexing won’t be exactly the same as on-premises or in Azure SQL Database.

Comments closed

Azure Synapse Analytics Query Options

James Serra has a breakdown of what can query what in Azure Synapse Analytics:

The public preview version of Azure Synapse Analytics has three compute options and four types of storage that it can access (mentioned in my blog at SQL on-demand in Azure Synapse Analytics). This gives twelve possible combinations of querying data. Not all of these combinations currently are supported and some have a few quirks of which I list below.

Read on for a table which breaks down current functionality as well as expected GA functionality.

Comments closed