Press "Enter" to skip to content

Category: Cloud

Monitoring Azure Synapse Analytics SQL Pools with Power BI

Brett Powell has a pair of Power BI templates for monitoring Azure Synapse Analytics:

Upon clicking ‘Load’ you’ll either need to provide your credentials for this source (if you don’t have this data source saved from previous use) or the queries will execute and the following report pages will be available:

– Executions
– Waits
– Sessions
– Waits Detail
– Execution Detail
– Memory
– ExecutionDrillThrough (hidden)

Click through to see what the templates look like and how to obtain them.

Comments closed

Using Postman with Power BI’s REST API

David Eldersveld takes us through the Power BI REST API:

Postman is a valuable tool to work with APIs, especially when testing and making ad hoc requests outside of an automated production solution. In terms of where a Power BI developer may find Postman useful, it sits somewhere between the documentation’s “Try It” functionality and a more production-worthy solution incorporating tools like Azure DevOps, Logic Apps/Power Automate, a Power BI custom connector, etc.

The ideas in this post extend an original post from Carl de Souza. Carl shows how to obtain an OAuth2 access token but does so with hardcoded values. Additional API requests use the token from the original response, but he also manually provides this token to those subsequent API calls.

David has a clever technique for getting the bearer token, so check it out.

Comments closed

Project Metamorphosis: Elastic Kafka Clusters

Jay Kreps explains what Confluent has been up to lately:

What is Project Metamorphosis?

Let me try to explain. I think there are two big shifts happening in the world of data right now, and Project Metamorphosis is an attempt to bring those two things together.

The first one, and the one that Confluent is known for, is the move to event streaming.

Event streams are a real revolution in how we think about and use data, and we think they are going to be at the core of one of the most important data platforms in a modern company. Our goal at Confluent is to build the infrastructure that makes that possible and help the world take advantage of it. That’s why we exist.

But event streaming isn’t the only paradigm shift we’re in the midst of. The other change comes from the movement to the cloud.

Click through for the high-level. I can see this even more directly competing with Kinesis and Event Hubs.

Comments closed

Polygon-Based Spatial Searches with Cosmos DB

Hasan Savran continues a series on spatial data in Cosmos DB:

I want to continue to develop our new map application for Azure Cosmos DB. So far, we can run a custom spatial query in Cosmos DB and display the results on a map. I want my users to create a polygon on map and search for data under this polygon. If you are familiar with Zillow, that is how their application lets you look for houses to buy or rent. You select an area, and Zillow application displays all available houses or rental under the area you draw. It is extremely useful and user-friendly search.

Click through to see how Hasan does it, as well as getting around a coordinate ordering problem.

Comments closed

Accessing Azure Queue Storage from R

Hong Ooi announces a new package for R:

This post is to announce that the AzureQstor package is now on GitHub. AzureQstor provides an R interface to Azure queue storage, building on the facilities provided by AzureStor.

Queue Storage is a service for storing large numbers of messages, for example from automated sensors, that can be accessed remotely via authenticated calls using HTTP or HTTPS. A single queue message can be up to 64 KB in size, and a queue can contain millions of messages, up to the total capacity limit of a storage account. Queue storage is often used to create a backlog of work to process asynchronously.

Hong includes a couple of demos as well, so check it out.

Comments closed

Displaying Cosmos DB Spatial Data with .NET Core

Hasan Savran builds up a quick .NET Core app to retrieve spatial data from Cosmos DB and display it:

Cosmos DB stores geospatial data in GeoJSON format. You can not tell what raw GeoJSON represents because usually all it has is a type and bunch of coordinates. Azure Cosmos DB does not have any UI to help you what GeoJSON data looks like on a map either. Only option you have is a third party tool which might display data on a map or Azure Cosmos DB Jupyter Notebooks.

    I want to run a query in Azure Cosmos DB and see the results on a map. I decided to create a simple UI which displays spatial data on a map. I will show you how to do this step by step. I will use LeafLetJs as a map. It is open source and free! Also, I need to create .NET Core 3.1 web application and use Azure Cosmos DB Emulator for data.

Hasan walks us through the demo and promises to put the code in GitHub later.

Comments closed

Getting Started with MySQL in Azure

Chris Hyde tries out Azure’s MySQL Platform-as-a-Service offering:

I started out by setting up a dedicated resource group to use for my instance, and then used the Azure Portal GUI to create a new instance named mysql-20200505. I made sure to downgrade from the default General Purpose configuration to Basic, so it will only cost me about $67 a month if I leave it running instead of around $350. After the instance was created successfully I then added some connection security rules to ensure that only my IP was able to connect to it.

I then opened up MySql Workbench to connect to the server as pictured below. Of course it took me two tries to connect as I made my usual error of not including the machine name in the username field the first time around.

Click through for Chris’s early tests.

Comments closed

Query Acceleration for Blob Storage and Data Lake Gen2

James Serra takes us through Query Acceleration for Azure Blob Storage and Azure Data Lake Storage Gen2:

Just announced is Query Acceleration for Azure Data Lake Storage Gen2 (ADLS) as well as Blob Storage. This is a new capability for ADLS that enables applications and analytics frameworks to dramatically optimize data processing by retrieving only the data that they require to perform a given operation from storage. This reduces the time and processing power that is required to query stored data.

For example, if an application will execute a SELECT statement that filters columns and rows from a csv file, instead of all pulling the entire csv file over the network into the application and then filtering the data, it will instead do the filtering at the time the data is read from the disk, so that only the filtered data is transferred over the network to the application. So if you have a csv file with 50 columns and 1 million rows, but the filters limit the data to 5 columns and 1000 rows, then only the 5 columns and 1000 rows will be retrieved from the disk and sent over the network to the application.

Click through to learn more, including current libraries which support this and information on the additional cost. I’d really like to see PolyBase support this, as it would alleviate one of the problems with using Blob Storage + PolyBase: the need to pull all of that data down to your SQL Server instance before doing any filtering.

Comments closed

Workload Classification with Resource Governor in Azure Synapse Analytics

Niko Neugebauer keys in on an interesting addition to Azure Synapse Analytics:

Given that we can specify 5 different parameters (USER MEMBERNAME, ROLE MEMBERNAME, WLM_LABEL, WLM_CONTEXT, START_TIME/END_TIME) – there must be a prioritisation mechanism in order to decide which condition gets selected. This mechanism is called Parameter Weighting in Azure Synapse and it assigns the following weight to each of those parameters:
USER = 64
ROLE = 32
WLM_LABEL = 16
WLM_CONTEXT = 8
START_TIME/END_TIME = 4
meaning that if the Workload Classifier fits into the timeframe START_TIME/END_TIME, WLM_LABEL & ROLE – it will receive 52 points = 4 + 16 + 32,
while a different Workload Classifier that fits into WLM_CONTEXT & USER will get 72 points = 8 + 64, thus will prevail and will be selected over the first Workload Classifier.

Azure Synapse Analytics (including when it was known as SQL Data Warehouse) has had some resource governor-related things I’ve wanted in the box product for a while, including labels (which are better than using application name).

Comments closed

Replicating SQL’s IN Operator with Azure Data Factory

Rayis Imayev shows how we can find values in a group using Azure Data Factory:

However only this use-case for the OR function with 2 condition could be possible:or(equals(variables(‘var1’), ‘A’), equals(variables(‘var1’), ‘B’)) – limit of two conditions

But what if we have an ability to check if a particular element variable/parameter/other ADF object value belongs to a range of values (array of value), similarly to what we can do with the IN operator in SQL language, this would definitely solve our problem and remove the limitation of logical conditions to check.

Click through for the answer.

Comments closed