Press "Enter" to skip to content

Category: Cloud

Querying Private Blob Storage Containers with Azure Synapse Analytics

Dennes Torres looks at some private information:

The queries from the previous article were made against the public container in the blob storage. However, if the container is private, you will need to authenticate with the container. In this article, you’ll learn how to query private blob storage with SQL.

NOTE: Be sure that the Azure Synapse Workspace and the storage account with the sample files are set up before following along with this article. You will also need to replace your storage account URL each time that a storage account URL is used in the article.

There are three possible authentication methods, and these methods may have some variation according to the type of storage account and the access configuration. I will not dig into details about storage here and leave that for a future article.

Read on for the three authorization methods and a lot of detail on using SAS tokens (the preferred method) to access this data.

Leave a Comment

Combining Change Data Capture with Azure Data Factory

Reitse Eskens continues a series on learning Azure Data Factory:

In my last blog, I pulled all the data from my table to my datalake storage. But, when data changes, I don’t want to perform a full load every time. Because it’s a lot of data, it takes time and somewhere down the line I’ll have to separate the changed rows from the identical ones. Instead of doing full loads every night or day or hour, I want to use a delta load. My pipeline should transfer only the new and changed rows. Very recently, Azure SQL DB finally added the option to enable Change Data Capture. This means after a full load, I can get the changed records only. And with changed records, it means the new ones, the updated ones and the deleted ones.

Let’s find out how that works.

Read on for the article and demonstration.

Leave a Comment

Differences in Logging between Azure Analysis Services and Power BI PPU

Gilbert Quevauvilliers continues a series on migrating from Azure Analysis Services to Power BI Premium Per User:

Another important aspect when having datasets is being able to log and monitor performance. In this blog post I am going to compare the logging between Azure Analysis Services (AAS) and Power BI Premium Per User (PPU).

With the recent release of PPU having integration with Log Analytics it makes it a lot easier to compare the logging options between AAS and PPU.

This is an area where there’s still a bit of a gap. Click through to see what the differences look like today.

Leave a Comment

A Primer on Azure Kubernetes Service

Arun Sirpal gives us a brief introduction of Azure Kubernetes Service:

You have the ability to run these on-premises (complex) or in a cloud service, like AWS or Azure. Hence AKS – Azure Kubernetes Service which helps reduce the complexity and operational overhead of managing Kubernetes by offloading much of that responsibility to Microsoft. You may be wondering how does containers relate to this? It was something on my mind when I first entered into this technology. Remember that containers is the next step beyond traditional virtualisation, you can run SQL Server Linux in containers, as an example. I then look at AKS as the “management” layer of the container solution, carrying out tasks such as scheduling, scaling, health, load balancing and host management.

Click through for more information.

Leave a Comment

Managing Spatial Data in Azure

Rolf Tesmer takes us through the different Azure services which offer some ability to work with spatial data:

Every now and then you come across a use-case where you need to do something with spatial data, and you need to do it in the cloud (Azure, of course)! Up until that very point you maybe didn’t know, or perhaps even care, much about the intricacies of spatial data assets, let alone how the heck you were going to store it, process it, and query it, without making a mess of your current data stack.

Well, if you’re that person, then I say welcome to this blog post!

Click through for a fairly lengthy list, including Rolf’s comments on each. Also note the one big omission from the list as far as data platform products.

Leave a Comment

Renaming a YAML Pipeline in Azure DevOps

Hamish Watson figures out what’s in a name:

I had created a pipeline using YAML – which was called InfrastructureAsCode as the YAMP file was in the root directory.

However I wanted to move it into a folder .\InfrastructureAsCode\pipelines\… and run the YAML file from there – as I would have a non-prod and PROD version of them (as the schedule was different for each).

Click through to see how Hamish was able to resolve this.

Leave a Comment

Case-Insensitive Collations in Redshift

Mengchu Cai, et al, show us how to change collation with Redshift:

Amazon Redshift is a fast, fully managed, cloud-native data warehouse. Tens of thousands of customers have successfully migrated their workloads to Amazon Redshift. We hear from customers that they need case-insensitive collation for strings in Amazon Redshift in order to maintain the same functionality and meet their performance goals when they migrate their existing workloads from legacy, on-premises data warehouses like Teradata, Oracle, or IBM. With that goal in mind, AWS provides an option to create case-insensitive and case-sensitive collation.

In this post, we discuss how to use case-insensitive collation and how to override the default collation. Also, we specifically explain the process to migrate your existing Teradata database using the native Amazon Redshift collation capability.

Specifically, it appears that they have two collations exposed: one which is case-sensitive and the other which is case-insensitive.

Leave a Comment

SQL Server on Azure Container Instances

Arun Sirpal has a series for us. Part 1 involves spinning up SQL Server on ACI:

This is Microsoft’s serverless technology which allows us to deploy containers without having to worry about managing the underlying hardware. It’s a way to get access to SQL fast (faster than traditional methods like installing a virtual machine) to do things like test code fixes etc.

There a couple of ways of doing this, you can use the portal, PowerShell or Azure CLI, I actually like Azure CLI.

Part 2 gives you an idea of what you get:

In the last post we built an image of SQL server 2019 Linux hosted in Azure Container Instance for fast access to SQL server. So, your next question is probably, lets see some database action?

When you connect to SSMS its not different, the feel and look, is, SQL server. Lets have a tour.

The normal warning with Azure Container Instances is that they’re great for development and testing efforts (in part because of how inexpensive it is compared to alternatives on Azure) but won’t have the same uptime or high availability guarantees that a service like Azure Kubernetes Service will have.

Leave a Comment

Optimizing BERT Models on Google Colab

Kevin Jacobs fine-tunes some NLP processes:

BERT is a language model and can thus be used for predicting the next word in a sentence. Furthermore, BERT can be used for automatic summarization, text classification and many more downstream tasks. Google Colab provides you with a cloud-based environment on which you can train your machine learning models on a GPU. The downside is that your data is uploaded to the Google cloud. Google Colab gives you the opportunity to finetune BERT.

Click through to see how.

Leave a Comment