Press "Enter" to skip to content

Category: Cloud

Creating Functions in Kusto Queries

Dennes Torres continues a series on working with the Kusto language:

In the previous blog, I illustrated how to create sub-queries in Kusto.

However, sometimes we may face even more complex situations and we may need to create not only a sub-query, but a function.

Another way to think about a function inside a Kusto query is like a parameterized sub-query.

Read on for an example of creating a Kusto function.

Comments closed

Reviewing Azure Purview Data Catalog Features

Angela Henry continues a series on Azure Purview:

The Data Catalog portion of Purview is where most people will spend their time. It provides the information about your organizations data assets in a searchable format. Depending on which level of Data Catalog you choose; you can also access a business glossary, lineage visualization, catalog insights, and sensitive data identification insights. This article will focus on the three different levels available within Data Catalog and offers scenarios demonstrating when you would use each offering.

Read on for the three tiers, all of which are currently free but that won’t stay the case.

Comments closed

Moving Data from Confluent Cloud to Cosmos DB

Nathan Ham announces the Azure Cosmos DB sink connector in Confluent Cloud:

Today, Confluent is announcing the general availability (GA) of the fully managed Azure Cosmos DB Sink Connector within Confluent Cloud. Now, with just a few simple clicks, you can link the power of Apache Kafka® together with Azure Cosmos DB to set your data in motion.

Click through for a marketing-heavy look at how this works.

Comments closed

Tips for Azure Site Recovery

Joey D’Antoni shares a few experiences when using Azure Site Recovery:

I need to blog more. Stupid being busy. Anyway, last week, we were doing a small scale test for a customer, and it didn’t work the way we were expecting, and for one of the dumbest reasons I’ve ever seen. If you aren’t familiar with Azure Site Recovery it provides disk level replication for VMs, and allows you to bring on-premises VMs online in Azure, or in another Azure region, if you VMs are in Azure already. It’s not an ideal solution for busy SQL Server VMs with extremely low recovery point objectives, however, if you need a simple DR solution for a group of VMs, and can sustain around 30 minutes of data loss, it is cheap and easy. The other benefit that ASR provides, similar to VMware’s Site Recovery Manager, is the ability to do a test recovery in a bubble environment.

Read on for notes from Joey.

Comments closed

Importing SQL Server Extended Properties into Azure Purview

Daniel Janik shows how you can use PyApacheAtlas to move specific SQL Server extended properties into Azure Purview:

This post is going to be restricted to only SQL Server Table Columns and only Extended Properties named MS_Description. Quite a few years ago I worked on a data catalog project where we added descriptions for many of the tables, views, and columns to the database using extended properties named MS_Description. Let’s assume you have some of these for this post keeping in mind that the Purview APIs provide so many functions beyond what this post covers and that the code here could be modified to do so much more as well.

Starting out I thought it would be great to import the sensitivity classifications that SSMS creates. Pre-SQL 2019 these were held in Extended Properties and now have their very own DMV (sys.sensitivity_classifications). While this sounded great in theory it wasn’t as exciting when I wrote the code. This is because Azure Purview already has system classifications at a more granular scale for each of the ones you find in SSMS and Purview also adds these as it executes a scan on the data source. It does a pretty good job too. With that said, I shifted my focus to adding descriptions instead.

Read on to see how you can do this.

Comments closed

Scaling ADF and Synapse Analytics Pipelines

Paul Andrew has a process for us:

Back in May 2020 I wrote a blog post about ‘When You Should Use Multiple Azure Data Factory’s‘. Following on from this post with a full year+ now passed and having implemented many more data platform solutions for some crazy massive (technical term) enterprise customers I’ve been reflecting on these scenario’s. Specifically considering:

– The use of having multiple regional Data Factory instances and integration runtime services.

– The decoupling of wider orchestration processes from workers.

Furthermore, to supplement this understanding and for added context, in December 2020 I wrote about Data Factory Activity Concurrency Limits – What Happens Next? and Pipelines – Understanding Internal vs External Activities. Both of which now add to a much clearer picture regarding the ability to scale pipelines for the purposes of large-scale extraction and transformation processes.

Read on for details about the scenario, as well as a design pattern to explain the process. This is a large solution for a large-scale problem.

Comments closed

Deploying Custom Docker Images in Azure ML

Tsuyoshi Matsuzaki shows us how to deploy an Azure ML model via custom Docker image:

In my early post, I have showed you how to bring your own custom docker image in training with Azure Machine Learning.
On the contrary, here I’ll show you how to bring custom docker image in model deployment.

In Azure Machine Learning, the base docker image in deployment includes the inferencing assets, such as, Flask server, etc. So you should use AML compliant image for base image, even when you use your own custom docker image.
The list of these maintained AML images is available in https://github.com/Azure/AzureML-Containers .

Read on for an example.

Comments closed

Deploying Datasets in Azure Analysis Services and Power BI PPU

Gilbert Quevauvilliers continues a series on migrating from Azure Analysis Services to Power BI Premium Per User:

Welcome to part 8, where in this blog post, I am going compare deploying datasets.

For those people who are not exactly sure what deployments are, what this means is when you are using Power BI Desktop and you click on Publish, you are effectively deploying your changes to the Power BI Service (Which could also be a server in the cloud).

In this blog post I will show the differences when completing a deployment from AAS and then PPU.

Read on to see several techniques for deploying for each technology.

Comments closed

An Overview of Amazon Athena

Aveek Das takes us through the basics of Amazon Athena:

Serverless. Since Amazon Athena is offered as a fully managed cloud service, customers do not need to take the pain of installing and maintaining separate infrastructures for this. You can start by logging into the AWS Web console and proceeded to Amazon Athena.

Pay Per Query. You only pay for queries you execute. This is very cost-effective, as you can easily figure out your monthly expenses based on your usage pattern. On average, users pay 5 USD for each terabyte of data scanned. This can be further optimized by creating partitions or compressing your dataset.

Interactive Performance. We do not need to worry about the resources that work behind the scenes. When a query is executed, Athena automatically runs the query in parallel across multiple resources, bringing the results faster.

Read on to see an example of Athena in action.

Comments closed