Press "Enter" to skip to content

Category: Cloud

Approximate Percentiles in Azure SQL DB and MI

Balmukund Lakhani announces a feature has gone generally available:

Today, we are announcing General Availability (GA) of native implementation of APPROX_PERCENTILE in Azure SQL Database and Azure SQL Managed Instance. We announced preview of these functions in October 2022. Since then, many customers have adopted these for the applications where response time of percentile calculation was more important than the accuracy of the result.

I have and will continue to extol the virtues of these two functions wherever I go. They’re considerably better than the originals once you start getting into the hundreds of thousands or millions of rows. They’re also available in SQL Server 2022.

Comments closed

Peeking into Azure SQL DB via Extended Events

Grant Fritchey observes the observers:

Last week I posted the results from using Extended Events to snoop on what happens inside an AWS RDS database. This week, I’m taking a look at what happens on Azure SQL Database. I’m using the same toolset again, if for no other reason that I’m consistent in my approach. So it’s basically just rpc_completed & sql_batch_completed on the database in question. Let’s check out the results.

Here’s the prior post, in case you missed it like I did.

Comments closed

Combining On-Demand and Spot VMs in AKS

Prakash P covers a topic near and dear to my heart—saving money by using spot instances:

While it’s possible to run the Kubernetes nodes either in on-demand or spot node pools separately, we can optimize the application cost without compromising the reliability by placing the pods unevenly on spot and OnDemand VMs using the topology spread constraints. With baseline amount of pods deployed in OnDemand node pool offering reliability, we can scale on spot node pool based on the load at a lower cost.

I like this idea a lot, as spot instances trade off saving a lot of money (up to 90%) for unreliability: you lose the spot instance as soon as someone else comes in willing to pay more. This gives you the best of both worlds with AKS: emphasize spot instances for the money savings but include the ability to use on-demand pricing for VMs when spot isn’t available. If I’m understanding the post correctly, this also reduces the downside risk of service instability that you get when spot instances are bought out from under you, as Kubernetes will automatically spin up and down services within a pod to keep a consistent number of instances available across the nodes to users.

Comments closed

Synapse and Azure ML Pipelines

Santosh Thomas integrates two Azure products:

As more customers standardize on the Synapse data platform, enabling machine learning workflows through Azure Machine Learning (Azure ML) becomes particularly interesting. This is especially true as more customers look to bring their data engineering and data science practices together and mature capabilities on both sides.

The goal of this blog post is to highlight how Synapse and Azure ML can work well together to deliver key insights. This is motivated by a scenario where a customer modernized their data platform on Azure Synapse but was looking to improve their data science practices through Azure ML. The focus of this blog is to expose existing functionality, and it is not a “hardened” solution with security or other cloud best practice implementations. The workflow steps also assume some level of comfort with Python and working with the Azure Python SDKs.

There was a time in which Microsoft wanted us to remain in Synapse for machine learning tasks, but that time is gone: the emphasis is definitely to do machine learning tasks in Azure ML, regardless of where the data lives…unless there’s a Spark job involved, in which case things get all weird again.

Comments closed

Kafka Control and Data Planes

Sanjay Garde explains how the architecture of Apache Kafka solutions has expanded over time:

With the advent of service mesh and containerized applications, the idea of the control and data plane has become popular. A part of your application infrastructure, such as a proxy or sidecar, is dedicated to aspects such controlling traffic, access, governance, security, and monitoring and is referred to as the control plane. Another part of your application infrastructure that is used purely for processing your business transactions is referred to as the data plane.

Read on to see how the concept works at an architectural level.

Comments closed

ADX Dashboards Now Generally Available

Michal Bar provides an overview of Azure Data Explorer functionality now generally available :

Each ADX dashboard is a collection of tiles, optionally organized in pages, where each tile has an underlying query and a visual representation. Using the web UI, you can natively export Kusto Query Language (KQL) queries to a dashboard as visuals and later modify their underlying queries and visual formatting as needed. In addition to ease of data exploration, this fully integrated Azure Data Explorer dashboard experience provides improved query and visualization performance.

Read on to learn more.

Comments closed

Delta Lake Support in Azure Stream Analytics

Emma An makes an announcement:

Delta Lake has gained popularity in recent times due to its unique features and advantages over traditional data warehouse and other storage formats. For those already using traditional data storage format or moving to a lakehouse architecture, Delta Lake can offer several compelling benefits that can further enhance the performance and capabilities of their data pipelines. Many Azure services are integrated with Delta Lake, and now you can use Azure Stream Analytics to write in Delta format.

In this blog, we will explain the native support of Delta Lake in Azure Stream Analytics, that can help users take their workload to the next level, providing a seamless and scalable solution for large-scale data processing and storage. It is easy to start, taking only a few clicks to create an end-to-end pipeline, and write to either a new or existing Delta table stored in Azure Data Lake Storage Gen2.

This is a nice addition to Stream Analytics and Emma shows two ways you can write out results in Delta Lake format.

Comments closed

First Thoughts on Azure Hyperscale Serverless

Reitse Eskens shares some thoughts:

As some of you know, I’ve written a series of blog posts on Azure SQL Databases and there’s an accompanying session that I had the honour of presenting a number of times.
Now Azure keeps developing new offers and one of these went in public preview February 15th. An offer I hadn’t seen coming. You can read the introductory post here.

It’s the Azure Hyperscale Serverless option.

Read on for Reitse’s impressions from the preview. This wasn’t a torture test but did provide an overview of how to create and load data into the database. Reitse also calculates the cutoff point when you should switch from Serverless to traditional Hyperscale, so check that out as well.

Comments closed

Building an Internal Load Balancer in Azure

Vaibhav Kumar balances the scales:

The Internal load balancer manages load for a private network with any inbound access from the public platform. As in the diagram below, the primary load balancer managing load from the internet is a public-type load balancer. But, the VMs communication to storage or database is managed through a type-internal load balancer.

Click through for a walkthrough of the process.

Comments closed

Working with Postgres Extensions in Azure Cosmos DB

Sarah Dutkiewicz runs into an issue:

Problem: I installed PostGIS on my single-node cluster without issues. However, I scaled my cluster to 2 nodes afterwards. When I ran the query that uses ST_X and ST_Y from PostGIS, I got the following error:

ERROR: type "public.geometry" does not exist
CONTEXT: while executing command on private-w0.azure-cosmos-db-global-ug-demo.postgres.database.azure.com:5432

When I read the CONTEXT message, I realized by the w# reference that the worker nodes didn’t have PostGIS installed. When you scale the nodes – at least in this case, it doesn’t enable the extensions over there.

Read on to see how Sarah was able to resolve this issue.

Comments closed