Press "Enter" to skip to content

Category: Cloud

Cache Recommendations for Azure Data Explorer

Guy Reginiano notes an update:

A new generation of cache recommendations for Azure Data Explorer is now available in the Azure portal! 
This update introduces significant improvements, including enhanced logic, additional statistics for end users, an improved user interface, and a streamlined process for reviewing and applying recommendations. In this blog post, we will explore the new features and benefits offered by this latest update. 

Read on to see where you can find these cache recommendations, as well as the types of recommendations you’re liable to receive.

Comments closed

A Complex Example of ADF Pipeline Return Value

Andy Leonard goes beyond the simple example:

In this post, I demonstrate one way to create a child pipeline that returns the SubscriptionId for a data factory. I then call the child pipeline from a parent package.

To build this demonstration, please follow the instructions that follow.

This is definitely more complicated than Andy’s simple example, but there are plenty of screenshots to take you through the process.

Comments closed

Trying the Azure OpenAI Playground

Obaro Alordiah gives us a primer:

The Azure OpenAI Service has been a trending topic in the tech world this year as it combines the power of OpenAI’s advanced generative AI models with the comprehensive suite of services available on the Azure cloud. It has given developers the opportunity to create and embed high performing AI models into the Azure environment to deliver more efficient, insightful & innovative solutions. In this blog, we will take a high level look at some of the key features within the Azure OpenAI playground and how we can get the best out of it.

Generative AI via OpenAI is an area in which Microsoft is putting an inordinate amount of focus.

Comments closed

A Simple Example of ADF Pipeline Return Value

Andy Leonard starts easy:

I want to develop an Azure Data Factory (ADF) design pattern for calling focused, unit-of-work, function-y ADF pipelines that perform focused tasks. Some of these “worker pipelines” will need to return values to the calling pipeline.

In this example, I started by reading Mark Kromer‘s (excellent) article titled You can now customize the return value from your pipeline! I then crafted the simple example shown in this post to make sure I understood the principles involved before using pipeline return value (preview) functionality in more robust ADF patterns.

Follow the steps I outline below to build a simple example for an ADF pipeline that returns a value!

Click through to follow those steps.

Comments closed

Choosing a Load Balancing Option in Azure

Santosh Hari looks at the options:

Azure docs have a great page on the various load balancing options in Azure that even has an awesome flowchart summing up the choices. However, not being from a networking background, combined with Microsoft’s “special” naming, combined with some sort of memory issue recalling these names from memory meant that even if I had to rely on rote memory when in conversations with customers, I would often mix up the names. For instance, confuse traffic manager and load balancer. So, I decided to understand some of the basics behind cloud load balancers to help become a more interesting conversationalist in this topic: “well actually, you should be using an app gateway there, John”.

This often isn’t in the database administrator’s purview, but Santosh does a good job of explaining the concepts and, if you’re hosted in Azure, it is good to know what’s sitting in front of your database.

Comments closed

Read and Write Data with PySpark

Dustin Vannnoy has two of the three R’s down:

Every Spark pipeline involves reading data from a data source or table. For data engineers we usually end the pipelines by writing the transformed data. In this tutorial we walk through some of the most common format and cloud storage locations for reading and writing with Spark. We’ll save some of the advanced Delta Lake capabilities for another tutorial.

Click through to see how to read from and write to CSV, JSON, and Parquet formats. Dustin has examples of working with Azure Blob Storage, S3, and Google Cloud Storage, and even some database examples with JDBC.

Comments closed

Running SqlBulkCopy in Parallel from Powershell

Jose Manuel Jurado Diaz has a script for us:

Today, we encountered an interesting service request of attempting to reduce the load times for 100,000 records from a table with 97 varchar(320) fields in an Azure SQL HyperScale database. Following, I would like to share my lessons learned here.

The idea is to split in different concurrent process the execution of multiples SqlBulkCopy. In this case, we are going to split this process in 5 processes running in parallel inserting 20,000 rows, let’s try to know the total size. 

Read on for the script, as well as a rough idea of how long it’ll take inserting into an Azure SQL DB Hyperscale instance.

Comments closed

Tools for Optimizing Azure SQL MI Performance

Rie Merritt breaks out the toolbox:

Azure SQL Managed Instance provides options within and outside Azure portal for troubleshooting and optimizing performance.  Within the portal, you can leverage automatic tuning and Intelligent Insights. Outside of the Azure Portal, you can take advantage of the capabilities that are already in the database engine, such as query store and dynamic management views (DMV). In addition, Microsoft offers several monitoring options that are in preview: Azure SQL Insights inside Azure Monitor, which requires an agent on a VM you own, Azure SQL Analytics, and Azure diagnostic telemetry. 

Automatic tuning in SQL Managed Instance supports FORCE LAST GOOD PLAN, which identifies queries using an execution plan that is slower than the previous good plan. It forces queries to use the last known good execution plan. Since the system automatically monitors the workload performance, in case of changing workloads, the system dynamically adjusts to force the best performing query execution plan. 

Many of the things Rie describes are also available on-premises, though Azure SQL Analytics is only available in Azure SQL DB and Azure SQL MI, as of the time of this post.

Comments closed

Creating an Azure DevOps YAML Pipeline for SQL Server Deploys

Oilivier Van Steenlandt updates to the new Azure DevOps model:

In one of my previous blog posts, I used the SQL Server database deploy task to deploy my DACPAC to SQL Server. Unfortunately, this task became deprecated in Release Pipelines. In this blog post, I would like to share the alternative.

Additionally, we will be moving from a Classic Release pipeline to a YAML pipeline. The YAML pipeline will be responsible for building and deploying our Database Projects.

Click through for the walkthrough.

Comments closed