Press "Enter" to skip to content

Category: Cloud

ADF Data Flows and Joins Failing During Debugging

Mark Kromer clears up some issues around debugging in Azure Data Factory:

One of the important features built into ADF is the ability to quickly preview your data while designing your data flows and to execute the finished product against a sampling of data prior to finalizing and operationalizing your pipelines.

However, there are a few fundamentals relative to working with Joins that you should keep in mind and a few details below are important to understand at design time and while debugging / testing.

The answer makes sense but it would not have been the first thing to come to mind for me.

Comments closed

Statistics Management with Azure SQL DB Serverless

Joey D’Antony takes us through stats management with the serverless tier of Azure SQL Database:

One of the only things platform as a service databases like Azure SQL Database do not do for you is actively manage column and index statistics. While backups, patches, and even integrity checks are built into the platform services, managing your metadata is not. Since Azure SQL Database lacks a SQL Sever Agent for scheduling, you have to use an alternative for job scheduling. 

Read on to learn about techniques as well as a few gotchas.

Comments closed

Securing Application Secrets with Azure Key Vault

Rishit Mishra walks us through Azure Key Vault:

As the name suggests, Azure Key Vault is used to store and manage keys securely. Key Vault can be used to store the cryptographic secrets and keys such as authentication keys, storage account keys, data encryption keys, passwords and certificates.

Azure Key Vault enables developers to create the keys for development and testing in minutes, and they can further migrate this setup seamlessly onto the production environment.

The centralized key store/vault can be securely managed by the Key Vault owner who manages permissions to this key store and would be responsible for keeping the secrets secure.

Key Vault becomes quite useful in managing secrets in tools like Azure Databricks and Azure Data Factory without saving a bunch of keys in configuration files. And it’s a lot safer than that option, too.

Comments closed

Dropping Unused Indexes in Azure SQL DB

Monica Rathbun gives an important lesson around tracking index utilization in Azure SQL Database:

If the index has not shown any utilization I investigate to determine if it is one that can be removed. However, this week something caught my attention. I was looking at a client’s indexes and noted the values for these were not as high as I would have expected. I know that these index statistics are reset upon every SQL Server Service restart, but in this case, I was working on an Azure SQL Database. which got me wondering exactly how that worked. With an Azure Virtual Machine or an on Prem SQL Server instance this is easy to figure out. But with an Azure SQL Database we do not have control over when restarts are done, and what about the Serverless offering (which pauses unutilized databases to reduce costs), how do those behave?  I really want to make sure before I remove any indexes from a database that I am examining the best data possible to make that decision. So, I did some digging.

Read on to see what Monica discovered.

Comments closed

So You Want to Fail Over a SQL Managed Instance

Danimir Ljepava takes us through user-initiated failover of SQL Managed Instances:

In August 2020, we have released a new feature user-initiated manual failover allowing to manually trigger a failover on SQL Managed Instance using PowerShell or CLI commands, or through invoking an API call.

Manually initiated failover on a managed instance will be an equivalent of the automated failover for high availability and software patches initiated automatically by the service. Manually invoking a failover on MI will help test end-to-end applications for fault resiliency on automatic failovers in case of planned or unplanned events before deploying to production. In addition to testing how failover impacts existing database sessions, it can also help verify if it changes the end-to-end performance due to changes in the network latency. In some cases if performance issues are encountered on SQL MI, manually invoking a failover to a new node can help mitigate the performance issue.

Read on to see how you can perform failover and how you can confirm that it worked.

Comments closed

Stopping and Starting Virtual Machines in a Resource Group

Dennes Torres walks us through a script to stop or start all virtual machines in an Azure resource group:

Some tasks on azure are easier if we automate them. The Azure Portal provides us the cloud shell, which we can use for this kind of automation.

I was making some experiences with SQL Server Always On, so I created three VMs inside a resouce group. Every time I want to start some experiment I need to start all three VMs and, in the end, stop all three again.

Read on to see how Dennes is able to accomplish this.

Comments closed

SQL Serverless in Azure Synapse Analytics

James Serra talks to us about SQL serverless (presently known as SQL on-demand but I’m getting ahead of the marketing curve this time):

Querying data in ADLS Gen2 storage using T-SQL is made easy because of the OPENROWSET function with additional capabilities (check out the T-SQL that is supported). The currently supported file types in ADLS Gen2 that SQL-on-demand can use are Parquet, CSV, and JSON. ParquetDirect and CSV 2.0 add performance improvements (see Benchmarking Azure Synapse Analytics – SQL Serverless, using .NET Interactive). You can also query folders and multiple files and use file metadata in queries.

Read on to learn a lot more about its use cases.

Comments closed

Restoring SQL Server Backups from Azure Blob Storage

Niko Neugebauer walks us through special considerations when using Azure Blob Storage as your backup location:

If you are using Azure Blob Storage for SQL Server Backups, you need to know a couple of important details before you start with some significant project and as you should know (and in my head I am keep on hearing Grant Fritchey angrily declaring that there is no backup strategy that exists, if there is no restore strategy to be found in the plan).

The ACL permissions required by the Restore From URL operation in SQL Server (any SQL Server right now, starting with SQL Server 2012 page blobs and including SQL Server 2019 blob storage support that was started with SQL Server 2014) will require … [drumroll] … exclusive WRITE-permissions on the de underlying file(s).

Niko explains some of the pain around that requirement, as well as a few other bees in your bonnet.

Comments closed

Connecting to Cosmos DB via Linked Server

Frank Solomon takes us through communicating with Cosmos DB from SQL Server:

Every source table column becomes an expression in the SELECT clause. If needed, JSONLint, for example, can validate the output JSON format. In this query, the FOR XML PATH clause places each row into a formatted JSON row, with key/value pairs that match the column/value pairs of the original rows. To get the data ready, the empty (”) value in the FOR XML PATH() clause at line 10 separates each XML row with a default comma. At line 11, the STUFF function arguments format the result set as a string and remove the leading “.” in the original data. Save the finished result XML-format result set as a JSON file. This file will become the data we’ll import.

Cosmos DB database has zero or more collections, which correspond to SQL Server tables. A collection has zero or more documents, which correspond to SQL Server table rows. In the Cosmos DB

With SQL Server 2019, PolyBase also allows connections to Cosmos DB if (and only if) you are using the MongoDB API for Cosmos. But if that’s how your collection is set up, querying it becomes pretty easy.

Comments closed

Retrieving Secrets from Azure DevOps Pipelines

Gavin Campbell shows how you can pull secrets out of an Azure DevOps Pipeline:

For secrets created in the Azure DevOps UI, whether pipeline-scoped or in a variable group, it is not so simple to retrieve the variables after creation. This might be required for a number of reasons, most often troubleshooting. The need to do this is often an indicator that the project should have been using an Azure Key Vault in the first place.

Previously it was necessary to jump through some hoops to access secret variables, but it turns out this is no longer required. It also appears the recommended approach of mapping secrets to environment variables is currently not working for secret variables from variable groups.

I second the notion of using Key Vault for secrets management.

Comments closed