Press "Enter" to skip to content

Category: Cloud

Moving a Virtual Machine with the Azure Resource Mover

Kathi Kellenberger tries out the Azure Resource Mover:

Another task I may want to perform is to move a VM to another region. I found this set of steps that involves using Azure Recovery Services Vault that seems a bit complex. Fortunately, I recently heard about a new, much easier way to move VMs and other resources called Azure Resource Mover (in preview). It was announced today.

Read on to see how this works. I like the idea a lot, especially for those times when you accidentally create resources in different regions and only realize it when it’s time to tie everything together.

Comments closed

Azure Synapse Analytics Sample Datasets and Scripts

James Serra shows us where to find samples for Azure Synapse Analytics:

Datasets: A bunch of datasets that when added will show up under Data -> Linked -> Azure Blob Storage.  You can then choose an action (via “…” next to any of the containers in the dataset) and choose New SQL script -> Select TOP 100 rows to examine the data as well as choose “New notebook” to load the data into a Spark dataframe.  Any dataset you add is a linked service to files in a blob storage container using SAS authentication.  You can also create an external table in a SQL on-demand pool or SQL provisioned pool to each dataset via an action (via “…” next to “External tables” under the database, then New SQL script -> New external table) and then query it or insert the data into a SQL provisioned database

Click through to learn more, as well as a few other things you can do with Synapse Analytics.

Comments closed

Azure SQL Managed Instance Updates

Borko Novakovic gives us a rundown of improvements to Azure SQL Managed Instances:

Azure SQL Managed Instance provide management operations that you can use to automatically deploy new managed instances, update instance properties, and delete instances when no longer needed. Most of the management operations in SQL Managed Instance are long-running but until now it was not possible for customers to get detailed information about operation status and progress in an easy and transparent way.
Through the introduction of a new CRUD API, the SQL Managed Instance resource is now visible from when the create request is submitted. In addition, the new OPERATIONS API adds the ability to monitor management operations, see operation steps, and take dependent actions based on operation progress.
Check out this blog post to learn how to effectively utilize new APIs in real-word scenarios.

If this product potentially fits your needs, also check out Vladimir Ivanovic’s post on performance improvements:

Previously, the tempdb I/O operations were governed as part of the instance log rate cap (which used to be configured to 22 MB/s for General Purpose and 48 MB/s for Business Critical). With this set of improvements, tempdb I/O operations are no longer governed as part of the instance log rate cap, allowing for a significantly higher tempdb I/O rates.

The improved tempdb performance will greatly improve the speed of tempdb-bound operations, such as running queries with large sorts/spills, or data loading through tempdb.

It looks like they’ve upped the caps on several storage-related limits for no extra charge.

Comments closed

From Kafka Into Azure Data Explorer

Anagha Khanolkar walks us through a data movement scenario:

Here is an end-to-end, hands-on lab showcasing the connector in action. You can see an overview of the lab below. In our lab example, we’re going to stream the Chicago crimes public dataset to Kafka on Confluent Cloud on Azure using Spark on Azure Databricks. Then, we will use the Kusto connector to stream the data from Kafka to Azure Data Explorer.

There’s also a lab to try this out, though the estimated spend is a bit high.

Comments closed

Azure Synapse Analytics Query Options

James Serra has a breakdown of what can query what in Azure Synapse Analytics:

The public preview version of Azure Synapse Analytics has three compute options and four types of storage that it can access (mentioned in my blog at SQL on-demand in Azure Synapse Analytics). This gives twelve possible combinations of querying data. Not all of these combinations currently are supported and some have a few quirks of which I list below.

Read on for a table which breaks down current functionality as well as expected GA functionality.

Comments closed

The Cost of Verifying Backups to Azure

Matt Robertshaw reminds us that TANSTAAFL:

Within two weeks of backups being written to Blog Storage, we observed a significant upward trend in cost associated with a Storage Account.  When compared to the previous month, there was an increase of c. £270.  After some further analysis, we were able to associate this increase with “bandwidth” charges.  This didn’t feel right – you don’t pay anything to upload data to Azure (ingress), you only pay when downloading data from Azure (egress).

Using Azure Monitor, we profiled the ingress and egress rates for the affected Storage Account and noticed the following pattern:

Each day, c. 150GB of backups were being written to blob storage (in blue), but shortly after, the same amount was being downloaded (in red).  Over this period, we calculated 4TB of data had been uploaded and then downloaded again.  Based on Microsoft’s latest Bandwidth pricing, whilst the first 5GB of egress per month is free, the next 5GB – 10TB is charged at £0.065 per month.  Some simple maths confirms it to be the additional £270 we observed.

Read on for three possible solutions. My preference for an on-prem solution would be to verify locally and then push to Blob Storage / S3. Backups tend to be faster that way as well, as your disk is likely faster than your network.

Comments closed

Using an Azure VM’s D Drive for tempdb

William Assaf shows how you can use the temporary D drive on an Azure VM to host tempdb in SQL Server:

Moving your SQL Server instance’s TempDB files to the D: volume is recommended for performance, as long as the TempDB files fit it the D: that has been allocated, based on your VM size. 
When the D: is lost due to deallocation, as expected, the subfolder you created for the TempDB files (if applicable) and the NTFS permissions granting SQL Server permission to the folder are no longer present. SQL Server will be unable to create the TempDB files in the subfolder and will not start. Even if you put the TempDB data and log files in the root of D:, after deallocation, that’s still not a solution, as the NTFS permissions to the root of D: won’t exist. In either case, SQL Server will be unable to create the TempDB files and will not start.

Read on for a few options and William’s thoughts on the relative merits of each.

Comments closed

Power BI (Lack of) Performance with Sharepoint

Matt Allington is not impressed:

On the face of it, it seems like a great idea to leverage SharePoint as a storage location for CSV and Excel files.

– Everyone has easy access to the files for editing and storage
– SharePoint manages version control, check in, check out etc
– SharePoint can facilitate shared editing of files
– You can build a Power BI report that will refresh online without the need to install a gateway.

Unfortunately, despite the benefits, the experience is not great.  Power BI performance with SharePoint as a data source is simply terrible.  Ultimately, the problems come down to performance in 2 areas.

Read on to learn more about these two issues and what you can do instead.

Comments closed

Passing an Array of Arrays as a Parameter in Azure Data Factory

Rayis Imayev has a list for us:

In my previous blog post – Setting default values for Array parameters/variables in Azure Data Factory, I had helped myself to remember that arrays could be passed as parameters to my Azure Data Factory (ADF) pipelines. This time I’m helping myself to remember that an array of other arrays can also exist as ADF pipeline parameters’ values.

Read on for the example.

Comments closed

Indexing S3 Data with CDP Data Hub

Eva Nahari, et al, show how to perform indexing and serving of S3 data in Cloudera Data Platform:

This blog post will present a simple “hello world” kind of example on how to get data that is stored in S3 indexed and served by an Apache Solr service hosted in a Data Discovery and Exploration cluster in CDP. For the curious: DDE is a pre-templeted Solr-optimized cluster deployment option in CDP, and recently released in tech preview. We will only cover AWS and S3 environments in this blog. Azure and ADLS deployment options are also available in tech preview, but will be covered in a future blog post.

We will depict the simplest scenario to make it easy to get started. There are of course more advanced data pipeline setups and more rich schemas possible, but this is a good starting point for a beginner. 

Read on for the instructions.

Comments closed