Cloud – Page 101 – Curated SQL

Adding a row number to your dataset could a trivial task. Both ANSI and Spark SQL have the row_number() window function that can enrich your data with a unique number whole for your whole or partitioned data recordset.
Recently I had a case of creating a data flow in Azure Data Factory (ADF) where there was a need to add a row number.

Read on for a couple attempts which didn’t work, followed by two that do, including an assist from Joseph Edwards.

Comments closed

ADF and Self-Hosted Integration Runtime Config Errors

Published 2020-10-05 by Kevin Feasel

Teo Lachev points out a common issue with using the Azure Data Factory self-hosted integration runtime:

You’ve set up the Azure Data Factory self-hosted integration runtime to access on-prem data sources. You create a linked server, click Test Connection, and then get greeted with an error saying the security context can’t be passed. On the on-prem VM, you use the Integration Runtime Configuration Manager and get a similar error or something to the extent that JSON can’t be parsed. You spent a few hours in trying everything that comes to mind, such as checking firewalls, connectivity from SSMS, but nothing helps.

Read on for the solution.

Comments closed

Azure Arc Deployment Options for SQL Server

Published 2020-10-01 by Kevin Feasel

Sasha Nosov takes us through Azure Arc deployment options for SQL Server:

As you can see, both on-Azure and off-Azure options offer you a choice between IaaS and PaaS. The IaaS category targets the applications that cannot be changed because of the SQL version dependency, ISV certification or simply because the lack of in-house expertise to modernize. The PaaS category targets the applications that will benefit from modernization by leveraging the latest SQL features, gaining a better SLA and reducing the management complexity.

Click through for a graphic, as well as further clarification on each item.

Comments closed

Recursive Metadata Discovery in Azure Data Factory

Published 2020-09-30 by Kevin Feasel

Richard Swinbank gives us one method to perform recursive metadata discovery in Azure Data Factory:

Azure Data Factory’s Get Metadata activity returns metadata properties for a specified dataset. In the case of a blob storage or data lake folder, this can include childItems array – the list of files and folders contained in the required folder. If you want all the files contained at any level of a nested a folder subtree, Get Metadata won’t help you – it doesn’t support recursive tree traversal. In this post I try to build an alternative using just ADF.

But before you get too invested in this technique, please read Richard’s spoiler.

Comments closed

Public Preview of SQL Server on Azure Arc

Published 2020-09-28 by Kevin Feasel

Sasha Nosov gives us an update on Azure Arc:

The preview includes the following features:
– Use Azure Portal to register and track the global inventory of your SQL instances across different hosting infrastructures. You can register an individual SQL instance or register a set of servers at scale using the same auto-generated script.
– Use Azure Security Center to produce a comprehensive report of vulnerabilities in SQL servers and get advanced, real time security alerts for threats to SQL servers and the OS.
– Investigate threats in SQL Servers using Azure Sentinel
– Periodically check the health of the SQL Server configurations and provide comprehensive reports and remediation recommendations using the power of Azure Log analytics.

Click through for more information and documentation.

Comments closed

Diving into the Azure Resource Mover

Published 2020-09-25 by Kevin Feasel

Dennes Torres shows off what the Azure Resource Mover can do:

If you include the need to copy a resource or set of resources, instead of only moving, the list expands a lot.
Azure already offers the resources to do this: ARM templates, automated deployments, Data Sync, Recovery Services Vault, VM replication and so on. The problem is that sometimes, to move a set of objects together, you may need to use many of these services and understand how to use them.
The solution is a new free service, still in preview, called Azure Resource Mover. This service reduces the complexity of moving resources, minimizing the number of decisions needed on how the resources will be moved. More than that, the last step, deleting the source of the move, is optional, as you will see in detail later. You can use this feature, not only to move resources, but also to copy and distribute them across many regions. During the move process, only one side (source or destination) will be active, but once you finish the move, if you decide not to delete the source, you have in fact a new deployment of the solution.

This is a fairly detailed tutorial, so check it out.

Comments closed

Moving a Virtual Machine with the Azure Resource Mover

Published 2020-09-25 by Kevin Feasel

Kathi Kellenberger tries out the Azure Resource Mover:

Another task I may want to perform is to move a VM to another region. I found this set of steps that involves using Azure Recovery Services Vault that seems a bit complex. Fortunately, I recently heard about a new, much easier way to move VMs and other resources called Azure Resource Mover (in preview). It was announced today.

Read on to see how this works. I like the idea a lot, especially for those times when you accidentally create resources in different regions and only realize it when it’s time to tie everything together.

Comments closed

Azure Synapse Analytics Sample Datasets and Scripts

Published 2020-09-25 by Kevin Feasel

James Serra shows us where to find samples for Azure Synapse Analytics:

Datasets: A bunch of datasets that when added will show up under Data -> Linked -> Azure Blob Storage. You can then choose an action (via “…” next to any of the containers in the dataset) and choose New SQL script -> Select TOP 100 rows to examine the data as well as choose “New notebook” to load the data into a Spark dataframe. Any dataset you add is a linked service to files in a blob storage container using SAS authentication. You can also create an external table in a SQL on-demand pool or SQL provisioned pool to each dataset via an action (via “…” next to “External tables” under the database, then New SQL script -> New external table) and then query it or insert the data into a SQL provisioned database

Click through to learn more, as well as a few other things you can do with Synapse Analytics.

Comments closed

Azure SQL Managed Instance Updates

Published 2020-09-24 by Kevin Feasel

Borko Novakovic gives us a rundown of improvements to Azure SQL Managed Instances:

Azure SQL Managed Instance provide management operations that you can use to automatically deploy new managed instances, update instance properties, and delete instances when no longer needed. Most of the management operations in SQL Managed Instance are long-running but until now it was not possible for customers to get detailed information about operation status and progress in an easy and transparent way.
Through the introduction of a new CRUD API, the SQL Managed Instance resource is now visible from when the create request is submitted. In addition, the new OPERATIONS API adds the ability to monitor management operations, see operation steps, and take dependent actions based on operation progress.
Check out this blog post to learn how to effectively utilize new APIs in real-word scenarios.

If this product potentially fits your needs, also check out Vladimir Ivanovic’s post on performance improvements:

Previously, the tempdb I/O operations were governed as part of the instance log rate cap (which used to be configured to 22 MB/s for General Purpose and 48 MB/s for Business Critical). With this set of improvements, tempdb I/O operations are no longer governed as part of the instance log rate cap, allowing for a significantly higher tempdb I/O rates.
The improved tempdb performance will greatly improve the speed of tempdb-bound operations, such as running queries with large sorts/spills, or data loading through tempdb.

It looks like they’ve upped the caps on several storage-related limits for no extra charge.

Comments closed

From Kafka Into Azure Data Explorer

Published 2020-09-21 by Kevin Feasel

Anagha Khanolkar walks us through a data movement scenario:

Here is an end-to-end, hands-on lab showcasing the connector in action. You can see an overview of the lab below. In our lab example, we’re going to stream the Chicago crimes public dataset to Kafka on Confluent Cloud on Azure using Spark on Azure Databricks. Then, we will use the Kusto connector to stream the data from Kafka to Azure Data Explorer.

There’s also a lab to try this out, though the estimated spend is a bit high.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Cloud

Adding Row Numbers to ADF Data Flows

ADF and Self-Hosted Integration Runtime Config Errors

Azure Arc Deployment Options for SQL Server

Recursive Metadata Discovery in Azure Data Factory

Public Preview of SQL Server on Azure Arc

Diving into the Azure Resource Mover

Moving a Virtual Machine with the Azure Resource Mover

Azure Synapse Analytics Sample Datasets and Scripts

Azure SQL Managed Instance Updates

From Kafka Into Azure Data Explorer