Press "Enter" to skip to content

Category: Cloud

Lookups And Conditionals In Azure Data Factory V2

Alex Whittles shows us how to perform lookups and operations with IF clauses in Azure Data Factory V2:

Azure Data Factory v2 (ADFv2) has some significant improvements over v1, and we now consider ADF as a viable platform for most of our cloud based projects. But things aren’t always as straightforward as they could be. I’m sure this will improve over time, but don’t let that stop you from getting started now.

This post provides a walk through of using the ‘Lookup’ and ‘If Condition’ activities to do some basic conditional logic depending on the results of a database query.

Assumptions: You already have an ADF pipeline created. If you want to hook into SSIS then you’ll also need the SSIS Integration Runtime set up – although this is not relevant just for the if condition.

Read on for an example.

Comments closed

Connecting To Azure SQL Database From On-Prem

Arun Sirpal shows how to set up a linked server instance between an on-prem SQL Server instance and Azure SQL Database:

You may (or may not) have a requirement to setup a linked server to Azure SQL Database from a locally installed SQL Server. One reason could be to pull down some reports from an Azure SQL Database to a local file share. Whatever your reason is hopefully you will find this blog post useful because I ran into some complications on the way.

This is what your linked server creation screens in SSMS (SQL Server Management Studio) should look like.

Take advantage of Arun’s hard-earned experience and read his post.

Comments closed

Resizing Azure Managed Instances

Jovan Popovic shows how to resize Azure SQL managed instances with Powershell:

Azure SQL Managed Instance is fully-managed SQL Server Database Engine hosted in Azure cloud. With Managed Instance you can easily add/remove cores associated to the instance and change the reserved size of the instance. You can use PowerShell to easily manage size of the instance and automate this process.

As a prerequisite, you need to have Azure SQL PowerShell libraries to configure Managed Instance. You would need to install Azure RM PowerShell and  AzureRm.Sql module that contains the commands for updating properties of Managed Instance.

Read on for a demo.

Comments closed

Azure Data Factory V2 Pricing

Chris Seferlis gives us the details on how Azure Data Factory V2 pricing works:

2. Volume of data moved – this is measured in DMUs (data movement units). This is one you should be aware of as this will default to auto, which is basically using all the DMUs it can use and this is paid for by the hour. Let’s say you specify and use 2 DMUs and it takes an hour to move that data. The other option is you could use 8 DMUs and it takes 15 minutes, this price is going to end up the same. You’re using 4X the DMUs but it’s happening in a quarter of the time.

This is good to look at and do some comparisons since how many DMUs you’re using is where the bulk of your spend if going to be.

There are a few moving parts here, so the calculation is not trivial.  But Chris makes good sense of it all.

Comments closed

Picking An Azure SQL Database Tier

Esat Erkec has various methods you can use to figure out your Azure SQL Database tier:

When we are beginning to think of migrating our on-premises databases to Azure SQL, we have to decide on a proper purchase model, a service tier, and a performance level. Before starting the Azure SQL migration process, we have to find logical and provable answers to the following questions:

  • Which purchase model is suitable for my apps and business requirements?
  • How much budget do I need?
  • Which performance level meets my requirements?
  • Can I achieve the acceptable performance of my apps?

The purchase model and service tiers will certainly affect the Azure bills. On the other hand, we have to achieve the maximum performance with the selected service tier.

It’s a good article with helpful tips for people thinking of moving to Azure SQL Database.

Comments closed

Gartner’s Cloud IaaS Magic Quadrant Changes

Bruno Aziza analyzes Gartner’s Magic Quadrant for Cloud Infrastructure as a Service offerings:

The first and most drastic change that occurred over the last year is the number of players that Gartner decided to highlight in its report: the number of vendors went from 14 to just 6 this year.  

Why is that?! Have the big become bigger and the small smaller?! Or has the space shrunk?   The latter is highly improbable.  All the contrary: earlier last year, forecasted that the highest growth in the cloud market would be coming from the sector this MQ covers: Gartner predicted that the cloud system infrastructure services would grow over 36% to reach $34B+ in 2017.

So, what gives?!

Read on to learn what gives.  As far as the rankings themselves go, I think it’s reasonable:  AWS and Azure can generally go head-to-head on features though Amazon does have the advantage.  Google is a distant third and the rest aren’t major players.

Comments closed

Don’t Forget Those Paused Indexes

Arun Sirpal tries to create a new index on his Azure SQL Database:

I was creating some demo non-clustered indexes in one of my Azure SQL Databases and received the following warning when I executed this code:

CREATE NONCLUSTERED INDEX [dbo.NCI_Time]
ON [dbo].[Audit] ([UserId])
INCLUDE ([DefID],[ShopID])

Msg 10637, Level 16, State 3, Line 7

Cannot perform this operation on ‘object’ with ID 1093578934 as one or more indexes are currently in resumable index rebuild state. Please refer to sys.index_resumable_operations for more details.

How intriguing!

Fortunately, the error message is clear and helpful, two terms which rarely go in conjunction with “error message.”

Comments closed

Bulk Loading Into Cosmos DB

Ben Jarvis has a performance test for the new Cosmos DB Bulk Executor Library:

One of the many exciting announcements made at MSBuild recently was the release of the new Cosmos DB Bulk Executor library that offers massive performance improvements when loading large amounts of data to Cosmos DB (see https://docs.microsoft.com/en-us/azure/cosmos-db/bulk-executor-overview for details on how it works). A project I’ve worked on involved copying large amounts of data to Cosmos DB using ADF and we observed that the current Cosmos DB connector doesn’t always make full use of the provisioned RU/s so I am keen to see what the new library can offer and look to see if our clients can take advantage of these improvements.

In this post I will be doing a comparison between the performance of the Cosmos DB connector in ADF V1, ADF V2 and an app written in C# using the Bulk Executor library. As mentioned in the Microsoft announcement, the new library has already been integrated into a new version of the Cosmos DB connector for ADF V2 so the tests using ADF V2 are also using the Bulk Executor library.

There are some significant performance improvements from using this bulk loader, as Ben shows.

Comments closed

Using AU Analyzer To Lower Data Lake Analytics Costs

Matthew Hicks shows off the Data Lake Analytics AU Analyzer:

The AU Analyzer looks at all the vertices (or nodes) in your job, analyzes how long they ran and their dependencies, then models how long the job might run if a certain number of vertices could run at the same time. Each vertex may have to wait for input or for its spot in line to run. The AU Analyzer isn’t 100% accurate, but it provides general guidance to help you choose the right number of AUs for your job.

You’ll notice that there are diminishing returns when assigning more AUs, mainly because of input dependencies and the running times of the vertices themselves. So, a job with 10,000 total vertices likely won’t be able to use 10,000 AUs at once, since some will have to wait for input or for dependent vertices to complete.

In the graph below, here’s what the modeler might produce, when considering the different options. Notice that when the job is assigned 1427 AUs, assigning more won’t reduce the running time. 1427 is the “peak” number of AUs that can be assigned.

I like this kind of tooling, as it provides a realistic assessment of tradeoffs.

Comments closed

Creating Azure SQL Database Managed Instances Via ARM Templates

Jovan Popovic shows how us how to build a Managed Instance of Azure SQL Database using Powershell and an ARM template:

Values that you need to change in this request are:

  • name – name of your Azure SQL Managed Instance (don’t include domain).

  • properties/administratorLogin – SQL login that will be used to connect to the instance.

  • properties/subnetId – Azure identifier of the subnet where Azure SQL Managed Instance should be placed. Make sure that you properlyconfigure network for Azure SQL Managed Instance

  • location – one of the valid location for Azure data centers, for example: “westcentralus”

  • sku/name: GP_Gen4 or GP_Gen5

  • properties/vCores: Number of cores that should be assigned to your instance. Values can be 8, 16, or 24 if you select GP_Gen4 sku name, or 8, 16, 24, 32, or 40 if you select GP_Gen5.

  • properties/storageSizeInGB: Maximum storage space for your instance. It should be multiple of 32GB.

  • properties/licenceType: Choose BasePrice if you don’t have SQL Server on-premises licence that you want to use, or LicenceIncluded if you can have discount for your on-premises licence.

  • tags(optional) – optionally put some key:value pairs that you would use to categorize instance.

Click through for the template and a quick Powershell script which shows how to use the template.

Comments closed