Category: Cloud

In part one I discussed how useful Azure tags can be, and specifically about how adding a ‘dateCreated’ tag can help you keep track of your resources, and how to find resources with certain tags using PowerShell. Part 2 and 3 are based around the fact that adding the ‘dateCreated’ tag is a great idea, but relying on a human to remember to add it is less than ideal. In part 2 we looked at using Azure Policy to automatically add the tag. Today’s post will cover another option using Azure Functions.
Azure Functions gives us a way of running serverless code, written in a number of different languages, triggered by specific events or timings. Looking through the documentation there are many use cases from processing files to analysing IoT workstreams. Our use case is to run a PowerShell script that tags any resources that are missing the ‘dateCreated’.

Click through to see how.

Comments closed

Hyperparameter Tuning in Azure Machine Learning

Published 2021-03-30 by Kevin Feasel

Dinesh Asanka takes us through hyperparameter tuning with Azure Machine Learning’s designer:

In the above experiment, both the previous model and the TMH included the model so that we can compare both models. In the above experiment, Tune Model Hyperparameters control is inserted between the Split Data and Score Model controls as shown. In the TMH, control has three inputs. The first control needs the relevant technique and, in this scenario, it is the Two-Class Logistic Regression technique. The second input needs the train data set and the last input needs the evaluation data set and for that, the test data set can be used.
Tune Model Hyperparameters control provides the best combinations and it will be connected to the score model. After the test data stream is connected to the score model, the output of the model is connected to the second input of the Evaluate model so that the previous model and the tuned model can be compared.

I’m not sure if there’s something handled internally in the Tune Model Hyperparameters component, but based on the pipeline images, I’d actually want two separate Split components so that I ended up with something more like 50-20-30 for training, hyperparameter testing, and validation. The first two pipelines appear to be 70-30-0 instead, and so can give you a false sense of confidence in model quality.

Comments closed

Cost versus Performance Optimization for SQL Server on VMs in Azure

Published 2021-03-30 by Kevin Feasel

Pam Lahoud takes a look at multi-constraint optimization:

So how do you get the best price-performance possible when configuring your SQL Server on Azure VM? In this blog, we’re going to cover three key aspects to right-sizing (and right-configuring) your Azure VM for SQL Server that are based on some common pitfalls customers face when migrating their on-premises workloads to Azure VM:
– Choosing the best VM series and size for your workload
– Configuring storage for maximum throughput and lower cost
– Leveraging unique to Azure features such as host caching to boost performance at no additional cost

One key point of the article is that there are several factors which can make a big difference in price and performance, but which you might not think about on-premises. It’s definitely worth taking the time to research this. It’s also a great example of how administrators are still important in a cloud-based world—having an admin who understands these settings and can get the most out of a given server can save a lot of money very quickly.

Comments closed

Limitations with Control Flows in Azure Data Factory

Published 2021-03-29 by Kevin Feasel

Meagan Longoria has a list:

If you’ve been using Azure Data Factory for a while, you might have hit some limitations that don’t exist in tools like SSIS or Databricks. Knowing these limitations up front can help you design better pipelines, so I’m listing a few here of which you’ll want to be aware.
1. You cannot nest For Each activities.
Within a pipeline, you cannot place a For Each activity inside of another For Each activity. If you need to iterate through two datasets you have two main options. You can combine the two datasets before you iterate over them. Or you can use a parent/child pipeline design where you move the inner For Each activity into the child pipeline. Fun fact: currently the Data Factory UI won’t stop you from nesting For Each activities. You won’t find out until you try to execute the pipeline.

Click through for several other limitations and workarounds.

Comments closed

Drift Monitoring with Azure Machine Learning

Published 2021-03-24 by Kevin Feasel

I take a look at dataset drift monitoring in Azure Machine Learning:

One of the things I like to say about machine learning model is, “shift happens.” By that, I mean that models lose effectiveness over time due to changes in underlying circumstances. Relationships between variables that used to hold no longer do, and so our model quality degrades. This means that we sometimes need to retrain models.
But there’s a cost to retraining models—that work can be computationally expensive and time-consuming. This concern is particularly salient if you’re in a cloud, as you pay directly for everything there. This means that we don’t want to retrain models unless we need to. But when do we know if we should retrain the model? We can watch for model degradation, but there’s another method: drift detection in your datasets.

Read on for a demonstration of how the product works and a couple of things to keep in mind.

Comments closed

Tagging Azure Resources by Policy

Published 2021-03-24 by Kevin Feasel

Jess Pomfret makes tagging resources in Azure better:

Last week, in Part 1, we talked about how to easily keep track of our resources with tags. There are many strategies for tagging your resources but I specifically focused on adding a ‘dateCreated’ tag so we could see when resources were created – since this isn’t available by default. During that post we identified the biggest issue we had was that we were relying on a human to remember to add the ‘dateCreated’ tag for every resource they created. I’ve got two ideas on how to fix that – today we’ll look at the first option, using Azure Policy.
Azure Policy is a way of comparing your Azure estate to defined requirements. You can either use predefined definitions (of which there are many) or create your own specific rules. These definitions can be assigned to certain scopes (subscriptions, resource groups). Azure Policy then reports on whether you’re in the expected state and in some cases can alter resources to ensure you are.

Click through to see how to define a policy and then how to apply it to relevant resources.

Comments closed

Granular Permissions for Dynamic Data Masking

Published 2021-03-24 by Kevin Feasel

John Martin reviews a change:

All the way back with SQL Server 2016 Microsoft released the Dynamic Data Masking feature in the database engine. It seemed like a huge step forward and promised so much, but there were severe limitations around the way that we could control who sees what masked data. It was a case of you either got to see masked data wherever it was configured, or you saw clear data, there was no granularity. I wrote about this and a few other things to do with Dynamic Data Masking all the way back in August of 2016 when I was at SentryOne. You can check that post out here. Also, back then I created several Connect items (blast from the past there), one of which was pulled over to the user voice replacement where I was asking for the UNMASK securable to be made more granular, you can check that out here.
So, why I am I writing this post? Well, it seems that our (my?) request has been granted. At least in Azure SQL Database. On March the 17th this year a little announcement slipped out stating “General availability: Dynamic data masking granular permissions for Azure SQL and Azure Synapse Analytics“. So, has this delivered on what we wanted, to really help this feature live up to its promise?

Read on to see how it works and what John thinks of the whole thing.

Comments closed

Azure ML and Azure SQL DB

Published 2021-03-23 by Kevin Feasel

I remembered that I had another blog and actually wrote something technical:

Not too long ago, I worked through an interesting issue with Azure Machine Learning. The question was, what’s the best way to read from Azure SQL Database, perform model processing, and then write results out to Azure SQL Database? Oh, by the way, I want to use a service principal rather than SQL authentication. Here’s what I’ve got.

This turned out to be a lot more work than I first expected.

Comments closed

Building a Friendship Lamp

Published 2021-03-22 by Kevin Feasel

Drew Furgiuele is looking for mood lighting tips:

It did get me thinking, though: what if I could take this idea and change it up a bit to where people could send me messages WITHOUT the need for them to have a lamp (and thereby give them plausible deniability of being, in fact, my friend). How would that work? In absence of a lamp, would a web application work? And what if we could let people pick a color in lieu of an actual message? You could send a whole mood!
And just like that, my motivation was restored. Time to get to work.

Click through for the build process, which includes 3D printing components, wiring and soldering to circuit boards, writing software for the IoT device, building the front-end web app, and more. Also, I sent red but now I’m not sure if I regret that color choice based on re-reading the first paragraph above.

Comments closed

Loading Azure Synapse Analytics using PolyBase

Published 2021-03-19 by Kevin Feasel

Gauri Mahajan needs to load some data:

Azure Synapse Analytics is Microsoft’s data warehousing offering on Azure Cloud. It supports three types of runtimes – SQL Serverless Pool, SQL Dedicated Pool, and Spark Pools. As there are a variety of data sources on Azure, it’s very obvious that there can be varying types and volumes of data that would have to be loaded into Azure Synapse pools. There are three major types of data ingestion approaches that can be used to load data into Synapse. The COPY command is the most flexible and elaborate mechanism, where someone can execute this command from a SQL pool to load data from supported data repositories. This command is convenient to load ad-hoc and small to medium-sized data loads into Synapse. The second method of loading data is the Bulk Insert, where the method name is self-relevant regarding the approach functionality. To ingest the data from supported repositories into dedicated SQL pools, PolyBase is as efficient and at times it’s even more efficient than the COPY command. This article will help you understand the process to ingest data into Azure Synapse Analytics using PolyBase to load the data.

Click through for the process.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31