Press "Enter" to skip to content

Category: Machine Learning

Drift Monitoring with Azure Machine Learning

I take a look at dataset drift monitoring in Azure Machine Learning:

One of the things I like to say about machine learning model is, “shift happens.” By that, I mean that models lose effectiveness over time due to changes in underlying circumstances. Relationships between variables that used to hold no longer do, and so our model quality degrades. This means that we sometimes need to retrain models.

But there’s a cost to retraining models—that work can be computationally expensive and time-consuming. This concern is particularly salient if you’re in a cloud, as you pay directly for everything there. This means that we don’t want to retrain models unless we need to. But when do we know if we should retrain the model? We can watch for model degradation, but there’s another method: drift detection in your datasets.

Read on for a demonstration of how the product works and a couple of things to keep in mind.

Comments closed

Azure ML and Azure SQL DB

I remembered that I had another blog and actually wrote something technical:

Not too long ago, I worked through an interesting issue with Azure Machine Learning. The question was, what’s the best way to read from Azure SQL Database, perform model processing, and then write results out to Azure SQL Database? Oh, by the way, I want to use a service principal rather than SQL authentication. Here’s what I’ve got.

This turned out to be a lot more work than I first expected.

Comments closed

Applied ML Prototypes

Alex Bleakley and Santiago Giraldo announce Applied ML Prototypes:

To directly address these challenges, we’ve released Applied ML Prototypes (AMPs) — a revolutionary new way of developing and shipping enterprise ML use cases — which provide complete ML projects that can be deployed with one click directly from Cloudera Machine Learning. AMPs enable data scientists to go from an idea to a fully working ML use case in a fraction of the time, with an end-to-end framework for building, deploying, and monitoring business-ready ML applications instantly. 

AMPs move the starting line for any ML project by enabling data scientists to start with a full end-to-end project developed for a similar use case, including a trained and deployed ML model, as well as prebuilt predictive business applications, out of the box. This means that ML development teams can tackle their own ML business use cases more quickly, from those involving churn modeling, to sentiment analysis, to anomaly detection and beyond.

Getting past the marketing fluff, there are some interesting ideas here.

Comments closed

ML Services: PYTHONHOME and PATH

Niels Berglund troubleshoots some issues:

In the last post, which looks at using Python 3.9 in SQL Server Machine Learning Services, I wrote this at the very end:

It looks like all is good, but maybe not? In a future post we’ll look at an issue we have introduced – but for now, let us bask in the glory of having created a new Python language extension.

In the post, we wrote a new language extension to handle Python 3.9, and that just worked fine. However, when I was doing some other things, I noticed some side effects, and in this post, we look at those side effects and how to solve them.

Click through to learn more.

Comments closed

Running an mlflow Server on Azure

Paul Hernandez configures mlflow on Azure using platform-as-a-service offerings:

It is indisputable true that mlflow came to make life a lot easier not only for data scientists but also for data engineers, architects among others. There is a very helpful list of tutorials and example in the official mlflow docs. You can just download it, open a console and start using it locally on your computer. This is the fastest way to getting started. However, as soon as you progress and introduce mlflow in your team, or you want to use it extensively for yourself, some components should be deployed outside your laptop.

To exercise a deployment setup and since I own azure experience, I decided to provision a couple of resources in the cloud to deploy the model registry and store the data produced by the tracking server.

I concur on the power of mlflow.

Comments closed

Choosing an ML Algorithm

Hui Li developed a flow for determining appropriate machine learning algorithms:

Since the cheat sheet is designed for beginner data scientists and analysts, we will make some simplified assumptions when talking about the algorithms.

The algorithms recommended here result from compiled feedback and tips from several data scientists and machine learning experts and developers. There are several issues on which we have not reached an agreement and for these issues we try to highlight the commonality and reconcile the difference.

Additional algorithms will be added in later as our library grows to encompass a more complete set of available methods.

Read the whole thing.

Comments closed

Writing a Python Language Extension for ML Services

Niels Berglund shows how you can bring your own Python 3.9 runtime to SQL Server Machine Learning Services:

When I wrote we’d look at it in a future post I thought to myself; “how hard can it be?”. I had read the steps of how to build a Python language extension for Windows here, and it didn’t seem that hard: some Boost, CMake, compile, and Bob’s your uncle! Well, it turned out it was somewhat more complicated than what I anticipated. So, if you are interested – read on!

I was going to say that the steps seem a bit complicated but not overly terrible, though Niels’s conclusion leaves me wondering.

Comments closed

Cross-Validation in Azure ML Studio

Dinesh Asanka takes us through the cross-validation component in Azure ML Studio:

Let us look at implementing Cross-Validation in Azure Machine Learning. Let us use the sample Adventure Works database that we used for all the articles.

Then Cross Validate Model is dragged and dropped to the experiment. The Cross Validate model has two inputs and two outputs. Two inputs are data input and the relation to the Machine Learning technique. Let us use the Two-Class Decision Jungle as the Machine Learning Technique. Then the first output is connected to the Evaluate Model as shown in the following figure:

Click through for the process.

Comments closed

Using the Open Source R or Python Runtime with Machine Learning Services

Niels Berglund walks us through using the open source extensibility framework to install R or Python:

When Java became a supported language in SQL Server 2019, Microsoft mentioned that communication between ExternalHost and the language extension should be based on an API, regardless of the external language. The API is the Extensibility Framework API for SQL Server. Having an API ensures simplicity and ease of use for the extension developer.

From the paragraph above, one can assume that Microsoft would like to see 3rd party development of language extensions. That assumption turned out to be accurate as, mentioned above, Microsoft open-sourced the Java language extension, together with the include files for the extension API, in September 2020! This means that anyone interested can now create a language extension for their own favorite language!

However, open sourcing the Java extension was not the only thing Microsoft did. They also created and open-sourced language extensions for R and Python!

Click through for more detail and a walkthrough on installation of Python.

Comments closed