Press "Enter" to skip to content

Category: Machine Learning

Synapse and Azure ML Pipelines

Santosh Thomas integrates two Azure products:

As more customers standardize on the Synapse data platform, enabling machine learning workflows through Azure Machine Learning (Azure ML) becomes particularly interesting. This is especially true as more customers look to bring their data engineering and data science practices together and mature capabilities on both sides.

The goal of this blog post is to highlight how Synapse and Azure ML can work well together to deliver key insights. This is motivated by a scenario where a customer modernized their data platform on Azure Synapse but was looking to improve their data science practices through Azure ML. The focus of this blog is to expose existing functionality, and it is not a “hardened” solution with security or other cloud best practice implementations. The workflow steps also assume some level of comfort with Python and working with the Azure Python SDKs.

There was a time in which Microsoft wanted us to remain in Synapse for machine learning tasks, but that time is gone: the emphasis is definitely to do machine learning tasks in Azure ML, regardless of where the data lives…unless there’s a Spark job involved, in which case things get all weird again.

Comments closed

Azure ML Overview

Sanil Mhatre gets us started with Azure Machine Learning:

The five-part series is designed to jump-start any IT professional’s journey in the fascinating world of Data Science with Azure Machine Learning (Azure ML). Readers don’t need prior knowledge of Data Science, Machine Learning, Statistics, or Azure to begin this adventure.

All you will need is an Azure subscription and I will show you how to get a free one that you can use to explore some of Azure’s features before I show you how to set up the Azure ML environment.

Part 1 is available now, with the other parts coming up soon. Even so, Part 1 is a big article on its own.

Comments closed

Dealing with Imbalanced Class Data for Image Classification

Alexander Billington needs more beta carotene:

Image classification is a standard computer vision task and involves training a model to assign a label to a given image, such as a model to classify images of different root vegetables. A big problem with classification is bias, and the models favouring a particular image class above the others. A common cause of this can be dataset imbalance, and it is often hard to spot as a model trained on an imbalanced dataset can often still have good accuracy. E.g. if there are 1000 images in the test dataset, 950 potatoes and 50 carrots and the model predicted all 1000 images to be potatoes it would still have 95% accuracy. This is also an example of why more metrics than accuracy should be considered… but let’s leave that discussion for another day.

Click through for several techniques you can use to balance out classes, with a focus on image classification. Undersampling is almost always a no-go for me, though I am much fonder of the other techniques.

Comments closed

Multi-Class Classification in PyTorch

Adrian Tam does some iris categorizing:

Now you need to have a model that can take the input and predict the output, ideally in the form of one-hot vectors. There is no science behind the design of a perfect neural network model. But you know one thing, it has to take in a vector of 4 features and output a vector of 3 values. The 4 features corresponds to what you have in the dataset. The 3-value output is because we know the one-hot vector has 3 elements. Anything can be in between, and those are known as the “hidden layers” since they are neither input nor output.

Click through for the full tutorial.

Comments closed

Using the Softmax Classifier in PyTorch

Muhammad Asad Iqbal Khan takes us through one of the classifier options available to PyTorch:

While a logistic regression classifier is used for binary class classification, softmax classifier is a supervised learning algorithm which is mostly used when multiple classes are involved.

Softmax classifier works by assigning a probability distribution to each class. The probability distribution of the class with the highest probability is normalized to 1, and all other probabilities are scaled accordingly.

Read on to learn some of the properties of the Softmax classifier, as well as how you can use this for multi-class classification in PyTorch.

Comments closed

Running ML.NET in F#

Matt Eland builds a notebook:

In this article I’ll outline a simple pipeline that trains a regression machine learning model and saves it to a file for use later on. We’ll look at how to load the model using F# and use it to generate new predictions for new data points.

To round things out, I’ll be showing you how to do this all in a Polyglot Notebook, though you can skim over this aspect of the experiment as almost all of the code will work just fine in a normal .fs file outside of Polyglot Notebooks.

At the end, Matt mentions that the F# code looks a whole lot like C# code and that’s my biggest problem with the library: it forces you into writing C#-style code.

Comments closed

Trying out FLAML

Gavita Regunath provides an overview of FLAML:

FLAML is short for Fast and Lightweight Automated Machine Learning library. It is an open-source Python library created by Microsoft researchers in 2021 for automated machine learning (AutoML). It is designed to be fast, efficient, and user-friendly, making it ideal for a wide range of applications.

Click through to learn more and to give it a spin with a pair of notebooks.

Comments closed

Working with R in AML v2

Tomaz Kastrun ends the advent of Azure ML on a downer:

R language and Azure Machine Learning SDK for R was deprecated a year ago (end of 2021). But R can be still used for training and deployment by using Azure Machine learning CLI 2.0!

Furthermore, R language can be used in Machine Learning Designer, for data preparation, data wrangling and statistical analysis.

You can work with R but they make sure everything is more difficult.

Comments closed