Press "Enter" to skip to content

Category: Machine Learning

Trying out FLAML

Gavita Regunath provides an overview of FLAML:

FLAML is short for Fast and Lightweight Automated Machine Learning library. It is an open-source Python library created by Microsoft researchers in 2021 for automated machine learning (AutoML). It is designed to be fast, efficient, and user-friendly, making it ideal for a wide range of applications.

Click through to learn more and to give it a spin with a pair of notebooks.

Comments closed

Working with R in AML v2

Tomaz Kastrun ends the advent of Azure ML on a downer:

R language and Azure Machine Learning SDK for R was deprecated a year ago (end of 2021). But R can be still used for training and deployment by using Azure Machine learning CLI 2.0!

Furthermore, R language can be used in Machine Learning Designer, for data preparation, data wrangling and statistical analysis.

You can work with R but they make sure everything is more difficult.

Comments closed

Structuring Azure ML Projects and using the Terminal

Tomaz Kastrun nears the end of the Azure ML advent. Day 20 covers package requirements and other niceties:

When creating notebooks, it is always a good way to have the dependencies included. Whether it is a particular version of a package, a separate script file or an installation requirement.

Selecting an environment or kernel can be an issue if it is not correctly initiated with the code. And you can also check the kernels with a simple python code:

Day 21 looks at the Azure CLI and running code from within a compute instance terminal:

Using Azure CLI can help you progress faster, make repetitve tasks automated and even use the GIT integration, for faster and better collaboration.

So we have created a YAML file on Day20 and we can use it also with Azure CLI to create an environment.

Comments closed

Statistical Analysis in Azure ML

Tomaz Kastrun continues an advent of Azure ML. Day 18 takes us through feature exploration:

Azure Machine Learning is also a great tool to do ordinary statistical analysis, graph plotting and everything that goes along.

Let’s get an open dataset, that is available on UCI Machine Learning repository and import it in the pandas dataframe.

Day 19 picks up with feature engineering:

Yesterday we have shown, that statistical analysis and all bolts and whistles can be done super simple in Azure machine learning. Today we will continue with feature engineering and modelling.

So, what is feature engineering? Is a general process and can involve both feature construction: adding new features from the existing data, and feature selection: choosing only the most important features for improving model performance, reducing data dimensionality, doing log-transformation, removing outliers, to do scaling (normalisation, standardisation), imputations, general transformation (and others, as polynomial), variable creation, variable extraction and so on.

Comments closed

MLflow in Action and Responsible AI

Tomaz Kastrun continues an advent of Azure ML. Day 16 shows off MLflow:

Yesterday we have looked into how to start the MLflow configurations and today, let’s put this to the test.

We will create a new notebook and use Heart dataset (link to dataset) to toy around. We will also import xgboost classifier to asses the accuracy of the presence of heart disease in the patient. We will be using a categorical (integer) variable with values from 0 (no presence) to 4 (strong presence) and attempt to classify based on 15+ attributes (out of more than 70 attributes).

Day 17 pivots to using the responsible AI dashboard:

Azure ML has provided users with collection of model and data exploration with the Studio user interface. But it also provides compatible solutions with Azure ML and Python package responsibleai. With the help of widgets, we will create an sample of dashboard to explore the solution with assessing the responsible decisions and actions.

Comments closed

Becoming Familiar with MLflow

Tomaz Kastrun continues an advent series on Azure ML:

MLFlow is an open-source framework for registering, managing and tracking machine learning models. It is multiplatform, bringing consistent model training and model consumption across different platforms. This means, that training a model locally and uploading it to Azure or training a model on remote compute instances and downloading it, is a great feature for MLflow.

You can use MLflow with Azure CLI, Azure Python SDK or in the studio and it will deliver a consistent experience (note, some functionalities are limited to the language).

Click through for a quick overview of MLflow.

Comments closed

AutoML and Model Registration in AML

Tomaz Kastrun continues an advent of Azure Machine Learning. Day 13 covers the topic of Automated ML:

Automated ML is a no-code automated machine learning task. It iterates over many combinations of algorithms and hyperparameters in order to find the best model for your dataset and your prediction variable(s). The final solution is a model, that can be downloaded and later reused. So Automated ML is not just giving you the best model out of a family of algorithms, but lets you use the model, generate the scripts and create the artefacts.

Day 14 concerns model registration:

Important asset is the “Models” in navigation bar. This feature allows you to work with different model types -> custom, MLflow, and Triton. What you do here is, you register a model from different locations (e.g.: local file, AzureML Datastore, AzureML Job, MLflow Job, Model asset in AzureML workspace, and Model asset in AzureML Registry).

Once you open the Models asset, you will see, that you can do many things here. I have already model register from the running the notebook on day4.

Comments closed

Pipelines and Jobs in Azure ML

Tomaz Kastrun continues an advent on Azure ML. Day 11 covers pipelines:

A pipeline is set of instructions (or a workflow) for executing particular work of a machine learning task. The idea behind pipelines is that will help the team of data scientists and machine learning engineers standardize workflow and incorporate best practices of preparing data, producing training models, executing the models and deploying them. Pipelines will help improve and build workflow efficiently and in such a way that it can be reusable.

And the idea behind it, is to split a machine learning process into smaller tasks, a multistep workflow, where each step is a separate component than can be developed, upgraded, optimised, configured, automated, and deleted separately. And these steps, connected through interfaces, form a workflow.

Day 12 makes us get a job:

An Azure ML job executes a task against a specified compute target. This is also how the job is created. By configuring a new job, you can also scale out model training, since there are single node and distributed training available.

A simple job command would be to execute a command in a Docker container. And further parameter sweeping can be executed, by specifying it in the job itself. 

Comments closed