Press "Enter" to skip to content

Category: Machine Learning

Using Filter Based Feature Selection in Text Analytics

Dinesh Asanka takes us through a text analytics technique in Azure Machine Learning:

There are two parameters to be defined in the Feature Hashing control. Hashing bitsize will define the maximum number of vectors. 10 hashing bitsize means 1,024 vectors (2^10). 1,024 vectors are more than enough even for the large volume text files. Next, we need to choose N-grams which is 2 as 2 is the optimal number for N-grams for most situations. A detailed description of N-Grams is given in the link given in the reference section.

After the vectors are generated, we do not need other text columns. Apart from the vectors, we need only the dependent attribute or the category column in this example. Therefore, we can remove the unnecessary attributes by Select Columns in dataset control. However, this control will show 1,024 vectors even though it is not available in the previous step, Feature Hashing. Therefore, you need to choose only the available attributes in the Feature Hashing control at the Select Columns in dataset control. In the above example, only 93 vectors were generated.

Click through to learn more.

Comments closed

Monitoring Power Virtual Agent Chatbots

Devin Knight has a video for us:

Power Virtual Agents empowers subject matter experts to build intelligent conversational bots, using a guided, no-code graphical interface. In this video you will learn how to monitor how successful your chatbots are at answering your users questions. Using the monitoring capability you will uncover areas of your chatbot that can be improved.

If I were familiar enough with Latin, I’d try a play on “Quis custodiet ipsos custodes?” with this.

Comments closed

Declarative MLOps with Ludwig

Jacqueline Cardoso announces a new version of Ludwig:

Ludwig abstracts away the complexity of combining all these disparate systems together through its declarative approach to structuring machine learning pipelines. Instead of writing code for your model, training loop, preprocessing, postprocessing, evaluation, and hyperparameter optimization, you only need to declare the schema of your data as a simple YAML configuration:

I’ve long been a fan of declarative approaches to problem-solving, so I’m going to need to dig into this a bit.

Comments closed

Using Lobe for Training ML Models

Chris Webb reviews a free tool from Microsoft:

The most impressive thing about it is not what it does but how it does it: a lot of tools claim to make machine learning easy for non-technical users but Lobe really is easy to use. My AI/ML knowledge is very basic but I got up and running with it extremely quickly.

To test it out I downloaded lots of pictures of English churches and trained a model to detect whether the church had a tower or a spire. After I labelled the pictures appropriately:

Click through for Chris’s findings. Looks like the only thing it does today is image classification, but more functionality is forthcoming.

Comments closed

AI versus ML versus Deep Learning

Holger von Jouanne-Diedrich asks the expert:

This is our 101’st blog post here on Learning Machines and we have prepared something very special for you!

Oftentimes the different concepts of data science, namely artificial intelligence (AI)machine learning (ML), and deep learning (DL) are confused… so we asked the most advanced AI in the world, OpenAI GPT-3, to write a guest post for us to provide some clarification on their definitions and how they are related.

We are most delighted to present this very impressive (and only slightly redacted) essay to you – enjoy!

The machine has learned about itself. This is where I’m glad I only believe weak AI is possible…

Comments closed

Plotting XGBoost Trees with R

Andrew Treadway shows off a method to visualize the results of training an XGBoost model:

In this post, we’re going to cover how to plot XGBoost trees in R. XGBoost is a very popular machine learning algorithm, which is frequently used in Kaggle competitions and has many practical use cases.

Let’s start by loading the packages we’ll need. Note that plotting XGBoost trees requires the DiagrammeR package to be installed, so even if you have xgboost installed already, you’ll need to make sure you have DiagrammeR also.

Click through for the process. H/T R-Bloggers.

Comments closed

Predicting Insurance Prices with ML.NET

Chandra Kudumula shows off ML.NET:

There are three ways to begin with ML.NET

– API Model: You can start ML.NET through a Framework API and write code in C# or F#
– GUI Model: Use ML.NET Model builder in Visual Studio.
– CLI Model: For cross-platform development like Mac and Linux, use ML.NET CLI.

Let’s get started with API Model for predicting the insurance premium using ML.NET Framework.

I’m using Microsoft (MS) Visual Studio 2019 and creating a Console Application. Be sure that you have the latest version of VS and that .NET 5 SDK is installed.

Click through for the demo in Visual Studio using C#.

Comments closed

Hyperparameter Tuning in Azure Machine Learning

Dinesh Asanka takes us through hyperparameter tuning with Azure Machine Learning’s designer:

In the above experiment, both the previous model and the TMH included the model so that we can compare both models. In the above experiment, Tune Model Hyperparameters control is inserted between the Split Data and Score Model controls as shown. In the TMH, control has three inputs. The first control needs the relevant technique and, in this scenario, it is the Two-Class Logistic Regression technique. The second input needs the train data set and the last input needs the evaluation data set and for that, the test data set can be used.

Tune Model Hyperparameters control provides the best combinations and it will be connected to the score model. After the test data stream is connected to the score model, the output of the model is connected to the second input of the Evaluate model so that the previous model and the tuned model can be compared.

I’m not sure if there’s something handled internally in the Tune Model Hyperparameters component, but based on the pipeline images, I’d actually want two separate Split components so that I ended up with something more like 50-20-30 for training, hyperparameter testing, and validation. The first two pipelines appear to be 70-30-0 instead, and so can give you a false sense of confidence in model quality.

Comments closed

Drift Monitoring with Azure Machine Learning

I take a look at dataset drift monitoring in Azure Machine Learning:

One of the things I like to say about machine learning model is, “shift happens.” By that, I mean that models lose effectiveness over time due to changes in underlying circumstances. Relationships between variables that used to hold no longer do, and so our model quality degrades. This means that we sometimes need to retrain models.

But there’s a cost to retraining models—that work can be computationally expensive and time-consuming. This concern is particularly salient if you’re in a cloud, as you pay directly for everything there. This means that we don’t want to retrain models unless we need to. But when do we know if we should retrain the model? We can watch for model degradation, but there’s another method: drift detection in your datasets.

Read on for a demonstration of how the product works and a couple of things to keep in mind.

Comments closed