Category: Machine Learning

The segmentation of an image into superpixels are an important step in generating explanations for image models. It is both important that the segmentation is correct and follows meaningful patterns in the picture, but also that the size/number of superpixels are appropriate. If the important features in the image are chopped into too many segments the permutations will probably damage the picture beyond recognition in almost all cases leading to a poor or failing explanation model. As the size of the object of interest is varying it is impossible to set up hard rules for the number of superpixels to segment into – the larger the object is relative to the size of the image, the fewer superpixels should be generated. Using plot_superpixels it is possible to evaluate the superpixel parameters before starting the time-consuming explanation function.

Fun stuff. I’m glad that there’s a lot of work going into explaining neural networks rather than hand-waving them off as magic.

Comments closed

Neural Topic Models On Amazon SageMaker

Published 2018-06-26 by Kevin Feasel

David Ping, et al, show off topic modeling on Amazon SageMaker:

Topic Modeling is used to organize a corpus of documents into “topics” which is a grouping based on a statistical distribution of words within the documents themselves. Amazon Comprehend, our fully managed text analytics service, provides a pre-configured topic modeling API that is best suited for the most popular use cases like organizing customer feedback, support incidents or workgroup documents. Amazon Comprehend is the suggested topic modeling choice for customers as it removes a lot of the most routine steps associated with topic modeling like tokenization, training a model and adjusting parameters. Amazon SageMaker’s Neural Topic Model (NTM) caters to the use cases where a finer control of the training, optimization, and/or hosting of a topic model is required, such as training models on text corpus of particular writing style or domain, or hosting topic models as part of a web application. While Amazon SageMaker NTM provides a starting point of state-of-the-art topic modeling, customers have the flexibility to modify the network architecture as well as hyperparameters to accommodate the idiosyncrasies of their data sets as well as to tune the trade-off between a multitude of metrics such as document modeling accuracy, human interpretability and granularity of the learned topics, based on their applications. In addition, Amazon SageMaker NTM leverages the full power of the Amazon SageMaker platform: easily configurable training and hosting infrastructure, automatic hyperparameter optimization, and fully-managed hosting with auto-scaling.

They walk through the entire topic modeling process, so check it out.

Comments closed

Solving A Problem In TensorFlow Using SoftMax

Published 2018-06-22 by Kevin Feasel

Kiran Gutha gives us a fairly simple solution to the MNIST digit data set using the SoftMax algorithm:

In this tutorial, we will train a machine learning model for predicting numbers in pictures. Our goal is not to design a world-class complex model (although we will give you the source code to implement first-rate predictive models later). Rather, this tutorial is to introduce how to use TensorFlow. So, we start here with a very simple mathematical model called Softmax Regression.

The implementation code for this tutorial is short, and the really interesting content is only contained in three lines of code. However, it is very important to understand the design ideas contained in these codes: the basic concepts of TensorFlow workflow and machine learning. Therefore, this tutorial will explain in detail the implementation of these codes.

This is about as easy as it gets with neural networks, but easy doesn’t mean wrong.

Comments closed

Neural Networks Are Polynomial Regression

Published 2018-06-21 by Kevin Feasel

Norman Matloff announces a new paper:

A summary of the paper is:

We present a very simple, informal mathematical argument that neural networks (NNs) are in essence polynomial regression (PR). We refer to this as NNAEPR.
NNAEPR implies that we can use our knowledge of the “old-fashioned” method of PR to gain insight into how NNs — widely viewed somewhat warily as a “black box” — work inside.
One such insight is that the outputs of an NN layer will be prone to multicollinearity, with the problem becoming worse with each successive layer. This in turn may explain why convergence issues often develop in NNs. It also suggests that NN users tend to use overly large networks.
NNAEPR suggests that one may abandon using NNs altogether, and simply use PR instead.
We investigated this on a wide variety of datasets, and found that in every case PR did as well as, and often better than, NNs.
We have developed a feature-rich R package, polyreg, to facilitate using PR in multivariate settings.

The paper and presentation slides are ungated, so check it out. H/T R-bloggers

Comments closed

Using DALEX To Explain Black-Box Models

Published 2018-06-20 by Kevin Feasel

Przemyslaw Biecek explains that there’s more than LIME for explaining black-box models:

I’ve heard about a number of consulting companies, that decided to use simple linear model instead of a black box model with higher performance, because ,,client wants to understand factors that drive the prediction’’.
And usually the discussion goes as following: ,,We have tried LIME for our black-box model, it is great, but it is not working in our case’’, ,,Have you tried other explainers?’’, ,,What other explainers’’?

So here you have a map of different visual explanations for black-box models.

Check out DALEX, which includes a Jupyter notebook example. H/T R-Bloggers

Comments closed

Comparing Keras In Python Versus R

Published 2018-06-20 by Kevin Feasel

Dmitry Kisler performs image classification using Keras in both Python and R:

From the plots above, one can see that:

the accuracy of your model doesn’t depend on the language you use to build and train it (the plot shows only train accuracy, but the model doesn’t have high variance and the bias accuracy is around 99% as well).
even though 10 measurements may be not convincing, but Python would reduce (by up to 15%) the time required to train your CNN model. This is somewhat expected because R uses Python under the hood when executes Keras functions.

This is just one example, but the results are about what I’d expect.

Comments closed

Auto-Encoders And KernelML

Published 2018-06-20 by Kevin Feasel

Rohan Kotwani gives us an example where KernelML might be better than TensorFlow or PyTorch:

So what’s the point of using KernelML?

1. The parameters in each layer can be non-linear
2. Each parameter can be sampled from a different random distribution
3. The parameters can be transformed to meet certain constraints
4. Network combinations are defined in terms of numpy operations
5. Parameters are probabilistically updated
6. Each parameter update samples the loss function around a local or global minima

KerneML Specs

KernelMLis brute force optimizer that can be used to train machine learning algorithms. The package uses a combination of a machine learning and monte carlo simulations to optimize a parameter vector with a user defined loss function. Using kernelml creates a high computational cost for large complex networks because it samples the loss function using a subspace for each parameter in the parameter vector which requires many random simulations. The computational cost was reduced by enabling parallel computations with the ipyparallel. The decision to use this package was made because it effectively utilizes the cores on a machine.

It’s an interesting use case, though I would have liked to have seen a direct comparison to other frameworks.

Comments closed

Probabilistic Debugging

Published 2018-06-20 by Kevin Feasel

Adrian Colyer summarizes a fascinating academic paper:

This program has a bug. When given an already encoded input, it encodes it again (replacing % with ‘%25’). For example, the input https://example.com/t%20c?x=1 results in the output https://example.com/t%2520c?x=1, whereas in fact the output should be the same as the input in this case.

Let’s put our probabilistic thinking caps on and try and to debug the program. We ‘know’ that the url printed on line 19 is wrong, so we can assign low probability (0.05) to this value being correct. Likewise we ‘know’ that the input url on line 1 is correct, so we can assign high probability (0.95). (In probabilistic inference, it is standard not to use 0.0 or 1.0, but values close to them instead). Initially we’ll set the probability of every other program variable being set to 0.5, since we don’t know any better yet. If we can find a point in the program where the inputs are correct with relatively high probability, and the outputs are incorrect with relatively high probability, then that’s an interesting place!

Since url on line 19 has a low probability of being correct, this suggests that url on line 18, and purl_str at line 12 are also likely to be faulty. PI Debugger actually assigns these probabilities of being correct 0.0441 and 0.0832 respectively. Line 18 is a simple assignment statement, so if the chances of a bug here are fairly low. Now we trace the data flow. If purl_str at line 12 is likely to be faulty then s at line 16 is also likely to be faulty (probability 0.1176).

I’m interested to see someone create a practical implementation someday.

Comments closed

Databricks MLflow

Published 2018-06-06 by Kevin Feasel

Matai Zaharia announces a new Databricks offering:

MLflow is inspired by existing ML platforms, but it is designed to be open in two senses:

Open interface: MLflow is designed to work with any ML library, algorithm, deployment tool or language. It’s built around REST APIs and simple data formats (e.g., a model can be viewed as a lambda function) that can be used from a variety of tools, instead of only providing a small set of built-in functionality. This also makes it easy to add MLflow to your existing ML code so you can benefit from it immediately, and to share code using any ML library that others in your organization can run.

Open source: We’re releasing MLflow as an open source project that users and library developers can extend. In addition, MLflow’s open format makes it very easy to share workflow steps and models across organizations if you wish to open source your code.

Mlflow is still currently in alpha, but we believe that it already offers a useful framework to work with ML code, and we would love to hear your feedback. In this post, we’ll introduce MLflow in detail and explain its components.

Even in alpha, it looks nice.

Comments closed

Permissions Error Executing R Scripts

Published 2018-06-05 by Kevin Feasel

Niels Berglund walks through a permissions error on a new installation of SQL Server 2017 CU 7 with Machine Learning Services:

Cool, all is “A-OK”! A couple of days go by, and I see that there is a Cumulative Update (CU) for SQL Server 2017 – CU7. I install it and does not think much about it. I mean: “what can go wrong, how hard can it be?”. A couple of days later and I am busy writing the follow-up post to sp_execute_external_script and SQL Compute Context – I when I try to execute sp_execute_external_script, and it falls over!

Niels has a couple false starts that he walks us through, but then lands on a solid answer.

Comments closed