Press "Enter" to skip to content

Category: Machine Learning

Bayesian Neural Networks

Yoel Zeldes thinks about neural networks from a different perspective:

The term logP(w), which represents our prior, acts as a regularization term. Choosing a Gaussian distribution with mean 0 as the prior, you’ll get the mathematical equivalence of L2 regularization.

Now that we start thinking about neural networks as probabilistic creatures, we can let the fun begin. For start, who says we have to output one set of weights at the end of the training process? What if instead of learning the model’s weights, we learn a distribution over the weights? This will allow us to estimate uncertainty over the weights. So how do we do that?

It’s an interesting approach to the problem.

Comments closed

Combining Apache Kafka With TensorFlow

Kai Waehner has an example of an application which uses Apache Kafka to stream car sensor data to TensorFlow on Google ML Engine:

A great benefit of Confluent MQTT Proxy is simplicity for realizing IoT scenarios without the need for a MQTT Broker. You can forward messages directly from the MQTT devices to Kafka via the MQTT Proxy. This reduces efforts and costs significantly. This is a perfect solution if you “just” want to communicate between Kafka and MQTT devices.

If you want to see the other part of the story (integration with sink applications like Elasticsearch / Grafana), please take a look at the Github project “KSQL for streaming IoT data“. This realizes the integration with ElasticSearch and Grafana via Kafka Connect and the Elastic connector.

Check it out and then take a gander at Kai’s GitHub repo.

Comments closed

Plotting ML Results In R

Bernardo Lares shows off the plots he creates in R to compare ML models:

Split and compare quantiles

This parameter is the easiest to sell to the C-level guys. “Did you know that with this model, if we chop the worst 20% of leads we would have avoided 60% of the frauds and only lose 8% of our sales?” That’s what this plot will give you.

The math behind the plot might be a bit foggy for some readers so let me try and explain further: if you sort from the lowest to the highest score all your observations / people / leads, then you can literally, for instance, select the top 5 or bottom 15% or so. What we do now is split all those “ranked” rows into similar-sized-buckets to get the best bucket, the second best one, and so on. Then, if you split all the “Goods” and the “Bads” into two columns, keeping their buckets’ colours, we still have it sorted and separated, right? To conclude, if you’d say that the worst 20% cases (all from the same worst colour and bucket) were to take an action, then how many of each label would that represent on your test set? There you go!

Read on to see what else he uses and how you can build it yourself.

Comments closed

Building TensorFlow Neural Networks On Spark With Keras

Jules Damji has an example of using the PyCharm IDE to use Keras to build TensorFlow neural network models on the Databricks MLflow library:

Our example in the video is a simple Keras network, modified from Keras Model Examples, that creates a simple multi-layer binary classification model with a couple of hidden and dropout layers and respective activation functions. Binary classification is a common machine learning task applied widely to classify images or text into two classes. For example, an image is a cat or dog; or a tweet is positive or negative in sentiment; and whether mail is spam or not spam.

But the point here is not so much to demonstrate a complex neural network model as to show the ease with which you can develop with Keras and TensorFlow, log an MLflow run, and experiment—all within PyCharm on your laptop.

Click through for the video and explanation of the process.

Comments closed

When Image Classifiers Look At Unknown Objects

Pete Warden explains that image classifiers aren’t magic:

As people, we’re used to being able to classify anything we see in the world around us, and we naturally expect machines to have the same ability. Most models are only trained to recognize a very limited set of objects though, such as the 1,000 categories of the original ImageNet competition. Crucially, the training process makes the assumption that every example the model sees is one of those objects, and the prediction must be within that set. There’s no option for the model to say “I don’t know”, and there’s no training data to help it learn that response. This is a simplification that makes sense within a research setting, but causes problems when we try to use the resulting models in the real world.

Back when I was at Jetpac, we had a lot of trouble convincing people that the ground-breaking AlexNet model was a big leap forward because every time we handed over a demo phone running the network, they would point it at their faces and it would predict something like “Oxygen mask” or “Seat belt”. This was because the ImageNet competition categories didn’t include any labels for people, but most of the photos with mask and seatbelt labels included faces along with the objects. Another embarrassing mistake came when they would point it at a plate and it would predict “Toilet seat”! This was because there were no plates in the original categories, and the closest white circular object in appearance was a toilet.

Read the whole thing.

Comments closed

Building Recurrent Neural Networks Using TensorFlow

Ahmet Taspinar walks us through creating a recurrent neural network topology using TensorFlow:

As we have also seen in the previous blog posts, our Neural Network consists of a tf.Graph() and a tf.Session(). The tf.Graph() contains all of the computational steps required for the Neural Network, and the tf.Session is used to execute these steps.

The computational steps defined in the tf.Graph can be divided into four main parts;

  1. We initialize placeholders which are filled with batches of training data during the run.

  2. We define the RNN model and to calculate the output values (logits)

  3. The logits are used to calculate a loss value, which then

  4. is used in an Optimizer to optimize the weights of the RNN.

As a lazy casual, I’ll probably stick with letting Keras do most of the heavy lifting.

Comments closed

Machine Learning With F#

Diogo Souza gives us an introduction to using Accord.NET in F#:

F# is a scripting as well as a REPL language. REPL comes from Read-Eval-Print Loop, which means that the language processes single steps one at a time like reading the user inputs (usually expressions), evaluating their values and, in the end, returning the result to the same user. All that happens in a loop until the loop ends. Visual Studio provides a great F# Interactive view that runs the scripts in REPL mode and shows the results. Take the following Hello World example:

This code just creates a single variable (let keyword) and assigns a string value to it. When you run this code (select all the code text and press Alt + Enter), you’ll see the following result in the F# Interactive window (Figure 5):

You can also use C# with Accord.NET, but there’s a strong bias toward F# among people in the .NET space who work with ML, for the same reason that there’s a bias toward Scala over Java for Spark developers:  the functional programming paradigm works extremely well with mathematical concepts.  Also, in addition to Accord.NET, you might also want to check out Math.NET.  My experience has been that this package tends to be a bit faster than Accord.

Comments closed

Using LIME To Explain Keras Models

Shirin Glander shows us how to use the LIME package to explain image recognition models built from Keras:

The segmentation of an image into superpixels are an important step in generating explanations for image models. It is both important that the segmentation is correct and follows meaningful patterns in the picture, but also that the size/number of superpixels are appropriate. If the important features in the image are chopped into too many segments the permutations will probably damage the picture beyond recognition in almost all cases leading to a poor or failing explanation model. As the size of the object of interest is varying it is impossible to set up hard rules for the number of superpixels to segment into – the larger the object is relative to the size of the image, the fewer superpixels should be generated. Using plot_superpixels it is possible to evaluate the superpixel parameters before starting the time-consuming explanation function.

Fun stuff.  I’m glad that there’s a lot of work going into explaining neural networks rather than hand-waving them off as magic.

Comments closed

Neural Topic Models On Amazon SageMaker

David Ping, et al, show off topic modeling on Amazon SageMaker:

Topic Modeling is used to organize a corpus of documents into “topics” which is a grouping based on a statistical distribution of words within the documents themselves. Amazon Comprehend, our fully managed text analytics service, provides a pre-configured topic modeling API that is best suited for the most popular use cases like organizing customer feedback, support incidents or workgroup documents. Amazon Comprehend is the suggested topic modeling choice for customers as it removes a lot of the most routine steps associated with topic modeling like tokenization, training a model and adjusting parameters. Amazon SageMaker’s Neural Topic Model (NTM) caters to the use cases where a finer control of the training, optimization, and/or hosting of a topic model is required, such as training models on text corpus of particular writing style or domain, or hosting topic models as part of a web application. While Amazon SageMaker NTM provides a starting point of state-of-the-art topic modeling, customers have the flexibility to modify the network architecture as well as hyperparameters to accommodate the idiosyncrasies of their data sets as well as to tune the trade-off between a multitude of metrics such as document modeling accuracy, human interpretability and granularity of the learned topics, based on their applications. In addition, Amazon SageMaker NTM leverages the full power of the Amazon SageMaker platform: easily configurable training and hosting infrastructure, automatic hyperparameter optimization, and fully-managed hosting with auto-scaling.

They walk through the entire topic modeling process, so check it out.

Comments closed

Solving A Problem In TensorFlow Using SoftMax

Kiran Gutha gives us a fairly simple solution to the MNIST digit data set using the SoftMax algorithm:

In this tutorial, we will train a machine learning model for predicting numbers in pictures. Our goal is not to design a world-class complex model (although we will give you the source code to implement first-rate predictive models later). Rather, this tutorial is to introduce how to use TensorFlow. So, we start here with a very simple mathematical model called Softmax Regression.

The implementation code for this tutorial is short, and the really interesting content is only contained in three lines of code. However, it is very important to understand the design ideas contained in these codes: the basic concepts of TensorFlow workflow and machine learning. Therefore, this tutorial will explain in detail the implementation of these codes.

This is about as easy as it gets with neural networks, but easy doesn’t mean wrong.

Comments closed