Category: Machine Learning

Online Learning Algorithms

Published 2017-10-19 by Kevin Feasel

Xin Hunt describes the benefits of online learning algorithms:

A few examples of classical online learning algorithms include recursive least squares, stochastic gradient descent and multi-armed bandit algorithms like Thompson sampling. Many online algorithms (including recursive least squares and stochastic gradient descent) have offline versions. These online algorithms are usually developed after the offline version, and are designed for better scaling with large datasets and streaming data. Algorithms like Thompson sampling on the other hand, do not have offline counterparts, because the problems they solve are inherently online.

Let’s look at interactive ad recommendation systems as an example. You’ll find ads powered by these systems when you browse popular publications, weather sites and social media networks. These recommendation systems build customer preference models by tracking your shopping and browsing activities (ad clicking, wish list updates and purchases, for example). Due to the transient nature of shopping behaviors, new recommendations must reflect the most recent activities. This makes online learning a natural choice for these systems.

My favorite online learning algorithm at the moment is Online Passive-Aggressive Algorithms. Not just because that name describes my Twitter feed.

1 Comment

Measuring Semantic Relatedness

Published 2017-10-16 by Kevin Feasel

Sandipan Dey re-works a university assignment on semantic relatedness in Python:

Let’s define the semantic relatedness of two WordNet nouns x and y as follows:

A = set of synsets in which x appears

B = set of synsets in which y appears

distance(x, y) = length of shortest ancestral path of subsets A and B

sca(x, y) = a shortest common ancestor of subsets A and B

This is the notion of distance that we need to use to implement the distance() and sca() methods in the WordNet data type.

It looks like a helpful assignment for understanding natural language processing a little better.

Comments closed

Errors Using Native Prediction In SQL Server

Published 2017-10-05 by Kevin Feasel

Sacha Tomey walks us through a few potential issues when converting code which uses SQL Server Machine Learning Services’s sp_execute_external_script procedure to native PREDICT calls:

Stumble One:
Error occurred during execution of the builtin function 'PREDICT' with HRESULT 0x80004001. 
Model type is unsupported.
Reason:

Not all models are supported. At the time of writing, only the following models are supported:

rxLinMod

rxLogit

rxBTrees

rxDtree

rxdForest

sp_rxPredict supports additional models including those available in the MicrosoftML package for R (I was using attempting to use rxFastTrees). I presume this limitation will reduce over time. The list of supported models is referenced in the PREDICT function (Documentation).

sp_rxPredict does require CLR, but it’s a viable alternative if you need to use a model not currently supported—like rxNeuralNet.

Comments closed

Using Kafka To Drive Machine Learning

Published 2017-10-02 by Kevin Feasel

Kai Waehner has a nice architectural post on using Kafka as the focal point for machine learning training and prediction:

The essence of this architecture is that it uses Kafka as an intermediary between the various data sources from which feature data is collected, the model building environment where the model is fit, and the production application that serves predictions.

Feature data is pulled into Kafka from the various apps and databases that host it. This data is used to build models. The environment for this will vary based on the skills and preferred toolset of the team. The model building could be a data warehouse, a big data environment like Spark or Hadoop, or a simple server running python scripts. The model can be published where the production app that gets the same model parameters can apply it to incoming examples (perhaps using Kafka Streams to help index the feature data for easy usage on demand). The production app can either receive data from Kafka as a pipeline or even be a Kafka Streams application itself.

This is approximately 80% of my interests wrapped up in one post, so of course I’m going to read it…

Comments closed

Recognizing Wood Knot Images

Published 2017-09-28 by Kevin Feasel

Bob Horton and Vanja Paunic walk through a lumber grading scenario with Microsoft R Server:

Here we use the rxFeaturize function from Microsoft R Server, which allows us to perform a number of transformations on the knot images in order to produce numerical features. We first resize the images to fit the dimensions required by the pre-trained deep neural model we will use, then extract the pixels to form a numerical data set, then run that data set through a DNN pre-trained model. The result of the image featurization is a numeric vector (“feature vector”) that represents key characteristics of that image.

Image featurization here is accomplished by using a deep neural network (DNN) model that has already been pre-trained by using millions of images. Currently, MRS supports four types of DNNs – three ResNet models (18, 50, 101)[1] and AlexNet [8].

This is a practical example of how to use image recognition to facilitate machine learning.

Comments closed

Getting Started With TensorFlow

Published 2017-09-26 by Kevin Feasel

Vivek Kalyanrangan shows us how to install TensorFlow:

Installing Tensorflow with GPU requires you to have NVIDIA GPU. AMD video cards are not supported with tensorflow. NVIDIA uses low level GPU computing system called CUDA. It is an NVIDIA proprietary software.

One can go the OpenCL way with AMD but as of now it won’t work with tensorflow.

Also, all NVIDIA devices are not supported. Here is a list from the NVIDIA documentation listing the supported GPUs.

By the end of it, Vivek also shows us a simple trained model.

Comments closed

Machine Learning Services Updates

Published 2017-09-26 by Kevin Feasel

Umachandar Jayachandran and team have been busy. First, they announced a preview of SQL Server ML Services in Azure SQL Database:

In-database Machine Learning support was added in SQL Server 2016 and we are now bringing the same functionality to Azure SQL Database. You can now train and score machine learning models in Azure SQL Database and the predictions can be exposed to any application using your database, easily and seamlessly.

The preview functionality allows you to train and score machine learning models using data that fits in memory (in R data frame). Please note that the amount of memory available for R scripts execution depends on the edition of the Azure SQL database and cannot be modified.

No Python support there yet, but it’s upcoming. Second, we can use the PREDICT function in Azure SQL Database:

Today we are announcing the general availability of the native PREDICT Transact-SQL function in Azure SQL Database. The PREDICT function allows you to perform scoring in real-time using certain RevoScaleR or revoscalepy models in a SQL query without invoking the R or Python runtime.

The PREDICT function support was added in SQL Server 2017. It is a table-valued function that takes a RevoScaleR or revoscalepy model & data (in the form of a table or view or query) as inputs and generates predictions based on the machine learning model. More details of the PREDICT function can be found here.

There are a limited number of models which support PREDICT—things like linear and logistic regression, RevoScaleR’s fast decision trees, etc. If you have this type of model, however, the predictions stay within SQL Server and end up being much faster than going out to R.

Comments closed

Understanding Neural Networks: Perceptrons

Published 2017-09-13 by Kevin Feasel

Akash Sethi explains what a perceptron is:

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
Linear classifier defined that the training data should be classified into corresponding categories i.e. if we are applying classification for the 2 categories then all the training data must be lie in these two categories.
Binary classifier defines that there should be only 2 categories for classification.
Hence, The basic Perceptron algorithm is used for binary classification and all the training example should lie in these categories. The basic unit in the Neuron is called the Perceptron.

Click through to learn more about perceptrons.

Comments closed

Uses Of kd-trees

Published 2017-09-13 by Kevin Feasel

Sandipan Dey explains what a kd-tree is and how it works:

The prime advantage of a 2d-tree over a BST is that it supports efficient implementation of range search and nearest-neighbor search. Each node corresponds to an axis-aligned rectangle, which encloses all of the points in its subtree. The root corresponds to the entire plane [(−∞, −∞), (+∞, +∞ )]; the left and right children of the root correspond to the two rectangles split by the x-coordinate of the point at the root; and so forth.

Range search: To find all points contained in a given query rectangle, start at the root and recursively search for points in both subtrees using the following pruning rule: if the query rectangle does not intersect the rectangle corresponding to a node, there is no need to explore that node (or its subtrees). That is, search a subtree only if it might contain a point contained in the query rectangle.
Nearest-neighbor search: To find a closest point to a given query point, start at the root and recursively search in both subtrees using the following pruning rule: if the closest point discovered so far is closer than the distance between the query point and the rectangle corresponding to a node, there is no need to explore that node (or its subtrees). That is, search a node only if it might contain a point that is closer than the best one found so far. The effectiveness of the pruning rule depends on quickly finding a nearby point. To do this, organize the recursive method so that when there are two possible subtrees to go down, you choose first the subtree that is on the same side of the splitting line as the query point; the closest point found while exploring the first subtree may enable pruning of the second subtree.
k-nearest neighbors search: This method returns the k points that are closest to the query point (in any order); return all n points in the data structure if n ≤ k. It must do this in an efficient manner, i.e. using the technique from kd-tree nearest neighbor search, not from brute force.

Sandipan implements a fairly classic problem in this space: the behavior of a group of flocking birds.

Comments closed

Overfitting On Decision Trees

Published 2017-09-08 by Kevin Feasel

Ramandeep Kaur explains overfitting as well as how to prevent overfitting on decision trees:

Causes of Overfitting

There are two major situations that could cause overfitting in DTrees:

Overfitting Due to Presence of Noise – Mislabeled instances may contradict the class labels of other similar records.

Overfitting Due to Lack of Representative Instances – Lack of representative instances in the training data can prevent refinement of the learning algorithm.

A good model must not only fit the training data well
but also accurately classify records it has never seen.

How to avoid overfitting?

There are 2 major approaches to avoid overfitting in DTrees.

approaches that stop growing the tree earlier, before it reaches the point where it perfectly classifies the training data.
approaches that allow the tree to overfit the data, and then post-prune the tree.

Click through for more details on these two approaches.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30