Press "Enter" to skip to content

Category: Machine Learning

Machine Learning Services Updates

Umachandar Jayachandran and team have been busy.  First, they announced a preview of SQL Server ML Services in Azure SQL Database:

In-database Machine Learning support was added in SQL Server 2016 and we are now bringing the same functionality to Azure SQL Database. You can now train and score machine learning models in Azure SQL Database and the predictions can be exposed to any application using your database, easily and seamlessly.

The preview functionality allows you to train and score machine learning models using data that fits in memory (in R data frame). Please note that the amount of memory available for R scripts execution depends on the edition of the Azure SQL database and cannot be modified.

No Python support there yet, but it’s upcoming.  Second, we can use the PREDICT function in Azure SQL Database:

Today we are announcing the general availability of the native PREDICT Transact-SQL function in Azure SQL Database. The PREDICT function allows you to perform scoring in real-time using certain RevoScaleR or revoscalepy models in a SQL query without invoking the R or Python runtime.

The PREDICT function support was added in SQL Server 2017. It is a table-valued function that takes a RevoScaleR or revoscalepy model & data (in the form of a table or view or query) as inputs and generates predictions based on the machine learning model. More details of the PREDICT function can be found here.

There are a limited number of models which support PREDICT—things like linear and logistic regression, RevoScaleR’s fast decision trees, etc.  If you have this type of model, however, the predictions stay within SQL Server and end up being much faster than going out to R.

Comments closed

Understanding Neural Networks: Perceptrons

Akash Sethi explains what a perceptron is:

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
Linear classifier defined that the training data should be classified into corresponding categories i.e. if we are applying classification for the 2 categories then all the training data must be lie in these two categories.
Binary classifier defines that there should be only 2 categories for classification.
Hence, The basic Perceptron algorithm is used for binary classification and all the training example should lie in these categories. The basic unit in the Neuron is called the Perceptron.

Click through to learn more about perceptrons.

Comments closed

Uses Of kd-trees

Sandipan Dey explains what a kd-tree is and how it works:

The prime advantage of a 2d-tree over a BST is that it supports efficient implementation of range search and nearest-neighbor search. Each node corresponds to an axis-aligned rectangle, which encloses all of the points in its subtree. The root corresponds to the entire plane [(−∞, −∞), (+∞, +∞ )]; the left and right children of the root correspond to the two rectangles split by the x-coordinate of the point at the root; and so forth.

  • Range search: To find all points contained in a given query rectangle, start at the root and recursively search for points in both subtrees using the following pruning rule: if the query rectangle does not intersect the rectangle corresponding to a node, there is no need to explore that node (or its subtrees). That is, search a subtree only if it might contain a point contained in the query rectangle.

  • Nearest-neighbor search: To find a closest point to a given query point, start at the root and recursively search in both subtrees using the following pruning rule: if the closest point discovered so far is closer than the distance between the query point and the rectangle corresponding to a node, there is no need to explore that node (or its subtrees). That is, search a node only if it might contain a point that is closer than the best one found so far. The effectiveness of the pruning rule depends on quickly finding a nearby point. To do this, organize the recursive method so that when there are two possible subtrees to go down, you choose first the subtree that is on the same side of the splitting line as the query point; the closest point found while exploring the first subtree may enable pruning of the second subtree.

  • k-nearest neighbors search: This method returns the k points that are closest to the query point (in any order); return all n points in the data structure if n ≤ k. It must do this in an efficient manner, i.e. using the technique from kd-tree nearest neighbor search, not from brute force.

Sandipan implements a fairly classic problem in this space:  the behavior of a group of flocking birds.

Comments closed

Overfitting On Decision Trees

Ramandeep Kaur explains overfitting as well as how to prevent overfitting on decision trees:

Causes of Overfitting

There are two major situations that could cause overfitting in DTrees:

  1. Overfitting Due to Presence of Noise – Mislabeled instances may contradict the class labels of other similar records.
  2. Overfitting Due to Lack of Representative Instances – Lack of representative instances in the training data can prevent refinement of the learning algorithm.

                      A good model must not only fit the training data well
                      but also accurately classify records it has never seen.

How to avoid overfitting?

There are 2 major approaches to avoid overfitting in DTrees.

  1. approaches that stop growing the tree earlier, before it reaches the point where it perfectly classifies the training data.

  2. approaches that allow the tree to overfit the data, and then post-prune the tree.

Click through for more details on these two approaches.

Comments closed

Sentiment Analysis In Power BI

Chris Webb has a new Power BI custom data connector:

I’m pleased to announce that I’ve published my first Power BI custom data connector on GitHub here:

https://github.com/cwebbbi/PowerBITextAnalytics

Basically, it acts as a wrapper for the Microsoft Cognitive Services Text Analytics API and  makes it extremely easy to do language detection, sentiment analysis and to extract key phrases from text when you are loading data into Power BI.

Read the whole thing, as Chris has a great demo of it.

Comments closed

Classifying Time Series Data With TensorFlow

Burak Himmetoglu applies a time series data set to two different types of neural networks using TensorFlow:

In this blog post, I will discuss the use of deep leaning methods to classify time-series data, without the need to manually engineer features. The example I will consider is the classic Human Activity Recognition (HAR) dataset from the UCI repository. The dataset contains the raw time-series data, as well as a pre-processed one with 561 engineered features. I will compare the performance of typical machine learning algorithms which use engineered features with two deep learning methods (convolutional and recurrent neural networks) and show that deep learning can surpass the performance of the former.

I have used Tensorflow for the implementation and training of the models discussed in this post.  In the discussion below, code snippets are provided to explain the implementation. For the complete code, please see my Github repository.

Click through for the samples, or check out the repo, linked above.

Comments closed

Using NLP To Find Similar Facebook Posts

The folks at Knoyd put together a word embedding example by scraping a Python Facebook group:

We are going to represent the content of a Facebook post using word embeddings and comparing the transformed posts using word mover’s distance. The combination of both have shown lower k-nearest neighbor-document classification error rates compared to other state of the art techniques.

The advantage of word embeddings is that the words which have similar meanings but don’t have any letters in common will still have similar vectors (be close) in the embedded space (e.g. lion and tiger).

There’s a good high-level discussion of techniques in this post.

Comments closed

TensorFlow On The Pi

Pete Warden shows how to install TensorFlow on a Raspberry Pi:

It’s never been easy to get TensorFlow installed on a Pi though. I had created a makefile script that let you build the C++ part from scratch, but it took several hours to complete and didn’t support Python. Sam Abrahams, an external contributor, did an amazing job maintaining a Python pip wheel for major releases, but building it required you to add swap space on a USB device for your Pi, and took even longer to compile than the makefile approach. Snips managed to get TensorFlow cross-compiling for Rust, but it wasn’t clear how to apply this to other languages.

Plenty of people on the team are Pi enthusiasts, and happily Eugene Brevdo dived in to investigate how we could improve the situation. We knew we wanted to have something that could be run as part of TensorFlow’s Jenkins continuous integration system, which meant building a completely automatic solution that would run with no user intervention. Since having a Pi plugged into a machine to run something like the makefile build would be hard to maintain, we did try using a hosted server from Mythic Beasts. Eugene got the makefile built going after a few hiccups, but the Python version required more RAM than was available, and we couldn’t plug in a USB drive remotely!

Read the whole thing, even if for the science experiment aspect.

Comments closed

Understanding Decision Trees

Ramandeep Kaur explains how decision trees work:

Simply put, a decision tree is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision.

It is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems and works for both categorical and continuous input and output variables. It is one of the most widely used and practical methods for Inductive Inference. (Inductive inference is the process of reaching a general conclusion from specific examples.)

Decision trees learn and train itself from given examples and predict for unseen examples.

Click through for an example of implementing the ID3 algorithm and generating a decision tree from a data set.

Comments closed

Deep Learning Isn’t The End-All Be-All Solution

Pablo Cordero explains that deep learning solutions are not the best choice in all cases:

The second preconception I hear the most is the hype. Many yet-to-be practitioners expect deep nets to give them a mythical performance boost just because it worked in other fields. Others are inspired by impressive work in modeling and manipulating images, music, and language – three data types close to any human heart – and rush headfirst into the field by trying to train the latest GAN architecture. The hype is real in many ways. Deep learning has become an undeniable force in machine learning and an important tool in the arsenal of any data modeler. Its popularity has brought forth essential frameworks such as tensorflow and pytorch that are incredibly useful even outside deep learning. Its underdog to superstar origin story has inspired researchers to revisit other previously obscure methods like evolutionary strategies and reinforcement learning. But it’s not a panacea by any means. Aside from lunch considerations, deep learning models can be very nuanced and require careful and sometimes very expensive hyperparameter searches, tuning, and testing (much more on this later in the post). Besides, there are many cases where using deep learning just doesn’t make sense from a practical perspective and simpler models work much better.

It’s a very interesting article, pointing out that deep learning solutions work better than expected on smaller data sizes, but there are areas where it’s preferable to choose something else.

Comments closed