Press "Enter" to skip to content

Category: Machine Learning

Machine Learning Data Preparation Tips

Jen Underwood has some good tips when preparing data for a machine learning operation:

Data preparation for machine learning requires business domain expertise, bias awareness and an experimental thought process. Before preparing your data, you’ll first define a business problem solve. During that exercise, you’ll select an outcome metric and brainstorm potential input variables that influence it from many varied perspectives. From there you will begin identifying, collecting, cleaning, shaping and sampling data to run through automated machine learning model processes.

Note that it is also not unusual for relevant machine learning input data to occur outside of existing transactional processes. If that is the case, you can still start creating a first-generation machine learning model with existing data and continue to build new model versions over time as supplementary data is acquired.

Click through for the ten tips.

Comments closed

Installing SQL Server 2017 Machine Learning Services

Ginger Grant shows how to install SQL Server 2017 Machine Learning Services:

There are two installation options:  In-Database or Standalone.  If you are evaluating Machine Learning Services and you have no knowledge of what the load may be, start by selecting the Machine Learning Service In-Database.  There are several reasons why by default you want to select the In-Database option. One of the problems that Microsoft was looking to solve by incorporating advanced data analytics was to improve performance of the native code by greatly reducing data latency.  If you are analyzing a lot of data which is stored within SQL Server, the performance will be improved if the data does not need to be moved around on a network. Also, the licensing costs of installing R Server standalone also need to be evaluated with a Microsoft representative as well. An evaluation of the resource load on the network, as well as analysis of the code running on SQL Server should be performed prior to the decision to install the Machine Learning Server Standalone.

Read the whole thing.

Comments closed

What Happens In Deep Neural Networks?

Adrian Colyer has a two-parter summarizing an interesting academic paper regarding deep neural networks.  Part one introduces the theory:

Section 2.4 contains a discussion on the crucial role of noise in making the analysis useful (which sounds kind of odd on first reading!). I don’t fully understand this part, but here’s the gist:

The learning complexity is related to the number of relevant bits required from the input patterns X for a good enough prediction of the output label Y, or the minimal I(X; \hat{X}) under a constraint on I(\hat{X}; Y) given by the IB.

Without some noise (introduced for example by the use of sigmoid activation functions) the mutual information is simply the entropy H(Y)independent of the actual function we’re trying to learn, and nothing in the structure of the points p(y|x) gives us any hint as to the learning complexity of the rule. With some noise, the function turns into a stochastic rule, and we can escape this problem. Anyone with a lay-person’s explanation of why this works, please do post in the comments!

Part two digs in deeper:

The different colours in the chart represent the different hidden layers (and there are multiple points of each colour because we’re looking at 50 different runs all plotted together). On the x-axis is I(X;T), so as we move to the right on the x-axis, the amount of mutual information between the hidden layer and the input X increases. On the y-axis is I(T;Y), so as we move up on the y-axis, the amount of mutual information between the hidden layer and the output Y increases.

I’m used to thinking of progressing through the network layers from left to right, so it took a few moments for it to sink in that it’s the lowest layer which appears in the top-right of this plot (maintains the most mutual information), and the top-most layer which appears in the bottom-left (has retained almost no mutual information before any training). So the information path being followed goes from the top-right corner to the bottom-left traveling down the slope.

This is worth a careful reading.

Comments closed

TensorFlow Tutorial

Ashish Bakshi has a TensorFlow tutorial:

As shown in the image above, tensors are just multidimensional arrays, that allows you to represent data having higher dimensions. In general, Deep Learning you deal with high dimensional data sets where dimensions refer to different features present in the data set. In fact, the name “TensorFlow” has been derived from the operations which neural networks perform on tensors. It’s literally a flow of tensors. Since, you have understood what are tensors, let us move ahead in this TensorFlow tutorial and understand – what is TensorFlow?

The sample here is Python, though there is an R library as well.

Comments closed

Fooling Neural Networks

Rodrigo Agundez shows how to fool neural networks:

A comprehensive and complete summary can be found in the When DNNs go wrong blog, which I recommend you to read.

All these amazing studies use state of the art deep learning techniques, which makes them (in my opinion) difficult to reproduce and to answer questions we might have as non-experts in this subject.

My intention in this blog is to bring the main concepts down to earth, to an easily reproducible setting where they are clear and actually visible. In addition, I hope this short blog can provide a better understanding of the limitations of discriminative models in general. The complete code used in this blog post can be found here.

This is a great article.

Comments closed

Sentiment Analysis With Python In SQL Server

Nellie Gustafsson has a quick example of sentiment analysis using SQL Server Machine Learning Services:

You don’t have to be a data scientist to use machine learning in SQL Server. You can use pre-trained models available for usage out of the box to do your analysis. The following example shows you how you quickly get started and do text sentiment analysis.

Before starting to use this model, you need to install it. The installation is quick and instructions for installing the model can be found here: How to install the models on SQL Server

Once you have SQL Server installed with Machine Learning Services, enabled external script execution, and installed the pre-trained model, you can execute the following  script to create a stored procedure that uses Python and the microsoftml function get_sentiment with the pre-trained model to determine the probability of positive sentiment of a text:

Click through to read the whole thing.

Comments closed

Position Differences And Convolutional Neural Networks

Pete Warden shares his knowledge of how convolutional neural networks deal with position differences in images:

If you’re trying to recognize all images with the sun shape in them, how do you make sure that the model works even if the sun can be at any position in the image? It’s an interesting problem because there are really three stages of enlightenment in how you perceive it:

  • If you haven’t tried to program computers, it looks simple to solve because our eyes and brain have no problem dealing with the differences in positioning.

  • If you have tried to solve similar problems with traditional programming, your heart will probably sink because you’ll know both how hard dealing with input differences will be, and how tough it can be to explain to your clients why it’s so tricky.

  • As a certified Deep Learning Guru, you’ll sagely stroke your beard and smile, safe in the knowledge that your networks will take such trivial issues in their stride.

It’s a good read.

Comments closed

Columnstore Indexes And ML Services

Niko Neugebauer picks up on some changes that SQL Server 2017 Machine Learning Services can use with respect to columnstore indexes:

I expect not just a couple of rows to be sent over for the Machine Learning Services, but huge tables with million of rows, that also contain hundreds of columns, because this kind of tables are the basis for the Data Science and Machine Learning processes.
While of course we are focusing here on rather small part of the total process (just the IO between SQL Server relational Engine and the Machine Learning Services), where the analytical process itself can take hours, but the IO can still make a good difference in some cases.
I love this improvement, which is very under-the-hood, but it will help a couple of people to make a decision of migrating to the freshly released SQL Server 2017 instead of the SQL Server 2016.

I haven’t quite taken advantage of this yet (just moved to 2017 but still in 130 compatibility mode) but fingers crossed that I’ll see those improvements.

Comments closed

Dealing With Word Tensors

Chris Moody continues his series on natural language processing:

Counting and tensor decompositions are elegant and straightforward techniques. But these methods are grossly underepresented in business contexts. In this post we factorized an example made up of word skipgrams occurring within documents to arrive at word and document vectors simultaneously. This kind of analysis is effective, simple, and yields powerful concepts.

Look to your own data, and before throwing black-box deep learning machines at them, try out tensor factorizations!

He has a set of animated GIFs to help with learning, though I do wish they were about 30% slower so you can take a moment to read each section before it jumps to the next bit.

Comments closed

Online Learning Algorithms

Xin Hunt describes the benefits of online learning algorithms:

A few examples of classical online learning algorithms include recursive least squares, stochastic gradient descent and multi-armed bandit algorithms like Thompson sampling. Many online algorithms (including recursive least squares and stochastic gradient descent) have offline versions. These online algorithms are usually developed after the offline version, and are designed for better scaling with large datasets and streaming data. Algorithms like Thompson sampling on the other hand, do not have offline counterparts, because the problems they solve are inherently online.

Let’s look at interactive ad recommendation systems as an example. You’ll find ads powered by these systems when you browse popular publications, weather sites and social media networks. These recommendation systems build customer preference models by tracking your shopping and browsing activities (ad clicking, wish list updates and purchases, for example). Due to the transient nature of shopping behaviors, new recommendations must reflect the most recent activities. This makes online learning a natural choice for these systems.

My favorite online learning algorithm at the moment is Online Passive-Aggressive Algorithms.  Not just because that name describes my Twitter feed.

1 Comment