William Vorhies discusses a new technical paper on Local Interpretable Model-Agnostic Explanations:

What the model actually used for classification were these: ‘posting’, ‘host’, ‘NNTP’, ‘EDU’, ‘have’, ‘there’.  These are meaningless artifacts that appear in both the training and test sets and have nothing to do with the topic except that, for example, the word “posting” (part of the email header) appears in 21.6% of the examples in the training set but only two times in the class “Christianity.”

Is this model going to generalize?  Absolutely not.

An Example from Image Processing

In this example using Google’s Inception NN on arbitrary images the objective was to correctly classify “tree frogs”.  The classifier was correct in about 54% of cases but also interpreted the image as a pool table (7%) and a balloon (5%).

Looks like an interesting paper.  Click through for a link to the paper.

Related Posts

Reproducibility And ML Projects

Pete Warden explains some of the difficulties around reproducing ML models: Why does this all matter? I’ve had several friends contact me about their struggles reproducing published models as baselines for their own papers. If they can’t get the same accuracy that the original authors did, how can they tell if their new approach is […]

Read More

XGBoost With Python

Fisseha Berhane looked at Extreme Gradient Boosting with R and now covers it in Python: In both R and Python, the default base learners are trees (gbtree) but we can also specify gblinear for linear models and dart for both classification and regression problems. In this post, I will optimize only three of the parameters […]

Read More


September 2016
« Aug Oct »