Approximation Or Classification?

A blog post on the Algolytics blog discusses different approximation and classification models and when to use each:

Even if your target variable is a numeric one, sometimes it’s better to use classification methods instead of approximation ones. For instance if you have mostly zero target values and just a few non-zero values. Change the latter to 1, in this case you’ll have two categories: 1 (positive value of your target variable ) and 0. You can also split numerical variable into multiple subgroups : apartment prices for low, medium and high by equal subset width and predict them using classification algorithms. This process is called discretization.

Both types of models are common in machine learning, so a good understanding of when to use which is important.

Related Posts

Anomaly Detection With Python

Robert Sheldon continues his SQL Server Machine Learning Series: As important as these concepts are to working Python and MLS, the purpose in covering them was meant only to provide you with a foundation for doing what’s really important in MLS, that is, using Python (or the R language) to analyze data and present the […]

Read More

Non-English Natural Language Processing

The folks at BNOSAC have announced a new natural language processing toolkit for R: BNOSAC is happy to announce the release of the udpipe R package (https://bnosac.github.io/udpipe/en) which is a Natural Language Processing toolkit that provides language-agnostic ‘tokenization’, ‘parts of speech tagging’, ‘lemmatization’, ‘morphological feature tagging’ and ‘dependency parsing’ of raw text. Next to text […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031