Approximation Or Classification?

A blog post on the Algolytics blog discusses different approximation and classification models and when to use each:

Even if your target variable is a numeric one, sometimes it’s better to use classification methods instead of approximation ones. For instance if you have mostly zero target values and just a few non-zero values. Change the latter to 1, in this case you’ll have two categories: 1 (positive value of your target variable ) and 0. You can also split numerical variable into multiple subgroups : apartment prices for low, medium and high by equal subset width and predict them using classification algorithms. This process is called discretization.

Both types of models are common in machine learning, so a good understanding of when to use which is important.

Related Posts

Patterns for ML Models in Production

Jeff Fletcher shows four patterns for productionalizing Machine Learning models, as well as some things to take care of once you’re in production: Operational DatabasesThis option is sometimes considered to be  real-time as the information is provided “as its needed,” but it is still a batch method. Using our telco example, a batch process can […]

Read More

Hyperparameter Tuning with MLflow

Joseph Bradley shows how you can perform hyperparameter tuning of an MLlib model with MLflow: Apache Spark MLlib users often tune hyperparameters using MLlib’s built-in tools CrossValidator and TrainValidationSplit.  These use grid search to try out a user-specified set of hyperparameter values; see the Spark docs on tuning for more info. Databricks Runtime 5.3 and 5.3 ML and above support […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031