A Basic Explanation Of Associative Rule Learing

Akshansh Jain has some notes on associative rules:

Support tells us that how frequent is an item, or an itemset, in all of the data. It basically tells us how popular an itemset is in the given dataset. For example, in the above-given dataset, if we look at Learning Spark, we can calculate its support by taking the number of transactions in which it has occurred and dividing it by the total number of transactions.

Support{Learning Spark} = 4/5
Support{Programming in Scala} = 2/5
Support{Learning Spark, Programming in Scala} = 1/5

Support tells us how important or interesting an itemset is, based on its number of occurrences. This is an important measure, as in real data there are millions and billions of records, and working on every itemset is pointless, as in millions of purchases if a user buys Programming in Scala and a cooking book, it would be of no interest to us.

Read the whole thing.

Related Posts

Polishing Uncalibrated Models

Nina Zumel takes an uncalibrated random forest model and applies a calibration technique to improve the estimate on one variable: In the previous article in this series, we showed that common ensemble models like random forest and gradient boosting are uncalibrated: they are not guaranteed to estimate aggregates or rollups of the data in an unbiased way. […]

Read More

Comparing Classification Model Quality

Stephanie Glen looks at ways to compare model evaluation for classification models: In part 1, I compared a few model evaluation techniques that fall under the umbrella of ‘general statistical tools and tests’. Here in Part 2 I compare three of the more popular model evaluation techniques for classification and clustering: confusion matrix, gain and lift chart, […]

Read More

Categories

April 2018
MTWTFSS
« Mar May »
 1
2345678
9101112131415
16171819202122
23242526272829
30