Nested Resampling In R

Kevin Feasel

2017-09-07

R

Max Kuhn describes how nested resampling works:

A common method for tuning models is grid search where a candidate set of tuning parameters is created. The full set of models for every combination of the tuning parameter grid and the resamples is created. Each time, the assessment data are used to measure performance and the average value is determined for each tuning parameter.

The potential problem is, once we pick the tuning parameter associated with the best performance, this value is usually quoted as the performance of the model. There is serious potential for optimization bias since we uses the same data to tune the model and quote performance. This can result in an optimistic estimate of performance.

Nested resampling does an additional layer of resampling that separates the tuning activities from the process used to estimate the efficacy of the model. An outer resampling scheme is used and, for every split in the outer resample, another full set of resampling splits are created on the original analysis set. For example, if 10-fold cross-validation is used on the outside and 5-fold cross-validation on the inside, a total of 500 models will be fit. The parameter tuning will be conducted 10 times and the best parameters are determined from the average of the 5 assessment sets.

Definitely worth the read.  H/T R-Bloggers

Related Posts

Biases in Tree-Based Models

Nina Zumel looks at tree-based ensembling models like random forest and gradient boost and shows that they can be biased: In our previous article , we showed that generalized linear models are unbiased, or calibrated: they preserve the conditional expectations and rollups of the training data. A calibrated model is important in many applications, particularly when financial data […]

Read More

R 3.6.1 Available

David Smith notes a new version of R is available: On July 5, the R Core Group released the source code for the latest update to R, R 3.6.1, and binaries are now available to download for Windows, Linux and Mac from your local CRAN mirror. R 3.6.1 is a minor update to R that fixes a few bugs. […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930