In this note we will answer “what is a good test set size?” three ways.
– The usual practical answer.
– A decision theory answer.
– A novel variational answer.
Each of these answers is a bit different, as they are solved in slightly different assumed contexts and optimizing different objectives. Knowing all 3 solutions gives us some perspective on the problem.
My rule of thumb is that I want it to be as small as possible while containing the highest likelihood of hitting all real-world scenarios enough times to provide a valid comparison. This conversely maximizes the size of the training data set, giving us the best chance of seeing the widest variety of scenarios we can during the formative phase.
And as usual, John goes way deeper than my rules of thumb. I like this post a lot.