Here is a quick data-scientist / data-analyst question: what is the overall trend or shape in the following noisy data? For our specific example: How do we relate
valueas a noisy function (or relation) of
m? This example arose in producing our tutorial “The Nature of Overfitting”.
Here’s a quick summary of my general philosophy: the data are more interesting than a smoothed line. I’m okay putting in a smoothed line to help a reader make sense of a trend, but I wouldn’t want to have a plot with just the smoothed line. Read the whole thing from John to get well beyond my rule of thumb.