Something to keep in mind is that when you're splitting your
data sets up into training, testing and validation sets, they can
get a little bit small, but you need to avoid small
sample sizes, particularly if you're dealing with the test set size.
And the reason why is, suppose you were predicting a binary outcome, so in my
case, a very common thing to try to do is to predict diseased versus healthy.
And in general, it might be something like, whether people will
click on an ad, or whether they won't click on an ad.
Then one classifier is just flipping a coin.
You could always just flip a coin, and say, they'll be diseased if
the coin is heads, and not diseased if the coin comes out tails.
And so the probability of a perfect classification, using this
really silly algorithm is one half raised to the sample size.
In other words, half the time you'll be
right, just by chance by flipping the coin.
And each time, supposing each prediction is independent,
then each time that you flip a coin, then
you'll get one half times that, a number will
be the decrease in accuracy that you would get.
So if you were pr-, test set has only one sample in
it, then you have about a 50/50 chance of getting that sample right.
So, even if you got prediction accuracy of 100% on the test set,
you would have a 50% chance of that, even with a coin flip.
With n equals 2, you only have a, you
still only have a 25% chance of 100% accuracy.
And with n equals 10 in your test set, now
you have, only about a .1% chance of getting 100% accuracy.
So if you see that 100% accuracy you'll feel a little bit
more confident that it's actually true and it's not something that's just random.
So this suggests that we should make sure
that especially our test sizes are of relatively
large size so we can be sure that
we're not just getting good prediction accuracy by chance.