When you have outliers you may face much frustration
if you include them in a model fitting operation.
But if your model's fit to a sample set of minimal size,
the probability of the set being outlier-free will rise.
Brute force tests of all sets will cause computational constipation.
N random samples
will provide an example
of a fitted model uninfluenced by outliers. No need to test all combinations!
Each random trial should have its own unique sample set
and make sure that the sets you choose are not degenerate.
N, the number of sets, to choose is based on the probability
of a point being an outlier, and of finding a set that's outlier free.
Updating N as you go will minimise the time spent.
So if you gamble
that N samples are ample
to fit a model to your set of points, it's likely that you will win the bet.
Select the set that boasts
that its number of inliers is the most (you're almost there).
Fit a new model just to those inliers and discard the rest,
an estimated model for your data is now possessed!
This marks the end point of your model fitting quest.
Email: fmatrix at danielwedge dot com
Please email me if you find any outliers in this song ;)
Feel free to play this in lectures etc, you have my permission (though I'd be interested to hear from you if you do!)