A short quiz on vision system evaluation

Adrian F. Clark

What is a false negative?
- A false result from an algorithm that should be negative
- A false result from an algorithm that should have succeeded
- A false result from an algorithm
- A negative result that is false
A false negative arises when an algorithm reports failure (or crashes) when it should have succeeded.
Which corner of a ROC curve indicates the best performance?
- upper left
- upper right
- lower left
- lower right
We want the smallest number of false positives for the largest number of true positives, so the best performance is the upper left corner of the plot.
When using McNemar's test, what do we do if we want to see whether one algorithm's performance is better than another's?
- Look up the Z-score in two-tailed tables
- Look up the Z-score in Normal distribution tables
- Look up the Z-score in binomial tables
- Look up the Z-score in one-tailed tables
If we want to know that one algorithm's performance is better than another, we must use one-tailed tables.
You are developing software for the police to show mugshots of suspects to the witness of a crime. Which of the following is the best approach to take?
- maximize the number of true positives
- maximize the number of true positives, even if the false positive rate is high
- minimize the number of false positives
- minimize the number of false negatives
For this type of application, we want to show anyone who stands any chance of being the perpetrator; it doesn't particularly matter if we have a high false positive rate as long as the true positive rate is high.
Which test is most appropriate for comparing algorithms' performances?
- Gauss's test
- Canny's test
- McNemar's test
- Laplace's test
McNemar's test is the most appopriate test for comparing algorithms: it is a chi-squared test with one degree of freedom for paired data.
When evaluating vision systems, it is normal to:
- use the same training and test sets
- train on the training set and test using both training and test sets
- have different training and test sets
- train on all data but test on only the test set
We know that algorithms work better on the data they were trained on than on unseen data; hence, we use different training and test sets.
What is 'ground truth'?
- values obtained by an algorithm that are known to be true
- the true values obtained by an algorithm
- images of the ground
- data known to be correct
Ground truth are data (usually images) for which the correct answer is known; they are used for training and testing algorithms.
You are developing a automatic passport system for use by immigration, where pictures of people are compared to those in their passports. Which of the following is the best approach to take?
- minimize the number of false positives
- maximize the number of true positives, even if the false positive rate is high
- minimize the number of false negatives
- maximize the number of true positives
For this type of application, we need to keep the number of false positives as low as possible; otherwise, we would admit lots of people who don't look like the picture on their passports.
What is a false positive?
- A true result from an algorithm that is incorrect
- A positive result that is false
- A false result from an algorithm
- A false result from an algorithm that should be correct
A false positive arises when an algorithm reports success but has actually found an incorrect result.
What are the axes of a ROC curve?
- FP and FN
- TP and FP
- TP and TN
- TP and FN
An ROC curve plots the number of FPs against the number of TPs.