In a real-time implementation of the Sobel operator, it is important to keep the number of multiplications as small as possible. What is the smallest number of multiplications that are required to convolve each image region with one of the Sobel masks?
Multiplication of any number by zero is obviously zero, and multiplication by unity just yields the number. Multiplication by two can be done (for integer operands) by a bitwise left-shift, so convolution with the Sobel masks do not involve any multiplications at all.
A shape descriptor consists of the distance from the middle of a feature in the four compass directions. It is found that one feature produces a perfect match with another when the north of one is aligned with the east of the other. What does this tell you?
Moving the north to the east is a 90-degree clockwise rotation, so this is the best match.
An image of a room contains a computer with a display. What feature or features would help you detect the display?
The best approach would be to combine rectangularity (to detect the rectangular feature) and aspect ratio, as computer monitors have a 4:3 or 16:9 aspect ratio. SIFT would be poor and corners do not carry enough information in this context.
In the broken biscuit identifier explored in lectures, what is its most serious problem?
- it doesn't identify biscuits that aren't aligned with the image edges
- it doesn't identify overlapping biscuits
- inadequate testing
non-uniform lighting
The most significant shortcoming is in thresholding the biscuits from the background because of non-uniform lighting, even though your lecturer went to some effort to make it as uniform as possible.
If a histogram has two peaks, where is the best place to put a threshold to separate foreground from background?
In general, the best place to put a threshold is at the bottom of the dip between the two peaks -- this is essentially what Otsu's method does.
Why is a recursive region labelling algorithm poor in practice?
Recursive implementations of any algorithm save state on the program's stack. Recursive region-labelling algorithms make one recursive call for each pixel in a region, so if a region contains many pixels, stack overflow is likely.
What does Otsu's method do?
Otsu's method minimizes the within-class variance as that is the best way of minimising incorrect background and object pixels; Otsu showed it is the same as maximizing the between-class variance.
Which is the easiest way to identify broken digestive biscuits on a production line?
Digestive biscuits are normally circular, which means their circularity (the ratio of the square of the circumference to the area) is $4\pi$. All other shapes have a circularity greater than this, which would be the case for a broken biscuit.
A system for identifying broken custard cream biscuits looks for rectangular regions that are aligned with the edges of images. Why might this be a bad idea?
If the buscuit is not aligned with the edges, its axis-aligned bounding box will be much larger than an oriented bounding box and this might incorrectly identify the biscuit as being broken -- see figure 5.10 in the lecture notes.
How are GLCMs normally used to identify similar textures?
A number of quantities are computed from the actual GLCMs, with the most common ones being listed in the lecture notes. These are then matched, most often these days using machine learning.