A short quiz on low-level vision

Adrian F. Clark

Which of the following is a sensible region descriptor?
- the value of the centre pixel of a region
- the location of a region within an image
- the circularity of a region
- the colour of the region's background
A region descriptor attempts to encapsulate some characteristic of a region in a number, so anything that does not do this is at best ineffective and at worst just pointless! The location of a region in an image is not a good way of describing its properties, nor is the colour of the pixels outside it. The value of the centre pixel might be a descriptor but in practice it is pretty useless.
In a real-time implementation of the Sobel operator, it is important to keep the number of multiplications as small as possible. What is the smallest number of multiplications that are required to convolve each image region with one of the Sobel masks?
- 1
- 0
- 6
- 9
Multiplication of any number by zero is obviously zero, and multiplication by unity just yields the number. Multiplication by two can be done (for integer operands) by a bitwise left-shift, so convolution with the Sobel masks do not involve any multiplications at all.
An image of a room contains a computer with a display. What feature or features would help you detect the display?
- SIFT features
- a combination of rectangularly and aspect ratio
- rectangularity
- corners
The best approach would be to combine rectangularity (to detect the rectangular feature) and aspect ratio, as computer monitors have a 4:3 or 16:9 aspect ratio. SIFT would be poor and corners do not carry enough information in this context.
Which is the easiest way to identify broken digestive biscuits on a production line?
- their circularity is less than $4\pi$
- their rectangularity is greater than 1
- their rectangularity is less than 1
- their circularity is larger than $4\pi$
Digestive biscuits are normally circular, which means their circularity (the ratio of the square of the circumference to the area) is $4\pi$. All other shapes have a circularity greater than this, which would be the case for a broken biscuit.
A shape descriptor consists of the distance from the middle of a feature in the four compass directions. It is found that one feature produces a perfect match with another when the north of one is aligned with the east of the other. What does this tell you?
- the second feature is rotated 90 degrees anticlockwise relative to the first
- the second feature is rotated 90 degrees clockwise relative to the first
- the second feature is rotated 180 degrees clockwise relative to the first
- the second feature is a mirror image of the first
Moving the north to the east is a 90-degree clockwise rotation, so this is the best match.
Why is a recursive region labelling algorithm poor in practice?
- it is too slow
- it tends to overflow the computer's stack
- it tends to overflow the computer's heap
- it is difficult to program
Recursive implementations of any algorithm save state on the program's stack. Recursive region-labelling algorithms make one recursive call for each pixel in a region, so if a region contains many pixels, stack overflow is likely.
Why is simple thresholding not especially effective at locating light features in an image?
- thresholding is a local operation
- thresholds are difficult to determine
- changes in illumination change the lightness of features
- thresholding is a global operation
The main problem with feature detection using thresholding is that changes in illumination can cause (say) a white object to appear grey.
A system for identifying broken custard cream biscuits looks for rectangular regions that are aligned with the edges of images. Why might this be a bad idea?
- the shape of the biscuit doesn't matter, only its appearance
- whole biscuits might not be aligned with the edges of images
- it's not really a bad idea
- broken biscuits might not be aligned with the edges of images
If the buscuit is not aligned with the edges, its axis-aligned bounding box will be much larger than an oriented bounding box and this might incorrectly identify the biscuit as being broken -- see figure 5.10 in the lecture notes.
If a histogram has two peaks, where is the best place to put a threshold to separate foreground from background?
- at the lower peak
- at the higher peak
- half-way down the lower peak
- in the bottom of the dip between the two peaks
- half-way up the higher peak
In general, the best place to put a threshold is at the bottom of the dip between the two peaks -- this is essentially what Otsu's method does.
What are grey-level co-occurrence matrices?
- how different grey levels are for two images of the same texture
- how similar grey levels are for a given shift in an image
- how similar grey levels are for two images of the same texture
- how different grey levels are for a given shift in an image
GLCMs are scattergrams (2D histograms) computed for two regions of a single image separated by a particular shift. High values in it indicate similarity.