The Computer Vision module consists of both lectures and laboratories. Roughly speaking, the lectures explain what techniques are used to process image or video data, how they work, and how they are combined into complete systems. The laboratories, which is what concern us here, are where you put those ideas and principles into practice.
There are two distinct parts to the programme of laboratories: the first half gets you to write and test routines that provide useful computer vision functionality, while the second half uses these functions in a set of vision applications. The laboratories are self-paced but the routine-writing ones are assessed in the first progress test mid-way through the term, while the applications part is assessed in the second progress test near the end of the term. These progress tests are open-book, meaning that you should make a record what you do and why as you work through the laboratories and then use your records to answer the progress test questions.
There are two widely-used ways of developing computer vision techniques and applications, Matlab and OpenCV. We are going to work with the latter as it makes you more employable in the industry and interfaces better to modern machine learning. OpenCV is written in C++ and has 'wrappers' that allow it to be used from a variety of programming languages; but here the focus will be on the use of Python, again because it has become the norm for research and development. All the examples you are given are in Python, and the library or module that you are expected to extend in the first half of the laboratory programme is also written in Python.
OpenCV and its Python wrapper run on Windows, Linux and macOS. In previous iterations, the laboratory programme has had to be built around Linux because the Windows port in particular was rather unreliable. However, it now appears to be stable and so you should be able to work on any of the three operating systems. Do bear in mind that the laboratories and progress tests are set on the basis of the Linux and macOS versions of OpenCV so if you're working under Windows, you should ensure that you get the same results as on the machines in our Software Labs or Horizon server under Linux. If you haven't used Linux...then why not? It's the primary free operating system and any self-respecting graduate in the computing or electronics areas should have some experience of it. You will also see that most of the examples expect you to be able to run Python programs from the command line. I have seen first-hand that many students are unable to write complete programs, instead relying on button-clicking in Jupyter or another IDE to run them --- that makes you only half a programmer.
If you want to install OpenCV on your own machine, it is normally
straightforward; there are guidelines in the lecture
notes. You need to make sure you install the same version as in the
Software Labs and the way to do that is to start a python
interpreter on one of those machines and type the incantations
>>> import cv2
>>> cv2.__version__
'4.5.3'
where >>>
is the prompt from the Python
interpreter. The version installed in the Software Lab will be much more
up to date than that printed out here.
The starting point for your laboratory work is a zip-file which contains:
sxcv.py
(for Essex Computer Vision), the
module you will extend as you work through these laboratories
review.py
, an example program which uses
sxcv.py
some example images for you to work with, such as
sx.jpg
used in this script
You should download the zip-file and unpack it. (Later scripts will
have separate zip-files of imagery associated with them.) You will work
on sxcv.py
throughout all the laboratory sessions, so don't
lose it or you will have to do the work again!
To run review.py
, you should bring up a terminal window
and type the command:
python review.py
You should see the output:
testimage is monochrome of size 13 rows x 10 columns with uint8 pixels.
You might have to use python3
rather than
python
depending on how the interpreter is installed on
your system.
If you run the same program with a filename:
python review.py sx.jpg
you should see the output
sx.jpg has 3 channels of size 512 rows x 512 columns with uint8 pixels.
instead. If you normally use an IDE and run programs by clicking a button, this might seem a little strange way to tun a program; but you will see as the laboratories progress that running programs this way brings benefits so do make sure you can do it --- and there will be questions in the progress tests relating to this.
Let us now review the code in review.py
:
#!/usr/bin/env python
"""review.py -- a "hello world" program for sxcv and cv2"""
import sys, sxcv, cv2
# If a filename was given on the command line, read it in. Otherwise, use
# the test image built into the sxcv module.
if len (sys.argv) > 1:
im = cv2.imread (sys.argv[1])
if im is None:
print ("Couldn't read the image file '%s'!" % sys.argv[1],
file=sys.stderr)
exit (1)
name = sys.argv[1]
else:
im = sxcv.testimage1 ()
name = "testimage"
# Output a one-line summary of the image.
print (sxcv.describe (im, name))
There a few interesting points worth bringing out.
The very first line is used to make the program able to run on Unix (Linux, macOS) systems. This will be familiar to anyone who did CE222 (Operating Systems) but speak to a demonstrator if you didn't and would like to understand what it means.
The import
line pulls in the sys
module, required to access the command line, our module
sxcv
and OpenCV. The latter is called cv2
because it was the second attempt at wrapping the OpenCV library for
Python, not because it is for OpenCV version 2.
sys.argv
is the command line, split up into
space-separated words. If you have programmed in C, C++ or Java, this is
equivalent to what your main
module receives from the
operating system. The first word is the command name, so
sys.argv[1]
is the first argument you gave to the program,
the name of the file to be processed.
You read an image in using cv2.imread
, which returns
None
if the read fails. If that should happen (and do try
it), an error message is written on the standard error channel
--- knowing the difference between standard output and
standard error and using the correct one is one of the ways you
graduate from being a newbie to an experienced programmer. The argument
to sys.exit
is returned to the invoking shell; as you'll
see in the demonstrations in lectures, this can be reported to the
user.
If no command-line argument is supplied, routine
sxcv.testimage1
is used to generate an example
image.
sxcv.describe
is passed the image and a title
string, and it generates the output that appears in the terminal
window.
You can use review.py
as the starting point for testing
the routines that you add to sxcv.py
...though you are
encouraged to give it a different name.
sxcv
Python moduleA good way to see what is in the module is by typing one of the commands
pydoc .\sxcv.py # Windows
pydoc ./sxcv.py # everywhere else
(This may need to be pydoc3
on some systems.) This pulls
the documentation strings from the code and presents them to you in a
nicely-formatted way. If pydoc
isn't installed on your
system, the longer command
python -m pydoc ./sxcv.py
(.\sxcv.py
on Windows) is equivalent. Do read through
the module before you proceed.
The routines present in the version of sxcv.py
that you
downloaded are a starting point for you to develop your own routines.
They include ones to examine and visualize images and data extracted
from them, generate test images and some simple debugging aids. As you
work through these laboratory scripts, you should add the routines that
are discussed below the comment that reads
#-------------------------------------------------------------------------------
# LIBRARY ROUTINES.
#-------------------------------------------------------------------------------
To show you what to do, let us add a routine to compute the mean of an image. Your starting point is the following text:
def mean (im):
"""
Return the mean of the pixel values an image.
Args:
im (image): image for which the mean value is to be found
Returns:
ave (float): the mean of the image
Tests:
>>> ave = mean (arrowhead ())
>>> print ("OK") if abs (ave - 39.66666666) < 1.0e-5 else print ("bad")
OK
"""
The first line is a definition of a Python routine and, as you will
see, it takes an image as its parameter. The comment that follows the
def
line is its documentation (what pydoc
will
extract). This includes an example of how it is invoked after the
Tests
line: the >>>
again represents
the Python prompt. You should paste the above code lines unchanged into
your copy of sxcv.py
.
The routine definition does not, of course, include any code for
computing the mean. You could compute this yourself, iterating over the
rows and columns of the image as discussed in lectures; but finding the
mean of a set of numbers is such a common thing to do that there is
suitable functionality in numpy
, the representation used
for OpenCV images in Python. If you do a web-search for something like
numpy calculate mean you'll find there is a method called
mean
, so all you need do is add the line
return numpy.mean (im)
at the end of your mean
routine with the right
indentation. The routines you have to write won't usually be one-liners
like this, of course.
With your functionality added, you now need to test whether it works. Python provides functionality to pull tests out of these so-called docstring comments, execute them, and check that they produce the same output as is in the comment. In this case, it will pull out the two lines of code
ave = mean (arrowhead ())
print ("OK") if abs (ave - 39.66666666) < 1.0e-5 else print ("bad")
The first line of these generates the 'arrowhead' image which appears
in the software chapter of the lecture
notes, computes its mean using your newly-written routine and checks
that the result is within \(10^{-5}\)
of 39.66666666; if it is, it prints OK
and if it isn't, it
prints bad
. A correctly working routine will print
OK
.
Rather than go to the effort of extracting and running the code manually, Python provides tooling for doing this. Simply type the command
python -m doctest sxcv.py
to run all the tests. If you see no output, thay have all succeeded.
To see all the tests being run, add the -v
qualifier to the
above command.
We shall take the same approach for all the other functions you have
to add to sxcv.py
: you will be given the routine
specification and docstring and it is your job to implement the
functionality of the code.
Now that you have seen how to add a routine to sxcv.py
,
add three further ones as specified below. The first two are also numpy
one-liners, while the last combines the values returned from the first
two.
def highest (im):
"""
Return the maximum of the pixel values of an image.
Args:
im (image): image for which the maximum value is to be found
Returns:
hi (of same type as image): highest value in the image
Tests:
>>> im = testimage1 ()
>>> print (highest (im))
15
"""
def lowest (im):
"""
Return the minimum of the pixel values of an image.
Args:
im (image): image for which the maximum value is to be found
Returns:
lo (of same type as image): lowest value in the image
Tests:
>>> im = testimage1 ()
>>> print (lowest (im))
10
"""
def extremes (im):
"""
Return the minimum and maximum of the pixel values of an image.
Args:
im (image): image for which the maximum value is to be found
Returns:
lo (of same type as image): lowest value in the image
hi (of same type as image): highest value in the image
Tests:
>>> im = testimage1 ()
>>> print (extremes (im))
[10, 15]
"""
To wrap up this experiment, copy review.py
to (say)
display.py
. Find the routine in sxcv.py
called
display
and make the program invoke it to display the image
specified on the command line. When you run
python display.py sx.jpg
you should see the following image pop up in a window on your screen.
Experiment with the parameters in the call to find out what they do.
If you have struggled to carry out this experiment, below is a screencast which shows me working through the important parts of it myself. Note that the image displays in my terminal window; yours won't do that. I apologize for my poor typing prowess: 40 years of using computers don't appear to have improved it much!
I shan't do this for subsequent experiments; but you shouldn't need as much hand-holding.
Web page maintained by Adrian F. Clark using Emacs, the One True Editor ;-) |