CE316 and CE866: Computer Vision

About the module

This module explores the algorithms and software that lets a computer 'understand' the content of images and videos. This is far from being a solved problem and research into it is extraordinarily active, with the UK contributing some of the most important results globally. The development of the discipline is so rapid that about half the content of the module couldn't have been taught ten years ago. Given what it is trying to do, you might think that vision is an area of artificial intelligence, placing it squarely in the realm of computer science. While this is true it is far from the whole story, with researchers also having backgrounds in disciplines such as electronic engineering, mathematics, physics, psychology and medicine. The contributions of the various disciplines will become more apparent as you learn about vision techniques.

In fact, this is not one module but two, for it is delivered simultaneously to final-year undergraduate students (CE316) and to postgraduate students (CE866). Laboratories are also joint between CE316 and CE866 but the examinations for the two modules are different.

The last decade or so has seen three major technological influences on computer vision. Firstly, digital imaging has transformed the capture of image and video data from something that pushed at the boundaries of real-time hardware and software into an everyday process. The second influence is cheap data storage and processing, allowing large quantities of image and video data to be stored and manipulated in reasonable timescales. The final influence, a consequence of having more CPU cycles to burn, is an expansion in the use of machine learning and an improvement in the learning algorithms available.

Machine learning has actually been employed in real-world vision systems for a long time, though until recently this module stopped short of discussing it. However, the last few years have seen machine learning become much more prevalent in vision techniques, so there is an introduction to it in the second half of the module. It won't turn you into an expert in machine learning but it should give you an appreciation of what it is — and sometimes isn't — capable of doing.

About your lecturer

Your friendly neighbourhood lecturer, Adrian Clark, has been researching digital image analysis and computer vision all his professional life. His PhD was in processing digital imagery from electron microscopes and his postdoctoral work was some of the earliest to explore the use of parallel computers for image processing (Google "ICL DAP"). He worked in industry, developing robust, real-time vision systems before joining Essex. Although he also researches virtual and augmented reality (the underlying maths is essentially the same as for vision), he has three major research interests:

Using genetic programming, a form of machine learning, to build computer vision systems from components. We have had some spectacular successes with this approach and I'll show you some of them towards the end of the module.
Reconstructing 3D models from images, captured either by humans or robots. We have reconstructed archaeological finds from photographs taken shortly after they were uncovered, recording their appearance in fine detail before any conservation has taken place. This was done most significantly for the Fenwick Treasure, an important discovery made as Colchester's department store was being extended. We have also been very successful in building 3D models of coral reefs in Indonesia and the Caribbean from imagery acquired by Jon Chamberlain, another academic in CSEE, using a rapid capture system that he and I designed. The 3D models we reconstruct are good enough to print (I have a colour 3D printer in my research lab), and you will see this work when we consider vision in a 3D world in the second half of the module.
Inspired by the problems encountered when developing vision systems in industry, I am a fervent evangelist of statistical approaches for evaluating and comparing the performance of vision systems. The material in lectures and laboratories on this topic is right at the forefront of research, and you'll evaluate systems during many of the laboratory experiments.

Lectures and laboratories

Lectures Thursday 09:00–11:00, weeks 16–25 Ivor Crewe Hall A

Laboratories Monday 09:00–11:00 or 16:00–18:00, weeks 17–25 CSEE Lab 7

Lectures	Thursday 09:00–11:00, weeks 16–25	Ivor Crewe Hall A
Laboratories	Monday 09:00–11:00 or 16:00–18:00, weeks 17–25	CSEE Lab 7

The lectures are live events. In them, I explain important techniques used in computer vision: you'll see how they work, how they're programmed and what they're used for; and you'll be able to ask me questions if there are things I haven't made clear. Unlike many academics, I do not simply drone on while presenting a set of overheads; instead, I write and sketch to explain things so you can see how ideas are transformed into algorithms.

Each lecture has a comprehensive set of support material:

notes which cover everything relevant to a topic;
a set of overheads that approximate to what the lecture covers — I normally start with these but move into sketching and illustrating how things work;
a scanned copy of any sketches etc created during a lecture;
a multiple-choice quiz;
and in some cases, worksheets for you to have a go yourself and videos that explain specific topics.

As the discipline is moving so quickly, no textbook that you can buy is really able to keep up, so these are the principal source material for you to work from. The notes are available as a single PDF file rather than separate chunks to make it easier for you to see how some ideas and principles underlie several techniques.

Each time you do one of the quizzes, you'll have a random selection of ten questions from a question bank and the order of the choices will be different. They are a little easier than the questions you'll experience in the formal progress tests discussed below but are a good way of checking that you understand things.

Even though I provide the much of the formal content in the notes and lectures, that does not mean that you shouldn't make notes yourself. The notes deliberately have a wide margin to make it easy for you to add your own notes alongside the text.

The laboratories are where you put the theory explained during lectures into practice by working through a series of exercises. It is essential that you record what you are doing and what you find during them because there are two progress tests during the module and they are specifically on the labs. They are open-book tests, so you are welcome to use any books etc that you like — especially your records of the experiments. (I'll go over this process during the first lecture in case it isn't clear from this explanation.)

The laboratories expect you to write code in Python. This is by far the most widely-used language in computer vision research and development, and AI in general, mostly because its write–compile–run cycle is so short. All of the laboratories should be fine under Linux, Windows or macOS, so you should be able to work in an environment you are already familiar with — though you are expected to be able to run software from the command line rather than from within your editor or IDE.

If you don't know Python, or all knowledge of it has slipped from your mind since you were taught it, you might find my notes on Python helpful. These date from when I taught an MSc module on Python programming and go from nothing to writing reasonably sophisticated programs.

In the laboratory exercises, images are represented as numpy ("Numerical Python") arrays, making it possible for you to employ the power of both numpy and scipy ("Scientific Python") in your solutions as well as OpenCV, the widely-used, open-source computer vision package intended for real-time applications. In the later laboratories, you'll see results obtained from machine learning packages Scikit-learn and TensorFlow (the latter in conjunction with Keras). You don't need to install these to do the experiments. Of course, if your project involves machine learning, you'll probably have them installed anyway.

If all this sounds very frighting, don't worry: I'll explain it all during the lectures.

If something doesn't make sense to you, don't be afraid to ask! Drop me an email explaining what the problem is and I'll either reply by email or we can meet up (in person or by Zoom) to discuss it.

Lecture notes etc

If you have a specific learning difficulty and would like a version of the lecture notes and laboratory scripts using different fonts, contrast, background colour or whatever, please do contact your lecturer. He spent part of a summer vacation writing software to do this so please make use of it rather than struggle on!

topic teaching material test your understanding

The entire book of notes
Summary sheet you will be provided with in the exams
Programming in Python How much you know before the module starts?

Module overview overheads
quiz on the module's organization

Introduction to Computer Vision overheads quiz, Python code, C++ code
histogram worksheet, solutions

The Human Visual System
(not examinable) quiz

Convolution overheads quiz, convolution worksheets, solutions

Low-level vision overheads quiz

Evaluating vision systems overheads quiz

Intermediate level vision overheads quiz

Looking at humans overheads quiz

Vision in a 3D world
(section 9.6 onwards is not examinable) overheads quiz

High-level vision with machine learning
Deep learning and neural networks overheads quiz

Getting to grips with the Unix command line shell syntax
[print double-sided and fold concertina-style
along the lines between columns]
shell reference
Emacs reference

topic	teaching material	test your understanding
	The entire book of notes Summary sheet you will be provided with in the exams Programming in Python	How much you know before the module starts?
Module overview	overheads	quiz on the module's organization
Introduction to Computer Vision	overheads	quiz, Python code, C++ code histogram worksheet, solutions
The Human Visual System (not examinable)		quiz
Convolution	overheads	quiz, convolution worksheets, solutions
Low-level vision	overheads	quiz
Evaluating vision systems	overheads	quiz
Intermediate level vision	overheads	quiz
Looking at humans	overheads	quiz
Vision in a 3D world (section 9.6 onwards is not examinable)	overheads	quiz
High-level vision with machine learning Deep learning and neural networks	overheads	quiz
Getting to grips with the Unix command line	shell syntax	[print double-sided and fold concertina-style along the lines between columns] shell reference Emacs reference

Laboratory exercises

The labs have been re-written this year to make it easier for people working to do them. This means some problems may have crept in; if you find one, please do let the author know and he will fix it pronto! This applies even to seemingly trivial things like typos.

Note that the labs are self-paced rather than intended to be done one per week. When you finish one, just go on and do the next. As will be clear from the schedule below, labs 1–4 are assessed in the first progress test and 5–9 in the second.

script data quiz

1. Getting to grips with OpenCV sxcv.zip quiz

2. Histograms [also sxcv.zip] quiz

3. Colour images 03-colour.zip quiz

4. Processing regions 04-regions.zip quiz

First progress test

5. Counting stomata 05-stomata.zip quiz

6. Broken biscuits 06-biscuits.zip quiz

7. Counting cars 07-counting-cars.zip quiz

8. Stereo 08-stereo.zip quiz

9. Machine learning 09-ml.zip quiz

Second progress test

script	data	quiz
1. Getting to grips with OpenCV	`sxcv.zip`	quiz
2. Histograms	[also `sxcv.zip`]	quiz
3. Colour images	`03-colour.zip`	quiz
4. Processing regions	`04-regions.zip`	quiz
First progress test
5. Counting stomata	`05-stomata.zip`	quiz
6. Broken biscuits	`06-biscuits.zip`	quiz
7. Counting cars	`07-counting-cars.zip`	quiz
8. Stereo	`08-stereo.zip`	quiz
9. Machine learning	`09-ml.zip`	quiz
Second progress test

Progress tests

There are two progress tests, scheduled for:

CE316 week 21
week 25

CE866 week 21
week 25

CE316	week 21 week 25
CE866	week 21 week 25

(Yes, they are both at the same time.) As discussed at length above and in the first chapter of the notes, these are open-book tests that focus on the experiments. For both CE316 and CE866, each progress test is worth 20% of the overall module mark.

The remainder of your marks come from an examination held in the first part of the summer term. This is worth 60% of the overall module mark. Previous examinations are available via the Moodle site.

Installing OpenCV on your own machine

OpenCV and the Python environment described in lectures are part of the standard installation on the machines in CSEE's computer laboratories. Of course, you can also install OpenCV on your own computer, either under Unix (macOS, Linux, etc) or Windows. The way you do that for all three operating systems is described in the first chapter of the lecture notes.