Visual Recognition

Course
2021-2022
Semester
2
ECTS
6
Type
Compulsory
University
UPorto and USC

Subject objectives

Visual recognition tasks range from object detection in images and videos, object classification, and instance recognition, to human action recognition. In the course we will undertake a study of the first tasks, as action recognition is the main topic of Human Action Recogniton subject.

The objective is for students to acquire knowledge and skills that allow them to design systems for video motion detection, motion-based segmentation and tracking, classification and detection of objects in images and video, as well as visual tracking of objects.

Contents

*PART 1 UPorto:
————————-

1-Introduction to video analysis.
Motion perception
Multidimensional images and image sequences
Spatio temporal image sampling
Motion blur and spatiotemporal filtering

2-Motion detection and estimation.
Image alignment and registration
Hierarchical and dense motion estimation
Discret coarse-to-fine image pyramids
Optical flow computation: intensity, energy and phase based

3-Motion segmentation and tracking.
Parametric motion, Spline-based motion, Layered motion
Traditional motion tracking: Kalman filters, particles filters

*PART 2 USC:
————-
4-Invariant feature extraction and matching.
Local Binary Descriptors
Spectra Descriptors
Basis Space Descriptors
Sparse Coding Methods

5-Classical methods of image classification and object detection.
Geometric Methods
Appearance-based Methods
Feature-based Approaches
Sliding Window Approaches
Bags of Visual Words
Part-based Models

6-Methods for object detection and tracking based on deep learning.
Convolutional Neural Networks for object detection in images.
Convolutional Neural Networks for object detection in videos.
Deep learning approaches to visual tracking.

The syllabus includes the main topics of current interest in visual recognition and that are transversal to several domains of application.

Basic and complementary bibliography

Basic:
In general, basic and complementary references are based on papers that will delivered by lecturers.

Complementary:

Szeliski, Richard. “Computer Vision: Algorithms and Applications”. Springer. 1st Edition, 2010. ISBN 978-1-84882-934-3.
http://szeliski.org/Book/drafts/SzeliskiBook_20100903_draft.pdf.

Md. Atiqur Rahman Ahad. Computer Vision and Action Recognition: A Guide for Image Processing and Computer Vision Community for Action Understanding. Atlantis Press; 2011 edition. ISBN 978-9491216190.

Boguslaw Cyganek. Object Detection and Recognition in Digital Images: Theory and Practice. John Wiley & Sons, Ltd. 2013. ISBN: 978-1-118-61836-3.

Competencies

Special attention will be paid to the specific competencies (CE) listed below, and to a different extent, according to the characteristics of the subject, the following competencies chosen from the general of the degree (GC), transversals (CT) and basic training (CB):

Basics:
CB8: Students are expected to be able to integrate knowledge and face the complexity of making judgments based on information that, being incomplete or limited, includes reflections on social and ethical responsibilities linked to the application of their knowledge and judgments.

CB10: Students are expected to acquire the learning skills that allow them to continue studying in a way that will be largely self-driven or autonomous.

Transversal:
CT3: Development of the innovative and entrepreneurial spirit.

General:
CG1: Ability to analyze and synthesize knowledge.
CG7: Ability to learn independently for specialization in one or more fields of study.

Specific:
CE1: Students are expected to know and apply the concepts, methodologies and technologies of image processing.
CE2: Students are expected to know and apply machine learning and pattern recognition techniques applied to computer vision.
CE3: Students are expected to know and apply the concepts, methodologies and technologies of image and video analysis.
CE5: Students are expected to know how to analyze and apply state of the art methods in computer vision.
CE9: Students are expected to know and apply the concepts, methodologies and technologies for the recognition of visual patterns in real scenes.

Teaching methodology

The development of the classes will be carried out harmonizing the teaching methodologies with the fundamental objectives of the subject. This will be a subject with exposition and applications, where students will learn not only why, but also how to implement, evaluate and decide over visual recognition systems.

The provision of information and scientific knowledge foreseen in the objectives will be developed at the beginning of each subject to be addressed, where the relation with other topics already treated in previous classes or in other subject will be established. These sessions will aim to develop students’ competences and sensitize them to the importance of the topics addressed in the current real context, contributing to a better framework and also easier to perceive the objectives that are intended to be achieved.

Given the practical nature of the topics to be dealt with, several exercises and practical cases, resulting from developed research work, will be presented and proposed. Students will learn by doing, reflecting and making decisions about proposed problems and alternatives, improving their skills in the topics under analysis. It will try to stimulate a process of dialogue in which all participate, through their own experience and knowledge. Thus, knowledge, doubts and questions will be shared, so as to benefit students’ learning and motivate them to become more motivated. Essentially, efforts will be made to ensure the development of the abilities to «apply in different contexts» the knowledge acquired under the influence of different factors and variables.

The practical group work will have an important contribution to the achievement of the objectives defined for the subject, providing the understanding and application of the themes under study, as well as show the benefits of computer vision projects in the efficiency of companies. This will allow to identify the different resources and components of a project, its internal and external relations, as well as to use in a general and integrated way the concepts and methodologies addressed throughout this and other curricular units.

The realization of the practical work will also have the advantages of sharing knowledge among the members of the group, searching for external information and, therefore, contact with reality. Its subsequent presentation and defense, as well as the analysis of a project carried out by another group of the class, will contribute in a decisive way to the reinforcement of the analysis capacity considered essential for the achievement of the objectives of this subject.

Competences CE1, CE2, CE3 and CE5 have specific theoretical and practical associated contents in the subject and will be evaluated explicitly throughout the course.
The work of competences CG1, CG7, CB8 and CB10 are worked mainly through the analysis and group discussion of state of the art papers.
Competences CT3 are specially worked in group projects.

The evaluation of the students will serve to gauge the effectiveness of teaching methodologies developed in compliance with the objectives of the subject.

See Contingency Plan for Alternative Scenarios (Distancing, Closure of Facilities).

Evaluation system

The evaluation of the subject consists of three parts:

60%: The part related to the presentation of the master sessions will be evaluated by means of written tests and /or the continuous evaluation of laboratory practices, which will assess the adequacy of the proposed solutions to the problems, the quality of the results obtained and the understanding of the techniques used.
It is used to assess CE1, CE2, CE3 and CE3 competencies mainly.

30%: Resolution of practical cases (research project). The adequacy of the proposed solutions to the problems, the quality of the results obtained and the understanding of the techniques used will be assessed.
It is used to evaluate the CT3, besides the specific competences.

10%: Analysis and of state of the art article in visual recognition.
It is used to assess CG1, CG7, CB8 and CB10 competencies mainly.

The final result (RF) will be obtained by the formula: RF = 0.6 A + 0.3 B + 0.1 C

All assignment and test marks will be preserved until the Second Opportunity. There the students could repeat some of the assessment activities. Final grade will the computed taking into account the maxima marks between corresponding activities in both opportunities.

A student will be marked Absent if they neither submit any of assessment exercises nor take any test in none of the Opportunities.

For cases of fraudulent performance of exercises or tests, the provisions of the Regulations for the evaluation of students’ academic performance and the review of qualifications will apply.
In application of the ETSE Regulation on plagiarism (approved by the ETSE Board on 19/12/2019) the full or partial copy of any exercise of practice or theory will suppose a failure in both opportunities of the course, with a rating of 0.0 in both cases.

See Contingency Plan for Alternative Scenarios (Distancing, Closure of Facilities).

Studying time and personal work

Recommended study time for students is about 2 hours per week. Additionally, we estimate that they should spend about 6,5 hours / week working in a number of assignments. All of these activities add up to around 120h/semester.

Subject study recommendations

It is recommended to bring the matter up to date and the use of mentoring to clarify doubts and receive advise in the development of the activities.

Observations

The virtual campus of UPorto and USC will be used for each of the parties.
The subject will be taught in English.

Contingency plan:

In the event that the health situation advises establishing a Scenario 2 (distancing):
1) all the expository classes will be taught online (synchronously by Microsoft Teams or asynchronously through the publication of videos recorded by the teaching staff),
2) interactive classes will be taught in person in a computer room,
3) the weight of the different parts of the subject and the requirements to overcome it will remain unchanged,
4) The final test will be done in person.

In the event that the health situation advises establishing a Scenario 3 (closure of facilities):
1) all the expository classes will be taught online (synchronously by Microsoft Teams or asynchronously through the publication of videos recorded by the teaching staff),
2) all interactive classes will be taught online (synchronously by Microsoft Teams or asynchronously through the publication of videos recorded by teachers),
3) the weight of the different parts of the subject and the requirements to overcome it will remain unchanged,
4) The final test will be done in person, using Microsoft Teams and the tools of the Moodle virtual classroom.