Research Overview: Analysis of high dimensional image feature spaces

The prime focus of my recent research activities has been investigating eigenanalysis (Principal Component Analysis (PCA)) of images where some core algorithms have been developed to incrementally build and learn eigenmodels. This has led to further research in the analysis of high dimensional image feature spaces - computational issues, manifold representation, applications to articulated human motion analysis, human faces and facial dynamics, biology and audio/visual tasks.

Building Eigenspace (Principal Component Analysis (PCA)) models:

Adding Eigenspaces
Incrementally building and learning eigenmodels: My initial research interests in this area led to some core algorithms being developed to incrementally build and learn eigenmodels. This is important as standard computer memory limits the size of eigenmodels that can be built in 'batch' mode and also in situations where data is not available at a single instance, e.g. A company/university facial database may need to be updated - added to or deleted - as people join or leave the institution. More details here....

Key Paper: Merging and Splitting Eigenspace Models (PDF), P. M. Hall, A. D. Marshall, R. R. Martin IEEE PAMI 22 (9), pp 1042-1049, 2000 ISSN 0162-8828

Hidden Markov Modelling

Hidden Markov Modelling: Hidden Markov Models are classic tools in Computer Vision (influenced from the speech and signal processing community) for modelling dynamics with applications to modelling motion (human, facial etc.) and used for tracking, for example. Early research in this area developed methods to add Hidden Markov Models. This work has since been advanced by my colleagues. My recent interest in HMM modelling has focussed on trying to improve the underlying modelling of motion by HMMs. We have looked at partitioning initial motion trajectories to model each partition by a single HMM. Recent work has also focussed on automatically building Hierarchical Hidden Markov Models (HHMMs).

Key Papers:
A method to add Hidden Markov Models with application to learning articulated motion (PDF), Y. Hicks, P. Hall and A.D. Marshall, British Machine Vision Conference, 489-498, 2003.
Generating Human Interactive Behaviours using the Windowed Viterbi Algorithm Y. Zheng, Y. Hicks, D. Cosker, D. Marshall Lecture Notes in Communications in Computer and Information Science, Springer-Verlag, Accepted, To Appear

Groupwise Registration

Groupwise Registration: In order to build statistical models of appearance (eigenmodels) you need to have a training set of registered data. The traditional method to achieve this was to hand label all images - a tedious and time consuming task. Inspired by pioneering work at Manchester University on automatic non-rigid registration via a groupwise image registration approach, we have developed our novel version. We propose a novel solution that implicitly reduces the dimensionality of the search space as the search progresses by incrementally learning optimal deformations. We also introduce a novel application of stochastic optimisation algorithms that do not significantly degrade in performance as the dimensionality grows. Additionally, due to the efficient formulation of our approach, it is amenable for GPU implementation Due to the robustness of our approach we are also able to perform inter-person groupwise registration. We take a corpus of individual face images and can successfully register them (see image opposite). This is the first time that the automatic nonrigid registration of data possessing such variety has been reported.

Key Paper: An Efficient Stochastic Approach to Groupwise Non-rigid Image Registration (PDF), Kirill Sidorov, Stephen Richmond and David Marshall, Proceedings of CVPR 2009, pp 2208-2213, Miami, USA, June 2009

Manifold Representations:

Isomap optimal parameter value estimation:The Isometric mapping (Isomap) method has demonstrated promising results in finding low dimensional manifolds from data points in high dimensional input space. Isomap has only one free parameter (number of nearest neighbours K or neighborhood radius 2), which has to be specified manually. We have developed a new method for selecting the optimal parameter value for Isomap automatically. Numerous experiments on synthetic and real data sets show the effectiveness of our method.

Key Paper: Selection of the optimal parameter value for the Isomap algorithm O. Samko, P. Rosin and D. Marshall Pattern Recognition Letters 27, pp 968-979, 2006


Robust automatic data decomposition: In this work, we address the problem of automating the partial representation from real world data with an unknown a priori structure. Such a representation is very useful for the further construction of an automatic hierarchical data model. We have devised a three stage process using data normalisation and the data intrinsic dimensionality estimation as the first step. The second stage uses a modified sparse Non-negative matrix factorization (sparse NMF) algorithm to perform the initial segmentation. At the final stage a region growing algorithm is applied to construct a mask of the original data. Our algorithm has a very broad range of potential applications, we illustrate this versatility by applying the algorithm to several dissimilar data sets.

Key Paper: Robust automatic data decomposition using a modified sparse NMF (PDF), O. Samko, P. L. Rosin and A. D. Marshall Mirage 2007 - Computer Vision/Computer Graphics Collaboration Techniques and Applications INRIA Rocquencourt, France, March 28-30, LNCS vol. 4418, pp. 225-234, 2007.

Applications to Articulated Human Motion Analysis:

Tracking People in 3D: This work stemmed from my initial work in adding and subtracting eigenspaces. Applying the theory to the tracking of 3D articulated human motion seemed an ideal area to exploit. This work then led to a novel hierarchical eigenmodel representation of data which subsequently led to my work on facial dynamics (see below) and permeates some of my research to this day.
The novel hierarchical model of human dynamics is capable of view independent tracking of a human figure in monocular video sequences. The model is trained using real (3D) data from a collection of people. The top of the hierarchy contains information about the whole body. The lower levels of the hierarchy contain more detailed information about possible poses of some subpart of the body. In this article we describe our model and present experiments that show we can recover 3D human figures from 2D images in a view independent manner, and also track people the system has not been trained on.

Key Paper: Tracking People in Three Dimensions Using a Hierarchical Model of Dynamics (PDF), Y Hicks, P.M. Hall and A.D. Marshall, Image and Vision Computing Volume 20, Issues 9-10, 1 August 2002, Pages 691-700 ISSN: 0262-8856

Virtual Friend

Virtual Friend: This work developed form the 3D people tracking research and also involves an interest in developing HMM models for accurate motion representation (both above).

We have developed a new approach for generating interactive behaviours for virtual characters using the windowed Viterbi algorithm. This is capable of real-time performance. Our system tracks and analyses the behaviour of a real person in video input and produces a fully articulated three dimensional (3D) character interacting with the person in the video input. Our system is model-based. Prior to tracking, we train a collection of dual-input Hidden Markov Model (HMM) on 3D motion capture (MoCap) data representing a number of interactions between two people. Then using the dual-input HMM, we generate a moving virtual character reacting to (the motion of) a real person. In this article, we present the detailed evaluation of using the windowed Viterbi algorithms within our system, and show that our approach is suitable for generating interactive behaviours in real-time. Furthermore, in order to enhance the tracking capabilities of the algorithm, we develop a novel technique that splits the complex motion data in an automated way. This results in improved tracking of the human motion from our model.

Key Paper: Generating Human Interactive Behaviours using the Windowed Viterbi Algorithm Y. Zheng, Y. Hicks, D. Cosker, D. Marshall Lecture Notes in Communications in Computer and Information Science, Springer-Verlag, Accepted, To Appear. Grapp 2008 conference version of paper (PDF).

Videos: Push, Pull, Handshake

Applications to Human Faces and Facial Dynamics:

Talking Head
Video Realistic Talking Head:We are working on various aspects of facial analysis/synthesis, both in 2D and 3D. Our early work was concerned with driving the facial appearance from audio input alone, i.e. "talking heads". We extended basic ideas from standard (eigenmodel based) Active Appearance Models to work with audio features as well as shape and texture features. Taking ideas from Tracking People in 3D (above) the basic flat active appearance model was also modified to use a hierarchy, which enabled better control of the model.

Key Paper: Speech-driven facial animation using a hierarchical model (PDF), D.P. Cosker, A.D. Marshall, P.L. Rosin, Y.A. Hicks, IEE Proc. Vision, Image and Signal Processing, vol. 151, no. 4, pp. 314-321, 2004.

Videos: Example_1, Example_2 (both example videos generated using only speech), Tracking_Example

Facial Dynamics
Facial Dynamics: A major theme of my research in recent years is the modelling of facial dynamics model. Taking the hierarchical face decomposition (above) further we can model the dynamics of each region by sampling the eigenvalues for the principal eigenvectors that span the region. Since the top principal components represent the greatest variation we can represent the dynamics with a few (often a single) parameters. (See below for more examples).

We have applied this technique to a few applications including animating a talking head - e.g. inserting artificial blinks (see above), behaviour transfer between two talking head models, the evaluation of the psychological perception of smiles, developing a 3D facial biometric and dental applications of 3D facial dynamics.

behaviour transfer

Facial Behaviour Transfer: We have developed a method for re-mapping animation parameters between multiple types of facial model for performance driven animation. A facial performance can be analysed in terms of a set of facial action parameter trajectories using a modified appearance model with modes of variation encoding specific facial actions which we can pre-define. These parameters can then be used to animate other modified appearance models or 3D morph-target based facial models. Thus, the animation parameters analysed from the video performance may be re-used to animate multiple types of facial model.

Key Paper: Towards Automatic Performance Driven Animation Between Multiple Types of Facial Model (PDF), D. Cosker, R. Borkett, D. Marshall and P. L. Rosin, "", IET Computer Vision, Vol. 2, No. 3, pages 129-141. 2008.

Videos: Overview, Simple_Morph_Target_Mapping_Example


Emotion and Expressive Facial Dynamics: Our face model has been applied to synthesise stimuli for psychological experiments. The images below show how a single parameter of the model (the current value given by the red dot in images opposite) can be used to vary the smile. We can then model the temporal onset, apex and offset of the smile and furthermore vary the amplitude/intensity of the smile by the strength of the principal components. This was used to determine how the temporal dynamics affected the perception of a smile as genuine or fake.

The work with our internationally renowned School of Psychology benefitted both Schools' research. We provided the most realistic stimulus models available for their experiments and their large experimental testbed not only provided results for the Psychology experiments but for our own (computer science) evaluations of our facial models.

Key Papers:
Effects of Dynamic Attributes of Smiles in Human and Synthetic Faces: A Simulated Job Interview Setting (PDF), E. Krumhuber, A. Manstead, D. Cosker, D. Marshall and P. L. Rosin, Journal of Non-Verbal Behaviour, vol. 33, no. 1, pp. 1-15, 2009.

Facial Dynamics as Indicators of Trustworthiness and Cooperative Behaviour (PDF), E. Krumhuber, A. Manstead, D. Cosker, P.L.Rosin and A.D. Marshall, Emotion, Vol 7, No 4, pp 730-735, 2007.

Videos: Overview. Smile_Analysis - this curve is then manipulated to create 'fake' or 'genuine' smiles.

behaviour transfer
behaviour transfer

3D Facial Biometrics: The human face has been so far mainly seen as a physiological biometric and very little work has been done to exploit the idiosyncrasies of facial gestures for person identification. We proposed a markerless method to capture 3D facial motions, and investigates a number of pattern matching techniques which operate accurately on very short facial actions. Qualitative and quantitative evaluations are performed for both the face identification and the face verification problems. The emphasis is placed on designing a system which is not only accurate but also usable in a real-life scenario.
Suitable data processing and feature extraction methods are examined, then a number of pattern matching techniques including the Frechet distance, Correlation Coefficients, Hidden-Markov Models, Dynamic Time Warping and its derived forms are compared, in light of which an improved Weighted Hybrid Derivative Dynamic Time Warping (WDTW) algorithm is proposed.

Key Paper: Assessing the Uniqueness and Permanence of Facial Actions for Use in Biometric Applications , L. Benedikt, D. Cosker, D. Marshall and P. L. Rosin, IEEE Transactions on Systems, Man and Cybernetics - Part A: Systems and Humans, 2009 (in press). BTAS 2008 conference version of paper (PDF)

behaviour transfer

Dental Applications: Craniofacial motion analysis: Craniofacial assessment for diagnosis, treatment planning and outcome of the facial structure. Evaluation and quantification of facial movement however is becoming particularly important, for example in children undergoing facial surgery (cleft lip and palate), in the assessment of patients with motor deficits (facial nerve palsies) and in the evaluation of psychomotor function associated with depression or pain. Until recently, the only tools available for the evaluation of facial function movements were based on subjective scaling assessments or two-dimensional measurements . Subjective assessments have the drawback that they are based on scales that are discontinuous and ambiguous and although two-dimensional measurements are objective, studies have cast doubt on their validity. This study represents experimental work that centres on a novel, non-invasive imaging system capable of three-dimensional, soft tissue image capture during facial movement. The methodology behind image capture is outlined and facial movement is assessed in response to facial expression and spoken word.

Key Papers:
Three-dimensional motion analysis - an exploratory study. Part 1: Assessment of facial movement (PDF), H. Popat, S. Richmond, R. Playle, D. Marshall, P.L. Rosin and D. Cosker Orthodontics and Craniofacial Research, Vol. 11, pp. 216-223, 2008.

Three-dimensional motion analysis - an exploratory study. Part 2. Reproducibility of facial movement (PDF), H. Popat, S. Richmond, R. Playle, D. Marshall, P.L. Rosin and D. Cosker Orthodontics and Craniofacial Research, Vol. 11, pp. 224-228, 2008.

Applications to Biology:

DIADIST: Diatom and Desmid Identification by Shape and Texture: The DIADIST project was funded by the Biotechnology and Biological Sciences Research Council (BBSRC) under its Bioinformatics Programme (official title: Visual Indexing for Taxonomic Information Systems; grant number 754/BIO14262). It was a collaboration between researchers from Cardiff University, Wales, and the Royal Botanic Garden Edinburgh, Scotland. The aim of this project was to investigate methods to enable the visual indexing of images and drawings of biological specimens held in a database. Biological specimens are frequently described in visual form for taxonomic and other purposes. Vast catalogues of specimen material have been accumulated over many years in the form of microscope slides, drawings and photographs. Recently, efforts have been made to digitize such data for electronic storage and transmission. This research programme will provide a means to structure taxonomic databases to enable this data to be more effectively queried. Whilst in some areas of biological data retrieval textual keys may be appropriate, this mode of operation is not adequate where visual data is of major importance, as in taxonomy. A great proportion of microscopic species are described and classified visually for taxonomic purposes. Current biological databases do not exploit the full potential of visual data which may already be stored within them. There has been some recent work in indexing between images in biological databases. However, there is a clear need to extend such indexing capabilities; the inclusion of biological drawings is a natural extension. The approach taken in the DIADIST project is novel in that it seeks to incorporate taxonomic drawings as a prime source of taxonomic data and to develop methods to enable indexing between digital photographic images and drawings stored in a biological database.

Project Web Sites: Main Diadist Project Site - hosted at Royal Botanic Garden Edinburgh

Cardiff Diadist Project Site - Code and local publications available here.


Extended Depth of Focus Imaging: Not Eigenanalysis but this work was embryonic -it formed the basis for the DIADIST work. The work was also used in the DIASIST project above. This work was performed in collaboration with the Natural History Museum, Madrid, Spain as part of an ongoing collaboration.

Microscopes offer a limited depth of focus which precludes the observation of a complete image of a three-dimensional (3D) object in a single view. Investigations, by a variety of researchers, have led to the development of extended depth of focus algorithms for serial optical slices of microscopic 3D objects in recent years. However, to date, no quantitative comparison of the different algorithms had been performed, generally leaving the evaluation to the subjective qualitative appreciation of the observer. In this work we defined three different tests for extended depth of focus algorithm evaluation and tested 10 different algorithms, some of them have been adapted (by us) for a series of optical slices. However, the main contribution of the paper was a new improved algorithm for computing the extended depth of focus

Key Paper: Extended Depth-of-Focus Algorithms in Brightfield Microscopy (PDF), Antonio G. Valdecasas, David Marshall and Jose M. Becerra, Microscopy and Analysis, September 2002, pp 9-17, ISSN 0958-1952


A full list of my publications and some downloadable papers are available ONLINE.