What do those images have in common?

Monday, March 24, 2008 - 4:47pm


Narendra Ahuja
Donald Biggar Willet Professor, University of Illinois at Urbana-Champaign

DATE: MONDAY, April 7, 2008
TIME: 3:30 – 4:30 PM
PLACE: Computer Science Conference Room, Harold Frank Hall Rm. 1132

This talk is about a priori unknown themes that may characterize a given, arbitrary or strategically chosen, set of images. If objects from a certain category occur frequently in the set, it is said to form a theme. No specific categories are defined by the user, indeed they are not even known to the user a priori. Whether and where instances of any categories appear in a specific image is also not known.

Such autonomous analysis of image content is becoming increasingly important with the fast growing volume of video data. It requires answers to the following basic questions. What is an object category? If, and to what extent, is human supervision necessary to communicate the nature of categories to a computer vision system? What properties should be used to define a good category representation? In our work, an object category is defined as consisting of (2D) subimages that have similar photometric, geometric and topological properties. In this talk, we present our methodology for achieving the following capabilities. (1) Discovering whether any categories occur in the image set. (2) Learning a compact model that captures the intrinsic, image-space nature of the categories. (3) Learning the relationships among the different categories, thus building a taxonomy of all discovered categories – the taxonomy makes explicit how a complex category may itself be defined as a configuration of other, simpler categories occurring in specific spatial relationships. (4) Using the learned taxonomy to recognize all occurrences of all categories in previously unseen images. (5) Segmenting each recognized category occurrence. (6) Explaining the recognition by articulating where and why the simpler, defining categories are detected.

Our approach begins with a segmentation-tree representation of each image. Capabilities (1-5) involve matching, learning, and organization of the trees. These computations are general, involve machine learning of stochastic visual patterns, and are almost completely unsupervised. This generality makes the approach easy to extend to detecting recurring image themes of other kinds. We present one such example, that of identifying and extracting the stochastically repeating parts of visual textures, commonly called texture elements (e.g., water lilies forming a texture on the water surface, and individuals in a crowd).

Narendra Ahuja received his Ph.D. degree in computer science from the University of Maryland, College Park, USA. Since 1979 he has been with the University of Illinois at Urbana-Champaign where he is currently Donald Biggar Willet Professor in the Department of Electrical and Computer Engineering, the Beckman Institute, and the Coordinated Science Laboratory. His current research is focused on extraction and representation of spatial structure in images and video; integrated use of multiple image-based sources for scene representation and recognition; versatile sensors for computer vision; and applications including visual communication, image manipulation, and information retrieval.

HOST: Matthew Turk