AISB event Bulletin Item

Programme: Workshop on the Active Vision of Humanoids, 29 November 2007, Pennsylvania, USA


                            THE ACTIVE VISION OF HUMANOID ROBOTS

                                                        Organized by

                                      Yiannis Aloimonos and Giulio Sandini

                                                     November 29, 2007

                                 Omni William Penn Hotel, 530 William Penn Pl.

                                                 Pittsburgh, PA 15219

                           Together with the IEEE-RAS 7th International Conference 

                                                on Humanoid Robots



            Practical computer vision-systems are devoted to answering a set of practical questions, such as is there something moving independently in the video taken by a moving camera?  What is it? Is there a human in the image? Who is he? On the other hand, biological vision systems are involved in an ongoing process of analyzing images. As Stuart Geman wrote, real world images have essentially infinite detail which can be perceived only by a process that is itself ongoing and essentially infinite. The more you look, the more you see. Considering a humanoid robot, how should we think about its vision? The way we think of a practical vision system or the way we think of a biological vision system?

          The current state of the art does not have a definite answer. But even if some of you adapt the bio-inspired or bio-mimetic viewpoint, how should we proceed? Should we think of the humanoid as performing unconscious inference about the world? Should we think of the humanoid as developing a data structure that represents the components of the scene and their relationships, like building a complex molecule whose atoms and bonds represent scene primitives and their relationships? Or should we follow the conventional wisdom that we inherited from D. Marr where vision amounts to a high resolution buffer and the job is to annotate a scene (human here, dog there) through a complicated search involving attention?

           The goal of the workshop would be to present various points of view on these problems while keeping some focus on the question of the visual architecture of the humanoid: how should its motion system be structured? Should it stabilize the images? Segment the scene into surfaces? Constantly check where it is with regard to its knowledge of the world? How should it build models of objects? How should it integrate cue information? How should it reach a decision? What is its perception of spatial layout?

Is there software that we have today which can be used to provide humanoids with a basic visual front-end, and what would this be? Should we be developing visuo-motor representations? How could we build them and how could we use them?  

            We will also be addressing questions such as: what kind of information should the humanoid extract from images and video? Should that information be expressed in some language? Should vision produce one general purpose description leaving it to other processes to transform it to suit their needs or should it produce many specialized  descriptions? Could intermediate level vision be learned? How?

           To shed light on these questions we have invited a few prominent researchers from the field. We hope you will join us in Pittsburgh for an exciting day of presentations and panel discussions.  The Workshop program follows.

                                  Workshop Program   (9:00am to 7:00 pm; lunch 12-1)

9:00-9:05: Introduction and welcome  Y. Aloimonos and G. Sandini

9:05-9:30: Y. Aloimonos: The compositionality of Vision and Action

9:30-10:00: Dana Ballard: The role of gaze control in embodied cognition

10:00-10:30: James Albus: A model of computation and representation in visual cortex

10:30-11:00: Coffee Break

11:00-11:30: Eric Schwartz: Space-variant active vision: hallmark of an unsolved

                                               Challenge for human(oid) vision

11:30-12:00: Y. Sagakami: 3D eyes in the Honda Humanoid Robot

12:00-1:00: Lunch

1:00-1:30: Bert Shi: Building active vision systems with POPCORN: hardware

                                  Simulating POPulations of CORtical Neuron models

1:30-2:00: Ming-Hsuan Yang: Incremental learning for robust visual tracking

2:00-2:30: Kostas Daniilidis: Challenges in navigation of humanoids

2:30-3:00: Michelle Rucci: Eye movements in humans and humanoids

3:00-3:30: Coffee Break

3:30-4:00: Jan-Olof Eklundh: Attention and 3D cues in active vision

4:00-4:30: Randal Nelson: The curse of arithmetic

4:30-5:00: Abhijit Ogale: Patterns in human(oid) motion synergies

5:00-5:30: Ruzena Bajcsy: Representation and Recognition of Human Action

5:30-6:00: Giulio Sandini: Humanoid Vision

6:00-6:45: Panel Discussion (Emphasis will be given to learning middle level vision with short presentations and questions)

6:45- : Reception