Skip to main content
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Contextual Visual Object Recognition and Behavior Modeling for Human-Robot Interactivity


Modeling spatio-temporal contextual information is fundamental in computer vision, with particular relevance to robotic intelligence and autonomous driving. We develop several frameworks for context modeling in image, video, multi-modal, and multi-cue data with applications to human-robot interactivity, in particular to the domain of intelligent vehicles. With the goal of developing contextual systems for interactivity, several key contributions are proposed: (1) A contextual framework for robust image-level scene understanding, including detection and localization of vehicles, pedestrians, and parts of humans (e.g. hands) in on-road setting, (2) A spatio-temporal, multi-modal, and multi-cue model which reasons over the complex interplay between the human (hand, head, and foot coordination), vehicle (speed, yaw-rate, etc.), and surround spatio-temporal context (agents, scene information) cues for understanding behavior and predicting activities, (3) A human-centric framework for object recognition and visual scene analysis, developed by studying a notion of object importance and relevance as measured in a spatio-temporal context of navigating a vehicle. The final contribution unifies the aforementioned components of the thesis, including spatio-temporal object recognition, human perception modeling, and behavior and intent prediction into a single research task. Although the data and case studies in this work emphasize the safety-critical settings of navigating a vehicle, the contributions of this thesis are general and can therefore be applied to a wider array of applications involving human-machine interactivity.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View