Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Hierarchical Modeling of Human-Object Interactions: from Concurrent Action Parsing to Physics-Based Grasping

Abstract

The study of human-object interaction (HOI) aims at modeling the geometric relationship between a human and an object in an interaction. Understanding HOI is an essential step towards holistic scene understanding and generating realistic scenarios that involve humans. Conventionally, the study of HOI focuses on detecting and classifying instance-level HOI on 2D images. Given an image, an example output would be a triplet , or , where the person, chair, and apple are all represented by bounding boxes. This dissertation aims to understand HOI in 3D.

Extending HOI to 3D faces two significant challenges. The first challenge lies in the difficulty of obtaining high-fidelity 3D annotation of HOI data. Existing methods of collecting 3D datasets all suffer from high occlusion, poor resolution, and high annotation costs. Another critical challenge in modeling 3D HOI lies in the representation of the objects. Existing methods treat each object as a unity, usually represented as an axis-aligned bounding box. Such methods ignore the complexity of objects' shapes and therefore fail to model complex geometrical relationships in HOIs such as sitting. The root cause of this challenge traces back to the first challenge, where we do not have the high-fidelity data necessary to reflect the details in object shapes.

This dissertation addresses both challenges by collecting a large-scale high-fidelity 3D HOI dataset and by proposing hierarchical modeling of HOI. By using instance-level HOI annotation, our dataset improves scene reconstruction performance by a significant margin. This high-fidelity nature of the collected dataset enables part-level HOI modeling, which addresses the second challenge. This dissertation also addresses the second challenge by decomposing shape-level HOI into physics-level, which significantly improves the quality and robustness of grasp synthesis.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View