Skip to main content
eScholarship
Open Access Publications from the University of California

UCLA

UCLA Electronic Theses and Dissertations bannerUCLA

Robot Imitation by Action Understanding, Mirroring, and Interactions

Abstract

This dissertation rethinks the problem of robot imitative learning given human demonstrations and proposes a holistic framework to unify three challenges: (i) understanding the actions being imitated, (ii) producing proper imitative behaviors, and (iii) interacting effectively with humans. Considering a complex manipulation task of opening medicine bottles, some key actions of pushing or squeezing are critical in unlocking the bottles, but can hardly be recognized from pure visual observations. Therefore, a glove-based system is firstly presented together with a demonstration collection pipeline to understand the actions from a functional perspective that incorporates hand movements and goal states and from a physical perspective that focuses on the forces to reach the states. This heterogeneous information is integrated by a Temporal And-Or Graph (T-AOG) grammar representation, which also captures the hierarchical structure of the task. Sampling from the T-AOG generates a valid action sequence to accomplish the task. To transfer this skill to a robot, a mirroring approach is then proposed for a robot to infer functionally equivalent actions that can produce a similar force pattern in a physics-based simulation and achieve the same goal in changing object states, naturally bridging the action perception and production in robot imitation. In addition, using such a grammar representation is advantageous in tracking object states and accumulating robot knowledge from multiple views over a long period of time. Based on these, a joint inference algorithm is proposed to infer human (false-)beliefs and overcome the ambiguity in visual detections. Finally, this dissertation studies how different forms of explanation generated from the representation prompt human trusts in a robotic system and develops an Augmented Reality (AR) interface that allows users to interactively supervise a robot’s decision-making process and intervene by patching its knowledge represented by a T-AOG. Having a T-AOG representation as the core, this dissertation seeks to unify the perception, learning, planning, and interaction problems in robot imitation.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View