- Main
Reasoning for Representations for Learning-based Control
- Xu, Zhuo
- Advisor(s): Tomizuka, Masayoshi
Abstract
Data-driven machine learning approaches offer great intelligence capabilities and can help solve many challenging control problems. However, the agnostic nature of the internal representations in the learning-based control (LbC) policies makes them difficult to apply on a broad basis. Since the LbC policies are generally optimized in an end-to-end manner based on specialized domain settings, they can fail, for unknown reasons, in domains with setting variations. Moreover, a lack of understanding of the underlying logics in LbC policies also limits the transferability of the learned knowledge to different tasks.
This dissertation presents a series of works on reasoning for representations for LbC policies, including decomposition of LbC policies, and design of interpretable and transferable representations. The thread of this dissertation is based on the roles of the representations for LbC. Three major areas are covered in this dissertation. Three major representation reasoning areas are covered in this dissertation:: (I) inside the LbC policies, (II) at the interface of the LbC policies, and (III) outside the LbC policies.
First, within the parameterized LbC policies, the reasoning for learning representations is developed to capture subtle features in complex scenarios. A sophisticated neural network structure is applied online to infer the task context representations in a contact-aware way; this is followed by a comprehensive investigation on design and selection of representations. The presented representations include the Gaussian mixture model, the graph neural network, and a novel history-encoding representation; the applications range from robotic manipulation to autonomous driving to human intention inference.
In regard to the challenge of knowledge transfer, we propose a policy decomposition approach to learn attribute-wise modules separately. We design two representation frameworks at the interface of the LbC policies for the decomposition and combination of attribute-wise modules. The proposed architecture, the cascade attribute networks (CANs), and parallel attribute networks (PANs) can transfer learned knowledge between tasks and efficiently produce sophisticated LbC policies by fusing learned attribute-wise modules.
Variation between the training and deployment domains is another major reason there are LbC policy failures. In the last part of the dissertation, we focus our investigation on representation reasoning for autonomous-driving policy transfer against vehicle dynamics variation and external dynamics disturbances. We leverage the interpretable kinematic-level representations to bridge the transferring domains. We propose two external adaptation modules—model agnostic meta learning (MAML) and disturbance-observer-based (DOB) robust controllers— to achieve one-shot or zero-shot adaptation of the kinematic-level representations to the deployment domain.
Main Content
Enter the password to open this PDF file:
-
-
-
-
-
-
-
-
-
-
-
-
-
-