UC San Diego
State-based Policy Representation for Deep Policy Learning
- Author(s): Liu, Fangchen
- Advisor(s): Su, Hao
- et al.
Reinforcement Learning has achieved noticeable success in many fields, such as video game playing, continuous control, and the game of Go. One the other hand, current approaches usually require large sample complexity, and also lack the transferability to similar tasks. Imitation learning, also known as ``learning from demonstrations'', is possible to mitigate the former problem by providing successful experiences. However, current methods usually assume the expert and imitator are the same, which lack flexibility and robustness when the dynamics change.
Generalizability is the core of artificial intelligence. An agent should be able to apply its knowledge for novel tasks after training in similar environments, or providing related demonstrations. Given current observation, it should have the ability to predict what can happen (modeling), and what needs to happen (planning). This brings out challenges on how to represent the knowledge and how to utilize the knowledge by learning from interactions or demonstrations.
In this thesis, we will systematically study two important problems, the universal goal-reaching problem and the cross-morphology imitation learning problem, which are representative challenges in the field of reinforcement learning and imitation learning. Laying out our research work that attends to these challenging tasks unfolds our roadmap towards the holy-grail goal: make the agent generalizable by learning from observations and model the world.