Skip to main content
eScholarship
Open Access Publications from the University of California

UC Riverside

UC Riverside Electronic Theses and Dissertations bannerUC Riverside

Exocentric to Egocentric Transfer for Action Recognition

Creative Commons 'BY' version 4.0 license
Abstract

Egocentric vision captures the scene from the point of the view of the camera wearer while exocentric vision captures the overall scene context. Jointly modelling ego and exo views is a crucial step towards developing next-generation AI agents. The community has regained interest in the field of egocentric vision. While, third-person view and first-person has been thoroughly investigated, very few works aim to study the both synchronously. Exocentric videos contain many relevant signals transferrable to egocentric videos. We propose a multimodal-LLM model that leverages large-scale exocentric information for the task of egocentric action recognition. This thesis also provides a broad overview of works combining both the egocentric and exocentric vision.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View