Ulutan, Oytun

Attention Models for Activity Detection

2019

Ulutan, Oytun
Advisor(s): Manjunath, Bangalore S

Abstract

Video action detection is an important part of video understanding and analysis. There are many possible applications such as smart home environments to recognizing user actions and acting, smart robotics such as autonomous cars, robot assistants, selfie drones following your gesture commands, automated security systems analyzing the environment and assessing events. This thesis focuses on introducing novel machine learning algorithms for video action detection. A central contribution of this research is in developing a context-aware attention model for atomic actions. An atomic action is a simple action which can be described with 1-3 words or atomic body movements such as walking, drinking, holding an object. While observing actions/activities, humans infer from the entire context and our perception depends on the surrounding objects, actors, and scene. Inspired by this, our Actor Conditioned Attention Maps(ACAM) model utilizes the surrounding scene for each actor and uses context for improving action/interaction detection. The modularity of the ACAM model allows us to detect, track and recognize actions over extended time periods. We further extend this framework to detect complex activities which are composed of sequences of atomic actions. We demonstrate the effectiveness of our proposed methods on aerial videos and videos from camera networks.

Main Content

For improved accessibility of PDF content, download the file to your device.

UC Santa Barbara

Attention Models for Activity Detection