The Semantic Analysis of Motion by Nonlinear Estimation Methods
- Author(s): Schlenzig, Jennifer
- et al.
The objective of motion understanding is to provide computers with the ability to identify motions captured by video cameras. The definition of what is a motion varies across applications. For example, in a dance application the motions could be the possible movements (plie, pirouette, etc.) in ballet. A characteristic of motions that does extend across applications is the fact that they occur in the spatiotemporal do main where both the evolving shape and the trajectory of the object can be expressed. Motion understanding differs from the classical problem of object tracking in that for tracking, the output of the system is a trajectory defining the position of the object in time. The output of a motion understanding system is a sequence of semantic labels (typically verbs) describing the motions identified in the video sequence. Motivating the development of motion understanding is the possibility of applications such as computerized sports analysis, immersive entertainment, intelligent surveillance sys tems and intuitive machine interfaces.
Previous attempts at motion understanding have typically concentrated on the interpretation of trajectories of feature points which have been extracted from each image, but common problems such as occlusion of the features and changes in lighting conditions in the image can hinder the feature finding algorithm. Then, between them. This requires assumptions such as smoothness of motion which may not necessarily be relevant. Once the trajectory has been obtained, a measure of similarity must be found that allows for the expected temporal and spatial variations across instantiations of the motions while still providing the necessary discrimination behavior.
The work presented here overcomes these difficulties. Although for a given application object trajectories may be important, and hence computed, there is no dependence on trajectories for the purpose of motion understanding. In fact, we are able to interpret the motions without using any of the traditional motion analysis techniques for image sequences. Instead, the determination of the current motion relies on the symbol stream which is extracted from the image sequence. The symbols are high level descriptions of the object, and are less susceptible to noise than low level features such as edges and bright spots. This information is used to update an estimate of the probability of occurrence for each of the possible motions. The maximum probability is then used to identify the current motion.