As networks of video cameras are being installed in many applications, modeling and inference strategies in video networks have captured more and more interest. There are many challenge problems, such as (i) traditional computer vision challenges in tracking and recognition, robustness to pose, illumination, occlusion, clutter; recognition of objects and activities; (ii) aggregating local information to obtain stable, long-term tracks of objects; (iii) cooperative camera control algorithms for multi-resolution target acquisition; (iv) distributed processing and scene analysis and (v) communication in a distributed manner. The overall aim of this thesis is to study the core issues in network-centric processing, control and communication in a multi-camera network, including frameworks for tracking people in video through changes of activities, tracking in a non-overlapping camera network, decentralized control and tracking in a camera network, and distributed video compression.
In our work, we address the problem of tracking in camera networks by dividing into two parts: tracking people through changes in activities and tracking multiple targets in a network of non-overlapping cameras. We present a novel framework for tracking a long sequence of human activities, including the time instances of change from one activity to the next, using a non-linear switching dynamical feedback system. Meanwhile, a multi-objective optimization framework by combining short term feature correspondences across the cameras with long-term feature dependency models is proposed to take care of tracking in a non-overlapping network.
We also deal with the problem of decentralized, cooperative control of a camera network and distributed multi-target tracking in such a network of self-configuring pan-tilt-zoom cameras. This control cannot be based on separate analysis of the sensed video in each camera. They must act collaboratively to be able to acquire multiple targets at different resolutions. Our research focuses on developing accurate and efficient camera control algorithms in such scenarios using game theory. For tracking the targets as they move through the area covered by the cameras, we propose a special application of the distributed estimation algorithm known as Kalman-Consensus filter through which each camera comes to a consensus with its neighboring cameras about the actual state of the target.
Finally, we present a framework for multi-terminal video compression (MTVC) that exploits the geometric constraints between cameras with overlapping fields of view, and then uses distributed source coding on corresponding points in two or more views. Our proposed method is composed of two parts - a Distributed Motion Estimation (DME) algorithm and a Distributed Source Coding (DSC) algorithm.