In many intelligent systems, a network of agents collaboratively perceives
the environment for better and more efficient situation awareness. As these
agents often have limited resources, it could be greatly beneficial to identify
the content overlapping among camera views from different agents and leverage
it for reducing the processing, transmission and storage of
redundant/unimportant video frames. This paper presents a consensus-based
distributed multi-agent video fast-forwarding framework, named DMVF, that
fast-forwards multi-view video streams collaboratively and adaptively. In our
framework, each camera view is addressed by a reinforcement learning based
fast-forwarding agent, which periodically chooses from multiple strategies to
selectively process video frames and transmits the selected frames at
adjustable paces. During every adaptation period, each agent communicates with
a number of neighboring agents, evaluates the importance of the selected frames
from itself and those from its neighbors, refines such evaluation together with
other agents via a system-wide consensus algorithm, and uses such evaluation to
decide their strategy for the next period. Compared with approaches in the
literature on a real-world surveillance video dataset VideoWeb, our method
significantly improves the coverage of important frames and also reduces the
number of frames processed in the system.