Distributed Camera Networks

In the last decades, video processing has become a convenient and widely used tool to assist, protect and simplify the daily life of people in areas such as surveillance, domotics, elderly care, traffic monitoring and video conferencing. Cameras, which become more and more widespread, in airports, cities or even indoor environments, provide visual information of a scene to monitor, analyze certain areas, or track individuals for special purposes.

One major problem of these cameras is the limitation in dealing with the tremendous amounts of information contained in numerous video streams. Currently the analysis often consists of the interpretation by human operators. Moreover, lots of data is recorded automatically and stored for a certain period to possibly being processed or analyzed manually later on. However, it is often impractical and impossible to track the information of interest quickly through large volumes of multiple video recordings.

All of the aforementioned applications can be greatly simplified by combining the information of multiple cameras. In contrast to a single camera approach, multi-camera networks are more robust and can handle more difficult situations (e.g. occlusions) in which one camera is not sufficient; for instance, observing objects which are occluded by corners, other objects or obstacles. Therefore, multi-camera systems became more and more popular in the recent years. However, it is not simple to process multiple images within a camera-network given the vast amount of data.

The idea of our multi-camera system is based on a decentralized tracking approach. In general, multi-camera tracking approaches can be categorized into centralized, decentralized and distributed tracking approaches. Centralized approaches transmit all video streams to one or more fusion centers and process the video on these fusion centers. The fusion centers need to be very powerful computers and need to be able to sustain high communication bandwidths. Therefore, centralized processing of multiple video streams creates not only a computing but also a communication bottleneck. Decentralized and distributed tracking approaches group cameras into clusters which communicate with a local fusion center (decentralized) or with each other (distributed tracking). This allows the construction of huge smart camera networks without straining network and server resources. It also only requires simple processing in the smart cameras, leaving precious resources for other video processing algorithms if needed. We have built such a multi-camera system with one fusion center and six (smart) cameras.

In our research we followed two different approaches: a multi-camera tracking approach based on occupancy maps and a distributed multi-camera tracking approach with a feedback loop. The former can be considered as a centralized approach, although parts of the processing are already carried out on the smart cameras. The latter describes a decentralized processing architecture with a feedback loop, in which the most compute-intensive video processing is performed within smart cameras. In fact, since the requirements on the fusion center are so low, it is even possible to run the fusion center algorithms on each camera and end up with a distributed architecture. We focus on the research and the development of these two different tracking approaches and explain them in detail.

Multi-camera tracking approach based on occupancy maps

We address the problem of tracking multiple individuals by using occupancy maps in a network of overlapping cameras. In contrast to other approaches, which compute, for example, optimal tracks from a Probabilistic Occupancy Map (POM) by a greedy search strategy based on Dynamic Programming, we combine Bayesian filtering strategies with occupancy maps. Furthermore, our approach obtains an estimate of each person on a frame-by-frame basis. We hereby focus on low-level processing resulting in a real-time system which can possibly be used on smart cameras.


Distributed multi-camera tracking approach with a feedback loop

In this research, we focus on real-time, low-latency and scalable tracking of multiple people. In our system all low-level video processing is performed on smart cameras. The smart cameras transmit a compact high-level description of moving objects to the fusion center, which fuses these data using a Bayesian approach. Moreover, a feedback loop ensures that each smart camera is up-to-date about the most recent locations and motion states of tracked individuals. We evaluate the performance (in terms of precision and accuracy) of the proposed system in indoor and meeting scenarios where individuals are often occluded by other people and/or furniture. Furthermore, we compare our approach to state-of-the-art methods and show that our system performs at least as good as other methods. However, our system is capable of running in real-time and therefore produces instant results.