Embedded early vision techniques for efficient background modeling and midground detection
Valentine, Brian Evans
MetadataShow full item record
An automated vision system performs critical tasks in video surveillance, while decreasing costs and increasing efficiency. It can provide high quality scene monitoring without the limitations of human distraction and fatigue. Advances in embedded processors, wireless networks, and imager technology have enabled computer vision systems to be deployed pervasively in stationary surveillance monitors, hand-held devices, and vehicular sensors. However, the size, weight, power, and cost requirements of these platforms present a great challenge in developing real-time systems. This dissertation explores the development of background modeling algorithms for surveillance on embedded platforms. Our contributions are as follows: - An efficient pixel-based adaptive background model, called multimodal mean, which produces results comparable to the widely used mixture of Gaussians multimodal approach, at a much reduced computational cost and greater control of occluded object persistence. - A novel and efficient chromatic clustering-based background model for embedded vision platforms that leverages the color uniformity of large, permanent background objects to yield significant speedups in execution time. - A multi-scale temporal model for midground analysis which provides a means to "tune-in" to changes in the scene beyond the standard background/foreground framework, based on user-defined temporal constraints. Multimodal mean reduces instruction complexity with the use of fixed integer arithmetic and periodic long-term adaptation that occurs once every d frames. When combined with fixed thresholding, it performs 6.2 times faster than the mixture of Gaussians method while using 18% less storage. Furthermore, fixed thresholding compares favorably to standard deviation thresholding with a percentage difference in error less than five percent when used on scenes with stable lighting conditions and modest multimodal activity. The chromatic clustering-based approach to optimized background modeling takes advantage of the color distributions in large permanent background objects, such as a road, building, or sidewalk, to speedup execution time. It abstracts their colors to a small color palette and suppresses their adaptation during processing. When run on a representative embedded platform it reduces storage usage by 58% and increases runtime execution by 45%. Multiscale temporal modeling for midground analysis presents a unified approach for scene analysis that can be applied to several application domains. It extends scene analysis from the standard background/foreground framework to one that includes a temporal midground object saliency window that is defined by the user. When applied to stationary object detection, the midground model provides accurate results at low sampling frame rates (~ 1 fps) while using only 18 Mbytes of storage and 15 Mops/sec processing throughput.