Efficient Calculation of Frame Level Complex Predicates in Video Analytics
MetadataShow full item record
The field of video analytics focuses on extracting useful information from video. Lets consider a scenario in which we have a large amount of video from a traffic camera at a certain busy intersection and we are looking for a black sedan. State of the art object detectors such as FasterRCNN  utilize computationally expensive methods like convolutional neural networks that analyze a frame of video and estimate the number of the object of interest and the locations of every instance of that object in the frame. The most basic approach to solving this problem would simply be to execute the object detector on all frames of the video and collect the frames which contain at least one black sedan to return to the user. However, this approach is impractical on longer videos as CNNs are computationally expensive and thus too slow. Instead the number of frames evaluated by the object detector must be limited. This field focuses on developing strategies for doing so, such as sampling, filtering, proxy models, and clustering.