Chetan Bhole
A9.com (Amazon's search engine subsidiary), USA
Title: Automated Person Segmentation in Unconstrained Video
Biography
Biography: Chetan Bhole
Abstract
Segmentation of people is an important problem in computer vision with uses in image understanding, graphics, security applications, sports analysis, education etc. In this talk, I will summarize work done in this area and our contributions. We have focussed on automatically segmenting a person from challenging video sequences. To have a general solution, we place no constraint on camera viewpoint, camera motion or the movements of a person in the scene. Our approach uses the most confident predictions from a pose or stick figure detector in key frames as forms of anchors that helps guide the segmentation of other more challenging frames in the video. Due to the unreliability of state of the art pose detectors on general frames, only highest confidence pose detections (key frames) are used. Features like color, position and optical flow are extracted from key frames and multiple conditional random fields (CRFs) are used to process blocks of video in batches. 2D CRFs for detailed key frame segmentations and 3D CRFs for propagating segmentations to the entire sequence of frames belonging to batches are used. Location information derived from the pose detector is also used to refine the results. As an important note, no hand labeled segmentation training data is required by our method. We discuss variants of the model and comparison to prior work. We also contribute our evaluation data to the community to facilitate further experiments.