Orientation-Aware Scene Understanding for Mobile Cameras
Abstract
We present a novel approach that allows anyone to quickly
teach their smartphone how to understand the visual world
around them. We achieve this visual scene understanding
by leveraging a camera-phone’s inertial sensors to lead to
both a faster and more accurate automatic labeling of the
regions of an image into semantic classes (e.g. sky, tree,
building). We focus on letting a user train our system from
scratch while out in the real world by annotating image regions
in situ as training images are captured on a mobile device,
making it possible to recognize new environments and
new semantic classes on the fly. We show that our approach
outperforms existing methods, while at the same time performing
data collection, annotation, feature extraction, and
image segment classification all on the same mobile device.