Vision-based autonomous navigation in medium level representation
Hwang, Jin Ha
MetadataShow full item record
Autonomous navigation for a mobile robot is required to operate in cluttered, unstruc- tured environment at high speeds with efficient data gathering. Given the payload con- straints and long-range sensing requirements, vision-based scene analysis is the preferred sensing modality for a modern navigation system. For outdoor navigation, stereo vision is favored by reasons of improving detection range by increasing the views of the scene and the indirect access to depth information. However, state of the art approach uses stereo camera observations by converting dis- parity images to a world representation such as 3D point cloud and 2D occupancy grid and fail to deal with sensor noises . The computational burden of performing dense stereo matching and updating the observation in the world representation and the difficulty in dealing with sensor error force modern approaches to update the world representation on a per frame basis, which often becomes a factor of reducing a degree of autonomy for a navigation task . Moreoever, the observation update in the 2D world representation often requires a simplification of a geometry of an object as a circle  or a rectangle [4, 5] for the obstacle expansion. Overly inflated region due to the simplified geometry causes the navigation system to negotiate by performing overly conservative path planning. In this paper, we propose an alternative scene perception and planning approach in a medium level representation called Stixel World for stereo cameras. Instead of convert- ing the local scene observation into the world representation, obstacle detection and path planning computation remains in the perception space. We use a method to construct the medium level representation by detecting every possible vertical obstacle for each column in the image. We construct the Stixel representation using a reduced detection window for the Stixels by ground plane estimation. Instead of computing the full disparity map, we utilize a cost volume matrix approach. Also, we propose a method that directly projects a robot model into the perception space so that the 3D physical geometry of the robot model does not need to be simplified unlike modern perception-based local planning approaches that perform obstacle expansion in the image space [6, 7]. Furthermore, we propose a method to integrate this local/reactive obstacle avoidance controller with a global planner for navigating to a provided destination both safely and efficiently. We demonstrate these capabilities on a simulated mobile robot in many environment scenarios for quantitative evaluation. Then, we also qualitatively explore limitations and possibilities of the vision-based autonomous navigation in the medium level representation by implementing our approach on a real mobile robot. From experimental results, we show the robustness and effectiveness of our method by comparing to other traditional 2D Cartesian-based navigation planners as well as the depth image-based path planner.