Toward Lifelong Visual Localization and Mapping

Hordur Johannsson, Ph.D., 2013
John Leonard, Advisor

Mobile robotic systems operating over long durations require algorithms that are robust and scale efficiently over time as sensor information is continually collected. For mobile robots one of the fundamental problems is navigation; which requires the robot to have a map of its environment, so it can plan its path and execute it. Having the robot use its perception sensors to do simultaneous localization and mapping (SLAM) is beneficial for a fully autonomous system. Extending the time horizon of operations poses problems to current SLAM algorithms, both in terms of robustness and temporal scalability. To address this problem we propose a reduced pose graph model that significantly reduces the complexity of the full pose graph model. Additionally we develop a SLAM system using two different sensor modalities: imaging sonars for underwater navigation and vision based SLAM for terrestrial applications.

Underwater navigation is one application domain that benefits from SLAM, where access to a global positioning system (GPS) is not possible. In this thesis we present SLAM systems for two underwater applications. First, we describe our implementation of real-time imaging-sonar aided navigation applied to in-situ autonomous ship hull inspection using the hovering autonomous underwater vehicle (HAUV). In addition we present an architecture that enables the fusion of information from both a sonar and a camera system. The system is evaluated using data collected during experiments on SS Curtiss and USCGC Seneca. Second, we develop a feature-based navigation system supporting multi-session mapping, and provide an algorithm for re-localizing the vehicle between missions. In addition we present a method for managing the complexity of the estimation problem as new information is received. The system is demonstrated using data collected with a REMUS vehicle equipped with a BlueView forward-looking sonar.

The model we use for mapping builds on the pose graph representation which has been shown to be an efficient and accurate approach to SLAM. One of the problems with the pose graph formulation is that the state space continuously grows as more information is acquired. To address this problem we propose the reduced pose graph (RPG) model which partitions the space to be mapped and uses the partitions to reduce the number of poses used for estimation. To evaluate our approach, we present results using an online binocular and RGB-Depth visual SLAM system that uses place recognition both for robustness and multi-session operation. Additionally, to enable large-scale indoor mapping, our system automatically detects elevator rides based on accelerometer data. We demonstrate long-term mapping using approximately nine hours of data collected in the MIT Stata Center over the course of six months. Ground truth, derived by aligning laser scans to existing floor plans, is used to evaluate the global accuracy of the system. Our results illustrate the capability of our visual SLAM system to map a large scale environment over an extended period of time.