Visual-Inertial Odometry: Efficiency and Accuracy
- Author(s): Zheng, Xing
- Advisor(s): Mourikis, Anastasios I
- et al.
Accurate localization is essential in many applications such as robotics, unmanned aerial vehicles, virtual reality, and augmented reality. In this work, we focus on the localization of a platform in an unknown environment, with an inertial measurement unit (IMU) and a monocular camera. This task is often termed visual-inertial odometry (VIO).
In this work, we focus on improving the computational efficiency and accuracy of the state of the art in VIO algorithms. Specifically, to improve computational efficiency we first propose the Decoupled Estimate-Error Parameterization (DEEP) that addresses the high dimensionality of the estimation problem. An extended Kalman filter (EKF) VIO algorithm is re-formulated in the DEEP framework, using measurements from a rolling-shutter camera. The DEEP-EKF formulation is evaluated through Monte-Carlo simulations and real-world experiments, which shows substantial computational gains, while incurring only a small loss of estimation performance.
To achieve improved estimation accuracy, we describe three key methods. First, we propose high-fidelity sensor modeling, along with online self-calibration. An additional contribution of the work is the novel method for processing the measurements of the rolling-shutter camera, which employs an approximate representation of the estimation errors, instead of the state itself. Both Monte-Carlo simulations and real-world experiments are conducted to demonstrate the improved estimation precision of the proposed approach compared to existing ones.
We also propose a direct VIO algorithm, which utilizes image patches extracted around image features, and formulates measurement residuals in the image intensity space directly. A detailed evaluation of the algorithm demonstrates that the use of photometric residuals results in increased pose estimation accuracy, with approximately 23% lower estimation errors, on average in our testing.
At last, we extend the direct VIO formulation to a semi-dense framework, where all informative areas in images are used. Photometric triangulation and a novel noise model, which accounts for noise during the image formation and interpolation errors, are employed in this work. Through Monte-Carlo simulations and real-world experiments, we demonstrate that the proposed semi-dense VIO outperforms the direct VIO and the point-feature-based method, in terms of the estimation accuracy.