Fast and Accurate Camera Motion Estimation For Static and Dynamic Scenes /
- Author(s): Azartash, Haleh
- et al.
Visual Odometry (VO) is the process of finding a camera's relative pose in different time intervals by analysing the images taken by the camera. Visual Odometry, also knowns as ego-motion estimation, has a variety of applications including image stabilization, unmanned aerial vehicle (UAV) and robotic navigation, scene reconstruction and augmented reality. VO has been extensively studied for the past three decades for stationary and dynamic scenes using monocular, stereo and more recently RGB-D cameras. It is important to note that camera motion estimation is application specific, and proper adjustments should be applied to the solution based on the requirements. In this thesis, we present different methods to estimate visual odometry accurately for camera stabilization and robotic navigation using monocular, stereo and RGB-D cameras for both stationary and dynamic scenes. For image stabilization, we propose a fast and robust 2D-affine ego- motion estimation algorithm based on phase correlation in Fourier-Mellin domain using a single camera. The 2D motion parameters, rotation-scale-translation (RST), are estimated in a coarse to fine approach, thus ensuring the convergence for large camera displacement. Using a RANSAC- based robust least square model fitting in the refinement process, we are able to find the final motion accurately which is robust to outliers such as moving objects or flat areas, therefore, making it suitable for both static and dynamic scenes. Even though this method estimates the 2D camera motion accurately, it is only applicable to scenes with small depth variation. Consequently, a stereo camera is used to overcome this limitation. Using a stereo camera enables us to find 3D camera motion (instead of 2D) of an arbitrary moving rig in any static environment with no limitation for depth variation. We propose a feature-based method that estimates large 3D translation and rotation motion of a moving rig. The translational velocity, acceleration and angular velocity of the rig are then estimated using a recursive method. In addition, we account for different motion types such as pure rotation and pure translation in different directions. Although by using a stereo rig we can find the arbitrary motion of a moving rig, the observed environment should be stationary. In addition, estimating the disparity between the stereo images increases the complexity of the proposed method. Therefore, we propose a robust method to estimate visual odometry using RGB-D cameras which is applicable to dynamic scenes as well. RGB-D cameras provide a color image and depth map of the scene simultaneously and therefore, reduce the complexity and computation time of visual odometry algorithms significantly. To exclude the dynamic regions of the scene from the camera motion estimation process, we use image segmentation to separate the moving parts from the stationary parts of the scene. We use an enhanced depth-aware segmentation method that improves the segmentation output in addition to conjoin areas where the depth value is not available. Then, a dense 3D point cloud is constructed by finding the dense correspondence between the reference and current frames using optical flow. Motion parameters for each segment is calculated using iterative closest point (ICP) technique (with six degrees of freedom). Finally, to find the true motion of the camera and exclude the dynamic region's motion parameters, we perform motion optimization by finding a linear combination of motion parameters that minimizes the remainder difference between the reference and the current images