Structure from Motion Based on Monocular Image Sequences with a Virtual Stereo Camera
Stereo matching (SM) is a well researched method for generating depth information from camera images. Efficient implementations of SM algorithms exist as part of widely used computer vision libraries, such as OpenCV. Typically, SM is being performed on pairs of images from a stereo camera in which intrinsic and extrinsic parameters are fixed and determined in advance by calibration. Furthermore, the images are usually taken at approximately the same time by triggering the shutters simultaneously. In this thesis a different approach is being pursued: stereo pairs are selected from a video sequence of a monocular camera, which is mounted on a moving vehicle. Two scenarios are covered: one where the camera is facing sidewards and one where it is facing forwards in relation to the driving direction. Extrinsic transformatins between frames are computed by visual odometry. Images out of a series can each be rectified with the same reference image; the resulting image pairs are therefore effectually taken by a virtual stereo camera with variable baseline. Stereo matching and three-dimensional reconstruction can be applied to these images in the same way as to those of a binocular camera with fixed extrinsic calibration. Apart from the development of the virtual stereo principle itself, two main contributions have been developed in this thesis: Firstly, it has been shown that the fusion of disparity images (according to Hirschmüller) taken at varying baselines improves quality in terms of density and error rate. Secondly, a new rectification procedure has been developed for the scenario of the forward facing camera; here the standard procedure developed for conventional stereo cameras is not applicable.