|MuDeepNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose Using Multi-view Consistency Loss
Jun-Ning Zhang, Qun-Xing Su, Peng-Yuan Liu*, Hong-Yu Ge, and Ze-Feng Zhang
International Journal of Control, Automation, and Systems, vol. 17, no. 10, pp.2586-2596, 2019
Abstract : We take formulate structure from motion as a learning problem, and propose an end-to-end learning framework to calculate the image depth, optical flow, and the camera motion. This framework is composed of multiple encoder-ecoder networks. The key part of the network structure is the FlowNet, which can improve the accuracy of the estimated camera ego-motion and depth. As with recent studies, we use an end-to-end learning approach with multi-view synthesis as a variety of supervision, and proposes multi-view consistency losses to constrain both depth and camera ego-motion, requiring only monocular video sequences for training. Compared to the recently popular depth-estimation-networks using a single image, our network learns to use motion parallax correction depth. Although MuDeepNet training requires the use of two adjacent frames to obtain motion parallax, it is tested by using a single image. Thus, MuDeepNet is a monocular system. The experiments on KITTI dataset show our MuDeepNet outperforms other methods.
Deep learning, depth consistency loss, depth estimation, optical flow, optical flow consistency loss, visual odometry (VO).
Download PDF : Click this link