AUB ScholarWorks

A Unified Hybrid Formulation for Visual SLAM

Show simple item record

dc.contributor.advisor Asmar, Daniel
dc.contributor.advisor Zelek, John
dc.contributor.author Younes, Georges
dc.date.accessioned 2021-02-09T12:29:28Z
dc.date.available 2021-02-09T12:29:28Z
dc.date.issued 2/9/2021
dc.identifier.uri http://hdl.handle.net/10938/22253
dc.description Alexander Wong; Bryan Tripp; Elie Shammas; José Marı́a Martı́nez Montiel; Kamel Abu Ghali; Stephen Smith;
dc.description.abstract Visual Simultaneous Localization and Mapping (Visual SLAM (VSLAM)), is the process of estimating the six degrees of freedom ego-motion of a camera, from its video feed, while simultaneously constructing a 3D model of the observed environment. Extensive research in the field for the past two decades has yielded real-time and efficient algorithms for VSLAM, allowing various interesting applications in augmented reality, cultural heritage, robotics and the automotive industry, to name a few. The underlying formula behind VSLAM is a mixture of image processing, geometry, graph theory, optimization and machine learning; the theoretical and practical development of these building blocks led to a wide variety of algorithms, each leveraging different assumptions to achieve superiority under the presumed conditions of operation. An exhaustive survey on the topic outlined seven main components in a generic VSLAM pipeline, namely: the matching paradigm, visual initialization, data association, pose estimation, topological/metric map generation, optimization, and global localization. Before claiming VSLAM a solved problem, numerous challenging subjects pertaining to robustness in each of the aforementioned components have to be addressed; namely: resilience to a wide variety of scenes (poorly textured or self repeating scenarios), resilience to dynamic changes (moving objects), and scalability for long-term operation (computational resources awareness and management). Furthermore, current state-of-the art VSLAM pipelines are tailored towards static, basic point cloud reconstructions, an impediment to perception applications such as path planning, obstacle avoidance and object tracking. To address these limitations, this work proposes a hybrid scene representation, where different sources of information extracted solely from the video feed are fused in a hybrid VSLAM system. The proposed pipeline allows for seamless integration of data from pixel-based intensity measurements and geometric entities to produce and make use of a coherent scene representation. The goal is threefold: 1) Increase camera tracking accuracy under challenging motions, 2) improve robustness to challenging poorly textured environments and varying illumination conditions, and 3) ensure scalability and long-term operation by efficiently maintaining a global reusable map representation.
dc.language.iso en
dc.subject Monocular
dc.subject Visual Odometry
dc.subject SLAM
dc.title A Unified Hybrid Formulation for Visual SLAM
dc.type Dissertation
dc.contributor.department Department of Mechanical Engineering
dc.contributor.faculty Maroun Semaan Faculty of Engineering and Architecture
dc.contributor.institution American University of Beirut


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search AUB ScholarWorks


Browse

My Account