3D Object Detection with Perceiver Adapting the Perceiver Architecture for 3D Object Detection with Automotive Image Data
Ladda ner
Typ
Examensarbete för masterexamen
Master's Thesis
Master's Thesis
Program
Systems, control and mechatronics (MPSYS), MSc
Publicerad
2024
Författare
Sedin, Linnéa
Andersson, Lydia
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Abstract
In the field of autonomous vehicles, accurately understanding and interpreting the environment and surroundings is vital for safe navigation. A robust perception system is fundamental in achieving this, and central to this is the use of computer vision techniques. This project investigates the adaptation of Perceiver [13] for 3D object detection using automotive image data. While the Perceiver architecture, which is based on Transformers, is designed for perception tasks, it has neither been evaluated on large input sizes nor tested on 3D object detection tasks. By using the ZOD dataset [2], which provides a comprehensive collection of automotive image data, this study aims to assess Perceiver’s capabilities in this context. Specifically, 3D object detection is exclusively done on camera images, targeting only vehicle objects. The proposed model in this project implements end-to-end object detection where the loss computation is inspired by the DETR loss computation [4]. The results indicate that the Perceiver architecture does not work without additional configuration for 3D object detection. Challenges in adapting the model highlight the need for further modifications and optimizations to improve its performance in this specific application. Despite these challenges, the study provides valuable insights into the potential and limitations of using the Perceiver architecture for 3D object detection with large input data. This study also provides insight into the impact of different additional modifications to the basic architecture.
Beskrivning
Ämne/nyckelord
Keywords: 3D Object Detection, Artificial Intelligence, Automotive Sensing, Deep Machine Learning, DETR, Perceiver, Transformer v