3D Object Detection with Perceiver Adapting the Perceiver Architecture for 3D Object Detection with Automotive Image Data

Sedin, Linnéa; Andersson, Lydia

3D Object Detection with Perceiver Adapting the Perceiver Architecture for 3D Object Detection with Automotive Image Data

Ladda ner

Msc_linnea.pdf (7.75 MB)

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Systems, control and mechatronics (MPSYS), MSc

Publicerad

2024

Författare

Sedin, Linnéa

Andersson, Lydia

Sammanfattning

Abstract In the field of autonomous vehicles, accurately understanding and interpreting the environment and surroundings is vital for safe navigation. A robust perception system is fundamental in achieving this, and central to this is the use of computer vision techniques. This project investigates the adaptation of Perceiver [13] for 3D object detection using automotive image data. While the Perceiver architecture, which is based on Transformers, is designed for perception tasks, it has neither been evaluated on large input sizes nor tested on 3D object detection tasks. By using the ZOD dataset [2], which provides a comprehensive collection of automotive image data, this study aims to assess Perceiver’s capabilities in this context. Specifically, 3D object detection is exclusively done on camera images, targeting only vehicle objects. The proposed model in this project implements end-to-end object detection where the loss computation is inspired by the DETR loss computation [4]. The results indicate that the Perceiver architecture does not work without additional configuration for 3D object detection. Challenges in adapting the model highlight the need for further modifications and optimizations to improve its performance in this specific application. Despite these challenges, the study provides valuable insights into the potential and limitations of using the Perceiver architecture for 3D object detection with large input data. This study also provides insight into the impact of different additional modifications to the basic architecture.

Ämne/nyckelord

Keywords: 3D Object Detection, Artificial Intelligence, Automotive Sensing, Deep Machine Learning, DETR, Perceiver, Transformer v

URI

http://hdl.handle.net/20.500.12380/307937

Samling

Examensarbeten för masterexamen

Visa fullständig post