3D Pose Estimation of Football Players

Osterman, Joakim; Sjögren, Olof

3D Pose Estimation of Football Players

Ladda ner

3D Pose Estimation of Football Players Final.pdf (42.22 MB)

Typ

Examensarbete för masterexamen
Master's Thesis

Program

Data science and AI (MPDSC), MSc

Författare

Osterman, Joakim

Sjögren, Olof

Sammanfattning

Abstract In the context of football analytics, video recordings of matches play a crucial role in post-game analysis. However, videos are inherently limited because they only allow viewers to follow the match from the camera’s perspective. This thesis is part of a larger project aimed at creating 3D representations of football matches from video, thus enabling users to view the game from anywhere inside the virtual 3D environment. The larger project consists of three parts. This thesis focuses on estimating the camera parameters, as well as the 3D poses and locations of the players in the video. The other two projects focus on player tracking and player texture generation. A pipeline consisting of camera calibration and pose estimation is proposed, taking video recordings and bounding box annotations as input and predicting camera pa rameters as well as the players’ 3D poses and locations. For camera calibration, a model specifically tailored for cameras viewing football fields is used. The results indicate accurately predicted positions and viewing angles for the estimated camera. Pose estimation is performed using a pre-trained model and results in visually ac curate projections, although perspective ambiguities are present when the 3D poses are viewed from different angles. The main approach for positioning players was to detect when players touched the ground and interpolate the positions for ambigu ous frames. The results are promising, but noise in the depth estimations occurs due to perspective ambiguities. Subsequently, an optional optimization of poses and positions using multi-view triangulation is also presented, showing possibilities for further refinement to ensure realistic and consistent human poses. Future work on pose and location optimization could yield a pseudo-truth dataset for further enhancements to improve overall poses and positions from strictly monocular video.

Ämne/nyckelord

Keywords: 3D Human Pose Estimation, Pose estimation, visual transformers, deep machine learning, camera calibration, depth estimation, multi-view optimization.

URI

http://hdl.handle.net/20.500.12380/307873

Samling

Examensarbeten för masterexamen

Visa fullständig post