AI-guided detection of antibiotic-resistantbacteria using resistance genes

Typ
Examensarbete för masterexamen
Master's Thesis
Program
Biomedical engineering (MPBME), MSc
Publicerad
2024
Författare
Aerts, Erik
Modellbyggare
Tidskriftstitel
ISSN
Volymtitel
Utgivare
Sammanfattning
Antibiotic resistance is threatening advancements made in modern medicine. Understanding the genomics behind multi-resistant profiles can assist in planning the correct treatment which can lower the abundance of antibiotic usage and hamper the vicious resistance cycle. Transformer-based AI models have shown state-of-the-art performance in understanding complex patterns in data. The thesis aimed to create a framework on how to implement transformers to predict bacterial resistance profiles by training on genomic data. The framework consisted of a transformer-based encoder and parallel classification networks for predicting antibiotic susceptibility. Each model trained on antibiotic resistance genes (ARGs) from Escherichia coli where a subset of isolates had recorded resistance profiles. The results showed that having a high complexity in the encoder is key for the model to accurately predict resistance to antibiotics where the occurrence of resistance is rare. This is relevant for any clinical setting, as models with less than 12 encoder blocks could not find these resistance profiles. The framework benefited from pretraining on unlabeled genomic data as performance generally increased. However, the type of masked language model pre-training which benefited the system more was situational and no conclusion was drawn. Finally, the thesis also found features in the data on which the models were basing decisions off on. The number of ARGs of an isolate was deemed the most influential feature in the data which relates to how much information the transformer can process. Following, relations between ARGs gyrA-D87N / parC-S80I and aph(3”)-Ib / aph(6’)-Id were shown to be an important decision basis for the models. Likewise, two point mutations of the pmrB gene also stood out as important ARGs in the decision-making processes for the models. The reasons why these ARGs are weighted highly by the models are currently unknown but are of interest to be studied further for a better understanding of underlying factors to multi-resistance.
Beskrivning
Ämne/nyckelord
Artificial intelligence, antibiotic resistance, transformer, self-attention, embedding, encoder, pre-training, fine-tuning, masked language modelling.
Citation
Arkitekt (konstruktör)
Geografisk plats
Byggnad (typ)
Byggår
Modelltyp
Skala
Teknik / material
Index