Machine Learning for Structural Predictions of PROTACs

dc.contributor.authorKällberg, Anders
dc.contributor.departmentChalmers tekniska högskola / Institutionen för life sciencessv
dc.contributor.departmentChalmers University of Technology / Department of Life Sciencesen
dc.contributor.examinerWittung-Stafshede, Pernilla
dc.contributor.supervisorMercado, Rocío
dc.contributor.supervisorNittinger, Eva
dc.contributor.supervisorTyrchan, Christian
dc.date.accessioned2024-06-18T12:15:38Z
dc.date.available2024-06-18T12:15:38Z
dc.date.issued2024
dc.date.submitted
dc.description.abstractPROteolysis TArgeting Chimeras (PROTACs) are molecules that induce the degradation of targeted proteins by hijacking the ubiquitin–proteasome system in the cell. A PROTAC binds simultaneously to an E3 ligase and a protein of interest (POI), forming a ternary complex. The ubiquitin–proteasome system tags the POI with ubiquitin, marking it for degradation by the proteasome. The formation of a good ternary complex is essential for the ubiquitination and subsequent degradation of the POI. Being able to accurately model ternary complexes thus provides critical advantages in the development of PROTACs; however, data on PROTACs and their crystallized ternary complexes are limited. Accurate predictions of these structures are desirable, but current computational methods struggle to simulate the interactions between the PROTAC and both proteins simultaneously. AlphaFold, a machine learning tool, has been shown to accurately predict protein complexes. Yet, research on applying AlphaFold to predict ternary complexes is scarce. In the first part of this thesis, the ternary complex was modeled using AlphaFold by utilizing the sequences of both natural and artificially linked POIs and E3 ligase. Nevertheless, it was determined that AlphaFold was unable to accurately predict these complexes, reasonably because it was not able to take the PROTAC into account in the predictions. The second part of this thesis focused on generating data on PROTAC substructures, essential for the development of these molecules. Despite the availability of such data, obtaining high-quality data on substructures of specific PROTACs can be challenging and time-consuming. To address this, the PROTAC Splitter, a novel machine learning tool based on graph neural networks, was developed to predict these substructures. The PROTAC Splitter predicts 99.7% of PROTACs, with known substructures, to a maximal error of 6 atoms wrong between the boundaries of the ligands and linker. It generalizes to PROTACs with three unknown substructures, where 23.1% of these predictions satisfy the same criteria. The code for the PROTAC splitter is available at https://github.com/AndersKallberg/PROTAC_splitter. Although accurate predictions of ternary complexes remain challenging, the PROTAC Splitter makes the substructures easily accessible to anyone in this field of research. In summary, the work presented in this thesis answers scientific questions in two complementary areas of PROTAC development: (1) ternary (protein) structure prediction, and (2) PROTAC component prediction. This information is limited and valuable, and accurate predictions of these could accelerate the discovery of effective PROTACs and help in the fight against disease.
dc.identifier.coursecodeBBTX60
dc.identifier.urihttp://hdl.handle.net/20.500.12380/307917
dc.language.isoeng
dc.setspec.uppsokLifeEarthScience
dc.subjectPROTAC
dc.subjectTernary Structure
dc.subjectSubstructures
dc.subjectAlphaFold
dc.subjectProtein Structure Prediction
dc.subjectGraph Neural Networks
dc.subjectNode Prediction
dc.subjectLink Prediction
dc.subjectMachine Learning
dc.subjectAI
dc.titleMachine Learning for Structural Predictions of PROTACs
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeBiotechnology (MPBIO), MSc
Ladda ner
Original bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
Master_Thesis_Anders Källberg.pdf
Storlek:
19.98 MB
Format:
Adobe Portable Document Format
Beskrivning:
License bundle
Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: