GRC 2023 Global Essay Competition Top 5
By Rohan Kuruvila
For decades on end, scientists have been stuck on one singular issue that could lead to massive advancements in healthcare and the environment - the protein folding problem.
Proteins are large, complex molecules. They are required for the structure, function, and regulation of all living organisms. For example, antibodies are proteins that are produced to fight against foreign invaders, while enzymes are proteins that stimulate metabolism. Each protein also has a distinct shape based on its amino acid sequence.
The problem with protein folding is trying to figure out how a protein will be structured given an amino acid sequence. One method is to record a folded protein in painstaking detail, which is time-consuming and expensive. The second method is to predict the protein structure after the fold. However, this method took a lot of calculations and was highly inaccurate, until the emergence of the AI model AlphaFold.
The Creation of AlphaFold
Emergence of AlphaFold
To solve the protein folding problem, scientists in the US decided to create a competition
known as CASP, where groups compete to predict the structure of proteins based on their amino-acid chains within a 3-week time limit. The group with the highest overall accuracy wins the competition.
AlphaFold’s first appearance was in CASP-13 where it scored 58.5/100, the highest out of any competitor during that year. 2 years later, during CASP-14, AlphaFold ended up with a score of 87/100, which is almost considered a solution to the protein folding problem.
Figure 1 - Graph by Deepmind/Nature
How AlphaFold Works
AlphaFold’s AI model works by linking the amino acid chains of proteins to how a protein folds. It trains itself on 200+ million entries from public databases, such as UniProt, which have mapped out amino acid chains to protein structures.
When AlphaFold is asked to predict a protein fold, it compares the amino acid sequence it is given to other sequences in the public databases. Then, it maps out similar sequences in a Multiple Sequence Alignment(MSA) representation. AlphaFold also pairs the input sequence with itself and compares the pairing with similar already-known sequence structures to create an initial pair representation for the input sequence. This pair representation shows the relationship between every pair of amino acids in the protein.
AlphaFold then uses both the MSA and pair representations in its very own neural network, dubbed the Evoformer. In the Evoformer, the pair representation uses the MSA representation’s information about the importance of each amino acid to “triangulate” and record the relative location of 2 amino acids. The relative locations are then used in the MSA representation again to compare the 2 amino acids. This step is repeated 48 times and outputs the refined MSA and pair representations.
Finally, the MSA and pair representations are inputted into the structure module. The module applies rotations, translations, physical constraints, and chemical constraints to determine a predicted protein structure using the information in both representations. This process is repeated with the Evoformer 3 more times to reach the final predicted atomic structure for the protein.
Figure 2 - Graph by The AlphaFold Team
Benefits of Predicting Protein Folding
Protein folding has a multitude of benefits in healthcare. For example, protein aggregation is closely linked with degenerative diseases. However, even with current medical advancements, it takes several months to accurately diagnose these diseases. With the application of AlphaFold, this process could be reduced to under a month. AlphaFold can use mRNA sequences, which determine the amino acid structure of a protein, to check for aggregated proteins. A faster diagnosis can lead to an 8-year longer lifespan for an Alzheimer's patient and a lower cost of treatment.
AlphaFold is also influential in the development of drugs. Drugs are made for specific proteins to change how they work. To do this, the drug has to bind to the protein. However, the act of finding a drug that can easily and tightly to the protein’s pockets is a meticulous and expensive task, especially since similar proteins can have different binding sites. Using AlphaFold, these drugs can be found much faster and at a lower cost, which can accelerate the discoveries of cures in the medical field.
There are many different applications of AlphaFold for a better environment, such as breaking down plastic waste. The world generates 400 million tonnes of plastic each year, which ends up in landfills and oceans. The Centre for Enzyme Innovation has found enzymes that are capable of breaking down plastics. This is only possible because of AlphaFold’s capability to easily find enzyme structures that can efficiently break down plastics.
AlphaFold holds many different possibilities for the future of science in healthcare and the environment. This paper outlined the origins of AlphaFold, how AlphaFold works, and a few of the benefits that the creation of AlphaFold has made possible. AlphaFold has great potential to unlock prosperity for future society.
“AlphaFold Protein Structure Database.” Deepmind, 2023. https://alphafold.ebi.ac.uk/
Nussinov, Ruth. “AlphaFold, Allosteric, and Orthosteric Drug Discovery: Ways Forward.” Drug Discovery Today, March 11, 2023. https://www.sciencedirect.com/science/article/pii/S1359644623000673
Jumper, John. “Highly Accurate Protein Structure Prediction with AlphaFold.” Nature News, July 15, 2021. https://www.nature.com/articles/s41586-021-03819-2
Ravisetti, Monisha. “Google’s Deepmind AI Predicts 3D Structure of Nearly Every Protein Known to Science.” CNET, July 29, 2022. https://www.cnet.com/science/biology/googles-deepmind-ai-predicts-3d-structure-of-near ly-every-protein-known-to-science/
Pinheiro, Francisca. “AlphaFold and the Amyloid Landscape.” Journal of Molecular Biology, May 21, 2021.
Using AlphaFold in the fight against plastic pollution - Google DeepMind. Google DeepMind, YouTube, 2022. https://www.youtube.com/watch?v=QkYUGgnRbbE
What Is AlphaFold? | NEJM. NEJM Group, YouTube, 2023. https://www.youtube.com/watch?v=7q8Uw3rmXyE