- Forward Future AI
- Posts
- AlphaFold2 Is Finally Getting the Attention It Deserves: A Nobel Prize Long Overdue.
AlphaFold2 Is Finally Getting the Attention It Deserves: A Nobel Prize Long Overdue.
AlphaFold2, developed by Google DeepMind under the direction of Demis Hassabis and John Jumper, has reached a milestone in science. The AI model, which predicts the three-dimensional structure of proteins from their amino acid sequences, has solved a 50-year-old problem in chemistry. This led to the 2024 Nobel Prize in Chemistry being awarded to Hassabis, Jumper and biophysicist Baker, who was honored for his own groundbreaking work in the field of protein design.
What makes AlphaFold2 so extraordinary is its ability to make precise predictions about the structure of almost any known protein – a tool used by millions of scientists worldwide to make strides in areas such as drug development or understanding global health issue like antibiotic resistance. The significance of this discovery quickly became clear when the model was made freely available through the AlphaFold Protein Structure database. Since its launch in 2020, AlphaFold2 has not only transformed the way biologists work, but has also ushered in a new era of structural biology.
The Nobel Prize not only crowns years of research, but also demonstrates how machine learning and artificial intelligence are profoundly influencing our understanding of biological processes. It is therefore not surprising that AlphaFold2 has been considered groundbreaking and has received widespread recognition in the scientific community.
But what exactly makes AlphaFold so special, what significant impact will it have on our lives, and why has it taken so long to develop a model like this? In the following article, I will provide an overview of the AlphaFold model2 from Google DeepMind.
What exactly does AlphaFold do?
Three years ago, the renowned journal Nature dedicated an article to AlphaFold and summarized its purpose as follows:
“Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.”
Google DeepMind, in turn, writes about AlphaFold:
“AlphaFold is an AI system developed by Google DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment. (...) While the system still has some limitations, the CASP results suggest AlphaFold has immediate potential to help us understand the structure of proteins and advance biological research.”
AlphaFold can therefore predict 3D protein structures with high accuracy. But what is so special about this, and why is it necessary to predict protein structures?
Proteins are large, complex molecules that are responsible for almost all biological processes. They consist of chains of amino acids that are strung together in a specific order (their sequence). These chains fold into a three-dimensional structure that is crucial for the function of the protein. However, predicting this structure from the sequence is an extremely difficult task because there are almost innumerable possibilities for how a protein can fold. This problem is known as the “protein folding problem” and is considered one of the major unsolved problems in biochemistry.
“The protein folding problem is the question of how a protein’s amino acid sequence dictates its three-dimensional atomic structure. The notion of a folding “problem” first emerged around 1960, with the appearance of the first atomic-resolution protein structures. (...) Since then, the protein folding problem has come to be regarded as three different problems: (a) the folding code: the thermodynamic question of what balance of interatomic forces dictates the structure of the protein, for a given amino acid sequence; (b) protein structure prediction: the computational problem of how to predict a protein’s native structure from its amino acid sequence; and (c) the folding process: the kinetics question of what routes or pathways some proteins use to fold so quickly.”
AlphaFold uses artificial intelligence to predict the folding of proteins based on their sequence. The predictions are so accurate that in many cases they are comparable to structures determined using expensive and time-consuming methods such as X-ray crystallography or cryo-electron microscopy.
Understanding protein structure is crucial to understanding how proteins function, how they interact with other molecules, and how misfolding can lead to disease. Some examples:
Many diseases, including Alzheimer's, Parkinson's and certain cancers, are associated with misfolded proteins. By predicting the structure of such proteins, scientists can better understand how misfolding can cause disease and target treatments. Knowing a protein's structure also facilitates the development of drugs that bind specifically to the protein and affect its function. AlphaFold can accelerate this process by providing structural data quickly and accurately.
Before AlphaFold, it was extremely time-consuming and expensive to determine the structure of proteins experimentally. The process could take months to years and was not always successful. AlphaFold provides an automated, more cost-effective, and faster way to obtain this information. This paves the way for new research and breakthroughs in biology.
“Made from long chains of amino acids, each has a unique complex 3D structure. But figuring out just one of these can take several years, and hundreds of thousands of dollars. In 2020, AlphaFold solved this problem, with the ability to predict protein structures in minutes, to a remarkable degree of accuracy.”
Nobel Prize for the second iteration: AlphaFold2
AlphaFold 1 and AlphaFold 2 differ significantly in their accuracy, methodology and underlying architecture. While AlphaFold 1 was already a significant improvement in predicting protein structures, it did not achieve the atomic accuracy that is crucial for many applications in the life sciences. AlphaFold 2 delivered the breakthrough in 2020, providing predictions that were nearly identical to experimentally determined structures in many cases.
The central advance of AlphaFold 2 lies in the “newly” developed transform-based architecture. This well-known architecture, which is also the architectural basis for all large (and small) language models and has its origin in the well-known “Attention is all you need” paper from 2017 by Google DeepMind, is also the key feature of AlphaFold 2.
In contrast to AlphaFold 1, which was based on multiple phases and evolutionary algorithms, AlphaFold 2 combines evolutionary sequence information and structural information in an end-to-end model. This leads to a more efficient and accurate calculation of the three-dimensional structure. Particularly noteworthy is the integration of multiple sequence alignments (MSAs) and three-dimensional structural data, which allows AlphaFold 2 to work reliably even for proteins without closely related sequences. AlphaFold 2 also uses attention mechanisms that allow it to better relate amino acids that are far apart in the sequence, resulting in a more precise folding prediction. The attention mechanism is an integral part of the aforementioned Transformer architecture, which includes the attention mechanism as one of its main components.
Another important difference is AlphaFold 2's ability to optimize a protein's structure in a continuous process, making it faster and more robust. It can also predict more complex protein structures, such as multimeric proteins, much better than its predecessor. Multimeric proteins are structures that consist of several protein units or polypeptide chains, which are referred to as subunits.
A classic example of a multimeric protein is hemoglobin. It consists of four subunits that work together to efficiently transport and deliver oxygen throughout the body.
While AlphaFold 1 was still inconsistent in its predictions and often only predicted rough structural features, AlphaFold 2 can calculate atomic details with high accuracy, making it invaluable for biomedical research.
In summary, AlphaFold 2 has revolutionized the level of protein structure prediction by not only being faster and more accurate, but also by providing more reliable predictions for more complex proteins and making them available to the scientific community.
Conclusion
The importance of AlphaFold for human health can hardly be overestimated. On the one hand, predicting 3D protein structures with this technique saves a great deal of time and resources that would otherwise have to be spent on lengthy trial and error processes. On the other hand, the sheer number of predictions can make many important diseases understandable and thus also treatable, not least because this technique can also be used to develop drugs. AlphaFold is a breakthrough that unfortunately hasn't received the attention it deserves lately, overshadowed by the more everyday AI chatbots. The Nobel Prize, on the other hand, was absolutely justified and is now putting AlphaFold at the center of the attention it deserves. Demis Hassabis and his team have done an outstanding job and rendered a great service to humanity. AlphaFold shows directly and immediately how artificial intelligence is already enriching all our lives today and making them more worth living in the long term.
But it should not stop there. Google DeepMind wants to continue research and, with AlphaFold 3, even target new areas of biology, as Demis Hassabis recently announced in an interview with the Financial Times.
“There are several. Firstly, on the biology track — you can see where we are going with that with AlphaFold 3 — the idea is to understand [biological] interactions, and eventually to model a whole pathway. And, then, I want to maybe build a virtual cell at some point.
With Isomorphic [DeepMind’s drug development spin-off] we are trying to expand into drug discovery — designing chemical compounds, working out where they bind, predicting properties of those compounds, absorption, toxicity and so on. We have great partners [in] Eli Lilly and Novartis . . . working on projects with them, which are going really well. I want to solve some diseases, Madhu. I want us to help cure some diseases.”
If there is a summary for the future, then only “Sky is the limit”.
About the author
Kim IsenbergKim studied sociology and law at a university in Germany and has been impressed by technology in general for many years. Since the breakthrough of OpenAI's ChatGPT, Kim has been trying to scientifically examine the influence of artificial intelligence on our society. |
Reply