DeepMind discovers the structure of 200 million proteins

Artificial intelligence has deciphered the structure of virtually every protein known to science, laying the groundwork for the development of new drugs or technologies to deal with global challenges such as famine or pollution.

Proteins are the building blocks of life. Made up of chains of amino acids, folded into complex shapes, their 3D structure largely determines their function. Once you know how a protein is folded, you can begin to understand how it works and how to modify its behavior. Although DNA provides the instructions for making the chain of amino acids, predicting how they interact to take on a 3D shape was more complicated, and until recently scientists had only discovered a fraction of the roughly 200 million proteins known to science. .

In November 2020, the Artificial Intelligence group DeepMind announced that they had developed a program called AlphaFold that could quickly predict this information using an algorithm. Since then, he has analyzed the genetic codes of every organism whose genome has been sequenced and predicted the structures of the hundreds of millions of proteins they collectively contain.

Last year, DeepMind published the protein structures of 20 species – including nearly all of the 20,000 existing human proteins – in a freely accessible database. Now he has completed the work and published the predicted structures of over 200 million proteins.

“Basically, it can be considered to cover the whole universe of proteins. It includes predictive structures for plants, bacteria, animals and many other organisms, creating huge new opportunities for AlphaFold to impact important issues such as sustainability, food insecurity and neglected diseases. said Demis Hassabis, founder and CEO of DeepMind.

Scientists are already using some of their early predictions to develop new drugs. In May, researchers led by Professor Matthew Higgins from the University of Oxford announced that they were using AlphaFold models to help determine the structure of a key malaria parasite protein and determine where antibodies might block the transmission of the parasite.

“We used to use a technique called protein crystallography to figure out what this molecule looks like, but because it’s quite dynamic and moving around, we couldn’t really figure it out,” Higgins explained. “When we took the models from AlphaFold and combined them with this experimental evidence, it suddenly all made sense.” This information will now be used to design better vaccines that induce the most potent antibodies that block transmission. »

AlphaFold models are also being used by scientists at the University of Portsmouth’s Center for Enzyme Innovation to identify enzymes from the natural world that could be modified to digest and recycle plastics. “It took us a long time to sift through this huge database of structures, but it opened up this whole set of new three-dimensional shapes that we hadn’t seen before that could actually break down plastics,” said the professor. John McGeehan, who leads the work. “This is an absolute paradigm shift. We can really accelerate down the road from here, and that helps us direct those precious resources to the things that matter.”

Professor Dame Janet Thornton, Group Leader and Senior Scientist at the European Institute of Bioinformatics at the European Molecular Biology Laboratory, said: “AlphaFold’s protein structure predictions are already being used in many ways. I’m confident that this latest update will unleash a flood of exciting new discoveries in the months and years to come, all thanks to the fact that the data is publicly available and usable by everyone..

Leave a Comment