AlphaFold: Google’s artificial intelligence predicts the structure of all known proteins and opens up a new world for science | Science

AlphaFold prediction of the structure of vitellogenin, an essential protein for all laying animals.
AlphaFold prediction of the structure of vitellogenin, an essential protein for all laying animals.deep mind

An artificial intelligence belonging to Google has predicted the structure of almost all known proteins; some 200 million molecules essential to understanding the biology of all living beings on the planet and the mechanisms of some of the most widespread diseases, from malaria to Alzheimer’s disease and cancer.

“This work ushers in a new era of computational biology,” celebrated Demis Hassabis, the 45-year-old programming and neuroscientist who is the main creator of Alpha foldingthe neural network system that was able to almost completely solve one of the biggest problems in biology.

Britain’s Hassabis was a young chess and video game talent who founded Deepmind in 2010, a company focused on creating artificial intelligence that can learn like humans. In 2013, this system proved to be better than anyone playing video games from the Atari company. The following year, Google bought the company for around 500 million euros. In 2017, Alpha Go it swept the top champions of Go, the highly complex Asian chess-like board game. Since then, Hassabis has turned his efforts to a much bigger challenge: predicting the three-dimensional shape a protein will have by reading only its genetic sequence, written in two-dimensional DNA letters.

Knowing the three-dimensional structure of these molecules from their genetic sequence is essential to understand their function, but it is a problem of immense difficulty. It’s like completing a puzzle with tens of thousands of pieces without knowing what image it represents.

Until this system appeared, elucidating the shape of a single protein made up of 100 base units—called amino acids—could take 13.7 billion years, the age of the universe. At best, it took scientists years to use X-ray microscopy or huge particle accelerators like the European synchrotron in Grenoble, France. Instead, Google’s algorithm predicts the structure of any protein in seconds.

“This universe of proteins” is “a gift for humanity”, underlined Hassabis during the presentation of the new database, during a press conference held on Tuesday, in the company of scientists from the European Molecular Biology Laboratory (EMBL), a public institution that has collaborated in the development of AlphaFold.

Until the advent of this technology, the structure of some 200,000 proteins had been determined, a task that took 60 years and the involvement of millions of scientists. This database was the artificial intelligence learning material of Google, which searched for valid models that predict the shape of proteins whose two-dimensional sequence is uniquely known. In 2021, the system has already solved the structure of one million proteins, including all humans. This year’s new shipment takes the record to 200 million: virtually all the proteins known to all living things on the planet.

Access to this new database it is open and free and the computer code of its artificial intelligence is open and downloadable. This Google of Life shows the two-dimensional sequence of any protein and a three-dimensional model that indicates the level of prediction reliability, which has a similar or even lower margin of error than conventional methods.

It is important to note that AlphaFold does not determine reality, but rather predicts it. Read the genetic sequence and estimate the most likely way the amino acids will be configured. The prediction has high reliability, which saves scientists a lot of time and money to do theoretical work without using expensive equipment to determine the actual structure of a protein until it is absolutely necessary.

The applications of this new tool are almost endless, since microscopic proteins are involved in all imaginable biological processes, from the mass death of bees to the resistance of crops to heat, through an infinity of diseases.

Matt Higgins’ team from the University of Oxford (UK) used AlphaFold in their project to develop an antibody – a type of protein – capable of neutralizing one of the essential proteins for the malaria pathogen can reproduce. Within a few years, this type of research could lead to the first highly protective vaccine against this disease, since it would prevent the transmission of the parasite from one person to another through mosquito bites.


Another of the milestones already achieved is the most detailed structure to date of the nuclear pores, a doughnut-shaped protein complex that is the entry and exit gate to the nucleus of human cells and is linked to diseases without end, including cancer and cardiovascular disease. diseases. This new tool allows unprecedented access to understand “how the recipe for life [escrita en el genoma] it comes into function when it is translated into proteins”, explained to this journal Jan Kosinski, researcher at EMBL and co-author of this discovery.

Hassabis and the other officials of Deepmind and EMBL assured that analyzes had been carried out on the possible risks associated with the publication of this database and its availability to all. “The benefits clearly outweigh the threats,” stressed the creator of the system, who added that in the future, as this technology develops, it will be up to the international community to decide whether its use should be limited.

One of the most tangible applications is the design of tailor-made molecules capable of blocking harmful proteins or, better still, of modulating their activity, a much more desirable effect in the design of new drugs, explains Carlos Fernández, scientist of the CSIC and head of the structural biology group of the Spanish Society of Molecular Biology. His team used AlphaFold to elucidate part of the structure of a complex composed of several proteins essential for the spread of the trypanosome responsible for sleeping sickness that exists in sub-Saharan African countries.

Now, years of work await us to confirm whether the predictions are correct, explains biologist José Márquez, an expert in protein structure at the Grenoble synchrotron. “The next frontier will be that AlphaFold can contribute to the design of drugs that block or activate proteins, a problem they are already tackling,” he explains. Another stumbling block: the system does not tell why a protein obtains its final shape, which can be essential in the investigation of diseases like Alzheimer’s or Parkinson’s, linked to incorrect protein folding.

Alfonso Valencia, Director of Life Sciences at Barcelona Supercomputing Center, talks about the shortcomings of the system. “Not everything is solved, because AlphaFold can only predict things that are in the realm of known things. For example, it cannot predict the structure of a type of protein that protects well against freezing because they are rare and there are not many examples in the databases, nor can it predict the consequence of mutations, which is a very negative point in medicine,” he points out.

He also acknowledges one of its strengths: that the code for the entire system is open, which means that other scientists can improve or modify it as they wish, even if Google decides to put the system offline. “It’s obvious that the folks at Deepmind are looking to win the Nobel Prize by acting in this transparent way,” Valencia says. “On the one hand, they get a big image and an advantage over their competitors, like Facebook. On the other hand, they have already hinted that they reserve the private use of specific health data and for the design of drugs,” he adds.

you can follow MATERIAL in Facebook, Twitter e instagramthe apuntarte here to receive our weekly newsletter.

Leave a Comment