An AI system has been trained to identify the billions of proteins that make up the human body. 

Over decades, global scientists have identified more than 200 million proteins in the human body, each with a unique makeup, but the structure (known as ‘folding’) of most of these proteins remains unknown. 

Researchers at Google’s DeepMind project in the UK have developed AlphaFold - an AI system that can predict the shape of proteins, which they say will give scientists a tool that could help them understand diseases more quickly and figure out how to use proteins to our advantage.

The resulting dataset provides a confident prediction of the structural position for nearly 60 per cent of the amino acids within the human proteome. The predictions will be made freely available to the community via a public database hosted by the European Bioinformatics Institute (EMBL-EBI).

Determining the structure of proteins can provide valuable information for understanding biological processes and could inform drug development. 

Given the importance of understanding the human proteome for health and medicine, intensive efforts have been made to determine these protein structures. However, after decades of research only 17 per cent of the human proteome’s amino acids — the subunits that are linked together to form proteins — have been included within an experimentally determined structure. 

Experimental structure determination requires overcoming many time-consuming hurdles and, as such, obtaining more extensive coverage of the proteome remains a key challenge.

Researchers are now using AlphaFold to determine the structures of proteins covering almost the entire human proteome (98.5 per cent of all human proteins). 

The authors found that AlphaFold was able to make a confident prediction of the structural position of 58 per cent of the amino acids in the human proteome. Of this, the position of a subset of 35.7 per cent was predicted with a very high degree of confidence, which is double the number covered by experimental structures. 

At the protein level, AlphaFold produced a confident prediction for the structure of 43.8 per cent of proteins for at least three quarters of their amino acid sequence.

The authors conclude that large-scale and accurate structure prediction will become an important tool, allowing new scientific questions to be addressed from a structural perspective and the predictions by AlphaFold will help to further illuminate the role of proteins.

The study is accessible here.

<br><iframe width="560" height="315" src="https://www.youtube.com/embed/tbQy2KvIBmc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>