New artificial intelligence software can compute protein structures in 10 minutes
Last year the artificial intelligence company DeepMind, a subsidiary of Google’s Alphabet, Inc, won a international competition among scientists working on finding ways to predict the shapes of proteins. Knowing the shape proteins assume after they are synthesized is important because a protein’s shape determines its function.
The DeepMind software, called AlphaFold, was seen as a major advance in the effort to find a way to reliably predict the shape of proteins from their amino acid sequence alone. But the software was proprietary and so unavailable to most scientists.
Now researchers at the Institute for Protein Design at the University of Washington School of Medicine in Seattle have created an artificial intelligence program that largely matches the performance achieved by DeepMind.
But, unlike DeepMind, the UW Medicine team has already made their method, dubbed RoseTTAFold, freely available.
These results will be published online today, July 15, by the journal Science.
Scientists from around the world are now using it to build protein models to accelerate their own research. Soon after its recent upload, the program was downloaded from GitHub by over 140 independent research teams.
Proteins consist of strings of amino acids that fold up into intricate microscopic shapes. These unique shapes in turn give rise to nearly every chemical process inside living organisms. By better understanding protein shapes, scientists can speed up the development of new treatments for cancer, COVID-19, and thousands of other medical disorders.
“I am delighted that the scientific community is already using the RoseTTAFold server to solve outstanding biological problems,” said senior author David Baker, Howard Hughes Medical Institute Investigator, professor of biochemistry, and director of the Institute for Protein Design at UW Medicine.
In the new study, a team of computational biologists led by Baker developed a software tool called RoseTTAFold that uses deep learning to quickly and accurately predict protein structures based on limited information.
Without the aid of such software, it can take years of laboratory work to determine the structure of just one protein.RoseTTAFold, on the other hand, can reliably compute a protein structure in as little as 10 minutes on a single gaming computer.
The team used RoseTTAFold to compute hundreds of new protein structures, including many poorly understood proteins from the human genome. They also generated structures directly relevant to human health, including for proteins associated with problematic lipid metabolism, inflammation disorders, and cancer cell growth. And they show that RoseTTAFold can be used to build models of complex biological assemblies in a fraction of the time previously required.
RoseTTAFold is a “three-track” neural network, meaning it simultaneously considers patterns in protein sequences, how a protein’s amino acids interact with one another, and a protein’s possible three-dimensional structure.
In this architecture, one-, two-, and three-dimensional information flows back and forth, allowing the network to collectively reason about the relationship between a protein’s chemical parts and its
folded structure.
“We hope this new tool will continue to benefit the entire research community,” said
Minkyung Baek, a postdoctoral scholar who led the project in the Baker laboratory at UW Medicine.
This work was supported in part by Microsoft, Open Philanthropy Project, Schmidt
Futures, Washington Research Foundation, National Science Foundation, Wellcome
Trust, and the National Institute of Health. A full list of supporters is available in the Science paper.
Adapted from a press release written by Ian Haydon, UW Medicine Institute for Protein Design