0954 GMT May 24, 2022
Set up in 1994, progress in the “Critical Assessment of protein Structure Prediction” (CASP) race had almost come to a halt. Many in the field had given up hope they would live to see a solution. However, in 2018 DeepMind had won the “protein Olympics” by some distance. This year its AlphaFold2 software lapped the opposition. While it was a great leap for CASP, it seemed a small step for humanity. DeepMind trained a neural network on protein-structure databases to learn what proteins look like. It did so by rapidly learning what evolutionary adaptations had occurred over millennia and using those insights in its guesses.
Proteins are key building blocks of life, intimately involved in every biological process. Cancer is traced to an overproduction of proteins. The bodies’ metabolism is regulated by a protein, insulin. Human proteins fold up often hundreds of amino acids in an astonishing number of ways: About a googol cubed or 10 to the power of 300. Shape determines whether a protein will catalyze a chemical reaction, become an organism’s scaffolding or transport molecules in and out of the cell. Misshaped proteins cause many deadly diseases and play a role in ageing.
It may not be possible to envisage the benefits of DeepMind’s work today. Drugs work by attaching a protein in a particular place, thereby altering or disabling its function. It is the 3D structure of the coronavirus spike protein that sees it bind tightly to receptors in our noses. Knowing a protein’s shape may allow scientists in the future to identify such binding sites and make it easier to synthesize therapeutics from scratch. But there’s a long way to go as many interactions in the body remain riddles without solutions. DeepMind admits its code can’t handle complexes of proteins that work together to carry out key functions. In short, the protein folding problem is far from being solved.
DeepMind’s achievement answers one big scientific question but raises more fundamental ones for society. Part of a profit-seeking company, DeepMind pays large salaries for scarce AI talent. Its groundbreaking news was announced in a company press release. It has yet to submit a paper describing its work to a peer-reviewed journal, though it has this year published one on its 2018 CASP entry. A simple idea underpins science: Results should always be subject to challenge from experiment. Commercial firms may want to be trusted more than scrutinized.
If there is a paradigm shift in biology that DeepMind represents it is artificial intelligence’s impact on biology. In 2020, it is thought there will be 21,000 scientific papers involving AI methods in this branch of science – and this is growing at 50 percent a year. It is also dominated by tech giants whose code is their intellectual property, making it particularly opaque. Only 25 percent of AI papers publish their code. DeepMind, say experts, regularly does not. This impairs accountability and reproducibility and ultimately may hamper progress. There are ongoing attempts to share proprietary data while respecting its highly confidential nature. It would be better if the industry adopted a more open-source attitude.
Science has traditionally progressed by freely distributing knowledge. The underlying concern is that DeepMind, like its rival OpenAI, may opt to commercialize its deep learning model instead of making it freely available. Some argue that price is not a problem and AlphaFold2 is cheap. DeepMind’s advances rest in part on state-backed breakthroughs – from the evolutionary insight of Spanish bioinformatician Alfonso Valencia to the computational work of scientists such as UCL’s David Jones. It would be strange if in years to come university researchers used government cash to pay DeepMind for a system built on government-funded insights. The Universe is not short of mysteries. Their discovery should be celebrated and dedicated largely to the public good rather than wholly to the pursuit of profit.
This article is a Guardian editorial.