Protein Structure Initiative spends misguided money on research
The process of understanding protein folding in cells is arguably one of the most well-studied, yet elusive problems in biology. It stems from a fundamental inability to predict a protein’s structure given an instruction manual (DNA) and parts (amino acids). Such a complicated process is leading scientists to pursue expensive, and sometimes fruitless, methods of studying them.
A protein begins as a string of amino acids, but is useless until this string coils and folds into a particular shape. Proteins fold in beautiful and complex ways. There are untold ways a protein can fold, and it varies from protein to protein. Knowledge of this intricate folding pattern would give researchers information about drug assembly processes and potential drug design.
For the most part, finding an intuitive model that predicts how a two-dimensional string of amino acids will fold into a three-dimensional protein shape, given a particular DNA sequence, has remained challenging. Given the astronomical number of possibilities, it would take the age of the universe to sample all of them. Yet somehow, the protein finds its correct shape in microseconds.
The picture of protein folding is baffling. Computational models fail to predict protein structures accurately and routinely.
Still, the lack of an intuitive explanation has not stopped scientists the world over from adopting costly, and sometimes controversial, approaches to amassing the world’s largest structural database — one protein at a time. Four research centers are taking part in the Protein Structure Initiative (PSI), a worldwide, brute-force approach to understanding proteins that many have hailed as a revolution.
However, by the end of 2010, the project will have cost more than three-quarters of a billion dollars.
Are proteins that important? Yes. But we need a more cost-effective way of researching them.
Appreciating the scope of this problem requires a quick recap of elementary biology.
To begin, it is important to grasp the intricacy of DNA codes for proteins. A strand of DNA is made up of individual “letters” (nucleotides) represented by A, G, C, and T. In the cell, this string of letters can be thought of as a code that is parsed, or scanned, in groups of three by a scanning mechanism called the ribosome. As the DNA code is scanned, little blocks called amino acids are sucked into the ribosome and then spit out the other end. Together, this string of ejected blocks forms a protein.
But there is a fly in the ointment. Given a particular sequence of DNA, it’s impossible to know which way the resulting protein will fold. It was originally thought that the string of amino acid blocks was linear and remained that way. Differences among proteins were chiefly attributed to differences in amino acid sequences.
That is not the case.
Proteins fold, and they do it with magnificent complexity. Some proteins, like cadherins, look like Twizzlers. Others look like rubber band balls.
Researchers involved in the PSI are churning out proteins at a furious rate. A protein that used to take two years to resolve now takes two days. Over the course of seven years, the PSI initiative has contributed roughly 40 percent of the structures available in the Protein Data Bank.
This is indeed impressive. It has been accomplished by using equipment that is elaborate and costly, turning itself into a glistening, state-of-the-art molecule factory. So far, it’s done this to the tune of $80 million.
But is it necessary to understand all of these protein structures? As the PSI approaches the halfway mark of its second five-year phase, a mounting number of scientists are calling it a mistake, with some reviewers even calling the approach “tragically flawed.” One key problem is that the rate of novel protein families being discovered is far greater than the rate of protein structure production. The continual discovery of proteins that have never been seen before could make the protein structure problem an open-ended one. Moreover, protein structures are of little to no use outside of a biochemical context.
The PSI is doubtless doing a wonderful thing for molecular biology. The difficulties of analyzing protein structures are amassing responses from scientists on a national level. Researchers around the world are using the Protein Data Bank to help them design better drugs and vaccines. But when it comes to the PSI initiative, and the burgeoning costs of sustaining it, perhaps less is more.