Skip to content

COMPUTATIONAL MEDICINE | Contents | Next

Dickerson’s Formula: Biochemistry’s Equivalent to Moore’s Law

wenty-four years ago, Richard Dickerson came up with a mathematical formula that optimistically predicted an accelerating pace of discovery in the burgeoning field of protein structure determination with X-ray crystallography (see story, page 8). Dickerson, then a professor of physical chemistry at Caltech, noted that the number of protein crystal structures had risen from one solved by the end of 1961 to 23 solved by the end of 1977. His formula predicted that by March 2001, scientists would have solved the 3-D structures of a grand total of more than 12,000 proteins. He was very close.

Dickerson’s Formula Predicts Rising Number of Solved 3-D Protein Structures

The first protein structure solved by scientists in 1961 was myoglobin (inset). In the equation developed by Richard Dickerson for the number of new protein structures solved per year, n is the number of new structures in a given year, e is the natural logarithm 2.71828, and y is the year (for March 27, 2001, y = 2000.25).

Nobody knew how close until Arthur Arnone, a biochemistry professor at the University of Iowa, checked. Arnone found that the equation predicted that there would be 12,066 crystal structures solved by March 27, 2001. By that date, the Protein Data Bank (PDB), the international online repository of protein data, had posted 12,123 protein structures, only 57 more than Dickerson’s forecast. The Dickerson formula was accurate to within 0.5 percent.

Arnone notes that the PDB tally on February 5, 2002, showed 13,635 X-ray structures for proteins, peptides, viruses, and protein-nucleic acid complexes. However, including the non-protein structures (the 601 nucleic acid structures and 14 carbohydrate structures) the total number of experimental X-ray structures in the PDB was 14,250. Dickerson’s equation predicted 14,201.

"I think that Dickerson’s equation is to protein structure what Moore’s Law is to semiconductor chips," said Arnone. "Even the time frames are the same. Dickerson’s equation and the current form of Moore’s Law were both derived in the late 1970s." Moore’s Law, created by Intel cofounder Gordon Moore, has forecast, with remarkable accuracy, that the number of transistors per square inch on integrated circuits would double about every 18 months.

Dickerson, now a professor of biochemistry at UCLA and a pioneer in X-ray crystallography, is astounded by his own predictions. "To move from the accidental to the ridiculous, if Dickerson’s Law holds, the year 2024 will see the production of 1 million new protein and nucleic acid structures, or 83,000 new structures per month," said Dickerson. "Somehow, I doubt that. Even the genomics and proteomics researchers who claim that their goal is to sequence all the DNA of an organism, and solve all the protein structures coded by that DNA, don’t go quite that far. But who knows? In short, I seem to have won the lottery. Pity there’s no jackpot."

Phil Bourne, co-director of the PDB and director of Integrative Biosciences at SDSC, describes Dickerson’s prediction as impressively insightful. "He put together the pieces and came up with this equation that basically predicted the exponential growth of the PDB. Back then, the numbers would have seemed staggering."

The details of just a single protein can easily fill a large book. Fortunately for crystallographers and other scientists, the growing encyclopedic catalog of proteins easily fits into the PDB, the world’s largest repository and distribution center of protein data (www.rcsb.org/pdb).

Dickerson’s equation predicts that a combined total of 24,667 structures will be found by 2004. "We’re just starting to see an explosion of growth as the field of proteomics comes into its own," he said. "When I started out, it took years to solve a structure; now they’re solving a couple thousand per year." –CF


Participants
Arthur Arnone
University of Iowa
Helen Berman,
John Westbrook
Rutgers University
Phil Bourne
SDSC
Richard Dickerson
UCLA
Gary Gilliland
National Institute of Standards and Technology