Focus: Evolution Thins Out Distracting DNA
Regulatory proteins turn genes on or off by binding to DNA at specific target sequences. Such proteins may also bind less strongly at incorrect places on the genome, impairing the efficiency of biochemical functions, but now researchers suggest that evolution may have crafted many organisms' genomes to avoid DNA sequences that cause this problem. The team analyzed the genomes of 75 organisms and used data from previous experiments on DNA-protein binding strength to draw their conclusion that the problem of mis-binding has been mitigated by evolutionary pressures. Many evolution experts had doubted that mis-binding would lead to detectable effects on a genome.
A typical regulatory protein may bind correctly to anything from 1 to 100 target sites in an organism’s complete genome. While correct binding is always extremely strong (very high binding affinity), biologists have long wondered if evolution may have weeded out DNA sequences to which proteins would bind more weakly, potentially acting as "imposters" for their correct binding sites. Previous work, however, has found evidence of such selection only in bacteria and only for a few specific proteins for which mis-binding could be lethal
Now physicists Long Qian and Edo Kussell of New York University present much more general evidence for a broad evolutionary shaping of genomes to avoid mis-binding. They made use of earlier research that yielded some data on the tendency of short DNA sequences, or motifs, to bind to regulatory proteins. These studies measured the average binding affinity of every possible eight-letter DNA motif for a sample of regulatory proteins from five organisms—human, mouse, fruit fly, worm, and yeast. Qian and Kussell were specifically concerned with the range of affinities for "incorrect" binding, not the very high affinity of a protein with its target sequence. A motif with low affinity for the proteins in these studies is unlikely to bind the wrong one, whereas those with higher affinities—though not nearly as high as binding with the correct protein—are more likely to do so.
Examining the five genomes in light of the data on binding affinity, Qian and Kussell found a strong negative correlation: the higher the affinity of a given DNA motif for the regulatory proteins, the less likely it was to be present in the genomes. Since this result applies only to a limited number of regulatory proteins, Qian and Kussell came up with a new analysis to determine whether it might apply more generally. If binding affinity were truly responsible for the negative correlation between affinity and presence of a motif in a genome, then motifs with similar affinities should appear in genomes with similar frequencies. The researchers analyzed genomic sequence data for some 75 different species covering all major categories of the evolutionary tree. They found that motifs differing at only a single DNA "letter"—which presumably have similar binding affinities—were present in similar numbers.
“We found this signature in each and every genome that we tested,” says Kussell. “We were surprised that this pattern hadn’t been noticed before.” The implication of the pattern is that evolution has resulted in a systematic reduction in the frequency of DNA motifs that are likely to mis-bind regulatory proteins. To check this idea, the researchers used a mathematical model for DNA evolution to see if the observed pattern could be the result of so-called neutral evolution, caused by random mutations over time without any evolutionary constraints that would inhibit the appearance of mis-binding motifs. They found that none of the models they tested could explain the pattern.
Taken together, Kussell says, these findings suggest that binding strength is a key determinant of motif frequency in all organisms and that all 75 genomes show a general turning away from motifs that are likely to mis-bind regulatory proteins. Another satisfying element of the work, Kussell adds, is that it explains why short DNA motifs tend to appear with similar frequencies in organisms that diverged very long ago. The common mouse and the fruit fly, for example, diverged some 600 million years ago and have evolved independently for billions of generations. Even so, the relative frequencies of DNA motifs in the two genomes show a similar pattern. That isn’t so surprising if both genomes continue to be shaped by pressure to avoid the mis-binding of regulatory proteins.
Before this study, says Eugene Koonin of the National Center for Biotechnology Information in Bethesda, Maryland, many biologists believed that the evolutionary consequences of protein mis-binding were probably insignificant and were almost certainly too small to be detected by today's genomic tools. “The results in this paper,” says Koonin, “strongly suggest that such selection is both detectable and biologically relevant.” For evolutionary biology, he adds, that’s an important conclusion.
This research is published in Physical Review X.
Mark Buchanan is a freelance science writer based in Wales and Normandy, France.