Consensus folding of aligned sequences as a new measure for the detection of functional RNAs by comparative genomics
Stefan Washietl, Ivo L. Hofacker
J Mol Biol 342: 19-30 (2004).
Facing the ever-growing list of newly discovered classes of functional RNAs, it can be expected that further types of functional RNAs are still hidden in recently completed genomes. The computational identification of such RNA genes is, therefore, of major importance. While most known functional RNAs have characteristic secondary structures, their free energies are generally not statistically significant enough to distinguish RNA genes from the genomic background. Additional information is required. Considering the wide availability of new genomic data of closely related species, comparative studies seem to be the most promising approach. Here we show that prediction of consensus structures of aligned sequences can be a significant measure to detect functional RNAs. We report a new method how to test multiple sequence alignments for the existence of an unusually structured and conserved fold. We show for alignments of six types of well known functional RNA that an energy score consisting of free energy and a covariation term significantly improves sensitivity compared to single sequence predictions. We further test our method on a number of non coding RNAs from C. elegans}/C. briggsae and seven Saccharomyces species. Most RNAs can be detected with high significance. We provide a Perl implementation which can be readily used to score single alignments and discuss how the methods described here can be extended to allow for efficient genome-wide screens.
Keywords: Minimum free energy folding, consensus secondary structure prediction, non coding RNAs, comparative genomics, randomizing multiple sequence alignments
Return to 2004 working papers list.
Last modified: 2004-08-30 11:13:24 ivo