Speaker | Christina Witwer |
Title | Prediction of Conserved and Consensus RNA Structures |
Most functional RNA molecules have characteristic secondary structures that are highly conserved during evolution. These secondary structures are used successfully in the interpretation of RNA function and reactivity.
RNA secondary structures can be predicted well by algorithms developed
in the last two decades. The algorithm alidot
developed by
Hofacker et al. predicts conserved secondary structure elements
for a small set of related sequences, combining thermodynamic and
phylogenetic structure prediction.
One part of the work presented here is concerned with the
identification of functionally important structures in the genomes of
members of the family Picornaviridae, using alidot
.
Picornaviruses are small RNA viruses, including important pathogens, like
Poliovirus, Hepatitis A virus and Foot-and-mouth disease
virus. This thesis provides a comprehensive computational survey of
conserved structural elements in the genomes of picornaviruses. Apart from
the recognition of known structural elements, among them the internal
ribosome entry site (IRES) in the 5'-non-translated region of the genome,
we detect a large number of secondary structure elements that have not been
described before, most importantly within the coding region.
Many important RNA molecules, however, contain pseudoknots. Pseudoknots are structural motifs, that are excluded explicitly from the conventional definition of secondary structures, mainly for computational reasons. While there exist fully developed algorithms for the prediction of consensus secondary structures when pseudoknots are forbidden, the prediction of secondary structures including pseudoknots still relies on comparative sequence analysis which requires a large set of related sequences.
In the second part of this thesis, we describe a computational method,
which combines covariational and thermodynamical information, to predict a
consensus secondary structure including pseudoknots from an alignment of a
smaller set of homologous sequences. This program, termed
hxmatch
is based on the Maximum Weighted Matching (MWM)
algorithm, and thus similar to the method of Tabaska et al., but
uses an improved scoring function and an elaborated postprocessing.
Hxmatch is tested on three different types of RNA known to contain pseudoknots: SRP RNA, RNase P RNA and tmRNA. Comparison with phylogenetic structures of the investigated RNAs shows, that a reliability of 60-85% is achieved, while no false positive helices are predicted, even from datasets containing not more than six related sequences. However, at least with our scoring procedure, the usage of the MWM algorithm does not improve the quality of the results compared to selecting the base pairs by a greedy algorithm.