Defensio Abstract

Speaker Christina Witwer
Title Prediction of Conserved and Consensus RNA Structures


Most functional RNA molecules have characteristic secondary structures that are highly conserved during evolution. These secondary structures are used successfully in the interpretation of RNA function and reactivity.

RNA secondary structures can be predicted well by algorithms developed in the last two decades. The algorithm alidot developed by Hofacker et al. predicts conserved secondary structure elements for a small set of related sequences, combining thermodynamic and phylogenetic structure prediction.

One part of the work presented here is concerned with the identification of functionally important structures in the genomes of members of the family Picornaviridae, using alidot. Picornaviruses are small RNA viruses, including important pathogens, like Poliovirus, Hepatitis A virus and Foot-and-mouth disease virus. This thesis provides a comprehensive computational survey of conserved structural elements in the genomes of picornaviruses. Apart from the recognition of known structural elements, among them the internal ribosome entry site (IRES) in the 5'-non-translated region of the genome, we detect a large number of secondary structure elements that have not been described before, most importantly within the coding region.

Many important RNA molecules, however, contain pseudoknots. Pseudoknots are structural motifs, that are excluded explicitly from the conventional definition of secondary structures, mainly for computational reasons. While there exist fully developed algorithms for the prediction of consensus secondary structures when pseudoknots are forbidden, the prediction of secondary structures including pseudoknots still relies on comparative sequence analysis which requires a large set of related sequences.

In the second part of this thesis, we describe a computational method, which combines covariational and thermodynamical information, to predict a consensus secondary structure including pseudoknots from an alignment of a smaller set of homologous sequences. This program, termed hxmatch is based on the Maximum Weighted Matching (MWM) algorithm, and thus similar to the method of Tabaska et al., but uses an improved scoring function and an elaborated postprocessing.

Hxmatch is tested on three different types of RNA known to contain pseudoknots: SRP RNA, RNase P RNA and tmRNA. Comparison with phylogenetic structures of the investigated RNAs shows, that a reliability of 60-85% is achieved, while no false positive helices are predicted, even from datasets containing not more than six related sequences. However, at least with our scoring procedure, the usage of the MWM algorithm does not improve the quality of the results compared to selecting the base pairs by a greedy algorithm.