Author(s):
U. Mückstein, I.L. Hofacker, P.F. Stadler
Download:
[PostScript] [PDF]
Appeared in:
Bioinformatics 18: S153-S160 (2002)
Abstract:
Motivation: The level of sequence conservation between related
nucleic acids or proteins often varies considerably along the
sequence. Both regions with high variability (mutational hot-spots)
and regions of almost perfect sequence identity may occur in the same
pair of molecules. The reliability of an alignment therefore strongly
depends on the level of local sequence similarity.
Results: The probability Pij of a match
between position i in the first and position j in the
second sequence is computed using the the partition function over all
canonical pairwise alignments. A probabilistic backtracking procedure
can then be used to generate ensembles of suboptimal alignments with
correct statistical weights.
A comparison between structure based alignments and large samples of
stochastic alignments shows that the ensemble contains correct
alignments with significant probabilities even though the optimal
alignment deviates significantly from the structural alignment.
Ensembles of suboptimal alignments obtained by stochastic
backtracking, or the match probability matrices themselves, are
therefore promising starting points for improved iterative multiple
alignment procedures. In particular, it should be possible to overcome
the problem of fixating an incorrect pairwise alignment in an early
iteration.
Availability The software described in this contribution is available for downloading at http://www.tbi.univie.ac.at/~ulim/probA.
Keywords: Alignments, Partition Function, Stochastic
Backtracking
Return to 2002 working papers
list.