probA


Calculation of the Partition Function, Match Probabilities and Stochastic backtracking




Along the length of biological sequences the level of sequence conservation between two sequences may vary considerable. The reliability of an alignment is therefore dependent on the level of local sequence conservation. In addition the dynamic programming algorithms used to derive the "optimal" alignment have an inherent ambiguity, that arises from the non uniqueness of optimal solutions and the particular scheme by which the search space is evaluated (Giegerich 2000). Several approaches to dealing with this effect have been reported, starting with the investigation of suboptimal alignments by (Vingron 1990) and (Saqi 1991). The use of the partition function of all alignments was pioneered by (Miyazawa 1994).

We implemented a C programm for global pairwise alignment using the partition function over all canonical alignments to calculate the probability of every match between a position in the first and a position in the second sequence. For the calculation of the partition function we used a modified alignment algorithm that avoids the generation of solutions that are represented differently but are equivalent from a semantic point of view. Furthermore we include a parameter governing the relative weight of alignment paths with different scores (Kschischo 2000) and extend previous approaches to stochastic pairwise alignments by a stochastic backtracking procedure that can be used to obtain ensembles of suboptimal alignments with correct statistical weights.

For more information about the programm look at the manpage of probA. If you want to include our code in your programms, a documentation of the probA lib is available.

The package is free software and can be downloaded as C source code. Go ahead and download the gzipped tar file of version 0.1.1.

An here is the defensio talk:
ulli.pdf

Ulli Mückstein, <ulim@tbi.univie.ac.at>
Institut für theoretische Chemie, Währingerstr. 17,
A-1090 Wien, Austria

Last modified: 2005-11-17 10:44:40 ulim