Next: , Previous: mfe Fold, Up: Folding Routines


2.2 Partition Function Folding

Instead of the minimum free energy structure the partition function of all possible structures and from that the pairing probability for every possible pair can be calculated, using a dynamic programming algorithm as described by McCaskill (1990). The following functions are provided:

— Function: float pf_fold (char* sequence, char* structure)

calculates the partition function Z of sequence and returns the free energy of the ensemble F in kcal/mol, where F=-kT ln(Z). If structure is not a NULL pointer on input, it contains on return a string consisting of the letters “ . , | { } ( ) “ denoting bases that are essentially unpaired, weakly paired, strongly paired without preference, weakly upstream (downstream) paired, or strongly up- (down-)stream paired bases, respectively. If fold_constrained (see Variables) is 1, the structure string is interpreted on input as a list of constraints for the folding. The character “x“ marks bases that must be unpaired, matching brackets “ ( ) “ denote base pairs, all other characters are ignored. Any pairs conflicting with the constraint will be forbidden. This usually sufficient to ensure the constraints are honored. If do_backtrack (see Variables) has been set to 0 base pairing probabilities will not be computed (saving CPU time), otherwise the pr[iindx[i]-j] (see Variables) will contain the probability that bases i and j pair.

— Function: void init_pf_fold (int length)

allocates memory for folding sequences not longer than length; sets up pairing matrix and energy parameters. Has to be called before the first call to pf_fold().

— Function: void free_pf_arrays (void)

frees the memory allocated by init_pf_fold().

— Function: void update_pf_params (int length)

Call this function to recalculate the pair matrix and energy parameters after a change in folding parameters like temperature (see Variables).

— Function: double mean_bp_dist (int length)

computes the mean base pair distance in the equilibrium ensemble as a measure of the structural diversity. It is given by <d> = sum_a,b p_a * p_b * d(a,b), where the sum goes over all pairs of possible structure a,b, p_a is the Boltzmann weight of structure a, and d(a,b) is the base pair distance (see bp_distance() in See Distances.). The mean base pair distances can be computed efficiently from the pair probabilities p_ij as <d> = sum_ij p_ij * (1-p_ij). Uses the global pr array filled by a previous call to pf_fold().

— Function: char *centroid (int length, double *dist)

Computes the centroid structure, i.e. the structure having the lowest average base pair distance to all structures in the Boltzmann ensemble. This can be computed trivially from the pair probabilities by choosing all base pairs that have probability greater 0.5. The distance of the centroid to the ensemble is returned in dist.

Prototypes for these functions are declared in part_func.h.