Partition function and equilibrium properties

Overview

Compute the partition function and various equilibrium properties derived from it. More…

// global functions

double vrna_mean_bp_distance_pr (
    int length,
    FLT_OR_DBL* pr
    )

double vrna_mean_bp_distance (vrna_fold_compound_t* vc)

vrna_ep_t* vrna_stack_prob (
    vrna_fold_compound_t* vc,
    double cutoff
    )

float vrna_pf (
    vrna_fold_compound_t* vc,
    char* structure
    )

float vrna_pf_fold (
    const char* sequence,
    char* structure,
    vrna_ep_t** pl
    )

float vrna_pf_circfold (
    const char* sequence,
    char* structure,
    vrna_ep_t** pl
    )

float pf_fold_par (
    const char* sequence,
    char* structure,
    vrna_exp_param_t* parameters,
    int calculate_bppm,
    int is_constrained,
    int is_circular
    )

float pf_fold (
    const char* sequence,
    char* structure
    )

float pf_circ_fold (
    const char* sequence,
    char* structure
    )

void free_pf_arrays (void)
void update_pf_params (int length)

void update_pf_params_par (
    int length,
    vrna_exp_param_t* parameters
    )

FLT_OR_DBL* export_bppm (void)

int get_pf_arrays (
    short** S_p,
    short** S1_p,
    char** ptype_p,
    FLT_OR_DBL** qb_p,
    FLT_OR_DBL** qm_p,
    FLT_OR_DBL** q1k_p,
    FLT_OR_DBL** qln_p
    )

double mean_bp_distance (int length)

double mean_bp_distance_pr (
    int length,
    FLT_OR_DBL* pr
    )

vrna_ep_t* vrna_plist_from_probs (
    vrna_fold_compound_t* vc,
    double cut_off
    )

void assign_plist_from_pr (
    vrna_ep_t** pl,
    FLT_OR_DBL* probs,
    int length,
    double cutoff
    )

Detailed Documentation

Compute the partition function and various equilibrium properties derived from it.

Global Functions

double vrna_mean_bp_distance_pr (
    int length,
    FLT_OR_DBL* pr
    )
Get the mean base pair distance in the thermodynamic ensemble from a probability matrix.

\(<d> = \sum_{a,b} p_a p_b d(S_a,S_b)\)

this can be computed from the pair probs \(p_ij\) as

\(<d> = \sum_{ij} p_{ij}(1-p_{ij})\)

Parameters:

length The length of the sequence
pr The matrix containing the base pair probabilities

Returns:

The mean pair distance of the structure ensemble

double vrna_mean_bp_distance (vrna_fold_compound_t* vc)
Get the mean base pair distance in the thermodynamic ensemble.

\(<d> = \sum_{a,b} p_a p_b d(S_a,S_b)\)

this can be computed from the pair probs \(p_ij\) as

\(<d> = \sum_{ij} p_{ij}(1-p_{ij})\)

SWIG Wrapper Notes This function is attached as method mean_bp_distance() to objects of type fold_compound

Parameters:

vc The fold compound data structure

Returns:

The mean pair distance of the structure ensemble

vrna_ep_t* vrna_stack_prob (
    vrna_fold_compound_t* vc,
    double cutoff
    )
Compute stacking probabilities.

For each possible base pair \((i,j)\) , compute the probability of a stack \((i,j)\) , \((i+1, j-1)\) .

Parameters:

vc The fold compound data structure with precomputed base pair probabilities
cutoff A cutoff value that limits the output to stacks with :math:` p > textrm{cutoff} ` .

Returns:

A list of stacks with enclosing base pair \((i,j)\) and probabiltiy :math:` p `

float vrna_pf (
    vrna_fold_compound_t* vc,
    char* structure
    )
Compute the partition function \(Q\) for a given RNA sequence, or sequence alignment.

If structure is not a NULL pointer on input, it contains on return a string consisting of the letters ” . , | { } ( ) ” denoting bases that are essentially unpaired, weakly paired, strongly paired without preference, weakly upstream (downstream) paired, or strongly up- (down-)stream paired bases, respectively. If the parameter calculate_bppm is set to 0 base pairing probabilities will not be computed (saving CPU time), otherwise after calculations took place pr will contain the probability that bases i and j pair.

SWIG Wrapper Notes This function is attached as method pf() to objects of type fold_compound

Parameters:

vc The fold compound data structure
structure A pointer to the character array where position-wise pairing propensity will be stored. (Maybe NULL)

Returns:

The Gibbs free energy of the ensemble ( :math:`G = -RT cdot log(Q) ` ) in kcal/mol

Note

This function is polymorphic. It accepts vrna_fold_compound_t of type VRNA_FC_TYPE_SINGLE , and VRNA_FC_TYPE_COMPARATIVE .

float vrna_pf_fold (
    const char* sequence,
    char* structure,
    vrna_ep_t** pl
    )
Compute Partition function \(Q\) (and base pair probabilities) for an RNA sequence using a comparative method.

This simplified interface to vrna_pf() computes the partition function and, if required, base pair probabilities for an RNA sequence using default options. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing.

Parameters:

sequence RNA sequence
structure A pointer to the character array where position-wise pairing propensity will be stored. (Maybe NULL)
pl A pointer to a list of vrna_ep_t to store pairing probabilities (Maybe NULL)

Returns:

The Gibbs free energy of the ensemble ( :math:`G = -RT cdot log(Q) ` ) in kcal/mol

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use vrna_pf() , and the data structure vrna_fold_compound_t instead.

float vrna_pf_circfold (
    const char* sequence,
    char* structure,
    vrna_ep_t** pl
    )
Compute Partition function \(Q\) (and base pair probabilities) for a circular RNA sequences using a comparative method.

This simplified interface to vrna_pf() computes the partition function and, if required, base pair probabilities for a circular RNA sequence using default options. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing.

Folding of circular RNA sequences is handled as a post-processing step of the forward recursions. See [7] for further details.

Parameters:

sequence A circular RNA sequence
structure A pointer to the character array where position-wise pairing propensity will be stored. (Maybe NULL)
pl A pointer to a list of vrna_ep_t to store pairing probabilities (Maybe NULL)

Returns:

The Gibbs free energy of the ensemble ( :math:`G = -RT cdot log(Q) ` ) in kcal/mol

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use vrna_pf() , and the data structure vrna_fold_compound_t instead.

float pf_fold_par (
    const char* sequence,
    char* structure,
    vrna_exp_param_t* parameters,
    int calculate_bppm,
    int is_constrained,
    int is_circular
    )
Compute the partition function \(Q\) for a given RNA sequence.

If structure is not a NULL pointer on input, it contains on return a string consisting of the letters ” . , | { } ( ) ” denoting bases that are essentially unpaired, weakly paired, strongly paired without preference, weakly upstream (downstream) paired, or strongly up- (down-)stream paired bases, respectively. If fold_constrained is not 0, the structure string is interpreted on input as a list of constraints for the folding. The character “x” marks bases that must be unpaired, matching brackets ” ( ) ” denote base pairs, all other characters are ignored. Any pairs conflicting with the constraint will be forbidden. This is usually sufficient to ensure the constraints are honored. If the parameter calculate_bppm is set to 0 base pairing probabilities will not be computed (saving CPU time), otherwise after calculations took place pr will contain the probability that bases i and j pair.

Deprecated Use vrna_pf() instead

Parameters:

sequence The RNA sequence input
structure A pointer to a char array where a base pair probability information can be stored in a pseudo-dot-bracket notation (may be NULL, too)
parameters Data structure containing the precalculated Boltzmann factors
calculate_bppm Switch to Base pair probability calculations on/off (0==off)
is_constrained Switch to indicate that a structure contraint is passed via the structure argument (0==off)
is_circular Switch to (de-)activate postprocessing steps in case RNA sequence is circular (0==off)

Returns:

The Gibbs free energy of the ensemble ( :math:`G = -RT cdot log(Q) ` ) in kcal/mol

Note

The global array pr is deprecated and the user who wants the calculated base pair probabilities for further computations is advised to use the function export_bppm()

Post-Condition

After successful run the hidden folding matrices are filled with the appropriate Boltzmann factors. Depending on whether the global variable do_backtrack was set the base pair probabilities are already computed and may be accessed for further usage via the export_bppm() function. A call of free_pf_arrays() will free all memory allocated by this function. Successive calls will first free previously allocated memory before starting the computation.

float pf_fold (
    const char* sequence,
    char* structure
    )
Compute the partition function \(Q\) of an RNA sequence.

If structure is not a NULL pointer on input, it contains on return a string consisting of the letters ” . , | { } ( ) ” denoting bases that are essentially unpaired, weakly paired, strongly paired without preference, weakly upstream (downstream) paired, or strongly up- (down-)stream paired bases, respectively. If fold_constrained is not 0, the structure string is interpreted on input as a list of constraints for the folding. The character “x” marks bases that must be unpaired, matching brackets ” ( ) ” denote base pairs, all other characters are ignored. Any pairs conflicting with the constraint will be forbidden. This is usually sufficient to ensure the constraints are honored. If do_backtrack has been set to 0 base pairing probabilities will not be computed (saving CPU time), otherwise pr will contain the probability that bases i and j pair.

Parameters:

sequence The RNA sequence input
structure A pointer to a char array where a base pair probability information can be stored in a pseudo-dot-bracket notation (may be NULL, too)

Returns:

The Gibbs free energy of the ensemble ( :math:`G = -RT cdot log(Q) ` ) in kcal/mol

Note

The global array pr is deprecated and the user who wants the calculated base pair probabilities for further computations is advised to use the function export_bppm() .

OpenMP: This function is not entirely threadsafe. While the recursions are working on their own copies of data the model details for the recursions are determined from the global settings just before entering the recursions. Consider using pf_fold_par() for a really threadsafe implementation.

Pre-Condition

This function takes its model details from the global variables provided in RNAlib

Post-Condition

After successful run the hidden folding matrices are filled with the appropriate Boltzmann factors. Depending on whether the global variable do_backtrack was set the base pair probabilities are already computed and may be accessed for further usage via the export_bppm() function. A call of free_pf_arrays() will free all memory allocated by this function. Successive calls will first free previously allocated memory before starting the computation.

float pf_circ_fold (
    const char* sequence,
    char* structure
    )
Compute the partition function of a circular RNA sequence.

Deprecated Use vrna_pf() instead!

Parameters:

sequence The RNA sequence input
structure A pointer to a char array where a base pair probability information can be stored in a pseudo-dot-bracket notation (may be NULL, too)

Returns:

The Gibbs free energy of the ensemble ( :math:`G = -RT cdot log(Q) ` ) in kcal/mol

Note

The global array pr is deprecated and the user who wants the calculated base pair probabilities for further computations is advised to use the function export_bppm() .

OpenMP: This function is not entirely threadsafe. While the recursions are working on their own copies of data the model details for the recursions are determined from the global settings just before entering the recursions. Consider using pf_fold_par() for a really threadsafe implementation.

Pre-Condition

This function takes its model details from the global variables provided in RNAlib

Post-Condition

After successful run the hidden folding matrices are filled with the appropriate Boltzmann factors. Depending on whether the global variable do_backtrack was set the base pair probabilities are already computed and may be accessed for further usage via the export_bppm() function. A call of free_pf_arrays() will free all memory allocated by this function. Successive calls will first free previously allocated memory before starting the computation.

See also:

vrna_pf()

void free_pf_arrays (void)
Free arrays for the partition function recursions.

Call this function if you want to free all allocated memory associated with the partition function forward recursion. Deprecated See vrna_fold_compound_t and its related functions for how to free memory occupied by the dynamic programming matrices

Note

Successive calls of pf_fold() , pf_circ_fold() already check if they should free any memory from a previous run.

OpenMP notice:

This function should be called before leaving a thread in order to avoid leaking memory

Post-Condition

All memory allocated by pf_fold_par() , pf_fold() or pf_circ_fold() will be free’d

void update_pf_params (int length)
Recalculate energy parameters.

Call this function to recalculate the pair matrix and energy parameters after a change in folding parameters like temperature

Deprecated Use vrna_exp_params_subst() instead

void update_pf_params_par (
    int length,
    vrna_exp_param_t* parameters
    )
Recalculate energy parameters.
Deprecated Use vrna_exp_params_subst() instead
FLT_OR_DBL* export_bppm (void)

Get a pointer to the base pair probability array

Accessing the base pair probabilities for a pair (i,j) is achieved by.

FLT_OR_DBL *pr  = export_bppm();
pr_ij           = pr[iindx[i]-j];

Returns:

A pointer to the base pair probability array

Pre-Condition

Call pf_fold_par() , pf_fold() or pf_circ_fold() first to fill the base pair probability array

int get_pf_arrays (
    short** S_p,
    short** S1_p,
    char** ptype_p,
    FLT_OR_DBL** qb_p,
    FLT_OR_DBL** qm_p,
    FLT_OR_DBL** q1k_p,
    FLT_OR_DBL** qln_p
    )
Get the pointers to (almost) all relavant computation arrays used in partition function computation.

Parameters:

S_p A pointer to the ‘S’ array (integer representation of nucleotides)
S1_p A pointer to the ‘S1’ array (2nd integer representation of nucleotides)
ptype_p A pointer to the pair type matrix
qb_p A pointer to the Q B matrix
qm_p A pointer to the Q M matrix
q1k_p A pointer to the 5’ slice of the Q matrix ( \(q1k(k) = Q(1, k)\) )
qln_p A pointer to the 3’ slice of the Q matrix ( \(qln(l) = Q(l, n)\) )

Returns:

Non Zero if everything went fine, 0 otherwise

Pre-Condition

In order to assign meaningful pointers, you have to call pf_fold_par() or pf_fold() first!

double mean_bp_distance (int length)
Get the mean base pair distance of the last partition function computation.

Deprecated Use vrna_mean_bp_distance() or vrna_mean_bp_distance_pr() instead!

Parameters:

length  

Returns:

mean base pair distance in thermodynamic ensemble

double mean_bp_distance_pr (
    int length,
    FLT_OR_DBL* pr
    )
Get the mean base pair distance in the thermodynamic ensemble.

This is a threadsafe implementation of mean_bp_dist() !

\(<d> = \sum_{a,b} p_a p_b d(S_a,S_b)\)

this can be computed from the pair probs \(p_ij\) as

\(<d> = \sum_{ij} p_{ij}(1-p_{ij})\)

Deprecated Use vrna_mean_bp_distance() or vrna_mean_bp_distance_pr() instead!

Parameters:

length The length of the sequence
pr The matrix containing the base pair probabilities

Returns:

The mean pair distance of the structure ensemble

vrna_ep_t* vrna_plist_from_probs (
    vrna_fold_compound_t* vc,
    double cut_off
    )
Create a vrna_ep_t from base pair probability matrix.

The probability matrix provided via the vrna_fold_compound_t is parsed and all pair probabilities above the given threshold are used to create an entry in the plist

The end of the plist is marked by sequence positions i as well as j equal to 0. This condition should be used to stop looping over its entries

Parameters:

vc The fold compound
cut_off The cutoff value

Returns:

A pointer to the plist that is to be created

void assign_plist_from_pr (
    vrna_ep_t** pl,
    FLT_OR_DBL* probs,
    int length,
    double cutoff
    )
Create a vrna_ep_t from a probability matrix.

The probability matrix given is parsed and all pair probabilities above the given threshold are used to create an entry in the plist

The end of the plist is marked by sequence positions i as well as j equal to 0. This condition should be used to stop looping over its entries

Deprecated Use vrna_plist_from_probs() instead!

Parameters:

pl A pointer to the vrna_ep_t that is to be created
probs The probability matrix used for creating the plist
length The length of the RNA sequence
cutoff The cutoff value

Note

This function is threadsafe