Predicting various Thermodynamic Properties

Compute various thermodynamic properties using the partition function.

Many thermodynamic properties can be derived from the partition function

\[Z = \sum_{s \in \omega} e^{\frac{-E(s)}{kT}}.\]

In particular, for nucleic acids in equilibrium the probabilty \(p(F)\) of a particular structural feature \(F\) follows Boltzmanns law, i.e.:

\[p(F) \propto \sum_{s \mid F \in s} e^{\frac{-E(s)}{kT}}.\]

The actual probabilities can then be obtained from the ratio of those structures containing \(F\) and all structures, i.e.

\[p(F) = \frac{1}{Z} \sum_{s \mid F \in s} e^{\frac{-E(s)}{kT}}.\]

Consequently, a particular secondary structure \(s\) has equilibrium probability

\[p(s) = \frac{1}{Z} e^{\frac{-E(s)}{kT}}\]

which can be easily computed once \(Z\) and \(E(s)\) are known.

Efficient dynamic programming algorithms exist to compute the equilibrium probabilities

\[p_{ij} = \frac{1}{Z} \sum_{s \mid (i,j) \in s} e^{\frac{-E(s)}{kT}}\]

of base pairs \((i,j)\) without the need for exhaustive enumeration of \(s\).

This interface provides the functions for all thermodynamic property computations implemented in RNAlib.

Thermodynamic Properties API

Base pair probabilities and derived computations

int vrna_pairing_probs(vrna_fold_compound_t *fc, char *structure)
#include <ViennaRNA/equilibrium_probs.h>
double vrna_mean_bp_distance_pr(int length, FLT_OR_DBL *pr)
#include <ViennaRNA/equilibrium_probs.h>

Get the mean base pair distance in the thermodynamic ensemble from a probability matrix.

\[ <d> = \sum_{a,b} p_a p_b d(S_a,S_b) \]

this can be computed from the pair probs \( p_{ij} \) as

\[ <d> = \sum_{ij} p_{ij}(1-p_{ij}) \]

Parameters:
  • length – The length of the sequence

  • pr – The matrix containing the base pair probabilities

Returns:

The mean pair distance of the structure ensemble

double vrna_mean_bp_distance(vrna_fold_compound_t *fc)
#include <ViennaRNA/equilibrium_probs.h>

Get the mean base pair distance in the thermodynamic ensemble.

\[ <d> = \sum_{a,b} p_a p_b d(S_a,S_b) \]

this can be computed from the pair probs \(p_{ij}\) as

\[ <d> = \sum_{ij} p_{ij}(1-p_{ij}) \]

SWIG Wrapper Notes:

This function is attached as method mean_bp_distance() to objects of type fold_compound. See, e.g. RNA.fold_compound.mean_bp_distance() in the Python API.

Parameters:
  • fc – The fold compound data structure

Returns:

The mean pair distance of the structure ensemble

double vrna_ensemble_defect_pt(vrna_fold_compound_t *fc, const short *pt)
#include <ViennaRNA/equilibrium_probs.h>

Compute the Ensemble Defect for a given target structure provided as a vrna_ptable.

Given a target structure \(s\), compute the average dissimilarity of a randomly drawn structure from the ensemble, i.e.:

\[ ED(s) = 1 - \frac{1}{n} \sum_{ij, (i,j) \in s} p_{ij} - \frac{1}{n} \sum_{i}(1 - s_i)q_i \]

with sequence length \(n\), the probability \(p_{ij}\) of a base pair \((i,j)\), the probability \(q_i = 1 - \sum_j p_{ij}\) of nucleotide \(i\) being unpaired, and the indicator variable \(s_i = 1\) if \(\exists (i,j) \in s\), and \(s_i = 0\) otherwise.

SWIG Wrapper Notes:

This function is attached as overloaded method ensemble_defect() to objects of type fold_compound. See, e.g. RNA.fold_compound.ensemble_defect() in the Python API.

See also

vrna_pf(), vrna_pairing_probs(), vrna_ensemble_defect()

Parameters:
  • fc – A fold_compound with pre-computed base pair probabilities

  • pt – A pair table representing a target structure

Pre:

The vrna_fold_compound_t input parameter fc must contain a valid base pair probability matrix. This means that partition function and base pair probabilities must have been computed using fc before execution of this function!

Returns:

The ensemble defect with respect to the target structure, or -1. upon failure, e.g. pre-conditions are not met

double vrna_ensemble_defect(vrna_fold_compound_t *fc, const char *structure)
#include <ViennaRNA/equilibrium_probs.h>

Compute the Ensemble Defect for a given target structure.

This is a wrapper around vrna_ensemble_defect_pt(). Given a target structure \(s\), compute the average dissimilarity of a randomly drawn structure from the ensemble, i.e.:

\[ ED(s) = 1 - \frac{1}{n} \sum_{ij, (i,j) \in s} p_{ij} - \frac{1}{n} \sum_{i}(1 - s_i)q_i \]

with sequence length \(n\), the probability \(p_{ij}\) of a base pair \((i,j)\), the probability \(q_i = 1 - \sum_j p_{ij}\) of nucleotide \(i\) being unpaired, and the indicator variable \(s_i = 1\) if \(\exists (i,j) \in s\), and \(s_i = 0\) otherwise.

SWIG Wrapper Notes:

This function is attached as method ensemble_defect() to objects of type fold_compound. Note that the SWIG wrapper takes a structure in dot-bracket notation and converts it into a pair table using vrna_ptable_from_string(). The resulting pair table is then internally passed to vrna_ensemble_defect_pt(). To control which kind of matching brackets will be used during conversion, the optional argument options can be used. See also the description of vrna_ptable_from_string() for available options. (default: VRNA_BRACKETS_RND). See, e.g. RNA.fold_compound.ensemble_defect() in the Python API.

See also

vrna_pf(), vrna_pairing_probs(), vrna_ensemble_defect_pt()

Parameters:
  • fc – A fold_compound with pre-computed base pair probabilities

  • structure – A target structure in dot-bracket notation

Pre:

The vrna_fold_compound_t input parameter fc must contain a valid base pair probability matrix. This means that partition function and base pair probabilities must have been computed using fc before execution of this function!

Returns:

The ensemble defect with respect to the target structure, or -1. upon failure, e.g. pre-conditions are not met

double *vrna_positional_entropy(vrna_fold_compound_t *fc)
#include <ViennaRNA/equilibrium_probs.h>

Compute a vector of positional entropies.

This function computes the positional entropies from base pair probabilities as

\[ S(i) = - \sum_j p_{ij} \log(p_{ij}) - q_i \log(q_i) \]

with unpaired probabilities \( q_i = 1 - \sum_j p_{ij} \).

Low entropy regions have little structural flexibility and the reliability of the predicted structure is high. High entropy implies many structural alternatives. While these alternatives may be functionally important, they make structure prediction more difficult and thus less reliable.

SWIG Wrapper Notes:

This function is attached as method positional_entropy() to objects of type fold_compound. See, e.g. RNA.fold_compound.positional_entropy() in the Python API.

Parameters:
  • fc – A fold_compound with pre-computed base pair probabilities

Pre:

This function requires pre-computed base pair probabilities! Thus, vrna_pf() must be called beforehand.

Returns:

A 1-based vector of positional entropies \( S(i) \). (position 0 contains the sequence length)

vrna_ep_t *vrna_stack_prob(vrna_fold_compound_t *fc, double cutoff)
#include <ViennaRNA/equilibrium_probs.h>

Compute stacking probabilities.

For each possible base pair \((i,j)\), compute the probability of a stack \((i,j)\), \((i+1, j-1)\).

Parameters:
  • fc – The fold compound data structure with precomputed base pair probabilities

  • cutoff – A cutoff value that limits the output to stacks with \( p > \textrm{cutoff} \).

Returns:

A list of stacks with enclosing base pair \((i,j)\) and probabiltiy \( p \)

Multimer probabilities computations

void vrna_pf_dimer_probs(double FAB, double FA, double FB, vrna_ep_t *prAB, const vrna_ep_t *prA, const vrna_ep_t *prB, int Alength, const vrna_exp_param_t *exp_params)
#include <ViennaRNA/equilibrium_probs.h>

Compute Boltzmann probabilities of dimerization without homodimers.

Given the pair probabilities and free energies (in the null model) for a dimer AB and the two constituent monomers A and B, compute the conditional pair probabilities given that a dimer AB actually forms. Null model pair probabilities are given as a list as produced by vrna_plist_from_probs(), the dimer probabilities ‘prAB’ are modified in place.

Parameters:
  • FAB – free energy of dimer AB

  • FA – free energy of monomer A

  • FB – free energy of monomer B

  • prAB – pair probabilities for dimer

  • prA – pair probabilities monomer

  • prB – pair probabilities monomer

  • Alength – Length of molecule A

  • exp_params – The precomputed Boltzmann factors

Structure probability computations

double vrna_pr_structure(vrna_fold_compound_t *fc, const char *structure)
#include <ViennaRNA/equilibrium_probs.h>

Compute the equilibrium probability of a particular secondary structure.

The probability \(p(s)\) of a particular secondary structure \(s\) can be computed as

\[ p(s) = \frac{exp(-\beta E(s)}{Z} \]

from the structures free energy \(E(s)\) and the partition function

\[ Z = \sum_s exp(-\beta E(s)),\quad\mathrm{with}\quad\beta = \frac{1}{RT} \]

where \(R\) is the gas constant and \(T\) the thermodynamic temperature.

SWIG Wrapper Notes:

This function is attached as method pr_structure() to objects of type fold_compound. See, e.g. RNA.fold_compound.pr_structure() in the Python API.

Parameters:
  • fc – The fold compound data structure with precomputed partition function

  • structure – The secondary structure to compute the probability for in dot-bracket notation

Pre:

The fold compound fc must have went through a call to vrna_pf() to fill the dynamic programming matrices with the corresponding partition function.

Returns:

The probability of the input structure (range \([0:1]\))

double vrna_pr_energy(vrna_fold_compound_t *fc, double e)
#include <ViennaRNA/equilibrium_probs.h>

SWIG Wrapper Notes:

This function is attached as method pr_energy() to objects of type fold_compound. See, e.g. RNA.fold_compound.pr_energy() in the Python API.

Basic heat capacity function interface

vrna_heat_capacity_t *vrna_heat_capacity(vrna_fold_compound_t *fc, float T_min, float T_max, float T_increment, unsigned int mpoints)
#include <ViennaRNA/heat_capacity.h>

Compute the specific heat for an RNA.

This function computes an RNAs specific heat in a given temperature range from the partition function by numeric differentiation. The result is returned as a list of pairs of temperature in C and specific heat in Kcal/(Mol*K).

Users can specify the temperature range for the computation from T_min to T_max, as well as the increment step size T_increment. The latter also determines how many times the partition function is computed. Finally, the parameter mpoints determines how smooth the curve should be. The algorithm itself fits a parabola to \( 2 \cdot mpoints + 1 \) data points to calculate 2nd derivatives. Increasing this parameter produces a smoother curve.

SWIG Wrapper Notes:

This function is attached as overloaded method heat_capacity() to objects of type fold_compound. If the optional function arguments T_min, T_max, T_increment, and mpoints are omitted, they default to 0.0, 100.0, 1.0 and 2, respectively. See, e.g. RNA.fold_compound.heat_capacity() in the Python API.

Parameters:
  • fc – The vrna_fold_compound_t with the RNA sequence to analyze

  • T_min – Lowest temperature in C

  • T_max – Highest temperature in C

  • T_increment – Stepsize for temperature incrementation in C (a reasonable choice might be 1C)

  • mpoints – The number of interpolation points to calculate 2nd derivative (a reasonable choice might be 2, min: 1, max: 100)

Returns:

A list of pairs of temperatures and corresponding heat capacity or NULL upon any failure. The last entry of the list is indicated by a temperature field set to a value smaller than T_min

int vrna_heat_capacity_cb(vrna_fold_compound_t *fc, float T_min, float T_max, float T_increment, unsigned int mpoints, vrna_heat_capacity_f cb, void *data)
#include <ViennaRNA/heat_capacity.h>

Compute the specific heat for an RNA (callback variant)

Similar to vrna_heat_capacity(), this function computes an RNAs specific heat in a given temperature range from the partition function by numeric differentiation. Instead of returning a list of temperature/specific heat pairs, however, this function returns the individual results through a callback mechanism. The provided function will be called for each result and passed the corresponding temperature and specific heat values along with the arbitrary data as provided through the data pointer argument.

Users can specify the temperature range for the computation from T_min to T_max, as well as the increment step size T_increment. The latter also determines how many times the partition function is computed. Finally, the parameter mpoints determines how smooth the curve should be. The algorithm itself fits a parabola to \( 2 \cdot mpoints + 1 \) data points to calculate 2nd derivatives. Increasing this parameter produces a smoother curve.

SWIG Wrapper Notes:

This function is attached as method heat_capacity_cb() to objects of type fold_compound. See, e.g. RNA.fold_compound.heat_capacity_cb() in the Python API.

Parameters:
  • fc – The vrna_fold_compound_t with the RNA sequence to analyze

  • T_min – Lowest temperature in C

  • T_max – Highest temperature in C

  • T_increment – Stepsize for temperature incrementation in C (a reasonable choice might be 1C)

  • mpoints – The number of interpolation points to calculate 2nd derivative (a reasonable choice might be 2, min: 1, max: 100)

  • cb – The user-defined callback function that receives the individual results

  • data – An arbitrary data structure that will be passed to the callback in conjunction with the results

Returns:

Returns 0 upon failure, and non-zero otherwise

Simplified heat capacity computation

vrna_heat_capacity_t *vrna_heat_capacity_simple(const char *sequence, float T_min, float T_max, float T_increment, unsigned int mpoints)
#include <ViennaRNA/heat_capacity.h>

Compute the specific heat for an RNA (simplified variant)

Similar to vrna_heat_capacity(), this function computes an RNAs specific heat in a given temperature range from the partition function by numeric differentiation. This simplified version, however, only requires the RNA sequence as input instead of a vrna_fold_compound_t data structure. The result is returned as a list of pairs of temperature in C and specific heat in Kcal/(Mol*K).

Users can specify the temperature range for the computation from T_min to T_max, as well as the increment step size T_increment. The latter also determines how many times the partition function is computed. Finally, the parameter mpoints determines how smooth the curve should be. The algorithm itself fits a parabola to \( 2 \cdot mpoints + 1 \) data points to calculate 2nd derivatives. Increasing this parameter produces a smoother curve.

SWIG Wrapper Notes:

This function is available as overloaded function heat_capacity(). If the optional function arguments T_min, T_max, T_increment, and mpoints are omitted, they default to 0.0, 100.0, 1.0 and 2, respectively. See, e.g. RNA.head_capacity() in the Python API.

Parameters:
  • sequence – The RNA sequence input (must be uppercase)

  • T_min – Lowest temperature in C

  • T_max – Highest temperature in C

  • T_increment – Stepsize for temperature incrementation in C (a reasonable choice might be 1C)

  • mpoints – The number of interpolation points to calculate 2nd derivative (a reasonable choice might be 2, min: 1, max: 100)

Returns:

A list of pairs of temperatures and corresponding heat capacity or NULL upon any failure. The last entry of the list is indicated by a temperature field set to a value smaller than T_min

Typedefs

typedef void (*vrna_heat_capacity_f)(float temp, float heat_capacity, void *data)
#include <ViennaRNA/heat_capacity.h>

The callback for heat capacity predictions.

Notes on Callback Functions:

This function will be called for each evaluated temperature in the heat capacity prediction.

Param temp:

The current temperature this results corresponds to in C

Param heat_capacity:

The heat capacity in Kcal/(Mol * K)

Param data:

Some arbitrary data pointer passed through by the function executing the callback

void() vrna_heat_capacity_callback (float temp, float heat_capacity, void *data)
#include <ViennaRNA/heat_capacity.h>
typedef struct vrna_heat_capacity_s vrna_heat_capacity_t
#include <ViennaRNA/heat_capacity.h>

A single result from heat capacity computations.

This is a convenience typedef for vrna_heat_capacity_s, i.e. results as obtained from vrna_heat_capacity()

struct vrna_heat_capacity_s
#include <ViennaRNA/heat_capacity.h>

A single result from heat capacity computations.

Public Members

float temperature

The temperature in C.

float heat_capacity

The specific heat at this temperature in Kcal/(Mol * K)