RNAlib-2.2.0-RC3
Basic Data Structures for Structure Prediction and Evaluation

This module provides interfaces that deal with the most basic data structures used in structure predicting and energy evaluating function of the RNAlib. More...

+ Collaboration diagram for Basic Data Structures for Structure Prediction and Evaluation:

Files

file  model.h
 The model details data structure and its corresponding modifiers.
 

Data Structures

struct  vrna_mx_mfe_t
 Minimum Free Energy (MFE) Dynamic Programming (DP) matrices data structure required within the vrna_fold_compound. More...
 
struct  vrna_mx_pf_t
 Partition function (PF) Dynamic Programming (DP) matrices data structure required within the vrna_fold_compound. More...
 
struct  vrna_fold_compound
 The most basic data structure required by many functions throughout the RNAlib. More...
 
struct  vrna_md_t
 The data structure that contains the complete model details used throughout the calculations. More...
 

Macros

#define VRNA_OPTION_MFE   1
 Option flag to specify requirement of Minimum Free Energy (MFE) DP matrices and corresponding set of energy parameters. More...
 
#define VRNA_OPTION_PF   2
 Option flag to specify requirement of Partition Function (PF) DP matrices and corresponding set of Boltzmann factors. More...
 
#define VRNA_OPTION_EVAL_ONLY   8
 Option flag to specify that neither MFE, nor PF DP matrices are required. More...
 
#define VRNA_MODEL_DEFAULT_TEMPERATURE   37.0
  Default temperature for structure prediction and free energy evaluation in °C More...
 
#define VRNA_MODEL_DEFAULT_PF_SCALE   -1
 Default scaling factor for partition function computations. More...
 
#define VRNA_MODEL_DEFAULT_BETA_SCALE   1.
 Default scaling factor for absolute thermodynamic temperature in Boltzmann factors. More...
 
#define VRNA_MODEL_DEFAULT_DANGLES   2
 Default dangling end model. More...
 
#define VRNA_MODEL_DEFAULT_SPECIAL_HP   1
 Default model behavior for lookup of special tri-, tetra-, and hexa-loops. More...
 
#define VRNA_MODEL_DEFAULT_NO_LP   0
 Default model behavior for so-called 'lonely pairs'. More...
 
#define VRNA_MODEL_DEFAULT_NO_GU   0
 Default model behavior for G-U base pairs. More...
 
#define VRNA_MODEL_DEFAULT_NO_GU_CLOSURE   0
 Default model behavior for G-U base pairs closing a loop. More...
 
#define VRNA_MODEL_DEFAULT_CIRC   0
 Default model behavior to treat a molecule as a circular RNA (DNA) More...
 
#define VRNA_MODEL_DEFAULT_GQUAD   0
 Default model behavior regarding the treatment of G-Quadruplexes. More...
 
#define VRNA_MODEL_DEFAULT_UNIQ_ML   0
 Default behavior of the model regarding unique multibranch loop decomposition. More...
 
#define VRNA_MODEL_DEFAULT_ENERGY_SET   0
 Default model behavior on which energy set to use. More...
 
#define VRNA_MODEL_DEFAULT_BACKTRACK   1
 Default model behavior with regards to backtracking of structures. More...
 
#define VRNA_MODEL_DEFAULT_BACKTRACK_TYPE   'F'
 Default model behavior on what type of backtracking to perform. More...
 
#define VRNA_MODEL_DEFAULT_COMPUTE_BPP   1
 Default model behavior with regards to computing base pair probabilities. More...
 
#define VRNA_MODEL_DEFAULT_MAX_BP_SPAN   -1
 Default model behavior for the allowed maximum base pair span. More...
 
#define VRNA_MODEL_DEFAULT_LOG_ML   0
 Default model behavior on how to evaluate the energy contribution of multibranch loops. More...
 
#define VRNA_MODEL_DEFAULT_ALI_OLD_EN   0
 Default model behavior for consensus structure energy evaluation. More...
 
#define VRNA_MODEL_DEFAULT_ALI_RIBO   0
 Default model behavior for consensus structure covariance contribution assessment. More...
 
#define VRNA_MODEL_DEFAULT_ALI_CV_FACT   1.
 Default model behavior for weighting the covariance score in consensus structure prediction. More...
 
#define VRNA_MODEL_DEFAULT_ALI_NC_FACT   1.
 Default model behavior for weighting the nucleotide conservation? in consensus structure prediction. More...
 
#define MAXALPHA   20
 Maximal length of alphabet.
 

Enumerations

enum  vrna_mx_t { VRNA_MX_DEFAULT, VRNA_MX_LFOLD, VRNA_MX_2DFOLD }
 An enumerator that is used to specify the type of a polymorphic Dynamic Programming (DP) matrix data structure. More...
 
enum  vrna_vc_t { VRNA_VC_TYPE_SINGLE, VRNA_VC_TYPE_ALIGNMENT }
 An enumerator that is used to specify the type of a vrna_fold_compound. More...
 

Functions

vrna_fold_compoundvrna_get_fold_compound (const char *sequence, vrna_md_t *md_p, unsigned int options)
 Retrieve a vrna_fold_compound data structure for single sequences and hybridizing sequences. More...
 
vrna_fold_compoundvrna_get_fold_compound_ali (const char **sequences, vrna_md_t *md_p, unsigned int options)
 Retrieve a vrna_fold_compound data structure for sequence alignments. More...
 
void vrna_params_update (vrna_fold_compound *vc, vrna_param_t *par)
 Update/Reset energy parameters data structure within a vrna_fold_compound. More...
 
void vrna_free_fold_compound (vrna_fold_compound *vc)
 Free memory occupied by a vrna_fold_compound. More...
 
void vrna_free_mfe_matrices (vrna_fold_compound *vc)
 Free memory occupied by the Minimum Free Energy (MFE) Dynamic Programming (DP) matrices. More...
 
void vrna_free_pf_matrices (vrna_fold_compound *vc)
 Free memory occupied by the Partition Function (PF) Dynamic Programming (DP) matrices. More...
 
void vrna_md_set_default (vrna_md_t *md)
 Set default model details. More...
 
void vrna_md_update (vrna_md_t *md)
 Update the model details data structure. More...
 
void vrna_md_set_globals (vrna_md_t *md)
 Set default model details. More...
 

Variables

double temperature
 Rescale energy parameters to a temperature in degC. More...
 
double pf_scale
 A scaling factor used by pf_fold() to avoid overflows. More...
 
int dangles
 Switch the energy model for dangling end contributions (0, 1, 2, 3) More...
 
int tetra_loop
 Include special stabilizing energies for some tri-, tetra- and hexa-loops;. More...
 
int noLonelyPairs
 Global switch to avoid/allow helices of length 1. More...
 
int noGU
 Global switch to forbid/allow GU base pairs at all.
 
int no_closingGU
 GU allowed only inside stacks if set to 1.
 
int circ
 backward compatibility variable.. this does not effect anything
 
int gquad
 Allow G-quadruplex formation.
 
int canonicalBPonly
 
int uniq_ML
 do ML decomposition uniquely (for subopt)
 
int energy_set
 0 = BP; 1=any mit GC; 2=any mit AU-parameter More...
 
int do_backtrack
 do backtracking, i.e. compute secondary structures or base pair probabilities More...
 
char backtrack_type
 A backtrack array marker for inverse_fold() More...
 
char * nonstandards
 contains allowed non standard base pairs More...
 
int max_bp_span
 Maximum allowed base pair span. More...
 
int oldAliEn
 use old alifold energies (with gaps)
 
int ribo
 use ribosum matrices
 
int logML
 if nonzero use logarithmic ML energy in energy_of_struct
 

Detailed Description

This module provides interfaces that deal with the most basic data structures used in structure predicting and energy evaluating function of the RNAlib.

Throughout the RNAlib, a data structure, the vrna_fold_compound, is used to group information and data that is required for structure prediction and energy evaluation. Here, you'll find interface functions to create, modify, and delete vrna_fold_compound data structures.

Macro Definition Documentation

#define VRNA_OPTION_MFE   1

Option flag to specify requirement of Minimum Free Energy (MFE) DP matrices and corresponding set of energy parameters.

See also
vrna_get_fold_compound(), vrna_get_fold_compound_ali(), VRNA_OPTION_EVAL_ONLY
#define VRNA_OPTION_PF   2

Option flag to specify requirement of Partition Function (PF) DP matrices and corresponding set of Boltzmann factors.

See also
vrna_get_fold_compound(), vrna_get_fold_compound_ali(), VRNA_OPTION_EVAL_ONLY
#define VRNA_OPTION_EVAL_ONLY   8

Option flag to specify that neither MFE, nor PF DP matrices are required.

Use this flag in conjuntion with VRNA_OPTION_MFE, and VRNA_OPTION_PF to save memory for a vrna_fold_compound obtained from vrna_get_fold_compound(), or vrna_get_fold_compound_ali() in cases where only energy evaluation but no structure prediction is required.

See also
vrna_get_fold_compound(), vrna_get_fold_compound_ali(), vrna_eval_structure()
#define VRNA_MODEL_DEFAULT_TEMPERATURE   37.0

Default temperature for structure prediction and free energy evaluation in °C

See also
vrna_md_t.temperature, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_PF_SCALE   -1

Default scaling factor for partition function computations.

See also
vrna_exp_param_t.pf_scale, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_BETA_SCALE   1.

Default scaling factor for absolute thermodynamic temperature in Boltzmann factors.

See also
vrna_exp_param_t.alpha, vrna_md_t.betaScale, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_DANGLES   2

Default dangling end model.

See also
vrna_md_t.dangles, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_SPECIAL_HP   1

Default model behavior for lookup of special tri-, tetra-, and hexa-loops.

See also
vrna_md_t.special_hp, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_NO_LP   0

Default model behavior for so-called 'lonely pairs'.

See also
vrna_md_t.noLP, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_NO_GU   0

Default model behavior for G-U base pairs.

See also
vrna_md_t.noGU, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_NO_GU_CLOSURE   0

Default model behavior for G-U base pairs closing a loop.

See also
vrna_md_t.noGUclosure, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_CIRC   0

Default model behavior to treat a molecule as a circular RNA (DNA)

See also
vrna_md_t.circ, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_GQUAD   0

Default model behavior regarding the treatment of G-Quadruplexes.

See also
vrna_md_t.gquad, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_UNIQ_ML   0

Default behavior of the model regarding unique multibranch loop decomposition.

See also
vrna_md_t.uniq_ML, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_ENERGY_SET   0

Default model behavior on which energy set to use.

See also
vrna_md_t.energy_set, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_BACKTRACK   1

Default model behavior with regards to backtracking of structures.

See also
vrna_md_t.backtrack, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_BACKTRACK_TYPE   'F'

Default model behavior on what type of backtracking to perform.

See also
vrna_md_t.backtrack_type, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_COMPUTE_BPP   1

Default model behavior with regards to computing base pair probabilities.

See also
vrna_md_t.compute_bpp, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_MAX_BP_SPAN   -1

Default model behavior for the allowed maximum base pair span.

See also
vrna_md_t.max_bp_span, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_LOG_ML   0

Default model behavior on how to evaluate the energy contribution of multibranch loops.

See also
vrna_md_t.logML, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_ALI_OLD_EN   0

Default model behavior for consensus structure energy evaluation.

See also
vrna_md_t.oldAliEn, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_ALI_RIBO   0

Default model behavior for consensus structure covariance contribution assessment.

See also
vrna_md_t.ribo, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_ALI_CV_FACT   1.

Default model behavior for weighting the covariance score in consensus structure prediction.

See also
vrna_md_t.cv_fact, vrna_md_set_default()
#define VRNA_MODEL_DEFAULT_ALI_NC_FACT   1.

Default model behavior for weighting the nucleotide conservation? in consensus structure prediction.

See also
#vrna_md_t.nc_fact, vrna_md_set_default()

Enumeration Type Documentation

enum vrna_mx_t

An enumerator that is used to specify the type of a polymorphic Dynamic Programming (DP) matrix data structure.

See also
vrna_mx_mfe_t, vrna_mx_pf_t
Enumerator
VRNA_MX_DEFAULT 

Default DP matrices.

VRNA_MX_LFOLD 

DP matrices suitable for local structure prediction.

See also
Lfold(), pfl_fold()
VRNA_MX_2DFOLD 

DP matrices suitable for distance class partitioned structure prediction.

See also
vrna_TwoD_fold(), vrna_TwoD_pf_fold()
enum vrna_vc_t

An enumerator that is used to specify the type of a vrna_fold_compound.

Enumerator
VRNA_VC_TYPE_SINGLE 

Type is suitable for single, and hybridizing sequences

VRNA_VC_TYPE_ALIGNMENT 

Type is suitable for sequence alignments (consensus structure prediction)

Function Documentation

vrna_fold_compound* vrna_get_fold_compound ( const char *  sequence,
vrna_md_t md_p,
unsigned int  options 
)

Retrieve a vrna_fold_compound data structure for single sequences and hybridizing sequences.

This function provides an easy interface to obtain a prefilled vrna_fold_compound by passing a single sequence, or two contatenated sequences as input. For the latter, sequences need to be seperated by an '&' character like this:

char *sequence = "GGGG&CCCC"; 

The optional parameter 'md_p' can be used to specify the model details for computations on the #vrna_fold_compounds content. The third parameter 'options' is used to specify the DP matrix requirements and the corresponding set of energy parameters. Use the macros:

to specify the required type of computations that will be performed with the vrna_fold_compound.

Note
The sequence string must be uppercase, and should contain only RNA (resp. DNA) alphabet depending on what energy parameter set is used
See also
vrna_get_fold_compound_ali(), vrna_md_t, VRNA_OPTION_MFE, VRNA_OPTION_PF, VRNA_OPTION_EVAL_ONLY
Parameters
sequenceA single sequence, or two concatenated sequences seperated by an '&' character
md_pAn optional set of model details
optionsThe options for DP matrices memory allocation
Returns
A prefilled vrna_fold_compound that can be readily used for computations
vrna_fold_compound* vrna_get_fold_compound_ali ( const char **  sequences,
vrna_md_t md_p,
unsigned int  options 
)

Retrieve a vrna_fold_compound data structure for sequence alignments.

This function provides an easy interface to obtain a prefilled vrna_fold_compound by passing an alignment of sequences.

The optional parameter 'md_p' can be used to specify the model details for computations on the #vrna_fold_compounds content. The third parameter 'options' is used to specify the DP matrix requirements and the corresponding set of energy parameters. Use the macros:

to specify the required type of computations that will be performed with the vrna_fold_compound.

Note
The sequence strings must be uppercase, and should contain only RNA (resp. DNA) alphabet including gap characters depending on what energy parameter set is used.
See also
vrna_get_fold_compound(), vrna_md_t, VRNA_OPTION_MFE, VRNA_OPTION_PF, VRNA_OPTION_EVAL_ONLY, read_clustal()
Parameters
sequencesA sequence alignment including 'gap' characters
md_pAn optional set of model details
optionsThe options for DP matrices memory allocation
Returns
A prefilled vrna_fold_compound that can be readily used for computations
void vrna_params_update ( vrna_fold_compound vc,
vrna_param_t par 
)

Update/Reset energy parameters data structure within a vrna_fold_compound.

Passing NULL as second argument leads to a reset of the energy parameters within vc to their default values. Otherwise, the energy parameters provided will be copied over into vc.

energy_parameters

Parameters
vcThe vrna_fold_compound that is about to receive updated energy parameters
parThe energy parameters used to substitute those within vc (Maybe NULL)
void vrna_free_fold_compound ( vrna_fold_compound vc)

Free memory occupied by a vrna_fold_compound.

See also
vrna_get_fold_compound(), vrna_get_fold_compound_ali(), vrna_free_mfe_matrices(), vrna_free_pf_matrices()
Parameters
vcThe vrna_fold_compound that is to be erased from memory
void vrna_free_mfe_matrices ( vrna_fold_compound vc)

Free memory occupied by the Minimum Free Energy (MFE) Dynamic Programming (DP) matrices.

See also
vrna_get_fold_compound(), vrna_get_fold_compound_ali(), vrna_free_fold_compound(), vrna_free_pf_matrices()
Parameters
vcThe vrna_fold_compound storing the MFE DP matrices that are to be erased from memory
void vrna_free_pf_matrices ( vrna_fold_compound vc)

Free memory occupied by the Partition Function (PF) Dynamic Programming (DP) matrices.

See also
vrna_get_fold_compound(), vrna_get_fold_compound_ali(), vrna_free_fold_compound(), vrna_free_mfe_matrices()
Parameters
vcThe vrna_fold_compound storing the PF DP matrices that are to be erased from memory
void vrna_md_set_default ( vrna_md_t md)

Set default model details.

Use this function if you wish to initialize a vrna_md_t data structure with its default values

Parameters
mdA pointer to the data structure that is about to be initialized
void vrna_md_update ( vrna_md_t md)

Update the model details data structure.

This function should be called after changing the vrna_md_t.energy_set attribute since it re-initializes base pairing related arrays within the vrna_md_t data structure. In particular, #vrna_md_t.pair, #vrna_md_t.alias, and #vrna_md_t.rtype are set to the values that correspond to the specified vrna_md_t.energy_set option

See also
vrna_md_t, vrna_md_t.energy_set, #vrna_md_t.pair, #vrna_md_t.rtype, #vrna_md_t.alias, vrna_md_set_default()
void vrna_md_set_globals ( vrna_md_t md)

Set default model details.

Use this function if you wish to initialize a vrna_md_t data structure with its default values, i.e. the global model settings as provided by the deprecated global variables.

Deprecated:
This function will vanish as soon as backward compatibility of RNAlib is dropped (expected in version 3). Use vrna_md_set_default() instead!
Parameters
mdA pointer to the data structure that is about to be initialized

Variable Documentation

double temperature

Rescale energy parameters to a temperature in degC.

Default is 37C. You have to call the update_..._params() functions after changing this parameter.

double pf_scale

A scaling factor used by pf_fold() to avoid overflows.

Should be set to approximately $exp{((-F/kT)/length)}$, where $F$ is an estimate for the ensemble free energy, for example the minimum free energy. You must call update_pf_params() after changing this parameter.
If pf_scale is -1 (the default) , an estimate will be provided automatically when computing partition functions, e.g. pf_fold() The automatic estimate is usually insufficient for sequences more than a few hundred bases long.

int dangles

Switch the energy model for dangling end contributions (0, 1, 2, 3)

If set to 0 no stabilizing energies are assigned to bases adjacent to helices in free ends and multiloops (so called dangling ends). Normally (dangles = 1) dangling end energies are assigned only to unpaired bases and a base cannot participate simultaneously in two dangling ends. In the partition function algorithm pf_fold() these checks are neglected. If dangles is set to 2, all folding routines will follow this convention. This treatment of dangling ends gives more favorable energies to helices directly adjacent to one another, which can be beneficial since such helices often do engage in stabilizing interactions through co-axial stacking.
If dangles = 3 co-axial stacking is explicitly included for adjacent helices in mutli-loops. The option affects only mfe folding and energy evaluation (fold() and energy_of_structure()), as well as suboptimal folding (subopt()) via re-evaluation of energies. Co-axial stacking with one intervening mismatch is not considered so far.

Default is 2 in most algorithms, partition function algorithms can only handle 0 and 2

int tetra_loop

Include special stabilizing energies for some tri-, tetra- and hexa-loops;.

default is 1.

int noLonelyPairs

Global switch to avoid/allow helices of length 1.

Disallow all pairs which can only occur as lonely pairs (i.e. as helix of length 1). This avoids lonely base pairs in the predicted structures in most cases.

int canonicalBPonly

Do not use this variable, it will eventually be removed in one of the next versions

int energy_set

0 = BP; 1=any mit GC; 2=any mit AU-parameter

If set to 1 or 2: fold sequences from an artificial alphabet ABCD..., where A pairs B, C pairs D, etc. using either GC (1) or AU parameters (2); default is 0, you probably don't want to change it.

int do_backtrack

do backtracking, i.e. compute secondary structures or base pair probabilities

If 0, do not calculate pair probabilities in pf_fold(); this is about twice as fast. Default is 1.

char backtrack_type

A backtrack array marker for inverse_fold()

If set to 'C': force (1,N) to be paired, 'M' fold as if the sequence were inside a multi-loop. Otherwise ('F') the usual mfe structure is computed.

char* nonstandards

contains allowed non standard base pairs

Lists additional base pairs that will be allowed to form in addition to GC, CG, AU, UA, GU and UG. Nonstandard base pairs are given a stacking energy of 0.

int max_bp_span

Maximum allowed base pair span.

A value of -1 indicates no restriction for distant base pairs.