RNAlib-2.2.0-RC3
+ Collaboration diagram for Soft Constraints:

Modules

 Generalized Soft Constraints
 

Data Structures

struct  vrna_sc_t
 The soft constraints data structure. More...
 

Macros

#define VRNA_CONSTRAINT_SOFT_MFE   8192U
 Soft constraints flag, apply constraints for MFE calculations.
 
#define VRNA_CONSTRAINT_SOFT_PF   16384U
 Soft constraints flag, apply constraints for partition function calculations.
 
#define VRNA_OBJECTIVE_FUNCTION_QUADRATIC   0
 Use the sum of squared aberrations as objective function. More...
 
#define VRNA_OBJECTIVE_FUNCTION_ABSOLUTE   1
 Use the sum of absolute aberrations as objective function. More...
 
#define VRNA_MINIMIZER_DEFAULT   0
 Use a custom implementation of the gradient descent algorithm to minimize the objective function.
 
#define VRNA_MINIMIZER_CONJUGATE_FR   1
 Use the GNU Scientific Library implementation of the Fletcher-Reeves conjugate gradient algorithm to minimize the objective function. More...
 
#define VRNA_MINIMIZER_CONJUGATE_PR   2
 Use the GNU Scientific Library implementation of the Polak-Ribiere conjugate gradient algorithm to minimize the objective function. More...
 
#define VRNA_MINIMIZER_VECTOR_BFGS   3
 Use the GNU Scientific Library implementation of the vector Broyden-Fletcher-Goldfarb-Shanno algorithm to minimize the objective function. More...
 
#define VRNA_MINIMIZER_VECTOR_BFGS2   4
 Use the GNU Scientific Library implementation of the vector Broyden-Fletcher-Goldfarb-Shanno algorithm to minimize the objective function. More...
 
#define VRNA_MINIMIZER_STEEPEST_DESCENT   5
 Use the GNU Scientific Library implementation of the steepest descent algorithm to minimize the objective function. More...
 

Typedefs

typedef void(* progress_callback) (int iteration, double score, double *epsilon)
 Callback for following the progress of the minimization process. More...
 

Functions

void vrna_sc_init (vrna_fold_compound *vc)
 Initialize an empty soft constraints data structure within a vrna_fold_compound. More...
 
void vrna_sc_add_bp (vrna_fold_compound *vc, const double **constraints, unsigned int options)
 Add soft constraints for paired nucleotides. More...
 
void vrna_sc_add_up (vrna_fold_compound *vc, const double *constraints, unsigned int options)
 Add soft constraints for unpaired nucleotides. More...
 
void vrna_sc_remove (vrna_fold_compound *vc)
 Remove soft constraints from vrna_fold_compound. More...
 
void vrna_sc_free (vrna_sc_t *sc)
 Free memory occupied by a vrna_sc_t data structure. More...
 
int vrna_sc_SHAPE_add_deigan (vrna_fold_compound *vc, const double *reactivities, double m, double b, unsigned int options)
 Add SHAPE reactivity data as soft constraints (Deigan et al. method) More...
 
int vrna_sc_SHAPE_add_deigan_ali (vrna_fold_compound *vc, const char **shape_files, const int *shape_file_association, double m, double b, unsigned int options)
 Add SHAPE reactivity data from files as soft constraints for consensus structure prediction (Deigan et al. method) More...
 
int vrna_sc_SHAPE_add_zarringhalam (vrna_fold_compound *vc, const double *reactivities, double b, double default_value, const char *shape_conversion, unsigned int options)
 Add SHAPE reactivity data as soft constraints (Zarringhalam et al. method) More...
 
int vrna_sc_SHAPE_to_pr (const char *shape_conversion, double *values, int length, double default_value)
 Convert SHAPE reactivity values to probabilities for being unpaired. More...
 
void vrna_sc_minimize_pertubation (vrna_fold_compound *vc, const double *q_prob_unpaired, int objective_function, double sigma_squared, double tau_squared, int algorithm, int sample_size, double *epsilon, double initialStepSize, double minStepSize, double minImprovement, double minimizerTolerance, progress_callback callback)
 Find a vector of perturbation energies that minimizes the discripancies between predicted and observed pairing probabilities and the amount of neccessary adjustments. More...
 

Detailed Description

Macro Definition Documentation

#define VRNA_OBJECTIVE_FUNCTION_QUADRATIC   0

Use the sum of squared aberrations as objective function.

$ F(\vec\epsilon) = \sum_{i = 1}^n{ \frac{\epsilon_i^2}{\tau^2} } + \sum_{i = 1}^n{ \frac{(p_i(\vec\epsilon) - q_i)^2}{\sigma^2} } \to min $

#define VRNA_OBJECTIVE_FUNCTION_ABSOLUTE   1

Use the sum of absolute aberrations as objective function.

$ F(\vec\epsilon) = \sum_{i = 1}^n{ \frac{|\epsilon_i|}{\tau^2} } + \sum_{i = 1}^n{ \frac{|p_i(\vec\epsilon) - q_i|}{\sigma^2} } \to min $

#define VRNA_MINIMIZER_CONJUGATE_FR   1

Use the GNU Scientific Library implementation of the Fletcher-Reeves conjugate gradient algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

#define VRNA_MINIMIZER_CONJUGATE_PR   2

Use the GNU Scientific Library implementation of the Polak-Ribiere conjugate gradient algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

#define VRNA_MINIMIZER_VECTOR_BFGS   3

Use the GNU Scientific Library implementation of the vector Broyden-Fletcher-Goldfarb-Shanno algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

#define VRNA_MINIMIZER_VECTOR_BFGS2   4

Use the GNU Scientific Library implementation of the vector Broyden-Fletcher-Goldfarb-Shanno algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

#define VRNA_MINIMIZER_STEEPEST_DESCENT   5

Use the GNU Scientific Library implementation of the steepest descent algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

Typedef Documentation

typedef void(* progress_callback) (int iteration, double score, double *epsilon)

Callback for following the progress of the minimization process.

Parameters
iterationThe number of the current iteration
scoreThe score of the objective function
epsilonThe perturbation vector yielding the reported score

Function Documentation

void vrna_sc_init ( vrna_fold_compound vc)

Initialize an empty soft constraints data structure within a vrna_fold_compound.

This function adds a proper soft constraints data structure to the vrna_fold_compound data structure. If soft constraints already exist within the fold compound, they are removed.

Note
Accepts vrna_fold_compound of type VRNA_VC_TYPE_SINGLE and VRNA_VC_TYPE_ALIGNMENT
See also
vrna_sc_add_bp(), vrna_sc_add_up(), vrna_sc_SHAPE_add_deigan(), vrna_sc_SHAPE_add_zarringhalam(), vrna_sc_remove(), vrna_sc_add_f(), vrna_sc_add_exp_f(), vrna_sc_add_pre(), vrna_sc_add_post()
Parameters
vcThe vrna_fold_compound where an empty soft constraint feature is to be added to
void vrna_sc_add_bp ( vrna_fold_compound vc,
const double **  constraints,
unsigned int  options 
)

Add soft constraints for paired nucleotides.

Parameters
vcThe vrna_fold_compound the soft constraints are associated with
constraintsA two-dimensional array of pseudo free energies in $ kcal / mol $
optionsThe options flag indicating how/where to store the soft constraints
void vrna_sc_add_up ( vrna_fold_compound vc,
const double *  constraints,
unsigned int  options 
)

Add soft constraints for unpaired nucleotides.

Parameters
vcThe vrna_fold_compound the soft constraints are associated with
constraintsA vector of pseudo free energies in $ kcal / mol $
optionsThe options flag indicating how/where to store the soft constraints
void vrna_sc_remove ( vrna_fold_compound vc)

Remove soft constraints from vrna_fold_compound.

Note
Accepts vrna_fold_compound of type VRNA_VC_TYPE_SINGLE and VRNA_VC_TYPE_ALIGNMENT
Parameters
vcThe vrna_fold_compound possibly containing soft constraints
void vrna_sc_free ( vrna_sc_t sc)

Free memory occupied by a vrna_sc_t data structure.

Parameters
scThe data structure to free from memory
int vrna_sc_SHAPE_add_deigan ( vrna_fold_compound vc,
const double *  reactivities,
double  m,
double  b,
unsigned int  options 
)

Add SHAPE reactivity data as soft constraints (Deigan et al. method)

This approach of SHAPE directed RNA folding uses the simple linear ansatz

\[ \Delta G_{\text{SHAPE}}(i) = m \ln(\text{SHAPE reactivity}(i)+1)+ b \]

to convert SHAPE reactivity values to pseudo energies whenever a nucleotide $ i $ contributes to a stacked pair. A positive slope $ m $ penalizes high reactivities in paired regions, while a negative intercept $ b $ results in a confirmatory ``bonus'' free energy for correctly predicted base pairs. Since the energy evaluation of a base pair stack involves two pairs, the pseudo energies are added for all four contributing nucleotides. Consequently, the energy term is applied twice for pairs inside a helix and only once for pairs adjacent to other structures. For all other loop types the energy model remains unchanged even when the experimental data highly disagrees with a certain motif.

See also
For further details, we refer to [3].
vrna_sc_remove(), vrna_sc_SHAPE_add_zarringhalam(), vrna_sc_minimize_pertubation()
Parameters
vcThe vrna_fold_compound the soft constraints are associated with
reactivitiesA vector of normalized SHAPE reactivities
mThe slope of the conversion function
bThe intercept of the conversion function
optionsThe options flag indicating how/where to store the soft constraints
Returns
1 on successful extraction of the method, 0 on errors
int vrna_sc_SHAPE_add_deigan_ali ( vrna_fold_compound vc,
const char **  shape_files,
const int *  shape_file_association,
double  m,
double  b,
unsigned int  options 
)

Add SHAPE reactivity data from files as soft constraints for consensus structure prediction (Deigan et al. method)

Parameters
vcThe vrna_fold_compound the soft constraints are associated with
shape_filesA set of filenames that contain normalized SHAPE reactivity data
shape_file_associationAn array of integers that associate the files with sequences in the alignment
mThe slope of the conversion function
bThe intercept of the conversion function
optionsThe options flag indicating how/where to store the soft constraints
Returns
1 on successful extraction of the method, 0 on errors
int vrna_sc_SHAPE_add_zarringhalam ( vrna_fold_compound vc,
const double *  reactivities,
double  b,
double  default_value,
const char *  shape_conversion,
unsigned int  options 
)

Add SHAPE reactivity data as soft constraints (Zarringhalam et al. method)

This method first converts the observed SHAPE reactivity of nucleotide $ i $ into a probability $ q_i $ that position $ i $ is unpaired by means of a non-linear map. Then pseudo-energies of the form

\[ \Delta G_{\text{SHAPE}}(x,i) = \beta\ |x_i - q_i| \]

are computed, where $ x_i=0 $ if position $ i $ is unpaired and $ x_i=1 $ if $ i $ is paired in a given secondary structure. The parameter $ \beta $ serves as scaling factor. The magnitude of discrepancy between prediction and experimental observation is represented by $ |x_i - q_i| $.

See also
For further details, we refer to [16]
vrna_sc_remove(), vrna_sc_SHAPE_add_deigan(), vrna_sc_minimize_pertubation()
Parameters
vcThe vrna_fold_compound the soft constraints are associated with
reactivitiesA vector of normalized SHAPE reactivities
bThe scaling factor $ \beta $ of the conversion function
optionsThe options flag indicating how/where to store the soft constraints
Returns
1 on successful extraction of the method, 0 on errors
int vrna_sc_SHAPE_to_pr ( const char *  shape_conversion,
double *  values,
int  length,
double  default_value 
)

Convert SHAPE reactivity values to probabilities for being unpaired.

This function parses the informations from a given file and stores the result in the preallocated string sequence and the double array values.

See also
vrna_read_SHAPE_file()
Parameters
shape_conversionString definining the method used for the conversion process
valuesPointer to an array of SHAPE reactivities
lengthLength of the array of SHAPE reactivities
default_valueResult used for position with invalid/missing reactivity values
void vrna_sc_minimize_pertubation ( vrna_fold_compound vc,
const double *  q_prob_unpaired,
int  objective_function,
double  sigma_squared,
double  tau_squared,
int  algorithm,
int  sample_size,
double *  epsilon,
double  initialStepSize,
double  minStepSize,
double  minImprovement,
double  minimizerTolerance,
progress_callback  callback 
)

Find a vector of perturbation energies that minimizes the discripancies between predicted and observed pairing probabilities and the amount of neccessary adjustments.

Use an iterative minimization algorithm to find a vector of perturbation energies whose incorporation as soft constraints shifts the predicted pairing probabilities closer to the experimentally observed probabilities. The algorithm aims to minimize an objective function that penalizes discripancies between predicted and observed pairing probabilities and energy model adjustments, i.e. an appropriate vector of perturbation energies satisfies

\[ F(\vec\epsilon) = \sum_{\mu}{ \frac{\epsilon_{\mu}^2}{\tau^2} } + \sum_{i = 1}^n{ \frac{(p_i(\vec\epsilon) - q_i)^2}{\sigma^2} } \to \min. \]

An initialized fold compound and an array containing the observed probability for each nucleotide to be unbound are required as input data. The parameters objective_function, sigma_squared and tau_squared are responsible for adjusting the aim of the objective function. Dependend on which type of objective function is selected, either squared or absolute aberrations are contributing to the objective function. The ratio of the parameters sigma_squared and tau_squared can be used to adjust the algorithm to find a solution either close to the thermodynamic prediction (sigma_squared >> tau_squared) or close to the experimental data (tau_squared >> sigma_squared). The minimization can be performed by makeing use of a custom gradient descent implementation or using one of the minimizing algorithms provided by the GNU Scientific Library. All algorithms require the evaluation of the gradient of the objective function, which includes the evaluation of conditional pairing probabilites. Since an exact evaluation is expensive, the probabilities can also be estimated from sampling by setting an appropriate sample size. The found vector of perturbation energies will be stored in the array epsilon. The progress of the minimization process can be tracked by implementing and passing a callback function.

See also
For further details we refere to [15].
Parameters
vcPointer to a fold compound
q_prob_unpairedPointer to an array containing the probability to be unpaired for each nucleotide
objective_functionThe type of objective function to be used (VRNA_OBJECTIVE_FUNCTION_QUADRATIC / VRNA_OBJECTIVE_FUNCTION_LINEAR)
sigma_squaredA factor used for weighting the objective function. More weight on this factor will lead to a solution close to the null vector.
tau_squaredA factor used for weighting the objective function. More weight on this factor will lead to a solution close to the data provided in q_prob_unpaired.
algorithmThe minimization algorithm (VRNA_MINIMIZER_*)
sample_sizeThe number of sampled sequences used for estimating the pairing probabilities. A value <= 0 will lead to an exact evaluation.
epsilonA pointer to an array used for storing the calculated vector of perturbation energies
callbackA pointer to a callback function used for reporting the current minimization progress