Generate Soft Constraints from Data

Find a vector of perturbation energies that minimizes the discripancies between predicted and observed pairing probabilities and the amount of neccessary adjustments.

Defines

VRNA_OBJECTIVE_FUNCTION_QUADRATIC
#include <ViennaRNA/perturbation_fold.h>

Use the sum of squared aberrations as objective function.

\( F(\vec\epsilon) = \sum_{i = 1}^n{ \frac{\epsilon_i^2}{\tau^2} } + \sum_{i = 1}^n{ \frac{(p_i(\vec\epsilon) - q_i)^2}{\sigma^2} } \to min \)

VRNA_OBJECTIVE_FUNCTION_ABSOLUTE
#include <ViennaRNA/perturbation_fold.h>

Use the sum of absolute aberrations as objective function.

\( F(\vec\epsilon) = \sum_{i = 1}^n{ \frac{|\epsilon_i|}{\tau^2} } + \sum_{i = 1}^n{ \frac{|p_i(\vec\epsilon) - q_i|}{\sigma^2} } \to min \)

VRNA_MINIMIZER_DEFAULT
#include <ViennaRNA/perturbation_fold.h>

Use a custom implementation of the gradient descent algorithm to minimize the objective function.

VRNA_MINIMIZER_CONJUGATE_FR
#include <ViennaRNA/perturbation_fold.h>

Use the GNU Scientific Library implementation of the Fletcher-Reeves conjugate gradient algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

VRNA_MINIMIZER_CONJUGATE_PR
#include <ViennaRNA/perturbation_fold.h>

Use the GNU Scientific Library implementation of the Polak-Ribiere conjugate gradient algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

VRNA_MINIMIZER_VECTOR_BFGS
#include <ViennaRNA/perturbation_fold.h>

Use the GNU Scientific Library implementation of the vector Broyden-Fletcher-Goldfarb-Shanno algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

VRNA_MINIMIZER_VECTOR_BFGS2
#include <ViennaRNA/perturbation_fold.h>

Use the GNU Scientific Library implementation of the vector Broyden-Fletcher-Goldfarb-Shanno algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

VRNA_MINIMIZER_STEEPEST_DESCENT
#include <ViennaRNA/perturbation_fold.h>

Use the GNU Scientific Library implementation of the steepest descent algorithm to minimize the objective function.

Please note that this algorithm can only be used when the GNU Scientific Library is available on your system

Typedefs

typedef void (*progress_callback)(int iteration, double score, double *epsilon)
#include <ViennaRNA/perturbation_fold.h>

Callback for following the progress of the minimization process.

Param iteration:

The number of the current iteration

Param score:

The score of the objective function

Param epsilon:

The perturbation vector yielding the reported score

Functions

void vrna_sc_minimize_pertubation(vrna_fold_compound_t *fc, const double *q_prob_unpaired, int objective_function, double sigma_squared, double tau_squared, int algorithm, int sample_size, double *epsilon, double initialStepSize, double minStepSize, double minImprovement, double minimizerTolerance, progress_callback callback)
#include <ViennaRNA/perturbation_fold.h>

Find a vector of perturbation energies that minimizes the discripancies between predicted and observed pairing probabilities and the amount of neccessary adjustments.

Use an iterative minimization algorithm to find a vector of perturbation energies whose incorporation as soft constraints shifts the predicted pairing probabilities closer to the experimentally observed probabilities. The algorithm aims to minimize an objective function that penalizes discripancies between predicted and observed pairing probabilities and energy model adjustments, i.e. an appropriate vector of perturbation energies satisfies

\[ F(\vec\epsilon) = \sum_{\mu}{ \frac{\epsilon_{\mu}^2}{\tau^2} } + \sum_{i = 1}^n{ \frac{(p_i(\vec\epsilon) - q_i)^2}{\sigma^2} } \to \min. \]

An initialized fold compound and an array containing the observed probability for each nucleotide to be unbound are required as input data. The parameters objective_function, sigma_squared and tau_squared are responsible for adjusting the aim of the objective function. Dependend on which type of objective function is selected, either squared or absolute aberrations are contributing to the objective function. The ratio of the parameters sigma_squared and tau_squared can be used to adjust the algorithm to find a solution either close to the thermodynamic prediction (sigma_squared >> tau_squared) or close to the experimental data (tau_squared >> sigma_squared). The minimization can be performed by makeing use of a custom gradient descent implementation or using one of the minimizing algorithms provided by the GNU Scientific Library. All algorithms require the evaluation of the gradient of the objective function, which includes the evaluation of conditional pairing probabilites. Since an exact evaluation is expensive, the probabilities can also be estimated from sampling by setting an appropriate sample size. The found vector of perturbation energies will be stored in the array epsilon. The progress of the minimization process can be tracked by implementing and passing a callback function.

See also

For further details we refer to Washietl et al. [2012] .

Parameters:
  • fc – Pointer to a fold compound

  • q_prob_unpaired – Pointer to an array containing the probability to be unpaired for each nucleotide

  • objective_function – The type of objective function to be used (VRNA_OBJECTIVE_FUNCTION_QUADRATIC / VRNA_OBJECTIVE_FUNCTION_LINEAR)

  • sigma_squared – A factor used for weighting the objective function. More weight on this factor will lead to a solution close to the null vector.

  • tau_squared – A factor used for weighting the objective function. More weight on this factor will lead to a solution close to the data provided in q_prob_unpaired.

  • algorithm – The minimization algorithm (VRNA_MINIMIZER_*)

  • sample_size – The number of sampled sequences used for estimating the pairing probabilities. A value <= 0 will lead to an exact evaluation.

  • epsilon – A pointer to an array used for storing the calculated vector of perturbation energies

  • callback – A pointer to a callback function used for reporting the current minimization progress