Unstructured domains

Overview

Add and modify unstructured domains to the RNA folding grammar. More…

// typedefs

typedef struct vrna_unstructured_domain_s vrna_ud_t

typedef int () vrna_callback_ud_energy (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    void *data
    )

typedef FLT_OR_DBL () vrna_callback_ud_exp_energy (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    void *data
    )

typedef void () vrna_callback_ud_production (
    vrna_fold_compound_t *vc,
    void *data
    )

typedef void () vrna_callback_ud_exp_production (
    vrna_fold_compound_t *vc,
    void *data
    )

typedef void () vrna_callback_ud_probs_add (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    FLT_OR_DBL exp_energy,
    void *data
    )

typedef FLT_OR_DBL () vrna_callback_ud_probs_get (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    int motif,
    void *data
    )

// structs

struct vrna_unstructured_domain_s

// global functions

void vrna_ud_add_motif (
    vrna_fold_compound_t* vc,
    const char* motif,
    double motif_en,
    unsigned int loop_type
    )

void vrna_ud_remove (vrna_fold_compound_t* vc)

void vrna_ud_set_data (
    vrna_fold_compound_t* vc,
    void* data,
    vrna_callback_free_auxdata* free_cb
    )

void vrna_ud_set_prod_rule_cb (
    vrna_fold_compound_t* vc,
    vrna_callback_ud_production* pre_cb,
    vrna_callback_ud_energy* e_cb
    )

void vrna_ud_set_exp_prod_rule_cb (
    vrna_fold_compound_t* vc,
    vrna_callback_ud_exp_production* pre_cb,
    vrna_callback_ud_exp_energy* exp_e_cb
    )

// macros

#define VRNA_UNSTRUCTURED_DOMAIN_ALL_LOOPS
#define VRNA_UNSTRUCTURED_DOMAIN_EXT_LOOP
#define VRNA_UNSTRUCTURED_DOMAIN_HP_LOOP
#define VRNA_UNSTRUCTURED_DOMAIN_INT_LOOP
#define VRNA_UNSTRUCTURED_DOMAIN_MB_LOOP
#define VRNA_UNSTRUCTURED_DOMAIN_MOTIF

Detailed Documentation

Add and modify unstructured domains to the RNA folding grammar.

This module provides the tools to add and modify unstructured domains to the production rules of the RNA folding grammar. Usually this functionality is utilized for incorporating ligand binding to unpaired stretches of an RNA.

Bug Although the additional production rule(s) for unstructured domains as descibed in domains_unstructured are always treated as ‘segments possibly bound to one or more ligands’, the current implementation requires that at least one ligand is bound. The default implementation already takes care of the required changes, however, upon using callback functions other than the default ones, one has to take care of this fact. Please also note, that this behavior might change in one of the next releases, such that the decomposition schemes as shown above comply with the actual implementation.

A default implementation allows one to readily use this feature by simply adding sequence motifs and corresponding binding free energies with the function vrna_ud_add_motif() (see also Ligands binding to unstructured domains ).

The grammar extension is realized using a callback function that

  • evaluates the binding free energy of a ligand to its target sequence segment (white boxes in the figures above), or
  • returns the free energy of an unpaired stretch possibly bound by a ligand, stored in the additional U DP matrix.

The callback is passed the segment positions, the loop context, and which of the two above mentioned evaluations are required. A second callback implements the pre-processing step that prepares the U DP matrix by evaluating all possible cases of the additional production rule. Both callbacks have a default implementation in RNAlib , but may be over-written by a user-implementation, making it fully user-customizable.

For equilibrium probability computations, two additional callbacks exist. One to store/add and one to retrieve the probability of unstructured domains at particular positions. Our implementation already takes care of computing the probabilities, but users of the unstructured domain feature are required to provide a mechanism to efficiently store/add the corresponding values into some external data structure.

Typedefs

typedef struct vrna_unstructured_domain_s vrna_ud_t
Typename for the ligand binding extension data structure vrna_unstructured_domain_s .
typedef int () vrna_callback_ud_energy (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    void *data
    )
Callback to retrieve binding free energy of a ligand bound to an unpaired sequence segment.

Notes on Callback Functions This function will be called to determine the additional energy contribution of a specific unstructured domain, e.g. the binding free energy of some ligand.

Parameters:

vc The current vrna_fold_compound_t
i The start of the unstructured domain (5’ end)
j The end of the unstructured domain (3’ end)
loop_type The loop context of the unstructured domain
data Auxiliary data

Returns:

The auxiliary energy contribution in deka-cal/mol

typedef FLT_OR_DBL () vrna_callback_ud_exp_energy (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    void *data
    )
Callback to retrieve Boltzmann factor of the binding free energy of a ligand bound to an unpaired sequence segment.

Notes on Callback Functions This function will be called to determine the additional energy contribution of a specific unstructured domain, e.g. the binding free energy of some ligand (Partition function variant, i.e. the Boltzmann factors instead of actual free energies).

Parameters:

vc The current vrna_fold_compound_t
i The start of the unstructured domain (5’ end)
j The end of the unstructured domain (3’ end)
loop_type The loop context of the unstructured domain
data Auxiliary data

Returns:

The auxiliary energy contribution as Boltzmann factor

typedef void () vrna_callback_ud_production (
    vrna_fold_compound_t *vc,
    void *data
    )
Callback for pre-processing the production rule of the ligand binding to unpaired stretches feature.
Notes on Callback Functions The production rule for the unstructured domain grammar extension
typedef void () vrna_callback_ud_exp_production (
    vrna_fold_compound_t *vc,
    void *data
    )
Callback for pre-processing the production rule of the ligand binding to unpaired stretches feature (partition function variant)
Notes on Callback Functions The production rule for the unstructured domain grammar extension (Partition function variant)
typedef void () vrna_callback_ud_probs_add (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    FLT_OR_DBL exp_energy,
    void *data
    )
Callback to store/add equilibrium probability for a ligand bound to an unpaired sequence segment.
Notes on Callback Functions A callback function to store equilibrium probabilities for the unstructured domain feature
typedef FLT_OR_DBL () vrna_callback_ud_probs_get (
    vrna_fold_compound_t *vc,
    int i,
    int j,
    unsigned int loop_type,
    int motif,
    void *data
    )
Callback to retrieve equilibrium probability for a ligand bound to an unpaired sequence segment.
Notes on Callback Functions A callback function to retrieve equilibrium probabilities for the unstructured domain feature

Global Functions

void vrna_ud_add_motif (
    vrna_fold_compound_t* vc,
    const char* motif,
    double motif_en,
    unsigned int loop_type
    )
Add an unstructured domain motif, e.g. for ligand binding.

This function adds a ligand binding motif and the associated binding free energy to the vrna_ud_t attribute of a vrna_fold_compound_t . The motif data will then be used in subsequent secondary structure predictions. Multiple calls to this function with different motifs append all additional data to a list of ligands, which all will be evaluated. Ligand motif data can be removed from the vrna_fold_compound_t again using the vrna_ud_remove() function. The loop type parameter allows one to limit the ligand binding to particular loop type, such as the exterior loop, hairpin loops, interior loops, or multibranch loops.

SWIG Wrapper Notes This function is attached as method ud_add_motif() to objects of type fold_compound

Parameters:

vc The vrna_fold_compound_t data structure the ligand motif should be bound to
motif The sequence motif the ligand binds to
motif_en The binding free energy of the ligand in kcal/mol
loop_type The loop type the ligand binds to
void vrna_ud_remove (vrna_fold_compound_t* vc)
Remove ligand binding to unpaired stretches.

This function removes all ligand motifs that were bound to a vrna_fold_compound_t using the vrna_ud_add_motif() function.

SWIG Wrapper Notes This function is attached as method ud_remove() to objects of type fold_compound

Parameters:

vc The vrna_fold_compound_t data structure the ligand motif data should be removed from
void vrna_ud_set_data (
    vrna_fold_compound_t* vc,
    void* data,
    vrna_callback_free_auxdata* free_cb
    )
Attach an auxiliary data structure.

This function binds an arbitrary, auxiliary data structure for user-implemented ligand binding. The optional callback free will be passed the bound data structure whenever the vrna_fold_compound_t is removed from memory to avoid memory leaks.

SWIG Wrapper Notes This function is attached as method ud_set_data() to objects of type fold_compound

Parameters:

vc The vrna_fold_compound_t data structure the auxiliary data structure should be bound to
data A pointer to the auxiliary data structure
free_cb A pointer to a callback function that free’s memory occupied by data
void vrna_ud_set_prod_rule_cb (
    vrna_fold_compound_t* vc,
    vrna_callback_ud_production* pre_cb,
    vrna_callback_ud_energy* e_cb
    )
Attach production rule callbacks for free energies computations.

Use this function to bind a user-implemented grammar extension for unstructured domains.

The callback e_cb needs to evaluate the free energy contribution \(f(i,j)\) of the unpaired segment \([i,j]\) . It will be executed in each of the regular secondary structure production rules. Whenever the callback is passed the VRNA_UNSTRUCTURED_DOMAIN_MOTIF flag via its loop_type parameter the contribution of any ligand that consecutively binds from position \(i\) to \(j\) (the white box) is requested. Otherwise, the callback usually performs a lookup in the precomputed B matrices. Which B matrix is addressed will be indicated by the flags VRNA_UNSTRUCTURED_DOMAIN_EXT_LOOP , VRNA_UNSTRUCTURED_DOMAIN_HP_LOOP VRNA_UNSTRUCTURED_DOMAIN_INT_LOOP , and VRNA_UNSTRUCTURED_DOMAIN_MB_LOOP . As their names already imply, they specify exterior loops ( F production rule), hairpin loops and interior loops ( C production rule), and multibranch loops ( M and M1 production rule).

_images/ligands_up_callback.svgligands_up_callback

The pre_cb callback will be executed as a pre-processing step right before the regular secondary structure rules. Usually one would use this callback to fill the dynamic programming matrices U and preparations of the auxiliary data structure vrna_unstructured_domain_s.data

_images/B_prod_rule.svgB_prod_rule

SWIG Wrapper Notes This function is attached as method ud_set_prod_rule_cb() to objects of type fold_compound

Parameters:

vc The vrna_fold_compound_t data structure the callback will be bound to
pre_cb A pointer to a callback function for the B production rule
e_cb A pointer to a callback function for free energy evaluation
void vrna_ud_set_exp_prod_rule_cb (
    vrna_fold_compound_t* vc,
    vrna_callback_ud_exp_production* pre_cb,
    vrna_callback_ud_exp_energy* exp_e_cb
    )
Attach production rule for partition function.

This function is the partition function companion of vrna_ud_set_prod_rule_cb() .

Use it to bind callbacks to (i) fill the U production rule dynamic programming matrices and/or prepare the vrna_unstructured_domain_s.data , and (ii) provide a callback to retrieve partition functions for subsegments :math:` [i,j] ` .

_images/B_prod_rule.svgB_prod_rule _images/ligands_up_callback.svgligands_up_callback

SWIG Wrapper Notes This function is attached as method ud_set_exp_prod_rule_cb() to objects of type fold_compound

Parameters:

vc The vrna_fold_compound_t data structure the callback will be bound to
pre_cb A pointer to a callback function for the B production rule
exp_e_cb A pointer to a callback function that retrieves the partition function for a segment \([i,j]\) that may be bound by one or more ligands.

Macros

#define VRNA_UNSTRUCTURED_DOMAIN_ALL_LOOPS
Flag to indicate ligand bound to unpiared stretch in any loop (convenience macro)
#define VRNA_UNSTRUCTURED_DOMAIN_EXT_LOOP
Flag to indicate ligand bound to unpiared stretch in the exterior loop.
#define VRNA_UNSTRUCTURED_DOMAIN_HP_LOOP
Flag to indicate ligand bound to unpaired stretch in a hairpin loop.
#define VRNA_UNSTRUCTURED_DOMAIN_INT_LOOP
Flag to indicate ligand bound to unpiared stretch in an interior loop.
#define VRNA_UNSTRUCTURED_DOMAIN_MB_LOOP
Flag to indicate ligand bound to unpiared stretch in a multibranch loop.
#define VRNA_UNSTRUCTURED_DOMAIN_MOTIF
Flag to indicate ligand binding without additional unbound nucleotides (motif-only)