Unstructured Domains

Add and modify unstructured domains to the RNA folding grammar.

This module provides the tools to add and modify unstructured domains to the production rules of the RNA folding grammar. Usually this functionality is utilized for incorporating ligand binding to unpaired stretches of an RNA.

Warning

Although the additional production rule(s) for unstructured domains as descibed in Unstructured Domains are always treated as segments possibly bound to one or more ligands, the current implementation requires that at least one ligand is bound. The default implementation already takes care of the required changes, however, upon using callback functions other than the default ones, one has to take care of this fact. Please also note, that this behavior might change in one of the next releases, such that the decomposition schemes as shown above comply with the actual implementation.

A default implementation allows one to readily use this feature by simply adding sequence motifs and corresponding binding free energies with the function vrna_ud_add_motif() (see also Ligands Binding to Unstructured Domains).

The grammar extension is realized using a callback function that

  • evaluates the binding free energy of a ligand to its target sequence segment (white boxes in the figures above), or

  • returns the free energy of an unpaired stretch possibly bound by a ligand, stored in the additional U DP matrix.

The callback is passed the segment positions, the loop context, and which of the two above mentioned evaluations are required. A second callback implements the pre-processing step that prepares the U DP matrix by evaluating all possible cases of the additional production rule. Both callbacks have a default implementation in RNAlib, but may be over-written by a user-implementation, making it fully user-customizable.

For equilibrium probability computations, two additional callbacks exist. One to store/add and one to retrieve the probability of unstructured domains at particular positions. Our implementation already takes care of computing the probabilities, but users of the unstructured domain feature are required to provide a mechanism to efficiently store/add the corresponding values into some external data structure.

Defines

VRNA_UNSTRUCTURED_DOMAIN_EXT_LOOP
#include <ViennaRNA/unstructured_domains.h>

Flag to indicate ligand bound to unpiared stretch in the exterior loop.

VRNA_UNSTRUCTURED_DOMAIN_HP_LOOP
#include <ViennaRNA/unstructured_domains.h>

Flag to indicate ligand bound to unpaired stretch in a hairpin loop.

VRNA_UNSTRUCTURED_DOMAIN_INT_LOOP
#include <ViennaRNA/unstructured_domains.h>

Flag to indicate ligand bound to unpiared stretch in an interior loop.

VRNA_UNSTRUCTURED_DOMAIN_MB_LOOP
#include <ViennaRNA/unstructured_domains.h>

Flag to indicate ligand bound to unpiared stretch in a multibranch loop.

VRNA_UNSTRUCTURED_DOMAIN_MOTIF
#include <ViennaRNA/unstructured_domains.h>

Flag to indicate ligand binding without additional unbound nucleotides (motif-only)

VRNA_UNSTRUCTURED_DOMAIN_ALL_LOOPS
#include <ViennaRNA/unstructured_domains.h>

Flag to indicate ligand bound to unpiared stretch in any loop (convenience macro)

Typedefs

typedef struct vrna_unstructured_domain_s vrna_ud_t
#include <ViennaRNA/unstructured_domains.h>

Typename for the ligand binding extension data structure vrna_unstructured_domain_s.

typedef int (*vrna_ud_f)(vrna_fold_compound_t *fc, int i, int j, unsigned int loop_type, void *data)
#include <ViennaRNA/unstructured_domains.h>

Callback to retrieve binding free energy of a ligand bound to an unpaired sequence segment.

Notes on Callback Functions:

This function will be called to determine the additional energy contribution of a specific unstructured domain, e.g. the binding free energy of some ligand.

Param fc:

The current vrna_fold_compound_t

Param i:

The start of the unstructured domain (5’ end)

Param j:

The end of the unstructured domain (3’ end)

Param loop_type:

The loop context of the unstructured domain

Param data:

Auxiliary data

Return:

The auxiliary energy contribution in deka-cal/mol

typedef FLT_OR_DBL (*vrna_ud_exp_f)(vrna_fold_compound_t *fc, int i, int j, unsigned int loop_type, void *data)
#include <ViennaRNA/unstructured_domains.h>

Callback to retrieve Boltzmann factor of the binding free energy of a ligand bound to an unpaired sequence segment.

Notes on Callback Functions:

This function will be called to determine the additional energy contribution of a specific unstructured domain, e.g. the binding free energy of some ligand (Partition function variant, i.e. the Boltzmann factors instead of actual free energies).

Param fc:

The current vrna_fold_compound_t

Param i:

The start of the unstructured domain (5’ end)

Param j:

The end of the unstructured domain (3’ end)

Param loop_type:

The loop context of the unstructured domain

Param data:

Auxiliary data

Return:

The auxiliary energy contribution as Boltzmann factor

typedef void (*vrna_ud_production_f)(vrna_fold_compound_t *fc, void *data)
#include <ViennaRNA/unstructured_domains.h>

Callback for pre-processing the production rule of the ligand binding to unpaired stretches feature.

Notes on Callback Functions:

The production rule for the unstructured domain grammar extension

typedef void (*vrna_ud_exp_production_f)(vrna_fold_compound_t *fc, void *data)
#include <ViennaRNA/unstructured_domains.h>

Callback for pre-processing the production rule of the ligand binding to unpaired stretches feature (partition function variant)

Notes on Callback Functions:

The production rule for the unstructured domain grammar extension (Partition function variant)

typedef void (*vrna_ud_add_probs_f)(vrna_fold_compound_t *fc, int i, int j, unsigned int loop_type, FLT_OR_DBL exp_energy, void *data)
#include <ViennaRNA/unstructured_domains.h>

Callback to store/add equilibrium probability for a ligand bound to an unpaired sequence segment.

Notes on Callback Functions:

A callback function to store equilibrium probabilities for the unstructured domain feature

typedef FLT_OR_DBL (*vrna_ud_get_probs_f)(vrna_fold_compound_t *fc, int i, int j, unsigned int loop_type, int motif, void *data)
#include <ViennaRNA/unstructured_domains.h>

Callback to retrieve equilibrium probability for a ligand bound to an unpaired sequence segment.

Notes on Callback Functions:

A callback function to retrieve equilibrium probabilities for the unstructured domain feature

Functions

vrna_ud_motif_t *vrna_ud_motifs_centroid(vrna_fold_compound_t *fc, const char *structure)
#include <ViennaRNA/unstructured_domains.h>

Detect unstructured domains in centroid structure.

Given a centroid structure and a set of unstructured domains compute the list of unstructured domain motifs present in the centroid. Since we do not explicitly annotate unstructured domain motifs in dot-bracket strings, this function can be used to check for the presence and location of unstructured domain motifs under the assumption that the dot-bracket string is the centroid structure of the equiibrium ensemble.

See also

vrna_centroid()

Parameters:
  • fc – The fold_compound data structure with pre-computed equilibrium probabilities and model settings

  • structure – The centroid structure in dot-bracket notation

Returns:

A list of unstructured domain motifs (possibly NULL). The last element terminates the list with start=0, number=-1

vrna_ud_motif_t *vrna_ud_motifs_MEA(vrna_fold_compound_t *fc, const char *structure, vrna_ep_t *probability_list)
#include <ViennaRNA/unstructured_domains.h>

Detect unstructured domains in MEA structure.

Given an MEA structure and a set of unstructured domains compute the list of unstructured domain motifs present in the MEA structure. Since we do not explicitly annotate unstructured domain motifs in dot-bracket strings, this function can be used to check for the presence and location of unstructured domain motifs under the assumption that the dot-bracket string is the MEA structure of the equiibrium ensemble.

See also

MEA()

Parameters:
  • fc – The fold_compound data structure with pre-computed equilibrium probabilities and model settings

  • structure – The MEA structure in dot-bracket notation

  • probability_list – The list of probabilities to extract the MEA structure from

Returns:

A list of unstructured domain motifs (possibly NULL). The last element terminates the list with start=0, number=-1

vrna_ud_motif_t *vrna_ud_motifs_MFE(vrna_fold_compound_t *fc, const char *structure)
#include <ViennaRNA/unstructured_domains.h>

Detect unstructured domains in MFE structure.

Given an MFE structure and a set of unstructured domains compute the list of unstructured domain motifs present in the MFE structure. Since we do not explicitly annotate unstructured domain motifs in dot-bracket strings, this function can be used to check for the presence and location of unstructured domain motifs under the assumption that the dot-bracket string is the MFE structure of the equiibrium ensemble.

See also

vrna_mfe()

Parameters:
  • fc – The fold_compound data structure with model settings

  • structure – The MFE structure in dot-bracket notation

Returns:

A list of unstructured domain motifs (possibly NULL). The last element terminates the list with start=0, number=-1

void vrna_ud_add_motif(vrna_fold_compound_t *fc, const char *motif, double motif_en, const char *motif_name, unsigned int loop_type)
#include <ViennaRNA/unstructured_domains.h>

Add an unstructured domain motif, e.g. for ligand binding.

This function adds a ligand binding motif and the associated binding free energy to the vrna_ud_t attribute of a vrna_fold_compound_t. The motif data will then be used in subsequent secondary structure predictions. Multiple calls to this function with different motifs append all additional data to a list of ligands, which all will be evaluated. Ligand motif data can be removed from the vrna_fold_compound_t again using the vrna_ud_remove() function. The loop type parameter allows one to limit the ligand binding to particular loop type, such as the exterior loop, hairpin loops, interior loops, or multibranch loops.

Parameters:
  • fc – The vrna_fold_compound_t data structure the ligand motif should be bound to

  • motif – The sequence motif the ligand binds to

  • motif_en – The binding free energy of the ligand in kcal/mol

  • motif_name – The name/id of the motif (may be NULL)

  • loop_type – The loop type the ligand binds to

void vrna_ud_remove(vrna_fold_compound_t *fc)
#include <ViennaRNA/unstructured_domains.h>

Remove ligand binding to unpaired stretches.

This function removes all ligand motifs that were bound to a vrna_fold_compound_t using the vrna_ud_add_motif() function.

SWIG Wrapper Notes:

This function is attached as method ud_remove() to objects of type fold_compound. See, e.g. RNA.fold_compound.ud_remove() in the Python API.

Parameters:
void vrna_ud_set_data(vrna_fold_compound_t *fc, void *data, vrna_auxdata_free_f free_cb)
#include <ViennaRNA/unstructured_domains.h>

Attach an auxiliary data structure.

This function binds an arbitrary, auxiliary data structure for user-implemented ligand binding. The optional callback free_cb will be passed the bound data structure whenever the vrna_fold_compound_t is removed from memory to avoid memory leaks.

SWIG Wrapper Notes:

This function is attached as method ud_set_data() to objects of type fold_compound. See, e.g. RNA.fold_compound.ud_set_data() in the Python API.

Parameters:
  • fc – The vrna_fold_compound_t data structure the auxiliary data structure should be bound to

  • data – A pointer to the auxiliary data structure

  • free_cb – A pointer to a callback function that free’s memory occupied by data

void vrna_ud_set_prod_rule_cb(vrna_fold_compound_t *fc, vrna_ud_production_f pre_cb, vrna_ud_f e_cb)
#include <ViennaRNA/unstructured_domains.h>

Attach production rule callbacks for free energies computations.

Use this function to bind a user-implemented grammar extension for unstructured domains.

The callback e_cb needs to evaluate the free energy contribution \(f(i,j)\) of the unpaired segment \([i,j]\). It will be executed in each of the regular secondary structure production rules. Whenever the callback is passed the VRNA_UNSTRUCTURED_DOMAIN_MOTIF flag via its loop_type parameter the contribution of any ligand that consecutively binds from position \(i\) to \(j\) (the white box) is requested. Otherwise, the callback usually performs a lookup in the precomputed B matrices. Which B matrix is addressed will be indicated by the flags VRNA_UNSTRUCTURED_DOMAIN_EXT_LOOP, VRNA_UNSTRUCTURED_DOMAIN_HP_LOOP VRNA_UNSTRUCTURED_DOMAIN_INT_LOOP, and VRNA_UNSTRUCTURED_DOMAIN_MB_LOOP. As their names already imply, they specify exterior loops (F production rule), hairpin loops and interior loops (C production rule), and multibranch loops (M and M1 production rule).

../_images/ligands_up_callback.svg

The pre_cb callback will be executed as a pre-processing step right before the regular secondary structure rules. Usually one would use this callback to fill the dynamic programming matrices U and preparations of the auxiliary data structure vrna_unstructured_domain_s.data

../_images/B_prod_rule.svg

SWIG Wrapper Notes:

This function is attached as method ud_set_prod_rule_cb() to objects of type fold_compound. See, e.g. RNA.fold_compound.ud_set_prod_rule_cb() in the Python API.

Parameters:
  • fc – The vrna_fold_compound_t data structure the callback will be bound to

  • pre_cb – A pointer to a callback function for the B production rule

  • e_cb – A pointer to a callback function for free energy evaluation

void vrna_ud_set_exp_prod_rule_cb(vrna_fold_compound_t *fc, vrna_ud_exp_production_f pre_cb, vrna_ud_exp_f exp_e_cb)
#include <ViennaRNA/unstructured_domains.h>

Attach production rule for partition function.

This function is the partition function companion of vrna_ud_set_prod_rule_cb().

Use it to bind callbacks to (i) fill the U production rule dynamic programming matrices and/or prepare the vrna_unstructured_domain_s.data, and (ii) provide a callback to retrieve partition functions for subsegments \( [i,j] \).

../_images/B_prod_rule.svg

../_images/ligands_up_callback.svg

SWIG Wrapper Notes:

This function is attached as method ud_set_exp_prod_rule_cb() to objects of type fold_compound. See, e.g. RNA.fold_compound.ud_set_exp_prod_rule_cb() in the Python API.

Parameters:
  • fc – The vrna_fold_compound_t data structure the callback will be bound to

  • pre_cb – A pointer to a callback function for the B production rule

  • exp_e_cb – A pointer to a callback function that retrieves the partition function for a segment \([i,j]\) that may be bound by one or more ligands.

struct vrna_unstructured_domain_s
#include <ViennaRNA/unstructured_domains.h>

Data structure to store all functionality for ligand binding.

Public Members

int uniq_motif_count

The unique number of motifs of different lengths.

unsigned int *uniq_motif_size

An array storing a unique list of motif lengths.

int motif_count

Total number of distinguished motifs.

char **motif

Motif sequences.

char **motif_name

Motif identifier/name.

unsigned int *motif_size

Motif lengths.

double *motif_en

Ligand binding free energy contribution.

unsigned int *motif_type

Type of motif, i.e. loop type the ligand binds to.

vrna_ud_production_f prod_cb

Callback to ligand binding production rule, i.e. create/fill DP free energy matrices.

This callback will be executed right before the actual secondary structure decompositions, and, therefore, any implementation must not interleave with the regular DP matrices.

vrna_ud_exp_production_f exp_prod_cb

Callback to ligand binding production rule, i.e. create/fill DP partition function matrices.

vrna_ud_f energy_cb

Callback to evaluate free energy of ligand binding to a particular unpaired stretch.

vrna_ud_exp_f exp_energy_cb

Callback to evaluate Boltzmann factor of ligand binding to a particular unpaired stretch.

void *data

Auxiliary data structure passed to energy evaluation callbacks.

vrna_auxdata_free_f free_data

Callback to free auxiliary data structure.

vrna_ud_add_probs_f probs_add

Callback to store/add outside partition function.

vrna_ud_get_probs_f probs_get

Callback to retrieve outside partition function.