Python API

Almost all symbols of the API available in our RNAlib C-library is wrapped for use in Python using swig. That makes our fast and efficient algorithms and tools available for third-party Python programs and scripting languages.

Note

Our Python API is automatically generated and translated from our C-library documentation. If you find anything problematic or want to to help us improve the documentation, do not hesitate to contact us or make a PR at our official github repository.

Installation

The Python interface is usually part of the installation of the ViennaRNA Package, see also Installation and Scripting Language Interfaces.

If for any reason your installation does not provide our Python interface or in cases where you don’t want to install the full ViennaRNA Package but only the Python bindings to RNAlib, you may also install them via Pythons pip:

python -m pip install viennarna

Usage

To use our Python bindings simply import the RNA or ViennaRNA package like

import RNA

or

import ViennaRNA

The RNA module that provides access to our RNAlib C-library can also be imported directly using

from RNA import RNA

or

from ViennaRNA import RNA

Note

In previous release of the ViennaRNA Packge, only the RNA package/module has been available. Since version 2.6.2 we maintain the ViennaRNA project at https://pypi.org. The former maintainer additionally introduced the ViennaRNA package which we intend to keep and extend in future releases.

Global Variables

For the Python interface(s) SWIG places global variables of the C-library into an additional namespace cvar. For instance, changing the global temperature variable thus becomes

RNA.cvar.temperature = 25

Pythonic interface

Since our library is written in C the functions we provide in our API might seem awkward for users more familiar with Pythons object oriented fashion. Therefore, we spend some effort on creating a more pythonic interface here. In particular, we tried to group together particular data structures and functions operating on them to derive classes and objects with corresponding methods attached.

If you browse through our reference manual, many C-functions have additional SWIG Wrapper Notes in their description. These descriptions should give an idea how the function is available in the Python interface. Usually, our C functions, data structures, typedefs, and enumerations use the vrna_ prefixes and _s, _t, _e suffixes. Those decorators are useful in C but of less use in the context of Python packages or modules. Therefore, these prefixes and suffixes are dropped from the Python interface.

Object orientation

Consider the C-function vrna_fold_compound(). This creates a vrna_fold_compound_t data structure that is then passed around to various functions, e.g. to vrna_mfe() to compute the MFE structure. A corresponding C-code may look like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <ViennaRNA/utils/basic.h>
#include <ViennaRNA/fold_compound.h>
#include <ViennaRNA/mfe.h>

int
main(int  argc,
     char *argv[])
{
  char *seq, *ss;
  float mfe;
  vrna_fold_compound_t *fc;

  seq = "AGACGACAAGGUUGAAUCGCACCCACAGUCUAUGAGUCGGUG";
  ss  = vrna_alloc(sizeof(char) * (strlen(seq) + 1));
  fc  = vrna_fold_compound(seq, NULL, VRNA_OPTION_DEFAULT);
  mfe = vrna_mfe(fc, ss);

  printf("%s\n%s (%6.2f)\n", seq, ss, mfe);

  return EXIT_SUCCESS;
}

In our Python interface, the vrna_fold_compound_t data structure becomes the RNA.fold_compound class, the vrna_fold_compound() becomes one of its constructors and the vrna_mfe() function becomes the method RNA.fold_compound.mfe(). So, the Python code would probably translate to something like

import RNA

seq = "AGACGACAAGGUUGAAUCGCACCCACAGUCUAUGAGUCGGUG"
fc  = RNA.fold_compound(seq)
(ss, mfe) = fc.mfe()

print(f"{seq}\n{ss} ({mfe:6.2f})")

Note

The C-function vrna_mfe() actually returns two values, the MFE in units of \(\text{kcal} \cdot \text{mol}^{-1}\) and the corresponding MFE structure. The latter is written to the ss pointer. This is necessary since C functions can at most return one single value. In Python, function and methods may return arbitrarily many values instead, and in addition, passing parameters to a function or method such that it changes its content is generally discouraged. Therefore, our functions that return values through function parameters usually return them regularly in the Python interface.

Lists and Tuples

C-functions in our API that return or receive list-like data usually utilize pointers. Since there are no such things in Python, they would be wrapped as particular kind of objects that would then be tedious to work with. For the Python interface, we therefore tried to wrap the majority of these instances to native Python types, such as list or tuple. Therefore, one can usually pass a list to a function that uses pointers to array in C, and expect to receive a list or tuple from functions that return pointers to arrays.

Energy Parameters

Energy parameters are compiled into our library, so there is usually no necessity to load them from a file. All parameter files shipped with the ViennaRNA Package can be loaded by simply calling any of the dedicated functions:

Examples

A few more Python code examples can be found here.

The RNA Python module

A library for the prediction and comparison of RNA secondary structures.

Amongst other things, our implementations allow you to:

  • predict minimum free energy secondary structures

  • calculate the partition function for the ensemble of structures

  • compute various equilibrium probabilities

  • calculate suboptimal structures in a given energy range

  • compute local structures in long sequences

  • predict consensus secondary structures from a multiple sequence alignment

  • predict melting curves

  • search for sequences folding into a given structure

  • compare two secondary structures

  • predict interactions between multiple RNA molecules

class RNA.COORDINATE

Bases: object

this is a workarround for the SWIG Perl Wrapper RNA plot function that returns an array of type COORDINATE

X
Type:

float

Y
Type:

float

this is a workarround for the SWIG Perl Wrapper RNA plot function that returns an array of type COORDINATE

X
Type:

float

Y
Type:

float

property X
property Y
get(i)
property thisown

The membership flag

class RNA.ConstCharVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.CoordinateVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.DoubleDoubleVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.DoublePair(*args)

Bases: object

property first
property second
property thisown

The membership flag

class RNA.DoubleVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.DuplexVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

RNA.E_ExtLoop(type, si1, sj1, P)
RNA.E_GQuad_IntLoop_L(i, j, type, S, ggg, maxdist, P)
RNA.E_GQuad_IntLoop_L_comparative(i, j, tt, S_cons, S5, S3, a2s, ggg, n_seq, P)
RNA.E_Hairpin(size, type, si1, sj1, string, P)

Compute the Energy of a hairpin-loop.

To evaluate the free energy of a hairpin-loop, several parameters have to be known. A general hairpin-loop has this structure:

a3 a4

a2 a5 a1 a6

X - Y | | 5’ 3’

where X-Y marks the closing pair [e.g. a (G,C) pair]. The length of this loop is 6 as there are six unpaired nucleotides (a1-a6) enclosed by (X,Y). The 5’ mismatching nucleotide is a1 while the 3’ mismatch is a6. The nucleotide sequence of this loop is “a1.a2.a3.a4.a5.a6”

Parameters:
  • size (int) – The size of the loop (number of unpaired nucleotides)

  • type (int) – The pair type of the base pair closing the hairpin

  • si1 (int) – The 5’-mismatching nucleotide

  • sj1 (int) – The 3’-mismatching nucleotide

  • string (string) – The sequence of the loop (May be NULL, otherwise mst be at least \(size + 2\) long)

  • P (RNA.param() *) – The datastructure containing scaled energy parameters

Returns:

The Free energy of the Hairpin-loop in dcal/mol

Return type:

int

Warning

Not (really) thread safe! A threadsafe implementation will replace this function in a future release!

Energy evaluation may change due to updates in global variable “tetra_loop”

See also

scale_parameters, RNA.param

Note

The parameter sequence should contain the sequence of the loop in capital letters of the nucleic acid alphabet if the loop size is below 7. This is useful for unusually stable tri-, tetra- and hexa-loops which are treated differently (based on experimental data) if they are tabulated.

RNA.E_IntLoop(n1, n2, type, type_2, si1, sj1, sp1, sq1, P)

Compute the Energy of an internal-loop

This function computes the free energy \(\Delta G\) of an internal-loop with the following structure:

3’ 5’ | | U - V

a_n b_1

. . . . . .

a_1 b_m

X - Y | | 5’ 3’

This general structure depicts an internal-loop that is closed by the base pair (X,Y). The enclosed base pair is (V,U) which leaves the unpaired bases a_1-a_n and b_1-b_n that constitute the loop. In this example, the length of the internal-loop is \((n+m)\) where n or m may be 0 resulting in a bulge-loop or base pair stack. The mismatching nucleotides for the closing pair (X,Y) are: 5’-mismatch: a_1 3’-mismatch: b_m and for the enclosed base pair (V,U): 5’-mismatch: b_1 3’-mismatch: a_n

param n1:

The size of the ‘left’-loop (number of unpaired nucleotides)

type n1:

int

param n2:

The size of the ‘right’-loop (number of unpaired nucleotides)

type n2:

int

param type:

The pair type of the base pair closing the internal loop

type type:

int

param type_2:

The pair type of the enclosed base pair

type type_2:

int

param si1:

The 5’-mismatching nucleotide of the closing pair

type si1:

int

param sj1:

The 3’-mismatching nucleotide of the closing pair

type sj1:

int

param sp1:

The 3’-mismatching nucleotide of the enclosed pair

type sp1:

int

param sq1:

The 5’-mismatching nucleotide of the enclosed pair

type sq1:

int

param P:

The datastructure containing scaled energy parameters

type P:

RNA.param() *

returns:

The Free energy of the Interior-loop in dcal/mol

rtype:

int

See also

scale_parameters, RNA.param

Note

Base pairs are always denoted in 5’->3’ direction. Thus the enclosed base pair must be ‘turned arround’ when evaluating the free energy of the internal-loop

This function is threadsafe

RNA.E_IntLoop_Co(type, type_2, i, j, p, q, cutpoint, si1, sj1, sp1, sq1, dangles, P)
RNA.E_MLstem(type, si1, sj1, P)
RNA.E_Stem(type, si1, sj1, extLoop, P)

Compute the energy contribution of a stem branching off a loop-region.

This function computes the energy contribution of a stem that branches off a loop region. This can be the case in multiloops, when a stem branching off increases the degree of the loop but also immediately interior base pairs of an exterior loop contribute free energy. To switch the behavior of the function according to the evaluation of a multiloop- or exterior-loop-stem, you pass the flag ‘extLoop’. The returned energy contribution consists of a TerminalAU penalty if the pair type is greater than 2, dangling end contributions of mismatching nucleotides adjacent to the stem if only one of the si1, sj1 parameters is greater than 0 and mismatch energies if both mismatching nucleotides are positive values. Thus, to avoid incorporating dangling end or mismatch energies just pass a negative number, e.g. -1 to the mismatch argument.

This is an illustration of how the energy contribution is assembled:

3’ 5’ | | X - Y

5’-si1 sj1-3’

Here, (X,Y) is the base pair that closes the stem that branches off a loop region. The nucleotides si1 and sj1 are the 5’- and 3’- mismatches, respectively. If the base pair type of (X,Y) is greater than 2 (i.e. an A-U or G-U pair, the TerminalAU penalty will be included in the energy contribution returned. If si1 and sj1 are both nonnegative numbers, mismatch energies will also be included. If one of si1 or sj1 is a negative value, only 5’ or 3’ dangling end contributions are taken into account. To prohibit any of these mismatch contributions to be incorporated, just pass a negative number to both, si1 and sj1. In case the argument extLoop is 0, the returned energy contribution also includes the internal-loop-penalty of a multiloop stem with closing pair type.

Deprecated since version 2.7.0: Please use one of the functions RNA.E_exterior_stem() and RNA.E_multibranch_stem() instead! Use the former for cases where extLoop != 0 and the latter otherwise.

See also

RNA.E_multibranch_stem, _ExtLoop

Note

This function is threadsafe

Parameters:
  • type (int) – The pair type of the first base pair un the stem

  • si1 (int) – The 5’-mismatching nucleotide

  • sj1 (int) – The 3’-mismatching nucleotide

  • extLoop (int) – A flag that indicates whether the contribution reflects the one of an exterior loop or not

  • P (RNA.param() *) – The data structure containing scaled energy parameters

Returns:

The Free energy of the branch off the loop in dcal/mol

Return type:

int

RNA.E_gquad(L, l, P)
RNA.E_gquad_ali_en(i, L, l, S, a2s, n_seq, P, en)
RNA.E_ml_rightmost_stem(i, j, fc)
class RNA.ElemProbVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.HeatCapacityVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.HelixVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.IntIntVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.IntVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

RNA.Lfold(sequence, window_size, nullfile=None)

Local MFE prediction using a sliding window approach (simplified interface)

This simplified interface to RNA.fold_compound.mfe_window() computes the MFE and locally optimal secondary structure using default options. Structures are predicted using a sliding window approach, where base pairs may not span outside the window. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing.

SWIG Wrapper Notes

This function is available as overloaded function Lfold() in the global namespace. The parameter file defaults to NULL and may be omitted. See e.g. RNA.Lfold() in the Python API.

Parameters:
  • string (string) – The nucleic acid sequence

  • window_size (int) – The window size for locally optimal structures

  • file (FILE *) – The output file handle where predictions are written to (if NULL, output is written to stdout)

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use RNA.fold_compound.mfe_window(), and the data structure RNA.fold_compound() instead.

RNA.Lfold_cb(char * string, int window_size, PyObject * PyFunc, PyObject * data) float
RNA.Lfoldz(sequence, window_size, min_z, nullfile=None)

Local MFE prediction using a sliding window approach with z-score cut-off (simplified interface)

This simplified interface to RNA.fold_compound.mfe_window_zscore() computes the MFE and locally optimal secondary structure using default options. Structures are predicted using a sliding window approach, where base pairs may not span outside the window. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing. This function is the z-score version of RNA.Lfold(), i.e. only predictions above a certain z-score cut-off value are printed.

Parameters:
  • string (string) – The nucleic acid sequence

  • window_size (int) – The window size for locally optimal structures

  • min_z (double) – The minimal z-score for a predicted structure to appear in the output

  • file (FILE *) – The output file handle where predictions are written to (if NULL, output is written to stdout)

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use RNA.fold_compound.mfe_window(), and the data structure RNA.fold_compound() instead.

RNA.Lfoldz_cb(char * string, int window_size, double min_z, PyObject * PyFunc, PyObject * data) float
RNA.MEA_from_plist(*args)

Compute a MEA (maximum expected accuracy) structure from a list of probabilities.

The algorithm maximizes the expected accuracy

\[A(S) = \sum_{(i,j) \in S} 2 \gamma p_{ij} + \sum_{i \notin S} p^u_{i}\]

Higher values of \(\gamma\) result in more base pairs of lower probability and thus higher sensitivity. Low values of \(\gamma\) result in structures containing only highly likely pairs (high specificity). The code of the MEA function also demonstrates the use of sparse dynamic programming scheme to reduce the time and memory complexity of folding.

SWIG Wrapper Notes

This function is available as overloaded function MEA_from_plist`(gamma = 1., md = NULL). Note, that it returns the MEA structure and MEA value as a tuple (MEA_structure, MEA). See, e.g. :py:func:`RNA.MEA_from_plist() in the Python API.

Parameters:
  • plist (RNA.ep() *) – A list of base pair probabilities the MEA structure is computed from

  • sequence (string) – The RNA sequence that corresponds to the list of probability values

  • gamma (double) – The weighting factor for base pairs vs. unpaired nucleotides

  • md (RNA.md() *) – A model details data structure (maybe NULL)

  • mea (list-like(double)) – A pointer to a variable where the MEA value will be written to

Returns:

An MEA structure (or NULL on any error)

Return type:

string

Note

The unpaired probabilities \(p^u_{i} = 1 - \sum_{j \neq i} p_{ij}\) are usually computed from the supplied pairing probabilities \(p_{ij}\) as stored in plist entries of type RNA.PLIST_TYPE_BASEPAIR. To overwrite individual \(p^u_{o}\) values simply add entries with type RNA.PLIST_TYPE_UNPAIRED

To include G-Quadruplex support, the corresponding field in md must be set.

RNA.Make_bp_profile(length)

Deprecated since version 2.7.0: This function is deprecated and will be removed soon! See Make_bp_profile_bppm() for a replacement

Note

This function is NOT threadsafe

RNA.Make_bp_profile_bppm(bppm, length)

condense pair probability matrix into a vector containing probabilities for unpaired, upstream paired and downstream paired.

This resulting probability profile is used as input for profile_edit_distance

Parameters:
  • bppm (list-like(double)) – A pointer to the base pair probability matrix

  • length (int) – The length of the sequence

Returns:

The bp profile

Return type:

list-like(double)

RNA.Make_swString(string)

Convert a structure into a format suitable for string_edit_distance().

Parameters:

string (string) –

Return type:

swString *

class RNA.MoveVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

RNA.PS_color_dot_plot(string, pi, filename)
RNA.PS_color_dot_plot_turn(seq, pi, filename, winSize)
RNA.PS_dot_plot(string, file)

Produce postscript dot-plot.

Wrapper to PS_dot_plot_list

Reads base pair probabilities produced by pf_fold() from the global array pr and the pair list base_pair produced by fold() and produces a postscript “dot plot” that is written to ‘filename’. The “dot plot” represents each base pairing probability by a square of corresponding area in a upper triangle matrix. The lower part of the matrix contains the minimum free energy

Deprecated since version 2.7.0: This function is deprecated and will be removed soon! Use PS_dot_plot_list() instead!

Note

DO NOT USE THIS FUNCTION ANYMORE SINCE IT IS NOT THREADSAFE

RNA.PS_dot_plot_list(seq, filename, pl, mf, comment)

Produce a postscript dot-plot from two pair lists.

This function reads two plist structures (e.g. base pair probabilities and a secondary structure) as produced by assign_plist_from_pr() and assign_plist_from_db() and produces a postscript “dot plot” that is written to ‘filename’. Using base pair probabilities in the first and mfe structure in the second plist, the resulting “dot plot” represents each base pairing probability by a square of corresponding area in a upper triangle matrix. The lower part of the matrix contains the minimum free energy structure.

Parameters:
  • seq (string) – The RNA sequence

  • filename (string) – A filename for the postscript output

  • pl (RNA.ep() *) – The base pair probability pairlist

  • mf (RNA.ep() *) – The mfe secondary structure pairlist

  • comment (string) – A comment

Returns:

1 if postscript was successfully written, 0 otherwise

Return type:

int

See also

assign_plist_from_pr, assign_plist_from_db

RNA.PS_dot_plot_turn(seq, pl, filename, winSize)
RNA.PS_rna_plot(string, structure, file)

Produce a secondary structure graph in PostScript and write it to ‘filename’.

Deprecated since version 2.7.0: Use RNA.file_PS_rnaplot() instead!

RNA.PS_rna_plot_a(string, structure, file, pre, post)

Produce a secondary structure graph in PostScript including additional annotation macros and write it to ‘filename’.

Deprecated since version 2.7.0: Use RNA.file_PS_rnaplot_a() instead!

RNA.PS_rna_plot_a_gquad(string, structure, ssfile, pre, post)

Produce a secondary structure graph in PostScript including additional annotation macros and write it to ‘filename’ (detect and draw g-quadruplexes)

Deprecated since version 2.7.0: Use RNA.file_PS_rnaplot_a() instead!

class RNA.PathVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.SOLUTION

Bases: object

property energy
get(i)
size()
property structure
property thisown

The membership flag

class RNA.SOLUTIONVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.StringVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.SuboptVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.SwigPyIterator(*args, **kwargs)

Bases: object

advance(n)
copy()
decr(n=1)
distance(x)
equal(x)
incr(n=1)
next()
previous()
property thisown

The membership flag

value()
class RNA.UIntUIntVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

class RNA.UIntVector(*args)

Bases: object

append(x)
assign(n, x)
back()
begin()
capacity()
clear()
empty()
end()
erase(*args)
front()
get_allocator()
insert(*args)
iterator()
pop()
pop_back()
push_back(x)
rbegin()
rend()
reserve(n)
resize(*args)
size()
swap(v)
property thisown

The membership flag

RNA.abstract_shapes(std::string structure, unsigned int level=5) std::string
RNA.abstract_shapes(IntVector pt, unsigned int level=5) std::string
RNA.abstract_shapes(varArrayShort pt, unsigned int level=5) std::string

Convert a secondary structure in dot-bracket notation to its abstract shapes representation.

This function converts a secondary structure into its abstract shapes representation as presented by Giegerich et al. [2004] .

SWIG Wrapper Notes

This function is available as an overloaded function abstract_shapes() where the optional second parameter level defaults to 5. See, e.g. RNA.abstract_shapes() in the Python API.

Parameters:
  • structure (string) – A secondary structure in dot-bracket notation

  • level (unsigned int) – The abstraction level (integer in the range of 0 to 5)

Returns:

The secondary structure in abstract shapes notation

Return type:

string

See also

RNA.abstract_shapes_pt

RNA.add_root(arg1)

Adds a root to an un-rooted tree in any except bracket notation.

Parameters:

structure (string) –

Return type:

string

RNA.aliLfold(alignment, window_size, nullfile=None)

SWIG Wrapper Notes

This function is available as overloaded function aliLfold() in the global namespace. The parameter fp defaults to NULL and may be omitted. See e.g. RNA.aliLfold() in the Python API.

RNA.aliLfold_cb(StringVector alignment, int window_size, PyObject * PyFunc, PyObject * data) float
RNA.aliduplex_subopt(StringVector alignment1, StringVector alignment2, int delta, int w) DuplexVector
RNA.aliduplexfold(StringVector alignment1, StringVector alignment2) duplex_list_t
RNA.alifold(*args)

Compute Minimum Free Energy (MFE), and a corresponding consensus secondary structure for an RNA sequence alignment using a comparative method.

This simplified interface to RNA.fold_compound.mfe() computes the MFE and, if required, a consensus secondary structure for an RNA sequence alignment using default options. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing, e.g. suboptimal backtracking, etc.

SWIG Wrapper Notes

This function is available as function alifold() in the global namespace. The parameter structure is returned along with the MFE und must not be provided. See e.g. RNA.alifold() in the Python API.

Parameters:
  • sequences (const char **) – RNA sequence alignment

  • structure (string) – A pointer to the character array where the secondary structure in dot-bracket notation will be written to

Returns:

the minimum free energy (MFE) in kcal/mol

Return type:

float

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use RNA.fold_compound.mfe(), and the data structure RNA.fold_compound() instead.

RNA.aln_consensus_mis(StringVector alignment, md md_p=None) std::string

Compute the Most Informative Sequence (MIS) for a given multiple sequence alignment.

The most informative sequence (MIS) [Freyhult et al., 2005] displays for each alignment column the nucleotides with frequency greater than the background frequency, projected into IUPAC notation. Columns where gaps are over-represented are in lower case.

SWIG Wrapper Notes

This function is available as overloaded function aln_consensus_mis() where the last parameter may be omitted, indicating md = NULL. See e.g. RNA.aln_consensus_mis() in the Python API.

Parameters:
  • alignment (const char **) – The input sequence alignment (last entry must be NULL terminated)

  • md_p (const RNA.md() *) – Model details that specify known nucleotides (Maybe NULL)

Returns:

The most informative sequence for the alignment

Return type:

string

RNA.aln_consensus_sequence(StringVector alignment, md md_p=None) std::string

Compute the consensus sequence for a given multiple sequence alignment.

SWIG Wrapper Notes

This function is available as overloaded function aln_consensus_sequence() where the last parameter may be omitted, indicating md = NULL. See e.g. RNA.aln_consensus_sequence() in the Python API.

Parameters:
  • alignment (const char **) – The input sequence alignment (last entry must be NULL terminated)

  • md_p (const RNA.md() *) – Model details that specify known nucleotides (Maybe NULL)

Returns:

The consensus sequence of the alignment, i.e. the most frequent nucleotide for each alignment column

Return type:

string

RNA.aln_conservation_col(StringVector alignment, md md=None, unsigned int options=) DoubleVector

Compute nucleotide conservation in an alignment.

This function computes the conservation of nucleotides in alignment columns. The simples measure is Shannon Entropy and can be selected by passing the RNA.MEASURE_SHANNON_ENTROPY flag in the options parameter.

SWIG Wrapper Notes

This function is available as overloaded function aln_conservation_col() where the last two parameters may be omitted, indicating md = NULL, and options = RNA.MEASURE_SHANNON_ENTROPY, respectively. See e.g. RNA.aln_conservation_col() in the Python API.

Parameters:
  • alignment (const char **) – The input sequence alignment (last entry must be NULL terminated)

  • md – Model details that specify known nucleotides (Maybe NULL)

  • options (unsigned int) – A flag indicating which measure of conservation should be applied

Returns:

A 1-based vector of column conservations

Return type:

list-like(double)

See also

RNA.MEASURE_SHANNON_ENTROPY

Note

Currently, only RNA.MEASURE_SHANNON_ENTROPY is supported as conservation measure.

RNA.aln_conservation_struct(StringVector alignment, std::string structure, md md=None) DoubleVector

Compute base pair conservation of a consensus structure.

This function computes the base pair conservation (fraction of canonical base pairs) of a consensus structure given a multiple sequence alignment. The base pair types that are considered canonical may be specified using the RNA.md().pair array. Passing NULL as parameter md results in default pairing rules, i.e. canonical Watson-Crick and GU Wobble pairs.

SWIG Wrapper Notes

This function is available as overloaded function aln_conservation_struct() where the last parameter md may be omitted, indicating md = NULL. See, e.g. RNA.aln_conservation_struct() in the Python API.

Parameters:
  • alignment (const char **) – The input sequence alignment (last entry must be NULL terminated)

  • structure (string) – The consensus structure in dot-bracket notation

  • md (const RNA.md() *) – Model details that specify compatible base pairs (Maybe NULL)

Returns:

A 1-based vector of base pair conservations

Return type:

list-like(double)

RNA.aln_mpi(StringVector alignment) int

Get the mean pairwise identity in steps from ?to?(ident)

SWIG Wrapper Notes

This function is available as function aln_mpi(). See e.g. RNA.aln_mpi() in the Python API.

Parameters:

alignment (const char **) – Aligned sequences

Returns:

The mean pairwise identity

Return type:

int

RNA.aln_pscore(StringVector alignment, md md=None) IntIntVector

SWIG Wrapper Notes

This function is available as overloaded function aln_pscore() where the last parameter may be omitted, indicating md = NULL. See e.g. RNA.aln_pscore() in the Python API.

RNA.b2C(structure)

Converts the full structure from bracket notation to the a coarse grained notation using the ‘H’ ‘B’ ‘I’ ‘M’ and ‘R’ identifiers.

Deprecated since version 2.7.0: See RNA.db_to_tree_string() and RNA.STRUCTURE_TREE_SHAPIRO_SHORT for a replacement

Parameters:

structure (string) –

Return type:

string

RNA.b2HIT(structure)

Converts the full structure from bracket notation to the HIT notation including root.

Deprecated since version 2.7.0: See RNA.db_to_tree_string() and RNA.STRUCTURE_TREE_HIT for a replacement

Parameters:

structure (string) –

Return type:

string

RNA.b2Shapiro(structure)

Converts the full structure from bracket notation to the weighted coarse grained notation using the ‘H’ ‘B’ ‘I’ ‘M’ ‘S’ ‘E’ and ‘R’ identifiers.

Deprecated since version 2.7.0: See RNA.db_to_tree_string() and RNA.STRUCTURE_TREE_SHAPIRO_WEIGHT for a replacement

Parameters:

structure (string) –

Return type:

string

RNA.backtrack_GQuad_IntLoop_L(c, i, j, type, S, ggg, maxdist, p, q, P)

backtrack an internal loop like enclosed g-quadruplex with closing pair (i,j) with underlying Lfold matrix

Parameters:
  • c (int) – The total contribution the loop should resemble

  • i (int) – position i of enclosing pair

  • j (int) – position j of enclosing pair

  • type (int) – base pair type of enclosing pair (must be reverse type)

  • S (list-like(int)) – integer encoded sequence

  • ggg (int **) – triangular matrix containing g-quadruplex contributions

  • p (int *) – here the 5’ position of the gquad is stored

  • q (int *) – here the 3’ position of the gquad is stored

  • P (RNA.param() *) – the datastructure containing the precalculated contibutions

Returns:

1 on success, 0 if no gquad found

Return type:

int

RNA.backtrack_GQuad_IntLoop_L_comparative(c, i, j, type, S_cons, S5, S3, a2s, ggg, p, q, n_seq, P)
class RNA.basepair

Bases: object

Typename for base pair element.

Deprecated since version 2.7.0: Use RNA.bp() instead!

i
Type:

int

j
Type:

int

Typename for base pair element.

Deprecated since version 2.7.0: Use RNA.bp() instead!

i
Type:

int

j
Type:

int

property i
property j
property thisown

The membership flag

RNA.boustrophedon(*args)

Generate a sequence of Boustrophedon distributed numbers.

This function generates a sequence of positive natural numbers within the interval \([start, end]\) in a Boustrophedon fashion. That is, the numbers \(start, \ldots, end\) in the resulting list are alternating between left and right ends of the interval while progressing to the inside, i.e. the list consists of a sequence of natural numbers of the form:

\[start, end, start + 1, end - 1, start + 2, end - 2, \ldots\]

The resulting list is 1-based and contains the length of the sequence of numbers at it’s 0-th position.

Upon failure, the function returns NULL

SWIG Wrapper Notes

This function is available as overloaded global function boustrophedon(). See, e.g. RNA.boustrophedon() in the Python API .

Parameters:
  • start (size()) – The first number of the list (left side of the interval)

  • end (size()) – The last number of the list (right side of the interval)

Returns:

A list of alternating numbers from the interval \([start, end]\) (or NULL on error)

Return type:

list-like(unsigned int)

See also

RNA.boustrophedon_pos

RNA.bp_distance(std::string str1, std::string str2, unsigned int options=) int
RNA.bp_distance(IntVector pt1, IntVector pt2) int
RNA.bp_distance(varArrayShort pt1, varArrayShort pt2) int

Compute the “base pair” distance between two secondary structures s1 and s2.

This is a wrapper around RNA.bp_distance_pt(). The sequences should have the same length. dist = number of base pairs in one structure but not in the other same as edit distance with open-pair close-pair as move-set

SWIG Wrapper Notes

This function is available as an overloaded method bp_distance(). Note that the SWIG wrapper takes two structure in dot-bracket notation and converts them into pair tables using RNA.ptable_from_string(). The resulting pair tables are then internally passed to RNA.bp_distance_pt(). To control which kind of matching brackets will be used during conversion, the optional argument options can be used. See also the description of RNA.ptable_from_string() for available options. (default: RNA.BRACKETS_RND). See, e.g. RNA.bp_distance() in the Python API.

Parameters:
  • str1 (string) – First structure in dot-bracket notation

  • str2 (string) – Second structure in dot-bracket notation

Returns:

The base pair distance between str1 and str2

Return type:

int

See also

RNA.bp_distance_pt

RNA.cdata(ptr, nelements=1)
RNA.centroid(length, dist)

Deprecated since version 2.7.0: This function is deprecated and should not be used anymore as it is not threadsafe!

RNA.circalifold(*args)

Compute MFE and according structure of an alignment of sequences assuming the sequences are circular instead of linear.

Deprecated since version 2.7.0: Usage of this function is discouraged! Use RNA.alicircfold(), and RNA.fold_compound.mfe() instead!

Parameters:
  • strings (const char **) – A pointer to a NULL terminated array of character arrays

  • structure (string) – A pointer to a character array that may contain a constraining consensus structure (will be overwritten by a consensus structure that exhibits the MFE)

Returns:

The free energy score in kcal/mol

Return type:

float

See also

RNA.alicircfold, RNA.alifold, RNA.fold_compound.mfe

RNA.circfold(*args)

Compute Minimum Free Energy (MFE), and a corresponding secondary structure for a circular RNA sequence.

This simplified interface to RNA.fold_compound.mfe() computes the MFE and, if required, a secondary structure for a circular RNA sequence using default options. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing, e.g. suboptimal backtracking, etc.

Folding of circular RNA sequences is handled as a post-processing step of the forward recursions. See Hofacker and Stadler [2006] for further details.

SWIG Wrapper Notes

This function is available as function circfold() in the global namespace. The parameter structure is returned along with the MFE und must not be provided. See e.g. RNA.circfold() in the Python API.

Parameters:
  • sequence (string) – RNA sequence

  • structure (string) – A pointer to the character array where the secondary structure in dot-bracket notation will be written to

Returns:

the minimum free energy (MFE) in kcal/mol

Return type:

float

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use RNA.fold_compound.mfe(), and the data structure RNA.fold_compound() instead.

class RNA.cmd

Bases: object

property thisown

The membership flag

RNA.co_pf_fold(*args)
RNA.cofold(*args)

Compute Minimum Free Energy (MFE), and a corresponding secondary structure for two dimerized RNA sequences.

This simplified interface to RNA.fold_compound.mfe() computes the MFE and, if required, a secondary structure for two RNA sequences upon dimerization using default options. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing, e.g. suboptimal backtracking, etc.

Deprecated since version 2.7.0: This function is obsolete since RNA.mfe()/RNA.fold() can handle complexes multiple sequences since v2.5.0. Use RNA.mfe()/RNA.fold() for connected component MFE instead and compute MFEs of unconnected states separately.

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use RNA.fold_compound.mfe(), and the data structure RNA.fold_compound() instead.

SWIG Wrapper Notes

This function is available as function cofold() in the global namespace. The parameter structure is returned along with the MFE und must not be provided. See e.g. RNA.cofold() in the Python API.

Parameters:
  • sequence (string) – two RNA sequences separated by the ‘&’ character

  • structure (string) – A pointer to the character array where the secondary structure in dot-bracket notation will be written to

Returns:

the minimum free energy (MFE) in kcal/mol

Return type:

float

RNA.compare_structure(std::string str1, std::string str2, int fuzzy=0, unsigned int options=) score
RNA.compare_structure(IntVector pt1, IntVector pt2, int fuzzy=0) score
RNA.compare_structure(varArrayShort pt1, varArrayShort pt2, int fuzzy=0) score
RNA.consens_mis(alignment, md_p=None)
RNA.db_flatten(*args)

Substitute pairs of brackets in a string with parenthesis.

This function can be used to replace brackets of unusual types, such as angular brackets <> , to dot-bracket format. The options parameter is used tpo specify which types of brackets will be replaced by round parenthesis ``() .

SWIG Wrapper Notes

This function flattens an input structure string in-place! The second parameter is optional and defaults to RNA.BRACKETS_DEFAULT.

An overloaded version of this function exists, where an additional second parameter can be passed to specify the target brackets, i.e. the type of matching pair characters all brackets will be flattened to. Therefore, in the scripting language interface this function is a replacement for RNA.db_flatten_to(). See, e.g. RNA.db_flatten() in the Python API.

Parameters:
  • structure (string) – The structure string where brackets are flattened in-place

  • options (unsigned int) – A bitmask to specify which types of brackets should be flattened out

See also

RNA.db_flatten_to, RNA.BRACKETS_RND, RNA.BRACKETS_ANG, RNA.BRACKETS_CLY, RNA.BRACKETS_SQR, RNA.BRACKETS_DEFAULT

RNA.db_from_WUSS(wuss)

Convert a WUSS annotation string to dot-bracket format.

Parameters:

wuss (string) – The input string in WUSS notation

Returns:

A dot-bracket notation of the input secondary structure

Return type:

string

Note

This function flattens all brackets, and treats pseudo-knots annotated by matching pairs of upper/lowercase letters as unpaired nucleotides

RNA.db_from_plist(ElemProbVector elem_probs, unsigned int length) std::string

Convert a list of base pairs into dot-bracket notation.

Parameters:
  • pairs (RNA.ep() *) – A RNA.ep() containing the pairs to be included in the dot-bracket string

  • n (unsigned int) – The length of the structure (number of nucleotides)

Returns:

The dot-bracket string containing the provided base pairs

Return type:

string

See also

RNA.plist

RNA.db_from_ptable(IntVector pt) char
RNA.db_from_ptable(varArrayShort pt) char *

Convert a pair table into dot-parenthesis notation.

This function also converts pair table formatted structures that contain pseudoknots. Non-nested base pairs result in additional pairs of parenthesis and brackets within the resulting dot- parenthesis string. The following pairs are awailable: (), []. {}. <>, as well as pairs of matching upper-/lower-case characters from the alphabet A-Z.

Parameters:

pt (const short *) – The pair table to be copied

Returns:

A char pointer to the dot-bracket string

Return type:

string

Note

In cases where the level of non-nested base pairs exceeds the maximum number of 30 different base pair indicators (4 parenthesis/brackets, 26 matching characters), a warning is printed and the remaining base pairs are left out from the conversion.

RNA.db_pack(struc)

Pack secondary secondary structure, 5:1 compression using base 3 encoding.

Returns a binary string encoding of the secondary structure using a 5:1 compression scheme. The string is NULL terminated and can therefore be used with standard string functions such as strcmp(). Useful for programs that need to keep many structures in memory.

Parameters:

struc (string) – The secondary structure in dot-bracket notation

Returns:

The binary encoded structure

Return type:

string

See also

RNA.db_unpack

RNA.db_pk_remove(std::string structure, unsigned int options=) std::string

Remove pseudo-knots from an input structure.

This function removes pseudo-knots from an input structure by determining the minimum number of base pairs that need to be removed to make the structure pseudo-knot free.

To accomplish that, we use a dynamic programming algorithm similar to the Nussinov maxmimum matching approach.

The input structure must be in a dot-bracket string like form where crossing base pairs are denoted by the use of additional types of matching brackets, e.g. <>, {}, ``[], {}. Furthermore, crossing pairs may be annotated by matching uppercase/lowercase letters from the alphabet A-Z. For the latter, the uppercase letter must be the 5’ and the lowercase letter the 3’ nucleotide of the base pair. The actual type of brackets to be recognized by this function must be specifed through the options parameter.

SWIG Wrapper Notes

This function is available as an overloaded function db_pk_remove() where the optional second parameter options defaults to RNA.BRACKETS_ANY. See, e.g. RNA.db_pk_remove() in the Python API.

Parameters:
  • structure (string) – Input structure in dot-bracket format that may include pseudo-knots

  • options (unsigned int) – A bitmask to specify which types of brackets should be processed

Returns:

The input structure devoid of pseudo-knots in dot-bracket notation

Return type:

string

See also

RNA.pt_pk_remove, RNA.db_flatten, RNA.BRACKETS_RND, RNA.BRACKETS_ANG, RNA.BRACKETS_CLY, RNA.BRACKETS_SQR, RNA.BRACKETS_ALPHA, RNA.BRACKETS_DEFAULT, RNA.BRACKETS_ANY

Note

Brackets in the input structure string that are not covered by the options bitmask will be silently ignored!

RNA.db_to_element_string(structure)

Convert a secondary structure in dot-bracket notation to a nucleotide annotation of loop contexts.

Parameters:

structure (string) – The secondary structure in dot-bracket notation

Returns:

A string annotating each nucleotide according to it’s structural context

Return type:

string

RNA.db_to_tree_string(std::string structure, unsigned int type) std::string

Convert a Dot-Bracket structure string into tree string representation.

This function allows one to convert a secondary structure in dot-bracket notation into one of the various tree representations for secondary structures. The resulting tree is then represented as a string of parenthesis and node symbols, similar to to the Newick format.

Currently we support conversion into the following formats, denoted by the value of parameter type:

  • RNA.STRUCTURE_TREE_HIT - Homeomorphically Irreducible Tree (HIT) representation of a secondary structure. (See also Fontana et al. [1993] )

  • RNA.STRUCTURE_TREE_SHAPIRO_SHORT - (short) Coarse Grained representation of a secondary structure (same as Shapiro [1988] , but with root node R and without S nodes for the stems)

  • RNA.STRUCTURE_TREE_SHAPIRO - (full) Coarse Grained representation of a secondary structure (See also Shapiro [1988] )

  • RNA.STRUCTURE_TREE_SHAPIRO_EXT - (extended) Coarse Grained representation of a secondary structure (same as Shapiro [1988] , but external nodes denoted as E )

  • RNA.STRUCTURE_TREE_SHAPIRO_WEIGHT - (weighted) Coarse Grained representation of a secondary structure (same as RNA.STRUCTURE_TREE_SHAPIRO_EXT but with additional weights for number of unpaired nucleotides in loop, and number of pairs in stems)

  • RNA.STRUCTURE_TREE_EXPANDED - Expanded Tree representation of a secondary structure.

Parameters:
  • structure (string) – The null-terminated dot-bracket structure string

  • type (unsigned int) – A switch to determine the type of tree string representation

Returns:

A tree representation of the input structure

Return type:

string

See also

sec_structure_representations_tree

RNA.db_unpack(packed)

Unpack secondary structure previously packed with RNA.db_pack()

Translate a compressed binary string produced by RNA.db_pack() back into the familiar dot-bracket notation.

Parameters:

packed (string) – The binary encoded packed secondary structure

Returns:

The unpacked secondary structure in dot-bracket notation

Return type:

string

See also

RNA.db_pack

RNA.delete_doubleP(ary)
RNA.delete_floatP(ary)
RNA.delete_intP(ary)
RNA.delete_shortP(ary)
RNA.delete_ushortP(ary)
RNA.deref_any(ptr, index)
RNA.dist_mountain(str1, str2, p=1)
class RNA.doubleArray(nelements)

Bases: object

cast()
static frompointer(t)
property thisown

The membership flag

RNA.doubleArray_frompointer(t)
RNA.doubleP_getitem(ary, index)
RNA.doubleP_setitem(ary, index, value)
class RNA.duplexT(*args, **kwargs)

Bases: object

Data structure for RNAduplex.

i
Type:

int

j
Type:

int

end
Type:

int

structure
Type:

string

energy
Type:

double

energy_backtrack
Type:

double

opening_backtrack_x
Type:

double

opening_backtrack_y
Type:

double

offset
Type:

int

dG1
Type:

double

dG2
Type:

double

ddG
Type:

double

tb
Type:

int

te
Type:

int

qb
Type:

int

qe
Type:

int

property dG1
property dG2
property ddG
property end
property energy
property energy_backtrack
property i
property j
property offset
property opening_backtrack_x
property opening_backtrack_y
property qb
property qe
property structure
property tb
property te
property thisown

The membership flag

class RNA.duplex_list_t

Bases: object

property energy
property i
property j
property structure
property thisown

The membership flag

RNA.duplex_subopt(std::string s1, std::string s2, int delta, int w) DuplexVector
RNA.duplexfold(std::string s1, std::string s2) duplex_list_t
RNA.encode_seq(sequence)
RNA.energy_of_circ_struct(string, structure)

Calculate the free energy of an already folded circular RNA

Deprecated since version 2.7.0: This function is deprecated and should not be used in future programs Use energy_of_circ_structure() instead!

Note

This function is not entirely threadsafe! Depending on the state of the global variable eos_debug it prints energy information to stdout or not…

Parameters:
  • string (string) – RNA sequence

  • structure (string) – secondary structure in dot-bracket notation

Returns:

the free energy of the input structure given the input sequence in kcal/mol

Return type:

float

RNA.energy_of_circ_structure(string, structure, verbosity_level)

Calculate the free energy of an already folded circular RNA.

If verbosity level is set to a value >0, energies of structure elements are printed to stdout

Note

OpenMP: This function relies on several global model settings variables and thus is not to be considered threadsafe. See energy_of_circ_struct_par() for a completely threadsafe implementation.

Deprecated since version 2.7.0: Use RNA.fold_compound.eval_structure() or RNA.fold_compound.eval_structure_verbose() instead!

Parameters:
  • string (string) – RNA sequence

  • structure (string) – Secondary structure in dot-bracket notation

  • verbosity_level (int) – A flag to turn verbose output on/off

Returns:

The free energy of the input structure given the input sequence in kcal/mol

Return type:

float

RNA.energy_of_gquad_structure(string, structure, verbosity_level)
RNA.energy_of_move(string, structure, m1, m2)

Calculate energy of a move (closing or opening of a base pair)

If the parameters m1 and m2 are negative, it is deletion (opening) of a base pair, otherwise it is insertion (opening).

Deprecated since version 2.7.0: Use RNA.fold_compound.eval_move() instead!

Parameters:
  • string (string) – RNA sequence

  • structure (string) – secondary structure in dot-bracket notation

  • m1 (int) – first coordinate of base pair

  • m2 (int) – second coordinate of base pair

Returns:

energy change of the move in kcal/mol

Return type:

float

RNA.energy_of_move_pt(pt, s, s1, m1, m2)

Calculate energy of a move (closing or opening of a base pair)

If the parameters m1 and m2 are negative, it is deletion (opening) of a base pair, otherwise it is insertion (opening).

Deprecated since version 2.7.0: Use RNA.fold_compound.eval_move_pt() instead!

Parameters:
  • pt (list-like(int)) – the pair table of the secondary structure

  • s (list-like(int)) – encoded RNA sequence

  • s1 (list-like(int)) – encoded RNA sequence

  • m1 (int) – first coordinate of base pair

  • m2 (int) – second coordinate of base pair

Returns:

energy change of the move in 10cal/mol

Return type:

int

RNA.energy_of_struct(string, structure)

Calculate the free energy of an already folded RNA

Deprecated since version 2.7.0: This function is deprecated and should not be used in future programs! Use energy_of_structure() instead!

Note

This function is not entirely threadsafe! Depending on the state of the global variable eos_debug it prints energy information to stdout or not…

Parameters:
  • string (string) – RNA sequence

  • structure (string) – secondary structure in dot-bracket notation

Returns:

the free energy of the input structure given the input sequence in kcal/mol

Return type:

float

RNA.energy_of_struct_pt(string, ptable, s, s1)

Calculate the free energy of an already folded RNA

Deprecated since version 2.7.0: This function is deprecated and should not be used in future programs! Use energy_of_structure_pt() instead!

Note

This function is not entirely threadsafe! Depending on the state of the global variable eos_debug it prints energy information to stdout or not…

Parameters:
  • string (string) – RNA sequence

  • ptable (list-like(int)) – the pair table of the secondary structure

  • s (list-like(int)) – encoded RNA sequence

  • s1 (list-like(int)) – encoded RNA sequence

Returns:

the free energy of the input structure given the input sequence in 10kcal/mol

Return type:

int

See also

make_pair_table, energy_of_structure

RNA.energy_of_structure(string, structure, verbosity_level)

Calculate the free energy of an already folded RNA using global model detail settings.

If verbosity level is set to a value >0, energies of structure elements are printed to stdout

Deprecated since version 2.7.0: Use RNA.fold_compound.eval_structure() or RNA.fold_compound.eval_structure_verbose() instead!

Note

OpenMP: This function relies on several global model settings variables and thus is not to be considered threadsafe. See energy_of_struct_par() for a completely threadsafe implementation.

Parameters:
  • string (string) – RNA sequence

  • structure (string) – secondary structure in dot-bracket notation

  • verbosity_level (int) – a flag to turn verbose output on/off

Returns:

the free energy of the input structure given the input sequence in kcal/mol

Return type:

float

RNA.energy_of_structure_pt(string, ptable, s, s1, verbosity_level)

Calculate the free energy of an already folded RNA.

If verbosity level is set to a value >0, energies of structure elements are printed to stdout

Deprecated since version 2.7.0: Use RNA.fold_compound.eval_structure_pt() or RNA.fold_compound.eval_structure_pt_verbose() instead!

Note

OpenMP: This function relies on several global model settings variables and thus is not to be considered threadsafe. See energy_of_struct_pt_par() for a completely threadsafe implementation.

Parameters:
  • string (string) – RNA sequence

  • ptable (list-like(int)) – the pair table of the secondary structure

  • s (list-like(int)) – encoded RNA sequence

  • s1 (list-like(int)) – encoded RNA sequence

  • verbosity_level (int) – a flag to turn verbose output on/off

Returns:

the free energy of the input structure given the input sequence in 10kcal/mol

Return type:

int

RNA.enumerate_necklaces(entity_counts)

Enumerate all necklaces with fixed content.

This function implements A fast algorithm to generate necklaces with fixed content as published by Sawada [2003] .

The function receives a list of counts (the elements on the necklace) for each type of object within a necklace. The list starts at index 0 and ends with an entry that has a count of 0. The algorithm then enumerates all non-cyclic permutations of the content, returned as a list of necklaces. This list, again, is zero-terminated, i.e. the last entry of the list is a NULL pointer.

SWIG Wrapper Notes

This function is available as global function enumerate_necklaces() which accepts lists input, an produces list of lists output. See, e.g. RNA.enumerate_necklaces() in the Python API .

Parameters:

type_counts (const unsigned int *) – A 0-terminated list of entity counts

Returns:

A list of all non-cyclic permutations of the entities

Return type:

list-like(list-like(unsigned int))

class RNA.ep(*args, **kwargs)

Bases: object

Data structure representing a single entry of an element probability list (e.g. list of pair probabilities)

See also

RNA.plist, RNA.fold_compound.plist_from_probs, RNA.db_from_plist, RNA.PLIST_TYPE_BASEPAIR, RNA.PLIST_TYPE_GQUAD, RNA.PLIST_TYPE_H_MOTIF, RNA.PLIST_TYPE_I_MOTIF, RNA.PLIST_TYPE_UD_MOTIF, RNA.PLIST_TYPE_STACK

i

Start position (usually 5’ nucleotide that starts the element, e.g. base pair)

Type:

int

j

End position (usually 3’ nucleotide that ends the element, e.g. base pair)

Type:

int

p

Probability of the element.

Type:

float

type

Type of the element.

Type:

int

Data structure representing a single entry of an element probability list (e.g. list of pair probabilities)

See also

RNA.plist, RNA.fold_compound.plist_from_probs, RNA.db_from_plist, RNA.PLIST_TYPE_BASEPAIR, RNA.PLIST_TYPE_GQUAD, RNA.PLIST_TYPE_H_MOTIF, RNA.PLIST_TYPE_I_MOTIF, RNA.PLIST_TYPE_UD_MOTIF, RNA.PLIST_TYPE_STACK

i

Start position (usually 5’ nucleotide that starts the element, e.g. base pair)

Type:

int

j

End position (usually 3’ nucleotide that ends the element, e.g. base pair)

Type:

int

p

Probability of the element.

Type:

float

type

Type of the element.

Type:

int

property i
property j
property p
property thisown

The membership flag

property type
RNA.eval_circ_gquad_structure(*args)

Evaluate free energy of a sequence/structure pair, assume sequence to be circular, allow for G-Quadruplexes in the structure, and print contributions per loop.

This function is the same as RNA.eval_structure_simple_v() but assumes the input sequence to be circular and allows for annotated G-Quadruplexes in the dot-bracket structure input.

G-Quadruplexes are annotated as plus signs (‘+’) for each G involved in the motif. Linker sequences must be denoted by dots (‘.’) as they are considered unpaired. Below is an example of a 2-layer G-quadruplex:

SWIG Wrapper Notes

This function is available through an overloaded version of RNA.eval_circ_gquad_structure(). The last two arguments for this function are optional and default to RNA.VERBOSITY_QUIET and NULL, respectively. See, e.g. RNA.eval_circ_gquad_structure() in the Python API .

Parameters:
  • string (string) – RNA sequence in uppercase letters

  • structure (string) – Secondary structure in dot-bracket notation

  • verbosity_level (int) – The level of verbosity of this function

  • file (FILE *) – A file handle where this function should print to (may be NULL).

Returns:

The free energy of the input structure given the input sequence in kcal/mol

Return type:

float

RNA.eval_circ_structure(*args)

Evaluate free energy of a sequence/structure pair, assume sequence to be circular and print contributions per loop.

This function is the same as RNA.eval_structure_simple_v() but assumes the input sequence to be circularized.

SWIG Wrapper Notes

This function is available through an overloaded version of RNA.eval_circ_structure(). The last two arguments for this function are optional and default to RNA.VERBOSITY_QUIET and NULL, respectively. See, e.g. RNA.eval_circ_structure() in the Python API .

Parameters:
  • string (string) – RNA sequence in uppercase letters

  • structure (string) – Secondary structure in dot-bracket notation

  • verbosity_level (int) – The level of verbosity of this function

  • file (FILE *) – A file handle where this function should print to (may be NULL).

Returns:

The free energy of the input structure given the input sequence in kcal/mol

Return type:

float

See also

RNA.eval_structure_simple_v, RNA.eval_circ_structure, RNA.fold_compound.eval_structure_verbose

RNA.eval_gquad_structure(*args)

Evaluate free energy of a sequence/structure pair, allow for G-Quadruplexes in the structure and print contributions per loop.

This function is the same as RNA.eval_structure_simple_v() but allows for annotated G-Quadruplexes in the dot-bracket structure input.

G-Quadruplexes are annotated as plus signs (‘+’) for each G involved in the motif. Linker sequences must be denoted by dots (‘.’) as they are considered unpaired. Below is an example of a 2-layer G-quadruplex:

SWIG Wrapper Notes

This function is available through an overloaded version of RNA.eval_gquad_structure(). The last two arguments for this function are optional and default to RNA.VERBOSITY_QUIET and NULL, respectively. See, e.g. RNA.eval_gquad_structure() in the Python API .

Parameters:
  • string (string) – RNA sequence in uppercase letters

  • structure (string) – Secondary structure in dot-bracket notation

  • verbosity_level (int) – The level of verbosity of this function

  • file (FILE *) – A file handle where this function should print to (may be NULL).

Returns:

The free energy of the input structure given the input sequence in kcal/mol

Return type:

float

RNA.eval_structure_pt_simple(*args)

Calculate the free energy of an already folded RNA.

This function allows for energy evaluation of a given sequence/structure pair where the structure is provided in pair_table format as obtained from RNA.ptable(). Model details, energy parameters, and possibly soft constraints are used as provided via the parameter ‘fc’. The fold_compound does not need to contain any DP matrices, but all the most basic init values as one would get from a call like this: In contrast to RNA.fold_compound.eval_structure_pt_verbose() this function assumes default model details and default energy parameters in order to evaluate the free energy of the secondary structure. Threefore, it serves as a simple interface function for energy evaluation for situations where no changes on the energy model are required.

Parameters:
  • string (string) – RNA sequence in uppercase letters

  • pt (const short *) – Secondary structure as pair_table

  • verbosity_level (int) – The level of verbosity of this function

  • file (FILE *) – A file handle where this function should print to (may be NULL).

Returns:

The free energy of the input structure given the input sequence in 10cal/mol

Return type:

int

See also

RNA.ptable, RNA.eval_structure_pt_v, RNA.eval_structure_simple

RNA.eval_structure_simple(*args)

Calculate the free energy of an already folded RNA and print contributions per loop.

This function allows for detailed energy evaluation of a given sequence/structure pair. In contrast to RNA.fold_compound.eval_structure() this function prints detailed energy contributions based on individual loops to a file handle. If NULL is passed as file handle, this function defaults to print to stdout. Any positive verbosity_level activates potential warning message of the energy evaluting functions, while values \(\ge 1\) allow for detailed control of what data is printed. A negative parameter verbosity_level turns off printing all together.

In contrast to RNA.fold_compound.eval_structure_verbose() this function assumes default model details and default energy parameters in order to evaluate the free energy of the secondary structure. Threefore, it serves as a simple interface function for energy evaluation for situations where no changes on the energy model are required.

SWIG Wrapper Notes

This function is available through an overloaded version of RNA.eval_structure_simple(). The last two arguments for this function are optional and default to RNA.VERBOSITY_QUIET and NULL, respectively. See, e.g. RNA.eval_structure_simple() in the Python API .

Parameters:
  • string (string) – RNA sequence in uppercase letters

  • structure (string) – Secondary structure in dot-bracket notation

  • verbosity_level (int) – The level of verbosity of this function

  • file (FILE *) – A file handle where this function should print to (may be NULL).

Returns:

The free energy of the input structure given the input sequence in kcal/mol

Return type:

float

RNA.exp_E_ExtLoop(type, si1, sj1, P)

This is the partition function variant of E_ExtLoop()

Deprecated since version 2.7.0: Use RNA.fold_compound.exp_E_ext_stem() instead!

Returns:

The Boltzmann weighted energy contribution of the introduced exterior-loop stem

Return type:

double

See also

E_ExtLoop

RNA.exp_E_Hairpin(u, type, si1, sj1, string, P)

Compute Boltzmann weight \(e^{-\Delta G/kT}\) of a hairpin loop.

Parameters:
  • u (int) – The size of the loop (number of unpaired nucleotides)

  • type (int) – The pair type of the base pair closing the hairpin

  • si1 (short) – The 5’-mismatching nucleotide

  • sj1 (short) – The 3’-mismatching nucleotide

  • string (string) – The sequence of the loop (May be NULL, otherwise mst be at least \(size + 2\) long)

  • P (RNA.exp_param() *) – The datastructure containing scaled Boltzmann weights of the energy parameters

Returns:

The Boltzmann weight of the Hairpin-loop

Return type:

double

Warning

Not (really) thread safe! A threadsafe implementation will replace this function in a future release!

Energy evaluation may change due to updates in global variable “tetra_loop”

See also

get_scaled_pf_parameters, RNA.exp_param, E_Hairpin

Note

multiply by scale[u+2]

RNA.exp_E_IntLoop(u1, u2, type, type2, si1, sj1, sp1, sq1, P)

Compute Boltzmann weight \(e^{-\Delta G/kT}\) of internal loop

multiply by scale[u1+u2+2] for scaling

param u1:

The size of the ‘left’-loop (number of unpaired nucleotides)

type u1:

int

param u2:

The size of the ‘right’-loop (number of unpaired nucleotides)

type u2:

int

param type:

The pair type of the base pair closing the internal loop

type type:

int

param type2:

The pair type of the enclosed base pair

type type2:

int

param si1:

The 5’-mismatching nucleotide of the closing pair

type si1:

short

param sj1:

The 3’-mismatching nucleotide of the closing pair

type sj1:

short

param sp1:

The 3’-mismatching nucleotide of the enclosed pair

type sp1:

short

param sq1:

The 5’-mismatching nucleotide of the enclosed pair

type sq1:

short

param P:

The datastructure containing scaled Boltzmann weights of the energy parameters

type P:

RNA.exp_param() *

returns:

The Boltzmann weight of the Interior-loop

rtype:

double

See also

get_scaled_pf_parameters, RNA.exp_param, E_IntLoop

Note

This function is threadsafe

RNA.exp_E_MLstem(type, si1, sj1, P)
RNA.exp_E_Stem(type, si1, sj1, extLoop, P)

Compute the Boltzmann weighted energy contribution of a stem branching off a loop-region

This is the partition function variant of E_Stem()

returns:

The Boltzmann weighted energy contribution of the branch off the loop

rtype:

double

See also

E_Stem

Note

This function is threadsafe

RNA.exp_E_gquad(L, l, pf)
RNA.exp_E_gquad_ali(i, L, l, S, a2s, n_seq, pf)
class RNA.exp_param(model_details=None)

Bases: object

The data structure that contains temperature scaled Boltzmann weights of the energy parameters.

id

An identifier for the data structure.

Deprecated since version 2.7.0: This attribute will be removed in version 3

Type:

int

expstack
Type:

double

exphairpin
Type:

double

expbulge
Type:

double

expinternal
Type:

double

expmismatchExt
Type:

double

expmismatchI
Type:

double

expmismatch23I
Type:

double

expmismatch1nI
Type:

double

expmismatchH
Type:

double

expmismatchM
Type:

double

expdangle5
Type:

double

expdangle3
Type:

double

expint11
Type:

double

expint21
Type:

double

expint22
Type:

double

expninio
Type:

double

lxc
Type:

double

expMLbase
Type:

double

expMLintern
Type:

double

expMLclosing
Type:

double

expTermAU
Type:

double

expDuplexInit
Type:

double

exptetra
Type:

double

exptri
Type:

double

exphex
Type:

double

Tetraloops
Type:

char

expTriloop
Type:

double

Triloops
Type:

char

Hexaloops
Type:

char

expTripleC
Type:

double

expMultipleCA
Type:

double

expMultipleCB
Type:

double

expgquad
Type:

double

expgquadLayerMismatch
Type:

double

gquadLayerMismatchMax
Type:

unsigned int

kT
Type:

double

pf_scale

Scaling factor to avoid over-/underflows.

Type:

double

temperature

Temperature used for loop contribution scaling.

Type:

double

alpha

Scaling factor for the thermodynamic temperature.

This allows for temperature scaling in Boltzmann factors independently from the energy contributions. The resulting Boltzmann factors are then computed by \(e^{-E/(\alpha \cdot K \cdot T)}\)

Type:

double

model_details

Model details to be used in the recursions.

Type:

vrna_md_t

param_file

The filename the parameters were derived from, or empty string if they represent the default.

Type:

char

expSaltStack
Type:

double

expSaltLoop
Type:

double

SaltLoopDbl
Type:

double

SaltMLbase
Type:

int

SaltMLintern
Type:

int

SaltMLclosing
Type:

int

SaltDPXInit
Type:

int

The data structure that contains temperature scaled Boltzmann weights of the energy parameters.

id

An identifier for the data structure.

Deprecated since version 2.7.0: This attribute will be removed in version 3

Type:

int

expstack
Type:

double

exphairpin
Type:

double

expbulge
Type:

double

expinternal
Type:

double

expmismatchExt
Type:

double

expmismatchI
Type:

double

expmismatch23I
Type:

double

expmismatch1nI
Type:

double

expmismatchH
Type:

double

expmismatchM
Type:

double

expdangle5
Type:

double

expdangle3
Type:

double

expint11
Type:

double

expint21
Type:

double

expint22
Type:

double

expninio
Type:

double

lxc
Type:

double

expMLbase
Type:

double

expMLintern
Type:

double

expMLclosing
Type:

double

expTermAU
Type:

double

expDuplexInit
Type:

double

exptetra
Type:

double

exptri
Type:

double

exphex
Type:

double

Tetraloops
Type:

char

expTriloop
Type:

double

Triloops
Type:

char

Hexaloops
Type:

char

expTripleC
Type:

double

expMultipleCA
Type:

double

expMultipleCB
Type:

double

expgquad
Type:

double

expgquadLayerMismatch
Type:

double

gquadLayerMismatchMax
Type:

unsigned int

kT
Type:

double

pf_scale

Scaling factor to avoid over-/underflows.

Type:

double

temperature

Temperature used for loop contribution scaling.

Type:

double

alpha

Scaling factor for the thermodynamic temperature.

This allows for temperature scaling in Boltzmann factors independently from the energy contributions. The resulting Boltzmann factors are then computed by \(e^{-E/(\alpha \cdot K \cdot T)}\)

Type:

double

model_details

Model details to be used in the recursions.

Type:

vrna_md_t

param_file

The filename the parameters were derived from, or empty string if they represent the default.

Type:

char

expSaltStack
Type:

double

expSaltLoop
Type:

double

SaltLoopDbl
Type:

double

SaltMLbase
Type:

int

SaltMLintern
Type:

int

SaltMLclosing
Type:

int

SaltDPXInit
Type:

int

property Hexaloops
property SaltDPXInit
property SaltLoopDbl
property SaltMLbase
property SaltMLclosing
property SaltMLintern
property Tetraloops
property Triloops
property alpha
property expDuplexInit
property expMLbase
property expMLclosing
property expMLintern
property expMultipleCA
property expMultipleCB
property expSaltLoop
property expSaltStack
property expTermAU
property expTriloop
property expTripleC
property expbulge
property expdangle3
property expdangle5
property expgquad
property expgquadLayerMismatch
property exphairpin
property exphex
property expint11
property expint21
property expint22
property expinternal
property expmismatch1nI
property expmismatch23I
property expmismatchExt
property expmismatchH
property expmismatchI
property expmismatchM
property expninio
property expstack
property exptetra
property exptri
property gquadLayerMismatchMax
property id
property kT
property lxc
property model_details
property param_file
property pf_scale
property temperature
property thisown

The membership flag

RNA.expand_Full(structure)

Convert the full structure from bracket notation to the expanded notation including root.

Parameters:

structure (string) –

Return type:

string

RNA.expand_Shapiro(coarse)

Inserts missing ‘S’ identifiers in unweighted coarse grained structures as obtained from b2C().

Parameters:

coarse (string) –

Return type:

string

RNA.extract_record_rest_structure(lines, length, option)
RNA.fc_add_pycallback(vc, PyFunc)
RNA.fc_add_pydata(vc, data, PyFuncOrNone)
RNA.file_PS_aln(std::string filename, StringVector alignment, StringVector identifiers, std::string structure, unsigned int start=0, unsigned int end=0, int offset=0, unsigned int columns=60) int

Create an annotated PostScript alignment plot.

Similar to RNA.file_PS_aln() but allows the user to print a particular slice of the alignment by specifying a start and end position. The additional offset parameter allows for adjusting the alignment position ruler value.

SWIG Wrapper Notes

This function is available as overloaded function file_PS_aln() where the last four parameter may be omitted, indicating start = 0, end = 0, offset = 0, and columns = 60. See, e.g. RNA.file_PS_aln() in the Python API.

Parameters:
  • filename (string) – The output file name

  • seqs (const char **) – The aligned sequences

  • names (const char **) – The names of the sequences

  • structure (string) – The consensus structure in dot-bracket notation

  • start (unsigned int) – The start of the alignment slice (a value of 0 indicates the first position of the alignment, i.e. no slicing at 5’ side)

  • end (unsigned int) – The end of the alignment slice (a value of 0 indicates the last position of the alignment, i.e. no slicing at 3’ side)

  • offset (int) – The alignment coordinate offset for the position ruler.

  • columns (unsigned int) – The number of columns before the alignment is wrapped as a new block (a value of 0 indicates no wrapping)

See also

RNA.file_PS_aln_slice

RNA.file_PS_rnaplot(*args)
RNA.file_PS_rnaplot_a(*args)
RNA.file_RNAstrand_db_read_record(fp, options=0)
RNA.file_SHAPE_read(file_name, length, default_value)

Read data from a given SHAPE reactivity input file.

This function parses the informations from a given file and stores the result in the preallocated string sequence and the double array values.

Parameters:
  • file_name (string) – Path to the constraints file

  • length (int) – Length of the sequence (file entries exceeding this limit will cause an error)

  • default_value (double) – Value for missing indices

  • sequence (string) – Pointer to an array used for storing the sequence obtained from the SHAPE reactivity file

  • values (list-like(double)) – Pointer to an array used for storing the values obtained from the SHAPE reactivity file

RNA.file_commands_read(std::string filename, unsigned int options=) cmd

Extract a list of commands from a command file.

Read a list of commands specified in the input file and return them as list of abstract commands

SWIG Wrapper Notes

This function is available as global function file_commands_read(). See, e.g. RNA.file_commands_read() in the Python API .

Parameters:
  • filename (string) – The filename

  • options (unsigned int) – Options to limit the type of commands read from the file

Returns:

A list of abstract commands

Return type:

RNA.cmd()

See also

RNA.fold_compound.commands_apply, RNA.file_commands_apply, RNA.commands_free

RNA.file_connect_read_record(fp, remainder, options=0)
RNA.file_fasta_read(FILE * file, unsigned int options=0) int

Get a (fasta) data set from a file or stdin.

This function may be used to obtain complete datasets from a filehandle or stdin. A dataset is always defined to contain at least a sequence. If data starts with a fasta header, i.e. a line like

>some header info then RNA.file_fasta_read_record() will assume that the sequence that follows the header may span over several lines. To disable this behavior and to assign a single line to the argument ‘sequence’ one can pass RNA.INPUT_NO_SPAN in the ‘options’ argument. If no fasta header is read in the beginning of a data block, a sequence must not span over multiple lines!

Unless the options RNA.INPUT_NOSKIP_COMMENTS or RNA.INPUT_NOSKIP_BLANK_LINES are passed, a sequence may be interrupted by lines starting with a comment character or empty lines.

A sequence is regarded as completely read if it was either assumed to not span over multiple lines, a secondary structure or structure constraint follows the sequence on the next line, or a new header marks the beginning of a new sequence…

All lines following the sequence (this includes comments) that do not initiate a new dataset according to the above definition are available through the line-array ‘rest’. Here one can usually find the structure constraint or other information belonging to the current dataset. Filling of ‘rest’ may be prevented by passing RNA.INPUT_NO_REST to the options argument.

The main purpose of this function is to be able to easily parse blocks of data in the header of a loop where all calculations for the appropriate data is done inside the loop. The loop may be then left on certain return values, e.g.:

In the example above, the while loop will be terminated when RNA.file_fasta_read_record() returns either an error, EOF, or a user initiated quit request.

As long as data is read from stdin (we are passing NULL as the file pointer), the id is printed if it is available for the current block of data. The sequence will be printed in any case and if some more lines belong to the current block of data each line will be printed as well.

Parameters:
  • header (char **) – A pointer which will be set such that it points to the header of the record

  • sequence (char **) – A pointer which will be set such that it points to the sequence of the record

  • rest (char ***) – A pointer which will be set such that it points to an array of lines which also belong to the record

  • file (FILE *) – A file handle to read from (if NULL, this function reads from stdin)

  • options (unsigned int) – Some options which may be passed to alter the behavior of the function, use 0 for no options

Returns:

A flag with information about what the function actually did read

Return type:

unsigned int

Note

This function will exit any program with an error message if no sequence could be read!

This function is NOT threadsafe! It uses a global variable to store information about the next data block. Do not forget to free the memory occupied by header, sequence and rest!

RNA.file_msa_detect_format(std::string filename, unsigned int options=) unsigned int

Detect the format of a multiple sequence alignment file.

This function attempts to determine the format of a file that supposedly contains a multiple sequence alignment (MSA). This is useful in cases where a MSA file contains more than a single record and therefore RNA.file_msa_read() can not be applied, since it only retrieves the first. Here, one can try to guess the correct file format using this function and then loop over the file, record by record using one of the low-level record retrieval functions for the corresponding MSA file format.

SWIG Wrapper Notes

This function exists as an overloaded version where the options parameter may be omitted! In that case, the options parameter defaults to RNA.FILE_FORMAT_MSA_DEFAULT. See, e.g. RNA.file_msa_detect_format() in the Python API .

Parameters:
  • filename (string) – The name of input file that contains the alignment

  • options (unsigned int) – Options to manipulate the behavior of this function

Returns:

The MSA file format, or RNA.FILE_FORMAT_MSA_UNKNOWN

Return type:

unsigned int

See also

RNA.file_msa_read, RNA.file_stockholm_read_record, RNA.file_clustal_read_record, RNA.file_fasta_read_record

Note

This function parses the entire first record within the specified file. As a result, it returns RNA.FILE_FORMAT_MSA_UNKNOWN not only if it can’t detect the file’s format, but also in cases where the file doesn’t contain sequences!

RNA.file_msa_read(std::string filename, unsigned int options=) int

Read a multiple sequence alignment from file.

This function reads the (first) multiple sequence alignment from an input file. The read alignment is split into the sequence id/name part and the actual sequence information and stored in memory as arrays of ids/names and sequences. If the alignment file format allows for additional information, such as an ID of the entire alignment or consensus structure information, this data is retrieved as well and made available. The options parameter allows to specify the set of alignment file formats that should be used to retrieve the data. If 0 is passed as option, the list of alignment file formats defaults to RNA.FILE_FORMAT_MSA_DEFAULT.

Currently, the list of parsable multiple sequence alignment file formats consists of:

  • msa-formats-clustal

  • msa-formats-stockholm

  • msa-formats-fasta

  • msa-formats-maf

SWIG Wrapper Notes

In the target scripting language, only the first and last argument, filename and options, are passed to the corresponding function. The other arguments, which serve as output in the C-library, are available as additional return values. This function exists as an overloaded version where the options parameter may be omitted! In that case, the options parameter defaults to RNA.FILE_FORMAT_MSA_STOCKHOLM. See, e.g. RNA.file_msa_read() in the Python API and Parsing Alignments in the Python examples.

Parameters:
  • filename (string) – The name of input file that contains the alignment

  • names (char ***) – An address to the pointer where sequence identifiers should be written to

  • aln (char ***) – An address to the pointer where aligned sequences should be written to

  • id (char **) – An address to the pointer where the alignment ID should be written to (Maybe NULL)

  • structure (char **) – An address to the pointer where consensus structure information should be written to (Maybe NULL)

  • options (unsigned int) – Options to manipulate the behavior of this function

Returns:

The number of sequences in the alignment, or -1 if no alignment record could be found

Return type:

int

See also

RNA.file_msa_read_record, RNA.FILE_FORMAT_MSA_CLUSTAL, RNA.FILE_FORMAT_MSA_STOCKHOLM, RNA.FILE_FORMAT_MSA_FASTA, RNA.FILE_FORMAT_MSA_MAF, RNA.FILE_FORMAT_MSA_DEFAULT, RNA.FILE_FORMAT_MSA_NOCHECK

Note

After successfully reading an alignment, this function performs a validation of the data that includes uniqueness of the sequence identifiers, and equal sequence lengths. This check can be deactivated by passing RNA.FILE_FORMAT_MSA_NOCHECK in the options parameter.

It is the users responsibility to free any memory occupied by the output arguments names, aln, id, and structure after calling this function. The function automatically sets the latter two arguments to NULL in case no corresponding data could be retrieved from the input alignment.

RNA.file_msa_read_record(FILE * filehandle, unsigned int options=) int

Read a multiple sequence alignment from file handle.

Similar to RNA.file_msa_read(), this function reads a multiple sequence alignment from an input file handle. Since using a file handle, this function is not limited to the first alignment record, but allows for looping over all alignments within the input.

The read alignment is split into the sequence id/name part and the actual sequence information and stored in memory as arrays of ids/names and sequences. If the alignment file format allows for additional information, such as an ID of the entire alignment or consensus structure information, this data is retrieved as well and made available. The options parameter allows to specify the alignment file format used to retrieve the data. A single format must be specified here, see RNA.file_msa_detect_format() for helping to determine the correct MSA file format.

Currently, the list of parsable multiple sequence alignment file formats consists of:

  • msa-formats-clustal

  • msa-formats-stockholm

  • msa-formats-fasta

  • msa-formats-maf

SWIG Wrapper Notes

In the target scripting language, only the first and last argument, fp and options, are passed to the corresponding function. The other arguments, which serve as output in the C-library, are available as additional return values. This function exists as an overloaded version where the options parameter may be omitted! In that case, the options parameter defaults to RNA.FILE_FORMAT_MSA_STOCKHOLM. See, e.g. RNA.file_msa_read_record() in the Python API and Parsing Alignments in the Python examples.

Parameters:
  • fp (FILE *) – The file pointer the data will be retrieved from

  • names (char ***) – An address to the pointer where sequence identifiers should be written to

  • aln (char ***) – An address to the pointer where aligned sequences should be written to

  • id (char **) – An address to the pointer where the alignment ID should be written to (Maybe NULL)

  • structure (char **) – An address to the pointer where consensus structure information should be written to (Maybe NULL)

  • options (unsigned int) – Options to manipulate the behavior of this function

Returns:

The number of sequences in the alignment, or -1 if no alignment record could be found

Return type:

int

See also

RNA.file_msa_read, RNA.file_msa_detect_format, RNA.FILE_FORMAT_MSA_CLUSTAL, RNA.FILE_FORMAT_MSA_STOCKHOLM, RNA.FILE_FORMAT_MSA_FASTA, RNA.FILE_FORMAT_MSA_MAF, RNA.FILE_FORMAT_MSA_DEFAULT, RNA.FILE_FORMAT_MSA_NOCHECK

Note

After successfully reading an alignment, this function performs a validation of the data that includes uniqueness of the sequence identifiers, and equal sequence lengths. This check can be deactivated by passing RNA.FILE_FORMAT_MSA_NOCHECK in the options parameter.

It is the users responsibility to free any memory occupied by the output arguments names, aln, id, and structure after calling this function. The function automatically sets the latter two arguments to NULL in case no corresponding data could be retrieved from the input alignment.

RNA.file_msa_write(std::string filename, StringVector names, StringVector alignment, std::string id="", std::string structure="", std::string source="", unsigned int options=VRNA_FILE_FORMAT_MSA_STOCKHOLM|VRNA_FILE_FORMAT_MSA_APPEND) int

Write multiple sequence alignment file.

SWIG Wrapper Notes

In the target scripting language, this function exists as a set of overloaded versions, where the last four parameters may be omitted. If the options parameter is missing the options default to (RNA.FILE_FORMAT_MSA_STOCKHOLM | RNA.FILE_FORMAT_MSA_APPEND). See, e.g. RNA.file_msa_write() in the Python API .

Parameters:
  • filename (string) – The output filename

  • names (const char **) – The array of sequence names / identifies

  • aln (const char **) – The array of aligned sequences

  • id (string) – An optional ID for the alignment

  • structure (string) – An optional consensus structure

  • source (string) – A string describing the source of the alignment

  • options (unsigned int) – Options to manipulate the behavior of this function

Returns:

Non-null upon successfully writing the alignment to file

Return type:

int

See also

RNA.FILE_FORMAT_MSA_STOCKHOLM, RNA.FILE_FORMAT_MSA_APPEND, RNA.FILE_FORMAT_MSA_MIS

Note

Currently, we only support msa-formats-stockholm output

RNA.filename_sanitize(*args)

Sanitize a file name.

Returns a new file name where all invalid characters are substituted by a replacement character. If no replacement character is supplied, invalid characters are simply removed from the filename. File names may also never exceed a length of 255 characters. Longer file names will undergo a ‘smart’ truncation process, where the filenames suffix, i.e. everything after the last dot .’, is attempted to be kept intact. Hence, only the filename part before the suffix is reduced in such a way that the total filename complies to the length restriction of 255 characters. If no suffix is present or the suffix itself already exceeds the maximum length, the filename is simply truncated from the back of the string.

For now we consider the following characters invalid:

  • backslash ‘'

  • slash ‘/’

  • question mark ‘?’

  • percent sign ‘’

  • asterisk ‘*’

  • colon ‘:’

  • pipe symbol ‘|’

  • double quote ‘”’

  • triangular brackets ‘<’ and ‘>’

Furthermore, the (resulting) file name must not be a reserved file name, such as:

  • ‘.’

  • ‘..’

Parameters:
  • name (string) – The input file name

  • replacement (string) – The replacement character, or NULL

Returns:

The sanitized file name, or NULL

Return type:

string

Note

This function allocates a new block of memory for the sanitized string. It also may return (a) NULL if the input is pointing to NULL, or (b) an empty string if the input only consists of invalid characters which are simply removed!

RNA.find_saddle(seq, s1, s2, width)

Find energy of a saddle point between 2 structures (search only direct path)

Deprecated since version 2.7.0: Use RNA.path_findpath_saddle() instead!

Parameters:
  • seq (string) – RNA sequence

  • s1 (string) – A pointer to the character array where the first secondary structure in dot-bracket notation will be written to

  • s2 (string) – A pointer to the character array where the second secondary structure in dot-bracket notation will be written to

  • width (int) – integer how many strutures are being kept during the search

Returns:

the saddle energy in 10cal/mol

Return type:

int

class RNA.floatArray(nelements)

Bases: object

cast()
static frompointer(t)
property thisown

The membership flag

RNA.floatArray_frompointer(t)
RNA.floatP_getitem(ary, index)
RNA.floatP_setitem(ary, index, value)
RNA.fold(string) -> (structure, mfe)fold(string) -> (structure, mfe)

Compute Minimum Free Energy (MFE), and a corresponding secondary structure for an RNA sequence.

This simplified interface to RNA.fold_compound.mfe() computes the MFE and, if required, a secondary structure for an RNA sequence using default options. Memory required for dynamic programming (DP) matrices will be allocated and free’d on-the-fly. Hence, after return of this function, the recursively filled matrices are not available any more for any post-processing, e.g. suboptimal backtracking, etc.

SWIG Wrapper Notes

This function is available as function fold() in the global namespace. The parameter structure is returned along with the MFE und must not be provided. See e.g. RNA.fold() in the Python API.

Parameters:
  • sequence (string) – RNA sequence

  • structure (string) – A pointer to the character array where the secondary structure in dot-bracket notation will be written to

Returns:

the minimum free energy (MFE) in kcal/mol

Return type:

float

Note

In case you want to use the filled DP matrices for any subsequent post-processing step, or you require other conditions than specified by the default model details, use RNA.fold_compound.mfe(), and the data structure RNA.fold_compound() instead.

class RNA.fold_compound(fold_compound self, char const * sequence, md md=None, unsigned int options=)
class RNA.fold_compound(fold_compound self, StringVector alignment, md md=None, unsigned int options=) fold_compound
class RNA.fold_compound(fold_compound self, char const * sequence, char * s1, char * s2, md md=None, unsigned int options=) fold_compound

Bases: object

The most basic data structure required by many functions throughout the RNAlib.

Note

Please read the documentation of this data structure carefully! Some attributes are only available for specific types this data structure can adopt.

Warning

Reading/Writing from/to attributes that are not within the scope of the current type usually result in undefined behavior!

See also

RNA.fold_compound, RNA.fold_compound, RNA.fold_compound_comparative, RNA.fold_compound_free, RNA.FC_TYPE_SINGLE, RNA.FC_TYPE_COMPARATIVE,

This data structure is wrapped as class fold_compound with several related functions attached as methods.

A new fold_compound can be obtained by calling one of its constructors:

  • fold_compound(seq) - Initialize with a single sequence, or two concatenated sequences separated by an ampersand character & (for cofolding)

  • fold_compound(aln) - Initialize with a sequence alignment aln stored as a list of sequences (with gap characters).

The resulting object has a list of attached methods which in most cases directly correspond to functions that mainly operate on the corresponding C data structure:

  • type() - Get the type of the fold_compound (See RNA.fc_type)

  • length() - Get the length of the sequence(s) or alignment stored within the fold_compound.

See, e.g. RNA.fold_compound in the Python API.

type

The type of the RNA.fold_compound().

Currently possible values are RNA.FC_TYPE_SINGLE, and RNA.FC_TYPE_COMPARATIVE

Warning

Do not edit this attribute, it will be automagically set by the corresponding get() methods for the RNA.fold_compound(). The value specified in this attribute dictates the set of other attributes to use within this data structure.

Type:

const vrna_fc_type_e

length

The length of the sequence (or sequence alignment)

Type:

unsigned int

cutpoint

The position of the (cofold) cutpoint within the provided sequence. If there is no cutpoint, this field will be set to -1.

Type:

int

strand_number

The strand number a particular nucleotide is associated with.

Type:

list-like(unsigned int)

strand_order

The strand order, i.e. permutation of current concatenated sequence.

Type:

list-like(unsigned int)

strand_order_uniq

The strand order array where identical sequences have the same ID.

Type:

list-like(unsigned int)

strand_start

The start position of a particular strand within the current concatenated sequence.

Type:

list-like(unsigned int)

strand_end

The end (last) position of a particular strand within the current concatenated sequence.

Type:

list-like(unsigned int)

strands

Number of interacting strands.

Type:

unsigned int

nucleotides

Set of nucleotide sequences.

Type:

vrna_seq_t *

alignment

Set of alignments.

Type:

vrna_msa_t *

hc

The hard constraints data structure used for structure prediction.

Type:

vrna_hc_t *

matrices

The MFE DP matrices.

Type:

vrna_mx_mfe_t *

exp_matrices

The PF DP matrices

Type:

vrna_mx_pf_t *

params

The precomputed free energy contributions for each type of loop.

Type:

param

exp_params

The precomputed free energy contributions as Boltzmann factors

Type:

exp_param

iindx

DP matrix accessor

Type:

int *

jindx

DP matrix accessor

Type:

int *

stat_cb

Recursion status callback (usually called just before, and after recursive computations in the library.

See also

RNA.recursion_status, RNA.fold_compound.add_callback

Type:

vrna_recursion_status_f

auxdata

A pointer to auxiliary, user-defined data.

Type:

void *

free_auxdata

A callback to free auxiliary user data whenever the fold_compound itself is free’d.

See also

RNA.fold_compound, RNA.auxdata_free

Type:

vrna_auxdata_free_f

domains_struc

Additional structured domains.

Type:

vrna_sd_t *

domains_up

Additional unstructured domains.

Type:

vrna_ud_t *

aux_grammar

Additional decomposition grammar rules.

Type:

vrna_gr_aux_t

sequence

The input sequence string.

Warning

Only available if

type==RNA.FC_TYPE_SINGLE

Type:

string

sequence_encoding

Numerical encoding of the input sequence.

See also

RNA.sequence_encode

Warning

Only available if

type==RNA.FC_TYPE_SINGLE

Type:

list-like(int)

encoding5
Type:

list-like(int)

encoding3
Type:

list-like(int)

sequence_encoding2
Type:

list-like(int)

ptype

Pair type array.

Contains the numerical encoding of the pair type for each pair (i,j) used in MFE, Partition function and Evaluation computations.

Note

This array is always indexed via jindx, in contrast to previously different indexing between mfe and pf variants!

Warning

Only available if

type==RNA.FC_TYPE_SINGLE

See also

RNA.idx_col_wise, RNA.ptypes

Type:

string

ptype_pf_compat

ptype array indexed via iindx

Deprecated since version 2.7.0: This attribute will vanish in the future! It’s meant for backward compatibility only!

Warning

Only available if

type==RNA.FC_TYPE_SINGLE

Type:

string

sc

The soft constraints for usage in structure prediction and evaluation.

Warning

Only available if

type==RNA.FC_TYPE_SINGLE

Type:

vrna_sc_t *

sequences

The aligned sequences.

Note

The end of the alignment is indicated by a NULL pointer in the second dimension

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

char **

n_seq

The number of sequences in the alignment.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

unsigned int

cons_seq

The consensus sequence of the aligned sequences.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

string

S_cons

Numerical encoding of the consensus sequence.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

list-like(int)

S

Numerical encoding of the sequences in the alignment.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

short **

S5

S5[s][i] holds next base 5’ of i in sequence s.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

short **

S3

Sl[s][i] holds next base 3’ of i in sequence s.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

short **

Ss
Type:

char **

a2s
Type:

list-like(list-like(unsigned int))

pscore

Precomputed array of pair types expressed as pairing scores.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

int *

pscore_local

Precomputed array of pair types expressed as pairing scores.

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

int **

pscore_pf_compat

Precomputed array of pair types expressed as pairing scores indexed via iindx.

Deprecated since version 2.7.0: This attribute will vanish in the future!

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

list-like(int)

scs

A set of soft constraints (for each sequence in the alignment)

Warning

Only available if

type==RNA.FC_TYPE_COMPARATIVE

Type:

vrna_sc_t **

oldAliEn
Type:

int

maxD1

Maximum allowed base pair distance to first reference.

Type:

unsigned int

maxD2

Maximum allowed base pair distance to second reference.

Type:

unsigned int

reference_pt1

A pairtable of the first reference structure.

Type:

list-like(int)

reference_pt2

A pairtable of the second reference structure.

Type:

list-like(int)

referenceBPs1

Matrix containing number of basepairs of reference structure1 in interval [i,j].

Type:

list-like(unsigned int)

referenceBPs2

Matrix containing number of basepairs of reference structure2 in interval [i,j].

Type:

list-like(unsigned int)

bpdist

Matrix containing base pair distance of reference structure 1 and 2 on interval [i,j].

Type:

list-like(unsigned int)

mm1

Maximum matching matrix, reference struct 1 disallowed.

Type:

list-like(unsigned int)

mm2

Maximum matching matrix, reference struct 2 disallowed.

Type:

list-like(unsigned int)

window_size

window size for local folding sliding window approach

Type:

int

ptype_local

Pair type array (for local folding)

Type:

char **

zscore_data

Data structure with settings for z-score computations.

Type:

vrna_zsc_dat_t

@1
Type:

union vrna_fc_s::@0

E_ext_hp_loop(i, j)
E_ext_int_loop(i, j)
E_hp_loop(i, j)
E_int_loop(i, j)
E_stack(i, j)
MEA(fold_compound self) char
MEA(fold_compound self, double gamma) char *

Compute a MEA (maximum expected accuracy) structure.

The algorithm maximizes the expected accuracy

\[A(S) = \sum_{(i,j) \in S} 2 \gamma p_{ij} + \sum_{i \notin S} p^u_{i}\]

Higher values of \(\gamma\) result in more base pairs of lower probability and thus higher sensitivity. Low values of \(\gamma\) result in structures containing only highly likely pairs (high specificity). The code of the MEA function also demonstrates the use of sparse dynamic programming scheme to reduce the time and memory complexity of folding.

Precondition

RNA.fold_compound.pf() must be executed on input parameter fc

SWIG Wrapper Notes

This function is attached as overloaded method MEA`(gamma = 1.) to objects of type `fold_compound. Note, that it returns the MEA structure and MEA value as a tuple (MEA_structure, MEA). See, e.g. RNA.fold_compound.MEA() in the Python API.

Parameters:
  • gamma (double) – The weighting factor for base pairs vs. unpaired nucleotides

  • mea (list-like(double)) – A pointer to a variable where the MEA value will be written to

Returns:

An MEA structure (or NULL on any error)

Return type:

string

add_auxdata(fold_compound self, PyObject * data, PyObject * PyFuncOrNone=Py_None) PyObject *

Add auxiliary data to the RNA.fold_compound().

This function allows one to bind arbitrary data to a RNA.fold_compound() which may later on be used by one of the callback functions, e.g. RNA.recursion_status(). To allow for proper cleanup of the memory occupied by this auxiliary data, the user may also provide a pointer to a cleanup function that free’s the corresponding memory. This function will be called automatically when the RNA.fold_compound() is free’d with RNA.fold_compound_free().

Parameters:
  • data (void *) – A pointer to an arbitrary data structure

  • f (RNA.auxdata_free) – A pointer to function that free’s memory occupied by the arbitrary data (May be NULL)

See also

RNA.auxdata_free

Note

Before attaching the arbitrary data pointer, this function will call the RNA.auxdata_free() on any pre-existing data that is already attached.

add_callback(fold_compound self, PyObject * PyFunc) PyObject *

Add a recursion status callback to the RNA.fold_compound().

Binding a recursion status callback function to a RNA.fold_compound() allows one to perform arbitrary operations just before, or after an actual recursive computations, e.g. MFE prediction, is performed by the RNAlib. The callback function will be provided with a pointer to its RNA.fold_compound(), and a status message. Hence, it has complete access to all variables that incluence the recursive computations.

Parameters:

f (RNA.recursion_status) – The pointer to the recursion status callback function

See also

RNA.recursion_status, RNA.fold_compound, RNA.STATUS_MFE_PRE, RNA.STATUS_MFE_POST, RNA.STATUS_PF_PRE, RNA.STATUS_PF_POST

backtrack(fold_compound self, unsigned int length) char
backtrack(fold_compound self) char *

Backtrack an MFE (sub)structure.

This function allows one to backtrack the MFE structure for a (sub)sequence

Precondition

Requires pre-filled MFE dynamic programming matrices, i.e. one has to call RNA.fold_compound.mfe() prior to calling this function

SWIG Wrapper Notes

This function is attached as overloaded method backtrack() to objects of type fold_compound. The parameter length defaults to the total length of the RNA sequence and may be omitted. The parameter structure is returned along with the MFE und must not be provided. See e.g. RNA.fold_compound.backtrack() in the Python API.

Parameters:
  • length (unsigned int) – The length of the subsequence, starting from the 5’ end

  • structure (string) – A pointer to the character array where the secondary structure in dot-bracket notation will be written to. (Must have size of at least $p length + 1)

Returns:

The minimum free energy (MFE) for the specified length in kcal/mol and a corresponding secondary structure in dot-bracket notation (stored in structure)

Return type:

float

Note

On error, the function returns INF / 100. and stores the empty string in structure.

benchmark(fold_compound self, std::string gold, int fuzzy=0, unsigned int options=8) score
bpp()
centroid(fold_compound self) char *

Get the centroid structure of the ensemble.

The centroid is the structure with the minimal average distance to all other structures \(<d(S)> = \sum_{(i,j) \in S} (1-p_{ij}) + \sum_{(i,j) \notin S} p_{ij}\) Thus, the centroid is simply the structure containing all pairs with \(p_{i}j>0.5\) The distance of the centroid to the ensemble is written to the memory adressed by dist.

Parameters:

dist (list-like(double)) – A pointer to the distance variable where the centroid distance will be written to

Returns:

The centroid structure of the ensemble in dot-bracket notation (NULL on error)

Return type:

string

commands_apply(fold_compound self, cmd commands, unsigned int options=) int

Apply a list of commands to a RNA.fold_compound().

SWIG Wrapper Notes

This function is attached as method commands_apply() to objects of type fold_compound. See, e.g. RNA.fold_compound.commands_apply() in the Python API .

Parameters:
  • commands (RNA.cmd()) – The commands to apply

  • options (unsigned int) – Options to limit the type of commands read from the file

Returns:

The number of commands successfully applied

Return type:

int

constraints_add(fold_compound self, char const * constraint, unsigned int options=)

Add constraints to a RNA.fold_compound() data structure.

Use this function to add/update the hard/soft constraints The function allows for passing a string ‘constraint’ that can either be a filename that points to a constraints definition file or it may be a pseudo dot-bracket notation indicating hard constraints. For the latter, the user has to pass the RNA.CONSTRAINT_DB option. Also, the user has to specify, which characters are allowed to be interpreted as constraints by passing the corresponding options via the third parameter.

The following is an example for adding hard constraints given in pseudo dot-bracket notation. Here, fc is the RNA.fold_compound() object, structure is a char array with the hard constraint in dot-bracket notation, and enforceConstraints is a flag indicating whether or not constraints for base pairs should be enforced instead of just doing a removal of base pair that conflict with the constraint.

In constrat to the above, constraints may also be read from file:

Parameters:
  • constraint (string) – A string with either the filename of the constraint definitions or a pseudo dot-bracket notation of the hard constraint. May be NULL.

  • options (unsigned int) – The option flags

See also

RNA.fold_compound.hc_add_from_db, RNA.fold_compound.hc_add_up, RNA.hc_add_up_batch, RNA.hc_add_bp_unspecific, RNA.fold_compound.hc_add_bp, RNA.fold_compound.hc_init, RNA.fold_compound.sc_set_up, RNA.fold_compound.sc_set_bp, RNA.fold_compound.sc_add_SHAPE_deigan, RNA.fold_compound.sc_add_SHAPE_zarringhalam, RNA.hc_free, RNA.sc_free, RNA.CONSTRAINT_DB, RNA.CONSTRAINT_DB_DEFAULT, RNA.CONSTRAINT_DB_PIPE, RNA.CONSTRAINT_DB_DOT, RNA.CONSTRAINT_DB_X, RNA.CONSTRAINT_DB_ANG_BRACK, RNA.CONSTRAINT_DB_RND_BRACK, RNA.CONSTRAINT_DB_INTRAMOL, RNA.CONSTRAINT_DB_INTERMOL, RNA.CONSTRAINT_DB_GQUAD

db_from_probs()
ensemble_defect(*args)

Compute the Ensemble Defect for a given target structure.

This is a wrapper around RNA.ensemble_defect_pt(). Given a target structure \(s\), compute the average dissimilarity of a randomly drawn structure from the ensemble, i.e.:

\[ED(s) = 1 - \frac{1}{n} \sum_{ij, (i,j) \in s} p_{ij} - \frac{1}{n} \sum_{i}(1 - s_{i})q_{i}\]

with sequence length \(n\), the probability \(p_{ij}\) of a base pair \((i,j)\), the probability \(q_{i} = 1 - \sum_{j} p_{ij}\) of nucleotide \(i\) being unpaired, and the indicator variable \(s_{i} = 1\) if \(\exists (i,j) \in s\), and \(s_{i} = 0\) otherwise.

Precondition

The RNA.fold_compound() input parameter fc must contain a valid base pair probability matrix. This means that partition function and base pair probabilities must have been computed using fc before execution of this function!

SWIG Wrapper Notes

This function is attached as method ensemble_defect() to objects of type fold_compound. Note that the SWIG wrapper takes a structure in dot-bracket notation and converts it into a pair table using RNA.ptable_from_string(). The resulting pair table is then internally passed to RNA.ensemble_defect_pt(). To control which kind of matching brackets will be used during conversion, the optional argument options can be used. See also the description of RNA.ptable_from_string() for available options. (default: RNA.BRACKETS_RND). See, e.g. RNA.fold_compound.ensemble_defect() in the Python API.

Parameters:

structure (string) – A target structure in dot-bracket notation

Returns:

The ensemble defect with respect to the target structure, or -1. upon failure, e.g. pre- conditions are not met

Return type:

double

See also

RNA.fold_compound.pf, RNA.pairing_probs, RNA.ensemble_defect_pt

eval_covar_structure(structure)

Calculate the pseudo energy derived by the covariance scores of a set of aligned sequences.

Consensus structure prediction is driven by covariance scores of base pairs in rows of the provided alignment. This function allows one to retrieve the total amount of this covariance pseudo energy scores. The RNA.fold_compound() does not need to contain any DP matrices, but requires all most basic init values as one would get from a call like this:

SWIG Wrapper Notes

This function is attached as method eval_covar_structure() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_covar_structure() in the Python API .

Parameters:

structure (string) – Secondary (consensus) structure in dot-bracket notation

Returns:

The covariance pseudo energy score of the input structure given the input sequence alignment in kcal/mol

Return type:

float

See also

RNA.fold_compound_comparative, RNA.fold_compound.eval_structure

Note

Accepts RNA.fold_compound() of type RNA.FC_TYPE_COMPARATIVE only!

eval_ext_hp_loop(i, j)
eval_ext_stem(i, j)
eval_hp_loop(fold_compound self, int i, int j) int

SWIG Wrapper Notes

This function is attached as method eval_hp_loop() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_hp_loop() in the Python API .

eval_int_loop(fold_compound self, int i, int j, int k, int l) int

SWIG Wrapper Notes

This function is attached as method eval_int_loop() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_int_loop() in the Python API .

eval_loop_pt(*args)

Calculate energy of a loop.

SWIG Wrapper Notes

This function is attached as method eval_loop_pt() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_loop_pt() in the Python API .

Parameters:
  • i (int) – position of covering base pair

  • pt (const short *) – the pair table of the secondary structure

Returns:

free energy of the loop in 10cal/mol

Return type:

int

eval_move(structure, m1, m2)

Calculate energy of a move (closing or opening of a base pair)

If the parameters m1 and m2 are negative, it is deletion (opening) of a base pair, otherwise it is insertion (opening).

SWIG Wrapper Notes

This function is attached as method eval_move() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_move() in the Python API .

Parameters:
  • structure (string) – secondary structure in dot-bracket notation

  • m1 (int) – first coordinate of base pair

  • m2 (int) – second coordinate of base pair

Returns:

energy change of the move in kcal/mol (INF / 100. upon any error)

Return type:

float

eval_move_pt(*args)

Calculate energy of a move (closing or opening of a base pair)

If the parameters m1 and m2 are negative, it is deletion (opening) of a base pair, otherwise it is insertion (opening).

SWIG Wrapper Notes

This function is attached as method eval_move_pt() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_move_pt() in the Python API .

Parameters:
  • pt (list-like(int)) – the pair table of the secondary structure

  • m1 (int) – first coordinate of base pair

  • m2 (int) – second coordinate of base pair

Returns:

energy change of the move in 10cal/mol

Return type:

int

eval_structure(structure)

Calculate the free energy of an already folded RNA.

This function allows for energy evaluation of a given pair of structure and sequence (alignment). Model details, energy parameters, and possibly soft constraints are used as provided via the parameter ‘fc’. The RNA.fold_compound() does not need to contain any DP matrices, but requires all most basic init values as one would get from a call like this:

SWIG Wrapper Notes

This function is attached as method eval_structure() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_structure() in the Python API .

Parameters:

structure (string) – Secondary structure in dot-bracket notation

Returns:

The free energy of the input structure given the input sequence in kcal/mol

Return type:

float

Note

Accepts RNA.fold_compound() of type RNA.FC_TYPE_SINGLE and RNA.FC_TYPE_COMPARATIVE

eval_structure_pt(*args)

Calculate the free energy of an already folded RNA.

This function allows for energy evaluation of a given sequence/structure pair where the structure is provided in pair_table format as obtained from RNA.ptable(). Model details, energy parameters, and possibly soft constraints are used as provided via the parameter ‘fc’. The fold_compound does not need to contain any DP matrices, but all the most basic init values as one would get from a call like this:

SWIG Wrapper Notes

This function is attached as method eval_structure_pt() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_structure_pt() in the Python API .

Parameters:

pt (const short *) – Secondary structure as pair_table

Returns:

The free energy of the input structure given the input sequence in 10cal/mol

Return type:

int

eval_structure_pt_verbose(*args)

Calculate the free energy of an already folded RNA.

This function is a simplyfied version of RNA.eval_structure_simple_v() that uses the default verbosity level.

SWIG Wrapper Notes

This function is attached as method eval_structure_pt_verbose() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_structure_pt_verbose() in the Python API .

Parameters:
  • pt (const short *) – Secondary structure as pair_table

  • file (FILE *) – A file handle where this function should print to (may be NULL).

Returns:

The free energy of the input structure given the input sequence in 10cal/mol

Return type:

int

eval_structure_verbose(structure, nullfile=None)

Calculate the free energy of an already folded RNA and print contributions on a per-loop base.

This function is a simplyfied version of RNA.eval_structure_v() that uses the default verbosity level.

SWIG Wrapper Notes

This function is attached as method eval_structure_verbose() to objects of type fold_compound. See, e.g. RNA.fold_compound.eval_structure_verbose() in the Python API .

Parameters:
  • structure (string) – Secondary structure in dot-bracket notation

  • file (FILE *) – A file handle where this function should print to (may be NULL).

Returns:

The free energy of the input structure given the input sequence in kcal/mol

Return type:

float

exp_E_ext_stem(i, j)
exp_E_hp_loop(i, j)
exp_E_int_loop(i, j)
exp_E_interior_loop(i, j, k, l)
property exp_matrices
property exp_params
exp_params_rescale(*args)

Rescale Boltzmann factors for partition function computations.

This function may be used to (automatically) rescale the Boltzmann factors used in partition function computations. Since partition functions over subsequences can easily become extremely large, the RNAlib internally rescales them to avoid numerical over- and/or underflow. Therefore, a proper scaling factor \(s\) needs to be chosen that in turn is then used to normalize the corresponding partition functions \(\hat{q}[i,j] = q[i,j] / s^{(j-i+1)}\).

This function provides two ways to automatically adjust the scaling factor.

  1. Automatic guess

  2. Automatic adjustment according to MFE

Passing NULL as second parameter activates the automatic guess mode. Here, the scaling factor is recomputed according to a mean free energy of 184.3*length cal for random sequences. On the other hand, if the MFE for a sequence is known, it can be used to recompute a more robust scaling factor, since it represents the lowest free energy of the entire ensemble of structures, i.e. the highest Boltzmann factor. To activate this second mode of automatic adjustment according to MFE, a pointer to the MFE value needs to be passed as second argument. This value is then taken to compute the scaling factor as \(s = exp((sfact * MFE) / kT / length )\), where sfact is an additional scaling weight located in the RNA.md() data structure of exp_params in fc.

Note

This recomputation only takes place if the pf_scale attribute of the exp_params data structure contained in fc has a value below 1.0.

The computed scaling factor \(s\) will be stored as pf_scale attribute of the exp_params data structure in fc.

SWIG Wrapper Notes

This function is attached to RNA.fc() objects as overloaded exp_params_rescale() method.

When no parameter is passed to this method, the resulting action is the same as passing NULL as second parameter to RNA.fold_compound.exp_params_rescale(), i.e. default scaling of the partition function. Passing an energy in kcal/mol, e.g. as retrieved by a previous call to the mfe() method, instructs all subsequent calls to scale the partition function accordingly. See, e.g. RNA.fold_compound.exp_params_rescale() in the Python API.

Parameters:

mfe (list-like(double)) – A pointer to the MFE (in kcal/mol) or NULL

exp_params_reset(md=None)

Reset Boltzmann factors for partition function computations within a RNA.fold_compound() according to provided, or default model details.

This function allows one to rescale Boltzmann factors for subsequent partition function computations according to a set of model details, e.g. temperature values. To do so, the caller provides either a pointer to a set of model details to be used for rescaling, or NULL if global default setting should be used.

SWIG Wrapper Notes

This function is attached to RNA.fc() objects as overloaded exp_params_reset() method.

When no parameter is passed to this method, the resulting action is the same as passing NULL as second parameter to RNA.fold_compound.exp_params_reset(), i.e. global default model settings are used. Passing an object of type RNA.md() resets the fold compound according to the specifications stored within the RNA.md() object. See, e.g. RNA.fold_compound.exp_params_reset() in the Python API.

Parameters:

md (RNA.md() *) – A pointer to the new model details (or NULL for reset to defaults)

exp_params_subst(par)

Update the energy parameters for subsequent partition function computations.

This function can be used to properly assign new energy parameters for partition function computations to a RNA.fold_compound(). For this purpose, the data of the provided pointer params will be copied into fc and a recomputation of the partition function scaling factor is issued, if the pf_scale attribute of params is less than 1.0.

Passing NULL as second argument leads to a reset of the energy parameters within fc to their default values

SWIG Wrapper Notes

This function is attached to RNA.fc() objects as overloaded exp_params_subst() method.

When no parameter is passed, the resulting action is the same as passing NULL as second parameter to RNA.fold_compound.exp_params_subst(), i.e. resetting the parameters to the global defaults. See, e.g. RNA.fold_compound.exp_params_subst() in the Python API.

Parameters:

params (RNA.exp_param() *) – A pointer to the new energy parameters

file_commands_apply(fold_compound self, std::string filename, unsigned int options=) int
property hc
hc_add_bp(fold_compound self, unsigned int i, unsigned int j, unsigned int option=VRNA_CONSTRAINT_CONTEXT_ALL_LOOPS)

Favorize/Enforce a certain base pair (i,j)

Parameters:
  • i (unsigned int) – The 5’ located nucleotide position of the base pair (1-based)

  • j (unsigned int) – The 3’ located nucleotide position of the base pair (1-based)

  • option (unsigned char) – The options flag indicating how/where to store the hard constraints

See also

RNA.fold_compound.hc_add_bp_nonspecific, RNA.fold_compound.hc_add_up, RNA.fold_compound.hc_init, RNA.CONSTRAINT_CONTEXT_EXT_LOOP, RNA.CONSTRAINT_CONTEXT_HP_LOOP, RNA.CONSTRAINT_CONTEXT_INT_LOOP, RNA.CONSTRAINT_CONTEXT_INT_LOOP_ENC, RNA.CONSTRAINT_CONTEXT_MB_LOOP, RNA.CONSTRAINT_CONTEXT_MB_LOOP_ENC, RNA.CONSTRAINT_CONTEXT_ENFORCE, RNA.CONSTRAINT_CONTEXT_ALL_LOOPS

hc_add_bp_nonspecific(fold_compound self, unsigned int i, int d, unsigned int option=VRNA_CONSTRAINT_CONTEXT_ALL_LOOPS)

Enforce a nucleotide to be paired (upstream/downstream)

Parameters:
  • i (unsigned int) – The position that needs to stay unpaired (1-based)

  • d (int) – The direction of base pairing ( \(d < 0\): pairs upstream, \(d > 0\): pairs downstream, \(d == 0\): no direction)

  • option (unsigned char) – The options flag indicating in which loop type context the pairs may appear

See also

RNA.fold_compound.hc_add_bp, RNA.fold_compound.hc_add_up, RNA.fold_compound.hc_init, RNA.CONSTRAINT_CONTEXT_EXT_LOOP, RNA.CONSTRAINT_CONTEXT_HP_LOOP, RNA.CONSTRAINT_CONTEXT_INT_LOOP, RNA.CONSTRAINT_CONTEXT_INT_LOOP_ENC, RNA.CONSTRAINT_CONTEXT_MB_LOOP, RNA.CONSTRAINT_CONTEXT_MB_LOOP_ENC, RNA.CONSTRAINT_CONTEXT_ALL_LOOPS

hc_add_bp_strand(*args, **kwargs)
hc_add_from_db(fold_compound self, char const * constraint, unsigned int options=) int

Add hard constraints from pseudo dot-bracket notation.

This function allows one to apply hard constraints from a pseudo dot-bracket notation. The options parameter controls, which characters are recognized by the parser. Use the RNA.CONSTRAINT_DB_DEFAULT convenience macro, if you want to allow all known characters

SWIG Wrapper Notes

This function is attached as method hc_add_from_db() to objects of type fold_compound. See, e.g. RNA.fold_compound.hc_add_from_db() in the Python API .

Parameters:
  • constraint (string) – A pseudo dot-bracket notation of the hard constraint.

  • options (unsigned int) – The option flags

See also

RNA.CONSTRAINT_DB_PIPE, RNA.CONSTRAINT_DB_DOT, RNA.CONSTRAINT_DB_X, RNA.CONSTRAINT_DB_ANG_BRACK, RNA.CONSTRAINT_DB_RND_BRACK, RNA.CONSTRAINT_DB_INTRAMOL, RNA.CONSTRAINT_DB_INTERMOL, RNA.CONSTRAINT_DB_GQUAD

hc_add_up(fold_compound self, unsigned int i, unsigned int option=VRNA_CONSTRAINT_CONTEXT_ALL_LOOPS)

Make a certain nucleotide unpaired.

Parameters:
  • i (unsigned int) – The position that needs to stay unpaired (1-based)

  • option (unsigned char) – The options flag indicating how/where to store the hard constraints

See also

RNA.fold_compound.hc_add_bp, RNA.fold_compound.hc_add_bp_nonspecific, RNA.fold_compound.hc_init, RNA.CONSTRAINT_CONTEXT_EXT_LOOP, RNA.CONSTRAINT_CONTEXT_HP_LOOP, RNA.CONSTRAINT_CONTEXT_INT_LOOP, RNA.CONSTRAINT_CONTEXT_MB_LOOP, RNA.CONSTRAINT_CONTEXT_ALL_LOOPS

hc_add_up_strand(*args, **kwargs)
hc_init()

Initialize/Reset hard constraints to default values.

This function resets the hard constraints to their default values, i.e. all positions may be unpaired in all contexts, and base pairs are allowed in all contexts, if they resemble canonical pairs. Previously set hard constraints will be removed before initialization.

SWIG Wrapper Notes

This function is attached as method hc_init() to objects of type fold_compound. See, e.g. RNA.fold_compound.hc_init() in the Python API .

heat_capacity(fold_compound self, float T_min=0., float T_max=100., float T_increment=1., unsigned int mpoints=2) HeatCapacityVector

Compute the specific heat for an RNA.

This function computes an RNAs specific heat in a given temperature range from the partition function by numeric differentiation. The result is returned as a list of pairs of temperature in C and specific heat in Kcal/(Mol*K).

Users can specify the temperature range for the computation from T_min to T_max, as well as the increment step size T_increment. The latter also determines how many times the partition function is computed. Finally, the parameter mpoints determines how smooth the curve should be. The algorithm itself fits a parabola to \(2 \cdot mpoints + 1\) data points to calculate 2nd derivatives. Increasing this parameter produces a smoother curve.

SWIG Wrapper Notes

This function is attached as overloaded method heat_capacity() to objects of type fold_compound. If the optional function arguments T_min, T_max, T_increment, and mpoints are omitted, they default to 0.0, 100.0, 1.0 and 2, respectively. See, e.g. RNA.fold_compound.heat_capacity() in the Python API.

Parameters:
  • T_min (float) – Lowest temperature in C

  • T_max (float) – Highest temperature in C

  • T_increment (float) – Stepsize for temperature incrementation in C (a reasonable choice might be 1C)

  • mpoints (unsigned int) – The number of interpolation points to calculate 2nd derivative (a reasonable choice might be 2, min: 1, max: 100)

Returns:

A list of pairs of temperatures and corresponding heat capacity or NULL upon any failure. The last entry of the list is indicated by a temperature field set to a value smaller than T_min

Return type:

RNA.heat_capacity() *

heat_capacity_cb(fold_compound self, float T_min, float T_max, float T_increment, unsigned int mpoints, PyObject * PyFunc, PyObject * data=Py_None) PyObject *

Compute the specific heat for an RNA (callback variant)

Similar to RNA.fold_compound.heat_capacity(), this function computes an RNAs specific heat in a given temperature range from the partition function by numeric differentiation. Instead of returning a list of temperature/specific heat pairs, however, this function returns the individual results through a callback mechanism. The provided function will be called for each result and passed the corresponding temperature and specific heat values along with the arbitrary data as provided through the data pointer argument.

Users can specify the temperature range for the computation from T_min to T_max, as well as the increment step size T_increment. The latter also determines how many times the partition function is computed. Finally, the parameter mpoints determines how smooth the curve should be. The algorithm itself fits a parabola to \(2 \cdot mpoints + 1\) data points to calculate 2nd derivatives. Increasing this parameter produces a smoother curve.

SWIG Wrapper Notes

This function is attached as method heat_capacity_cb() to objects of type fold_compound. See, e.g. RNA.fold_compound.heat_capacity_cb() in the Python API.

Parameters:
  • T_min (float) – Lowest temperature in C

  • T_max (float) – Highest temperature in C

  • T_increment (float) – Stepsize for temperature incrementation in C (a reasonable choice might be 1C)

  • mpoints (unsigned int) – The number of interpolation points to calculate 2nd derivative (a reasonable choice might be 2, min: 1, max: 100)

  • cb (RNA.heat_capacity) – The user-defined callback function that receives the individual results

  • data (void *) – An arbitrary data structure that will be passed to the callback in conjunction with the results

Returns:

Returns 0 upon failure, and non-zero otherwise

Return type:

int

property iindx
property jindx
property length
property matrices
maxmimum_matching()
mean_bp_distance()

Get the mean base pair distance in the thermodynamic ensemble.

\[<d> = \sum_{a,b} p_{a} p_{b} d(S_{a},S_{b})\]

this can be computed from the pair probs \(p_{ij}\) as

\[<d> = \sum_{ij} p_{ij}(1-p_{ij})\]

SWIG Wrapper Notes

This function is attached as method mean_bp_distance() to objects of type fold_compound. See, e.g. RNA.fold_compound.mean_bp_distance() in the Python API.

Returns:

The mean pair distance of the structure ensemble

Return type:

double

mfe()

Compute minimum free energy and an appropriate secondary structure of an RNA sequence, or RNA sequence alignment.

Depending on the type of the provided RNA.fold_compound(), this function predicts the MFE for a single sequence (or connected component of multiple sequences), or an averaged MFE for a sequence alignment. If backtracking is activated, it also constructs the corresponding secondary structure, or consensus structure. Therefore, the second parameter, structure, has to point to an allocated block of memory with a size of at least \(\mathrm{strlen}(\mathrm{sequence})+1\) to store the backtracked MFE structure. (For consensus structures, this is the length of the alignment + 1. If NULL is passed, no backtracking will be performed.

SWIG Wrapper Notes

This function is attached as method mfe() to objects of type fold_compound. The parameter structure is returned along with the MFE und must not be provided. See e.g. RNA.fold_compound.mfe() in the Python API.

Parameters:

structure (string) – A pointer to the character array where the secondary structure in dot-bracket notation will be written to (Maybe NULL)

Returns:

the minimum free energy (MFE) in kcal/mol

Return type:

float

Note

This function is polymorphic. It accepts RNA.fold_compound() of type RNA.FC_TYPE_SINGLE, and RNA.FC_TYPE_COMPARATIVE.

mfe_dimer(fold_compound self) char *

Compute the minimum free energy of two interacting RNA molecules.

The code is analog to the RNA.fold_compound.mfe() function.

Deprecated since version 2.7.0: This function is obsolete since RNA.fold_compound.mfe() can handle complexes multiple sequences since v2.5.0. Use RNA.fold_compound.mfe() for connected component MFE instead and compute MFEs of unconnected states separately.

SWIG Wrapper Notes

This function is attached as method mfe_dimer() to objects of type fold_compound. The parameter structure is returned along with the MFE und must not be provided. See e.g. RNA.fold_compound.mfe_dimer() in the Python API.

Parameters:

structure (string) – Will hold the barcket dot structure of the dimer molecule

Returns:

minimum free energy of the structure

Return type:

float

mfe_window(nullfile=None)

Local MFE prediction using a sliding window approach.

Computes minimum free energy structures using a sliding window approach, where base pairs may not span outside the window. In contrast to RNA.fold_compound.mfe(), where a maximum base pair span may be set using the RNA.md().max_bp_span attribute and one globally optimal structure is predicted, this function uses a sliding window to retrieve all locally optimal structures within each window. The size of the sliding window is set in the RNA.md().window_size attribute, prior to the retrieval of the RNA.fold_compound() using RNA.fold_compound() with option RNA.OPTION_WINDOW

The predicted structures are written on-the-fly, either to stdout, if a NULL pointer is passed as file parameter, or to the corresponding filehandle.

SWIG Wrapper Notes

This function is attached as overloaded method mfe_window() to objects of type fold_compound. The parameter FILE has default value of NULL and can be omitted. See e.g. RNA.fold_compound.mfe_window() in the Python API.

Parameters:

file (FILE *) – The output file handle where predictions are written to (maybe NULL)

mfe_window_cb(fold_compound self, PyObject * PyFunc, PyObject * data=Py_None) float

SWIG Wrapper Notes

This function is attached as overloaded method mfe_window_cb() to objects of type fold_compound. The parameter data has default value of NULL and can be omitted. See e.g. RNA.fold_compound.mfe_window_cb() in the Python API.

mfe_window_zscore(min_z, nullfile=None)

Local MFE prediction using a sliding window approach (with z-score cut-off)

Computes minimum free energy structures using a sliding window approach, where base pairs may not span outside the window. This function is the z-score version of RNA.fold_compound.mfe_window(), i.e. only predictions above a certain z-score cut-off value are printed. As for RNA.fold_compound.mfe_window(), the size of the sliding window is set in the RNA.md().window_size attribute, prior to the retrieval of the RNA.fold_compound() using RNA.fold_compound() with option RNA.OPTION_WINDOW.

The predicted structures are written on-the-fly, either to stdout, if a NULL pointer is passed as file parameter, or to the corresponding filehandle.

SWIG Wrapper Notes

This function is attached as overloaded method mfe_window_zscore() to objects of type fold_compound. The parameter FILE has default value of NULL and can be omitted. See e.g. RNA.fold_compound.mfe_window_zscore() in the Python API.

Parameters:
  • min_z (double) – The minimal z-score for a predicted structure to appear in the output

  • file (FILE *) – The output file handle where predictions are written to (maybe NULL)

mfe_window_zscore_cb(fold_compound self, double min_z, PyObject * PyFunc, PyObject * data=Py_None) float
move_neighbor_diff(self, pt, move, options=4 | 8) varArrayMove
move_neighbor_diff(fold_compound self, varArrayShort pt, move move, PyObject * PyFunc, PyObject * data=Py_None, unsigned int options=(4|8)) int

Apply a move to a secondary structure and indicate which neighbors have changed consequentially.

Similar to RNA.move_neighbor_diff_cb(), this function applies a move to a secondary structure and reports back the neighbors of the current structure become affected by this move. Instead of executing a callback for each of the affected neighbors, this function compiles two lists of neighbor moves, one that is returned and consists of all moves that are novel or may have changed in energy, and a second, invalid_moves, that consists of all the neighbor moves that become invalid, respectively.

Parameters:
  • ptable (list-like(int)) – The current structure as pair table

  • move (RNA.move()) – The move to apply

  • invalid_moves (RNA.move() **) – The address of a move list where the function stores those moves that become invalid

  • options (unsigned int) – Options to modify the behavior of this function, .e.g available move set

Returns:

A list of moves that might have changed in energy or are novel compared to the structure before application of the move

Return type:

RNA.move() *

neighbors(fold_compound self, varArrayShort pt, unsigned int options=(4|8)) varArrayMove

Generate neighbors of a secondary structure.

This function allows one to generate all structural neighbors (according to a particular move set) of an RNA secondary structure. The neighborhood is then returned as a list of transitions / moves required to transform the current structure into the actual neighbor.

SWIG Wrapper Notes

This function is attached as an overloaded method neighbors() to objects of type fold_compound. The optional parameter options defaults to RNA.MOVESET_DEFAULT if it is omitted. See, e.g. RNA.fold_compound.neighbors() in the Python API.

Parameters:
  • pt (const short *) – The pair table representation of the structure

  • options (unsigned int) – Options to modify the behavior of this function, e.g. available move set

Returns:

Neighbors as a list of moves / transitions (the last element in the list has both of its fields set to 0)

Return type:

RNA.move() *

See also

RNA.neighbors_successive, RNA.move_apply, RNA.MOVESET_INSERTION, RNA.MOVESET_DELETION, RNA.MOVESET_SHIFT, RNA.MOVESET_DEFAULT

property params
params_reset(md=None)

Reset free energy parameters within a RNA.fold_compound() according to provided, or default model details.

This function allows one to rescale free energy parameters for subsequent structure prediction or evaluation according to a set of model details, e.g. temperature values. To do so, the caller provides either a pointer to a set of model details to be used for rescaling, or NULL if global default setting should be used.

SWIG Wrapper Notes

This function is attached to RNA.fc() objects as overloaded params_reset() method.

When no parameter is passed to this method, the resulting action is the same as passing NULL as second parameter to RNA.fold_compound.params_reset(), i.e. global default model settings are used. Passing an object of type RNA.md() resets the fold compound according to the specifications stored within the RNA.md() object. See, e.g. RNA.fold_compound.params_reset() in the Python API.

Parameters:

md (RNA.md() *) – A pointer to the new model details (or NULL for reset to defaults)

See also

RNA.fold_compound.exp_params_reset, RNA.params_subs

params_subst(par=None)

Update/Reset energy parameters data structure within a RNA.fold_compound().

Passing NULL as second argument leads to a reset of the energy parameters within fc to their default values. Otherwise, the energy parameters provided will be copied over into fc.

SWIG Wrapper Notes

This function is attached to RNA.fc() objects as overloaded params_subst() method.

When no parameter is passed, the resulting action is the same as passing NULL as second parameter to RNA.fold_compound.params_subst(), i.e. resetting the parameters to the global defaults. See, e.g. RNA.fold_compound.params_subst() in the Python API.

Parameters:

par (RNA.param() *) – The energy parameters used to substitute those within fc (Maybe NULL)

path(fold_compound self, IntVector pt, unsigned int steps, unsigned int options=) MoveVector
path(fold_compound self, varArrayShort pt, unsigned int steps, unsigned int options=) MoveVector

Compute a path, store the final structure, and return a list of transition moves from the start to the final structure.

This function computes, given a start structure in pair table format, a transition path, updates the pair table to the final structure of the path. Finally, if not requested otherwise by using the RNA.PATH_NO_TRANSITION_OUTPUT flag in the options field, this function returns a list of individual transitions that lead from the start to the final structure if requested.

The currently available transition paths are

  • Steepest Descent / Gradient walk (flag: RNA.PATH_STEEPEST_DESCENT)

  • Random walk (flag: RNA.PATH_RANDOM)

The type of transitions must be set through the options parameter

SWIG Wrapper Notes

This function is attached as an overloaded method path() to objects of type fold_compound. The optional parameter options defaults to RNA.PATH_DEFAULT if it is omitted. See, e.g. RNA.fold_compound.path() in the Python API.

Parameters:
  • pt (list-like(int)) – The pair table containing the start structure. Used to update to the final structure after execution of this function

  • options (unsigned int) – Options to modify the behavior of this function

Returns:

A list of transition moves (default), or NULL (if options & RNA.PATH_NO_TRANSITION_OUTPUT)

Return type:

RNA.move() *

See also

RNA.fold_compound.path_gradient, RNA.fold_compound.path_random, RNA.ptable, RNA.ptable_copy, RNA.fold_compound, RNA.PATH_RANDOM, RNA.MOVESET_DEFAULT, RNA.MOVESET_SHIFT, RNA.PATH_NO_TRANSITION_OUTPUT

Note

Since the result is written to the