RNAdesign.pl − Flexible design of multi−stable RNA molecules
RNAdesign.pl [−o, −−options] < constraints.in
Conformational design an RNA (or DNA ) molecules. An initially random sequence is iteratively mutated and evaluated according to an objective function (see Option: −−optfun). Whenever a better scoring sequence has been found, the mutation is accepted, the algorithm terminates once a local minimum is found.
For algorithmic details as well as for choosing an objective function see Stefan Badelt, "Control of RNA function by conformational design", PhD Thesis (2016).
The input file contains (one or more) secondary structures followed by one (optional) sequence constraint in IUPAC code (i.e. A, C, G, U/T, N, R, Y, S, M, W, K, V, H, D, B). Both sequence and structure constraints are strictly enforced during the design process. However, RNAdesign.pl avoids difficulties of multi-stable designs where a single nucleotide has more than two dependencies. In that case, base-pair constraints are not strictly enforced, but still evaluated in the objective function. A warning will be printed to *STDERR*.
Input example: ((((....((((...))))...)))) ((((....))))...((((...)))) NNNNGNRANNNNNNNNNNNNNNAUGN
Secondary structures must be specified with a well-balanced dot-bracket string. They may contain the following special characters: ’&’ connects two sequences to design a pair of interacting RNAs, ’x’ when a structure is used in the objective function to compute the accessibility of nucleotides (i. e. the probability of being unpaired).
This script is part of the ViennaRNA package. It requires the RNA::Design Perl library, which has been introduced in ViennaRNA package−v2.2.
A web interface to call this script is available at
For Details on the Algorithms see
Badelt S., "Control of RNA function by conformational design", PhD Thesis (2016) Flamm C., et al. "Design of Multi−Stable RNA Molecules", RNA 7:254−265 (2001)
−o, −−optfun <string>
The objective function is a simplified interface to access functions of the ViennaRNA package. Every input secondary structure *can* serve as full target conformation or structure constraint. The objective function can include terms to compute the free energy of a target structure, the (constrained) ensemble free energy, the (conditional) probabilities of secondary structure elements, the accessibility of subsequences and the direct-path barriers between two structures. All of these terms exist for linear, circular, and cofolded molecules, as well as for custom specified temperatures. In the following examples, the indices i, j correspond to the secondary structures specified in the input file, t is optional to specify the temperature in Celsius. By default, computations use the standard temperature of 37C.
eos(i,t): Free energy of structure i at temperature t. [circular: eos_circ(i,t)] efe(i,t): Free energy of a constrained secondary structure ensemble. Omitting i, or specifying i=0 computes the unconstraint ensemble free energy. [circular: efe_circ(i,t)] prob(i,j,t): Probability of structure i given structure j. The probability is computed from the equilibrium partition functions: Pr(i|j)=Z_i/Z_j. Hence, the constraint i *must* include the constraint of j (Pr(i|j)=Z_i+j/Z_j). Omitting j, or specifying j=0 computes the probability of i in the unconstrained ensemble Pr(i)=Z_i/Z. [circular: prob_circ(i,j,t)] acc(i,j,t): Accessibility of an RNA/DNA motif. This function is exactly the same as prob(i,j,t), however, it is ment to be used with constraints that use the character 'x' to specify strictly unpaired regions. [circular: acc_circ(i,j,t)] barr(i,j,t): direct path energy barrier from i to j computed using findpath. [circular: not implemented]
By default, two independent penalties are added to the objective function, see options −−avoid and −−bprobs.
−a, −−avoid <string>
A set of sequence motifs that receive an extra penalty. Whenever one of these motifs is found in a sequence, its penalty is added to the score of the objective function. Specify pairs of motif:penalty as a comma-separated string. It is allowed to give certain motifs a negative contribution in order to favor them during the optimization.
−p, −−bprobs <string>
Set a target distribution of nucleotides in the designed sequence. Whenever the observed nucleotide content differs from target values, a penalty is added to the objective function (see −−optfun). Let p be a vector of specified base probabilities and q a vector of observed nucleotide percentages, then the similarity of these vectors is computed as s=sum_n(sqrt(p_n*q_n)) where n is the index for A,C,G,U. The penalty is calculated (1−s)*k. Specify base probabilities as a comma-separated string of <base>:<prob> tuples.
−n, −−number <int>
Specify the number of independent sequence designs. Default: 1
−m, −−maxiter <int>
Maximum number of trial/error improvements during adaptive walk. The default is 1e10, however, this threshold is only important for large sequence designs, as the value is calculated automatically form the number of possible sequence mutations.
−s, −−start <string>
Specify a starting sequence for adaptive optimization.
Specify an upper bound for the ’findpath’ direct path search used in the objective function: barr(i,j,t). Default 10.
−P, −−params <path−to−file>
Specify a Parameter file other than rna_turner2004. Alternative DNA and RNA parameter files are distributed with the ViennaRNA package.
−4, −−tetloops <flag>
Turn off parameters for particular tetra-loop hairpin motifs.
−d, −−dangle <int [0,1,2,3]>
ViennaRNA dangling energy model. Default: 2
Turn off energies for GU base-pairs.
Turn off energies for GU base-pairs at the end of helices.
−v, −−verbose <int>
Print settings and report every sequence accepted during the adaptive walk.
Print short help
Show the manual page
Stefan Badelt <firstname.lastname@example.org>
1.00 −− initial release (March, 8th 1016)