RNAlib-2.2.0-RC2
|
Functions dealing with file formats for RNA sequences, structures, and alignments. More...
Go to the source code of this file.
Functions | |
void | vrna_structure_print_hx (const char *seq, const char *db, float energy, FILE *file) |
Print a secondary structure as helix list. More... | |
void | vrna_structure_print_ct (const char *seq, const char *db, float energy, const char *identifier, FILE *file) |
Print a secondary structure as connect table. More... | |
void | vrna_structure_print_bpseq (const char *seq, const char *db, FILE *file) |
Print a secondary structure in bpseq format. More... | |
unsigned int | vrna_read_fasta_record (char **header, char **sequence, char ***rest, FILE *file, unsigned int options) |
Get a (fasta) data set from a file or stdin. More... | |
void | vrna_extract_record_rest_constraint (char **cstruc, const char **lines, unsigned int option) |
Extract a hard constraint encoded as pseudo dot-bracket string. More... | |
int | vrna_read_SHAPE_file (const char *file_name, int length, double default_value, char *sequence, double *values) |
Read data from a given SHAPE reactivity input file. More... | |
plist * | vrna_read_constraints_file (const char *filename, unsigned int length, unsigned int options) |
Read constraints from an input file. More... | |
unsigned int | read_record (char **header, char **sequence, char ***rest, unsigned int options) |
Get a data record from stdin. More... | |
Functions dealing with file formats for RNA sequences, structures, and alignments.
The RNAlib can parse and apply data from constraint definition text files, where each constraint is given as a line of whitespace delimited commands. The syntax we use extends the one used in mfold / UNAfold where each line begins with a command character followed by a set of positions.
Additionally, we introduce several new commands, and allow for an optional loop type context specifier in form of a sequence of characters, and an orientation flag that enables one to force a nucleotide to pair upstream, or downstream.
The following set of commands is recognized:
F
P
W
A
E
The optional loop type context specifier [WHERE] may be a combination of the following:
E
H
I
i
M
m
A
If no [WHERE] flags are set, all contexts are considered (equivalent to
A
)
For particular nucleotides that are forced to pair, the following [ORIENTATION] flags may be used:
U
D
If no [ORIENTATION] flag is set, both directions are considered.
Sequence positions of nucleotides/base pairs are based and consist of three positions
,
, and
. Alternativly, four positions may be provided as a pair of two position ranges
, and
using the '-' sign as delimiter within each range, i.e.
, and
.
Below are resulting general cases that are considered valid constraints:
[WHERE] allows to force them to appear as closing/enclosed pairs of certain types of loops.F i j k [WHERE]
[WHERE] allows to specify in which loop context the base pair must appear.P i 0 k [WHERE]
[WHERE] allows to force the nucleotides to appear within the loop of specific types.P i j k [WHERE]
[WHERE] allows to specify the type of loop they are disallowed to be the closing or an enclosed pair of.P i-j k-l [WHERE]Description:
[WHERE] allows to specify the type of loop they are disallowed to be the closing or an enclosed pair of.W i 0 k [WHERE]Description:
[WHERE] may be used to prohibit pairing in specific contexts.W i j k
F
and W
commands, which remove conflicting base pairs, the A
command does not. Therefore, it may be used to allow non-canoncial base pair interactions. Since the RNAlib does not contain free energy contributions
[WHERE] allows to specify in which loop context the base pair may appear.