alifoldz.pl - Manual page
Name
alifoldz.pl - Assessing a multiple sequence alignment for the existence
of an unusual stable and conserved RNA secondary structure.
Syntax
alifoldz.pl [options] < alignment.aln
Description
The program reads a multiple sequence alignment (aligned FASTA or
ClustalW format) from STDOUT and tries to estimate if there is a con-
served RNA-secondary structure which is more stable than one could
expect by chance.
It uses the program RNAalifold to calculate a consensus minimum free
energy (MFE) structure, which does not only consider thermodynamic sta-
bility but also phylogentic information like compensatory mutation.
alifoldz.pl compares the consensus MFE of the native alignment to ran-
domized alignments and expresses the significance as a z-score (stan-
dard deviations from the mean).
Negative z-scores indicate that the native alignment has a more stable
consensus structure than the random alignments. The significance of
z-scores depends on various factors (qualitiy of alignment, number of
random samples...). Tests show that for ClustalW alignments with mean
pairwise identities above 60% and a sample size of 100, z-scores below
-3.5 can be regarded as significant (false positive rate below 1%).
Options
--n, -n
Number of random samples for the z-score calculation. Default:
100.
--single
Score a single sequence (given as a FASTA file or the first in
an alignment) using RNAfold.
--forward, --reverse, --both
Score the foreward, the reverse or both strands. Default is
both.
--foldpars
The parameters for RNAalifold or RNAfold. Refer to the documen-
tion of RNAalifold and RNAfold for details. Default is none
(i.e. default for RNAalfiold and RNAfold). IMPORTANT: use quotes
like this --foldpars "-T 25 -nc 5"!
--window (-w), --slide (-x)
Score the alignment using a sliding window of a specified window
size and step-size. Default is complete alignment.
--threshold, -t
Score only windows which have a native MFE below this value.
Default: -3.0
--help, -h, ?
Display help message.
Dependencies
alifoldz.pl depends on the programs RNAalifold (and RNAfold if --single
is used). Both programs are part of the Vienna RNA package which can be
downloaded from http://www.tbi.univie.ac.at/RNA/. The executables must
be in your PATH or some variables in alifoldz.pl must be edited to
point to your custom locations.
Further, the script needs the Perl module Math::NumberCruncher available
from www.cpan.org.
Examples
Score an alignment of yeast SRP RNAs using a window size of 150 and a
step-size 20. Only score forward strand:
alifoldz.pl -w 150 -x 20 --forward <SRP-yeast.aln
The program gives you the following results. It is a list which shows
you the coordinates of the window, the strand, the RNAalifold MFE for
the current window, the mean MFE for the corresponding random samples
and the standard deviation. The last column shows the calculated
Z-score: (Native MFE - Mean MFE)/STDV. In this example you get only
negative z-scores. In the region of appr. 120-330 you get significantly
low z-scores below -4, which indicates a unusual stable local secondary
structure in this region.
Subsequently you can run "RNAalifold < SRP-yeast.aln" to get a consen-
sus secondary prediction.
From To Strand Native MFE Mean MFE STDV Z
------------------------------------------------------------------
1 150 + -18.21 -13.41 2.58 -1.9
21 170 + -11.44 -10.39 2.78 -0.4
41 190 + -18.60 -10.32 2.52 -3.3
61 210 + -21.70 -12.12 2.88 -3.3
81 230 + -25.84 -16.39 2.76 -3.4
101 250 + -22.48 -16.01 3.05 -2.1
121 270 + -24.49 -12.22 2.73 -4.5
141 290 + -30.20 -13.12 3.01 -5.7
161 310 + -29.29 -17.64 3.02 -3.9
181 330 + -32.24 -19.34 3.07 -4.2
201 350 + -34.27 -23.96 3.15 -3.3
221 370 + -30.61 -21.98 3.45 -2.5
241 390 + -27.17 -24.28 3.46 -0.8
261 402 + -26.05 -24.37 3.56 -0.5
If you shuffle the SRP RNA alignment by filtering it through shuffle-aln.pl
you destroy the native secondary structure. Z-scores around 0 (+/-2) show
you that there is obviously no significant secondary structure in your alignment:
cat SRP-yeast.aln | shuffle-aln.pl | alifoldz.pl -w 150 -x 20 --forward
From To Strand Native MFE Mean MFE STDV Z
------------------------------------------------------------------
1 150 + -9.66 -11.93 2.91 0.8
21 170 + -5.99 -7.19 2.19 0.5
41 190 + -8.32 -6.83 2.27 -0.7
61 210 + -8.63 -9.49 2.72 0.3
81 230 + -15.29 -13.71 2.86 -0.6
101 250 + -15.64 -15.87 3.08 0.1
121 270 + -9.93 -9.91 2.53 -0.0
141 290 + -12.17 -11.56 2.78 -0.2
161 310 + -16.55 -16.55 2.62 0.0
181 330 + -14.73 -17.96 2.95 1.1
201 350 + -19.83 -22.75 3.17 0.9
221 370 + -13.08 -18.87 2.97 2.0
241 390 + -23.73 -23.75 3.35 0.0
261 402 + -27.90 -24.72 3.27 -1.0
Authors
Stefan Washietl <wash@tbi.univie.ac.at>
Stefan Washietl
Last modified: Tue Mar 16 08:14:10 CET 2004