rnazCluster.pl
- Cluster RNAz hits and print a summary of the results.
rnazCluster.pl [options] [file]
Only consider hits with RNAz class probablility P>X (Default:0.5)
Set these flags to print information for ``windows" and/or ``loci" in the output. By default, both single windows and combined loci are printed.
Print a header explaining the fields of the output (see below for a detailed description of the fields).
Generates HTML formatted output of the results in the subdirectory
results
. For this option to work you need to have installed
ghostscript and a few programs from the ViennaRNA package. More
precisely you need the following executables in your PATH: gs
,
RNAalifold
, colorrna.pl
, coloraln.pl
. Alternatively you can
adjust the locations of these programs directly in the
rnazCluster.pl
script. Please note that if you use this option the
program will get very slow because the figures have to be
generated. It is also important that you have run RNAz with the
--show-gaps
option!
Name of directory where HTML pages are stored. Default: results
Prints version information and exits.
Prints a short help message and exits.
Prints a detailed manual page and exits.
rnazCluster.pl
reads RNAz output files and combines hits in
overlapping windows to ``loci". It prints a summary of the windows
and/or loci as a tabulator delimited text to the standard output. An
explanation of the fields can be found below. See the user manual for
a more detailed meaning of these values.
To work properly, your RNAz output file needs to contain position information. This means there must have been genomic locations in your original alignments you scored with RNAz (i.e. MAF files with a reference sequence). Moreover, the original input alignments have to be ordered by the genomic location of the reference sequence.
If you want HTML output please see the notes for the --html
option
above.
"Window" lines
Consecutive numbered ID for each window
The locus which this window belongs to
Identifier of the sequence (e.g. human.chr1 or contig42)
Start position of the reference sequence in the window
End position of the reference sequence in the window
Indicates if the reference sequence is from the positive or negative strand
Number of sequences in the alignment
Number of columns in the alignment
Mean pairwise identity of the alignment
Mean minimum free energy of the single sequences as calculated by the RNAfold algorithm
``consensus MFE" for the alignment as calculated by the RNAalifold algorithm
Contribution to the consensus MFE which comes from the energy part of the RNAalifold algorithm
Contribution to the consensus MFE which comes from the covariance part of the RNAalifold algorithm
Number of different base combinations per predicted pair in the consensus seconary structure
Mean z-score of the sequences in the alignment
Structure conservation index for the alignment
Support vector machine decision value
RNA class probability as calculated by the SVM
"Loci" lines
Consecutive numbered ID for each locus
Identifier of the sequence (e.g. human.chr1 or contig42)
Start position of the reference sequence in the window
End position of the reference sequence in the window
Indicates if the reference sequence is from the positive or negative strand
Maximum number of sequences in the alignments of this locus
Maximum mean pairwise indentity in the alignments of this locus
Maximum RNA class probability in the alignments of this locus
Minimum z-score in the alignments of this locus.
# rnazCluster.pl rnaz.out
Parses and clusters the hits in the file rnaz.out
and prints loci
and cluster information to the standard output.
# rnazCluster.pl -c 0.9 --html rnaz.out > results90.out
Clusters all hits from the file rnaz.out
with P>0.9, writes the
tab-delimited output to the file results90.out
and, at the same
time, generates a website in a subdirectory called results
.
Stefan Washietl <wash@tbi.univie.ac.at>