Sampling the Boltzmann Ensemble

  1. Draw a sample of size 10,000 from the Boltzmann ensemble
  2. Calculate the desired property by using a perl script
  $ RNAsubopt -p 10000 < test.seq > tt
  $ perl -nle '$h++ if substr($_,5,3) eq "...";
      END {print $h/$.}' tt
      0.426057394260574

A far better way to calculate this property is to use RNAfold -p to get the ensemble free energy, which is related to the partition function via $F = -RT\ln(Q)$, for the unconstrained ($F_u$) and the constrained case ($F_c$), where the three bases are not allowed to form base pairs (use option -C), and evaluate $p_c
= \exp((F_u - F_c)/RT)$ to get the desired probability.

So let's do the calculation using RNAfold.

  $RNAfold -p

  Input string (upper or lower case); @ to quit
  ....,....1....,....2....,....3....,....4....,....5....,....6....,....7....,....8
  ccgcacagcgggcagugccca
  length = 21
  ccgcacagcgggcagugccca
  .........((((...)))).
   minimum free energy =  -5.70 kcal/mol
  ,,{{...}}||((...))}}.
   free energy of ensemble =  -6.25 kcal/mol
  ..((...))((((...)))). { -4.80 d=3.57}
   frequency of mfe structure in ensemble 0.408876; ensemble diversity 4.29
Now we have calculated the free ensemble energy of the ensemble over all structures (F_u), in the next step we have to calculate it for the structures using a constraint(F_c).

Following notation has to be used for defining the constraint:

  1. $\vert$ : paired with another base
  2. . : no constraint at all
  3. x : base must not pair
  4. $<$ : base i is paired with a base j<i
  5. $>$ : base i is paired with a base j>i
  6. matching brackets ( ): base i pairs base j

So our constraint should look like this:

  .....xxx............
Next call the application with following command and provide the sequence and constraint we just created.
  $ RNAfold -p -C
The output should look like this
  length = 21
  ccgcacagcgggcagugccca
  .........((((...)))).
   minimum free energy =  -5.70 kcal/mol
  .........((((...)))).
   free energy of ensemble =  -5.73 kcal/mol
  .........((((...)))). { -5.70 d=0.12}
   frequency of mfe structure in ensemble 0.95751; ensemble diversity 0.24
Afterwards evaluate the desired probability according to the formula given before e.g. with a simple perl script.
  perl -e 'print exp(-(6.25-5.73)/(0.00198*310.15))."\n"'

You can see that there is a slight difference between the RNAsubopt run with 10,000 samples and the RNAfold run including all structures.

Sven Findeiss 2013-11-22