So, I’m thinking that the most common failure of an RNA to fold will be when it folds into the second-lowest energy state. The score, then, would simply be the difference in free energy between the target shape and the energy of the second lowest, with the larger difference being better.
EDIT: After looking at the ViennaRNA package again I saw there was a built-in program to computer suboptimal shapes, RNAsubopt. The below can be ignored then, I guess just use RNAsubopt with increasing energy margins until a second structure is returned. Ignore the below if you so choose, it’s just me discussing how the algorithm could work if it was needed~
I realize that computing the second-lowest-energy state is something the software is probably not equipped to do, but after reading the through the algorithm used by ViennaRNA, I’m confident that it would be perfectly straightforward. I’ll explain how it could be efficiently implemented below.
The existing algorithm takes subsequences of nts i through j of the RNA and looks for the lowest energy of each. The lowest energy of a larger one can then be found by surveying the j-i ways to split it up, and summing their energies. To find the second lowest-energy all that would be needed would be to store the TWO lowest energies from each subsequence, as the two lowest energies of a larger subsequence will be a combination of the lowest energy from all parts but one.
That is, using the notation in http://www.ncbi.nlm.nih.gov/pmc/artic…, and using V2(i,j) and W2(i,j) to mean the second lowest states, as opposed to V(i,j) and W(i,j) which are the smallest…
V2(i,j) will be the second smallest of:
-E(FH(i,j)) - as there is only one way to create a hairpin, so there is no second-lowest way to arrange it
-either of the 2 lowest values obtained from the “E2” value, instead of just the minimum
-lowest value obtained from a modified “E2” which uses V2 in lieu of V
-either of the 2 lowest values of “E3”
-lowest value of “E3” with one instance of W replaced by W2
W2(i,j) will be the second smallest of:
-either of the 2 lowest values of “E4”
-lowest value of a modified “E4” with one instance of W replaced by W2
You can see that in many ways it’s like the original algorithm, and requires simply adding two new arrays of memoized results (V2 and W2), and a new computation for these two.
I just hope that this request isn’t too tall - but I think it would make an effective prediction algorithm!