RNA Structure Prediction Software

I got a couple of questions about programs I’ve used to check and refine designs. They are all available on a server.

CONTRAfold
Srna
RapidShapes

CONTRAfold and RapidShapes require you to input your sequence in FASTA format, which just involves putting “>sequencename” in the line before your sequence. You can read more about the nuts and bolts of the programs at their respective websites if you’re interested.

Quasispecies: This is really interesting. Can you tell us more about how you use these tools?

After I design a sequence for a lab challenge, I stick it in a few server-based folding programs to see if there are parts that I need to tweak. The positional entropy feature of RNAfold and the alternative structures generated by Srna and RNAshapes are great ways of identifying “trouble spots” in your sequence. Similarly, if CONTRAfold or NUPACK predict that your sequence will not adopt the target structure, the alternative structure predicted can also point out “trouble spots”.

Usually there is some degree of agreement between all of the programs - they will predict alternative structures that are somewhat similar. By comparing your sequence folded into the target structure with your sequence folded into the most probable alternative structures, you can sometimes identify “trouble spots” that you can modify.

If the alternative structures have undesired base pairing, you can change the sequence to disrupt the base pairs in the alternative structure. If alternative structures have an undesired loop, you can try to stabilize the helical regions so that loop formation is accompanied by a larger increase in free energy.

Using Srna

The information provided by Srna is similar to RNAfold, but the programs seem to work differently and Srna have the advantage of allowing you to actually see some of the alternative structures for your sequence. Srna gives you a little more information about the ensemble than RNAfold. RNAfold shows you only the ensemble centroid, but Srna also shows you the centroids of clusters within the ensemble as well as the “size” of the cluster. You can also view a list of most probable structures.

There are some settings you can modify, but I have not found any reason to do so. I just input the plain sequence and let it run.

Using RNAshapes

RNAshapes shows you structures beyond the MFE structure or ensemble centroid. It generates those structures differently than Srna, however.

RNAshapes has a few settings that you can modify, and like CONTRAfold it requires the sequence to be in FASTA format. I do two runs in “shape folding,” mode, one at abstraction level 1 and one at abstraction level 5. The differences are usually minor. Check the boxes “calculate structure probabilities,” “generate structure graphs,” “allow lonely base pairs,” and “ignore unstable structures”. Leave everything else as is. When your results are ready, click on “results with RNA plots”. It will return several structures with the format:
Free energy, probability, dot-parentheses notation, shape notation, link to image

Using CONTRAfold

Input your sequence in FASTA format and let it run. I think someone posted a strategy to the effect of “penalize sequences that don’t fold to the target structure in CONTRAfold”. All I’m doing is implementing that strategy for myself. The trick is actually getting CONTRAfold to reproduce the target structure when you input a trial sequence. CONTRAfold routinely hates a lot of things that players design… possibly because we’re expecting these sequences to adopt wacky structures.

The ones I’ve used are

CONTRAfold
CentroidFold
RNAfold
Sfold
KineFold
NUPACK

CONTRAfold and CentroidFold have been shown to be the most predictive in comparative tests. See CompaRNA.

RNAfold gives some good free energy information.

KineFold has a cool utility that will create a movie of your design being virtually synthesized.

I often will run these on my designs and also on competing lab designs to help me analyze which ones I might vote for. As an aside, none of the proposed designs for Chalk Outline pass all of these analyzers, but some come close.

Here are some links that might be useful:

CentroidFold: http://www.ncrna.org/centroidfold/
ContraFold: http://contra.stanford.edu/contrafold/
NUPACK: http://nupack.org/ (doesn’t work with IE)
RNAfold: http://rna.tbi.univie.ac.at/cgi-bin/R…
Sfold: http://sfold.wadsworth.org/cgi-bin/sr…
KineFold: http://kinefold.curie.fr/

I have also found this one but never used it. Now that I understand more, I may consider using this tool as well. It seems to have many features that could be useful.

http://bibiserv.techfak.uni-bielefeld…

Read an article about “RNA secondary structure prediction by centroids in a Boltzmann weighted ensemble” From the Oct. 2011 edition of RNA (RNA Society)

http://rnajournal.cshlp.org/content/1…

and found another RNA folding tool referenced in the article-

http://mfold.rna.albany.edu/?q=mfold/…

not sure I like it (the tool - but the article is GREAT), but I will keep playing with them all and try to figure out how to use them the best way.

It would be really nice if on the voting page there were a bunch of extra columns with a % value for how close the design came to working for each of these programs - with a click through to actually view the design as interpreted by the individual program(s). That would save each individual running the sequence through these programs independently. And would give each person who submitted a sequence a quick assessment of how they had done according to the programs. :slight_smile:

2011 Paper on RNAwolf Noncanonical Structure Algorithym

RNAwolf source, binaries and parameter files (plus other stuff)
http://www.tbi.univie.ac.at/software/…