In EteRNA player puzzles and RNA lab, we have 4 “bots” that try to solve puzzles and submit their RNA sequences.
These bots are computer algorithms that are designed to take in the “target shape”, then output RNA sequences that fold into the target shape. They are called “Inverse folding” algorithms opposite to “folding” algorithm which tries to figure out RNA shapes given RNA sequences.
The following link explains more details about inverse folding algorithms in general.
4 EteRNA bots use following algorithms
(1) ViennaRNA : ViennaRNA (a.k.a. RNAFold) package’s RNAInverse program. ViennaRNA mainly runs a stochastic search - in easy words, it tries changing bases randomly until it finds a sequence that folds into the target shape. It starts from a random sequence, try to change bases randomly, and pick the change that puts the sequence most close to the target shape.
Because of it’s random nature, ViennaBot’s performance often unstable. It could solve very hard puzzle in seconds when lucky, but it could also get stuck in very easy puzzle when out of luck.
(2) InfoRNA : INverse FOlding of RNA takes the same approach as ViennaRNA - a stochastic search. The main difference however is that they first try initialize the sequence to minimize free energy in the target structure. In EteRNA, this is equivalent to designing RNA to have the lowest energy in “Target mode”, regardless of whether it folds correctly in “Natural mode.” Then InfoRNA runs the usual random process to come up with the answer.
Due to this nature, InfoRNA is extremely fast and strong. However their designs usually have excessive number of GC pairs
(3) RNASSD : RNA Secondary Structure Design is an algorithm described in the following paper,
A New Algorithm for RNA Secondary Structure Design
M. Andronescu et al. 2003
The authors of the paper generously provided us the source code to run RNASSD bot. RNASSD uses stochastic searches too, but differs from the former 2 algorithms in that it first decomposes the target shape into a hierarchy of substructures. Then it runs stochastic searches on the hierarchy recursively, until it comes up with the answer
(4) NUPACK : NUcleic Acid PACKage is quite different from other 3 in that it’s primary goal is not just to come up with a sequence that folds into the target shape. It wants to come up with the sequence with the minimal “ensemble defect”. The ensemble defect is the average number of incorrectly paired nucleotides at equilibrium over the ensemble of (possible) secondary structures(=shape).
NUPACK bot is only used in RNA lab now, and its performance is surprisingly good. It also seems that there is a noticeable correlation between NUPACK design’s ensemble defect & lab synthesis score (The lower the defect is, the higher the score.)