Tell us about your EteRNA lab algorithms!

Eli_Fisker · April 27, 2011, 8:58pm

I would love to do this on the highest scoring design so far, Mat’s Branches V1. But “unfortunately” he don’t uses tetraloops, so there would be more unknown factors in play, than just the opposite turning GC-pairs and the neck. So I think I will wait till a later puzzle, with a more usual design candidate. But thanks for the support!

Berex_NZ · May 4, 2011, 6:43am

Ok well I finally managed to get some time to finish this off.
This is how I go and solve a puzzle, and for the 98% of them out there, it’s all you’ll ever need.
Although some of the more enterprising player puzzles utilise specific combinations, that you’ll discover if you have the time and patience for it.
Disclaimer: I’m not telling anyone how to play, but this is just how I go about it.
Two main approaches
A) From scratch using standard heuristics
B) Modifying best lab score, using experience gained, stabilising tetraloops and minimising high energy differential neighbours.
Ok, lets start.

Use Q to highlight the whole puzzle in AU pairs.
Now its fine if you have all A’s on one side and U’s on the other. But every now and then you’ll need a pair to alternate to stabilise that stem. For most stem lengths, 2 alternates should be enough.
You locate and identify all the loops.
You should know your 1-1’s, 2-2’s by now. If not, please refer to the loop guide.
Tetraloops, G on the first nucleotide (nt) on the right.
If you so choose, you can leave your tetra’s empty with A’s. The reason for this is then your puzzle will have less possibilities to bind to the other nt’s. By modifying your tetra’s, it can complicate the folding process.
Loops with two stems, G on the same side of each connector, usually on the side with less nt’s.
Internal Bulges are annoying, always wrap them on GC’s on either side.
1-3’s G on bottom and top right nt.
Multiloops can be tricky. No general rule to them but if you are just trying to lower their energy, try a G around the connections. At least one of them will work. And if you are lucky, sometimes the opposite nt will also work, lowering it further.
Play it by ear at this point. Look at your minimap, look at what areas are still red. I work on the strategy of using GC’s to stabilise, then minimise GC’s later. If the puzzle is especially loop heavy, watch your energies and neighbours. Because with loop heavy designs, they are more prone to domino effects.
By the time you’ve reach 50k points, you should have reached an intuition already. Trust it.
Now just do three things. You mouse-over each quad, checking the energies. Generally if any quad is higher than 3 difference, you lessen the differential. Usually found when a GC is next to an AU.
Second, I go to RNAFold, to measure entropies. Usually I don’t bother with the head or tail, cos they are hard-coded, mostly out of your control. But generally I try to keep the rest of my designs under 0.05.
Third, read your design from the end to the front. aka 3’ to 5’, where 5’ is the beginning of your sequence on nt 1. I do this because that is how the RNA ends up folding, in reverse. And you try to minimise the chances of it mis-folding.

Please note: In nature it folds from the front to the end.

If you are designing a lab that is past round 1, should always base your design on the top winner of the previous round/s. I know some people dislike the refining strategy. But then you can start off with a control and measure what works or doesn’t work, in future rounds. E.g Branches. Look at the difference between mat’s V1 and Berex 3-1.

Hope this helps. Enjoy!

cdmonson · May 4, 2011, 5:14pm

Generally, I start by copying an early design that has a decent amount of votes, thus saving time and improving to odds of gaining some points if I vote on both theirs and mine. I also tend towards submissions with stats (bond counts, MP, etc.) that are solidly mid-range.

I then go through pair by pair, strengthening any weak points and then weakening some of the overly-strong sections. I try to strike a balance between stability and ease of bond formation (because that’s what life would do). Not too many GUs or CGs, but not too few either.

I’ll submit that and then make a few minor alterations and submit those as well.

JeehyungLee · May 4, 2011, 8:07pm

This is incredible Berex,

In fact, Eli and mat747 have been talking about your energy differential points in 4 (It was also discussed by AnticNoise in one of old GetSat post http://getsatisfaction.com/eternagame…

JeehyungLee · May 4, 2011, 8:09pm

This is amazing - thanks cdmonson!

I’m finding more and more people using the energy balance approach. May I ask how you search overly strong or weak points?

Eli_Fisker · May 28, 2011, 7:31pm

I was stating that there sometimes was a penalty for having all GC-pairs in a string, turn in the same direction, even if they are not right besides each other. And that twisting one or two ( if more than 2 GC pairs in the string) is helpful. (Should often be the GC-pair closing the tetraloop, as not to interfere with the sameturning rule of GC-pairs in the multiloop.

I saw a picture of a miRNA helix on the book Genetics: A conceptual approach and had the thought: Maybe the reason we sometimes need to twist one or two GC-pairs in a string, have to do with the spirally structure of the doublestranded part of RNA - or in other words, about helping the RNA start spiralling and be more stable.

What is the cause of the spirally structure in RNA and DNA anyway?

Let me hear what the rest of you think about this.

Eli_Fisker · July 11, 2011, 12:04am

Hi Adrien and Xmbrst!

Here is the answer to what happens, when turning GC-pairs in the opposite direction. Back when I started this post, we didn’t had the energy view mode in the puzzles and therefore could not see what was going on. This is a demonstration of what goes on inside a multiloop on energy level.

Energy forces at work in multiloops

Adrien_Treuille · July 11, 2011, 11:57pm

Very interesting, Eli. Do you feel that the experimental data bears out this observation, or that further tests might confirm / falsify it?

I guess I’m asking because one of the main points of EteRNA is to glean insights into RNA nano-engineering which are not present in existing models, such as the model which EteRNA itself presets to users.

Eli_Fisker · July 12, 2011, 1:27pm

Hi Adrien!

This is a mixture of things I been writing to Jee and Jerry lately and sort of an overview about what I found out about direction of GC-pairs in multiloops.

I have been very focused on if there were a pattern in where opposite GC-pairs in multiloops were allowed (neck is safest). Think it came down to that I discovered the pattern of direction of GC-pairs, before I knew about the energy in multiloops. Now I sort of think that a certain amount of opposite GC-pairs are allowed in multiloops, as long as the negative energy in the multiloop don’t go below a critical limit. I still think that preferably all GC-pairs should turn in the right direction (red nucleotide to the right) as this gets the negative energy inside the multiloop up, and ensures it stays together. It may however be helpfull to lower energy around the neckarea, as to make eg. a low energy neck (neck with collective low energy) work together with the multilioop.

It is only in the designs with asymmetric multiloops - different numbers of nucleotides between arms, that I discovered a pattern in where these opposite GC-pairs in multiloops are most likely to be tolerated, other than in the neck. That is in the toparm, the one where there are least numers of nucleotides on both sides of this arm, compared to the others. I will do a post on asymmetric design later, when we have a bit more data. Tendency is that most of the rule for the symmetric designs hold in asymmetrics as well, just that they are a bit sloppier, it is allowed to stray a bit from some of the common rules.

Jerryfu is programming my strategies for direction of GC-pairs in multiloops. He asked a great question: (He needed to know this, to know how to score designs of this type, with my comming algorithm)

Do multiloops need at least 3 closing pairs (ie. 1 neck, 2 arms), or 2 (1 neck, 1 arm)?

And he have a point. Things do look different in designs like the finger, where multiloops only have two arms or one arm and one neck.

I haven’t been keen of having the finger design in my algorithms for direction of GC-pairs in multiloops, but hadn’t thougt up a way to exclude it. The tendency for direction of GC-pairs are not as clear in loops with just two arms, or one arm and a neck. Besides the data from the Finger lab is contaminated by all the colored nucleotides that are placed in the multiloop ring, so I can’t see which is causing the mispairing of the shape data - the nucleotides in the loop ring or the wrong direction of the GC-pairs in the loop closing. I do suspect however, that direction of GC-pairs does not matter so much here, as in loops with more arms. Those (small) multiloops with few arms are not as energetic pressured as the bigger multiloops with more arms and thus negative energy inside the multiloop is not that important to keep the structure together. Time will tell with this one.

But for lab designs in general, at least 2/3 and rather 4/5 of all GC-pairs in multiloops should turn the right way. And opposite GC-pairs are best tolerated in the neck.

Yes, I do think experience and data confirms this theory about direction of GC-pairs in multiloops. It is not always a 100% rule, but the tendency is clear. If followed, it mostly pays of.

Hopes this answers your questions

jerryfu · July 31, 2011, 3:58pm

Hello Ding,

The dev team is trying to create a script out of your strategy. I want to clarify something before I finalize the script. In your original post, you wrote:

“If the tetraloops aren’t either AAAA or one of the known patterns that gets an energetic bonus in EteRNA I check to see if there are mispairing possibilities with a complementary sequence nearby”

Could you tell me about the known patterns that get an energy bonus?

Thanks for your help!

Eli_Fisker · August 1, 2011, 1:43pm

Comment to my comment above: One of the robots just managed to make a row of 5 AU pairs. (Nupack bot design 1, The branches, 80 % synthesis) But that doesn’t mean that it usually will work.

Now I have a better idea about why this pattern is tolerated in the neck and why it is not recommendable elsewhere in the design.

See explanation here in What’s so special about the neck?