Nature of the bots

Eli_Fisker · June 29, 2011, 9:31am

I been thinking about why it is the bots fail certain puzzles. When it comes to the strongest bot, INFO-RNA, untill now there is only 3 types of puzzles it fails. It don’t like zigzags and it don’t like really big, symmetrical puzzles. Then for some reason, half-circles made of circles - Iroppy says about the bots: ”it is like they have no guide to the obvious boost points.” (in the crop circles)

I think the bots dislikes pattern not commonly found in the nature, as this is all their inbuild algorithm knows of so far. The bot generally don’t like sharp angles too – by this I mean strings take a very sudden turning. Nature like things smooth and curvy. Even when it comes to rocks - it just takes a lot of a time.

Others have an oppinion on why some of the bots fail certain types of puzzles? This could be an interesting discussion, now the eterna crew wish us to point to lab puzzles that beats Vienna and Nupack.

Eli_Fisker · August 8, 2011, 10:09am

Now there are more types of puzzles Info bot fails. But I’m starting to see a new pattern. Infobot failing puzzles that SSD-bot can do. This is interesting.

Quasispecies · August 8, 2011, 10:59pm

This page briefly describes the bots. INFO-RNA apparently starts by trying to find a sequence that minimizes free energy when the molecule is held in the target conformation (I’ll call it the MFE sequence). My gut says that the MFE sequence is probably a bad place to start when solving structures that are highly symmetric or full of closely-spaced loops. Here’s my two cents:

Zigzags and symmetric designs seem to be rare structures. Not many sequences fold into them. Those that do might be separated from the MFE sequence by a considerable distance.

The MFE sequence in a symmetric design probably has a lot of repeats and alternative base pairing options. The MFE sequence for design with many closely-spaced loops probably has the potential to form fewer, larger loops of even lower energy.

To find a sequence that folds to the target, you need to make several changes to the MFE sequence. I would bet that most close neighbors of the MFE sequence are less similar to the target structure than the MFE sequence itself.

What if the algorithm accepts or rejects random changes to the initial sequence based on whether they bring the new sequence closer to the target structure? Considered individually, most changes on the path from the MFE sequence to the properly-folding sequence fold to something further away from the target structure than the unchanged sequence. Depending on how the algorithm is designed, the bot could get stuck in a local minimum of structural similarity.

Eli_Fisker · August 9, 2011, 3:33pm

Hi Quasispecies!

Thanks for your fine explanation. Things make a bit more sense now.

paramodic · January 2, 2012, 6:40pm

I’d like to resurrect this thread and give a suggestion. We haven’t been looking at the bot algorithms (player puzzles) nearly as hard as we’ve been looking at the labs. I’d like to think that there’s almost as much information to be gleaned there as there is from the labs. To illustrate, we know generalizations about the bots: they don’t do well with zig-zags, they don’t do well with bond-sparse folds, etc. But that’s really not specific enough to learn anything from.

After talking with Eli Fisker, Starryjess, Edward, and a few others, I feel that it would be beneficial to start a puzzle maker’s collaborative. The purpose of the Collab being testing bots for specific errors and trouble-points when solving RNAs. Anyone interested?

Also, it’s curious to me that Infobot failed this puzzle when SSD and Vienna bot killed it.
http://eterna.cmu.edu/eterna_page.php…

stevetclark · January 2, 2012, 6:48pm

Im interested

paramodic · January 2, 2012, 8:55pm

If anyone would like to join me, I’m currently testing how the bots fare against structures formed by short, repetitive sequences.

Edward_Lane · January 2, 2012, 10:53pm

I’m in, I’m still trying to build ‘simple’ sequences the bots’ can’t manage but humans can. mismatching so far seems the biggest bot confusion

paramodic · January 2, 2012, 11:08pm

Agreed. I suspect that the bots also lack a separate protocol for handling bases that are meant to be unpaired. The result is that RNAs with many loops and sparse bonds; Those with sharp bends and, as a result, large bulges; And those with repetitious structures, especially repetitious loops, will defeat the bots. One of the big differences that I know a human has is that it generally won’t mess with the bases that are meant to be unpaired, unless it has a specific reason to. Judging by the bot’s lab submissions, the bots don’t suffer the same discrimination.

For reference on my findings thus far, check out the thread titled ‘The problem with random bots’.

Eli_Fisker · January 2, 2012, 11:48pm

Thoughts from the recent discussion today. My focus is on puzzles that stomps InfoRNA.

It is my perception that too much symmetry and too much asymmetry makes the bots go mad.

Paramodic mentioned something: For example, they (bots) seem to also have issue with structures formed by repetetive sequences, so not just symmetrical structures, but repetetive ones.

I think Paramodic is on to something. Repetitative sequences, that might be the key to part of what makes bot fails.

Just like Dings mirrored snowflakes, are symmetric, they are also also repetitative.

Big symmetric puzzles are energeticly pressured, at least Dings snowflakes were, as the strings were relatively short and close to each other.

But small and asymmetric puzzles stumps bots too. Like Kudzu. Here the strings are very short, and the structure very energetic pressured.

I’m just thinking, what the big symmetric puzzles and the small symmetric puzzles that stomps bots have in common, are relatively short strings.

But well smaller symmetric puzzles stumps InfoRNA too. Especially if mirrored on more than one axe. Wonder if there is something there? Brourd’s clothespin spring are mirrored around two axes.

Ding’s snowflakes 4 (all of them) were mirrored on more that three axes. Notice the mirroing on the smaller arms too.

I’m think mirroring itself is a problem for the bots. As in a mirroring puzzle, too many regions are similar, which again makes bigger chances for mispairing, if arms are similarly solved. Mirroring in itself can also put energetic pressure on the puzzle especially if elements are close together and strings are short. The puzzle becomes like a handful of tightened bows, wanting to release the energy somewhere and pushing the structure apart.

With mirroring around axes, comes sharp angles. Sharp angles is often a problem for the bots. Like 90 degrees or smaller. The more two strings are bent close to each other, the harder to solve for us and bots don’t like them either.

paramodic · January 3, 2012, 12:10am

Good post, Eli!

I’d like to add that Brourd’s puzzle shows a higher level of symmetry in that the arms of the puzzle are symmetric to each other. If you abbreviate out the middle portion of the RNA, it simply forms an ‘X’. For most things, this would be unimportant, but RNA folds exhibit as many, if not more, global effects as local effects.

I’d also like to point out that Infobot seems to do just fine with fractal structure RNAs, where SSD-bot and Viennabot fail.

Eli_Fisker · January 3, 2012, 12:47am

Thanks for your comments. I really like your observation that SSD-bot and Vienna fails on the fractal structures. Which reminds me of the title of an paper I once found: The fractal nature of RNA secondary structure. I must admit I haven’t read it. I was just searching for fun to see if anyone had taken a fractal approach to RNA design. And the answer were yes. Thought you would like it. Click on the text and it becomes bigger and readable. It is just an intro.

jandersonlee · January 4, 2012, 12:09pm

The real question to me is: are the bots failing at something that folds in the wet lab, or is the EteRNA game model too lax, allowing puzzle shapes that do not exists in nature - like the all GU 2-2 loop.

Eli_Fisker · January 4, 2012, 12:32pm

I’m inclined to think the latter, but does not have the mathematical or science background to say that it is so. So this is just my personal feeling.

Eli_Fisker · January 4, 2012, 1:31pm

To cite what Paramodic said during our chat debate on bots and their puzzle solving: For example, they (bots) seem to also have issue with structures formed by repetetive sequences, so not just symmetrical structures, but repetetive ones.

InfoBot just failed my whole christmas series: Christmas bird and Christmas special 1 & 2.

There certainly are repetitative sequences in these three puzzles. They are build on a small pattern, a double bulge structure I found, in Paramodics (RNA) GC only puzzle.

Maybe repetition is why lots of 2-2 loops stumps bots too, allthough it propably is not all of the explanation in that case. A few 2-2 loops alone is usually not enough to stump InfoBot.

paramodic · January 4, 2012, 4:15pm

I strongly feel that the bots don’t have a special or separate protocol for handling unpaired bases, or for boosting. Therefor, I think that’s why loops tend to stump the bots.

paramodic · January 4, 2012, 4:25pm

A new observation: Infobot seems to have a great deal of difficulty with puzzles featuring 1 NT bulges. The details require a lot more fleshing out at the moment, but it seems to follow this pattern:
-The bulges are on the same side of the fold as each other (ex: Left side)
-The bulges are close to an internal loop
-The bulges appear on sequential or adjacent strings.

As evidence, I present the puzzles ‘600!’, ‘(RNA) Cytosine Free’, and ‘Comet Tail’. 600 is where it really caught my attention, since Vienna and SSD bots knocked this relatively simple puzzle out of the park, but Infobot failed. Bear in mind that the (RNA) puzzle may be contaminated by the structure formed by a lack of cytosine, as infobot also failed the uracil free puzzle. I’d love to see this explored more. Any ideas why Infobot gets stumped by the bulges?

Eli_Fisker · January 4, 2012, 4:32pm

Good observation, Paramodic.

I been thinking it is not all about size. New combinations of elements might also stump bots.

My puzzle A bulge and 1-2 loops made InfoBot time out. New combination of elements

Brourd’s Small and easy 2. This one required a new and surprising boost solution.

And those puzzles are very small, like Comet tail as you mentioned.

paramodic · January 4, 2012, 7:04pm

Now to be fair, there is a large, repetitious puzzle involving many 1 nt bulges called ‘the fishhook’ that SSDbot solved, but Vienna and Infobots couldn’t. I think that SSD bot is exceedingly good with repetitious structures, though I don’t quite understand why yet. Infobot seems to be hit-and-miss with them, and Vienna obviously has the most trouble with them.

Eli_Fisker · January 5, 2012, 11:44am

Very well noticed. Yes, I think you got something there. SSD bot is good with repetitious structures, compared to Infobot and Vienna. I have been wanting to know for a long time what was the cause of why SSD bot solved puzzles that Vienna and especially Infobot failed. Thanks.