Deploying new scoring function - please give us feedback!

Tom · December 1, 2012, 5:52am

Hi everyone,

We are excited to announce an update to our scoring system for switch lab puzzles. I would first like to thank our data analysis expert Hanjoo Kim for his time and effort improving our scoring system - he has put in a huge amount of work and it really shows! I would also like to thank Brourd for giving us reminders and ideas to help us improve the scoring method. Thanks all around!

Without further ado, here is a link to an image archive of our new scores:

https://www.dropbox.com/sh/qdumrhmn5l…

This link should contain two folders, each full of .png image files corresponding to every switch round, going back to the Simple RNA Switch. (In lab, we refer to the round by their number in the history of EteRNA, so that explains why the earliest one is Round 51). One folder contains images for all of the scores you have seen before, by our old metrics; another folder contains the same data, analyzed and scored by several new procedures! We consider the new scoring scheme to be an improvement over the old one - many of the scores have improved, but a handful have dropped as well. We think the new one is more fair and more accurate.

A note on how to read the image files - you will notice 4 rows of data for each RNA. The top row is SHAPE data (as a reminder, SHAPE can react with just about any unpaired nucleotide) without FMN, the second row has SHAPE data with FMN, the third row has DMS data (DMS reacts with any unpaired A or C) without FMN, and the fourth row has DMS data with FMN.

There are a few important differences to the scoring scheme. Perhaps the most obvious one is that we are now including the aptamer nucleotides in the switch score - in previous we had omitted them. As I said, this has boosted the scores for many people and occasionally reduced scores. The other major difference is that Hanjoo has reworked the data analysis protocol. Before we show all the data to the players, the data must be normalized - essentially, we have to compute how light and dark the bands should be for each experimental run. Hanjoo has kindly improved the normalization strategy and we think this will yield more accurate final results that better reflect the experimental results.

A few reminders and explanations of how we score the data:

With the switch labs, we care about whether or not the RNA appears to switch when we add FMN. While we only show the target structure and the most stable structure in the game, RNA actually frequently exists as a mixture of related shapes when it is in solution. I gave a brief explanation of this in the following link:

https://getsatisfaction.com/eternagam…

So, because there may be many different RNA shapes in solution at any time, all of them can contribute to our data! Mostly, we care if your RNA responds to FMN - if there is a change in the pattern of dark bands, it means that your RNA is binding FMN at least a little. However, the FMN binding shape may not make up 100% of the RNA molecules in the test tube, so sometimes it doesn’t look like the RNA has fully switched. In cases where the dark bands move in the correct direction (light to dark or dark to light), but don’t go all black or all white, we give partial credit. Rhiju covered this in a post, with the link below:

https://getsatisfaction.com/eternagam…

In the image files, you will see green circles of various darkness - the more green, the more credit. A light green circle may only get 0.2 or so of a point.

As I mentioned above, we are now scoring the aptamer nucleotides. While they are unpaired, not all of them are expect to receive full protection from FMN. A perfect FMN binder would have reactivities in the presence of FMN as follows:

AGGAUAU AGAAGG
0001000 000000 [SHAPE]
0001010 001100 [DMS]

1 means dark, a reaction, normally associated with unpaired. 0 means white, no reaction, the nucleotide is protected from the chemical probe by FMN (or typically because it is in a base pair). Without FMN around, these nucleotides will behave as normal paired (white bands) or unpaired (dark bands) nucleotides.

Now, the most important part. Please give us feedback! The player community has proven to be very observant, and we hope you can help us find improvements, errors, or anything that isn’t clear about the new scoring system and results. If you find results that still seem to be scored inaccurately (or mysteriously), please post in this thread about them. We will look forward to any feedback you can give us!

As a final note, you’ll see that the few most recent rounds are missing from the new scoring system data files. The last few rounds have had an assortment of errors due to several recent issues in lab - we have retaken all the lower quality data and are getting around to reanalyzing it with the new scoring system. Those image files will hopefully be available soon.

We are excited to get any feedback on the new scoring system!

Best,
Tom

Tom · December 1, 2012, 5:56am

Also, I noticed that there seems to be an incomplete image file for R57 of the new scoring info. I will track down a complete form of that image.

Eli_Fisker · December 1, 2012, 9:28am

Did I get this correct: The grey squares are for nucleotides that will not recieve score. The red x’s are for nucleotides that do recieve a score, but failed to deserve one.

Brourd · December 1, 2012, 11:36am

Question,

In this screenshot, it is clear that the CC sequence switches with DMS, but is shown to not switch with SHAPE. When it comes to scoring how a base switches, does one of the chemical mapping techniques take precedence over the other, or does a base score a full point, as long as it switches according to one of the chemical probes?

Brourd · December 1, 2012, 12:34pm

With the recent FMN Switch 2.0 lab, I was going through the old results,and something caught my eye.

Here, it appears we have 2 identical readings for base 57, yet, one design gets a point for switching, and the other doesn’t. Now, from what I can tell, I would think neither base would get a full point, as it is supposed to switch from paired to unpaired.

So, should this base be removed in the scoring of the bases, as it is near the cut off point for the "grey data? Or, should both designs be scored on this base and given a full point, or should the bases not be scored a full point? This problem is also present in many other designs, but is usually scored a full point in those cases.

Tom · December 1, 2012, 5:30pm

For the questions so far -

Yes, Eli, a gray box means those nucleotides were not counted. A red X means the nucleotide should switch but didn’t.
Brourd, good question, I forgot to explain. In cases like your top image where one probe shows the switch and another doesn’t, we give credit according to whichever one would score higher, I believe - I will double-check on this and post a correction if that’s not the case.
Hmm, interesting problem cases - here I think the scoring is likely done according to numerical values that may be discernibly different, even if their image representation is not. We will review cases like this one where there is so much signal it’s actually hard to tell what’s going on - thanks for pointing this out!
For the bottom two, the top two lanes are SHAPE and the bottom two are DMS. SHAPE can report on G but DMS cannot, so we just score according to SHAPE. Does that answer your question?

Also, we have reports about the image files being too small to view easily. If you click on one to make it full screen, in the bottom right there is a symbol that will let you download each one individually. Then you can blow up any image to your liking in Adobe Reader or Preview or whatever other image viewer you have. Let me know if there are any issues with this.

Brourd · December 1, 2012, 5:49pm

Thanks for the response Tom.

As for the “bottom two,” I apologize if I had been unclear, that was just the SHAPE and DMS data for the 2 cases above it. Now that I look, I wonder why I put that in there

jnicol · December 1, 2012, 10:24pm

Can we also get the DMS data? A spreadsheet would be best for the SHAPE and DMS data. Sending it back in the http.response to the labs will work also, as is done currently for the SHAPE data.

Thanks,
John

Brourd · December 1, 2012, 10:56pm

Some more minor questions I have.

Will the original structure based scoring be integrated in any way to the final score for each design? For example, in order for it to be considered a successful switch, must it score higher than 94 for a switch score, and pass an overall structure score of 75 for both shapes in order to be considered a winner? Or would we go with a penalty system, where for every 5 points below a set threshold, an example could be 90, we penalize 1 point or 0.5 point off the switch score. Player and developer input on this idea would be much appreciated.
Will DMS data eventually be integrated into the game interface, allowing us to optionally see how those bases scored and switched as well?
I like how each base that switched was either given a green circle or red X in order to signify how many points it gained. Could this be integrated into the game as well? As of right now, we are limited to seeing how all bases switch, but we have no indication of which bases are scored and how they scored. I believe something like this being in the game would be more friendly to casual and new players who do not want to be hardcore looking at dark and light bands, determining which bases received a full point and by how much.
There are other chemical probes like CMCT. Is there any plan in the near future to include these into the chemical mapping and scoring of designs?

Thanks for the game

Tom · December 2, 2012, 12:31am

Good points, guys. I’ll talk with the devs about prospects of making the DMS data available for each lab round, and incorporating the scoring/partial credit system into the game.

Right now the final switch score number is the average of the switch scores for SHAPE data with /without FMN and DMS data with/without FMN. We have discussed what to do with the “structure based” scoring as you said, but have been focusing on switching so far. We may decide to include it, and more discussion about the best way to do that would certainly be welcome. The thing is that a successful “switch” has a broad definition - it may be something that starts out in all the no-FMN state and then a few of the molecules switch; it could also be something that starts out with a lot of molecules in the FMN-ready shape (even without FMN!) and then switches all the way. The structure based scoring may actually penalize some successful switches, but it is still very much worth considering.

Brourd, very interesting that you read up on CMCT - it is another classic chemical probe. We actually use it in lab for a handful of non-EteRNA experimental projects, but I have to say I doubt we will start using it for EteRNA for two reasons. The first is that our current experimental scheme is efficiently operating at a maximum throughput for 16 samples, and adding CMCT trials would push us over that threshold. Also, CMCT doesn’t give as consistent data as DMS and SHAPE - it is not uncommon for it to react (or not react!) when we think it should. DMS and SHAPE are more predictable and we think better suited for the game.

Thanks for the great discussion points!

Brourd · December 2, 2012, 1:37am

Thanks for the response! I do understand the entire issue with how integrating the old single state scoring could cause issues, perhaps someday we will figure that all out. Like I said, we could make the minimum threshold rather low, an example being around 75, essentially meaning that 3/4 of the molecules folded correctly, but as long as over 94% switched correctly, it still does well.

This is kind of related to the previous conversations with developers related to there being multiple OFF and ON states for RNA switches. The RNA switch design problem will encompass multiple areas, including the design of switches with defined OFF states and defined ON states, as well as designs where neither state is well defined, but will instead focus on measuring how well the molecule binds to the RNA with certain structural motifs required.

An example of the latter will probably end up being something like Brent Townshend’s project, and a score of the shape itself is probably unnecessary. An example of the former is similar to what we are currently doing, and I would think a minimum score of how well the RNA folded in each state will be needed.

The RNA switch design problem has many facets to explore, and I look forward to what we shall be able to discover here in EteRNA.

Thanks for the game

Eli_Fisker · December 2, 2012, 2:16pm

We wish to get numbers on along with the sequence for each design.

I miss basenumbers in the pictures. Jnicol was mentioning basesnumbers to point me to what green color of bases we were talking about in the pics of the newly scored pics.

Eli Fisker: John, you have been sitting counting bases, right?
Eli Fisker: But I bet some will have trouble remembering with where to start counting from
jnicol: yes, counting at first, but I memorized the scored pattern to make it easier to count
Eli Fisker: Hehe, John, you have clearly been watching these pictures for too long. Love that
Eli Fisker: I’m just saying, most will not be able too, which is why we will need the numbers
janelle: Right, numbers would greatly help
jnicol: agreed numbers would have saved a lot of time

We were also discussing which green was which kind of green. As John said: I like Mat’s idea to put numbers on the lab data for the visually impaired.

I think it will help the rest of us too.

Eli_Fisker · December 2, 2012, 3:34pm

I will also like to have lab name and round on each slide of the data. It will be a great help when we discuss the different labs.

jnicol · December 2, 2012, 7:59pm

After reviewing the new scoring data, I have a few observations:

There are 2 pictures for every design. The eterna_score.png, which shows a total score for each row test and the switch_score.png, which shows the the base points / maximum base points score. I assume the eterna_score pictures are used as a reference and that switch_score is the one used for the actual scoring.
The good bases actually have 4 levels of good, dark green, medium green, light green and white. It appears that these are scored as follows, dark = 1.0, medium = 0.7, light = 0.3 and white = 0.05. Can we please have the exact method for scoring? If the above is correct, I see many discrepancies.

janelle · December 3, 2012, 4:19am

To the Eterna Team,

A big, “THANK YOU” for the tremendous work that went into the new scoring system. I realize we are all still figuring it out but hopefully, the new data here (and the data to come) will enhance our ability to build “good” molecules.

Brourd · December 3, 2012, 8:13am

Found another one of those mystery “scoring is likely done according to numerical values that may be discernibly different, even if their image representation is not.”

This one is base 44 of the FMN Switch 2.0 lab, and is supposed to switch from a loop to a stack, yet is given a plus score when it appears that there is no discernible shift in the image representation.

This appears to be present in sequences 4, 5, 6, 9, 14, 16 of round 61, base 44.

If this is a case where there is a difference in the numerical value giving a point, could the numerical values for the data also be included for player reference in the future please?

Thanks!

Brourd · December 11, 2012, 8:04pm

Found another weird one in regards to no difference in image representation.

Old Scoring

New Scoring

Now, the score on this design went up quite a bit from the old score to the new, not just due to the aptamer bases, but also due to the bases that showed no visible difference being scored a full point, when before, they were penalized. This is sequence 14, “Good Solution 4” from round 52.

My question is, is there a difference in the numerical data for these bases with high signal, or are bases with high signal handled differently in the new scoring system?

Eli_Fisker · December 15, 2012, 3:55pm

Mat and I have made a guide to the new scoring system. It is made with the intention of helping players get the best out of the extra data we are now getting. We have covered what we can for now and intend adding more stuff later.

Visual guide to the new scoring system