Alternative Outputs?

Berex_NZ · March 24, 2011, 10:53pm

I’m just wondering has there been any thought towards using something other than SHAPE to measure our synthesis results?

I’m not exactly sure what other options out there are viable though.

But just thought I’d ask.

JeehyungLee · March 29, 2011, 7:21pm

Hi Berex,

Right now SHAPE is the only data we can use - it is the fastest way to analyze RNA shapes which allows us to run weekly synthesis & analysis schedule.

There are few more analysis processes we could use. But there are much more expensive and time consuming than the current method. Once we come up with a “killer lab puzzle” and someone solves it, we could use these process to solidify the result.

Berex_NZ · April 6, 2011, 11:59pm

Hi Jee,

Thank you for that. What processes would they be, once we end up with a “killer lab puzzle”?

Are we able to get the scoring algorithm used for SHAPE?
That way we would be able to double check the scores
The more eyes on it, the more we can replicate and check its validity.

Ding · April 7, 2011, 2:14am

The recent problem in scoring data was actually a problem with the SHAPE data itself, so there’s no way we could have caught it.

As for scoring (confirmed by jee in chat), it works like this: there are two basic thresholds. The SHAPE threshold (which is listed in the csv) applies to all bonded areas and is used for color coding results. Then there’s a much lower loop threshold calculated as 1/4(SHAPE threshold)+3/4(SHAPE minimum). Count how many nucleotides in unbonded portions are at or below the loop threshold, and how many nucleotides are at or above the SHAPE threshold in bonded portions. Add them, subtract from the total # of nucleotides there’s SHAPE data for, and turn it into a percent of all nucleotides there’s SHAPE data for.

I’ve been checking all scores all along (actually, I use the score calculation as a sanity check on other stuff), and they seem to be right. I’m taking their word for it that the thresholds are set at the best level for score though because my spreadsheet skills aren’t up to that kind of calculation However, if the wrong data ends up in the csv, there’s not much to be done about it on our end.

edit (a second time) to add: unless I’m wrong and we could have caught the problem. I was pretty surprised at some of the results especially in rounds 3 and 4 of the Star Lab, maybe there was more we could have done?