GC Pair Percentage Display

Hi All,

It has occurred to me that we have no metric in the interface to let players know when their GC Pair Percentage is beginning to exceed historical levels where too many GC’s can begin to adversely affect scores.

The recent discovery and increasing use of the RNAFold website has been very helpful in many respects, but I have discovered that very GC Heavy Designs, and even outright “Christmas Tree” designs can look very good in RNAFold’s assessment algorithm. (I theorize this may be because most users of RNAFold would know beforehand that “All GC” designs are not good, and so would never even submit one, so the algorithm was never programmed to assess for that)

Because of this, I have begun to see some HIGH GC designs being submitted to the Lab, while quoting what great results they had on RNAFold.

For this reason, I think it would be a good idea to begin to display GC Pair Percentages in the EteRNA interface, right along with the individual pair counts, Melting Point, Free Energy, and Dot Plot.

As an adjunct effort, some published research on what percentages are entering the dangerous area should be researched and published as well, to give Players some reasonable baseline figures to judge from.

I will be posting a first effort in this direction very soon, but it will not be as useful without the accompanying interface enhancement to show the GC Pair Percentage.

Can this be included next update cycle?

Thanks, and Best Regards,

-d9

Great post!

For starters, here’s a link showing that for siRNA (that is, 21 nt RNAs that are complimentary to a target single-stranded mRNA), maximum efficiency is achieved using between 30 and 50% GC content. Less than 30% resulted in inefficient binding to the target sequence (i.e. the MFE was too high), while GC content above 60% also had a negative impact (presumably because the siRNA got trapped in structures that were not the MFE)

http://www.ambion.com/techlib/misc/si…

I realize this is slightly different than the design constraints - smaller RNA, and binding of two molecules vs folding of one, but the ranges are amazingly spot on with your meta-analysis of past lab results. But the concepts are really the same - the minimum free energy structure is for all 21 nt to form a duplex with the complementary sequence on the other molecule, and failures due to high GC are due to persistent structures (“suboptimal folds”) on the part of either binding partner.

Similarly, for PCR primers use of more than 50% GC is correlated with increased failure rate due to suboptimal secondary structure formation, with almost complete failure of >70%GC sequences.

http://www.ncbi.nlm.nih.gov/pmc/artic…

Lastly, I can’t find a reference but I know many DNA synthesis companies will outright refuse to synthesize genes with > 60% GC content (or charge super extra), due to high failure rates (as we have already seen in lab ourselves so a reference really isn’t needed). Special chemicals are used to ensure that the high GC molecule can be melted, since boiling water alone won’t do it. . .

Hello d9,

we’ll include percentage of each pairs in the next update

EteRNA team