Hi All,
Recently, I had a conversation with Sneh in Chat in which resulted in me coming away with a significantly increased awareness of the importance of loops in the game in general, and in our Lab Designs in particular. This got me thinking that I needed to learn more about loops.
So, I set off to do a bit of research, and here is what I found so far: (source:
http://www.bioinfo.rpi.edu/zukerm/cgi…
This is a list of the most abundant, most frequently occurring Tetraloop configurations - including the stack-end pair the loop is attached to (the two bases on the ends). Asking around a bit, I found out that these sequences are in 5’ --> 3’ order, and that this list of 30 Tetraloops accounts for MOST all of the Tetraloops configurations that are found in nature.
Before finding this table, I had no idea that there were such a small finite number (30) of valid Tetraloops, and no idea either that they were all so …identified and defined and recorded. (having just been creating them mostly by beginner’s guesswork thus far in the game).
This table also contained a few other surprises for me; among them, the solid realization that there ARE places where U’s and even C’s can and should be used in a loop, but usually only one at a time (although there is one configuration that does contain 2 consecutive U’s).
Tetra-loops
Seq - Energy Seq - Energy Seq - Energy
GGGGAC -3.00 CUACGG -2.50 GGGAAC -1.50
GGUGAC -3.00 GGCAAC -2.50 UGAAAA -1.50
CGAAAG -3.00 CGCGAG -2.50 AGCAAU -1.50
GGAGAC -3.00 UGAGAG -2.50 AGUAAU -1.50
CGCAAG -3.00 CGAGAG -2.00 CGGGAG -1.50
GGAAAC -3.00 AGAAAU -2.00 AGUGAU -1.50
CGGAAG -3.00 CGUAAG -2.00 GGCGAC -1.50
CUUCGG -3.00 CUAACG -2.00 GGGAGC -1.50
CGUGAG -3.00 UGAAAG -2.00 GUGAAC -1.50
CGAAGG -2.50 GGAAGC -1.50 UGGAAA -1.50
But before I started this endeavor, I found an even better and more detailed version of this table in this article that I was directed to by Player alan.robot (Thank You Alan) - (This version included Frequency-of-Occurrence Data for each Tetraloop):
http://bio.gnu.ac.kr/research/miRNA/R…
(see Figure 8.)
Table 8. Tetraloop hairpin bonuses
Sequence - Occurrence - bonus (kcal/mol)
GGGGAC 87 ÿ3.0 CUACGG 17 ÿ2.5 GGGAAC 9 ÿ1.5
GGUGAC 76 ÿ3.0 GGCAAC 17 ÿ2.5 UGAAAA 9 ÿ1.5
CGAAAG 56 ÿ3.0 CGCGAG 16 ÿ2.5 AGCAAU 8 ÿ1.5
GGAGAC 47 ÿ3.0 UGAGAG 16 ÿ2.5 AGUAAU 8 ÿ1.5
CGCAAG 40 ÿ3.0 CGAGAG 14 ÿ2.0 CGGGAG 8 ÿ1.5
GGAAAC 36 ÿ3.0 AGAAAU 13 ÿ2.0 AGUGAU 7 ÿ1.5
CGGAAG 35 ÿ3.0 CGUAAG 11 ÿ2.0 GGCGAC 6 ÿ1.5
CUUCGG 28 ÿ3.0 CUAACG 11 ÿ2.0 GGGAGC 6 ÿ1.5
CGUGAG 23 ÿ3.0 UGAAAG 11 ÿ2.0 GUGAAC 6 ÿ1.5
CGAAGG 18 ÿ2.5 GGAAGC 9 ÿ1.5 UGGAAA 6 ÿ1.5
…Very interesting, but a bit cryptic, and not very accessible for many people, I thought, so I decided it might be a helpful contribution to the EteRNA community if I were to put it into a format that is hopefully a bit more usable and visually appealing.
After reading this article and finding this version of the table, I decided to create the following Excel Table with a color-coded and otherwise enhanced version of this information on valid Tetraloop configurations which Players can then use in their Lab Designs with confidence that these configurations ARE valid; that they ARE found in nature, and are also published in many scientific articles, and used in RNA folding software packages (such as Vienna RNA and likely EteRNA as well).
It has been my perception that thus far in the game, comparatively speaking, much less progress has been made by most players in improving their knowledge and skill regarding the construction of properly designed loops - than has been made in advancing skills at Stack Design (I know that is true for me, at least) - so it is conceivable this information could be a factor in changing that for the better.
I made the table in two accompanying sorts, the one on the left is sorted as I found it, which is in order of Frequency-of-Occurrence, or Abundance in the Tested Sample of 914 Tetraloops. These most abundant, most frequently occurring Tetraloops were also assigned the lowest energy values.
_The table on the right, I re-sorted for use by EteRNA players. It changes the sort to separate the Tetraloops by the bases of the attached stack-end, so it is easier to see what Tetraloops one can use with a particular stack-end base configuration, when composing a Lab Design. _
In both tables, I also inlcuded both 5’–> 3’ order and in 3’ --> 5’ order - to facilitate visualizations where the actual game layout may differ from the given 5’–> 3’ order, - (to save players the necessity of having to mentally transpose the sequences). The Tetraloops that work in one pair, will not always work on the flipped pair; they are mostly not inter-changeable, so some care must be taken to select the proper orientation.
I also inserted a slight separation between the two end-bases (the stack-end pair) and the four center bases of the Tetraloop itself, just in an attempt to further increase clarity and readability.
(Please click on the table for a larger, clearer version)
(Please click on the table for a larger, clearer version)
I hope this table might help make this information a bit more accessible to some of us; that it might help us all to learn these valid Tetraloop configurations, and hopefully that it might also therefore enhance all of our chances to excel and succeed in our future Lab Designs.
Thanks, and Best Regards,
-d9