IUPAC strategy brainstorm

some thoughts about the lab puzzles with IUPAC constraints, where shape doesn’t matter as much.

I think we basically have two types of moves available
a) stiffening the loops by putting strong bindings inside
b) loosening up stems to shift the center of gravity and have more favorable loops

also 2 observations:

  • in puzzles loops only have few points which are significant for energy change (closing pair, boost points)
  • IUPAC constraints allow us only to change roughly every 3rd base (most Aminoacids allow variation in the 3rd base of the codon, only a few [L, S, R] allow changing the first or second base, but these exceptions have internal dependencies which are often difficult to deal with in manual solution exploring. [“if first base is A then third base can’t be U”]

we may want to screen for solutions, where the positions of boost points in loops are divisible by 3, because this would allow us better control of the loops ???

I’ve thought a little about something along these lines, but haven’t developed anything fully enough to present a proposal. I believe we should avoid triloops as degradation hotspots and encourage tetraloops and pentaloops (and larger) with sequences we have found resistant to degradation. The other likely hotspots are certain bulges.

The current DEG-2 is a first stab at machine learning with the RYOS 1 data. It does not take superior tetraloop sequences (boosts) into account. There currently is a Kaggle competition with 1300 teams of data scientists working to find a more intricate degradation algorithm. And we will have even stronger data from RYOS 2 for refining algorithms.

I agree once we know more about which structures and sequences resist degradation, we should consider developing a tool that will find desirable codons to place in desired locations. It’s a good suggestion, but there may not be enough time to develop such a tool for the final OpenVaccine round.