EFTK calculation

DigitalEmbrace in the Town Hall discussion concluded that the EFTK calculation is a bit off. It does look a bit amiss but I would like address her concerns in a bit more detailed approach. The post will also demonstrate how I solve these puzzles with the Delta number missing.

The short stack will not hold by themselves in EFTK so you have to lengthen both stacks using what I can fake pseudoknots, the yellow and green lines in the above chart. The resultant combined (real + fake pseudoknot) pairs result in two equally balanced stacks ( shown in red) that with a bit of fiddling solve the puzzle.

Does the Dot Plot back up the above discussion? Yes it does, by comparing the diamonds inside the red squares there are more diamonds (pairings) than real stack pairings.

My conclusion would be that the ensemble calculation may not be off as much as you think.

I want to quickly provide a bit of clarification to make sure everyone is on the same page here.

What’s going on is that without those “fake pseudoknots” or “unresolved closing pair boosts” (ie, they’re contributing stability as if they were closing pairs, but do not come out as paired), the probability that the pairs in that short stack will fold falls below EFTK’s threshhold (.15, as we currently have it configured). When adding those additional pairs, the probability of the pairs in that short stack increases above the cutoff. However, those “pairs” you just added? They wind up coming in just below the cutoff. The short stack is slightly more likely to fold than the extra pairs, which ultimately results in them being included in the predicted fold but the others not included.

As far as I know this is not a “true bug” in EFTK - this is working as expected and in accordance to the behavior defined by the Threshknot paper. To the extent that this isn’t reflective of nature, it’s a limitation of the Threshknot algorithm.

It would be interesting to have some test cases of experimental results to see how this would behave in the lab, with varied differences in probability between paired/unpaired and varied “thresholds” they are positioned around. Maybe an improved version of the Threshknot algorithm would adopt nearby base pairs if they have a close probability even if it falls below the threshold, or some way of dynamically adjusting the threshold or normalizing base pair probabilities somehow. Or maybe we might see this manifest in semi-ambiguous reactivity data suggesting an unstable ensemble!

Also FWIW, a puzzle like this which is mostly very short stacks seems particularly likely to surface weird/unrealistic behavior in folding engines, so to some degree this is unlikely to be representative of widespead misbehavior so much as asking for information of a silly situation - though more testing is required to validate that, of course. :wink:

That could be a lab in itself. The upcoming shape data from the previous pseudoknots labs should also shed some light.