Going to respond to a bunch of things at once with my perspective:
Interesting, it seems like since it’s a ML model, instead of building up the pairing probabilities, it starts with almost everything being able to pair with something else (albeit with <0.1 confidence), and as you add more pairs, it removes possible bonds until only your intended structure can form.
My intuition is that this is at least somewhat an artifact. Eg, you see this with a string of all As, but I imagine that’s likely because it’s “out of distribution” (ie, a situation that the model has no way of understanding because it’s never seen anything like it before) since the training data has limited “poly-N” sequences due to that not working with the experiments. Though I don’t know if there’s any information on what is expected of that situation in an experiment (does it have some other interesting behavior aside from just sitting around unpaired?)
It seems to like structures like <…(… …)…> a lot more than other folding engines, same with pks
I will note that it has a sorta-similar behavioral bias to EFTK, in that it takes pairing likelihood and then picks out non-conflicting pairs above some threshold (using Hungarian rather than Threshknot, but relatively similar), so you can wind up with some weird behavior on “borderline” pairs. That said I think RibonanzaNet may have more “strong opinions” about pairings, making that show up less often? Haven’t gotten to really play enough with it, I’m sure others may have more experience with that.
And as DigitalEmbrace has mentioned, there may be situations that look weird to us, but could be reflecting some 3D behavior we’re not used to seeing! Using 3D models could help give a clue as to whether it’s more likely an artifact or more likely reflecting some interesting interaction.
I’m not sure that I would say that RNet takes 3D folding into account.
If by 3D contacts you mean pseudoknot bonds then yes, you are are correct. The reactivity data includes that information.
Going to “yes and” the discussion here. The RibonanzaNet foundation model, trained on reactivity data, does have the capacity to take 3D contacts into account, even if they’re not Watson-Crick-Franklin pairs - and even potentially if they’re not reflected in the reactivity data directly.
Conventional thermodynamic models (particularly when we’re talking about minimum free energy structures from nearest-neighbor parameters) have to explicitly take into account any potential interaction the authors want to consider into its equations. However, RNet’s approach has the ability to pick up on any kind of motif - it has the ability to say something like “if I see a mix of As and Gs over here that has some characteristic, if I see a sequence that looks a particular way over there, it’s going to be protected”. And those motifs may represent some interesting 3D structure. And it doesn’t even need to be a specific sequence resulting in a specific reactivity profile - the architecture allows for things like complex patterns and inter-related conditions. This is the reason why there’s hope it could be useful as the basis of a 3D model! We have lots of reactivity data, but little 3D data, so the hope is the model can get a lot of insight from just the reactivity and then contextualize it with the 3D data we have.
That being said, what is ultimately reported by the RibonanzaNet secondary structure algorithm is limited. First off it was trained to predict Watson-Crick-Franklin pairs (ie, the training data that it was given to fine-tune from was the WCF pairs of some RNAs). That said, if there were 3D interactions which determined the WCF pairs, it could still pick those up. Additionally, even if the algorithm did predict a non-WCF pair, we filter those out in Eterna (I don’t think I’ve ever seen that happen, but I also haven’t rigorously checked!). Point being, while the 2D structure may be informed by 3D features, we don’t actually get information on what those 3D features are (nor does the model have the benefit of incorporating that contextual knowledge) - a true 3D model would be another step to giving us more information!
It would be interesting to see a 3D version of Rnet, but it might be an easier step to take first to release a 3D version of Vienna or EteRNAFold first, since it would both be easier to test and it is a simpler model. It might also be the premise for a new EteRNA-like game, where we have to find out the 3D model!
Unfortunately, not really. That requires being able to encode specific 3D folding “rules”, which are hard to determine and measure. 3D behaviors - or rather, how you would model 3D structure algorithmically - are also substantially different from how algorithms like Vienna and Eternafold work. There are other models out there that do take a more traditional approach (such as some classic Rosetta RNA modeling code), but they’re not very good. 