Do any of the current player databases currently have a compiled list of sequences/(all structures) for all the lab designs that have been synthesized? This would include all native fold structures for each condition the design may be in, and it would be a plus if there were some other details like structure identification and indices, but that’s not really necessary. Yes, I know this can be done with Eternascript, etc, but I’d rather know it’s readily available in a form that can be easily downloaded and manipulated, rather than having to figure that out for myself.
Good question. I was wondering that myself and I think there is a way using API.
I am NOT SURE if this data is readily available for ALL labs, but I know that any lab can be searched for in this way.
Simply type/paste the lab id number onto the end of the following URL:
to get http://www.eternagame.org/get/?type=solutions&puznid=6892346 which I hope has the info you seek there.
Right now I’m currently trying to write a bit of code that will disseminate the structures of an RNA sequence that provides a “base-pairing map”, describing the way in which the structure switches. I’m also interested in a few other statistics, like the change in the number of base pairs between states, the composition of the helices and base pairs broken and formed in the initial and final product, as well as some other stuff.
Granted, I’m writing this in Python, so as much as I’d like to use web API to query for these attributes:
- I don’t know how to do a query of that scale. Not to mention, there appears to be no secondary structure information in that query.
- I don’t really want to install ViennaRNA, NUPACK, etc. locally to determine the native fold secondary structure, for a variety of reasons. Although the main one being my internal memory is kind of at its limit as is.