i know almost nothing about genetics. i found this site from the nytimes article and have found it fascinating. at the moment it is a rule based puzzle for me but i would like to understand the science underneath the game. can anyone recommend a good introductory book? i looked online but did not find much but textbooks. that would be ok too if there is one that someone could recommend…
The really authoritative book on RNA science is called “The RNA World”, and its 3rd edition it out.
The problem is that its for people who already know a lot of genetics and biology. And its expensive.
Your question is interesting though – RNA is really hot, but there hasn’t a book on it yet – would be a great topic for a popular science book. There have been some magazine profiles though. For example, there are several articles in this issue of the economist:
Have fun! And thanks for asking the question.
ty for the info! i read through the first chapter online. it is a little over my head but was still informative. there is a copy of the second edition for $20. is that too out of date to be useful for me? 3rd edition seems to be about $130.
Unfortunately, this is a rapidly evolving field so I suspect by the time someone gets around to writing a popular science book about it, it will already be hopelessly obsolete. . .even the textbooks have a hard time as it is keeping up and they come out with new editions every 2 years. . …
The NCBI has several of the better cell biology textbooks available online for free, however, which you may want to take a look at. They won’t read like a novel but they are clearly written and have nice illustrations.
Note these are undergraduate-level texts, and generally assume that you’ve been exposed to introductory chemistry and biology (for example, AP courses in US high schools). I’m afraid I can’t think of anything at a more accessible level than that.
I would settle for a moderately detailed explanation of what we are actually *doing* in solving these puzzles! How is it that “we know” the starting configurations, but not the bases? How is it we know when we’ve hit a solution, yet apparently the search for valid configs still comprises a challenge? What *is* the state of the art in automated solution of the kinds of problems we see here? Are these puzzles typical of RNA? I mean, DNA is a billion bases long, right, how come these are so short? Plus (for extra credit!) a short spec sheet on G, C, U, and A, and maybe a short note about the difference, structurally, between (short) sequences of RNA versus DNA.
I agree there are alot more that could be done from a science outreach standpoint, it would be great if the devs could get some educators on board with this project to address these sorts of queries. However, the answers to almost all of your RNA questions you asked can be found in the textbook links I provided, so give them a try, and alot of this is being discussed in other discussion threads.
Before you volunteer for a billion base puzzle, I’ll also point out that DNA (and hence RNA) is organized into much smaller sections called genes, which are typically hundreds to low thousands of nucleotides long, and the non-coding RNAs are typically much shorter. So yes, the puzzle sizes are typical, actually I think they are longer than most noncoding RNAs in real life because they have some protein sequences in there and those don’t have any function other than to code for an amino acid sequences.
The naturally occurring sequences for all the puzzles are in fact known, and the target structures come from an automated folding algorithm applied to the real sequences. So you are being trained on problems with known answers before being asked to design sequences that fit a particular structure. This is called the inverse folding problem, which is much tougher (i.e. given a target design, what is the best sequence to use). I’m pretty sure I saw a discussion of this in another thread:
The algorithm built into ETERNA is the ViennaFold which is certainly among the state-of-the-art used in science, but it can only predict base pairs and loops, there’s alot that it can’t predict that I assume future design competitions will try and address, see this thread
kewl, i will scan through the book. nice that they are online. ty for the help and the info!
I can answer this specific question a little more directly although alan’s answer has an incredibly high average quality: " How is it we know when we’ve hit a solution, yet apparently the search for valid configs still comprises a challenge?"
The ViennaFold algorithm is among the state-of-the-art as alan said, but given a certain desired shape for an RNA, there are hundreds of possible sequences that ViennaFold predicts will work just fine. As it turns out, when we sequence some of them, they do terribly horribly badly.
So, the challenge is picking out the ones that will “actually work” from the ones that ViennaFold thinks will work. At the moment the community is doing a lot of trial-and-error that is almost science.