In the RMDB rdat files, there is a field for the signal to noise ratio of the data gained for a RNA that has been synthesized and sequenced. While I have some notion of the details of this ratio, perhaps defining the values may be helpful.

What is considered to be “signal.” How is this value calculated for the ratio?

What is considered to be “noise.” How is this value calculated for the ratio?

Is the S/N ratio limited to the RNA as a whole, or is it based on an average of the measurements of S/N for each residue? If each residue has a different S/N ratio, is it relevant to report these values as well? This may be answered by the specific definitions of variables considered signal and noise.

Is the reported reactivity error for each individual residue factored into the signal to noise ratio, or is that an unrelated measurement that may be derived from similar data?

Signal-to-noise ratio is computed as follows. For each construct:

we take the mean of the SHAPE reactivity across all residues.

then we take the mean of the estimated error across all residues.
(This error is based on Poisson counting statistical errors.)

then we take the ratio of those two values.

We also explored taking the S/N ratio of nucleotides individually and averaging them, but I saw some wacky results that did not correspond to my manual assessment.

The code for doing this is available in the file estimate_signal_to_noise_ratio.m
in the
MAPseeker package, which is not super well-documented (yet).

Again, do we have a wiki page that you (or others) could help me get this information into?

Oops the link above is private. What happened is that we decided to put MAPseeker under a license – it will still be free for non-commercial use. Until I get the license page up though I can’t release the software…