I think this would be extremely helpful for players. It’s very time consuming checking the spreadsheet to see if mutations are violation free.
When you were looking at the spreadsheet, did you notice most the protein sites are not mutable? Not all of them are constrained already, but ~75% of them are.
As for motifs, I have not seen evidence that all motif structures must be maintained for the ribosome to function. The fact that bases involved in a motif have changed in different species, as evidenced by the IUPAC coding, indicates that indeed those bases can change and not kill the ribosome. Furthermore, one of the researchers on this project is a motif expert and he is the person who decided which bases to lock.
A visual tool showing locations might be helpful for design and analysis, I’m not sure. Maybe disrupting a couple motifs is the key that helps the ribosome fold better, we don’t know. For now, how great is it to have this detailed spreadsheet with so much data at our fingertips. Bravo Eli!
I am trying to write a JS script to count a group of characters such as “UUUU” in a sequence and report where their start locations are.
I have managed to count the occurrences with this " var A2 = (A1.match(/(UUUU)/g)|| []).length" .
where A1 is the sequence to search.
Does anyone know how I can get the indexes (sequence numbers) within the sequence?
I’m hoping to get result = UUUU has 4 occurrences at 12, 29, 47,1129. as an example.
Also, I had to hard code the UUUU. I need to make it a variable instead.