[Strategy Market] GU and CU considered dangerous

The nucleotide sequences GU and CU should be considered dangerous since they have a strong potential for mismatch. While small sequences that avoid certain other patterns may get away with them sometimes, it is probably good practice to avoid using them.

If an RNA sequence has a GU sub-sequence that is paired with a GC sub-sequence and there is more than one possibly matching(*) GC sub-sequence, subtract 2 times the number of additional GC sub-sequences.

If an RNA sequence has a GU sub-sequence that is not paired with a GC sub-sequence and there is at least one possibly matching(*) GC sub-sequence, subtract 2 times the number of GC sub-sequences.

If an RNA sequence has a GU sub-sequence that is paired with an AU sub-sequence and there is at least one possibly matching(*) AC sub-sequence, subtract 1 times the number of AC sub-sequences.

If an RNA sequence has a CU sub-sequence that is paired with a GG sub-sequence and there is more than one possibly matching(*) CC sub-sequence, subtract 2 times the number of additional CC sub-sequences.

If an RNA sequence has a CU sub-sequence that is not paired with a GG sub-sequence and there is at least one possibly matching(*) CC sub-sequence, subtract 2 times the number of CC sub-sequences.

If an RNA sequence has a CU sub-sequence that is paired with an AG sub-sequence and there is more than one possibly matching(*) AG sub-sequence, subtract 1 times the number of AC sub-sequences.

(*): Note that when considering sub-sequences for alternate pairings they should be at an offset of at least 5 positions (to form at least a tri-loop) from the potential match.

Dear jandersonlee,

Your strategy has been added to our implementation queue with task id 138. You can check the schedule of the implementation here.

Thanks for sharing your idea!

EteRNA team