Pattern in noncoding regions of sars cov reference

Abinmorth · February 21, 2021, 3:51pm

was digging into the noncoding regions of the ncbi reference genome.
(“unindexed” positions not coding for a gene)

21556…21562	between ORF1ab … spike	acgaaca
25385…25392	between Spike … ORF3a	acgaactt
26221…26244	between ORF3a…ORF4	gcacaagctgatgagtacgaactt
26473…26522	between ORF4…ORF5	acgaactaaatattatattagtttttctgtttggaactttaattttagcc
27192…27201	between ORF5…ORF6	gtgacaacag
27388…27393	between ORF6…ORF7a	acgaac
27888…27893	between ORF7b … ORF8	acgaac
28260…28273	between ORF8…ORF9	acgaacaaactaaa
29534…29557	between ORF9…ORF10	actcatgcagaccacacaaggcag
29658…29674	between ORF10…3UTR	actttaatctcacatag

most of these (not all) seem to contain the motif acgaac
does anyone know more about this motif?
is this sars cov 2 specific or does it appear in other sequences?

edit: found a paper where this sequence has been mentioned:

DigitalEmbrace · February 22, 2021, 4:27pm

How cool that you found this sequence and research showing the role it plays in viral transcription. I’m going to guess those six nucleotides line up exactly with two amino acids, if you would like to check the amino acid sequence (either in the puzzle or elsewhere). That sequence does seem like an intriguing therapeutic target given that it repeats 10x and primarily occurs in stem loops, if I’m reading the paper correctly.

Abinmorth · February 22, 2021, 6:27pm

acgaac = TN

the complement ugcuug = CL

tgcttg shows up 13 times in ace2, also noncoding regions, but could be coincidence
(I’ve looked up the indexes manually at Homo sapiens angiotensin converting enzyme 2 (ACE2), RefSeqGene on chr - Nucleotide - NCBI - may have missed some)
starting positions are
454
4826
5391
6916
7907
13203
13688
17858
27434
27443
30347
35351
39686

DigitalEmbrace · February 22, 2021, 8:38pm

Oh wait, your original post was about non-coding regions. Nothing to do with the amino acid sequence or our puzzles. Scratch what I suggested. Your second post is listing a complementary sequence found in ACE2 that might pair up with the acgaac sequence in SARS-CoV-2?

Abinmorth · February 23, 2021, 8:11am

at least I assumed that those are non-coding regions, because those positions dont show up in the reference record (source Severe acute respiratory syndrome coronavirus 2 isolate Wuhan-Hu-1, co - Nucleotide - NCBI)

I was searching for the complementary sequence in ace2, but tbh the pattern isn’t that long, so you can probably find it anywhere.