Coronavirus: Genome packing signal
“The assembly of infectious coronavirus particles requires the selection of viral genomic RNA from a cellular pool that contains an abundant excess of non-viral and viral RNAs. Among the seven to ten specific viral mRNAs synthesized in virus-infected cells, only the full-length genomic RNA is packaged efficiently into coronavirus particles. Studies have revealed cis-acting elements and trans-acting viral factors involved in the coronavirus genome encapsidation and packaging. Understanding the molecular mechanisms of genome selection and packaging is critical for developing antiviral strategies and viral expression vectors based on the coronavirus genome.”
https://en.wikipedia.org/wiki/Coronavirus
One more thing I took note of, are the cis-acting (before the gene) and trans-acting (outside the gene) viral factors. Now I wonder if some of the structures of the conserved sequences in your paper, are such factors?
Like how many of them are before a gene?
I also wonder if the job of some of them is to slow down the ribosome after a gene is made, so that the specific gene part just made, can get cut.
I am also searching for an overview of the content of the coronavirus genome. So I could compare to the numbers in your paper.
_“In common with the genomes of all other RNA viruses, coronavirus genomes contain cis-acting RNA elements that ensure the specific replication of viral RNA by a virally encoded RNA-dependent RNA polymerase. The embedded cis-acting elements devoted to coronavirus replication constitute a small fraction of the total genome, but this is presumed to be a reflection of the fact that coronaviruses have the largest genomes of all RNA viruses. The boundaries of cis-acting elements essential to replication are fairly well-defined, and the RNA secondary structures of these regions are understood. However, how these cis-acting structures and sequences interact with the viral replicase and host cell components to allow RNA synthesis is not well understood.” (_Also from the same wikiarticle.)
I took note that only the full-length genome of viral RNA is packaged. So it is really used as one giant mRNA in the host ribosomes.
Rhiju: There is a hypothesis that UUYCGU apical loops (‘hairpins’) in SL5 in the 5’ UTR are packaging signals (‘cis acting factors’)
That’s for SARS Coronavirus and some other beta coronaviruses
I haven’t yet seen experiments that test that idea — keep your eyes open.
here’s the paper with the original proposal, from 2010:
Group-specific structural features of the 5’-proximal sequences of coronavirus genomic RNAs
Eli: 5’ UTR of the coronavirus as taken from your preprint:
5’ UTR image with Chen/Olsthoorn genomesignal hairpins highlighted
Notice the two marked loops at the bottom of the image.
A partial in the sequence is also on the pseudoknot of the frameshift:
Pseudoknot frameshift
At position 200-205 and 238-243
Eli: Actually this looks exactly like this one:
Conserved hexaloops from the Chen/Olsthoorns paper:
Rhiju das: yes exactly
3 conserved endloops
I wonder if there is anything these UUUCGU sequences are complementary to.
Eli: These two mostly pyrimidine stretches, kind of reminds me of this:
iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution
I wonder if the other coronaviruses show the same 4 way junction structure, as above?
Rhiju das: the Chen/Olsthoorn paper shows the SL5 for lots of coronaviruses — the 4WJ is unique to SARS/SARS-II and closely related.
Eli 4WJ?
Ah (4 way junction)
So structure and hairpin loops are conserved. While longer stretches of sequences for this area are not. Ok, I can see that 80-197 and 120-237 are also pretty conserved according to your paper.
Now I wonder what the packaging signals for DNA looks like. Assuming there are some too.
I mean, what starts the assembly of histones? Or nucleoproteins
I wonder what would happen if one threw the 5’ UTR in a test tube with a lot of nucleoproteins.
Rhiju: Thumbs up
Eli: And I see that Chen/Olsthoorn would also like to know that:
“Future experiments should also verify whether the conserved UUYCGU motifs in SL5 function as PS in group Iand II CoVs by interacting with nucleocapsid and/or membrane proteins.”
Eli: If these UUUCGU hairpin loops really are a packing signal for the viral genome, I wonder if they could be used as a medical target. I mean, no packaging, no further infection. The 6 nt hairpin loops looks suspiciously like one half part of a kissing loop. I wonder if such a partner could be made to target the hairpin loop and if this would affect replication of coronavirus. Alternatively perhaps an RNA aptamer could be evolved to target this region.
Alternatively, is any protein or drug known to bind to such a sequene/structure?
One more thing I have been wondering about. If the viral corona genome can be seen like a sort of operon.
I mean if not all proteins have to be made at the same time, it would make sense to have an operon to orchestrate when what is made.
A normal operon has a promotor and operator. The operator can be down regulated. Not sure coronavirus is interested in downregulating itself. So this may not be there.
On gene regulation and timing I recall reading that the M, N and S proteins were made and followed the same pathway.
“The viral structural proteins S, E, and M move along the secretory pathway into the Golgi intermediate compartment.”
https://en.wikipedia.org/wiki/Coronavirus
Since in e.coli, the lac operon is activated by lactase, I have started wondering about viral metabolism. What energy source, like atp, lipids, glucose, steroids etc it prefers. And if such an energy source could be an activator.
I have read a bit on and I have found that it is known that viruses up and downregulates the metabolism of the host. Plus that certain viruses have specific preferences. I have read this paper, which I find very interesting:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3004434/
Since in e.coli, the lac operon is activated by lactase, I have started wondering about viral metabolism. What energy source, like atp, lipids, glucose, steroids etc it prefers. And if such an energy source could be an activator.
I have yet to find anything in relation to coronavirus and their metabolism. I would like to know what would happen with lungcells infected with SARS-2 and put on different diets.
Eli: I have found several papers on covid that says that coronavirus upregulates an inflammation pathway.
Rhiju das:
> “Alternatively, is any protein or drug known to bind to such a sequene/structure?”
I’ve been wondering the same thing
>A normal operon has a promotor and operator. The operator can be down regulated. Not sure coronavirus is interested in downregulating itself. So this may not be there.
Its there. The virus has to change the transcription of its full genome vs. subgenomic RNAs to switch between replication (early) and then packaging (late). Lots of empirical information known, but unclear how the different amounts of RNA are turned on/off — many have hypothesized that the ‘logic’ is embedded in the structures of these elements
Gene regulation and the order of the operon by Amoeba Sisters
I have later dropped the idea that the S, E and M proteins are in an operon, since they are not present in coronavirus in the same amount.