Rachael Kretsch has published an astonishing new preprint paper titled Naturally ornate RNA-only complexes revealed by cryo-EM where she reveals the 3D structure of three large RNA-only molecules. I’m blown away by what she found. All three are much more ordered and symmetrical than I’ve seen in other RNA structures.
I’ll present all three individually for players to view and perhaps use for inspiration in our current 240mer design lab project.
The Ornate Large Extremophilic (OLE) RNA consists of two identical pieces of RNA that connect at three sites.
Our OLE dimer map shows that it is organized as a series of parallel A-form helices, like a bundle of pipes. The exterior ends of these pipes from each chain are interconnected into a five-way junction…
An unusual but highly conserved symmetric interaction comprised of four A-A base pairs between two chains (L4, Fig. 1D), intermolecular base-pairing and stacking interactions connecting L5, L6, and L7 (Fig. 1E), and a kissing loop (L9.3, Fig. 1F) ‘weld’ the pipes together in the middle of the complex. Hereafter we denote these intermolecular interactions ‘bridges’ B1-B3, as used in ribosome nomenclature.
One thing I find useful to learn from this new structure map is how highly paired P5 and P7 are with P5 featuring GU pairs and P7 featuring AU pairs. This almost looks engineered!
Moving on, the other two structures are multimeric, meaning several copies of identical RNA come together to form a structure, in this case a hollow sphere. The ROOL sphere consists of 8 copies and the GOLLD sphere consists of 14 copies. These appear to be an entirely new class of structures.
Symmetric multimers are common among proteins and rationally designed RNA molecules, but observations of natural RNA multimers are rare. When observed, natural RNA homomeric interactions typically involve a single contact.
Both of these structure have numerous intermolecular contacts, referred to as bridges in the paper, that connect the individual chains (copies). The intramolecular contacts are drawn as lines as normal, and the intermolecular contacts are listed as B1, B2, B3, etc. (All of this makes them look even more like a Death Star to me.) Both structures form a nanocage larger than the ribosome!
The Rumen-Originating, Ornate, Large (ROOL) RNA consists of two half-shells, each formed by four chains (copies) of the 659-base RNA.
The mix of pairs in the long stems of the ROOL are more in line with what I typically see in natural RNA, although still a little heavy in AU pairs. I don’t have much more to say about the secondary structure except that AA non-canonical pairs are again more common than I would expect, and the lack of GU/UG crossing pairs makes sense since each RNA is essentially flat. (GU crossing pairs introduce flexibility.)
@DigitalEmbrace, again a supercool explanation of the article!
I was wondering about if there were a protein equivalent of this RNA Death Star. Like a selfassembling hollow ball.
It hit me where I had seen it before. The protein capsid (shell) of a virus. Those viral coat proteins are typically assembled of repeat monomers of the same protein. An advantage as the virus then only needs one gene for a coat protein. It just makes more of it.
The natural question now is, if a virus exist with an RNA coat shell
Also I wonder if RNA capsids will generally be smaller than protein capsids.
Ok, I think I can answer the latter one myself. The nanocages of those RNA balls can be larger than ribosomes. Ribosomes are similar in size to or much smaller than viral capsids.
The chaperonin GroEL GroES protein complex forms chambers to correct or assist protein folding. It’s ATP driven and has a pair of reciprocating chambers. While one GroEL chamber is occupied and closed off by a GroES cap, the other chamber can’t be used.
Vaults are another example of compartments formed by protein. Not much seems to be known about them. Also composed of two halves. About 95% protein and 5% RNA.
She said: Maybe you can solve the mystery of its function.
I have been speculating about if these RNA nanocages could be envelopes for viruses. But I had no idea which.
I read up on ROLL and I can see that they exist in both phages, bacteriophages and bacteria (the latter some which are not affected of the phage in question)
My hypothesis is that the nanocages are capsids/envelopes for phages and bacteriophages - as an alternative way for the virus to get out of a cell.
As Wikipedia says on prophages (Host integrated phages):
The cell may fill with new viruses until it lyses or bursts, or it may release the new viruses one at a time in an exocytotic process.
Why then would bacteria have ROLL sequences in their genome? I’m guessing it could be to keep phages handy for an attack on other bacteria.
So I need to figure out if the diameter of a phage could fit inside its roll cage.
Understanding the context of the ROOL sequence
I did a BLAST with the ROOL sequence. It looks like it mainly turns up in an organism L. salivarius with plasmids, specifically on the megaplasmid. Perhaps the nanocage is for the plasmid?
I read a paper reference in the ROOL family in RFAM. It was from before the lncRNA was named ROOL. But I could line it up with my ROOL with a few gaps. This paper says:
High-level lncRNA expression correlated with high megaplasmid copy number.
The presence of megaplasmids is a distinguishing and unifying feature of L. salivarius, and these plasmids range in size from 100 kb to approximately 400 kb, in linear or circular forms
The GOLLD RNA in Lactobacillus brevis ATCC 367 was studied experimentally. This GOLLD RNA is apparently encoded by a prophage, and its transcription is increased during the phage lytic cycle.
I’m imagining another packing style than the usual bacteriophage packaging inside a “syringe”. Rather a disposal of the nanocaged phage during the waste disposal system of the attacking bacteria as an extracellular vesicle. Then uptake as food by an unsuspecting nearby and phage susceptible bacteria, in the expectation that there is food.
So when the nanocage may be used by the phage genome itself, the phage perhaps has just extended the service to the bacterial plasmid that was kind enough to host it.
Here is the ROOL sequence and when Change region shown is changed to the whole sequence, one can see the megaplasmid size.
Genome size of the plasmid with the ROOL = 405494 bp. Which is 405 kilobases (kb)
According to this capsid volume versus genome size slide, my ROOL nanocage is too small to contain the megaplasmid. So I think the phage is cutting itself out of the megaplasmid before it gets packed in the nanocage.
A good bid for where to look for other RNA nanocages could be in phages, plasmids and bacteria. In particular lnRNA’s belonging to phages.
Also RNA nanocage sequences seem to have a special base frequency. This may have something to do with avoiding misfolding with themselves and elsewhere. They have a general low content of C’s. C’s that would typically be involved in base pairing.
I realized I missed the bit in Rachel’s paper that the parent phage were too large to fit the ROOL and GOLLD. My bad. I also no longer believe that the nanocages carry their phage genome. I have been looking at a lot of sequences with GOLLD and ROOL. My dataset is messy and large as I built onto it while I learned about different phage tools.
I got the sequences from RFAM. Then I blasted them to look at related sequences. I typically made a cutoff above 80% for both identity and query length.
What types of phages are ROOL and GOLLD found in?
GOLLD and ROOL are found in both temperate and virulent phages. I find the latter particularly interesting as they bring much less genetic baggage. Potentially making it easier to figure out what the nanocages are for. I used Pha-Box to tell me if a bacteriophage was likely virulent or temperate.
GOLLD and ROOL are often positioned in defense islands
I used IslandViewer4 to check where GOLLD and ROOL were positioned in relation to defense islands. Bacteria often have clusters of genes that they use for dealing with phages and stress.
I found that in bacteria, GOLLD and ROOL were often positioned inside such defense islands. Along with other genes used for bacterial defense against phages or stress. If they are not in an intact phage in the bacteria. In such cases they are surrounded by phage genes. I used PHASTER to get an idea if a phage is intact - the program gives a score for the amount of normal phage parts and if they are positioned so they can be a complete phage. Plus I used it to check if there was an overlap between potential phage and GOLLD or ROOL.
Here is a case where GOLLD is found inside a defense island with IslandViewer4:
One could be snatching the tRNA’s resources or ribosome parts and preventing degradation of them from the phages perspective. Or making them unavailable for the phage, from the bacteria’s perspective.
My main hypothesis for now is that the RNA nanocages are used for defense. My reason for this is that they are not only in temperate phages that often integrate in a bacteria and are not always aggressive. They are also in virulent phages. Virulent phages don’t carry much extra genetic weight that are not used directly for hostile takeover of a bacteria.
Bacteria have toxin-antitoxin systems. As long as their genome is safe, they happily produce enough antitoxin to deactivate the toxin they also have. If a bacteriophage chops up their genome or transcription system, they stop producing antitoxin to counter the toxin, letting the toxin loose on the bacteriophage.
If RNA nanocages function as an antitoxin or defense system, this could explain why both bacteria and phages keep them. In bacteria they are often close to a toxin or defense system. Toxin-antitoxin normally is close to each other in bacteria. But in phages, they seem distant to known toxins.This may suggest that they aren’t for neutralizing a phage toxin. Especially because they are also in virulent phages which are all on attack. While they are near hypothetical proteins which could very well be an unknown toxin.
So both bacteriophages and bacteria could have benefited from having a toxin capturing nanocage. Both for defense as a bacteria or for capturing bacterial toxins if one is a bacteriophage on the attack.
Since ROOL and GOLLD are found near nucleases, they could also be holding back these.
So if the nanocages are antitoxins, it could both be to counter internal or external toxins.
Basically I think the nanocages could be for capturing things the organism wants or that the organisms don’t want. Both are useful features.
Overlap between GOLLD and other lnRNA’s
I was looking through bacteria that had GOLLD embedded (in RFAM). I found some that did have GOLLD matchup sections to the tRNA Histidine or Proline. Not just near the lncRNA as mentioned in the paper. According to the paper, both ROOL and GOLLD exist near tRNA islands in phages. This also regularly happens in bacteria.
Rachel’s image figure 4bc illustrates that GOLLD could physically hold a tRNA and ROOL could hold the large ribosomal subunit.
Here are examples of cases where such ribosomal sequences are overlapping with GOLLD
Two tRNA’s, 5.8S + large ribosomal subunit (LSU) matching GOLLD
I have also found multiple hypothetical proteins overlapping with GOLLD. This particularly is in phages. Perhaps GOLLD in some bacteriophages have a different target. It could also be that GOLLD has been mutated so far from its origin that it is no longer forming a cage and the phage has opted to get some more useful proteins instead.
Nucleases are also regularly near ROOL. Same is the case in HEARO and OLE. (ribonuclease)
I was interested in if other phage clusters from Phamerator had similar space for a nanocage (these lncRNA’s leave holes between the proteins which are normally very closely packed in the phage genome). So I listed clusters with tRNA islands. I also listed clusters with exonuclease with an empty space beside. I went for sizes above 200 bases. I found a good bunch of other clusters with space in them. My hope is to find GOLLD, ROLL or another potential nanocage/lncRNA’s in these. See my data spreadsheet for these:
Sheets named the following:
Phamerator clusters with tRNA islands
Phamerator clusters with gaps near exonucleases
Phamerator tRNA gap
GOLLD showing up in the phage Nanosmite in the M3 subcluster as a gap among the proteins:
ROOL and GOLLD are sometimes found in some of the same types of phages and bacteria. They sometimes also are found near similar defense systems. I used DefenseFinder to get an idea if ROOL or GOLLD was near a specific type of defense systems like eg CRISPR.
I find GOLLD and ROOL in company with defense genes in bacteria and plasmids. That is when the ROOL and GOLLD are not still inside an intact prophage (temperate phage), then they are around phage genes. So GOLLD and ROOL are probably kept by bacteria while other phage genes are inactivated.
Some of these defense systems do get a ping in IslandViewer 4 that highlight defense areas, but not all of them. My current working hypothesis is that these lncRNAs are markers of defense systems. Some of which are yet not registered as such. And in the case of virulent phages also having them, I suspect they are there put to use for attack. Plus for prophages they are probably for defense - against bacteria and other phages.
Perspective
I wonder if GOLLD and ROOL are interchangeable?
I have been wondering if some of the other lnRNA’s with unknown functions would also be found in phages. I found this in the case of HEARO. While there are not many organisms with ARRPOF, I have found a few phage cases there too. HEARO seems to pop up in defense islands also. Sometimes in crazy amounts.
I think it will be worth searching for lncRNA’s with unknown function, in phage genomes and in bacterial defense islands. This may give a hint about their true nature. Plus I think it is worth looking for holes with no proteins in phage genomes. It will be particularly interesting if such a gap is conserved among clusters. Phages are nothing but effective, they don’t seem to have gaps not used for anything lying around. If they take too long to replicate in their host, they run the risk of not making it to the next phage generation. Potentially these are spots to discover new ln(c)RNA’s. I suspect that phages/transposons are the creators of some of these ln(c)RNAs. If they are useful for phages, bacterias or plasmids, they stick around.
SOME BACKGROUND
Bacteriophage fairytale
I found a beautiful research summary in the blog “Small things considered” that reads like a story.
To sum it up: It has long been a mystery why phages regularly brought their own tRNA’s to the replication party, when they could just steal their hosts.
But the host regularly uses nucleases to degrade its own transcriptional machinery, rather than become a phage factory. Turns out the phage tRNA’s can bring their own less degradable tRNA’s.
Bacterial war horror story
I found a video that nicely describe battle between bacteria and phages
Different powers of bacteria and bacteriophage
Power of bacteria
Bacteria can use nucleases to cut up phage tRNA’s or its own ribosomes. No resources for the phages and potentially saving their bacterial mates.
Bacteria can use retrons – a reverse transcriptase plus a ncRNA template making a DNA-RNA hybrid with itself.
Power of bacteriophage
Bring their own less degradable tRNA’s
Chop up the host genome for access to more building blocks.
Phages like to integrate themselves in tRNA islands or even inside a host tRNA. Either disrupting it but bringing their own equivalent. Or only allowing the host to get the tRNA in question, if the phage is replicated also.
Defense islands
“It has become clear in recent years that anti-phage defense systems cluster non-randomly within bacterial genomes in so-called “defense islands”.
We find that anti-phage defense systems are almost always carried on mobile genetic elements such as prophages, transposons and conjugative elements. These elements integrate at specific locations, or “hotspots”, within the E. coli genome. Different anti-phage defense systems are carried by distinct types of mobile genetic elements that preferentially integrate at specific hotspots, explaining why phage resistance profiles can vary significantly even among closely related E. coli strains.
Bacterial anti-phage defense systems were shown to be non-randomly distributed in microbial genomes [6,9,10]. Such systems were observed to frequently co-localize in bacterial and archaeal genomes, forming so-called “defense islands”: genomic regions in which multiple defense systems cluster together [6,9,10]. The tendency of defense genes to reside next to one another has enabled the discovery of dozens of novel phage resistance systems based on their genomic presence next to known defense systems [4,6,7,11–17].
Prophages were the most abundant MGE type carrying defense systems (Fig 2B)
Multiple of the phages with ROOL and GOLLD were labeled as MAGs
Many of the hotspots that we identified were previously described as integration positions for known MGEs. Specifically, 18 of the hotspots were within tRNA loci in the E. coli genome, which are commonly used by prophages and other MGEs as integration hotspots [26].
Back when my hypothesis was that GOLLD and ROOL are for holding phage genomes, I was on the lookout for bacteriophages with missing phage parts. I did find phages with such missing parts. (Can be found in the data sheet I linked earlier.) Based on the assumption that if a phage could hitch a ride with a GOLLD or ROLL capsule, it wouldn’t need most of those parts. Except for an integrase and perhaps a few defense genes.
Then I got distracted by defense islands plus in doubt if those prophages I had been looking at that had missing parts, really were still active phages and not just inactivated ones. Bacteria will try to get rid of prophages and if important phage genes get mutated, they can no longer escape. If a bacteriophage is missing part of its syringe, it isn’t going anywhere. That is an evolutionary dead end for it. I’m also not sure what a rescue will look like though. While I have seen bacteria with multiple seemingly intact phages integrated in their genome. Sometimes the same phage or related ones. In such cases where phage genes are expressed, viable bacteriophages could be made of mixtures of parts. Plus some of the genomes will be complete. Alternatively sloppy copying could mean a rescue of an inactivated phage gene part.
However this is where I’m kind of back with my first hypothesis. That GOLLD and ROOL could be for holding phage genomes. But not as their primary route of transmission. Rather as a security policy where a phage genome could use ROOL or GOLLD as an escape capsule. So even if a bacteria goes into self-destruct mode while phage production is in the process of making new bacteriophages, some ROOL or GOLLD capsules with a safety copy of the phage genome may still make it out of the bacteria before.
Perhaps it is time to update the saying by Peter Medawar that a virus is a piece of bad news wrapped in protein, with the addition, that a virus can be a piece of good news wrapped in RNA too.
In such a case the question is what is the fastest. Assembling a GOLLD or ROLL nanocage with the phage genome inside or assembling a complete bacteriophage and loading the genome inside. My money is on the nanocage being faster.
I also found phages with really large genomes 100K+ where it looked like there were two separate sets of genomes of phage in one. One with small parts and one with larger parts. As there were double up on tails, heads etc. I was wondering if it was a sort of minimum phage inside a larger phage. Where the small gets cut out and ROLL or GOLLD gets into action if conditions for the full size phage is not good.
So if a phage could escape - be it as a miniaturized phage version or with parts missing - it still stands a chance of being taken up by another bacteria and either inserted in the genome.
ROOL and GOLLD’s evolutionary past
If a “bacteriophage” can exist with a ROOL or GOLLD capsule and without its normal syringe parts, it raises another question. Since this is a much simpler virus design - fewer parts - is it a precursor for a bacteriophage in evolutionary time? Like an RNA relic from before viral protein capsules?
Do there exist CRISPR CAS against GOLLD or ROOL?
Another thing I’m wondering about is there are known CRISPR cas that targets GOLLD or ROOL? I have tried look at few CRISPR CAS databases, but I’m not sure how to figure the answer to my question. I mean if GOLLD and ROOL is seen as phage DNA/RNA and non-self by a bacteria, it would make sense to remember to attack it.
Defense islands, ribozymes and lncRNA’s
I think it is not just GOLLD and ROLL that are often associated with defense islands. I think that may also go for a good bunch of ribozymes. I have just placed the Bacterial RNase P class A ribozyme in such a bacterial defense island.
Several of those lncRNA’s like GOLLD and ROOL that have been discovered turn up in phages and leave some holes in an otherwise very closely packed protein genome of phages. I see several of those lncRNA’s that have been highlighted but still with unknown function, pop up in phages. I think they originate from there. Also they pop up in defense islands. Bacteria have defense islands, where they keep phage defense genes, stress genes and metabolic genes together in groups.
I ran a check on mainly bacterial ribozymes (since I’m interested in defense islands) Many of them pop up in similar positions where I have seen ROOL and GOLLD. In defense islands, in connection with energy metabolism islands, stress genes, inside phages or together with defense systems. Also like GOLLD and ROLL together with mobile genetic elements (MAG’s) Basically I wouldn’t be too surprised should GOLLD or ROOL be found together with a ribozyme.
In principle an RNA nanocage could fit multiple different phages. As long as they were short enough to fit inside. One of my early hypothesis was that the GOLLD or ROOL sequence could bind up
Loose thought. This may help us with what of our pseudoknot discoveries that may be worth looking at. If they turn up in bacterial defense islands - besides having other positive data, it may be an extra indicator they have a function.
Another case of value by association. Location, location, location
I also keep seeing specific types of lncRNA’s or ribozymes turn up with each their own type of defense system. I think looking at defense systems in relation to functional RNA can be a valuable road.
Either by looking for defense systems near functional/structured RNA or looking for RNA with a function near known defense systems.
GOLLD or ROLL as defense island inactivators?
Another hypothesis that I have considered. A phage would want to inactivate bacterial defense islands. Whether they hold retrons, toxins etc. These often seem located together with tRNA, so tRNA may just be the address for the phage to find what it is looking for.
The nanocage lncRNA may bind to tRNA to find the defense island. Since the retron is typically in a tRNA island, the different individual parts of the RNA nanocage could bind up with each their tRNA. And the nanocage could fold up around the bacterial defense island as a ball around a string
While the phage wont get more tRNA’s from its host, it may be stopping the bacteria from selfdestructing by means of eg a retron.