
This is probably one of the coolest papers I’ve read in a while and I encourage everyone to read it. “Origins and Evolution of the Global RNA Virome” by Wolf et al., (Nov/Dec, 2018) attempts to reconstruct RNA virus evolution by taking advantage of the massive amount of new virus data science has gotten in the past few years thanks to metagenomics advances.
The really major takeaways
- dsRNA viruses evolved from +ssRNA viruses at least twice, and the prokaryotic dsRNA viruses actually are in the same grade as Reoviridae (i.e. rotaviruses) while another group of eukaryotic dsRNA viruses evolved separately
- -ssRNA viruses evolved from dsRNA viruses
- lots of extensive horizontal virus and gene transfer, coexpressed gene exchange across distantly related hosts. Even tips of the tree can have cross-kingdom host-range
Bacteria have mostly DNA viruses
We’ve found very few RNA viruses in bacteria (and archaea), which the paper suggests could have something to do with the bacteria cells not having many compartments or a nuclear envelope. The idea given was DNA viruses are at a disadvantage to RNA viruses in eukaryotes because they have to deal with more barriers. I’d imagine this could have a compounding effect as DNA viruses are usually not so great at host-switching and often tightly coevolve with their hosts, while RNA viruses often employ a strategy where they have many potential hosts. Infecting many hosts may facilitate horizontal gene transfer between very different viruses. This combined with rapid mutation rates in RNA viruses may further enhance diversity, while the prokaryotes keep getting infected by a clade of often strain-specific dsDNA phages.
RNA viruses have still been found in bacteria (+ssRNA Leviviridae and dsRNA cystoviridae). But we have never discovered a -ssRNA prokyarotic virus. Bacteriophages do have the well-characterized cystoviruses which are dsRNA, and lump in with the Reo-like eukaryotic viruses (which is quite cool). If bacteria have dsRNA viruses, and -ssRNA viruses in eukaryotes came from dsRNA viruses, it doesn’t seem so unlikely that a similar event could occur twice. Here’s hoping my lab is able to isolate a -ssRNA phage.
United by a single gene
For background, RNA viruses have an RNA genome while their hosts have a DNA genome. This means hosts are aren’t making RNA from RNA, but only RNA from DNA. So hosts don’t need to encode an RNA-dependent-RNA polymerase (RdRp), meaning all RNA viruses are united by this single requirement that they make an RdRp. All other genes are basically impossible to use for creating really deep evolutionary trees, though some genes for capsid proteins, helicases, and capping enzymes, might be decent choices for a relatively deep analysis.
These authors looked at 4,617 RNA virus RNA-dependent-RNA polymerases (RdRps), did quite a bit of work, and ultimately created a phylogenetic tree consisting of 5 major branching events.

Imagine we’re starting in RNA world, and the first branching event is the +ssRNA viruses from our outgroup(s), the Group II introns and the Non-LTR retrotransposons (which would be ancient, even older than retroviruses). Reverse transcriptase can bring us into the DNA world. The first major branch is leading to the bacteriophage +ssRNA viruses, the Leviviruses, which then split into these fungi and plant virus groups, notably “Mitoviruses” which infect fungi and mitochondria. It seems the base of the tree was an RNA replicon that was reproducing in the mitochondria (bacteria) which had no capsid, and later during eukaryotic evolution, (wherein endosymbiotic bacteria became mitochondria), they gained either a host-derived single-jelly roll capsid protein or one from a DNA virus to form the ancestral RNA virus. This protein is the most common capsid protein seen in +ssRNA viruses.
*As I’m reading in a 2018 paper, scientists have also found evidence (meaning they found the sequence just not an isolate) of mitoviruses in contemporary plant mitochondria by looking at plant transcriptomes. They add that “genuine plant mitoviruses were immediate ancestors to endogenized mitovirus elements now widespread in land plant genomes.”
The second branch is referred to as the “Picornavirus supergroup” and contains a bunch of +ssRNA viruses, notably the nidoviruses which include the largest RNA viruses, as well as this branch of dsRNA viruses nested within the group! This is the largest/most diverse branch, with the authors suggesting diversification had already been occurring before the Cambrian explosion. I am assuming as they reference ctenophores, sponges and cnidarian viromes, they’re indicating substantial diversification had occurred during the Ediacaran. By the way, I love this casual mention of the Cambrian, to remind readers that Opabinia had viruses. The base of this branch also seems to be where the authors placed the origin of viral single jelly-roll capsid proteins, which they say were acquired from cells. As viral genomes get bigger they start acquiring helicases as well.
The third branch contains a bunch more +ssRNA viruses like the flaviviruses and alpha viruses, with the capping enzyme CapE being ancestral to this group. Though the authors do point out there were likely three convergent evolution events where viruses acquired this capping enzyme. They suggest gene capture was an especially dominant strategy of these viruses. I wrote about an especially cool group in this branch– the Jingmenviruses which contain an animal multicomponent virus which packages its five segments into five separate virions.
Then comes branch 4, which are the majority of the dsRNA viruses. The Cystoviruses (which are enveloped bacteriophages), the Reoviridae, and the Totivirus group. It looks like this branch has the broadest host range, infecting protists, bacteria, fungi, plants, animals. Branch 4 is a pretty puzzling one– I want to know how cystoviruses became exclusively bacterial viruses, and how exactly they came around. Based on the tree, the ancestor appears eukaryotic, however the authors suggested picobirnaviruses (branch 2 dsRNA) may be prokaryotic viruses, and that totiviruses (branch 4) may potentially include prokaryotic viruses, which could indicate a prokaryotic virus ancestral state. That being said they were pretty confident that the reovirus group is closer to cystoviruses, as they have these unique T=1 capsids, surrounded by T=13 outer shells.
Branch 5 is all the -ssRNA viruses we’ve found which includes the Mononegavirales (rhabdoviruses like rabies and paramyxoviruses like measles), the Bunyavirales (like Hanta virus) and the orthomyxoviruses (like influenza). Their host range is pretty small, perhaps because they’ve diversified most recently. One instance of a -ssRNA virus found in protists which was most definitely a HVT event via an arthropod host. No prokaryotic at all, though I wouldn’t be surprised if there was some undiscovered second branch of -ssRNA viruses that came from dsRNA prokaryotic viruses. Another thought that occurred to me was the -ssRNA viruses which have coated RNA and very rarely recombine, could be further limited if they have less horizontal gene transfer opportunity.
The authors humbly make the important point that we know very, very little in the grand scheme of things about virus evolution. RdRp evolutionary history does not equal RNA virus evolutionary history necessarily, but it provides a rough framework from which to build on I think. I didn’t really get into it much but the paper goes into a lot of detail on horizontal virus/gene transfer events and how there’s not very strong phylogenetic signal in relation to host.
RNA viruses in Archaea?
I can’t find any RNA virus isolates that infect archaea, however metagenomics studies like this one have identified putative archaeal RNA viruses, most likely with Sulfolobus archaea as the host. These putative viral sequences were distinct from one another indicating there may be abundant diversity within archaea RNA viruses.
***a note– A pet peeve of mine is using polyphyletic groups as group names. I don’t like that we call bacteria viruses “phages,” and eukaryotic viruses “viruses,” and archaea viruses we’re split. Cystoviruses (host is Pseudomonas) for example, are closer to human rotaviruses than they are to any other bacteriophage we know of. I’ve even seen yeast viruses called “phages” because yeast are single-celled eukaryotes, and I’ve seen algae viruses called “phages,” but then giant viruses of algae and amoeba are so flashy that virologists of course call them “viruses.” It’s just so much easier to discuss things using monophyletic groups.
Imagine if a zoologist said they only studied animals that fly. “I study butterflies, birds–but not flightless birds (cuz that’s a WHOLE DIFFERENT ANIMAL SO SLOW DOWN THERE!!), bats, pterosaurs, bees, and the occasional flying squirrel.” So why do virus people tend to talk like that? ~end rant