Enlarge / Researchers examine a bat as part of their search for dangerous animal pathogens in the Global Viral Forecasting Initiative Lab in Yaounde, Cameroon.Brent Stirton/Getty Images

One of the longest-running questions about this pandemic is a simple one: where did it come from? How did a virus that had seemingly never infected a human before make a sudden appearance in our species, equipped with what it needed to sweep from China through the globe in a matter of months?

Analysis of the virus' genome was ambiguous. Some analyses placed its origin within the local bat population. Others highlighted similarities to pangolins, which might have been brought to the area by the wildlife trade. Less evidence-based ideas included an escape from a research lab or a misplaced bioweapon. Now, a US-based research team has done a detailed analysis of a large collection of viral genomes, and it finds that evolution pieced together the virus from multiple parts—most from bats, but with a key contribution from pangolins.

Recombination

How do pieces of virus from different species end up being mashed together? The underlying biology is a uniquely viral twist on a common biological process: recombination.

In cells, recombination is a normal part of genetics. Any time two DNA molecules share extensive similarities, it's possible for them to exchange pieces. The result is a hybrid molecule: a stretch of DNA from one parental piece of DNA, followed by a stretch from the other. As a result, some of the differences between the two parent molecules get scrambled—some from each parent will end up on the final molecule.

Recombination is a normal part of the reproduction of complicated cells. If you happen to have an offspring, you've given that child a set of chromosomes that are a mix of pieces from the ones you were given by your mother and father. Recombination can also take place in simpler cells, where it's been the primary tool that we've used to engineer new or altered genes into the genome of bacteria. And, since the molecules that perform the recombination aren't especially picky about which DNA molecules they work with, DNA viruses that infect cells can sometimes recombine if more than one strain of virus infects a single cell.

RNA

Those of you who have followed the virus closely, however, may be wondering what's going on here. All of this recombination takes place between DNA molecules. But the coronavirus genome is composed of RNA. So why would any of it work there?

The answer is that it doesn't. But other processes essentially perform the same function, mixing up pieces of RNA to form distinct genetic combinations. For example, the influenza virus spreads its genome across eight different molecules, allowing cells infected by more than one strain of flu virus to produce viral particles that have a random assortment of molecules from the two strains.

Coronavirus' genome is a single, long RNA molecule, so that sort of recombination doesn't work there. But it still can recombine. The enzyme that copies the RNA genome moves down it from one end to the other, making a copy as it goes. Sometimes, however, it can stall and fall off the molecule it's copying, while still hanging on to its partially complete copy. In many cases, the copying will just be aborted. But in others, it can latch on to a new genome and use the copy to pick up where it left off.

Critically, the new molecule with which it restarts the copying doesn't have to be the one it was copying originally. It just has to be similar to the first one it copied—it doesn't have to be identical. As a result, this process can allow recombination among viruses that are relatively distantly related from an evolutionary perspective. All they have to do is infect the same host.

Swapping genes

Now that we know recombination can take place, how would we go about looking for it? The key here is that we now have a lot of coronavirus sequences from a lot of different hosts available in public databases. Dedicated public health researchers have even gone in and sampled dozens of bat sources to look for strains that might be capable of starting a pandemic. So, for the new analysis, the research team started with a collection of 43 different coronaviruses from a variety of species, including humans, bats, and the pangolin sequences known to be similar to SARS-CoV-2.

The basic genome analysis confirmed that SARS-CoV-2 is most closely related to a number of viruses that had been isolated from bats. But different areas of the virus were more or less related to different bat viruses. In other words, you'd see a long stretch of RNA that's most similar to one virus from bats, but it would then switch suddenly to look most similar to a different bat virus.

This sort of pattern is exactly what you'd expect from recombination, where the switch between two different molecules would cause a sudden change in the sequence at the point where the exchange took place. (You'd see this rather than differences from both parent molecules being spread evenly throughout the genome.)

Spike protein

But there was a notable exception to this mixing of bat viruses: the spike protein that sits on the virus' surface and latches on to human cells. Here, the researchers found exactly what the earlier studies had suggested: a key stretch of the spike protein, the one that determines which proteins on human cells it interacts with, came from a pangolin version of the virus through recombination.

In other words, both of the ideas from earlier work were right. SARS-CoV-2 is most closely related to bat viruses and most closely related to pangolin viruses. It just depends on where in the genome you look.

The other bit of information to come out of this study is an indication of where changes in the virus' proteins are tolerated. This inability to tolerate changes in an area of the genome tends to be an indication that the protein encoded by that part of the genome has Read More – Source

[contf] [contfnew]

arstechnica

[contfnewc] [contfnewc]