Exploring the Structural Biology of Viruses
Structures of viral proteins help us discover effective ways to fight infection.
Viruses are a major threat to global health. Historically, pandemics of influenza, polio, smallpox and many other viruses have spread through populations numerous times, killing millions of people. Today, with our continually growing understanding of virus structure and biology, we have many tools to fight viral infection. Antiviral drugs block key viral proteins, preventing their replication and spread, and vaccines prime our immune system to make us ready for future exposure to common viruses.
This page explores some of the insights provided by structural biology about viruses and how these insights are used to develop new defenses against viral infection. Topics include:
Viruses typically have two types of life cycles. “Lytic” viruses inject their genome into the cell, then make many new viruses using the cell’s resources, and finally burst the cell, releasing the viruses to infect neighboring cells. Lytic viruses typically are composed of a protein coat surrounding the genome, which can be composed of DNA or RNA. Examples of lytic viruses include bacteriophage T4 (shown here), poliovirus, rhinovirus, and adenovirus.
“Lysogenic” viruses, on the other hand, fuse with cell membranes and release their genome into the cell, and then new viruses bud from the surface of the cell. During this budding process, the new viruses capture a coating of the cell membrane. So, lysogenic viruses often have many layers: an outer membrane “envelope,” a protein capsid and other interior proteins, and the genome. Examples of lysogenic viruses include HIV, coronavirus, influenza virus, and ebolavirus.
PDB-101 includes artistic conceptions of the bacteriophage T4 life cycle and HIV budding from the surface of an infected cell. Note: the free, infectious form of a virus is often termed a “virion,” but here, we will use the term “virus” to encompass all stages of the viral life cycle.
Most viruses are much smaller than living cells and can only hold a small amount of genetic material. For example, circoviruses get by with only two genes: one to encode the capsid protein that will protect and deliver the genome in the infectious virus, and a replicase protein that hijacks cellular polymerases to create new copies of the viral single-stranded DNA genome. This means that viruses employ many economical strategies to maximize the use of their genetic information. Often, viral genes encode proteins with several functionalities, or genes for different proteins overlap with one another. As presented below, viruses also employ structural symmetry to build large structures with small, identical building blocks. In addition, viruses rely on cellular proteins to do most of the work of creating new viruses, and need only to encode proteins to hijack this machinery and shut down the normal functions of the cell.
PDB-101 includes presentations of the 15 proteins encoded in the HIV-1 genome, the 7 proteins in the ebolavirus genome, 4 proteins encoded by simian virus 40, and the ~26 proteins in the SARS-CoV-2 genome.
Viruses break every rule and use whatever mechanisms they need to hijack cells. Most notably, they often don’t use the traditional flow of information transfer, from DNA to RNA to protein. Instead, viral genomes can be carried in RNA or DNA, single- or double-stranded, and including coding information or complementary information. In order to use these different types of genomes, viruses often encode exotic polymerases that perform non-traditional replication and transcription tasks. For example, HIV delivers its genome as a single strand of RNA, and encodes a reverse transcriptase that builds a double-stranded DNA from it. The infected cell can then use this DNA copy in the way that it normally replicates and transcribes DNA. Other viruses don’t bother with DNA at all, doing everything with RNA. The viral genome is encoded in a strand of RNA, which includes instructions for building a polymerase that makes new RNA strands using RNA as the template. As shown here, this strategy is used by poliovirus and SARS-CoV-2.
As mentioned above, viral genomes are typically very small and can encode a limited number of proteins. However, viral capsids need to be large enough to enclose the entire genome. Viruses solve this problem by employing symmetry to build huge assemblies using a limited number of building blocks. Many virus capsids have icosahedral symmetry, building a hollow sphere to enclose the genome. Quasisymmetry, where one type of subunit is used in many slightly different structural contexts, is used to make the icosahedral shells even larger. Viruses like tobacco mosaic virus and ebolavirus use helical symmetry to enclose their genomes in a long tube of protein. Other viruses break from these regular symmetries to create more exotic structures, such as the cone-shaped capsid of HIV and the amazing structures of tailed bacteriophages like the one shown here, which still only requires encoding of 9 proteins in the genome.
During an infection, viruses reproduce in a matter of days to create a huge population of viruses. For example, in an individual infected with HIV, 10 billion new viruses are created every day, and the whole life cycle takes only 2 or 3 days. These are perfect conditions for rapid evolution, which leads to several important consequences. Firstly, viruses are often very specific for particular hosts. Their mechanisms for recognizing cells are often tailored for particular proteins found on the cell surface. For example, influenza virus uses hemagglutinins that recognize specific cell surface glycosylation. Mutation and evolution of these proteins, however, can allow viruses to infect new types of hosts. This is a common occurrence with influenza, where viruses in animal populations, such as birds or pigs, acquire the ability to infect humans.
Secondly, evolution of viral proteins allows them to become resistant to antiviral therapies. For example, resistant strains of HIV emerged very rapidly after the first anti-HIV drugs were deployed in the clinic. As shown in the figure, the molecular basis of this resistance could be observed within a matter of days in the laboratory setting. Today, HIV-infected individuals are treated with a cocktail of different drugs, making it much less probable that the viral population will be able to mutate to avoid all of them simultaneously. Similarly, variants of SARS-CoV-2 emerged rapidly in the global population as people became immune to earlier variants.
By understanding the underlying molecular mechanisms of viral biology, we can find ways to block them and fight viral infection. Structural biology has provided a unique window on viruses, revealing their weak points. Fortunately, many viruses employ novel proteins that are quite different from our cellular proteins, so they are attractive targets for drug therapy. Many of these targets are viral enzymes that play key roles in the viral life cycle. For example, effective drugs that block HIV reverse transcriptase, HIV protease, and HIV integrase, all developed through design efforts guided by knowledge of atomic structures, have turned HIV infection into a manageable disease. Similarly, the structure of the major protease from SARS-CoV-2 guided design of a targeted drug to help fight the COVID-19 pandemic.
Therapeutics are also being designed to block the mechanisms of viral attachment and entry. These often employ therapeutic antibodies to bind to the surface glycoproteins of these viruses, using the same defenses that our immune system uses to fight viral infection.
Vaccination is our most powerful tool for protecting us against the ravages of viruses. Vaccination leverages our natural defenses, priming our immune system so it will be ready when we are challenged with a virus. The concept is simple and revolutionary: we introduce a weakened (possibly inert) version of the virus into our system, not strong enough to cause disease but similar enough to the real virus to stimulate creation of antibodies against it. The first vaccines used the viruses themselves, inactivated by chemicals, or used a less dangerous virus similar to pathogenic one. Today, modern vaccines include only the most antigenic portions of the virus, typically the viral surface glycoproteins. The proteins themselves may be used as the vaccine or, more recently, mRNA vaccines can stimulate some of our own cells to produce the antigenic protein. Structures of these glycoproteins have recently been used to optimize them for maximal effectiveness, making small changes that stabilize it in the shape it adopts on the surface of an infectious virus. These “prefusion stabilized” forms were developed using the respiratory syncytial virus fusion glycoprotein, and later was used in the mRNA vaccines protecting us from SARS-CoV-2 infection.