Designed Proteins and Citizen Science

What if people with no formal experience in science could help to improve or even rewrite nature, simply by playing a game?

This article was written and illustrated by Changpeng Lu, Natalie Losada, and Nithish Selvaraj as part of a week-long boot camp on "Science Communication in Biology and Medicine" for undergraduate and graduate students hosted by the Rutgers Institute for Quantitative Biomedicine in January 2021.
Four proteins created by Foldit users (image generated with Pymol).
Download high quality TIFF image

De novo Design, at a Glance

Since 1971, the Protein Data Bank has been the world’s library for 3D structures of biological molecules (like proteins and nucleic acids). But what if we want new proteins that can solve 21st century problems like breaking down plastic or building drug molecules more efficiently? Building custom proteins from scratch, called de novo design, is notoriously difficult because a protein can adopt many shapes in 3D space. Fortunately, the many experimentally-determined structures in the PDB can help us understand the rules dictating how amino acid chains fold into functional proteins. Scientists implement these rules as complex algorithms in protein folding and design programs, such as AlphaFold and Rosetta. However, these programs are not perfect, and sometimes a structure is predicted that is possible in theory, but is not the most stable when synthesized and tested. Today, initiatives like Foldit are putting people back into the process, and asking them to help fold proteins.

A Solution: Foldit

Foldit is an interactive game that enables players to collaboratively design protein structures. It follows the rules set by Rosetta to make protein structures through stabilizing interactions such as hydrogen bonds, hydrophobic interactions, and hydrophilic interactions. It has an easy-to-use interface so players can manipulate structures manually based on their spatial intuition. These simplifications make Foldit user-friendly for people with varying levels of experience. The game also lets players use their 3D problem-solving skills to explore a unique range of structures that might be missed by purely computational approaches. To start a challenge, scientists upload puzzles where a protein’s amino acid sequence is provided as a reference. The difficulty level of puzzles varies from simple modifications of an already folded structure, to difficult protein structure prediction problems such as de novo design of an entire protein. Then, Foldit users make 3D structures, aiming to have the highest points in the game. Afterwards, users could see their best predicted structures turn into real proteins confirmed by scientists in laboratory experiments.

The Impact of de novo Design

Discovering new protein folds has been a cornerstone of de novo design, as new folds can be used as templates to engineer new enzymes and other functional proteins. With the help of scientists, Foldit users were able to create 20 unique protein structures entirely from scratch, one of which was a newly-discovered protein fold. Structures were determined for four of these new proteins, as shown here from PDB ID 6mrr, 6nuk, 6msp and 6mrs. Other scientific groups have used algorithms like Rosetta to design new protein folds from scratch, such as Top7 (1qys, not shown). These new folds are the basis from which new enzymes are created, such as proteins that help fight off viral infections (3r2x, not shown) or the creation of molecules that can track specific chemicals in cells. Thanks to new computational design tools, there are now more possibilities for de novo designed proteins than ever before.

An engineered Diels-Alderase enzyme (left) and an enzyme modified with Foldit (right). The designed loop (pink) stabilizes the ligand (green), increasing catalytic activity.
Download high quality TIFF image

Other Successes of Protein Design

De novo protein design may be the end goal, but these tools have also been very effective for structure prediction and optimization. For example, scientists engineered an efficient PET depolymerase (6tht, not shown) through structure optimization, that performs the difficult task of breaking down polyester plastics. Also, after 10 years of trying, scientists were unable to determine the structure of the protease from Mason-Pfizer monkey virus (M-PMV). Foldit players correctly folded the protease (3sqf, not shown), and the structure was ready for use in drug development. Foldit players were also able to improve an existing enzyme by increasing its catalytic activity more than 10-fold. The starting point was an engineered protein that performs an unusual Diels-Alder reaction (PDB ID 3i1c). The players were guided by scientists to build a “lid” for the enzyme that held the substrate more tightly for a more efficient reaction (PDB ID 3u0s).

Exploring the Structure

How is a protein’s structure stabilized?

For proteins to adopt their stable 3D structure, many different types of interactions occur between individual amino acids. Carbon-rich amino acids are clustered inside the enzyme, forming a “hydrophobic core,” and charged and polar amino acids are most often arrayed on the surface of the protein, where they interact with the surrounding water. Specific interactions, such as ionic interactions, hydrogen bonds, and others further stabilize the protein and guide the local details of the fold. Click on the image for an interactive JSmol that displays many of these interactions for a Foldit-designed protein (PDB ID 6nuk).

Topics for Further Discussion

  1. Try yourself at their website.
  2. Try searching for “de novo” at the main RCSB PDB site to see many designed protein in the PDB archive.


  1. Kohli, P., Jones, D.T., Silver, D., Kavukcuoglu, K., Hassabis, D. (2020) Improved protein structure prediction using potentials from deep learning. Nature 577: 706–710.
  2. Tournier, V., Topham, C.M., Gilles, A., David, B., Folgoas, C., Moya-Leclair, E., Kamionka, E., Desrousseaux, M.L., Texier, H., Gavalda, S., Cot, M., Guémard, E., Dalibey, M., Nomme, J., Cioci, G., Barbe, S., Chateau, M., André, I., Duquesne, S., Marty, A. (2020) An engineered PET depolymerase to break down and recycle plastic bottles. Nature 580: 216–219.
  3. 6mrs, 6mrr, 6msp, 6nuk: Koepnick, B., Flatten, J., Husain, T., Ford, A., Silva, D., Bick, M., Bauer, A., Liu, G., Ishida, Y., Boykov, A., Estep, R., Kleinfelter, S., Nørgård-Solano, T., Wei, L., Foldit Players, Montelione, G. T., DiMaio, F., Popović, Z., Khatib, F., Cooper, S., Baker, D. (2019) De novo protein design by citizen scientists. Nature 570: 390–394.
  4. Feng, J., Wester, B. W., Tinberg, C. E., Mandell, D. J., Antunes, M. S., Chari, R., Morey, K. J., Rios, X., Medford, J. I., Church, G. M., Fields, S., Baker, D. (2015) A General Strategy to Construct Small Molecule Biosensors in Eukaryotes. ELife 4.
  5. 3u0s: Eiben, C.B., Siegel, J.B., Bale, J. B., Cooper, S., Khatib, F., Shen, B.W., Foldit Players, Stoddard, B.L., Popovic, Z., Baker, D. (2012) Increased Diels-Alderase activity through backbone remodeling guided by Foldit players. Nature Biotechnology 30(2): 190-192.
  6. Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z., Foldit players. (2010) Predicting protein structures with a multiplayer online game. Nature 466: 756–760.
  7. 3i1c: Siegel, J.B., Zanghellini, A., Lovick, H.M., Kiss, G., Lambert, A.R., St Clair, J.L., Gallaher, J.L., Hilvert, D., Gelb, M.H., Stoddard, B.L., Houk, K.N., Michael, F.E., Baker, D. (2010) Computational design of an enzyme catalyst for a stereoselective bimolecular Diels-Alder reaction. Science 329: 309-313.
  8. Senior, A.W., Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T., Qin, C., Zidek, A., Nelson, A.W.R., Bridgland, A., Penedones, H., Petersen, S., Simonyan, K., Crossan, S., Simons, K.T., Bonneau, R., Ruczinski, I.,Baker, D. (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins 37: 171-176.

July 2021, Changpeng Lu, Natalie Losada, Nithish Selvaraj, David S. Goodsell, Shuchismita Dutta

About Molecule of the Month
The RCSB PDB Molecule of the Month by David S. Goodsell (The Scripps Research Institute and the RCSB PDB) presents short accounts on selected molecules from the Protein Data Bank. Each installment includes an introduction to the structure and function of the molecule, a discussion of the relevance of the molecule to human health and welfare, and suggestions for how visitors might view these structures and access further details. More