Deciphering Microbial DUFs

November 2014

Researchers in structural genomics have several ambitious goals. On one hand, PSI researchers have chosen several important organisms, such as members of the microbial flora in our gut and a bacterium that is remarkably resistant to radiation, and are systematically determining the structures of all of their component proteins. As with the elucidation of genome sequences, this will allow an unprecedented understanding of the inner workings of cells. A second goal is woven through this goal: to determine enough structures that we can confidently say that we have observed all the possible ways that a protein can fold. This will be an invaluable resource for the understanding of the basics of protein structure and function, and to fuel research on protein structure prediction.

Domains of Mystery

As part of this work, PSI researchers have identified a number of DUFs: domains of unknown function. These are protein domains, identified based on their sequences, that are quite different from any proteins with known structure or function. By study of these DUFs, PSI researchers uncover new ways that proteins can fold, and also determine their function along the way. As part of this work, they have already defined 200 new families of proteins from the human microbiome, and have solved structures for over 60 of them. Two recent structures from this growing body of knowledge are described here.

A New Calcium-binding Domain

PSI researchers at JCSG have used NMR to reveal the structure and function of the DUF YP_001302112.1 (PDB entry 2lge), a protein secreted by an intestinal bacterium. The protein chain folds into two stacked beta sheets, forming a compact structure with loops at either end. The fold is new, but comparison with distant relatives revealed several calcium-binding proteins with similar stacked beta sheets. Further NMR experiments revealed the ability of the protein to bind calcium.

Discovering the LUD Domain

JCSG researchers have also explored a DUF from a radiation-resistant bacterium, discovering a new domain family in the process. The DUF162 domain is found in several proteins involved in lactate utilization. The crystallographic structure of the domain (PDB entry 2g40), taken from the LutC protein, shows a mix of beta sheets and alpha helices. Based on this structure and other experimental evidence, this domain as has been assigned the name "LUD". Looking at other proteins in the genome, LUD domains may be the primary constituent in some proteins, as with LutC, or may be linked with other domains.

Honing in on Function

Comparison of YP_001302112.1 (PDB entry 2lge) with a distant relative, Clostridium perfringens alpha-toxin (PDB entry 1qmd), gives some additional hints about its calcium-binding function. Both proteins have two facing beta sheets with similar topology, although the new structure has a few extra beta strands in one of the sheets. The calcium ions are bound by the loops between the sheets in alpha-toxin. NMR analysis has revealed that a similar region is perturbed when calcium binds to the YP protein, implicating the loops in calcium binding. To look at these structures in more detail, the JSmol tab below displays an interactive JSmol.

YP_001302112.1 and alpha-Toxin (PDB entries 2lge and 1qmd)

The structure of YP_001302112.1 shows a similar fold as alpha-toxin, and also binds to calcium ions. Use the buttons to compare the two structures and change the coloring.

YP_001302112.1   alpha-toxin  

color proteins blue to red   color beta sheets yellow and alpha helices magenta  


  1. Serrano, P., Geralt, M., Mohanty, B. & Wuthrich, K. Structural representative of the protein family PF14466 has a new fold and establishes links with the CD and PLAT domains from the widely distant Pfams PF00168 and PF01477. Prot. Sci. 22, 1000-1007 (2013).

  2. Hwang, W. C. et al. LUD, a new protein domain associated with lactate utilization. BMC Bioinf. 14, 341 (2013).