Molecule of the Month: Oct and Sox Transcription Factors

Transcription factors decide when particular genes will be transcribed

Transcription factors Oct1 (turquoise) and Sox2 (blue) bound to a short piece of DNA.
Transcription factors Oct1 (turquoise) and Sox2 (blue) bound to a short piece of DNA.
Download high quality TIFF image
The development of a complete human being from a single cell is one of the great miracles of life. A human egg cell contains about 30,000 genes that encode proteins, and of these, about 3,000 of these genes encode transcription factors. Transcription factors determine when genes will be turned on and turned off, orchestrating the many processes involved in the development of an embryo and the many tasks performed by each cell after a child is born. Amazingly, there is only about 1 transcription factor for every 10 genes, posing a puzzle: how does this limited set of proteins control the many genes and processes that must be regulated?

Combinatorial Control

One of the answers to this question may be discovered by looking at the binding sites for transcription factors in the genome. Typical genes in our cells have extensive regulatory regions before and after the genes, sometimes 100,000 base pairs away, and occasionally even inside the genes. These regions act in many different ways, as enhancers, silencers, insulators, and promotors of the gene. Each gene is controlled by a combination of many transcription factors, which together form a consensus as to whether the gene will be expressed or not at any given time.

Choosing a Path

Oct4 and its cofactor Sox2 are at the center of a collection of transcription factors that control the first decisions in the development of an embryo. Oct4 is present in embryonic stem cells, and its levels drop when the cell starts to divide and differentiate into different types of cells. It has been called the "gatekeeper" of development, since it is necessary for maintaining the stem cell state. The structure shown here, from PDB entry 1gt0 , shows the DNA-binding portions of a similar protein, Oct1 (at the bottom in turquoise), and Sox2 (at the top in blue) bound to a short piece of DNA (in orange and pink).


Unfortunately, once stem cells make their choices and differentiate into nerve cells or skin cells or other types of cells, they are normally unable to reverse their choices and become stem cells once again. If this were possible, however, it would be very useful: for instance, imagine taking a few skin cells from a patient with diabetes, and then changing these cells into pancreatic cells that can make insulin. Researchers have recently used Oct4 and Sox2 to make the first steps towards this amazing goal. By adding the genes for these proteins, along with a few other transcription factors, to skin cells, they were able to reprogram the cells into "pluripotent" stem cells that are able to form many other cell types.

DNA-binding domain of Myc and Max bound to DNA (top).
DNA-binding domain of Myc and Max bound to DNA (top).
Download high quality TIFF image

Group Effort

The reprogramming of skin cells or other cells into stem cells requires a few helper proteins that relay the signal of Oct4 to the many genes that must be affected. The two proteins shown here--c-Myc (top) and Klf4 (bottom)--were used in the first successful reprogramming experiments. The DNA-binding portion of c-Myc is shown bound to a small piece of DNA along with its partner protein Max, from PDB entry 1nkp . The DNA-binding portion of Klf transcription factors are composed of three zinc finger domains, shown here from PDB entry 2ebt .

Exploring the Structure

Oct and Sox Transcription Factors (PDB entries 1gt0 and 1o4x)

Combinatorial control, where several transcription factors bind together to control a gene, allows the same proteins to be used in different ways. This is shown in two structures of Oct1 and Sox2 bound to different pieces of regulatory DNA. In PDB entry 1gt0 (left), the proteins are bound to the FGF4 enhancer DNA and the two proteins interact weakly through and extended tail of Sox2. In PDB entry 1o4x (right), the proteins are bound closer together on the Hoxb1regulatory sequence and they form a stronger interaction. In this way, the different spacing of the binding sites in the DNA can control the binding strength of the Oct and Sox complex. To explore these structures in more detail, click on the image for an interactive JSmol.

Topics for Further Discussion

  1. Researchers have used a variety of other transcription factors along with Oct4 and Sox2 for reprogramming cells, including Nanog and Lin-28. The DNA-binding portions of these proteins are available in the PDB. Can you find similarities and differences with the transcription factors shown here?
  2. Many DNA-binding proteins bend DNA when they bind. Can you find other examples in the PDB?


  1. M. Levine and R. Tjian (2003) Transcription regulation and animal diversity. Nature 424, 147-151.
  2. W. Buitrago and D. R. Roop (2007) Oct-4: the almighty POUripotent regulator? Journal of Investigative Dermatology 127, 260-262.
  3. S. I. E. Guth and M. Wegner (2008) Having it both ways: Sox protein function between conservation and innovation. Cellular and Molecular Life Sciences 65, 3000-3018.
  4. Y.-H. Loh, J.-H. Ng and H.-H. Ng (2008) Molecular framework underlying pluripotency. Cell Cycle 7, 885-891.
  5. K. Hochedlinger and K. Plath (2009) Epigenetic reprogramming and induced pluripotency. Development 136, 509-523.

April 2009, David Goodsell
About Molecule of the Month
The RCSB PDB Molecule of the Month by David S. Goodsell (The Scripps Research Institute and the RCSB PDB) presents short accounts on selected molecules from the Protein Data Bank. Each installment includes an introduction to the structure and function of the molecule, a discussion of the relevance of the molecule to human health and welfare, and suggestions for how visitors might view these structures and access further details.More