Regulation of Gene Expression in Eukaryotes
Summary: Danny Reinberg studies the regulation of gene expression in eukaryotes.Gene expression – the process that cells use to produce proteins from genes on DNA strands – is fundamental to all life. DNA sequences in genes are first “transcribed” to RNA molecules, which then become the templates for proteins. But this process must be controlled so that correct amounts and types of proteins are made in a normal cell. Moreover, in multicellular animals, the complex regulation of gene expression that results in different tissues during development and maintains tissue identity in the adult must be established.

In any tissue of the body, there are some genes that are never expressed – they are “silent” genes – and there are some genes that are expressed exclusively to this tissue, giving rise to its particular functions and identity. Somehow, when a cell in one organ divides, the identity of the organ is transmitted accurately to the daughter cells that now also exhibit this differential gene expression. How does this happen? This transmission of identity is not through the genes themselves, as all cell types contain the identical genetic makeup. Instead, this complex and fascinating process is functionally dependent on the proteins that structure the body of DNA. My laboratory’s long-term goal is to determine how a gene gets transcribed when it does, and what controls this process. To do this, we set out to determine the criteria that enable or disable transcription as a function of increasingly complex gene organization.


We focused initially on naked DNA and the requirements for the formation of a transcription complex that can synthesize RNA. We established a fully reconstituted transcription system in vitro with all the components defined. This led to the discovery of how different families of protein factors regulate the transcription process.

The large RNA polymerase II (RNAPII) molecule does the actual work of running along the gene’s DNA and producing RNA. But it needs the help of general transcription factors (GTFs) to find the beginning of the gene sequence (promoter) and start transcribing. The GTF family of factors consists of the TATA-binding protein (TBP) subunit of TFIID, TFIIA, TFIIB, TFIIF, TFIIE, and TFIIH. Four factors act as a “dock” to bind the polymerase to the start of the gene (TFIID, TFIIA, TFIIB, TFIIF), and two (TFIIE and TFIIH) push the polymerase on its way down the DNA and function during promoter escape.

Our work contributed information regarding the basic process of eukaryotic gene transcription from naked DNA, yet the context of cellular genes is far more complex. Two meters of DNA are packed in each human cell nucleus, in an ordered hierarchy. First, the DNA itself is wrapped around proteins called histones, forming spools called nucleosomes and a fiber with an 11-nanometer diameter. Then the fiber is coiled into a 30-nm fiber, and this, in turn, is structured into higher-order formations up to the scale of the chromosomes. The genes that are to be transcribed to RNA have to be untangled from the chromosome structure.

We obtained conditions to reconstitute the 11-nm fiber and asked whether RNAPII could traverse these nucleosomes. We discovered that nucleosomes represent a barrier to RNAPII, but we also discovered the factor FACT (facilitates chromatin transcription), which allows the polymerase to read the DNA while it is still spooled on the nucleosomes. FACT removes half of the histones ahead of the polymerase and places them behind the polymerase, allowing the polymerase molecule to burrow its way through the spool, transcribing the DNA as it goes.

While FACT dismantles and then reestablishes the assembled histones during RNA polymerase migration along the DNA, another factor we uncovered and named RSF (remodeling and spacing factor) acts at the stage of transcription initiation. RSF interacts with transcriptional activators and renders the promoter sequence accessible to the transcription complex. We found that RSF mobilizes nucleosomes in an ATP-dependent manner and, through its chaperone activity, deposits them to form nucleosomes. Thus, RSF enables the region of the DNA that is appropriate for transcription initiation to become accessible to the transcription complex.

As well as exploring the identity and functional roles of the cellular proteins that enable the transcription complex to access and move through the nucleosomal DNA by dynamically altering the nucleosomal structure, we have also focused on the properties of the nucleosomal proteins that are determinant to the DNA organization. It is the proteins that alter the constitution of these components that actually enable the DNA to be more or less accessible. The complex chromosome structure around silent genes is tightly wound, allowing no access to the polymerase and GTFs, while it is more loosely wrapped around the genes active in RNA synthesis. Over the past years, we and our colleagues in the transcription field have found that the key to these two different structures lies in the histone tails, unstructured, hook-like projections that extend outward from each nucleosome. Post-translational modifications in these histone tails determine whether they lock together tightly or loosely and thus whether the genes spooled around them are active or silent.

One particular histone tail modification has been shown to be reversible – acetylation. In general, genes containing acetylated histones have conditions conducive to RNA synthesis, while genes containing deacetylated histones are repressed. Another modification of the histone tails appears to be irreversible – lysine methylation. Depending upon the sites of histone lysine methylation, genes may be transcriptionally active or inert. Given the irreversible nature of methyl-lysine modifications, they may constitute the determinant marks of tissue specificity that can be stably transmitted to daughter cells.

We concentrated on the identity and biochemical characteristics of the proteins required for these histone modifications. A big step toward our goal of understanding how dividing cells from a specific tissue can retain tissue identity was our recent isolation of a human protein, PR-Set7, which specifically methylates the histone H4 polypeptide at a specific residue, lysine 20. We discovered that methylation of histone H4 – lysine 20 correlates in vivo with repressive chromatin, and that PR-Set7 expression is regulated during cell division. PR-Set7 is expressed early during mitosis and binds to mitotic chromosomes, during prometaphase, prior to separation of the chromosomes. We postulate that binding of PR-Set7 to mitotic chromosomes establishes the basis for propagation of this mark through cell divisions. We hypothesize that PR-Set7 recognizes methylation of histone H4 – lysine 20 in the mother chromosomes and transmits this mark to the daughter chromosomes before separation. This discovery is exciting, because the modification can be passed down to daughter cells, telling them which genes are silent and which are active – in other words, determining the cell’s identity.

A further advance on this front involves Ezh2, a known member of the Polycomb group of proteins that maintain Hox gene repression during development. We found that Ezh2 actually methylates histones and that it participates in a dynamic interplay between histone methylation and acetylation. The activity of Ezh2 is modulated during its association with other proteins. Studies from other labs established that the expression of Ezh2 and some of its partner proteins is cell cycle regulated and coordinated with DNA replication. Thus, while histone methylation by PR-Set7 may establish heritable cellular identity during mitosis, histone methylation by Ezh2 may do so during DNA replication.

A corollary to understanding the normal transmission of a cell’s identity to its progeny is that it may provide insight into what parts of the process become aberrant in cancer that includes the disestablishment of the cell’s identity. One or more of the proteins required to sustain the accurate program may be found to function improperly in tumor cells. This may be the case for Ezh2, which has recently been found to be overly abundant in cancer cells. We are currently using the mouse model of prostate oncogenesis to test whether this overabundance correlates with cancer progression, whether it is directly involved in aberrant cell growth, and if so, how.

Grants from the National Institutes of Health provided partial support for these projects.