Eukaryotic Transcription Factors

▶Transcription factor domain structure: Transcription factors other than the general transcription factors of the basal transcription complex were first identified through their affinity for specific motifs in promoters, upstream regulatory elements (UREs) or enhancer regions. These factors have two distinct activities. Firstly, they bind specifically to their DNA-binding site and, secondly, they activate transcription. These activities can be assigned to separate protein domains called activation domains and DNA-binding domains. In addition, many transcription factors occur as homo- or heterodimers, held together by dimerization domains. A few transcription factors have ligand-binding domains which allow regulation of transcription factor activity by binding of an accessory small molecule. The steroid hormone receptors are an example containing all four of these types of domain.

Mutagenesis of the yeast transcription factors Gal4 and Gcn4 showed that their DNA-binding and transcription activation domains were in separate parts of the proteins. Experimentally, these activation domains were fused to the bacterial LexA repressor. These hybrid fusion proteins activated transcription from a promoter containing the lexA operator sequence, indicating that the transcriptional activation function of the yeast proteins was separable from their DNA-binding activity. These type of experiments are called domain swap experiments.

▶DNA-binding domains:

  • The helix–turn–helix domain

This domain is characteristic of DNA-binding proteins containing a 60-aminoacid homeodomain which is encoded by a sequence called the homeobox. In the Antennapedia transcription factor of Drosophila, this domain consists of four α -helices in which helices II and III are at right angles to each other and are separated by a characteristic β -turn. The characteristic helix–turn–helix structure is also found in bacteriophage DNA-binding proteins such as the λ phage  cro repressor, lac and trp repressors, and cAMP receptor protein, CRP. The domain binds so that one helix, known as the recognition helix, lies partly in the major groove and interacts with the DNA. The recognition helices of two homeodomain factors Bicoid and Antennapedia can be exchanged, and this swaps their DNA-binding specificities. Indeed, the specificity of this interaction is demonstrated by the observation that the exchange of only one amino acid residue swaps the DNA-binding specificities.

  • The zinc finger domain

This domain exists in two forms. The C2H2 zinc finger has a loop of 12 amino acids anchored by two cysteine and two histidine residues that tetrahedrally co-ordinate a zinc ion. This motif folds into a compact structure comprising two β -strands and one α -helix, the latter binding in the major groove of DNA The α -helical region contains conserved basic amino acids which are responsible for interacting with the DNA. This structure is repeated nine times in TFIIIA, the RNA Pol III transcription factor. It is also present in transcription factor SP1 (three copies). Usually, three or more C2H2 zinc fingers are required for DNA binding. A related motif, in which the zinc ion is co-ordinated by four cysteine residues, occurs in over 100 steroid hormone receptor transcription factors. These factors consist of homo- or hetero-dimers, in which each monomer contains two C4 ‘zinc finger’motifs . The two motifs are now known to fold together into a more complex conformation stabilized by zinc, which binds to DNA by the insertion of one α -helix from each monomer into successive major grooves, in a manner reminiscent of the helix–turn–helix proteins.

  • The basic domain

A basic domain is found in a number of DNA-binding proteins and is generally associated with one or other of two dimerization domains, the leucine zipper or the helix–loop–helix (HLH) motif. These are referred to as basic leucine zipper (bZIP) or basic HLH proteins. Dimerization of the proteins brings together two basic domains which can then interact with DNA.

▶Dimerization domains:

  • Leucine zippers

Leucine zipper proteins contain a hydrophobic leucine residue at every seventh position in a region that is often at the C-terminal part of the DNA-binding domain. These leucines lie in an α -helical region and the regular repeat of these residues forms a hydrophobic surface on one side of the α -helix with a leucine every second turn of the helix. These leucines are responsible for dimerization through interactions between the hydrophobic faces of the α -helices. This interaction forms a coiled-coil structure. bZIP transcription factors contain a basic DNA-binding domain N-terminal to the leucine zipper. This is present on an α -helix which is a continuation from the leucine zipper α -helical C-terminal domain. The N-terminal basic domains of each helix form a symmetrical structure in which each basic domain lies along the DNA in opposite directions, interacting with a symmetrical DNA recognition site so that the protein in effect forms a clamp around the DNA. The leucine zipper is also used as a dimerization domain in proteins that use DNA-binding domains other than the basic domain, including some homeodomain proteins.

  • The helix–loop–helix domain

The overall structure of this domain is similar to the leucine zipper, except that a nonhelical loop of polypeptide chain separates two _-helices in each monomeric protein. Hydrophobic residues on one side of the C-terminal α -helix allow dimerization. This structure is found in the MyoD family of proteins. As with the leucine zipper, the HLH motif is often found adjacent to a basic domain that requires dimerization for DNA binding. With both basic HLH proteins and bZIP proteins the formation of heterodimers allows much greater diversity and complexity in the transcription factor repertoire.

▶Transcription activation domains:

  • Acidic activation domains

Comparison of the transactivation domains of yeast Gcn4 and Gal4, mammalian glucocorticoid receptor and herpes virus activator VP16 shows that they have a very high proportion of acidic amino acids. These have been called acidic activation domains or ‘acid blobs’ or ‘negative noodles’ and are characteristic of many transcription activation domains. It is still uncertain what other features are required for these regions to function as efficient transcription activation domains.

  • Glutamine-rich domains

Glutamine-rich domains were first identified in two activation regions of the transcription factor SP1. As with acidic domains, the proportion of glutamine residues seems to be more important than overall structure. Domain swap experiments between glutamine-rich transactivation regions from the diverse transcription factors SP1 and the Drosophila protein Antennapedia showed that

these domains could substitute for each other.

  • Proline-rich domains

Proline-rich domains have been identified in several transcription factors. As with glutamine, a continuous run of proline residues can activate transcription. This domain is found, for example, in the c-Jun, AP2 and Oct-2 transcription factors.

▶Repressor domains: Repression of transcription may occur by indirect interference with the function of an activator. This may occur by:

  • Blocking the activator DNA-binding site (as with prokaryotic repressors)
  • Formation of a non-DNA-binding complex (e.g. the repressors of steroid hormone receptors, or the Id protein which blocks HLH protein–DNA interactions, since it lacks a DNA-binding domain)

Masking of the activation domain without preventing DNA binding (e.g. Gal80 masks the activation domain of the yeast transcription factor Gal4). In other cases, a specific domain of the repressor is directly responsible for inhibition of transcription. For example, a domain of the mammalian thyroid hormone receptor can repress transcription in the absence of thyroid hormone

and activates transcription when bound to its ligand. The product of the Wilms tumor gene, WT1, is a tumor-suppressor protein having a specific proline-rich repressor domain that lacks charged residues.

▶Targets for transcriptional regulation: The presence of diverse activation domains raises the question of whether they each have the same target in the basal transcription complex or different targets for the activation of transcription. They are distinguishable from each other since the acidic activation domain can activate transcription from a downstream enhancer site while the proline domain only activates weakly and the glutamine domain not at all. While proline and acidic domains are active in yeast, glutamine domains have no activity, implying that they have a different transcription target which is not present in the yeast transcription complex. Proposed targets of different transcriptional activators include:

  • chromatin structure;
  • interaction with TFIID through specific TAFIIs;
  • interaction with TFIIB;
  • interaction or modulation of the TFIIH complex activity leading to differential phosphorylation of the CTD of RNA Pol II.

It seems likely that different activation domains may have different targets, and almost any component or stage in initiation and transcription elongation could be a target for regulation resulting in multistage regulation of transcription.