Protein power: transcription

As described earlier, transcription relies on the complementary pairing of bases. The
two strands of the double helix separate locally, and one of the separated strands
acts as a template. Next, free nucleotides are aligned on the DNA template by their
complementary bases in the template.

The free ribonucleotide A aligns with T in the
DNA, G with C, C with G, and U with A. The process is catalyzed by the enzyme RNA polymerase, which attaches and moves
along the DNA adding ribonucleotides in the growing RNA as shown in Figure 10-6a.

Hence, already we see the two
principles of base complementarity and binding proteins (in this case, the RNA polymerase) in action.

RNA growth is always in the 5′ → 3′ direction: in other words, nucleotides are always
added at a 3′ growing tip, as shown in Figure
10-6b. Because of the antiparallel nature of the nucleotide pairing, the
fact that RNA is synthesized 5′ → 3′ means that the template strand must be oriented
3′ → 5′.

In most prokaryotes, a single RNA polymerase species transcribes all types of
RNA. Figure 10-7 shows the structure of
RNA polymerase from E. coli. We can see that the enzyme
consists of four different subunit types.

The beta (β) subunit has a molecular
weight of 150,000, beta prime (β′) 160,000, alpha (α) 40,000, and sigma (σ)
70,000. The σ subunit can dissociate from the rest of the complex, leaving the
core enzyme.

The complete enzyme with σ is termed the RNA
polymerase holoenzyme
and is necessary for correct initiation of
transcription, whereas the core enzyme can continue transcription after
initiation.

Next, let’s look at the three distinct stages of transcription: initiation,
elongation,
and termination.

The regions of the DNA that signal initiation of transcription in prokaryotes are
termed promoters. (We consider their role in gene regulation in
Chapter 11.

) Figure 10-8 shows the promoter sequences
from 13 different transcription initiation points on the E.
coli
genome.

The bases are aligned according to homologies, or
similar base sequences, that appear just before the first base transcribed
(designated the “initiation site” in Figure
10-8).

Note in Figure 10-8 that two regions of
partial homology appear in virtually each case. These regions have been termed
the −35 and −10 regions because of their locations relative to the transcription
initiation point.

At the bottom of Figure
10-8, an ideal, or consensus, sequence of a promoter is given.
Physical experiments have confirmed that RNA polymerase makes contact with these
two regions when binding to the DNA.

The enzyme then unwinds DNA and begins the
synthesis of an RNA molecule.

The dissociative subunit of RNA polymerase, the σ factor, allows RNA polymerase
to recognize and bind specifically to promoter regions. First, the holoenzyme
searches for a promoter (Figure 10-9a)
and initially binds loosely to it, recognizing the −35 and −10 regions.

The
resulting structure is termed a closed promoter complex (Figure 10-9b). Then, the enzyme binds more
tightly, unwinding bases near the −10 region. When the bound polymerase causes
this local denaturation of the DNA duplex, it is said to form an open
promoter complex
(Figure
10-9c).

This initiation step, the formation of an open complex,
requires the sigma factor.

Shortly after initiating transcription, the sigma factor dissociates from the RNA polymerase. The RNA is always synthesized in the 5′ → 3′ direction (Figures 10-10 and 10-11), with nucleoside triphosphates (NTPs) acting as
substrates for the enzyme. The following equation represents the addition of
each ribonucleotide.

Protein Power: TranscriptionThe energy for the reaction is derived from splitting the high-energy
triphosphate into the monophosphate and releasing the inorganic diphosphates
(PPi), as shown in Figure
10-10. Figure 10-11 gives a
physical picture of elongation. Note how a “transcription bubble” must be
maintained, because the transcription takes place on a double-stranded template.
The bubble must move along the DNA duplex during elongation. Certain sequences
may cause stalling or pausing, which becomes critical for termination of
transcription.

RNA polymerase also recognizes signals for chain termination, which includes the
release of the nascent RNA and the enzyme from the template. There are two major
mechanisms for termination in E. coli.

In the first mechanism, the termination is direct. The terminator sequences
contain about 40 bp, ending in a GC-rich stretch that is followed by a run of
six or more A’s on the template strand.

The corresponding GC sequences on the
RNA are so arranged that the transcript in this region is able to form
complementary bonds with itself, as can be seen in Figure 10-12. The resulting
double-stranded RNA section is called a hairpin loop.

It is
followed by the terminal run of U’s that correspond to the A residues on the DNA
template. The hairpin loop and section of U residues appear to serve as a signal
for the release of RNA polymerase and termination of transcription.

See also:  What happened when the mona lisa was stolen?

In the second type, the help of an additional protein factor, termed rho, is required for RNA polymerase
to recognize the termination signals. mRNAs with rho-dependent termination
signals do not have the string of U residues at the end of the RNA and usually
do not have hairpin loops.

A model for rho-dependent termination is shown in
Figure 10-13. Rho is a hexamer
consisting of six identical subunits; the hydrolysis of ATP to ADP and
Pi drives the termination reaction.

The first step in termination
is the binding of rho to a specific site on the RNA termed rut
(Figure 10-13a and b). After
binding, rho pulls the RNA off the RNA polymerase, probably by translocating
along the mRNA, as depicted in Figure
10-13b and c.

The rut sites are located just
upstream from (that is, 5′ from) sequences at which the RNA polymerase tends to
pause.

The efficiency of both mechanisms of termination is influenced by surrounding
sequences and other protein factors, as well.

Transcription (biology)

For the journal, see Transcription (journal). For transcription in eukaryotes, see Eukaryotic transcription.
Process of copying a segment of DNA into RNA
Simplified diagram of mRNA synthesis and processing. Enzymes not shown.

Transcription is the first of several steps of DNA based gene expression in which a particular segment of DNA is copied into RNA (especially mRNA) by the enzyme RNA polymerase.

Both DNA and RNA are nucleic acids, which use base pairs of nucleotides as a complementary language. During transcription, a DNA sequence is read by an RNA polymerase, which produces a complementary, antiparallel RNA strand called a primary transcript.

Transcription proceeds in the following general steps:

  1. RNA polymerase, together with one or more general transcription factors, binds to promoter DNA.
  2. RNA polymerase generates a transcription bubble, which separates the two strands of the DNA helix. This is done by breaking the hydrogen bonds between complementary DNA nucleotides.
  3. RNA polymerase adds RNA nucleotides (which are complementary to the nucleotides of one DNA strand).
  4. RNA sugar-phosphate backbone forms with assistance from RNA polymerase to form an RNA strand.
  5. Hydrogen bonds of the RNA–DNA helix break, freeing the newly synthesized RNA strand.
  6. If the cell has a nucleus, the RNA may be further processed. This may include polyadenylation, capping, and splicing.
  7. The RNA may remain in the nucleus or exit to the cytoplasm through the nuclear pore complex.

The stretch of DNA transcribed into an RNA molecule is called a transcription unit and encodes at least one gene. If the gene encodes a protein, the transcription produces messenger RNA (mRNA); the mRNA, in turn, serves as a template for the protein's synthesis through translation.

Alternatively, the transcribed gene may encode for non-coding RNA such as microRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), or enzymatic RNA molecules called ribozymes.

[1] Overall, RNA helps synthesize, regulate, and process proteins; it therefore plays a fundamental role in performing functions within a cell.

In virology, the term may also be used when referring to mRNA synthesis from an RNA molecule (i.e., RNA replication).

For instance, the genome of a negative-sense single-stranded RNA (ssRNA -) virus may be template for a positive-sense single-stranded RNA (ssRNA +)[clarification needed].

This is because the positive-sense strand contains the information needed to translate the viral proteins for viral replication afterwards. This process is catalyzed by a viral RNA replicase.[2][clarification needed]

Background

A DNA transcription unit encoding for a protein may contain both a coding sequence, which will be translated into the protein, and regulatory sequences, which direct and regulate the synthesis of that protein.

The regulatory sequence before (“upstream” from) the coding sequence is called the five prime untranslated region (5'UTR); the sequence after (“downstream” from) the coding sequence is called the three prime untranslated region (3'UTR).[1]

As opposed to DNA replication, transcription results in an RNA complement that includes the nucleotide uracil (U) in all instances where thymine (T) would have occurred in a DNA complement.

Only one of the two DNA strands serve as a template for transcription. The antisense strand of DNA is read by RNA polymerase from the 3' end to the 5' end during transcription (3' → 5').

The complementary RNA is created in the opposite direction, in the 5' → 3' direction, matching the sequence of the sense strand with the exception of switching uracil for thymine. This directionality is because RNA polymerase can only add nucleotides to the 3' end of the growing mRNA chain.

See also:  What are sequences in math?

This use of only the 3' → 5' DNA strand eliminates the need for the Okazaki fragments that are seen in DNA replication.[1] This also removes the need for an RNA primer to initiate RNA synthesis, as is the case in DNA replication.

The non-template (sense) strand of DNA is called the coding strand, because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine). This is the strand that is used by convention when presenting a DNA sequence.[3]

Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA. As a result, transcription has a lower copying fidelity than DNA replication.[4]

Major steps

Further information: Bacterial transcription and Eukaryotic transcription

Transcription is divided into initiation, promoter escape, elongation, and termination.[5]

Initiation

Transcription begins with the binding of RNA polymerase, together with one or more general transcription factors, to a specific DNA sequence referred to as a “promoter” to form an RNA polymerase-promoter “closed complex”. In the “closed complex” the promoter DNA is still fully double-stranded.[5]

RNA polymerase, assisted by one or more general transcription factors, then unwinds approximately 14 base pairs of DNA to form an RNA polymerase-promoter “open complex”. In the “open complex” the promoter DNA is partly unwound and single-stranded. The exposed, single-stranded DNA is referred to as the “transcription bubble.”[5]

RNA polymerase, assisted by one or more general transcription factors, then selects a transcription start site in the transcription bubble, binds to an initiating NTP and an extending NTP (or a short RNA primer and an extending NTP) complementary to the transcription start site sequence, and catalyzes bond formation to yield an initial RNA product.[5]

In bacteria, RNA polymerase holoenzyme consists of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, and 1 ω subunit. In bacteria, there is one general RNA transcription factor known as a sigma factor.

RNA polymerase core enzyme binds to the bacterial general transcription (sigma) factor to form RNA polymerase holoenzyme and then binds to a promoter.

[5]
(RNA polymerase is called a holoenzyme when sigma subunit is attached to the core enzyme which is consist of 2 α subunits, 1 β subunit, 1 β' subunit only).

In archaea and eukaryotes, RNA polymerase contains subunits homologous to each of the five RNA polymerase subunits in bacteria and also contains additional subunits.

In archaea and eukaryotes, the functions of the bacterial general transcription factor sigma are performed by multiple general transcription factors that work together.[5] In archaea, there are three general transcription factors: TBP, TFB, and TFE.

In eukaryotes, in RNA polymerase II-dependent transcription, there are six general transcription factors: TFIIA, TFIIB (an ortholog of archaeal TFB), TFIID (a multisubunit factor in which the key subunit, TBP, is an ortholog of archaeal TBP), TFIIE (an ortholog of archaeal TFE), TFIIF, and TFIIH.

The TFIID is the first component to bind to DNA due to binding of TBP, while TFIIH is the last component to be recruited. In archaea and eukaryotes, the RNA polymerase-promoter closed complex is usually referred to as the “preinitiation complex.”[6]

Transcription initiation is regulated by additional proteins, known as activators and repressors, and, in some cases, associated coactivators or corepressors, which modulate formation and function of the transcription initiation complex.[5]

Promoter escape

After the first bond is synthesized, the RNA polymerase must escape the promoter. During this time there is a tendency to release the RNA transcript and produce truncated transcripts.

This is called abortive initiation, and is common for both eukaryotes and prokaryotes.

[7] Abortive initiation continues to occur until an RNA product of a threshold length of approximately 10 nucleotides is synthesized, at which point promoter escape occurs and a transcription elongation complex is formed.

Mechanistically, promoter escape occurs through DNA scrunching, providing the energy needed to break interactions between RNA polymerase holoenzyme and the promoter.[8]

In bacteria, it was historically thought that the sigma factor is definitely released after promoter clearance occurs. This theory had been known as the obligate release model however later data showed that upon and following promoter clearance, the sigma factor is released according to a stochastic model known as the stochastic release model.[9]

In eukaryotes, at an RNA polymerase II-dependent promoter, upon promoter clearance, TFIIH phosphorylates serine 5 on the carboxy terminal domain of RNA polymerase II, leading to the recruitment of capping enzyme (CE).[10][11] The exact mechanism of how CE induces promoter clearance in eukaryotes is not yet known.

See also:  3 frequently asked questions about modular arithmetic

Elongation

Simple diagram of transcription elongation

One strand of the DNA, the template strand (or noncoding strand), is used as a template for RNA synthesis. As transcription proceeds, RNA polymerase traverses the template strand and uses base pairing complementarity with the DNA template to create an RNA copy (which elongates during the traversal). Although RNA polymerase traverses the template strand from 3' → 5', the coding (non-template) strand and newly formed RNA can also be used as reference points, so transcription can be described as occurring 5' → 3'. This produces an RNA molecule from 5' → 3', an exact copy of the coding strand (except that thymines are replaced with uracils, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one fewer oxygen atom) in its sugar-phosphate backbone).[citation needed]

mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be rapidly produced from a single copy of a gene.

[citation needed] The characteristic elongation rates in prokaryotes and eukaryotes are about 10-100 nts/sec.[12] In eukaryotes, however, nucleosomes act as major barriers to transcribing polymerases during transcription elongation.

[13][14] In these organisms, the pausing induced by nucleosomes can be regulated by transcription elongation factors such as TFIIS.[14]

Elongation also involves a proofreading mechanism that can replace incorrectly incorporated bases. In eukaryotes, this may correspond with short pauses during transcription that allow appropriate RNA editing factors to bind. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure.[citation needed]

Termination

Regulation of Gene Expression

For a cell to function properly, necessary proteins must be synthesized at the proper time. All cells control or regulate the synthesis of proteins from information encoded in their DNA. The process of turning on a gene to produce RNA and protein is called gene expression.

Whether in a simple unicellular organism or a complex multi-cellular organism, each cell controls when and how its genes are expressed.

For this to occur, there must be a mechanism to control when a gene is expressed to make RNA and protein, how much of the protein is made, and when it is time to stop making that protein because it is no longer needed.

The regulation of gene expression conserves energy and space. It would require a significant amount of energy for an organism to express every gene at all times, so it is more energy efficient to turn on the genes only when they are required.

In addition, only expressing a subset of genes in each cell saves space because DNA must be unwound from its tightly coiled structure to transcribe and translate the DNA.

Cells would have to be enormous if every protein were expressed in every cell all the time.

The control of gene expression is extremely complex. Malfunctions in this process are detrimental to the cell and can lead to the development of many diseases, including cancer.

Learning Objectives

  • Discuss why every cell does not express all of its genes
  • Compare prokaryotic and eukaryotic gene regulation

For a cell to function properly, necessary proteins must be synthesized at the proper time. All cells control or regulate the synthesis of proteins from information encoded in their DNA.

The process of turning on a gene to produce RNA and protein is called gene expression. Whether in a simple unicellular organism or a complex multi-cellular organism, each cell controls when and how its genes are expressed.

For this to occur, there must be a mechanism to control when a gene is expressed to make RNA and protein, how much of the protein is made, and when it is time to stop making that protein because it is no longer needed.

The regulation of gene expression conserves energy and space. It would require a significant amount of energy for an organism to express every gene at all times, so it is more energy efficient to turn on the genes only when they are required.

In addition, only expressing a subset of genes in each cell saves space because DNA must be unwound from its tightly coiled structure to transcribe and translate the DNA.

Cells would have to be enormous if every protein were expressed in every cell all the time.

The control of gene expression is extremely complex. Malfunctions in this process are detrimental to the cell and can lead to the development of many diseases, including cancer.

Gene regulation makes cells different

Gene regulation

Be the first to comment

Leave a Reply

Your email address will not be published.


*