| Home | Biology Department Home Page | IUS Home Page | IUS Admissions |

Self-Instructional on DNA Science
Proteins and Nucleic Acids
Proteins are composed of monomeric units called amino acids. These monomers are called amino acids because of the carboxylic acid functional group and the amine group common to all molecules of this classification.
All of the proteins in living systems perform essential functions for organisms. The functions of proteins are defined by the characteristics of the amino acids that compose them. In particular, the sequence of amino acids in a protein determines the function of that protein. Amino acids are linked through peptide bonds. The amino acid chain is termed a polypeptide. The importance of the sequence of amino acids in a polypeptide explains how there can be seemingly limitless types of proteins from a mere 20 different amino acids. Change the amino acid sequence of any protein, and the structure of the protein changes.Nucleic acids are polymers, as are proteins. The monomeric units of nucleic acids are called nucleotides. Nucleotides are made up of three parts covalently bonded together:
Figure 1. Generalized diagram of two nucleic acids and their components.

A phosphodiester bond links nucleotide units to one another. These bonds are formed through a dehydration reaction, creating the complex polymeric molecules known as DNA and RNA. DNA stands for deoxyribonucleic acid, and RNA stands for ribonucleic acid. The nucleotides of DNA lack one oxygen on the ribose sugar. Thus the prefix "deoxy." Nucleic acids were named with reference to the nucleus of the cell because that is where they were first discovered. DNA does reside in the nucleus of the cell in eukaryotic organisms, but in prokaryotes, which lack a nucleus, DNA is found in the cytoplasm. Different types of RNA are found both in the nucleus of eukaryotes and in the cytoplasm. Nucleic acids have acidic properties because of the phosphate groups, which can act as hydrogen ion donors.
DNA and RNA differ in several characteristics:
Complementarity in DNA
The double stranded structure of DNA is the result of the formation of hydrogen bonds between the nitrogenous bases on one strand of DNA with the nitrogenous bases on a different strand of DNA. There are four nitrogenous bases in DNA:
These names are abbreviated by the letters T, A, C, and G. The key to DNA's remarkable ability to replicate itself lies in the fact that each nitrogenous base can only form the proper hydrogen bonds with one other nitrogenous base. Thymine can only hydrogen bond with adenine. Cytosine can only hydrogen bond with guanine. This property of the nitrogenous bases is called complementarity because each base has its specific complement. It is very important to remember these base-pairing rules, as they are the single most important feature of nucleic acid structure. Summarized, the base-pairing rules are:
Complementarity allows DNA to replicate itself, and also to be rewritten in the form of RNA. The RNA can then be translated into an amino acid sequence, also through the help of complementarity. The sequence of nitrogenous bases is retained throughout replication, transcription, and translation because of these base-pairing rules.
It may seem strange to think that a spiraling, double-stranded molecule of DNA, composed of simple sugars, phosphates, and four different nitrogenous bases, could somehow lead to the production of all proteins necessary for the functions of the cell. The process is really beautifully simple: DNA is transcribed, or rewritten, into the form of RNA; the RNA molecule is translated into an amino acid sequence; and the amino acid sequence folds to produce a three-dimensional protein.
DNA Replication: the duplication of DNA molecules
We all began existence as a single cell. At some point a sperm and an ovum fused together, each containing one half of the DNA content from our mother and one half of the DNA content from our father. This fused cell is called a zygote. The zygote grew and divided until it became many cells which all needed to have exactly the same DNA content as the original cell.
Every time one cell divides to become two cells it must first make an identical copy of all of its DNA so that it may pass this new DNA to the new cell. The DNA must be replicated, or copied. Error- free DNA replication is absolutely critical to the growth, development, reproduction and maintenance of all organisms. Why is it so essential that DNA be replicated faithfully? The DNA in all of the thousands of cells forming an embryo is used as a set instructions that tells the embryo how to develop. The development of even the simplest creatures is not yet well understood, but this development is regulated by the DNA within the organism.
All life starts with and relies upon the amazing process of DNA replication.DNA replication starts with the double stranded (ds) structure of DNA splitting into two single strands. In this process, the hydrogen bonds between the 2 strands of DNA are broken in one specific location. This location in any DNA molecule is called the origin of replication. Some DNA molecules have only one origin of replication, but large DNA molecules, such as the eukaryotic chromosome, have many origins.
The DNA at the origin of replication is mostly composed of A and T base pairs. If you look at a structural formula showing the base pairing of A-T and C-G you will notice that A and T base pairs form 2 hydrogen bonds while C and G base pairs form 3 hydrogen bonds. This means that A and T base pairs take less energy to pull apart and therefore split apart more easily.
Once the double stranded DNA opens up at the origin of replication, an enzyme called DNA polymerase binds to the DNA. The DNA polymerase is a proteinaceous enzyme that catalyzes the formation of phosphodiester bonds between the nucleotides of DNA. DNA polymerase works in the 5' to 3' direction along a strand of DNA. The 5' (read "five prime") and 3' (read "three prime") notation refers to a location on the DNA strand. As you know, DNA is composed of alternately repeating segments of phosphate group and ribose sugar, with bases attached to the sugar. The carbons in the ribose are numbered according to IUPAC rules. The number 5 carbon of ribose is attached directly to a phosphate group, while the number 3 carbon of ribose bears a hydroxyl group that is linked to a subsequent phosphate group. DNA polymerase forms a new phosphodiester bond between the 3' hydroxyl group and the 5' phosphate group of another nucleotide. The head of the newly replicated DNA strand will have a phosphate group attached to the 5' carbon, while the tail end will consist of a hydroxyl group attached to the 3' carbon.
When two DNA molecules hydrogen bond together to form a double stranded structure they must lie in an antiparallel fashion so that the bases can be in the proper orientation to form hydrogen bonds. The antiparallel orientation means that the 3' end of one strand is side by side with the 5' end of the complementary strand. One of the strands is essentially upside-down.

Figure 2. the carbon numbering for deoxyribose.
The directional addition of nucleotides and antiparallel orientation are very important concepts to understand before we go on.
Replication occurs in the DNA molecule on both ends of the replication fork at the same time in different directions. For this reason, DNA replication is said to be bidirectional. Figure 2 illustrates this concept. Please note that bidirectional replication proceeds in both directions around the circular chromosome which allows for a speedier replication process.
Figure 3. A bidirectional replication bubble on a circular DNA bacterial chromosome

Remember that the contents of the cell contain many solubilized molecules and components. Many nucleotides are floating around next to the DNA polymerase waiting to form the new DNA strand. Each of the nucleotides contains either a thymine, adenine, cytosine, or guanine base. How does DNA polymerase know which of the four nucleotides to insert next? The sequence of bases for the new DNA strand is determined by its complementary strand, using the base pairing rules A-T and C-G. Only a nucleotide with an adenine will fit opposite a nucleotide with a thymine, while only a nucleotide with a cytosine base will fit opposite a nucleotide with a guanine base. If DNA polymerase tries to insert an adenine opposite a cytosine, the hydrogen bonds will not form properly, and the adenine will be removed.
Keep in mind that DNA replication proceeds on both sides of the replication fork, one DNA polymerase moving toward the fork, and another moving away. In this manner, each individual strand of the old DNA is used as a complement for the new strand of DNA. This is termed semiconservative replication, because each new strand of DNA is composed of one old (parental) strand and 1 new (daughter) strand.
Remember that DNA polymerase can only add nucleotides in the 5' to 3' direction. This means that DNA polymerase can only catalyze the addition of a nucleotide onto the 3' OH group at the tail of a DNA strand. If this is true then only the 3' to 5' strand can act as the template for synthesis of the alternate strand, because of the antiparallel orientation of DNA molecules. How then does the other strand of DNA get synthesized? It is synthesized in what we call a discontinuous fashion. Refer back to Figure 3 for illustration.
As the strands of DNA separate, synthesis occurs continuously from the strand running in the 3' to 5' direction.
However, in order to replicate the 5' to 3' strand, the DNA polymerase must jump up to the crotch of the fork and replicate the DNA in the opposite direction. Imagine the DNA polymerase taking a giant leap forward, and then scooting backward, then leaping forward, and so on. Due to the DNA polymerase frequently stopping and jumping ahead, several small pieces of replicated DNA result. These small pieces of DNA are called okazaki fragments, after their discoverer. After the okazaki fragments are synthesized on the discontinuous strand they are joined together, or ligated, by an enzyme aptly named DNA ligase.A tip on 5' to 3' DNA synthesis: Do not try to remember the just the numbers. Remember instead the structure of the nucleotide, and that the 3' hydroxyl group is at the tail of the molecule. DNA polymerase can only add a new nucleotide to an existing hydroxyl group, therefore, all additions must take place at the 3' tail of the new strand.
Practice Round One
1. Please draw 2 nucleotides covalently bonded together through a phosphodiester bond. You do not need to draw the structure of the bases, simply write the letter abbreviation at the correct location on the ribose.
2. Write the complementary sequence for the following single strand of DNA:
5' A-T-T-C-C-G-G-A-A-T-T-C-C-C-C-G-T-A-A-T-C-G-T-A-C-C-G-G-T-G-C-A 3'
3. What is the initial site of DNA replication and what special feature/s does it have to facilitate the replication process?
4. Define or explain the following terms which relate to DNA structure and function: Antiparallel, Bidirectional synthesis, Semiconservative synthesis, Replication fork, Continuous strand synthesis, Discontinuous strand synthesis, Okazaki fragment,
and DNA ligase.DNA Transcription : the conversion of DNA to RNA
Literally, the term transcription means to "write across," or convert information. The transcription process converts the double stranded DNA code into a single strand RNA format. The sequence of bases in DNA holds all the information necessary to tell the cell how to manufacture proteins. Why then can protein not be directly synthesized from the double-stranded DNA molecule?
First, double-stranded DNA is a huge, cumbersome molecule. Since cells usually only need a few proteins manufactured at any given time it would be very difficult to locate the specific DNA sequence which codes for a single protein (a gene) on the huge strand of DNA.
Second, in all eukaryotes the DNA in a cell is isolated within the nucleus, separate from the organelles used in the assembly and modification of proteins. The most important of these organelles is the ribosome. The ribosome floats in the cytoplasm, or remains embedded in the endoplasmic reticulum, also in the cytoplasm. For this reason, ribosomes and DNA chromosomes are isolated from one another in eukaryotic cells. Both ribosomes and DNA chromosomes are far too large to fit through a nuclear pore. However, a piece of single-stranded mRNA that codes for one or more proteins can easily float through a nuclear pore to meet up with a ribosome inside the cytoplasm.
Fortunately for us, the process of transcription is very similar to the process of replication. Just be careful not to get them confused!
Transcription is accomplished by a proteinaceous enzyme called RNA polymerase that catalyzes the formation of phosphodiester bonds between the nucleotides to produce an RNA strand. The sequence of the RNA molecule is determined by its complementarity with the DNA strand, just as in replication. The only difference is that whenever RNA polymerase encounters an adenine nucleotide, it inserts a uracil nucleotide on the new RNA strand, rather than thymine.
RNA polymerase can only catalyze the formation of the phosphodiester bonds in a 5' to 3' direction (just like DNA polymerase) and the DNA and newly made RNA strand are antiparallel (just like double stranded DNA). This means that the template strand of DNA for RNA synthesis is always the 3' to 5' strand.
One final note: transcription as a process means the conversion of DNA to any RNA molecule. Some RNAs are made via transcription which are not translated into proteins. Instead these RNA molecules have a wide variety of functions inside cells. The best understood types of non-translatable RNA are those responsible for the conversion of messenger RNA (mRNA) into protein molecules. These RNA's are transfer RNA (tRNA) and ribosomal RNA (rRNA) which will be discussed in the next section.
Practice Round Two
1. Why must the transcription of DNA into RNA occur ?
2. In which direction does RNA synthesis proceed ?
3. What enzyme catalyzes transcription ?
4. Transcribe the following ds DNA molecule.
5Õ A-T-T-C-C-G-G-C-C-A-T-C-G-C-T 3Õ
3Õ T-A-A-G-G-C-C-G-G-T-A-G-C-G-A 5Õ
Translation: Conversion of mRNA into Protein
The sequence of the amino acids in a protein is critically important to the structure, and therefore the function of the protein. Amino acid sequence is ultimately determined by DNA sequence. Therefore, the sequence of bases in DNA is equally important because a change in the base sequence will affect the amino acid sequence. If you change the base sequence, the amino acid sequence will change, most likely resulting in a nonfunctional protein that is useless and harmful to the cell.
To summarize, the three main types of RNA are:
How does translation occur?
The mRNA travels to the ribosome in the cytoplasm and binds to the small subunit of the ribosome. A complete ribosome may be thought of as two spherically shaped structures that sit one on top of the other. The complete ribosome is composed of a small spherical ribosomal subunit and a large spherical ribosomal subunit. Figure 4 below is a diagrammatic representation of the subunit structure of ribosomes.
Figure 4.

It is known that the sequence of the bases in DNA is the foundation for the amino acid sequence of a protein, but how do we get from a sequence of nucleotide bases to a sequence of amino acids? Each nitrogenous base cannot represent a separate amino acid. Obviously, four different bases could only represent four different amino acids. There are twenty different amino acids that must be specified. Neither can a pair of bases code for all twenty different amino acids. A combination of two of the four bases would result in only 16 different possible combinations. The language that codes for each amino acid is represented by a sequence of three bases, called a codon. By definition a codon is a 3 base sequence in messenger RNA which codes for a specific amino acid or a stop translation signal.
All of the possible combinations of 4 bases taken by threes can be represented by 43 (4^3), which results in 64 possible different combinations. This is more than enough combinations to represent the 20 different amino acids. As a matter of fact, even though we only need 20 codons (21 including a stop and start codon), there are 61 codons which code for the 20 amino acids and 3 codons which code for a stop translation signal. Because organisms had 64 codons available it took up all the available codons and used them. This is why the genetic code is said to be degenerate. Over time the code degenerated into all 64 available codons even though it only needed 21 different codons.
Each codon is specific for only one amino acid; however, some amino acids have more than one codon. For example, there are six codons which represent the amino acid leucine. These codons are CUA, CUC, CUG, CUC, UUA and UUG. However the amino acid methionine only has one codon AUG, which is also the start codon.
Now let's get back to the process of translation.
During translation the mRNA binds to the small ribosomal subunit first and then a tRNA molecule carrying an amino acid is called in. The tRNA uses a three base pair sequence called an anticodon to bind to the start translation site on the mRNA. Only one specific tRNA with the proper anticodon will be able to bind with the mRNA codon because of the base-pairing rules.
Translation begins with the AUG start codon. This AUG sequence is recognized by a tRNA that carries the amino acid methionine. This special tRNA is called tRNA Met, as an abbreviation for methionine. The tRNA Met molecule has the anticodon UAC, which is complementary to the codon AUG. In this manner, methionine is installed as the first amino acid of the polypeptide chain. The start translation site is always the same for all mRNA molecules. The 3 nucleotide base sequence AUG says "Start Translation!" on the mRNA's of all genes. For this reason it is termed the universal start translation site.
Below is a chart which represents all of the 64 codons of the Genetic Code.
| UUU - Phe
UUC - Phe UUA - Leu UUG - Leu |
UCU - Ser
UCC - Ser UCA - Ser UCG - Ser |
UAU - Tyr
UAC - Tyr UAA - Stop UAG - Stop |
UGU - Cys
UGC - Cys UGA - Stop UGG - Trp |
| CUU - Leu
CUC - Leu CUA - Leu CUG - Leu |
CCU - Pro
CCC - Pro CCA - Pro CCG - Pro |
CAU - His
CAC - His CAA - Gln CAG - Gln |
CGU - Arg
CGC - Arg CGA - Arg CGG - Arg |
| AUU - Ile
AUC - Ile AUA - Ile AUG - Met (Start) |
ACU - Thr
ACC - Thr ACA - Thr ACG - Thr |
AAU - Asn
AAC - Asn AAA - Lys AAG - Lys |
AGU - Ser
AGC - Ser AGA - Arg AGG - Arg |
| GUU - Val
GUC - Val GUA - Val GUG - Val |
GCU - Ala
GCC - Ala GCA - Ala GCG - Ala |
GAU - Asp
GAC - Asp GAA - Glu GAG - Glu |
GGU - Gly
GGC - Gly GGA - Gly GGG - Gly |
Outline of the Process of Translation
Initiation
Elongation
Termination
The process of translation occurs continuously inside living cells and is an incredibly fast process. The instant that the translation of protein ceases, life ceases.
Practice Round Three
1. Define or explain the following terms: codon, anticodon, universal start codon, stop codons, degeneracy of the DNA code, small and large ribosomal subunits, mRNA, tRNA, and rRNA.
2. What are the three steps of translation called?
3. Why are there 64 codons?
4. Where does translation occur?
5. Why is DNA not directly translated into proteins?
6. List the components of translation and describe what happens to them after termination of translation has occurred.
7. Draw the process of translation initiation, elongation and termination.
POST TEST on DNA Science
1. Outline the processes of Replication of DNA, Transcription of RNA and Translation of Proteins. Please include definitions of all the components of each process. Draw pictures if necessary.
2. Using your codon chart, Replicate, then Transcribe and Translate the following DNA Sequence. Be sure to label the ends of the molecules appropriately.
5' A-A-T-A-T-G-C-T-C-A-T-T-C-C-C-G-G-G-T-T-C-A-T-G-C-C-G-C-G-C-T-G-A-A-T 3'
3. Draw a label appropriately a DNA replication fork. Be sure to label the strands and show the direction of synthesis.
© copyright by Gretchen Kirchner 1996, 2001
| Home | Biology Department Home Page | IUS Home Page | IUS Admissions |