DNA and RNA | Computational Medicine Center at Thomas …

Posted: March 31, 2021 at 6:36 am

1. DNA1.1 DNA basics / structure

DNA (deoxyribonucleic acid) is the genomic material in cells that contains the genetic information used in the development and functioning of all known living organisms. DNA, along with RNA and proteins, is one of the three major macromolecules that are essential for life. Most of the DNA is located in the nucleus, although a small amount can be found in mitochondria (mitochondrial DNA). Within the nucleus of eukaryotic cells, DNA is organized into structures called chromosomes. The complete set of chromosomes in a cell makes up its genome; the human genome has approximately 3 billion base pairs of DNA arranged into 46 chromosomes. The information carried by DNA is held in the sequence of pieces of DNA called genes.

DNA consists of two long polymers of simple units called nucleotides, with backbones made of sugars and phosphate groups joined by ester bonds. These two strands run in opposite directions to each other and are therefore anti-parallel. Attached to each sugar is one of four types of molecules called nucleobases (bases). It is the sequence of these four bases along the backbone that encodes information. The sequence of these bases comprises the genetic code, which subsequently specifies the sequence of the amino acids within proteins. The ends of DNA strands are called the 5(five prime) and 3 (three prime) ends. The 5 end has a terminal phosphate group and the 3 end a terminal hydroxyl group. One of the major structural differences between DNA and RNA is the sugar, with the 2-deoxyribose in DNA being replaced by ribose in RNA.

Thestructure of DNA

Bases are classified into two types: the purines, A and G, and the pyrimidines, the six-membered rings C, T and U. Uracil (U), takes the place of thymine in RNA and differs from thymine by lacking a methyl group on its ring. Uracil is not usually found in DNA, occurring only as a breakdown product of cytosine.

In the DNA double helix, each type of base on one strand normally interacts with just one type of base on the other strand. This is complementary base pairing. Therefore, purines form hydrogen bonds to pyrimidines, with A bonding only to T, and C bonding only to G.

The central dogma of molecular biology is DNA makes RNA makes protein. This general rule emphasizes the order of events from transcription through translation and provides the basis for much of the genetic code research in the post double helix 1950s. The central dogma is often expressed as the following: DNA makes RNA, RNA makes proteins, proteins make us. Protein is never back translated to RNA or DNA. Furthermore, DNA is never translated directly to protein.

The Central Dogma of Molecular Biology

See also:The central dogma(external link).

Cell division is essential for cells to multiply and organisms to grow. As the final step in the Central Dogma, DNA replication must occur in order to faithfully transmit genetic material to the progeny of any cell or organism. When a cell divides, it must correctly replicate the DNA in its genome so that the two daughter cells have the same genetic information as their parent. The double-stranded structure of DNA provides a simple mechanism for DNA replication. The two strands are separated and then an enzyme called DNA polymerase recreates each strands complementary DNA sequence. This enzyme makes the complementary strand by finding the correct base through complementary base pairing. As DNA polymerases can only extend a DNA strand in a 5 to 3 direction, different mechanisms are used to copy the antiparallel strands of the double helix. In this way, the base on the old strand dictates which base appears on the new strand, and the cell ends up with a perfect copy of its DNA. This process typically takes place during S phase of the cell cycle.

The process by which DNA achieves its control of cell life and function through protein synthesis is calledgene expression. A gene is a DNA sequence that contains genetic information for one functional protein. Proteins are essential for the modulation and maintenance of cellular activities. The formation of new protein molecules is made from amino acid building blocks based on information encoded in DNA/RNA. The amino acid sequence of each protein determines its conformation and properties (e.g. ability to interact with other molecules, enzymatic activity etc). Directed protein synthesis follows two major steps: gene transcription and transcript translation.

Transcription is the process by which the genetic information stored in DNA is used to produce a complementary RNA strand. In more detail, the DNA base sequence is first copied into an RNA molecule, called premessenger RNA, by messenger RNA (mRNA) polymerase. Premessenger RNA has a base sequence identical to the DNA coding strand. Genes consist of sequences encoding mRNA (exons) that are interrupted by non-coding sequences of variable length, called introns. Introns are removed and exons joined together before translation begins in a process called mRNA splicing. Messenger RNA splicing has proved to be an important mechanism for greatly increasing the versatility and diversity of expression of a single gene. It takes place in the nucleus in eukaryotes and in the cytoplasm in bacteria and archaea and leads to the formation of mature mRNA. Several different mRNA and protein products can arise from a single gene by selective inclusion or exclusion of individual exons from the mature mRNA products. This phenomenon is calledalternative mRNA splicing. It permits a single gene to code for multiple mRNA and protein products with related but distinct structures and functions1. Once introns are excised from the final mature mRNA molecule, this is then exported to the cytoplasm through the nuclear pores where it binds to protein-RNA complexes called ribosomes2. Ribosomes contain two subunits: the 60S subunit contains a single, large (28S) ribosomal RNA molecule complexed with multiple proteins, whereas the RNA component of the 40S subunit is a smaller (18S) ribosomal RNA molecule.

DNA transcription

Although every somatic cell in the human body contains the same genome, activation and silencing of specific genes in a cell-type-specific manner is necessary. Moreover, a cell must silence expression of genes specific to other cell types to ensure genomic stability. This type of repression must be maintained throughout the life of each cell in normal development. Epigenetic modifications that are defined as heritable, yet reversible changes that influence the expression of certain genes but with no alteration in the primary DNA sequence are ideal for regulating these events. The best studied epigenetic modification in human is DNA methylation, however it becomes increasingly acknowledged that DNA methylation does not work alone, but rather occurs in the context of other epigenetic modifications such as the histone modifications.

Epigenetic Modifications

RNA, is another macromolecule essential for all known forms of life. Like DNA, RNA is made up of nucleotides. Once thought to play ancillary roles, RNAs are now understood to be among a cells key regulatory players where they catalyze biological reactions, control and modulate gene expression, sensing and communicating responses to cellular signals, etc.

The chemical structure of RNA is very similar to that of DNA: each nucleotide consists of a nucleobase a ribose sugar, and a phosphate group. There are two differences that distinguish DNA from RNA: (a) RNA contains the sugar ribose, while DNA contains the slightly different sugar deoxyribose (a type of ribose that lacks one oxygen atom), and (b) RNA has the nucleobase uracil while DNA contains thymine. Unlike DNA, most RNA molecules are single-stranded and can adopt very complex three-dimensional structures.

DNA and RNA similarities and differences

The universe of protein-coding and non-protein-coding RNAs (ncRNAs) is very diversevis--vis biogenesis, composition and function, and has been expanding rapidly59. Among the ncRNAs, microRNAs (miRNAs) represent the best-studied class to date and have been shown to regulate the expression of their protein-coding gene targets in a sequence-dependent manner1012.

An RNA molecule is said to be monocistronic when it captures the genetic information for a single molecular transcriptional product, e.g. a single miRNA precursor or a single primary mRNA. Most eukaryotic mRNAs are indeed monocistronic. On the other hand, rRNAs and some miRNAs are known to be polycystronic. In the case of polycistronic mRNAs, the primary transcript comprises several back-to-back mRNAs, each of which will be eventually translated into an amino acid sequence (polypeptide). Such polypeptides usually have a related function (they often are the subunits composing a final complex protein) and their coding sequences are grouped into a single primary transcript, which in turn permits them to share a common promoter and to be regulated together.

One of the best known and best-studied classes of RNAs are messenger RNAs (mRNAs). MRNAs carry the genetic information that directs the synthesis of proteins by the ribosomes. All cellular organisms use mRNAs. The process of protein synthesis makes use of two more classes of RNAs, the transfer RNAs (tRNAs) and the ribosomal RNAs (rRNAs). The role of tRNAs is the delivery of amino acids to the ribosome where rRNAs link them together to form proteins.

The structure of an mRNA

RNA interference is a process that moderates gene expression in a sequence dependent manner. The RNAi pathway is found in all higher eukaryotes and was recently found in the budding yeast as well. Viruses have also been shown to be RNAi-aware in that they use their natural hosts RNAi pathway to their benefit.

RNAi is initiated by Dicer, a double-stranded-RNA-specific endonuclease from the RNase III protein family. Dicer cleaves double-stranded RNA (dsRNA) molecules into short fragments of ~21 nucleotides, with a two-nucleotide overhang at their 3 end, as well as a 5 phosphate and a 3 hydroxyl group. The RNAi pathway can be engaged by two types of small regulatory non-coding RNAs: a) small interfering RNAs (siRNAs), which are typically exogenous, and b) microRNAs (miRNAs), which are endogenous. SiRNAs are double-stranded ncRNAs that are mainly delivered to the cell experimentally by various transfection methods although they have been described to be produced form the cell itself15. MiRNAs are another type of small ncRNAs that are transcribed from the organisms DNA. After processing of the primary siRNAs and miRNAs by Dicer, typically one of the two strands is loaded onto the RNA-induced silencing complex (RISC), a complex of RNA and proteins that includes the Argonaute protein, whereas the other strand is discarded. The loaded siRNAs and miRNAs guide RISCs binding to specific mRNAs (targets). The sequence of the siRNA/miRNA determines the identity of the target. The resulting heteroduplex of the siRNA/miRNA and its target mRNA is characterized by base-pairing that generally spans much of the siRNA/miRNAs length. SiRNAs are typically designed to be perfectly complementary to their targets. On the other hand, miRNAs need not be fully-complementary to the mRNA that they target. This imprecise matching gives miRNAs the potential to target multiple endogenous mRNAs simultaneously. Whether induced by an siRNA or an miRNA, the downstream effect is the down-regulation of the targeted mRNA either via degradation or translational inhibition.

RNA interference in mammalian cells

Designer siRNAs are now widely used in the laboratory to down-regulate specific proteins whose function is under study. At the same time, the ability to engage the RNAi pathway in an on demand manner suggests the possibility that RNAi can be used in the clinic to reduce the production of those proteins that are over-expressed in a given disease context. Analogously, RNAi can also be used to sponge away excess amounts of an endogenous miRNA that would otherwise down-regulate a needed protein. The delivery method remains an important consideration for the development of RNAi-based therapies as the active molecule needs to be delivered efficiently and in a tissue-specific manner in order to maximize impact and diminish off-target effects.

See also:RNAi(external link).

The expression of proteins is determined by genomic information, and their presence supports the function of cell life. Parts of an organisms genome are transcribed in an orderly tissue- and developmental phase- specific manner into RNA transcripts that are destined to effect the eventual production of proteins.

Until fairly recently, it was believed that the molecules that are important for the function of a cell are those described by the Central Dogma of biology, namely messenger RNAs and proteins. Things began to change with the discovery of microRNAs more than 20 years ago in plants16and animals17,18. Subsequent research efforts have demonstrated that large parts of an organisms genome will be transcribed at one time point or another into RNA, but will not be translated into an amino acid sequence. These RNA transcripts have been referred to as ncRNAs and there is increased appreciation that many of them are indeed functional and affect key cellular processes.

There are many recognizable classes of ncRNAs, each having a distinct functionality. These include: transfer RNAs (tRNAs)19; ribosomal RNAs (rRNAs)20; the above-mentioned miRNAs17,18; small nucleolar RNAs (snoRNAs)21,22; piwi-interacting (piRNAs)2325; transcription initiation RNAs (tiRNAs)26; human microRNA-offset (moRNAs)27; sno-derived RNAs (sdRNAs)28; long intergenic ncRNAs (lincRNAs)29; etc. The full extent of distinct classes of ncRNAs that are encoded within the human genome is currently unknown but are believed to be numerous.

miRNA biogenesis

The biological role of long ncRNAs as a class remains largely elusive. Several specific cases have been shown to be involved in transcriptional gene silencing, and the activation of critical regulators of development and differentiation: these exerted their regulatory roles by interfering with transcription factors or their co-activators, though direct action on DNA duplex, by regulating adjacent protein-coding gene expression, by mediating DNA epigenetic modifications, etc.

Reverse transcription is the transfer of information from RNA to DNA (the reverse of normal transcription). This is known to occur in the case of retroviruses, such as HIV, as well as in eukaryotes, in the case of retrotransposons and telomere synthesis.

Post-transcriptional modification is a process in cell biology by which, primary transcript RNA is converted into mature RNA. A notable example is the conversion of precursor messenger RNA into mature messenger RNA (mRNA), which includes splicing and occurs prior to protein synthesis. This process is vital for the correct translation of the genomes of eukaryotes as the human primary RNA transcript that is produced as a result of transcription contains both exons, which are coding sections of the primary RNA transcript and introns, which are the non coding sections of the primary RNA transcript.

Post-trancriptional modifications that lead to a mature mRNA include the (i) addition of a methylated guaninecapto the 5 end of mRNA and (ii) the addition of apoly-A tailto the other end. The cap and tail protect the mRNA from enzyme degradation and aid its attachment to the ribosome.In addition, (iii) introns(non-coding) sequences are spliced out of the mRNA andexons(coding) sequences are spliced together. The mature mRNA transcript will then undergotranslation64.

A protein is a molecule that performs reactions necessary to sustain the life of an organism. One cell can contain thousands of proteins.

Following transcription, translation is the next step of protein biosynthesis. In translation, mRNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or a polypeptide, that will later fold into a protein. Ribosomes read mRNA sequence in a ticker tape fashion three bases at a time, inserting the appropriate amino acid encoded by each three-base code word or codon into the appropriate position of the growing protein chain. This process is called mRNA translation. In particular, the mRNA sequence directly relates to the polypeptide sequence by binding to transfer RNA (tRNA) adapter molecules in binding pockets within the ribosome. Each amino acid is encoded by a sequence of three successive bases. Because thereare four code letters (A, C, G, and U), and because sequences read in the 53 direction have a different biologic meaning than sequences read in the 35 direction, there are 43=64, possible codons consisting of three bases. Some specialized codons serve as punctuation points during translation.The methionine codon (AUG), serves as the initiator codon signaling the first amino acid tobe incorporated. All proteins thus begin with a methionine residue, but this is often removed later in the translational process. Three codons, UAG, UAA, and UGA, serve as translation terminators, signaling the end of translation. The completed polypeptide chain then folds into a functional three-dimensional protein molecule and is transferred to other organelles for further processing or released into cytosol for association of the newly completed chain with other subunits to form complex multimeric proteins.

Protein translation

Post-translational modification is the chemical modification of a peptide that takes place after its translation. They represent one of the later steps in protein biosynthesis for many proteins. During protein synthesis, 20 different amino acids can be incorporated in order to form a polypeptide. After translation, the addition of other biochemical functional groups (such as acetate, phosphate, various lipids and carbohydrates) to the proteins amino acids extends the range of functions of the protein modifying the chemical nature of an amino acid (e.g. citrullination), or making structural changes (e.g. formation of disulfide bridges). In addition, enzymes may remove amino acids from the amino end of the protein, or even cut the peptide chain in the middle. For instance, most nascent polypeptides start with the amino acid methionine because the start codon on mRNA also codes for this amino acid. This amino acid is usually taken off during post-translational modification. Other modifications, like phosphorylation, are part of common mechanisms for controlling the behavior of a protein, for instance activating or inactivating an enzyme.

See also:Inside a cell(external link).

Link:
DNA and RNA | Computational Medicine Center at Thomas ...

Related Posts