• 沒有找到結果。

Chapter 1 Introduction

1.1 Biological background

1.1.1 Central dogma

The biological central dogma (Figure 1.1) describes the flow of genetic information within a biological system. It was first stated by Francis Crick [12]. In briefly, four steps in this dogma:

First RNA polymerase docks to the chromosome and slides along the gene, transcribing the sequence on one strand of DNA into a single strand of RNA. Next, all introns-noncoding parts of the initial RNA transcript-are spliced out, and the rests are joined together to make a messenger RNA. The RNA then moves out of the nucleus to the cytosol of the cell, where molecular machines translate it into chains of amino acids. Finally, each chain twists and folds into an intricate three-dimensional shape. Traditionally, the proteins are recognized as the main responsibility for biological function.

Figure 1.1 The central dogma of molecular biology.

1.1.2 Non-coding RNA

Classically, proteins are recognized as having the main responsibility for biological function, with RNA merely a messenger that transfers protein-coding information from DNA [13, 14].

This concept has changed in recent years, however, while less than 2% of the genome encodes protein, over 80% of the genome produces non-protein coding RNA transcripts (Figure 1.2) [13-17] and these ncRNAs have important biological functions including gene regulation [18, 19], imprinting [20-24], epigenetic regulation [25, 26], cell cycle control [27], regulation of transcription, translation and splicing [19, 28-32] and others. There are many studies discoveries of non-coding RNAs (ncRNAs) to regulate protein-coding gene expressions.

ncRNA is any RNA molecule that is not translated into a protein, such as piwi interacting RNA (piRNA), microRNAs (miRNAs), short interfering RNAs (siRNA), long ncRNAs (lncRNAs) and transcribed pseudogenes (TPGs) (Figure 1.3). Following, we will introduce the functions of these ncRNAs.

Figure 1.2 Human genome.

Figure 1.3 Schematic representation of the emerging ncRNA world.

1.1.3 Piwi interacting RNA (piRNA)

Piwi interacting RNA (piRNA), form RNA-protein complexes through interactions with piwi proteins, is the largest class of sRNA molecules that is expressed in animal cells [33]. The biogenesis of piRNAs is not yet fully understand, although possible mechanisms have been linked to both epigenetic and post-transcriptional gene silencing of retrotransposons and other genetic elements in germ line cells, particularly those in spermatogenesis [34].

1.1.4 MicroRNA (miRNA)

MicroRNAs (miRNAs) play important roles on development, oncogenesis and apoptosis by binding to mRNAs to regulate the post-transcriptional level of gene expression in mammals, plants and insects [35, 36]. The general biogenesis of the miRNA is shown in Figure 1.4. In briefly, microRNA is defined as single-stranded RNAs of ~22 nt in length generated from endogenous transcripts. It is transcribed by RNA polymerase II [8]、, and the primary miRNA

(pri-miRNA) is first processed by the nuclear RNase type III enzyme, Drosha, to release the hairpin-shaped intermediates, become precursor miRNA (pre-miRNAs) [37]. Pre-miRNA is typically 60-70 nt, is a hairpin structure, which contain an ~22 bp double-stranded stem and a

~10 nt terminal loop. The nuclear export factor, Exportin 5, export the pre-miRNA from the nucleus to the cytoplasm [38]. Then pre-miRNA is cleaved by another RNase III type enzyme, Dicer, to generate an ~22 nt RNA duplex that includes the mature miRNA which becomes part of the RNA-induced silencing complex (RISC) [39]. The mature miRNA then binds to complementary sites in the mRNA target to negatively regulate gene expression through two major mechanisms: one is mRNA degradation through perfect hybridization between miRNA and its target sites, another is translation repression with imperfect hybridization.

Figure 1.4 Micro-RNA biogenesis.

1.1.5 Small interfering RNA (siRNA)

Small interfering RNA (siRNA) is a class of double-stranded RNA molecules with 20-25 bp in length. siRNA plays many roles, but its most notable is in the RNA interference (RNAi) pathway, where it interference with the expression of specific genes with complementary nucleotide sequence [40].

1.1.6 Long non-coding RNA (lncRNA)

Long non-coding RNA (lncRNA) is in general considered as non-protein coding transcript with more than 200 bp in length. This limitation is due to practical considerations including the separation of RNAs in common experimental protocols. Large scale sequencing of cDNA libraries and more recently transcriptomic sequencing by next generation sequencing indicate that the number of lncRNAs is over than ten thousand in human genome [41]. The functions of lncRNA are showed in Figure 1.5 [42]. In briefly, lncRNA transcribed from an upstream non-coding promoter can negative (Figure 1.5 ①) or positively (Figure 1.5 ②) affect expression of the downstream gene by inhibiting RNA polymerase II recruitment and/or inducing chromatin remodelling, respectively. lncRNA is able to hybridize to the pre-mRNA and block recognition of the splice sites by the spliceosome, thus resulting in an alternatively spliced transcript (Figure 1.5 ③ ). Alternatively, hybridization of sense and antisense transcripts can allow Dicer to generate endo-siRNAs (Figure 1.5 ④). The binding of lncRNA to miRNA results in the miRNA silencing (Figure 1.5 ⑤). The complex of lncRNA and specific protein partners can modulate the protein activity (Figure 1.5 ⑥), structure (Figure 1.5 ⑦), localization (Figure 1.5 ⑧) or epigenetic regulation (Figure 1.5 ⑨). Finally, lncRNA also can produce sRNAs (Figure 1.5 ⑩).

Figure 1.5 The functions of lncRNA.

1.1.7 Pseudogene

Pseudogenes are DNA sequences in the genome that similarity to specific protein-coding genes, but are unable to produce functional proteins due to existence of frameshifts, premature stop codons or other deleterious mutations [2]. Pseudogenes have been denoted in several ways including the prefixed Greek symbol , for example PPM1K, or by a capital ‘P’ suffix, for example ZNF355P. There are two major classes of pseudogene (Figure 1.6): one represents processed forms that contain poly-A tails, lack introns and arise through retrotransposition, while the other comprises nonprocessed pseudogenes resulting from gene duplication, which retain exon/intron structure, although occasionally incompletely [2].

Pseudogenes are usually considered to be junk DNA and genomic fossils, however, a number

of recently studies showed that pseudogenes, especially transcribed ones, may not mere genomic fossils, but function as gene regulators. The following section will introduce the functions of TPGs..

Figure 1.6 The mechanisms of pseudogene.

1.1.8 Transcribed pseudogene (TPG)

The transcribed pseudogenes (TPGs) are disabled but nonetheless transcribed. These TPGs may function as gene regulators through generation of endogenous siRNAs (esiRNAs), antisense RNAs, or RNA decoys. For instance, the NOS transcript acts as a natural antisense regulator of neuronal NOS protein synthesis in snails [44, 45]; and in mice, reduced expression of makorin1-p1 due to a transgene insertion caused mRNA instability of its parental gene Mkrn1, resulting in polycystic kidneys and bone deformity [10, 46], although

contradictory results were also reported [47]. Additionally, a transcript of PTEN/PTENP1, a highly homologous processed TPG of tumor suppressor gene PTEN, not only interacts with its cognate sequence but also exerts a growth suppressor role as a decoy by binding to PTEN-targeting miRNAs [48]. These findings clearly imply that TPGs may play active regulatory roles in cellular functions.

相關文件