microRNA: A Master Regulator of Cellular Processes for Bioengineering Systems

(1)

microRNA: A Master Regulator

of Cellular Processes for

Bioengineering Systems

Wei Sun,

1

_{Yi-Shuan Julie Li,}

2

_{Hsien-Da Huang,}

3

John Y-J. Shyy,

1

_{and Shu Chien}

2

1_{Division of Biomedical Sciences, University of California, Riverside, California 92521;} email: shyy@ucr.edu

2_{Department of Bioengineering and Institute of Engineering in Medicine, University of} California, San Diego, La Jolla, California 92093; email: shuchien@ucsd.edu

3_{Department of Biological Science and Technology, Institute of Bioinformatics and Systems} Biology, National Chiao Tung University, HsinChu, Taiwan

Annu. Rev. Biomed. Eng. 2010. 12:1–27 First published online as a Review in Advance on April 20, 2010

The Annual Review of Biomedical Engineering is online at bioeng.annualreviews.org This article’s doi:

1523-9829/10/0815-0001$20.00

Key Words

microRNA, gene regulation, bioinformatics, deep sequencing

Abstract

microRNAs (miRNAs) are small RNAs 18 to 24 nucleotides in length that serve the pivotal function of regulating gene expression. Instead of being translated into proteins, the mature single-stranded miRNA binds to mes-senger RNAs (mRNAs) to interfere with the translational process. It is esti-mated that whereas only 1% of the genomic transcripts in mammalian cells encode miRNA, nearly one-third of the encoded genes are regulated by miRNA. Various bioinformatics databases, tools, and algorithms have been developed to predict the sequences of miRNAs and their target genes. In combination with the in silico approaches in systems biology, experimen-tal studies on miRNA provide a new bioengineering approach for under-standing the mechanism of ﬁne-tuning gene regulation. This review aims to provide state-of-the-art information on this important mechanism of gene regulation for researchers working in biomedical engineering and related ﬁelds. Particular emphases are placed on summarizing the current tools and strategies for miRNA study from a bioengineering perspective and the pos-sible applications of miRNAs (such as antagomirs and miRNA sponges) in biomedical engineering research.

Annu. Rev. Biomed. Eng. 2010.12:1-27. Downloaded from www.annualreviews.org

by National Chiao Tung University on 04/24/14. For personal use only.

Click here for quick links to Annual Reviews content online, including:

• Other articles in this volume • Top cited articles • Top downloaded articles • Our comprehensive search

Further

ANNUAL

(2)

mRNA: messenger ribonucleic acid miRNA: microRNA pri-miRNA: the original long transcript generated from miRNA gene, which is processed into miRNA precursor and then mature miRNA pre-miRNA: precursor miRNA Contents 1. INTRODUCTION . . . 2 2. miRNA . . . 3 2.1. Biogenesis of miRNA . . . 3

2.2. Mechanisms of miRNA Targeting to Cognate mRNA . . . 4

2.3. Transcriptional Regulation of miRNA . . . 6

3. miRNA IN HEALTH AND DISEASE . . . 7

3.1. miRNA Involvement in Cell and Tissue Development . . . 7

3.2. miRNA and Cancer . . . 8

3.3. miRNA in the Cardiovascular System . . . 8

3.4. miRNA Involvement in Metabolism . . . 9

4. METHODS FOR miRNA PROFILING . . . 10

4.1. Microarray-Based Screening . . . 10

4.2. High-Throughput Sequencing-by-Synthesis Technology . . . 10

5. BIOINFORMATICS ANALYSIS OF miRNAs . . . 11

5.1. Databases of miRNAs . . . 11

5.2. Databases of miRNA Targets . . . 13

5.3. Algorithms and Tools for Identifying miRNAs . . . 13

5.4. Algorithms and Tools for Identifying miRNA Targets . . . 18

6. APPLICATIONS OF miRNA IN BIOENGINEERING . . . 19

6.1. Antagomirs: Antisense Inhibition of miRNA . . . 20

6.2. miRNA Sponges . . . 20

6.3. Bioinformatics Perspectives . . . 21

7. SUMMARY AND CONCLUSIONS . . . 21

1. INTRODUCTION

The “central dogma” of modern molecular biology, which has provided the guiding principle for the transfer of genomic information into organismal structure and function, involves ﬁrst the transcription of deoxyribonucleic acid (DNA) into messenger ribonucleic acid (mRNA) and then the translation of mRNA into proteins (1). (To review basic concepts of molecular genetics, see Reference 2, pp. 111–64.) The discovery of microRNAs (miRNAs) and their target mRNAs has uncovered novel mechanisms regulating gene expression beyond this central dogma. miRNAs belong to noncoding small RNAs that are not translated into proteins. Instead, the genes encoding for miRNAs are transcribed from DNA to produce a primary transcript (pri-miRNA) that is processed into a shorter precursor miRNA (pre-miRNA), which is further processed into a mature, single-stranded miRNA that is 18 to 24 nucleotides long. A mature miRNA binds to its mRNA target at their complementary sequences to downregulate gene expression by inhibiting the mRNA translation to proteins or by inducing mRNA degradation.

The ﬁrst report on miRNA was presented in 1993 by Ambros and colleagues, who described a 22-nucleotide RNA in Caenorhabditis elegans encoded by the 4 gene, which can bind to the lin-14 transcript and interfere with its expression (3). In 1999, Baulcombe and colleagues discovered that miRNAs are also involved in silencing of plant genes (4). The identiﬁcation of the miRNA let-7, originally discovered in C. elegans, in a variety of animal cells suggests ubiquitous distribution of miRNAs (5, 6). The term microRNA was formally introduced in 2001 (7, 8). Thousands of

(3)

RT-PCR: reverse transcription-polymerase chain reaction Drosha: an endoribonuclease in the RNase III family that cleaves pri-miRNA into pre-miRNA DGCR8:acronym for DiGeorge critical region 8, which is a binding partner for Drosha

Exportin-5 (XPO5): pre-miRNA-speciﬁc export carrier that mediates nuclear export of pre-miRNAs Dicer: an

endoribonuclease in the RNase III family that cleaves pre-miRNA into mature miRNA Argonaute (Ago): the catalytic components of the RNA-induced silencing complex (RISC); named after the argonaute (AGO) phenotype of Arabidopsis mutants, which itself was named after its resemblance to argonautes miRISC: miRNA-induced silencing complex that mediates gene silencing; composed of Argonaute proteins and mature miRNA

miRNAs have been identiﬁed in animals and plants since these seminal discoveries. The September 2009 release of miRBase (9), a central online repository for miRNAs, consists of 10,883 miRNA loci in 115 species, which express 10,581 distinct mature miRNA sequences. (More information about Release 14 of the miRBase Sequence Database is available at http://www.mirbase.org/.) There is mounting evidence that altered expressions of these miRNAs are associated with the regulation of a variety of physiological processes such as development and pathophysiological processes such as cancer and other disease states.

Compared with DNA and mRNA, miRNAs have much shorter nucleotide sequences. Con-ventional methods [e.g., Northern blotting and reverse transcription-polymerase chain reaction (RT-PCR)] used in the analysis of DNA or mRNA have signiﬁcant limitations in the analysis of miRNAs because of their short nucleotide length. (For details about these techniques, see Reference 10.) Therefore, innovative high-throughput technologies have been developed to facilitate the sequencing of such short nucleotides on a genome-wide scale. These technologies are able to proﬁle up to 106_{short RNA sequences, but it remains unclear whether these sequenced}

RNAs are simply degradation products of larger RNAs or indeed miRNAs (or similar small RNAs). The unraveling of such problems requires the development of bioinformatics methods to analyze the large amount of data generated from high-throughput sequencing. Thus, software-and algorithm-based in silico approaches are necessary to help determine the genomic loci, functions, and targets of miRNAs.

The rapid advancements in miRNA biology, including miRNAs’ functions in living cells in health and disease, have provided the motivation to introduce this frontier ﬁeld to the discipline of bioengineering. This review summarizes, from a bioengineering perspective, the knowledge about biogenesis and functions of miRNAs, the regulatory roles of miRNAs in biological processes and diseases, the bioinformatics techniques and tools for identifying and characterizing miRNAs, and the potential applications of miRNAs in biomedical-engineering research.

2. miRNA

2.1. Biogenesis of miRNA

The biogenesis of miRNA begins with the transcription of miRNA in the nucleus. The pri-miRNA contains a 60- to 80-nucleotide hairpin stem-loop structure. As shown in Figure 1, its biogenesis involves the cleavage of this hairpin structure by a protein complex consisting of Drosha and DGCR8 (DiGeorge critical region 8, also known as Pasha). This results in the pre-miRNA, which includes a 22-bp stem, a loop, and a 2-nucleotide 3-overhang (11–13). Drosha is an RNase III enzyme, and DGCR8 is its binding partner. The pre-miRNA is exported from the nucleus to the cytoplasm by Exportin-5 (XPO5) (14). In the cytoplasm, the pre-miRNA is further cleaved by another RNase III enzyme, Dicer, which removes the loop to yield the∼22-nucleotide miRNA duplex (15, 16). After being unwound by an unidentiﬁed helicase, one strand of miRNA (whose 5end binds more weakly to the complementary strand) is destined to be the mature miRNA and is termed the guide strand. The complementary strand, termed the passenger strand or miRNA∗, is rapidly degraded (17, 18). Together with the Argonaute (Ago) family of proteins, the mature miRNA is then packed into a ribonucleoprotein complex known as miRISC (miRNA-induced silencing complex), which mediates gene silencing. The endonuclease activity of Ago cleaves the double-strand miRNA-mRNA complex but not the single-strand mRNA. Ago may also mediate the repression of protein synthesis through mechanisms that are unclear. The above provides a general scheme of miRNA biogenesis, but some miRNAs may be processed differently. For example, whereas most miRNAs mature in the cytoplasm, the biogenesis of human miR-29b occurs only in the nucleus (19).

(4)

miRNA genes

RNA pol II or III

pre-miRNA Dicer miRNA duplex Exportin-5 pre-miRNA miRNA* miRNA Unwind mRNA Targeting Degradation pri-miRNA Nucleus Cytoplasm 3' overhang Stem Loop DGCR8 Drosha Figure 1

Biogenesis and functional targeting of miRNA. pri-miRNA (a 60–80-nucleotide hairpin stem-loop) is transcribed from the miRNA gene by RNA pol II or III and then cleaved by the Drosha/DGCR8 complex to result in the pre-miRNA (with a 22-bp stem and a 2-nucleotide 3overhang) in the nucleus. After being exported from the nucleus to the cytoplasm by Exportin-5, pre-miRNA is further cleaved by Dicer to remove the terminal loop. After unwinding, one strand of miRNA acts as the functional guide strand and binds to the target mRNA. The complementary strand (i.e., miRNA∗) is rapidly degraded.

There are active investigations regarding the regulatory mechanisms in each of the steps in-volved in miRNA biogenesis. The important issues that remain to be elucidated are (a) the molec-ular mechanisms by which Drosha, Dicer, DGCR8, and Ago are regulated; (b) the mechanisms of distinctive cellular localization of miRNAs; and (c) the molecular bases of miRNA transcrip-tion, degradatranscrip-tion, and turnover. Conceptually, the homeostasis of miRNA, as that of all elements in the living system, is highly regulated by a set of intricate mechanisms. Understanding the structure-function basis of each component involved in miRNA biogenesis would help design the systems-biology approach to elucidate miRNA regulation and functions.

2.2. Mechanisms of miRNA Targeting to Cognate mRNA

The primary action of miRNA is to target the cognate mRNA, a process governed by base pairing. Depending on the extent of complementarity, miRNA may exert one of two effects on the tar-geted mRNA: (a) mRNA cleavage and degradation, or (b) translation repression. In plants, most miRNAs pair to the target mRNAs in a nearly perfect match, leading to mRNA cleavage and sub-sequent degradation (Figure 2) (20). However, such a high degree of miRNA-mRNA matching is rare in animal cells, where miRNAs typically make imperfect pairings with their mRNA targets. Investigators have extensively studied the mechanisms of animal miRNA targeting by using bioin-formatics approaches in conjunction with experimental validation (21–25). The most important

(5)

AGO miRISC miRNAs mRNAs Translational repression mRNA degradation 3' Seed Seed 5' mRNAs ORF ORF 5' miRISC miRISC AGO 3'UTR 5' 5’' 3’' Seed 5' 3' AGO 3' 3’' 3'UTR Figure 2

Mechanism of miRNA targeting. The 7-nucleotide seed region, which starts at the second nucleotide from the 5end of the miRNA, is required for miRNA-mRNA interaction. The miRNA-mRNA targeting occurs predominantly at the 3UTR of the target mRNA, located at the 3downstream of the open reading frame (ORF). Bulges or mismatches may be present in the middle part of the miRNA-mRNA duplex. As shown in the oval on the lower left, perfect base pairing between miRNA and mRNA (mostly in plant cells) results in miRISC-mediated endonucleolytic cleavage and hence mRNA degradation. Despite the lack of complete complementarity in animal cells shown on the lower right, base pairing (particularly the 13–16 nucleotides of miRNA) is still important in stabilizing the miRNA-mRNA interaction, which leads to translational repression. Frequently, multiple miRNAs (red and blue) can target the same 3UTR. Conversely, a unique miRNA may target multiple binding sites on the same 3UTR in animal cells.

Seed region: a miRNA segment whose sequences precisely match their complementary sequences within the 3UTR of the targeted mRNA. The region is usually located at nucleotides 2–8 3UTR: the untranslated region at the 3end of eukaryotic messenger RNA, which is important in translation regulation

feature of the mechanism is the presence of a seed region comprising a segment of miRNA in which nucleotides 2–8 precisely match their complementary sequences within the 3untranslated region (3UTR) of the targeted mRNA. Another feature is that mismatches and bulges may be present in the miRNA-mRNA duplex but not in the region where miRNA associates with Ago. The third feature is the universal existence of complementarity between the 3half of the miRNA (typically at nucleotides 13–16) and the 3UTR of the mRNA to stabilize the duplex. This is nec-essary because the base pairing between the seed sequence and the target sites on the mRNA is usually not sufﬁcient for repression (22, 24) (Figure 2).

Animal and plant miRNAs differ not only in their miRNA-mRNA base pairing but also in their modes of action. In plant cells, the high miRNA-mRNA complementarity recruits miRISC, leading to the cleavage of mRNA transcript by RNase activity associated with Ago. In animal cells, miRNAs ﬁne-tune protein translation rather than degrade their mRNA targets as seen in plant cells (26). The 3UTRs of animal mRNAs often contain more than one site targeted by one or more miRNAs, suggesting the possibility of cooperative repression imposed by multiple miRNAs (22–25). Several databases and Web resources provide information on miRNA-target interactions; miRNA databases, algorithms, and tools for target prediction are reviewed in Section 5.

In animal cells, miRNA can suppress protein translation by both direct and indirect effects (Figure 3). The direct effects are exerted at various phases of protein translation, including the repression at the initiation (27–30), prevention of ribosome assembly for the initiation of trans-lation (31), repression at postinitiation steps (32), and inhibition of elongation or termination of translation process (33, 34). (For basic knowledge of protein translation, see Reference 2, pp. 132–39.) The indirect translational repression by miRNA occurs at the mRNA level and is

(6)

AGO AGO 60S

a

Initiation block

b

Ribosomal drop-off

c

Stalled elongation Deadenylation mRNA degradation mRNA sequestration P body

Direct translational repression Indirect translational repression

40S 40S AAAAAAAA 40S 40S 60S 60S 40S 40S 40S 60S 60S 40S 40S miRISC miRISC AGO miRISC AGO miRISC 3' tail ORF 5' cap 60S 60S 60S 60S ORF ORF ORF

d

Figure 3

miRNA-mediated translational repression in animal cells. The miRISC-mRNA interaction can lead to several modes of direct translational repression (a–c). (a) Initiation block: The recruitment of 40S and/or 60S ribosomes near the 5cap of mRNA is inhibited. (b) Ribosomal drop-off: The 40S/60S ribosomes are dissociated from mRNA. (c) Stalled elongation: The 40S/60S ribosomes are prohibited from joining during the elongation process. (d ) The indirect translational repression by miRISC occurs via deadenylation, by which the 3poly-A tail of the mRNA is removed, leading to increased mRNA degradation. Alternatively, the destabilized mRNA resulting from deadenylation is localized in P bodies and hence sequestered from translational machinery.

Processing bodies (P bodies): regions within the cytoplasm of the eukaryotic cell consisting of many enzymes involved in mRNA turnover, including mRNA degradation and sequestration TF: transcription factor

caused by the destabilization and subsequent degradation—as well as the compartmentalization and sequestration—of the target mRNAs. In this mode of action, miRNAs destabilize their target mRNAs mainly through deadenylation (35–37). The compartmentalization and sequestration rely on the cytoplasmic foci known as processing bodies (P bodies), where the repressed mRNAs are sequestered with enriched translational repressors (38, 39).

2.3. Transcriptional Regulation of miRNA

Like all classes of RNA, miRNAs are transcribed from DNA. However, the mechanism underlying transcriptional regulation of miRNA had not been understood until recently. Key transcription factors (TFs), including hypoxia-induced factor, the oncogene c-myc, the tumor suppressor p53, and NF-κB, have been shown to up- or downregulate clusters of miRNAs (40–43). The major challenge for the analyses of TFs is to delineate their exact binding sites in the miRNA promoter regions, which may range from a few kb to more than 50 kb upstream of the miRNA genes. As a result, only a few of the TF binding sites have been identiﬁed experimentally. In silico

(7)

TRANSFAC: a database for transcription factors that provides information on their experimentally proven binding sites and their target genes

Match: a software program speciﬁcally designed for the identiﬁcation of TF binding sites in the promoter sequences of genes

ES cells: embryonic stem cells

Gastrulation: a phase early in the development of animal embryos during which three embryonic germ layers—the endoderm, ectoderm, and mesoderm—are structured through cell migration

analysis of the miRNA promoter regions will be able to facilitate this identiﬁcation (44). Because of the highly conserved nature of miRNAs across species, the binding of TFs to promoter regions of conserved candidate miRNAs can be assessed by TRANSFAC and Match. TRANSFAC is a database containing information on TFs, their experimentally proven binding sites, and their target genes (45). Match is a software program speciﬁcally designed for identifying TF binding sites in the promoter sequences of the miRNA gene (46).

To date, TF regulation of miRNAs has been studied mainly in cancer cells. How various miRNAs are transcriptionally controlled under physiological and pathophysiological conditions and how other transcription regulators such as coactivators and corepressors may affect miRNA expression remain to be investigated. A given TF may regulate a cluster of miRNAs, which may in turn modulate other TFs as their target genes, thus forming genetic circuits. The hierarchical relationships of the TF regulation of miRNA and the miRNA regulation of TF are largely un-known. For bioengineers with expertise in computer-assisted design and analysis, elucidation of transcriptional regulation of miRNA can be a fruitful area of research.

3. miRNA IN HEALTH AND DISEASE

miRNA regulates mRNA, which encodes proteins that modulate cellular functions and fate. Therefore, miRNAs play important roles in physiological homeostasis in health and pathophys-iological derangement in disease. Certain miRNAs are tissue-specific—e.g., miR-1 in muscle, miR-21 in the heart, and miR-122 in the liver—whereas others are broader in distribution. The temporal expression of the tissue-specific miRNAs correlates closely with the specific physiologi-cal or pathologiphysiologi-cal status of the corresponding organs. In this section, we review the involvement of miRNAs in tissue development, cancer, cardiovascular diseases, and metabolism. The roles of miRNAs in health and disease of all organs and tissues are being extensively studied in all biomedical fields, and new knowledge is being developed on a daily basis.

3.1. miRNA Involvement in Cell and Tissue Development

Since the discovery of lin-4 and let-7 in C. elegans, there has been increasing evidence that miRNAs are engaged in vertebrate and invertebrate development, including proliferation and differentiation of embryonic stem (ES) cells, lineage commitment during embryogenesis, and maturation of multiple tissues.

In 2003, Bernstein et al. reported that disrupting the global miRNA biogenesis by ablation of Dicer in mice causes embryo death before gastrulation (47), providing the first demonstration of the essentiality of miRNA in early embryogenesis of mammals. Later studies showed that ES cells isolated from such mice have a slower proliferation rate and impaired differentiation, indicating the involvement of miRNAs in the self-renewal and pluripotency of ES cells (48). miRNA profiling has since been performed to reveal the functions of specific miRNAs in ES cells. For example, the miR-290 cluster is highly expressed in ES cells, and investigators have proposed that miR-290 suppresses the proteins that inhibit the expression of Oct4 (49). On the other hand, miR-21, which causes inhibition of Oct4 as a target, has a low expression in ES cells (50). The synergism of high expression of miR-290 and low expression of miR-21 in ES cells may account for the overall high level of Oct4, which is a TF necessary for the self-renewal of undifferentiated ES cells.

After lineage commitment, a highly coordinated gene-expression program directs the matu-ration of various tissues. Indeed, many studies have shown that miRNAs are regulators that ﬁne-tune the maturation of nervous, muscle, adipose, and other tissues. miR-133b, highly expressed

(8)

miRNA signature: the speciﬁc miRNA expression proﬁle in a tissue/organ under a certain physiological or pathological condition

EC: endothelial cell

in dopaminergic neurons, regulates their maturation and function (51). miR-1 and miR-133 are present in skeletal muscle cells: miR-1 promotes their differentiation during myogenesis, and miR-133 enhances the proliferation of myoblasts (52). miR-143 is increased in adipocytes dur-ing adipogenesis. The inhibition of miR-143 effectively suppresses the differentiation process by a reduction in triglyceride accumulation and a decreased expression of adipocyte-speciﬁc genes (53). Detailed reviews of the involvement of miRNAs in the development of various tissues can be found in References 54 and 55.

3.2. miRNA and Cancer

Because miRNAs are directly involved in gene regulation, many of them have been implicated in cancer. In 2005, Lu et al. presented systematic miRNA profiling in multiple human cancer samples, showing that the changes in miRNAs correlate with developmental lineages and differentiation states of the cancers (56). This study demonstrates, for the first time, the potential of using a miRNA signature in cancer diagnosis. Since then, genome-wide profiling has been widely performed to reveal up- or downregulated miRNAs as biomarkers for specific types of cancer (see Reference 57 for review).

Studies on the role of miRNA in cancer development have identified the involvement of miR-NAs in tumorigenesis as well as metastasis, the two major processes in cancer progression. The miR-17–92 cluster, which comprises the first identified oncogenic miRNAs in mammals, is a di-rect effector of the c-myc oncogene and hence named oncomiR-1 (58). In parallel, miR-10b is the first reported metastasis-promoting miRNA; it targets homeobox D10, an antimetastatic gene (59). With the discovery of these cancer-promoting miRNAs, many cancer-suppressing miRNAs have also been identified (60). For example, miR-15a and miR-16-1 have been shown to inhibit tumorigenesis through targeting of the Bcl2 oncogene (61). Meanwhile, the list of the metastasis-suppressing miRNAs, including miR-126, miR-355, and the miR-200 family, is also expanding (62–64). A comprehensive update of currently identified miRNAs and their implications in various forms of cancer is summarized in the recent publication by Sotiropoulou et al. (65).

The important roles of miRNA in tumorigenesis and metastasis have led to the development of miRNA-based diagnostic and prognostic biomarkers as well as anticancer therapeutic agents. Results from several trials have provided evidence to support the use of antisense miRNA to suppress oncogenic miRNA for cancer therapy (66).

3.3. miRNA in the Cardiovascular System

There is ample evidence that miRNAs also play critical roles in cardiovascular homeostasis. Dicer knockdown in endothelial cells (ECs) alters the expression of genes affecting EC biology and reduces EC proliferation and angiogenesis in vitro (67). With the knockdown of both Dicer and Drosha in ECs, capillary sprouting and tube-forming activity are significantly reduced, which may be a consequence of decreased biogenesis of let-7f and miR-27b (68). The available data indicate that the global reduction of miRNAs through the knockdown of Dicer and/or Drosha significantly affects EC functions in vitro and in vivo, suggesting the important roles of miRNAs in regulating vascular functions. Cardiac-specific deletion of Dicer is postnatal lethal, causing significant changes of miRNA expression with consequential dysregulation of muscular and ad-hesion proteins, dilated cardiomyopathy in neonates (52), and massive cardiac remodeling such as spontaneous hypertrophy (69).

Several studies have identiﬁed the involvement of speciﬁc miRNAs in cardiovascular dis-eases. van Rooij et al. have shown that more than 12 miRNAs are up- or downregulated in the

(9)

Antagomirs: chemically modiﬁed antisense oligonucleotides for silencing endogenous miRNA

myocardium of mice following transverse aortic constriction and pathological cardiac remod-eling (70). Muscle-speciﬁc miRNAs (e.g., miR-1 and miR-133) are particularly important in regulating cardiac functions. miR-133 knockout causes dilated cardiomyopathy and heart fail-ure in mice (71). miRNAs also play critical roles in regulating oxidative stress in the cardio-vascular system. It has been shown that miRNAs affect EC redox responses through the reg-ulation of the transcription factor HBP1 and the consequent expression of p47(phox) encod-ing NAD(P)H oxidase (72). Cheng et al. have demonstrated that free radicals induce miR-21 expression, which leads to the protection of cardiomyocytes from oxidative stress (73). In-creased miR-21 has also been found in balloon-injured rat carotid arteries (74). More recently, miR-221 and miR-222 have been reported to possess similar functions as those of miR-21 (75).

These studies have demonstrated the vital roles of miRNAs in regulating cardiovascular gene expression and, as a consequence, cardiovascular functions under physiological and pathological conditions.

3.4. miRNA Involvement in Metabolism

miRNAs are also regulators of metabolism and energy homeostasis. They are not only important for the differentiation of tissues involved in energy production, utilization, and storage (e.g., liver, skeletal muscle, and adipocytes), but also in the regulation of insulin release and amino acid and lipid metabolism. Insulin secretion by pancreatic β cells in mice is decreased by overexpression of miR-375 and increased by inhibition of miR-375, indicating that this miRNA negatively regulates insulin release (76). In terms of amino acid metabolism, miR-29b targets the mRNA-encoding branched-chain α-ketoacid dehydrogenase (BCKD) to prevent its translation in mammalian cells. BCKD catalyzes the ﬁrst irreversible step in branched-chain amino acid (BCAA) catabolism (77). BCAAs (e.g., leucine, isoleucine, and valine) cannot be made de novo in mammals, and they are important in protein synthesis and nitrogen metabolism. Because leucine can stimulate insulin secretion (78), the targeting of BCKD translation by miR-29b indicates that this miRNA may also regulate insulin metabolism.

Regarding lipid metabolism, it has been found that intravenous administration of antagomirs (see Section 6.1) against miR-122 results in the degradation of miR-122 (the most abun-dant hepatic miRNA) in the mouse liver (79). Microarray analysis of liver isolated from these mouse models has shown the inhibition of several genes involved in cholesterol biosynthesis, including HMG-CoA reductase, the rate-limiting enzyme of cholesterol biosynthesis. Inhibi-tion of miR-122 in male C57BL/6 mice causes a reducInhibi-tion in plasma cholesterol level, an in-crease in hepatic β-oxidation, and a dein-crease in the synthesis of fatty acid and cholesterol in the liver; these changes are in line with the idea that miR-122 downregulates energy stor-age. Moreover, miR-122 inhibition by antisense nucleotides in a diet-induced obesity mouse model leads to a decrease in plasma cholesterol level and an improvement of liver steatosis, with reduced expression of lipogenic genes. Liver extracts from these miR-122-inhibted mice cause a 2.5-fold increase in the activity of adenosine monophosphate–activated protein kinase, which serves as an energy sensor by regulating multiple metabolic processes including energy storage, energy mobilization, and appetite (80). In addition, miR-103/107 loci are located in the introns of the genes that encode the pantothenate kinase, which is the rate-limiting en-zyme in generating coenen-zyme A (CoA) (81). Through the regulation of pantothenate kinase, miR-103/107 may modulate cellular acetyl-CoA and lipid levels. Indeed, target prediction of miR-103/107 reveals that many of its target genes are rate-limiting enzymes in metabolic pathways (82).

(10)

Sequencing-by-synthesis (SBS): a high-throughput, large-scale, parallel nucleotide-sequencing technology. Base sequence is determined using fluorescence labeling in the process of nucleotide synthesis Clonal amplicons: the colonies of DNAs formed as the products of artificial amplification events such as PCR in the process of SBS Polonies: multiple colonies of DNA fragments immobilized on the surface of planar substrate or beads for SBS

4. METHODS FOR miRNA PROFILING

Several methods have been developed for detecting and identifying miRNA proﬁles. The earlier approaches are hybridization-based methods, such as Northern blotting and microarray. The recently developed synthesis-based methods of deep-sequencing technologies have enabled the quantitative measurement of multiple miRNA samples within a single setup.

4.1. Microarray-Based Screening

Earlier studies of miRNA profiling depended on Northern blotting, RT-PCR, and cloning (83), which are labor-intensive, time-consuming, and limited in the information that they can obtain about global miRNA expression patterns. The rapid increase in the identified miRNA sequences led to the use of microarray for miRNA profiling; this method can detect thousands of miRNAs and their precursors and analyze multiple samples within a single chip. As in DNA microarray, the sequence-specific probes immobilized on the chip are used to detect miRNA levels in the sam-ples. A variety of commercial sources can provide miRNA profiling using the microarray platform containing the latest version of probes. For example, 866 human and 89 viral miRNA sequences are included in Agilent’s Human miRNA Microarray Version 3. In this platform, small RNAs iso-lated from cells or tissues are covalently labeled with fluorescent dyes and then hybridized to the immobilized probes on array chips. After the nonbinding miRNAs are washed off, the hybridized miRNAs are detected using a microarray scanner or scanning microscope to determine the fluo-rescent intensity on each probe spot, which represents the expression level of the corresponding miRNA in the cells (see References 84 and 85 for reviews).

With the use of small-scale, custom-made microarrays, early studies identiﬁed the miRNA expression proﬁles (i.e., miRNA signatures) during brain development (86). miRNA signatures of various developmental stages and disease progression have also been investigated using miRNA microarrays (87).

Although the miRNA microarray enables analyses of miRNA expression at the global scale with reasonable reproducibility and a medium-to-high capacity, it inherits the drawbacks of hybridization-based approaches in which the signal intensity is based on the afﬁnity of hybridiza-tion between the miRNA and the probe. Therefore, the results are semiquantitative at best. In addition, the similarities in sequences shared by miRNAs and other small RNAs make it dif-ﬁcult to distinguish them from one another. Furthermore, because all probes immobilized on the microarray are previously known miRNAs, this method is not suited for discovery of novel miRNAs.

4.2. High-Throughput Sequencing-by-Synthesis Technology

Several high-throughput sequencing technologies, regarded as “next-generation sequencing,” have been developed for genome-wide nucleotide sequencing (88, 89). Compared with the ana-log nature of hybridization-based microarray, these new methods are based on the principle of sequencing-by-synthesis (SBS), which generates digital data quality. The principal method of SBS (Figure 4) is to sequence the clonal amplicons generated in vitro. The small RNA–derived complementary DNAs (cDNAs) collected from cells are ligated ﬁrst with a short section of nu-cleotides (containing adaptors or tags), then immobilized on the surface of a planar substrate or micron-scale beads. The ampliﬁcation involves either in situ multiple colonies (i.e., polonies) or PCR to generate clusters derived from any given single nucleotide molecule (90, 91). Because the nucleotide reads (sequences) determined by SBS are within the length range of pri-miRNA

(11)

Sanger sequencing: a DNA sequencing method developed by Frederick Sanger. The key principle is the use of dideoxynucleotide triphosphates (ddNTPs) as DNA chain terminators

(30–300 nucleotides) and miRNA, SBS is highly suitable for miRNA profiling. Unlike the tradi-tional Sanger sequencing of single nucleotide, SBS does not require bacterial-based cloning and electrophoretic separation. The SBS-based miRNA profiling can generate much more extensive and comprehensive data than the microarray-based approach. Another major advantage of SBS for miRNA sequencing is its ability to discover and identify unknown miRNA sequences, which can-not be done with the microarray-based approach. For example, acute myeloid leukemia–associated miRNAs have been studied using both microarray and SBS technology (92, 93). Compared with the microarray study, SBS profiling has allowed the identification of not only a larger number of known miRNAs but also 55 novel miRNA species.

Currently, the commercially available SBS-based platforms include the 454 GS FLX pyrosequencing-based instrument (Roche Applied Science), the Illumina Genome Analyzer (Il-lumina Inc.), the SOLiD System (Applied Biosystems), and the HeliScope Sequencer (Helicos BioSciences Corp.). The major technical features of these new platforms, compared with Sanger sequencing, are summarized in Table 1. Because the development of these technologies advances rapidly, readers are referred to the manufacturers’ Web sites for the most current upgrades. These high-throughput sequencing technologies for detecting nucleotide sequences are useful for com-parative biology, genomics studies, and medical diagnostics (89, 94–96) as well as biomedical engineering.

Given the large quantity of output reads with short lengths, read accuracy becomes critical for mapping the acquired sequence to a reference genome and for detecting sequence polymorphisms. Compared with the base accuracy detected by Sanger sequencing, there is a lack of comparable conﬁdence for SBS-based technologies. To compensate for this shortcoming, researchers have used multiple overlapping reads for applications in which accuracy is paramount, such as detection of mutations or sequence polymorphisms (97). An increase in the accuracy of detecting individual bases will ultimately lead to reductions of repetitive reads currently required for base assignment.

5. BIOINFORMATICS ANALYSIS OF miRNAs

Because of the complexity of miRNA biogenesis and miRNA-target interactions, bioinformatics approaches are necessary for processing the large amount of data involved in miRNA registry, miRNA prediction, and miRNA-target prediction. To date, almost all miRNA studies require bioinformatics resources, such as the multitude of miRNA databases and a variety of computational tools and algorithms (Table 2). Many, if not all, of these bioinformatics resources are Web-based and freely available to the research community. The following sections introduce the commonly used databases for miRNAs and miRNA targets, as well as the algorithms and tools for identifying them. Please refer to Figure 5 for a summary of these tools.

5.1. Databases of miRNAs

Several miRNA databases have been developed to manage the experimentally determined and bioinformatics-computed miRNAs. The major miRNA databases, with Web links and remarks about their features, are listed in Table 2. It is beyond the scope of this review to cover all these databases; among them, miRBase, ASRP, and miRNAMap are the most widely used.

The miRBase database is the largest Web-accessible database for all discovered miRNAs and covers the most extensive species. It contains information about known miRNAs and secondary structures of their precursors, as well as annotations related to their discovery. Currently, miRBase contains 10,581 miRNA sequences belonging to 115 species (Release 14, September 2009). The

(12)

miRBase Registry can assign ofﬁcial names for newly discovered miRNAs. ASRP (Arabidopsis thaliana Small RNA Project), which was developed especially for plant miRNAs, contains several plant species’ sequences of miRNA precursors, mature miRNAs, and their secondary structures. miRNAMap is a miRNA database for animals; it contains a genomic map of known miRNA genes, miRNA targets, and putative miRNA-target interactions. A unique feature of miRNAMap is that it contains the miRNA expression proﬁles found in a number of human normal/cancer tissues.

Genome

Small RNA cDNA

Tag/adaptor ligation

Bead binding

Emulsion PCR

Bead deposition

454 or SOLiD Illumina/Solexa Helicos

Fluorescent image per synthesis cycle

Bridge PCR Cluster generation Surface binding Surface binding

+

a

Sample preparation

b

Amplicon generation

c

Massive parallel sequencing

454: pyrosequencing SOLiD: supported oligoligation Illumina: SBS with reversible terminator Helicos: single-molecule sequencing

ATCGATCGATCG……

(13)

5.2. Databases of miRNA Targets

Several databases have been developed for collecting experimentally validated miRNA targets. By extracting information from the literature, the most updated version of TarBase has gathered 1333 entries of interactions between miRNAs and the targeted 3UTR. miRecords collects the animal miRNA-target interactions that have been categorized as experimentally validated or computa-tionally identiﬁed. Like TarBase, miRecords obtains the experimentally validated miRNA-target interactions from published literature (Table 2) (9, 23, 24, 98–121).

5.3. Algorithms and Tools for Identifying miRNAs

One of the major bioinformatics tasks in miRNA studies is the identiﬁcation of miRNA genes. Almost all computational approaches for miRNA-gene prediction are based on biological features of miRNAs, e.g., the hairpin structure of miRNA precursors (see Figure 1) that is evolutionarily conserved among species, and the more stable conformations compared with other RNAs. Strategies used in the computational approaches to identify miRNAs include (a) performing a miRNA search based on RNA conformations, (b) performing a homology search for miRNAs with similar sequences, (c) identifying miRNAs in a new species through comparative genomics, (d ) applying machine-learning approaches to miRNA feature descriptors, and (e) elucidating

←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Figure 4

High-throughput SBS technologies for miRNA. (a) Sample preparation. The reverse-transcribed cDNA from small RNA (red or green curved lines) is ligated with adaptors or tags (blue or purple straight lines). The tags are used for fragment identification, whereas the adaptors serve as primers for subsequent PCR amplification or adherence onto the binding surface to form an array of immobilized polonies. (b) Amplicon generation (clonal amplification) for sequencing by various methods. The 454 and SOLiD platforms rely on emulsion PCR for amplification of polonies. A single adaptor-cDNA is tethered to the surface of each micron-scale bead. The adaptor-cDNA fragments are amplified in a water-in-oil emulsion. The amplicons generated by continuous PCR from the original, single-template DNA molecules are then enriched by bearing on the surface of the beads. The Illumina/Solexa technology depends on bridge PCR or cluster PCR to generate amplicons. The adaptor-flanked nucleotide fragment is attached on the surface of a solid substrate by a flexible linker. The surface is coated with the primers for bridge PCR. Along with the PCR process, the amplification products originating from any given template DNA remain locally tethered near the point of origin. As a result, each clonal cluster contains∼1000 copies of the template DNA molecule. The HeliScope platform (Helicos) utilizes single-molecule sequencing, which requires no clonal amplification. In the surface-binding process, template DNA molecules tailed with poly-A are captured by hybridization to surface-tethered poly-T oligomers. (c) Massive parallel sequencing. This part denotes the massive parallel sequencing-by-synthesis (SBS) process. In the 454 platform, bead-bearing amplicons are randomly deposited onto a microfabricated surface. DNA polymerase incorporates one of the four nucleotides (A, G, T, C) in each cycle. As a result of nucleotide coupling, an ATP analog and luciferin are used to generate the fluorescence to indicate the specific nucleotide incorporation. The SOLiD platform generates a disordered, dense array with the beads bearing the amplicons. Sequencing is performed with a ligase rather than a polymerase. Each sequencing cycle introduces a population of fluorescent octamers. The octamers are structured to correlate with the identity of the central two bases. After ligation and imaging in four channels, the labeled portion of the octamer is cleaved, leaving a free end for another cycle of ligation. The Illumina technology involves the generation of a dense array of amplicons on a planar surface. Each sequencing cycle includes the simultaneous addition of four modified nucleotide species, each bearing one of four fluorescent labels. DNA polymerase drives the extension of amplicon molecules. The fluorescence is imaged in four channels and then cleaved. The HeliScope platform generates a disordered array of primed, single-molecule sequencing templates. Each cycle consists of the polymerase-driven incorporation of a labeled nucleotide at a subset of templates, followed by fluorescence imaging of the full array and chemical cleavage of the label.

(14)

Table 1 Feature comparison of Sanger sequencing and m ajor high-throughput sequencing technologies Technology Representative Manufacturer Web site Method Amplification Read length Data size Usual run time Sanger DNA sequencer with capillary elec-trophoresis ABI3730xl Applied Biosystems http://www. appliedbiosystems.com Sanger dideoxy sequencing w ith dye terminators In vitro cloning in bacterial hosts 1k b 96 KB 3h 454 454 GS FLX Roche A pplied Science http://www.454.com

Polymerase- mediated, bead-support pyrosequencing

In situ emulsion PCR 400–500 bp 100 MB 8h Illumina

Illumina Genome Analyzer

Illumina

Inc.

http://www.illumina.com

Polymerase- mediated, reversible dye-terminator reactions

on ﬂat surface In situ bridge PCR 36 bp (75 + bp with pair-end technology) 1–3 G B 2d ay s (4 days for p air-end) SOLiD SOLiD 3 Applied Biosystems http://www. appliedbiosystems.com Ligase-mediated ligation o f labeled octamers In situ emulsion PCR 36 bp 2–3 G B 5d ay s Helicos

HeliScope Single Molecule Sequencer Helicos BioSciences Corp.

http://www.helicosbio. com

Polymerase- mediated, single-molecule sequencing No ampliﬁcation needed 25–55 bp 14–28 GB 8d ay s

(15)

Table 2 Bioinformatics resources for miRNAs

Resource names Web links Reference Remarks

Databases of miRNAs

miRBase http://www.mirbase.org/ 9 Comprehensive collection of known miRNAs

for all species

ASRP http://asrp.cgrb.oregonstate.edu/ 99 Collection of known miRNAs in plants miRNAMap http://miRNAMap.mbc.nctu.edu.tw/ 104 Collection of computationally identiﬁed

miRNA-target interactions in metazoan genomes

miRGen http://www.diana.pcbi.upenn.edu/ miRGen.html

114 Identiﬁes animal miRNA-target interactions using multiple target-prediction programs

CoGemiR http://cogemir.tigem.it/ 113 Comparative genomics of miRNAs

Databases of miRNA targets (miRNA-target interactions)

TarBase http://diana.cslab.ece.ntua.gr/tarbase/ 117 Collection of experimental miRNA targets miRecords http://miRecords.biolead.org/ 120 Collection of experimental and computationally

identiﬁed miRNA targets Algorithms and tools for identifying miRNAs

miRseeker N/A 110 Based on sequence conservation and structural

conformation (miRNA-like stem-loop structure) (for ﬂy)

MiRscan http://genes.mit.edu/mirscan/ 112 Based on sequence conservation and structural conformation (for worm)

ProMiR http://cbit.snu.ac.kr/∼ProMiR2/ introduction.html

115, 116 Based on sequence conservation and structural conformation (for human and mouse) miRDeep http://www.mdc-berlin.de/en/research/

research teams/systems biology of gene regulatory elements/projects/miRDeep/ index.html

101 Identiﬁes miRNAs using deep-sequencing technique (for worm)

miRanalyzer http://web.bioinformatics.cicbiogune.es/ microRNA/

102 Identiﬁes miRNAs using deep-sequencing technique (for worm, ﬂy, and animals) Algorithms and tools for identifying miRNA targets

TargetScanS http://www.targetscan.org/ 23, 24, 111 Seed match (SM), sequence complementarity (SC), and minimal free energy (MFE) of miRNA/target duplex; sequence preferences of target sites

miRanda http://www.microrna.org/ 100, 105 SM, SC, and MFE

PicTar http://www.pictar.org/ 109 Identiﬁes SM, SC, MFE, and combinatorial

miRNA-target interactions RNAhybrid http://bibiserv.techfak.uni-bielefeld.de/

rnahybrid/

118 Measures SC, MFE, and statistical signiﬁcance of miRNA-target interactions

PITA http://genie.weizmann.ac.il/pubs/mir07/ mir07 data.html

107 Considers SM, SC, MFE, and target-site accessibility

mirWIP http://mirtargets.org/ 103 Considers SM, SC, MFE, statistical signiﬁcance

of miRNA-target interactions, and target-site accessibility

MirTarget2 http://mirdb.org/miRDB/ 119 Based on machine-learning technique; a

computational model was trained by a variety of features concerning miRNA-target interactions (Continued )

(16)

Table 2 (Continued )

Resource names Web links Reference Remarks

DIANA-microT http://www.diana.pcbi.upenn.edu/cgi-bin/micro t.cgi

108 SM, SC, MFE, and sequence preferences of target sites

miRcheck http://web.wi.mit.edu/bartel/pub/software. html

106 Sequence complementarity, allowing gap in miRNA/target duplex (for plants) miRU http://bioinfo3.noble.org/miRNA/

miRU.htm

121 Sequence complementarity, allowing gap in miRNA/target duplex (for plants) ﬁndMiRNA http://sundarlab.ucdavis.edu/mirna/ 98 Sequence complementarity; gap not allowed

(for plants)

miRNAs/miRNA precursors

Databases of known miRNAs

miRBase, miRNAMap, ASRP, miRGen, CoGemiR

Databases of known miRNA targets

TarBase, miRecords

Methods for identifying miRNAs

miRseeker, MiRscan, ProMiR, miRDeep 1. RNA conformation 2. Comparative genomics 3. Machine learning 4. Deep sequencing miRNA-target interactions ACG ATCGATCGA_T C G A T C GAT CGA TC TC GA TC GA T A TCG A TCG A AT C G ATC GATCGATC_G A TC G ATC T A G CT AG C TAG CTA_G C T A G C T A G

Methods for identifying miRNA targets

TargetScan, miRanda, PicTar, PITA, MirTarget2, mirWIP, RNAhybrid 1. Sequence complementarity 2. Binding affinity 3. Seed-region match 4. mRNA accessibility 5. Evolutionary conservation mRNA ATCGATCGA miRNA

a

b

d

e

c

ATCGATCGATCGAT GATC

C C

TC GATC GAT

Figure 5

Bioinformatics databases, algorithms, and tools commonly used in miRNA research. (a) miRBase, miRNAMap, ASRP, miRGen, and CoGemiR are databases for the registered, known miRNAs. (b) TarBase and miRecords are databases for veriﬁed miRNA targets. (c) From left to right: the miRNAs or their precursors, a schematic of miRNA-target interactions, and the miRNA/target duplex, with the nucleotide sequence diagram on top. (d ) Methods for identifying miRNAs (e.g., miRseeker, MiRscan, ProMiR, and miRDeep) with the associated principles. (e) Methods for identifying or predicting miRNA-target interactions (e.g., TargetScan, miRanda, PicTar, PITA, MirTarget2, mirWIP, and RNAhybrid) with the associated principles.

(17)

miRNA sequences by direct analysis of next-generation sequencing data. These methods are introduced in the following sections, and the prediction tools developed are listed in Table 2.

5.3.1. miRNA search based on RNA conformations. This major category of computational algorithms and tools is based on the hairpin structures of pre-miRNAs (see Figure 1), which are distinctive from mRNAs and other noncoding RNAs. The hairpin structure features a unique thermodynamic stability, and the mature miRNA always locates in the stem region of the precursor hairpin.

Several computational tools have been developed to assess whether a putative hairpin is a candidate for a miRNA precursor. Mfold (122) and RNAfold (123) can generate the hairpin structures of miRNA-precursor candidates with minimum free energy (MFE) of the structure. The MFE-RNA-structure prediction is a popular method because of its thermodynamic basis and its high accuracy. The ﬁrst step of this method is to search the conserved regions in the whole genome, which are divided into segments that hold approximately 110 nucleotides each, a length to cover pre-miRNA. The segment is then folded by structure-prediction programs (e.g., Mfold or RNAfold) that score these hairpin-shaped stem-loops for potential miRNA candidates. Two computation algorithms, miRseeker and MiRscan, have been developed to identify animal miRNA genes. Whereas both programs analyze the stem-loop secondary structure, MiRscan can also identify miRNAs based on common characteristics, such as positional nucleotide preferences of previously known miRNAs.

5.3.2. Homology search for miRNAs with similar sequences. Homology search–based meth-ods were developed on the basis of using sequence homology to identify miRNAs whose sequences are evolutionarily conserved among different species. This method can be classified as genome- or expressed sequenced tag (EST)-based search. Based on known miRNA sequences in one species, computational homology search can identify novel miRNAs in other genomes (124). When the interested genomes are not completely sequenced, EST-based homology search is an alternative way to identify miRNAs. Such an approach has been used to identify 338 plant miRNAs in 60 species via a Basic Local Alignment Search Tool (BLAST) search of the whole GenBank EST database (125). Although homology search is highly efficient in identifying homological miRNAs, it cannot find totally novel miRNA sequences that have not yet been identified in any species.

5.3.3. Algorithms based on comparative genomics. Comparative genomics can be used to identify miRNA genes through the comparison of two completely sequenced genomes. Unlike the miRNA homology–search approach, this method analyzes hairpin structures generated from the whole genome. The miRNA precursors are identified among the hairpins based on the unique characteristics that are highly conserved among species. For example, Jones-Rhoades & Bartel developed a comparative-genomics approach to identify miRNAs conserved in both Arabidopsis and rice (106). This approach has allowed not only the confirmation of a majority of previously known plant miRNAs but also the identification of a number of novel miRNA genes. Using a similar computational program, Bonnet et al. identified 91 new miRNA genes in Arabidopsis and rice (126). Although these methods are widely used in computational identification of miRNA genes, they are limited by the lack of genomic sequences for a majority of animal and plant species.

5.3.4. Machine-learning approaches applied to miRNA feature descriptors. Machine-learning techniques are popular for miRNA identiﬁcation. A variety of miRNA properties and features, including sequence composition, structural conformation, evolutionary conservation,

(18)

and sequence preferences, are used to train a learning system. Using a probabilistic model learned from miRNA sequence and structure, ProMiR was the ﬁrst Web server for miRNA discovery. Recently, Support Vector Machine (SVM) has become a popular machine-learning technique for solving classiﬁcation problems in bioinformatics. Several tools have been developed to apply SVM to identify miRNAs (127–129).

5.3.5. Elucidating miRNA sequences by direct analysis of next-generation sequencing data. Next-generation sequencing techniques can be used not only to estimate the expression levels of known miRNAs but also to identify novel miRNAs. miRDeep uses a probabilistic model to identify novel miRNAs by aligning sequencing reads against genomic sequence to obtain the corresponding miRNA precursors (101). miRanalyzer, which contains a machine-learning method to identify novel miRNAs, was developed for comprehensive analysis of deep-sequencing reads of small RNAs.

5.4. Algorithms and Tools for Identifying miRNA Targets

A given miRNA may have multiple targeting mRNAs, and the interaction may occur through select nucleotide regions of the miRNA (e.g., the seed region; see Figure 2). Without bioinfor-matics predictions, identiﬁcation of miRNA targets would be a daunting task. Because different mechanisms are involved in miRNA-target interactions in animals versus plants (see Section 2.2), different bioinformatics methodologies are required for identifying miRNA targets in these two kingdoms of living systems. Plant miRNAs show a near-perfect complementarity with their target genes, so miRNA targets can be predicted simply by aligning miRNA sequences to the mRNA sequence database. Researchers have used this algorithm to develop Web-accessible tools that improve accuracy in plant miRNA-target predictions (see Table 2).

In animal cells, the partial and imperfect base-pairing of miRNA and its target sites within the 3UTR would cause multiple patterns of miRNA-target interactions. Thus, computational identification of miRNA targets is more intricate because of such complex miRNA-target interac-tions. For animal miRNA-target prediction, most computational approaches rely on a seed-region match algorithm, which identifies the high-complementarity region in the 3UTR to score possi-ble sites and enumerate putative gene targets. Other biological features such as binding affinities of the miRNA/target duplex, accessibility of target sites, and conservation of target sites are in-corporated into the computational methods for identifying miRNA targets to increase accuracy. In the following sections, the three most commonly used software programs for predicting animal miRNA targets are introduced: TargetScan, miRanda, and PicTar.

5.4.1. TargetScan. TargetScan was the ﬁrst program to use the concept of seed-region matches (seed matches). The algorithm scans a set of orthologous 3UTR sequences in a group of organisms. In this method, seed matches are deﬁned as small segments of seven nucleotides in the 3UTR of mRNA with perfect complementarity to the bases in the 2–8 positions of the miRNA. TargetScanS, the upgraded version of TargetScan, relies exclusively on the seed matches of six nucleotides (i.e., the 2–7 positions) instead of the seven nucleotides used in TargetScan. More than 5300 human-coding genes were predicted by TargetScanS as possible targets of miRNAs (23), suggesting that a large fraction of human-coding genes appear to be regulated by miRNAs.

5.4.2. miRanda. miRanda, which was initially designed for identifying miRNA targets in Drosophila, has been extended to identify miRNA targets in ﬁsh and mammals (human, rodents). For each miRNA, target genes are identiﬁed on the basis of three biological features: sequence

(19)

complementarity of the miRNA/target duplex, free energy of the miRNA/target duplex, and con-servation of target sites among different species. With the use of this software,∼2000 human genes conserved in mammals were identiﬁed as putative targets of 218 miRNAs (105).

5.4.3. PicTar. PicTar is a computational method for identifying genes that are common targets of sets of miRNAs. In this method, the input of a set of miRNAs and multiple sequence alignments of 3UTRs are received, and the targets are ranked according to two criteria: One is their likelihood of being a common target of input miRNAs; the other is the probability of each 3UTR being the miRNA-target site. Using the PicTar method, Krek and coworkers estimated that each vertebrate miRNA may target approximately 200 transcripts on average (109).

5.4.4. Principles of using algorithms and tools for identifying miRNA targets. In summary, most of the miRNA-target prediction tools have been developed by incorporating biological features in ways similar to those described by the above three methods. With high predictive sensitivity but low predictive specificity, these tools can generate a large number of putative miRNA-target interactions, but with many false positives. A rule of thumb is to select consensus targets identified in common by different tools, such as TargetScan, miRanda, and PicTar. As discussed, other tools are available for miRNA-target prediction (see Table 2); users may consider the unique features of these methods for specific applications. It is desirable to have high sensitivity as well as high selectivity, but this is difficult because the true experimental approach can validate only limited numbers of predicted positives and cannot assess true negatives. To develop more accurate tools, it is necessary to incorporate into the computer models new, defined biological features in the mechanism of miRNA gene silencing; this depends on an increase in experimental identification of miRNA-target interactions.

The amount of false positive predictions can be reduced if a computational tool searches for a set of genes responding to specific experimental conditions rather than for all genes in the genome. Using this principle, Tsai et al. (130) incorporated a bioinformatics approach to identify target genes of miR-122 against upregulated genes in hepatocellular carcinoma. Thirty-two target genes were experimentally verified from 45 computationally identified candidates. This improvement of the accuracy of miR-target identification is attributable to the increase in the probability of true positives by the selective conditions (i.e., the upregulated genes in hepatocellular carcinoma in association with miR-122 downregulation).

6. APPLICATIONS OF miRNA IN BIOENGINEERING

The above indicates that the study of miRNA biogenesis, regulation, and target genes has become a major research topic in many biological systems in health and disease. In view of the important roles of miRNA in the regulation of gene expression and hence tissue functions and phenotypes, investigations of miRNA offer many opportunities for scientists with bioengineering expertise. One bioengineering application is the use of concepts and techniques for gene targeting to achieve the inhibition of miRNAs in vitro and in vivo. In the complementary approach, the development of tools for the delivery of miRNAs to suppress the expression of target genes involved in pathogenesis is equally important. These concepts have been adopted for the development of drugs using miRNAs. Another area where bioengineers can easily integrate miRNA into their research is the use of miRNA inhibition and activation in tissue engineering and regenerative medicine. For bioengineers with bioinformatics expertise, further development of the integrated algorithms and methods for target prediction and function assessment will contribute importantly to the future advancement of this novel research ﬁeld.

(20)

Small interfering RNA (siRNA): a class of double-stranded RNA molecules, 20–25 nucleotides in length, interfering with the expression of a speciﬁc gene shRNA: small hairpin RNA

6.1. Antagomirs: Antisense Inhibition of miRNA

Antagomirs, a group of modified anti-miRNA oligonucleotides, are currently the most readily available tools for miRNA inhibition. They have been applied successfully to inhibit specific endogenous miRNAs in cell cultures and mice. Antagomirs include oligonucleotides with modified 2-OH residues of the ribose by using 2-O-methyl or 2-O-methoxyethyl and locked nucleic acid. These modifications are designed to increase nuclease resistance, improve pharmacokinetic properties such as the half-life, and enhance cellular uptake. The therapeutic efficacy with the miR-122 antagomir has been examined and validated in the mouse model in vivo (79, 80). The long-lasting and nontoxic silencing generated by intravenous injection of 2-O-methyl antagomirs complementary to miR-122 in mice causes the degradation of miR-122 in the liver and reductions of hepatic fatty-acid synthesis and cholesterol synthesis. Because of their superior affinity to target miRNA, high fidelity, low toxicity, and improved metabolic stability, locked nucleic acids have also been used in experimental therapies for cardiovascular diseases and cancers.

To enhance the delivery efﬁciency of antagomirs to target tissues, several techniques used for small interfering RNA (siRNA) delivery have been applied to conjugate or package the antagomirs. These include methods based on the uses of lipids (e.g., cholesterol or liposomes), peptides (e.g., TAT leading sequences), proteins (e.g., binding proteins or antibodies), viruses (e.g., retroviral and adenoviral vectors), hydrogel, and nanoparticles. Further developments in antagomir oligonu-cleotide design, packaging, and local delivery through the application of bioengineering principles and technologies will serve to enhance the therapeutic effectiveness of these antagomirs.

6.2. miRNA Sponges

Synthetic RNAs containing miRNA-targeted sites can serve as a “decoy” or “sponge” to compete with miRNA in binding to its target mRNA and thus inhibit miRNA functions. The concept of an miRNA sponge was reported ﬁrst by Ebert et al. (131), who engineered the tandem repeats of the putative miRNA binding sites into the 3UTR of green ﬂuorescent protein or luciferase reporter genes. Their results demonstrated that the miRNA sponge effectively suppresses the expression of the reporter gene. Notably, Ebert et al. have shown that the miRNA sponge outperforms antagomir in most miRNAs tested and that the combination of sponges and antagomir exhibits a synergistic effect. Thereafter, Kumar et al. showed that let-7 sponges promote the growth of mouse lung-cancer cell line (132). Care et al. infected neonatal murine cardiac myocytes with adenoviral vector encoding miR-133 sponges, causing suppression of miR-133 expression and marked hypertrophy (133). These studies indicate that miRNA sponges can effectively modulate the endogenous miRNA and their target functions.

Future directions for bioengineers interested in this area include designing and manufactur-ing miRNA-sponge expression systems, engineermanufactur-ing miRNA sponges to be inducible and tissue-specific for therapeutic purposes, and developing and improving various delivery vehicles and tools. It is possible that the engineering of miRNA constructs may use some of the strategies estab-lished for other small RNAs. An example is the tunable RNA interference (RNAi) construct (134) with the use of two coupled repressor proteins: one controlling the small hairpin RNA (shRNA) gene expression and another controlling the target gene expression. The shRNA and target gene expressions can thus be controlled by adding inducers specific to their coupled repressors. With such multirepressor modules, the target gene can be temporally tuned in the presence or absence of shRNA. The various components in the construct are modular in nature, thus allowing regula-tion of a desired gene in tissue-specific and inducible manners. Although originally designed for shRNA targeting, such a strategy may be applicable for engineering a miRNA-based gene switch.

(21)

6.3. Bioinformatics Perspectives

The next-generation sequencing technology will lead to a rapid increase in the number of exper-imentally determined miRNAs. Given the lack of high-throughput methods to conﬁrm miRNA-target interactions, it is crucial to improve the accuracy of computational tools for such inter-actions and associated biological functions. The inclusion of the experimentally validated data will provide larger training sets to evaluate the performance of algorithms for miRNA-target predictions. An appropriate experiment design can employ a smaller gene group rather than all encoded mRNAs in a genome. The conventional thinking is that the miRNA-target regions are 3UTRs. However, miRNAs can also target the coding regions, as exempliﬁed by let-7 miRNA (135). The prediction of miRNA-target sites should therefore be expanded to the entire coding sequences.

There is also a need for a comprehensive understanding of the transcriptional regulation of miRNAs. Because most miRNAs are transcribed by RNA polymerase II, as is the case for mR-NAs, many computational tools used for identifying promoters of protein-coding genes can be employed to predict miRNA promoters. Once the miRNA promoters have been identiﬁed, tran-scriptional regulation of miRNA can be deciphered by analyzing the cis-regulatory elements and their corresponding trans-factors.

Whereas siRNAs are widely used experimentally to knock down a speciﬁc gene exogenously, miRNAs provide endogenous mechanisms to repress the expression of multiple genes. Compu-tational methods can be used to design a miRNA sequence that can be introduced exogenously to downregulate a set of genes for desired cellular consequences. With such a multitarget potential, miRNA modulation with exogenously introduced sequences may provide a more effective means to manipulate biological functions experimentally.

7. SUMMARY AND CONCLUSIONS

This review provides a summary of the state-of-the-art information on the rapidly developing new ﬁeld of miRNA, which plays an important role in modulating virtually all biological processes (e.g., cell proliferation, development, differentiation, adhesion, migration, interaction, and apoptosis) through its ﬁne tuning of gene regulation.

Investigations on miRNA offer new and exciting opportunities for scientists with bioengineer-ing expertise. Thus, manipulation of miRNA activities can lead to the integrative understandbioengineer-ing of the bioengineering basis of homeostatic regulation from molecular to systems levels and also clinical applications in conditions such as tissue engineering, regenerative medicine, drug dis-covery, and other therapeutic and diagnostic innovations. Bioengineers with computational and bioinformatics expertise can develop new ways to design the miRNA sequences for enhancement of these applications and to improve the innovative algorithms and analysis methods for the further advancement of this novel research ﬁeld.

In conclusion, miRNA serves as a master regulator of cellular processes for bioengineering systems. It presents an excellent opportunity for the development of innovative biomedical engi-neering research.

DISCLOSURE STATEMENT

The authors are not aware of any afﬁliations, memberships, funding, or ﬁnancial holdings that might be perceived as affecting the objectivity of this review.