Chapter 3 miRNA and Target Gene Prediction Using Tissue-Selective Motif
3.3.4 Predicted Novel miRNAs
Over 70% of the predicted frequent motifs do not match any seed regions of currently known human miRNAs. As these unmatched motifs may be potential target sites of unknown miRNAs, we check each of them to see whether it is located in any of the stem regions in the secondary structures of the miRNA candidates predicted by Pedersen et al. [104].
Figure 10. Three cases of frequent motifs that match predicted secondary structures of mature miRNA candidates. (A) AAUCUUU and UCCUUGU. (B) AACUUUU and CUGGGCA. (C) UAGGAUG
Because the locations of seed regions in the predicted secondary structures are unknown, we consider three possible binding situations. First, two different motifs match the two primes of the stem part of a miRNA gene candidate (Figure 10A).
Second, two motifs are mapped onto the same stem of a miRNA gene candidate but at two different locations (Figure 10B). The predicted mature sequences are two potential alternative forms of this candidate miRNA. Third, a motif matches a region of a known miRNA gene, but our predicted mature miRNA is different from the known mature sequence in miRBase. For example, in Figure 10C, the motif UAGGAUG matches a region in the mature sequence UUGCAUAUGUAGGAUGUCCCAU of hsa-miR-448, but our predicted mature sequence is ACAUCCUGCAUAGUGCUGCCAG; we call it an alternative mature
form of hsa-miR-448. Tables 2 and 3 show 4 and 11 additional examples, respectively.
When G:U pairing is not allowed, 60 frequent motifs that do not match any known miRNA seed regions give rise to 48 mature miRNA candidates (Table 2).
However, P-9-3p and -5p are two novel alternative mature forms of hsa-miR-652;
that is, these two sequences overlap with that of hsa-miR-652 but have different seed regions. Similarly, P-27-3p and -5p are two novel alternative mature forms of hsa-miR-802 (Table 2). When one G:U pairing is allowed in the seed match, 116 frequent motifs give rise to 93 mature miRNA candidates (Table 3). P-62, P-63-1 and -2, P-64-1 and -2, P-65, P-66, P-67, P-68-1 and -2, and P-69 are novel alternative mature forms of has-miR-544, hsa-miR-1264, has-miR-1298, hsa-miR-873, hsa-miR-376b, hsa-miR-381, hsa-miR-365, and hsa-miR-448, respectively (Table 3).
Table 2. Predicted novel miRNAs with no G:U pairing in the seed match. *P-9 (3p and 5p), †P-11 (3p and 5p), and ‡P-27 (3p and 5p) are alternative mature forms of hsa-miR-652, cfa-miR-1839, and hsa-miR-802, respectively; however, no human version of cfa-miR-1839 is found in miRbase. The coordinates and the located gene IDs of these predicted mature miRNAs are given in the last two columns.
Frequent
Predicted miRNA mature sequence Genomic coordinates (NCBI
36.1) Gene located
AUGCAAA artery/aorta,
uterus P-1 AUUUGCAUAAUGGAUGC chr2:205539433-205539449 ENSG00000116117
GACAGAC
AGCUGAA peripheral nerve P-3 UUUCAGCUCAUAAAA chr5:82856771-82856785 ENSG00000038427 UAAUUGG ovary P-4 UCCAAUUAAGUCUUUUAAAU chr8:83086984-83087003
UUCAAAG liver/hepato CCUUUGAAAAUAUAAAAUC chr8:35236713-35236732
UUUCAAA
cerebrum, intestine, liver/hepato, pancreas
P-5 CUUUGAAAAUAUAAAAUC chr8:35236714-35236732 ENSG00000156687
UCAAUUU kidney P-6-3p UAAAUUGAGGUGGAUCCUGU chr15:57977629-57977648 AAAUUGA lymphnode P-6-5p CUCAAUUUAUUCCUAGAAACA chr15:57977582-57977602
UAAAUUG placenta UCAAUUUAUUCCUAGAAACAG chr15:57977583-57977603 AAUUGAG kidney UCUCAAUUUAUUCCUAG chr15:57977581-57977597 GCAAUUU peripheral nerve P-7-3p AAAUUGCCAUAAAGUG chr9:80472606-80472621
AUGUUCA artery/aorta P-8 AUGAACAUCUGAUUAUU chr11:132066843-132066859 ENSG00000183715 CCAUUCA retina P-9-3p UUGAAUGGCGCCACUAGGGUU chrX:109185270-109185290
UUGUGCA lymphnode P-9-5p CUGCACAACCCUAGGAGAGGG chrX:109185228-109185248
ENSG00000157600 (hsa-miR-652) ACAAAUC cerebrum P-10 UGAUUUGUUCAAGAUGAUGA chr7:27256727-27256746
GGUCUUG lung P-11-3p UCAAGACCUACUUAUCUACC chr15:81221851-81221870 AUCUACC testis P-11-5p GGUAGAUAGAACAGGUCUUG chr15:81221817-81221836 AUGCAUU lymphnode P-12-3p AAAUGCAUGAAAUAGAU chr1:158032773-158032789 AUAAUGA cerebrum P-13-5p-1 UCAUUAUAAAAUGUGAUAAUGU chr15:51019585-51019606
AAUGUCA placenta P-13-3p GUGACAUUAUGACAUUACAUU chr15:51019643-51019663 CUAAUUA retina UUAAUUAGCAAAAAGGCU chr1:208493640-208493657
GCUAAUU thymus, testis,
UGUUAAU placenta P-15-3p AAUUAACAGAAUAUUAU chr5:146055768-146055784
AUUAACA
P-15-5p UUGUUAAUCAAAAAACUAU chr5:146055739-146055757 ENSG00000156475
UGAAUAU cerebrum, pineal P-16-5p UAUAUUCACAUUUAUUGGAU chr7:146609661-146609680 ENSG00000174469
gland, peripheral blood
UCAAAAC placenta, lung P-17 UGUUUUGAUAACAGUAAUGU chr8:75780484-75780503 UAAUUUC kidney UGAAAUUAUAUUACCAACA chr10:128235007-128235025
UUGUAGU lymphnode P-19-1 GACUACAACUCCCAAGGUA chr1:153246074-153246092
GGAGUUG vein P-19-2 ACAACUCCCAAGGUACAUACA chr1:153246078-153246098 ENSG00000160685
AUUAUCU lymphnode P-20-1 UAGAUAAUUUGCACAUUAU chr14:72223490-72223508 ENSG00000205683
AUUACAG
P-21 CUGUAAUAUAAAUUUAAUUUAUU chr4:126647864-126647886
UUUUAAC cerebrum,
prostate, lung P-23 GUAUAUUGUGACAUACAUGU chr1:1463389-1463408
AAUUAAC placenta P-24-3p GUGUUAAUUAAACCUCUAUUUAC chr8:113724958-113724980 AUUUACA placenta AUGUAAAUACAGAUUUAAUUAAC chr8:113724904-113724926
UAUUUAC cerebrum,
placenta, kidney UGUAAAUACAGAUUUAAUUAACA chr8:113724905-113724927
UUUACAU
UCAUGUU placenta P-26 CCAACAUGAUGCUAAUAAAU chr17:73293262-73293281
AAUCUUU
UCCUUGU ovary, breast P27-3p AACAAGGAGAAUCUUUGUCACU chr21:36014934-36014955
ENSG00000159216 (hsa-miR-802)
CUAAUAA lymphnode,
placenta P-28 UUUAUUAGUGCCAUAUAAUA chr2:219696897-219696916 ENSG00000187736 AUCAAUA cerebrum, testis P-29-3p UUAUUGAUCAGCGUAGCAAACA chr5:3508678-3508699
UCAAUAA prostate, testis,
adrenal gland CUUAUUGAUCAGCGUAGCAA chr5:3508677-3508697 UUGAUCA lymphnode P-29-5p-1 CUGAUCAAUAAUAAGAUUGAU chr5:3508628-3508648
UUAUUGA lymphnode,
kidney P-29-5p-2 AUCAAUAAUAAGAUUGAUAC chr5:3508631-3508650
UAAACUG lymphnode CCAGUUUAUUUUGUAAAUAUA chr1:60826128-60826148 AUAAACU pancreas P-30
CAGUUUAUUUUGUAAAUAUA chr1:60826129-60826148
AAAUAUG
cerebrum, lymphnode, pancreas
P-31 UCAUAUUUUCUAUCUCUUUGCUU chr4:39795271-39795293 ENSG00000078177
UUCCUUA placenta P-32-1 UUAAGGAAAUUAUGCUGAAC chr4:21498378-21498397 ENSG00000185774 AAUUUUC lymphnode P-33 AGAAAAUUAGGUUGAUA chr6:37530191-37530212 ENSG00000137200
UUCAAAA cerebrum P-34 CUUUUUGAGUUUUGAGGAAG chr9:99359944-99359963 ENSG00000136842 UUACAGC placenta P-35 AGCUGUAAACAGCUCUCCA chr17:69079331-69079349
GAAUAAU
cerebrum, eye, lymphnode, ovary, lung, kidney
P-36 AUUAUUCUUUUUAUAAAA chr2:144691036-144691053 ENSG00000121964
UAUAAAC liver/hepato,
pancreas P-37 UGUUUAUAGUAAUGGGAGAUA chr9:127702054-127702074 ENSG00000167081
CAAAUAU peripheral nerve,
placenta, kidney P-38 AAUAUUUGGAAACAUCCA chr9:13932445-13932467
Table 3. Predicted novel miRNAs with 1 G:U pairing in seed match.
Predicted miRNA mature sequence Genomic coordinates (NCBI
36.1) Gene located
UCCAUCU retina P-6-3p GAGGUGGAUCCUGUUCCAAUU chr15:57977635-57977655 AAUUUGU lymphnode P-7-3p UGCAAAUUGCCAUAAAGUG chr9:80472603-80472621 UAAAGUG prostate, testis P-7-5p CAUUUUAUGGCAAUUUGUU chr9:80472560-80472583
UGAAAUA
P-13-5p-2 AAAUGUGAUAAUGUCAUUGC chr15:51019593-51019612
UUAACAG placenta,
pancreas P-15-5p CUUGUUAAUCAAAAAACUAU chr5:146055738-146055757 ENSG00000156475 UGGAUAU cerebrum UAUAUUCAUGAAUAUAU chr7:146609695-146609711
GAUAUAU
UGAUAUA kidney P-18-2 UUAUAUUACCAACAGAAAU chr10:128235012-128235030
AGAUUAU lymphnode P-20-2 GAUAAUUUGCACAUUAU chr14:72223492-72223508 ENSG00000205683
UGAUUUC bone, lung P-32-2 GGAAAUUAUGCUGAACUCAUUU chr4:21498382-21498403 ENSG00000185774
UAAUGUU artery/aorta P-39 UGACAUUAGUUCAUUU chr2:56002041-56002056 ENSG00000115380
UUUUAAG
lymphnode, skin, placenta, lung, pancreas
P-40 GCUUAGAAAAGUGACCUAGA chr2:77350845-77350864 ENSG00000176204
GCUGUUU lung P-41 AAAGCAGCGUGAAGAUGC chr2:105075146-105075163 ENSG00000135972
UGCAUAU
peripheral nerve, placenta,
pancreas
P-42 UAUAUGUAGAUGUAGCUAUAU chr2:192552553-192552573 ENSG00000144339
AGAUUAU lymphnode P-43 AUAAUUUAUUAGAACAAUUAG chr3:60801997-60802017 ENSG00000189283 AGUAAUU lymphnode UAAUUAUUUUUCUCCAUC chr3:62045270-62045287
GUAAUUA lymphnode, kidney, pancreas
P-44
UUAAUUAUUUUUCUCCAUC chr3:62045269-62045287 ENSG00000144724
UUCAAUU lymphnode P-45 UAAUUGGAAAUUUCAUUU chr4:146917406-146917423 ENSG00000151612 GCACAUU lung P-46 UAAUGUGUAAUGCUGUAGUUU chr4:153150954-153150974
GUCUAUA lymphnode, lung P-47 AUGUAGACAAAACAUCCAGAUAA chr5:15476433-15476455
UGUCUAU lung UGUAGACAAAACAUCCAGAU chr5:15476434-15476453
P-48 AUUAUUUUAGUAAUUCAACAG chr5:36750623-36750643
AAUAACU
placenta UAACUUUUAAUGUAAGCCUGG chr6:50426012-50426032
UAAGAGU placenta, pancreas
P-50
AACUUUUAAUGUAAGCC chr6:50426013-50426029
GCAAAAU cerebrum, retina,
placenta, prostate P-51 UGUUUUGCCAGCAUGUGGUUG chr9:72221811-72221831 UAAGUGA cerebrum UUCAUUUAAAAUUAGGC chr10:13522803-13522819
UUAAGUG lymphnode,
placenta, lung UCAUUUAAAAUUAGGC chr10:13522804-13522819
UUUAAGU lymphnode,
placenta, kidney UCAUUUAAAAUUAGGC chr10:13522803-13522819 AAGUGAA cerebrum
P-52
AUUCAUUUAAAAUUAGGC chr10:13522801-13522819
ENSG00000165626
UUAAGAU placenta, lung,
pancreas P-53 CAUCUUGAAAUAAGUCCUCA chr11:122110545-122110564 ENSG00000154127
UUUAAGA
lymphnode, uterus, prostate, testis, lung, kidney, pancreas
AUCUUGAAAUAAGUCCUCAU chr11:122110546-122110565
UUUUAAG
lymphnode, skin, placenta, lung, pancreas
UCUUGAAAUAAGUCCUCAUC chr11:122110547-122110566
UUGAUUA lymphnode, lung,
pancreas P-54 AUAAUCAGAAACACUAAUCA chr13:66389595-66389614 ENSG00000184226
UAAAUGU lymphnode,
placenta P-55 UAUAUUUAACAUACACUUG chr13:104638844-104638862
AUUAGUU lymphnode,
kidney P-56 GAAUUAAUGGUAUUAA chr14:56071430-56071448
UAGUUUA placenta P-57 UAAAUUAAAAUCAAUAUUUU chr16:7601200-7601219 ENSG00000078328 GAAUUUC skin, placenta GGGAAUUCCCACUCUGCAG chr17:9233266-9233289
GGAAUUU kidney P-58
GGAAUUCCCACUCUGCA chr17:9233267-9233288 ENSG00000170310 AUUCUGA kidney P-59 UUUAGAAUUCUAAUUA chr18:26697252-26697267
GCAAAUU lymphnode P-60 UGAUUUGCAUUUUAGU chr18:40000232-40000247 AUGAAGU uterus, lung P-61 UAUUUCAUUUUAAUCUUGA chr19:35550032-35550050 UAGCAAG prostate CUUGUUAAAAAGCAGAUUCU chr14:100584767-100584786
UUUUAGC lung P-62
UGUUAAAAAGCAGAUUCUGA chr14:100584769-100584788 hsa-miR-544
UUGUUGA kidney P-63-1 CUCAAUAAGUAUUUGUUGA chrX:113793391-113793409 ENSG00000147246
UCAAUAA prostate, adrenal
gland P-63-2 UUUGUUGAAAGAAUAAAUAA chrX:113793402-113793421 (hsa-miR-1264)
AACUUUU
cerebrum, lymphnode, prostate, intestine, lung, kidney
P-64-1 AAGAGUUCAUUCGGCUGUCCAG chrX:113855918-113855939
CUGGGCA thymus P-64-2 CUGUCCAGAUGUAUCCAAGU chrX:113855932-113855951
ENSG00000147246 (hsa-miR-1298)
CUCCUGU vein AAUAGGAGACUCACAAGUUCCUG chr9:28878920-28878942 UCCUGUU eye CAAUAGGAGACUCACAAGUU chr9:28878919-28878938 UCUCCUG breast
P-65
AUAGGAGACUCACAAGUUCCUG chr9:28878921-28878942
hsa-miR-873
CAUGUUU lymphnode,
placenta P-66 AAAACGUGGAUAUUCCUUCUAUG chr14:100576545-100576567 hsa-miR-376b GUAUCAA pancreas UUUGGUACUUAAAGCGAGG chr14:100582005-100582023
UAUCAAA testis P-67
GUUUGGUACUUAAAGCGAGG chr14:100582004-100582023 hsa-miR-381 CCUUAUU lymphnode P-68-1 AAAUGAGGGACUUUUGGGGGCA chr16:14310653-14310674
CCUAAAA uterus P-68-2 CUUUUGGGGGCAGAUGUG chr16:14310663-14310680 hsa-miR-365
UAGGAUG bone P-69 ACAUCCUGCAUAGUGCUGCCAG chrX:113964286-113964307 ENSG00000147246 (hsa-miR-448)
3.4 Experimental validation of predicted novel miRNAs and their