行政院國家科學委員會專題研究計畫 成果報告
利用「行為資料採礦」探索生物系統的網路基序
計畫類別: 個別型計畫
計畫編號: NSC93-2213-E-110-029-
執行期間: 93 年 08 月 01 日至 94 年 07 月 31 日
執行單位: 國立中山大學資訊管理學系(所)
計畫主持人: 鄭炳強
共同主持人: 鄒文雄
計畫參與人員: 王修齊、侯雍聰
報告類型: 精簡報告
處理方式: 本計畫可公開查詢
中 華 民 國 94 年 10 月 26 日
Discovering network motifs by behavior data mining
NSC 93-2213-E-110-029
93
8
1
94
7
31
Email: [email protected]
(motif)
Abstract
In the era of post-genome, scientists are
aware that the study of biological systems has
to be upgraded from molecular level to
network level. Past research shows that there
are some highly re-occurrence subnets exist in
the biological networks of Saccharomyces
Cerevisiae and E. Coli, which are called
network motifs. It is guessed that they are
reserved in the evolution because of their
specific functions which are useful to a system.
Current approaches to discover network motifs
all assume prior knowledge of links between
molecular nodes, and justify the existence by
some statistic techniques. Since a motif has its
dynamics behavior model, it will be interesting
if one could discover it from the existing
experimental behavior data by some data
mining techniques. We made this attempt in
this project and discussed what we found and
encountered problems.
Keywords: Network Motif, Behavior
Datamining, Sequence Alignment
(network motif)
[3]
(randomized network)
[3, 5] Milo
Lee[4]
6
1.Autoregulatory 2. Feed-forward loop 3.
Multi-component loop 4.Single input module 5.
Multi-input module 6.Regulator cascade (
1)
1. Feed-forward
loop(FFL)
module
(threshold)
FFL
[3]
(
)
1. Examples of network motifs in the yeast
regulatory network
Lee, T., et.al.,2002[4]
Alon
FFL [13 14]
(model)
Alon
DNA
Microarray[12]
mRNA
Microarray
[10
11]
-
[2
13
14]
[6
7
8
9]
g
13
7
mRNA
g
24
8
g
1g
2time series
[16]
T
={X
1,X
2, K,
n}
(real number)
X
ii
T
jj
mRNA
10
n = 10
T
1= {X
1,X
2, K,
n} T
2= {Y
1,Y
2, K,
n}
f(T
1,T
2)
= [
X
1-Y
1 2+(X
2-Y
2)
2+
K+X
n-Y
n)]
1/2f(T
1,T
2)
(
)
T
1, T
2f(T
1,T
2)
Euclidean Distance
Microarray
mRNA
[1]
metric
Microarray
X
iY
i 1Liping Ji
[11]
- 1(
) -1(
)
0(
)
m
n
m n
T
ijT
iji
j
mRNA(
)
1 <= i <= m
0 <= j <= n
T
ijT
i(j+1)T
ij(1)
T
ij0
T
ij= (T
i(j+1)- T
ij) /
T
ij(2)
T
ij= 0
T
i(j+1)> 0
T
ij= 1
(3)
T
ij= 0
T
i(j+1)< 0
T
ij= -1
(4)
T
ij= 0
T
i(j+1)= 0
T
ij= 0
Normalization Threshold(t) (
1)
T
ijT
ij(1)
T
ijt
T
ij= 1
(2)
T
ij-t
T
ij= -1
(3)
T
ij= 0
2(pattern)
metric
Hamming Distance:
s t |s| = |t| s
t
Hamming Distance
H(s t) =
1( , )
n i i imismatch s t
s
it
i( , )
i imismatch s t =1
mismatch s t =0
( , )
i iH
____________________________________ 1 2 : ababccglobal
H(s
1s
2)
(local)
H
abababaaaabababa
bcbcbcaaaababab
n
T
p
w; w < n 1 <= p <= n-w+1
C
p=
{X
p,X
p+1, K
p+w-1}
p
w-1
T
aabb bbcc
aabbcc
3
T
i:{-1,0,1}
q
3
qq-cluster[11]
:(0 0)
(0 1)
(1 0)
(1 1)
2
H = 0
q
q-clusters
FFL
[14]
FFL
X
Y
Y
Z
X
)
Y
Z
X
Y Z
q-cluster :
0 1 1 1 1
(3
2) (5
3) (7
4)
0 -1 -1 -1 -1
(3
7) (5
8) (7
8)
3
5
7
____________________________________ 3 sliding windowFFL
5
false-
positive
Microarray array
[13
14]
Microarray mRNA
[12]
(
FFL-Motif)
[13
14]
Microarray
(
mRNA
false- positive
[12]
Microarray
10
[12]
mRNA
(post-translational modification)
Microarray mRNA
[13
14]
[15]
Microarray
(
:
-DNA
-knockout
)
Microarray
[1] Patrick O. Brown and David Botstein, § Expl ori ng new world genome with DNA microarray. Nature Genet. 21, 33-37 (1999).
[2] Ptashne M., § Regul ati on of tr anscri pti on: fr o m lambda to eukaryotes. ¨Trends Biochem. Sci. 30, 275-279 (2005).
[3] Milo R. et al., § Net wor k Motifs: Si mpl e Buil di ng Blocks of Complex Networks. ¨ Science 298, 824-827 (2002).
[4] Tong Ihn Lee, Nicola J. Rinaldi, Francois Robert, Duncan T. Odom, Ziv Bar-Joseph, Georg K. Gerber, Nancy M. Hannett, Christopher T. Harbison, Craig M. Thompson, Itamar Simon, Julia Zeitlinger, Ezra G. Jennings, Heather L. Murray, D. Benjamin Gordon, Bing Ren, John J. Wyrick, Jean-Bosco Tagne, Thomas L. Volkert, Ernest Fraenkel, David K. Gifford, and Richard A. Young, Transcriptional Regulatory Networks in Saccharomyces cerevisiae. ¨Science 298, 799-804 (2002).
[5] Shen-Orr S.S., Milo R., Mangan S. & Alon U., Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet. 31, 64-68 (2002).
[6] Pilpel Y, Sudarsanam P, Church GM.,
Identifying regulatory networks by combinatorial analysis of promoter elements. Nature Genet. 29, 153-159 (2001).
[7] Bailey, Timothy L. and Noble, William Stafford, Searching for statistically significant regulatory modules. Bioinformatics 19, 16-25 (2003). [8] Butte AJ, Tamayo P, Slonim D, Golub TR,
Kohane IS, Discovering functional relationships between RNA Expression and chemotherapeutic susceptibility using relevance networks.
Proc. Natl. Acad. Sci. 22, 12182 6 (2000).
[9] Eisen, M.B., Spellman, P.T., Brown, P.O. and Botstein, D., Cluster Analysis and Display of Genome-wide expression patterns. Proc. Natl. Acad. Sci. 16, 707-726 (1998).
[10] Kwon, Andrew T., Hoos, Holger H., and Ng, Raymond, Inference of transcriptional regulation relationships from gene expression data. Bioinformatics 19, 905-912 (2003).
[11] Ji Liping and Tan Kian-Lee, Identifying Time-Lagged Gene Clusters on Gene Expression Data. ¨Bioinformatics 21, 509 - 516 (2005). [12] Spellman,P., Sherlock,G., Zhang,M., Iyer,V.,
Anders,K., Eisen,M., Brown,P., Botstein,D. and Futcher,B., § Co mpr ehensi v identification of cell cycle-regulated genes of the yeast Saccharomyces
cerevisiae microarray hybridization. ¨ Mol. Biol. Cell 9, 3273 3297 (1998).
[13] Mangan S. and Alon U., §Str uct ur e and f uncti on of the feed-forward loop network motif. Proc. Natl. Acad. Sci. USA 100, 11980 11985 (2003). [14] Mangan S., Zaslaver A. & Alon U., ¨ The
Coherent Feedforward Loop Serves as a Sign sensitive Delay Element in Transcription Networks. J. Mol. Biol. 334, 197 204 (2003). [15] Bar-Joseph, Z. § Anal yzi ng ti me seri es gene
expression data , Bioinformatics 20, 2493 2503 (2004)
[16] Antunes C.M. & Oliveira A.L., Temporal data mining: An overview. ¨KDD Workshop on Temporal Data Mining (2001)