第 5 章 結論與未來工作
5.1 結論
在本篇論文中,我們探討如何改善列聯方體維護的效率與合併變動方體時會 出現的未出現項目的問題。在維護效率方面,我們原本是使用完全重新計算,但 此種作法效率不佳因為每次維護時都要重新再對原本的列聯方體進行掃描,因此 我們提出了漸進式維護這種方法解決維護效率的問題,將要進行維護的資料製作 成變動方體,再與列聯方體進行合併減少需要進行運算的資料,以改善維護效率 的問題。
在未出現項目方面,我們探討可能會出現的未出現項目組合,探討(1)需要時 再計算相對應列聯表,(2)先行儲存相對應列聯表來,兩種不同方法的優缺點,並
有系統地分析該如何對未出現項目進行計算與所有情況可預先儲存的值。
最後我們以 FAERS 的真實資料進行實驗,探究漸進維護這種方法是否較完 全重新計算有效率。實驗結果顯示漸進式維護雖然在列聯方體資料量不多時與完 全重新計算維護效率差異不大,但隨著每季(或年)的資料加入的時候,完全重新
計算花費時間成長的幅度較漸進式維護多了許多,因此在列聯方體資料量很多時 漸進式維護較完全重新計算更有效率。
5.2 未來工作
53
本論文研究的問題只是針對列聯方體這種架構進行維護,未來還有許多可 以進行延伸的工作,以下列舉三個面向說明:
1. 我們設計的漸進式維護方法,在效率與消耗資源方面也許還能更加優化,
例如:可考慮在生成變動方體時只生成維度最多的組合,再由該種組合生成
其他差異方體,但如果是使用此種作法對於效率方面就無法得知是否會有 影響,因此還需要多加進行研究。
2. 而除了效率與消耗資源以外,我們也會考慮有關於多重藥物的問題。本論 文只探討可分析單一藥物引起的不良反應信號的列聯方體維護方式,至於 如何將多重藥物的列聯方體納入考慮一併進行維護,將參考我們實驗室先 前的研究[7]。
3. 最後我們考慮將我們維護的方法擴展至雲端系統架構,例如採用Mapreduce 的計算方式,這方面的研究將參考我們實驗室先前的研究[30],以 Mapreduce
的架構進行列聯方體計算的方式。
54
參考文獻
[1] A. Bate and S.J.W. Evans, “Quantitative signal detection using spontaneous ADR reporting,” Pharmacoepidemiol Drug, 2009, pp. 427–436.
[2] A. Bate, A. Lansner, De. Freitas, I. Edwards, M. Lindquist, R. Orre and S. Olsson,
“A Bayesian neural network method for adverse drug reaction signal generation,”
European Journal of Clinical Pharmacology, vol. 54, 1998, no. 4, pp. 315–321.
[3] A. Bate, A. Lansner, M. Lindquist, R. Orre, “Bayesian neural networks with confidence estimations applied to data mining,” Computational Statistics & Data Analysis, 2000, pp. 473–493.
[4] A. Bosworth, A. Layman, D. Reichart, F. Pellow, H. Pirahesh, J. Gray, M.
Venkatrao and S. Chaudhuri, “Data cube: a relational aggregation operator
generalizing group-by, cross-tab, and sub totals,” Data Mining and Knowledge Discovery, vol. 1, 1997, No. 1, pp. 29–53.
[5] S. Chaudhuri and U. Dayal, “An overview of data warehouse and OLAP technology,” ACM SIGMOD Record, 1997, pp. 65–74.
[6] C. C. Chiu, H.Y. Yeh, S.F. Lin, V.W. Soo and Y.T. Huang, “Probability analysis on associations of adverse drug events with drug-drug interactions,” in Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, 2007, pp. 1308–1312.
55
[7] J. Du, “Detecting adverse drug interactions with multivalued dimension cube technology,” Master thesis, Dept. of Computer Science and Information Engineering, National University of Kaohsiung, Taiwan, July 2012.
[8] A. Egberts, E. Van and R. Meyboom, “Use of measures of disproportionality in pharmacovigilance: three Dutch examples,” Drug Safety, vol. 25, 2002, pp. 453–
458.
[9] C. Ezeife and M. Xu, “Maintaining horizontally partitioned warehouse views,” in Proceedings of International Conference on Data Warehousing and Knowledge Discovery, 2000, pp. 126-133.
[10] D. Fram, J. Almenoff and W. DuMouchel, “Empirical Bayesian data mining for discovering patterns in post-marketing drug safety,” in Proc. 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2003, pp. 359–368.
[11] A. Gupta, J. Naughton, P. Deshpande, R. Agrawal, R. Ramakrishnan, S. Agarwal and S. Sarawagi, “On the computation of multidimensional aggregates,” in Proc.
22th Int. Conf. on Very Large Data Bases, 1996, pp. 506–521.
[12] F.H. Huang, “Effect of drug name inconsistence and duplicate report in SRS data to the detection of ADR signals,” Master thesis, Dept. of Computer Science and
Information Engineering, National University of Kaohsiung, Taiwan, July 2015.
[13] J. Han, J. Pei and Y. Yin, “Mining frequent patterns without candidate generation,”
56
in Proc. ACM SIGMOD Int. Conf. on Management of Data, 2000, pp. 1–12.
[14] W.H. Inmon, Building the Data Warehouse. New York, NY:USA, 1992.
[15] D. Jin, K. Higuchi and T. Tsuji, “An incremental maintenance scheme of data cubes and its evaluation,” Information and Media Technologies, vol. 4, no. 2, 2009, pp. 364–367.
[16] C. Kongkaew, D.M. Ashcroft and P.R. Noyce, “Hospital admissions associated with adverse drug reactions: a systematic review of prospective observational studies,” The Annals of Pharmacotherapy, vol. 42, nol. 7, 2008,pp. 1017–25.
[17] G. Kimura, I. Miki, J.B. Brown, K. Kadoyama, K. Nisiguchi, T. Nakamura, T.
Sakaeda and Y. Okuno, “Antipsychotics-associated serious adverse events in children: an analysis of the FAERS database,” International Journal of Medical Sciences, vol. 12, 2015.
[18] K.Y. Lee and M.H. Kim, “Efficient incremental maintenance of data cubes,” in Proc. 32nd Int.Conf. on Very Large Data Bases, 2006, pp. 823–833.
[19] C.F. Lo, H.Y. Li, J.W. Du, V.W. Soo, W.Y. Lin and W.Y. Feng, “iADRs: towards online adverse drug reaction analysis,”SpringerPlus, 2012.
[20] B.S. Mumick, D. Quass and I.S. Mumick, “Maintenance of data cubes and summary tables in a warehouse,” in Proc. ACM SIGMOD Int. Conf. on Management of Data, 1997, pp. 100-111.
57
[21] A. Mendelzon A. Vaisman and C. Hurtado, “Maintaining data cubes under dimension updates,” in Proceedings of IEEE International Conference on Data Engineering, 1999, pp. 346-355.
[22] A. Pesonen, A. Wolski, J. Arminen and J. Kiviniemi, “Lazy aggregates for real-time OLAP,” in Proc. 1st Int. Conf. on Data Warehousing and Knowledge Discovery, 1999, pp. 165–172.
[23] B. Peng, C. Wang, K. Fan, L. Xu, X. Mao, X. Sun, Y. Tao, Y. Pan, “High-performance signal detection for adverse drug events using MapReduce paradigm,”
in Proc. AMIA 2010 Annual Symposium Proceedings, 2010, pp. 902–906.
[24] E.A. Rundensteiner and S. Chen, “GPIVOT: Efficient incremental maintenance of complex ROLAP views,” in Proc. IEEE Int. Conf. on Data Engineering, 2005, pp.
552–563.
[25] A. Szarfman, R. Neill, S. Machado, “Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database,” Drug Safety, 2002, pp.
381–392.
[26] S. Tsumoto and S. Hirano, “Statistical independence in three-variables contingency cube,” in IEEE International Conference on Cognitive Informatics, 2008, pp. 14-16.
58
[27] A.K.H. Tung, C.P. Li, G. Cong and S. Wang, “Incremental maintenance of quotient cube for median,” in Proc. 10th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2004, pp. 226–235.
[28] A.A. Vaismana, A.O. Mendelzonb, S.G. Cymerman and W. Ruaroa, “Supporting dimension updates in an OLAP server,” Information Systems, vol. 29, 2004, pp.
165–185.
[29] P.C. Waller, S. Davis and S.J.W. Evans, “Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports,”
Pharmacoepidemiol Drug, 2001, pp. 483–486.
[30] M. Wang and W. Lin, “MapReduce-based ADR contingency cube computation,”
The 2016 Conference on Technologies and Applications of Artificial Intelligence, Hsinchu, Taiwan, pp. 25-27.
[31] H. Xu, M. Liu, M. Matheny and Y. Hu, “Data mining methodologies for pharmacovigilance,” SIGKDD Explorations, vol. 14, 2012, no. 1, pp. 35–42.
[32] F. Yates, “Contingency table involving small numbers and the χ2 test,” Suppl J R Stat Soc, 1934, pp. 217–235.
[33] Food and Drug Administration Adverse Event Reporting System, Available:https://www.fda.gov/, May 5, 2018.
[34] Medicines and healthcare products regulatory agency,
59
Available:https://www.gov.uk/government/organisations/medicines-and-healthcare-products-regulatory-agency, May 5, 2018.
[35] Taiwan National Adverse Drug Reactions Reporting System, Available:https://adr.fda.gov.tw/Manager/WebLogin.aspx, May 5, 2018.
[36] National Library of Medicine RxNorm, Available:
https://www.nlm.nih.gov/research/umls/rxnorm/, July 7, 2016.
[37] Medical Dictionary for Regulatory Activities, Available: https://www.meddra.org/, May 5, 2018.