• 沒有找到結果。

Experimental Results on Cube Computation

Chapter 5 Experiments

5.2 Experimental Results on Cube Computation

5.2 Experimental Results on Cube Computation

In this part of experiments, we first compare our proposed MDC cube structures, type-1 and type-2, with the structure used by ABCM-MS, which stores the reduced transactions in a plain text file (RTF). Figure 5-3 shows the storage requirement for these three structures. As the results demonstrate, the RTF structure consumes the lease amount of space, no more than 10MB. The type-1 and type-2 MDC cubes require significantly larger amount of storage, while the type-2 MDC cube consumes less amount than type-1 MDC cube. The main reasons for such a dramatic difference between RTF and the MDC cubes are:

(1) RTF stores only frequent items, and prunes many unqualified transactions containing no frequent items.

(2) The MDC cube, on the other hand, introduces multivalued dimensions into the

structure, and needs to store extra attributes such as count, a, b, c, d.

Figure 5-3. The storage requirements of the three methods, ABCM-MS, MDC-1, and MDC-2.

Next, we compare the two approaches for MDC cube computations, SQL implementation and the faster implementation. The performance comparison is shown in Figure 5-4. Clearly, our faster cube computation is significantly faster than the SQL implementation, with the gap increasing proportional to the growth of data size.

20000 4000 60008000 10000 12000 14000 16000 18000 20000

1/10

2/10 3/10

4/10 5/10

6/10 7/10

8/10 9/10 10/10 ratio of data set

Time (sec)

SQL implement Faster method

Figure 5-4. The execution time comparison between SQL implementation and faster method.

Chapter 6

Conclusions and Future Work

6.1 Conclusions

The problem of ADR detection has long played as a key issue in the pharmacovigilance community. Although many related researches have been proposed in recent years, majority of them were concerned about the accuracy of the detected adverse drug reactions, very few paying attention to the performance issue of how to quickly accomplish the detection process.

In this thesis, we have point out the weakness of previously proposed contingency cube-based method, CBM-SS, and the performance problem of CR-tree-based method, ABCM-MS, on adverse drug interaction detection. To solve the deficiency of contingency cube, we have proposed the concept of multivalued contingency cube, adding the concept of multivalued dimension into the data cube structure. We have proposed two structure types of MDC cubes, type-1 and type-2 MDC cubes. Experimental results have shown that both types of MDC cubes can facilitate a faster detection of adverse drug interactions, and type-2 method consumes less storage and is more efficient than type-1 method.

We also have proposed a faster algorithm for computing the MDC cubes from input dataset, which are significantly faster than the naive SQL language approaches to generate MDC cube.

6.2 Future Work

The concept of MDC cube has shown its effectiveness in supporting faster detection of adverse drug interactions. Among the many topics to be explored in future research, some important ones are listed as follows:

1. In the past few years, our laboratory has established a web-based ADR detection and analysis system, called iADRs. The performance of iADRs in detecting adverse drug interactions, however, is not satisfying due to the poor efficiency of ABCM-MS method. We will replace the ABCM-MS method with our MDC cube-based method to establish a more efficient and interactive environment for the iADRs system.

2. Currently, the ADR signal we consider involves only one ADR as the consequent.

We will extend the structure of our MDC cube to allow multivalued ADR dimension and empower the MDC cube-based method to detect ADR signals with multiADRs.

3. In the pharmacy industry, there exist some important knowledge ontology about the drug ingredient and ADR terms. For example, “ADACEL” is a 3-in-1 vaccine for the prevention of tetanus, diphtheria, and pertussis as a single dose. Information that reveal a ADR caused by some specific ingredient of a drug will be very useful.

Developing efficient algorithms that can utilize these medical ontology into the detection of specialized or generalized ADR signals is an important research issue.

References

[1] A. Beta, M. Lindquist, I. R. Edwards, S. Olsson, R. Orre, A. Lansner, and R.M. De Freitas, “A bayesian neural network method for adverse drug reaction signal generation,” European Journal of Clinical Pharmacology, Vol. 54, No. 4, pp. 315-321, 1998.

[2] W. DuMouchel, “Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system,” The American Statistician, Vol. 53, No. 3, pp.

177-190, 1999.

[3] A. De Roeck, A. Sarkar and P. Garthwaite. "Defeating the Homogeneity Assumption", in Proceedings of the 7th International Conference on the Statistical Analyisis of Textual Data , 2004.

[4] A. C. Egberts, R. H. Meyboom, and E. P. van Puijenbroek, “Use of measures of disproportionality in pharmacovigilance: three Dutch examples,” Drug Safety, Vol. 25, No. 6, pp. 453-458, 2002.

[5] S. J. Evans, P. C. Waller, and S. Davis, “Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports,” Pharmacoepidemiol Drug Safety, Vol. 10, No. 6, pp. 483-486, 2001.

[6] D. M. Fram, J. S. Almenoff, and W. DuMouchel, “Empirical bayesian data mining for discovering patterns in post-marketing drug safety,” in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.

359-368, 2003.

[7] D. M. Fram, J. S. Almenoff and W. DuMouchel, “Empirical Bayesian Data Mining forDiscovering Patterns in Post-Marketing Drug Safety,” in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pp.

359 – 368, 2003.

[8] Y. T. Huang, S. F. Lin, C. C. Chiu, H. Y. Yeh, and V. W. Soo, “Probability analysis on associations of adverse drug events with drug-drug interactions,” in Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, pp.

1308-1312, 2007.

[9] J. Han, J. Pei, Y. Yin, and R. Mao, “Mining frequent patterns without candidate generation,” in Proceedings of the ACM SIGMOD international conference on Management of data Vol. 29, pp. 1-12, 2000.

[10] H. Jin, J. Chen, H. He, G.J. Williams, C. Kelman, and C. M. O’Keefe, “Mining unexpected temporal associations: Applications in detecting adverse drug reactions,”

IEEE Transactions on Information Technology in Biomedicine, Vol. 12, No. 4, pp.

488-500, 2007.

[11] H. Y. Li, A Knowledge Discovery Platform for Analyzing and Detecting Adverse Drug Reactions, Master Thesis, National University of Kaohsiung, 2009.

[12] W. M. Li, J. W. Han, and J. Pei, “CMAR: Accurate and efficient classificationbased on multiple class-association rules,” in Proceedings of IEEE InternationalConference on Data Mining, pp. 369-376, 2001.

[13] W. Y. Lin and K. W. Huang, "MCFPTree: A FP-tree-based algorithm formulti-constrained patterns discovery," in Proceedings of the

InternationalConference on Complex, Intelligent and Software Intensive Systems, pp.

105-111,2009.

[14] B. Liu, W. Hsu, and Y. Ma, “Integrating classification and association rule mining,” in Proceedings of the Fourth International Conference on Knowledge Discoveryand Data Mining, pp. 80–86, 1998.

[15] R. Orre, A. Lansner, A. Bate, and M. Lindquist, “Bayesian neural networks with confidence estimations applied to data mining,” Computational Statistics & Data Analysis, Vol. 34, No. 4, pp. 473-493, 2000.

[16] A. Szarfman, S. G. Machado, and R. T. O’Neill, “Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database,” Drug Safety, Vol. 25, No. 6, pp.

381-392, 2002.

[17] Z. J. Wei, N. N. Cheng, L. He, W. M. Du, J. L. Xu, B. Y. Chen, and Y.M. Wang,

“Establishment of quantitative signal detection system on adverse drug reaction spontaneous reporting database of Shangai,” Fudan University Journal of Medical Sciences, Vol. 10, No. 4, pp. 475-479, 2006.

[18] Adverse Event Reporting System (AERS),

http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/A dverseDrugEffects/default.htm.

[19] CLÆRITY ,

http://www.unitedbiosource.com/scientific/safety/software/claerity.aspx.

[20] E.F. Codd, S.B. Codd, and C.T. Salley, "Providing OLAP (On-line Analytical Processing) to User-Analysts: An IT Mandate", http://www.arborsoft.com/OLAP.html.

Medicines and Healthcare products Regulatory Agency (MHRA), http://www.mhra.gov.uk.

[21] Medical Dictionary for Regulatory Activities Maintenance and Support Services Organization (MedDRA MSSO), http://www.meddramsso.com.

[22] National Reporting System of Adverse Drug Reaction in Taiwan,http://adr.doh.gov.tw.

[23] P. Hutt, “Babies, Blemishes and FDA: A History of Accutane Regulation in the United States, ” http://leda.law.harvard.edu/leda/data/472/Green.html

[24] U.K. Yellow Card,http://yellowcard.mhra.gov.uk.

[25] World Health Organization (WHO),

http://www.who.int/medicines/areas/quality_safety/safety_efficacy/advdrugreactions/en [26] S. Sarawagi, R. Agrawal, and A. Gupta. “On computing the data cube,” IBM Almaden

Research Center, 1996.

[27] The Importance of Pharmacovigilance - Safety Monitoring of medicinal products, World Health Organization, 2002.

[28] WHO Technical Report No 498, World Health Organization, 1972.

相關文件