Evaluation on Non-withdrawn Drugs - 基於約略集合之藥物不良反應信號偵測方法於具缺漏值之通報資料的可行性研究

Chapter 4 Experiments

4.3 Evaluation on Non-withdrawn Drugs

In this experiment, we evaluated the methods on non-withdrawn drugs. Likewise, we first compared the performance of the six rough set based methods. The results shown in Table 4.5 exhibit similar phenomenon observed in Table 4.4. That it, Method 1 outperforms all the others, Methods 4 and 6 exhibit the worst performance, and Methods 4 and 6 exhibit identical accuracy for all rules. Unsurprisingly, the value ranges of all rule strength quarterly, as displayed in Figures 4.11 and 4.12, demonstrate the same result; Method 1 results in the strictest strength range in all cases, then Method 2, Method 5, and finally Methods 4 and 6.

Table 4.5. Accuracy of the six rough set based ADR measurings for non-withdrawn drugs.

Method No. Rule

M1 M2 M3 M4 M5 M6

R4 1 1 1 1 1 1

R5 0.922051 0.911747 0.881026 0.877162 0.881026 0.877162 Total average 0.961026 0.955874 0.940513 0.938581 0.940513 0.938581

Next, we compared Method 1 with deletion based measuring. The results for rule R4 are not effective; all methods fail to generate significant signals during the observed duration. Both listwise deletion and pairwise deletion generate very similar rule strengths, which usually are bounded by the results generated by Method 1. But for rule R5, all methods can generate continuously significant signals after 2010Q2, quite earlier than the warning time (2014) issued by FDA. Note that although most of the

strengths during 2008Q4 to 2010Q1 are higher than 2, their a values are less than 3.

Therefore, they cannot be regarded as significant signals.

(a) (b)

(c) (d)

Figure 4.11 The result of the ROR & PRR lower and upper for R4.

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

ROR_Lower ROR lower of R4 for each methods

M1*-ROR_lower M2*-ROR_lower M3*-ROR_lower M4*-ROR_lower M5*-ROR_lower M6*-ROR_lower Marketed year : 1940

Warning year : 2014

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

ROR_Upper ROR upper of R4 for each methods

M1*-ROR_upper M2*-ROR_upper M3*-ROR_upper M4*-ROR_upper M5*-ROR_upper M6*-ROR_upper Marketed year : 1940

Warning year : 2014

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

PRR_Lower PRR lower of R4 for each methods

M1*-PRR_lower M2*-PRR_lower M3*-PRR_lower M4*-PRR_lower M5*-PRR_lower M6*-PRR_lower Marketed year : 1940

Warning year : 2014

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

PRR_Upper PRR upper of R4 for each methods

M1*-PRR_upper M2*-PRR_upper M3*-PRR_upper M4*-PRR_upper M5*-PRR_upper M6*-PRR_upper Marketed year : 1940

Warning year : 2014

(a) (b)

(c) (d)

Figure 4.12 The result of the ROR & PRR lower and upper for R5.

08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

ROR_Lower ROR lower of R5 for each methods

M1*-ROR_lower M2*-ROR_lower M3*-ROR_lower M4*-ROR_lower M5*-ROR_lower M6*-ROR_lower Marketed year : 2008

Warning year : 2014

08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

ROR_Upper ROR upper of R5 for each methods

M1*-ROR_upper M2*-ROR_upper M3*-ROR_upper M4*-ROR_upper M5*-ROR_upper M6*-ROR_upper Marketed year : 2008

Warning year : 2014

08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

PRR_Lower PRR lower of R5 for each methods

M1*-PRR_lower M2*-PRR_lower M3*-PRR_lower M4*-PRR_lower M5*-PRR_lower M6*-PRR_lower Marketed year : 2008

Warning year : 2014

08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

RST

PRR_Upper PRR upper of R5 for each methods

M1*-PRR_upper M2*-PRR_upper M3*-PRR_upper M4*-PRR_upper M5*-PRR_upper M6*-PRR_upper Marketed year : 2008

Warning year : 2014

(a) PRR

(b) ROR

Figure 4.13 Strength of rule R4 computed by Method 1 and listwise deletion.

0 6 12 18 24 30

0 2 4 6 8 10

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

A Value

PRR

Method 1 M(s, g, g) for R4

PRR_ld PRR_lower PRR_pd PRR_upper

Threshold=2 A_ld A_rs A_pd

0 6 12 18 24 30

0 2 4 6 8 10

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

A Value

ROR

Method 1 M(s, g, g) for R4

ROR_ld ROR_lower ROR_pd ROR_upper

Threshold=2 A_ld A_rs A_pd

Marketed year : 1940 Warning year: 2014 Marketed year : 1940 Warning year: 2014

(a) PRR

(b) ROR

Figure 4.14 Strength of rule R5 computed by Method 1 and listwise deletion.

0 44 88 132 176 220

0 3 6 9 12 15

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

A Value

PRR

Method 1 M(s, g, g) for R5

PRR_ld PRR_lower PRR_pd PRR_upper

Threshold=2 A_ld A_rs A_pd

0 44 88 132 176 220

0 3 6 9 12 15

04Q1 04Q3 05Q1 05Q3 06Q1 06Q3 07Q1 07Q3 08Q1 08Q3 09Q1 09Q3 10Q1 10Q3 11Q1 11Q3 12Q1 12Q3 13Q1 13Q3

A Value

ROR

Method 1 M(s, g, g) for R5

ROR_ld ROR_lower ROR_pd ROR_upper

Threshold=2 A_ld A_rs A_pd

Marketed year : 2008 Warning year: 2014

Chapter 5 Conclusions and Future Work

5.1 Conclusions

ADR detection from SRSs has been an important problem in the pharmacovigilance community and pharmaceutical industry. Although it is well known that the SRS dataset contains lots of missing data, most of published research work on this topic adopted listwise deletion method to eliminate data with missing values. No work has noticed the possibility and examined the effect of including the missing data in the process of ADR detection.

This study represents a preliminary endeavor towards the exploration of this question. We aim at inspecting the feasibility of applying rough set theory to the ADR detection problem. We have proposed the concept of utilizing characteristic set based approximation to measure the uncertain contingency values when missing data has to be counted. Specifically, we propose twelve different rough set based measuring methods and show, in terms of two new proposed concepts, satisfiable and indistinguishable properties, only six of them are feasible for the purpose.

To examine the effectiveness of these six measuring methods, we have conducted various experiments using the publicly accessible FAERS database. We compared our proposed six measuring methods with traditional frequentist method, inspecting their capability in timeline warning of noteworthy ADR signals. Two different groups of marketed drugs were considered, withdrawn drugs and non-withdrawn drugs.

Experimental results show that most of the time the six rough set based methods exhibit similar capability of timeline warning as that of traditional method, both predicting

ADR signals earlier than the time that FAD issues the warning or withdrawal announcement. However, our methods can yield noteworthy measures (higher than the threshold) of some known ADR signals while traditional method fails, for example, the following ADR signal,

Gender = Female, Drug = ZELNORM → PT = CEREBROVASCULAR ACCIDENT.

Among the six rough set based methods, Method 6, i.e., M(t, l, c), yields the widest range of ADR measure, that is, it exhibits the least in measuring quality, while Method 1, i.e., M(s, g, g), yields the least and so the best in measuring quality.

5.2 Future Work

Although the results of our study show some merits of rough set based approaches on ADR signal measuring and detection from incomplete datasets, there are some limitations of our work and so result in some unexplored research issues in the future.

First, our method only applicable to ADR rules composed of nonempty extra condition, that is, there is at least one incomplete attribute other than Drug and PT involved in the rule condition. However, the approximation concept was originally proposed for complete but inconsistent data. How to devise other rough set based methods that can improve the ADR signal detected from complete data is an interesting problem. One of the promising approaches that we figure out is to use rough set membership function to calculate the contingency values. Then, we can simply use the original formula for PRR and ROR to measure ADR signals. This method is relatively simple and requires less computation though its effectiveness needs further examination.

Second, we assume all missing values in the dataset are regarded as lost or don’t care, but not simultaneously. If additional information is available, maybe different

missing values can be interpreted as lost or don’t care. Our approach is easily modified to handle this case by simply using the general characteristic set.

Third, it is well known that the FAERS dataset contains data quality problems other than the missing value, such as duplicate records and inconsistent drug names.

We will incorporate data cleaning techniques to obtain better dataset, and inspect again our approaches considering more other known ADR signals.

Finally, through our experimental results, it seems an appropriate time granularity is important to the timeline ADR signal warning. In this study, we only conform to the original granularity embedded in the FAERS database, i.e., quarter, which however, is too finer to sustain enough amount of cases for signal measuring. We will consider larger time granularity, such as year, and also compute the strength in a cumulative way.

References

[1] J.S. Almenoff, K.K. LaCroix, N.A. Yuen, D. Fram, and W. DuMouchel,

“Comparative performance of two quantitative safety signalling methods:

implications for use in a pharmacovigilance department,” Journal of Drug Safety, vol. 29, no. 10, pp. 875-887, 2006.

[2] A. Bate, “Bayesian confidence propagation neural network,” Journal of Drug Safety, vol. 30, no. 7, pp. 623-625, 2007.

[3] A. Bate et al, “A Bayesian neural network method for adverse drug reaction signal generation,” European journal of clinical pharmacology, vol. 54, no. 4, pp. 315-321, 1998.

[4] D. Banks et al, “Comparing data mining methods on the VAERS database,”

Pharmacoepidemiology and Drug Safety, vol. 14, no. 9, pp. 601-609, 2005.

[5] B.W. Chee, R. Berlin, and B. Schatz, “Predicting adverse drug events from personal health messages,” in Proceeding of 2011 AMIA Annual Symposium, 2011, pp. 217-226.

[6] B.K. Chen and Y.T. Yang, “Post-marketing surveillance of prescription drug safety: past, present, and future,” Journal of Legal Medicine, vol. 34, no. 2, pp.

193-213, 2013.

[7] P.M. Coloma, G. Trifirò, V. Patadia, and M. Sturkenboom, “Postmarketing safety surveillance: Where does signal detection using electronic healthcare records fit into the big picture?” Journal of Drug Safety, vol. 24, no. 6, pp. 343-348, 2013.

[8] R.L. Carter, “Solutions for missing data in structural equation modeling,”

Research & Practice in Assessment, vol. 1, no. 1, pp. 1-6, 2006.

[9] G. Deshpande, V. Gogolak, and S.W. Smith, “Data mining in drug safety,”

Journal of Pharmaceutical Medicine, vol. 24, no. 1, pp. 37-43, 2010.

[10] A.C. Egberts, R.H. Meyboom, and E.P. van Puijenbroek, “Use of measures of disproportionality in pharmacovigilance: three Dutch examples,” Journal of Drug Safety, vol. 25, no. 6, pp. 453-458, 2002.

[11] S.J. Evans, P.C. Waller, and S. Davis, “Use of proportional reporting ratios(PRRs) for signal generation from spontaneous adverse drug reaction reports,” Journal of Drug Safety, vol. 10, no. 6, pp. 483-486, 2001.

[12] FDA Adverse Event Reporting System, Available:

http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveilla nce/AdverseDrugEffects/ucm083765.htm

[13] J.W. Grzymala-Busse, “Rough set strategies to data with missing attribute values,”

Foundations and Novel Approaches in Data Mining, vol. 9, pp. 197-212, 2006.

[14] C. Gravel, An Introduction to the Statistical Methods for Signal Detection in Pharmacovigilance, Available:

http://www.mclaughlincentre.ca/events/Parmaco2/Chris%20Gravel_UofO%20w orkshop%20May%2028-29%202009.pdf

[15] A.E. Hassanien, A. Abraham, J.F. Peters, and G. Schaefer, “Rough sets in medical informatics applications,” in Applications of Soft Computing, vol. 58. J. Mehnen, M. Koeppen, A. Saad, and A. Tiwari, Eds. Heidelberg: Springer-Verlag, 2009, pp.

23-30.

[16] A.E. Hassanien and J.M.H. Ali, “Rough set approach for generation of classification rules of breast cancer data,” Informatica Journal on Computer Science, vol. 15, no. 1, pp. 23-38, 2004.

[17] J.T. Harvey, C. Turville, S.M. Barty, “Data Mining of the Australian Adverse Drug Reactions Database: a comparison of Bayesian and other statistical indicators,” International Transactions in Operational Research, vol. 11, no. 4, pp. 419-433, 2004.

[18] Y. Ji et al, “A distributed adverse drug reaction detection system using intelligent agents with fuzzy recognition-primed decision model,” International Journal of Intelligent Systems, vol. 22, no. 8, pp. 827-845, 2007.

[19] M. Kryszkiewicz, “Rough set approach to incomplete information systems,” in Processing of the Second Annual Joint Conference on Information Sciences, Wrightsvill beach, 1995, pp. 194-197.

[20] K. Kubota, D. Koide, and T. Hirai, “Comparison of data mining methodologies using Japanese spontaneous reports,” Pharmacoepidemiology and drug safety, vol.

13, no. 6, pp. 387-394, 2004.

[21] S. Kobashi, K. Kondo, and Y. Hata, “Rough sets based medical image segmentation with connectedness,” in Proceeding of Automation Congress, 2004, pp. 197-202.

[22] S. Mandal, G. Saha, and R.K. Pal “An approach towards automated disease diagnosis & drug design using hybrid rough-decision tree from microarray dataset,”

Journal of Computer Science & Systems Biology, vol. 6, no. 6, pp. 337-343, 2013.

[23] Medical Dictionary for Regulatory Activities(MedDRA), available:

http://www.meddra.org/

[24] Z. Pawlak, “Rough sets,” in International Journal of Computer & Information Sciences, vol. 11, no. 5, pp. 341-356, 1982.

[25] E. Poluzzi, E. Raschi, C. Piccinni, and F. de Ponti, “Data mining techniques in pharmacovigilance: analysis of the publicly accessible FDA adverse event reporting system (AERS),” in Data Mining Applications in Engineering and Medicine. Adem Karahoca, Ed. Turkey: InTech, 2012, pp. 266-302.

[26] E. Roux, F. Thiessard, A. Fourrier, B. Begaud, and P. Tubert-Bitter, “Evaluation of statistical association measures for the automatic signal generation in pharmacovigilance,” IEEE Transactions on Information Technology in

Biomedicine, vol. 9, no. 4, pp. 518-527, 2005.

[27] J.M. Reps et al, “Comparison of algorithms that detect drug side effects using electronic healthcare databases,” Journal of Soft Computing, vol. 17, no. 12, pp.

2381-2397, 2013.

[28] S. Rissino and G.L. Torres, “Rough set theory fundamental concepts, principals, data extraction, and applications,” in Data Mining and Knowledge Discovery in Real Life Applications. J. Ponce and A. Karahoca, Eds. Vienna: I-Tech Education and Publishing, 2009, pp. 36-58.

[29] A. Szarfman, J.M. Tonning, J.G. Levine, P.M. Doraiswamy, “Atypical antipsychotics and pituitary tumors: a pharmacovigilance study,” Journal of Human Pharmacology and Drug Therapy, vol. 26, no. 6, pp. 748-758, 2006.

[30] J. Stepaniuk, “Rough set data mining of diabetes mellitus data,” Foundations of Intelligent Systems, vol. 1609, pp. 457-465, 1999.

[31] J. Stefanowski and A. Tsoukias, “On the extension of rough sets under incomplete information,” in Proceedings of the 7^th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-soft Computing, Japan, 1999, pp. 73-81.

[32] S. Tsumoto, “Mining diagnostic rules from clinical databases using rough sets and medical diagnostic model,” Information sciences, vol. 162, no. 2, pp. 65-80, 2004.

[33] Z. Wojcik. “Rough approximation of shapes in pattern recognition,” Computer Vision, Graphics, and Image Processing, vol. 40, no. 2, pp. 228-249, 1987.

[34] B. Walczak and D.L. Massart, “Rough sets theory,” Chemometrics and Intelligent Laboratory Systems, vol. 47, no. 1, pp. 1-16, 1999.

在文檔中基於約略集合之藥物不良反應信號偵測方法於具缺漏值之通報資料的可行性研究 (頁 69-0)