Results - 整合階層種類知識於多標籤診斷文字理解

The baseline and the results of adding the proposed mechanisms are shown in Ta-ble 3.3. For MIMIC3-50, all proposed mechanisms achieve the improvement for almost all metrics, and the best one is from the hierarchical learning with average meta-label. The consistent improvement indicates that category knowledge provides informative cues for sharing parameters across low-level codes under the same cat-egories. For MIMIC3-Full, our proposed mechanisms still outperform the baseline CNN model, and the best performance comes from the one with multi-task learning.

The reason may be that multi-task learning has more flexible constraints compared with hierarchical learning, and it is more suitable for this more challenging scenario due to data imbalance. In addition, the proposed knowledge integration mechanisms using multi-task learning or hierarchical learning with average meta-label are able to improve the prior state-of-the-art model, CAML [3], demonstrating the superior capability and the importance of domain knowledge.

To further investigate the model effectiveness, we perform the experiments on the internal dataset in Table 3.2. Due to shorter clinical notes and higher OOV rate, this dataset is more challenging and the results are lower than the ones in MIMIC-3. Nevertheless, the proposed methods still improve the performance by integrating category knowledge using multi-task learning or hierarchical learning with average meta-label. In sum, our proposed category knowledge integration mechanisms are

Data-200 Macro-F1 Micro-F1

+ Hierarchical (avg) 18.4^† 45.7^† Table 3.2: The results on internal data.

MIMIC3-50 P@1 P@3 P@5 MAP Macro-F Micro-F Macro-AUC Micro-AUC

CNN [5] 82.8 71.2 61.4 72.4 57.9 63.0 88.2 91.2

+ Cluster Penalty 83.5^† 71.9^† 62.4^† 73.1^† 58.3^† 63.7^† 88.5^† 91.3^†

+ Multi-Task 83.5^† 71.3^† 61.9^† 72.5^† 57.6 62.8 88.1 91.1

+ Hierarchical avg 84.5^† 72.1^† 62.4^† 73.5^† 58.6^† 64.3^† 88.9^† 91.4^† at-least-one 83.4^† 72.1^† 62.4^† 73.4^† 58.5^† 63.8^† 88.4^† 91.3^†

MIMIC3-Full P@1 P@3 P@8 P@15 Macro-F Micro-F Macro-AUC Micro-AUC

CNN [5] 80.5 73.6 59.6 45.4 3.8 42.9 81.8 97.1

+ Cluster Penalty 88.4 82.4 68.8 54.0 5.4 51.2 87.5 98.3

+ Multi-Task 89.7^† 83.4 69.7^† 54.8 6.9^† 52.3^† 88.8^† 98.5^†

+ Hierarchical avg 89.6 83.5^† 70.9^† 56.1^† 8.2^† 53.9^† 89.5^† 98.6^†

at-least-one 89.4 83.3 69.5 54.8^† 6.2^† 51.7 88.3 98.4

Table 3.3: The results on MIMIC-3 data (%). ^† indicates the improvement over the baseline.

capable of improving the text understanding performance by combining the domain knowledge with neural models and achieve the state-of-the-art results.

3.3 Qualitative Analysis

From our prediction results, we find that our proposed mechanisms tend to predict more labels than the baseline models for both CNN and CAML. Specifically, our methods can assist models to consider more categories from shared information in the hierarchy. The additional codes often contain the right answers and sometimes are in the correct categories but not exactly matched. Moreover, our mechanisms have the capability of correcting the wrong codes to the correct ones which are under the same category. The appendix provides some examples for reference.

doi:10.6342/NTU201903870 (a) Clinical notes

admission date discharge date date of birth sex m service surgery allergies no drug allergy information on file attending first name3 lf chief complaint fall from bike major surgical or invasive procedure n a history of present illness 71m who was brought to the hospital1 ed after a fall from his bike past medical history seizure disorder bph spinal stenosis sleep apnea social history n a family history n a physical exam no brainstem reflexes pertinent results n a brief hospital course mr known lastname was admitted after a fall from his bicycle he was seen getting up from the accident and then collapsed shortly thereafter he then was noted to be in asystole when ems arrived the total amount of time the patient was in asystole is not known upon arrival to the ed he had regained a pulse a neuro exam was performed and he had no brainstem reflexes an mri confirmed a c2 level spinal cord injury and changes consistent with an anoxic brain injury the neob was contact name ni but due to unknown circumstances surrounding his cardiac arrest he did not meet donation criteria the family elected to withdraw care he was extubated and expired shortly thereafter medications on admission n a discharge medications n a discharge disposition expired discharge diagnosis odontoid fracture spinal cord injury respiratory failure discharge condition n a discharge instructions n a followup instructions n a

Baseline: 327.23 345.90 348.1 518.81 E826.1

Proposed: 327.23 33.24345.90 348.1401.9518.81600.00780.39780.57 806.01 96.0496.696.71 96.72 E826.1

Ground truth: 288.50 345.90 348.1 356.9 427.5 518.81 600.00 780.57 806.01 807.01 96.04 96.71 E826.1

(b) Clinical notes

admission date discharge date date of birth sex f service neurosurgery allergies wellbutrin lipitor flagyl levaquin attending first name3 lf chief complaint decline in mental status major surgical or invasive procedure angiogram with embolization of aneurysm history of present illness 63f who began to have mental staus decline dysarthria at home brought to needhan hospital1 where had head ct showing large l parietal hemorrhage was transferred to hospital1 for further treatment upon arrival there was concern for airway safety and she was intubated was reportedly moving all extremities prior to intubation past medical history ccy multiple ercp for biliary strictures benign breast tumor l aneurysm clip no deficit chronic autoimmune hepatitis on steroids osteoporosis social history married she smokes to cigarettes a day does not drink any alcohol she is a retired hospital3 manager she watches her grandson a couple times a week participates in book clubs walks and traveling family history thyroid disease is positive in the family as is rheumatoid arthritis her sister died at years of age of liver disease of unknown cause it is not known whether that also was autoimmune hepatitis there is also cirrhosis in the family physical exam hunt and doctor last name doctor last name gcs 6t e v 1t motor o t afeb bp hr r16 o2sats intubated sedated examined in ed just after intubated heent pupils 4mm reactive neck supple extrem warm and well perfused no c c e neuro no eye opening all to nox pertinent results cta redemonstrated ip ic sah worsened mass effect with 10mm rightward mls and effacement of the basal cisterns there is downward herniation aneurysms ruptured left mca partially calcified right m1 origin aneurysm the latter is amenable to coiling possible third small left mca trifurcation aneurysm await reformations brief hospital course pt was admitted to the icu for close neurological observation in the afternoon of admission the patient s mental status declined including loss of cough and gag brain test testing was initiated by the icu and concluded that she was brain dead preparations were made for organ donation per the families request medications on admission all flagyl levaquin statins wellbutrin discharge medications n a discharge disposition expired discharge diagnosis n a discharge condition deceased discharge instructions n a followup instructions n a name6 md name8 md md md number completed by Baseline: 571.5 733.00 96.0496.72

Proposed: 305.1 38.91 38.93 39.72 401.9 431 518.81 571.42 571.5 733.00 733.09 88.41 96.04 96.6 96.72

Ground truth: 276.3 276.8 348.4 348.89 38.93 39.72 430 571.42 733.00 88.41 96.04 96.71 V49.86

V58.65 10

Chapter 4 Conclusion

This paper proposes multiple mechanisms using the refined losses to leverage hierar-chical category knowledge and share semantics of the labels under the same category, so the the model can better understand the clinical texts even if the training sam-ples are limited. The experiments demonstrate the effectiveness of the proposed knowledge integration mechanisms given the achieved state-of-the-art performance and show the great generalization capability for multiple datasets.

Bibliography

[1] E. Choi, M. T. Bahadori, A. Schuetz, W. F. Stewart, and J. Sun, “Doctor ai:

Predicting clinical events via recurrent neural networks,” in Machine Learning for Healthcare Conference, pp. 301–318, 2016.

[2] L. R. de Lima, A. H. Laender, and B. A. Ribeiro-Neto, “A hierarchical approach to the automatic categorization of medical documents,” in Proceedings of the seventh international conference on Information and knowledge management, pp. 132–139, ACM, 1998.

[3] J. Mullenbach, S. Wiegreffe, J. Duke, J. Sun, and J. Eisenstein, “Explainable prediction of medical codes from clinical text,” in Proceedings of the 2018 Con-ference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1101–1111, 2018.

[4] W. H. Organization et al., “International statistical classification of diseases and related health problems: tenth revision-version for 2007,” http://apps.

who. int/classifications/apps/icd/icd10online/, 2007.

[5] H. Shi, P. Xie, Z. Hu, M. Zhang, and E. P. Xing, “Towards automated icd coding using deep learning,” arXiv preprint arXiv:1711.04075, 2017.

[6] A. E. Johnson, T. J. Pollard, L. Shen, H. L. Li-wei, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. A. Celi, and R. G. Mark, “Mimic-iii, a freely acces-sible critical care database,” Scientific data, vol. 3, p. 160035, 2016.

[7] G. Singh, J. Thomas, I. Marshall, J. Shawe-Taylor, and B. C. Wallace, “Struc-tured multi-label biomedical text tagging via attentive neural tree decoding,”

in Proceedings of the 2018 Conference on Empirical Methods in Natural Lan-guage Processing, pp. 2837–2842, Association for Computational Linguistics, 2018.

[8] Y. Kim, “Convolutional neural networks for sentence classification,” in Pro-ceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751, 2014.

[9] A. Nie, A. Zehnder, R. L. Page, A. L. Pineda, M. A. Rivas, C. D. Bustamante, and J. Zou, “Deeptag: inferring all-cause diagnoses from clinical notes in under-resourced medical domain,” arXiv preprint arXiv:1806.10722, 2018.

[10] T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Advances in neural information processing systems, pp. 3111–3119, 2013.

在文檔中整合階層種類知識於多標籤診斷文字理解 (頁 22-28)