人類信任度及模型解釋偏好評估 - 在 SNLI 資料集之適應性 - 自然語言推理之後設可解釋性建模

4.3 在 SNLI 資料集之適應性

4.4.2 人類信任度及模型解釋偏好評估

Metric/Accuracy Correlation (SUM@F1)

Model Accuracy (%) Correct Incorrect Correlation Entailment

‧

關係的區間；Pretrained@AS 同樣給予可以推論關係的區間，但是同時包含了大量多餘的、情境上的資訊。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

- Contradiction: 和 entailment 的特性相同。

在進行模型信任度評估實驗時，我們希望受測者能夠正確的辨認三個解釋方法之特性（來自哪個機器人），再依照其對模型解釋方法的偏好，選擇其認為較能夠信任的解釋方法。

我們一共蒐集了 100 個受測者分別在 9 題分辨 3 個模型的問答中，得到了 2700 個回答結果，在我們所蒐集的實驗結果混淆矩陣如【圖 4-2】所示。其中間軸佔比為 74.14 %，即在總計 2700 個回答中，有 2002 個（74.14%）解釋能夠被正確判斷來自哪個解釋方法。

圖 4-2 判斷機器解釋做答狀況之混淆矩陣

在判別我們的受測者是否能夠正確判斷解釋方法的特性時，我們先計算在隨機現象下，各個答對題數之人數期望分布應為何。我們以式16 進行計算，在我們的信任度實驗中，受測者一共需回答 9 題判斷解釋來自哪個解釋方法的問

‧

果【表 4-18】顯示，3MTL@SD、3MTL@AS 和 Pretrained@AS 三種解釋方法，隨著輪次的進展，其投注的金額，也就是對於模型的信任度有顯著的不 One-way MANOVA

Multivariate ^, Test

Univariate Test

Source Measure F Sig. Post-hoc

‧

為 Pretrained@AS (37)，而最不受信任的方法為 3MLT@AS (12)。3MLT@AS 從第一輪到第四輪變化不大，但是經過四輪的下注，我們觀察到，雖然

Round1 Round2 Round3 Round4

信任程度

輪次

人類信任度變化

3MTL@SD 3MTL@AS Pretrained@AS

‧

（12.66）顯著低於其他兩者。而 3MTL@SD（49.36）高於 Pretrained@AS (37.95)，達到接近顯著的差異（p = .052）。

表 4-19 Round1 One-way ANOVA

在【表 4-20】我們可以觀察到第二輪中，不同解釋方法（F_{( ,} ₎ = 25.736, p = .000 ）對於信任度仍然有顯的影響，但是這個顯著性主要來自 3MTL@SD（14.04）和其他兩者的差異。3MTL@SD（48.16）和

Pretrained@AS（37.08）之差異則較第一輪不顯著 (p = .092)。

Round1 One-way ANOVA

N Mean Std. Deviation

1: 3MTL@SD 79 49.39 28.54

2: 3MTL@AS 79 12.66 14.86

3: Pretrained@AS 79 37.95 24.88 Test of Within-Subjects Effect

df F Sig.

Method 2 33.732 .000

Error (Method) 156

Pairwise Comparisons

Method (I) Method (J) Mean Difference Std. Error Sig.

‧

3MTL@SD （42.23）和 Pretrained@AS（43.54）之差異基本上已經消失。

Round2 One-way ANOVA

N Mean Std. Deviation

1: 3MTL@SD 79 48.16 28.63

2: 3MTL@AS 79 14.04 16.72

3: Pretrained@AS 79 37.08 27.92 Test of Within-Subjects Effect

df F Sig.

Method 2 25.736 .000

Error (Method) 156

Pairwise Comparisons

Method (I) Method (J) Mean Difference Std. Error Sig.

‧

在第四輪中【表 4-22】，3MTL@SD （39.81）和 Pretrained@AS

（49.23）的投注金額雖然沒有無顯著差異 (p = .119)，但是已經產生清楚的反轉。3MTL@SD 仍然最不受信任，其平均數（10.96）仍然顯著地低於其他兩個模型。

Round3 One-way ANOVA

N Mean Std. Deviation

1: 3MTL@SD 79 42.23 29.33

2: 3MTL@AS 79 14.23 17.94

3: Pretrained@AS 79 43.54 29.24 Test of Within-Subjects Effect

df F Sig.

Method 2 21.258 .000

Error (Method) 156

Pairwise Comparisons

Method (I) Method (J) Mean Difference Std. Error Sig.

‧

Pretrained@AS 從第一輪到第四輪得到更多的信任（+11.27，p=.000），而 3MTL@AS 從第一輪到第四輪的變化不大（+1.69，p=.291）。由此可見，在實驗的過程中，3MTL@SD 逐漸喪失信任，而 Pretrained@AS 逐漸得到信任。

Round4 One-way ANOVA

N Mean Std. Deviation

1: 3MTL@SD 79 39.81 26.78

2: 3MTL@AS 79 10.96 12.60

3: Pretrained@AS 79 49.23 27.73 Test of Within-Subjects Effect

df F Sig.

Method 2 38.174 .000

Error (Method) 156

Pairwise Comparisons

Method (I) Method (J) Mean Difference Std. Error Sig.

‧

I was nearly charged with petty theft for pilfering coffee at the illustrious Hippodrome Building. But lest I be judged too quickly I must convey the sublimity of the fourth floor's coffee machine. Harry Houdini performed at the Hippodrome at 1120 Avenue of the Americas near 44th Street. Many of the best and most famous performers of the time appeared there. It was one of the biggest and most successful theaters of its time capable of accommodating 5200 people.

Hypothesis: Harry Houdini was a magician.

Relation: neutral

3MTL@SD: " Harry Houdini performed at the Hippodrome "

3MTL@AS: " Harry Houdini performed "

Pretrained@AS: " Harry Houdini performed at the Hippodrome at 1120 Avenue of the Americas near 44th Street. Many of the best and most famous performers "

第四輪 – 第一輪之投注金額

Difference t df Sig.

3MTL@SD -9.58 3.33 78 .001

Pretrained@AS +11.27 -3.70 78 .000

3MTL@AS 1.69 1.06 78 .291

‧

Barack Obama's presidency and friends said he owned several handguns and an AK-47 assault rifle. Three officers killed. Autopsies show Sciullo 37 died of wounds to the head and torso. Mayhle 29 was shot in the head. A witness awakened by two gunshots told investigators of seeing the gunman standing in the home's front doorway and firing two to three shots into one officer who was already down. Sciullo was later found dead in the home's living room and Mayhle near the front stoop police said. A third officer Eric Kelly 41 was killed as he arrived to assist the first two officers. Kelly was in uniform but on his way home when he responded and was gunned down in the street.

Kelly's radio call for help summoned other officers including a SWAT team.

Hypothesis: Sciullo was killed by Poplawski.

Relation: entailment

3MTL@SD: " Three officers killed. Autopsies show Sciullo 37 "

3MTL@AS: " killed. Autopsies show Sciullo "

‧

Pretrained@AS: " Poplawski was concerned about his weapons being seized during Barack Obama's presidency and friends said he owned several handguns and an AK-47 assault rifle. Three officers killed. Autopsies show Sciullo "

在上述問題中，3 個方法都能夠抓到與 H 相關的文字內容，在

‧

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

結合了自注意力機制，在預訓練階段對於自然語言處理的問題即有一定程度的掌握。

‧

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

不相關的解釋上，該如何給出更符合其特性的解釋；在相斥的蘊含關係判斷上，如何處理如負面字詞等議題，以提升模型在判斷相斥關係時的效能。

在自注意力機制和可解釋性方面，我們期待能夠進行更嚴謹且更大範圍的探索，目前解釋方式方面的評估，僅止於自然語言推理任務之形式上，然而自然語言推理任務為以準確率評估效能的任務，在可解釋性和準確率上難免有所取捨，若準確率夠高，是否就能夠不在意解釋性？若能夠在不同情景，例如醫療決策方面，需要高準確率也需要可解釋性的領域或是其他自然語言處理範疇進行更深層的研究，對於將人工智慧系統部署於實際應用，並改善與人類之互動模式、信任方面定能有莫大幫助。

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

參考文獻

Adadi, Amina & Berrada, Mohammed. (2018). Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access. PP. 1-1.

10.1109/ACCESS.2018.2870052.

Review of "Natural language understanding" by James Allen. Benjamin/Cummings 1987.

L. Arras, F. Horn, G. Montavon, K.-R. M¨uller, and W. Samek. Explaining predictions of non-linear classiﬁersinnlp.

InProceedingsofthe1stWorkshoponRepresentation Learning for NLP, pages 1–7.

ACL, 2016

Baehrens, David & Fiddike, Timon & Harmeling, Stefan & Kawanabe, Motoaki &

Hansen, Katja & Müller, Klaus-Robert. (2009). How to Explain Individual Classification Decisions. Journal of Machine Learning Research. 11.

Bahdanau, Dzmitry & Cho, Kyunghyun & Bengio, Y.. (2014). Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv. 1409.

Baxter, J. (1997). A Bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning, 28, 7–39. Retrieved from

http://link.springer.com/article/10.1023/A:1007327622663

Bos, Johan & Markert, Katja. (2005). Recognising Textual Entailment with Logical Inference.. 10.3115/1220575.1220654.

‧

Bowman, Samuel & Angeli, Gabor & Potts, Christopher & Manning, Christoper.

(2015). A large annotated corpus for learning natural language inference.

10.18653/v1/D15-1075.

R. Caruana, Y. Lou, J. Gehrke, P. Koch, M. Sturm, N. Elhadad, “Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission”, In Proc.

of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, p. 1721–1730, 2015.

Chen, Hanjie & Zheng, Guangtao & Ji, Yangfeng. (2020). Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection. 5578-5593.

10.18653/v1/2020.acl-main.494.

Clark, Kevin & Khandelwal, Urvashi & Levy, Omer & Manning, Christoper. (2019).

What Does BERT Look at? An Analysis of BERT’s Attention. 276-286.

10.18653/v1/W19-4828.

Dagan, Ido & Glickman, Oren & Magnini, Bernardo. (2006). The pascal recognising textual entailment challenge.

Devlin, Jacob & Chang, Ming-Wei & Lee, Kenton & Toutanova, Kristina. (2018).

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.

F. Doshi-Velez, B. Kim, “Towards a rigorous science of interpretable machine learning”, arXiv preprint arXiv:1702.08608, 2018.

Feng, Shi & Boyd-Graber, Jordan. (2019). What can AI do for me?: evaluating machine learning interpretations in cooperative play. 229-239.

10.1145/3301275.3302265.

‧

Ghaeini, Reza & Fern, Xiaoli & Tadepalli, Prasad. (2018). Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference.

Hase, Peter & Bansal, Mohit. (2020). Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?. 5540-5552.

10.18653/v1/2020.acl-main.491.

G.E. Hinton, J.L. McClelland, D.E. Rumelhart. Distributed representations. In:

Parallel distributed processing: Explorations in the microstructure of cognition.

Volume 1: Foundations,MIT Press, 1986.

Hinton, Geoffrey & Vinyals, Oriol & Dean, Jeff. (2015). Distilling the Knowledge in a Neural Network.

Hoffman, Robert & Mueller, Shane & Klein, Gary & Litman, Jordan. (2018). Metrics for Explainable AI: Challenges and Prospects.

Huang, Hen-Hsen & Chang, Kai-Chun & Chen, Hsin-Hsi. (2013). Modeling Human Inference Process for Textual Entailment Recognition. ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2. 446-450.

Jacovi, Alon & Goldberg, Yoav. (2020). Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness?. 4198-4205.

10.18653/v1/2020.acl-main.386.

Jain, Sarthak & Wallace, Byron. (2019). Attention is not Explanation.

Jijkoun, Valentin & Rijke, Maarten. (2005). Recognizing Textual Entailment Using Lexical Similarity. Journal of Colloid and Interface Science - J COLLOID

‧

Kumar, Sawan & Talukdar, Partha. (2020). NILE : Natural Language Inference with Faithful Natural Language Explanations. 8730-8742. 10.18653/v1/2020.acl-main.771.

Lapuschkin, Sebastian & Binder, Alexander & Montavon, Grégoire & Klauschen, Frederick & Müller, Klaus-Robert & Samek, Wojciech. (2015). On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLoS ONE. 10. e0130140. 10.1371/journal.pone.0130140.

Lee, Dong-Ho & Khanna, Rahul & Lin, Bill & Lee, Seyeon & Ye, Qinyuan &

Boschee, Elizabeth & Neves, Leonardo & Ren, Xiang. (2020). LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation. 372-379.

10.18653/v1/2020.acl-demos.42.

Z. C. Lipton, “The mythos of model interpretability”, ICML Workshop on Human Interpretability in Machine Learning, 2016.

Liu, Yinhan & Ott, Myle & Goyal, Naman & Du, Jingfei & Joshi, Mandar & Chen, Danqi & Levy, Omer & Lewis, Mike & Zettlemoyer, Luke & Stoyanov, Veselin.

(2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach.

Xinhong, Ma & Zhang, Tianzhu & Xu, Changsheng. (2019). GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation. 8258-8268. 10.1109/CVPR.2019.00846.

MacCartney, Bill & Manning, Christoper. (2009). An extended model of natural logic. 140-156. 10.3115/1693756.1693772.

Marneffe, Marie-Catherine & Rafferty, Anna & Manning, Christoper. (2008). Finding Contradictions in Text.. 1039-1047.

‧

T. Miller, “Explanation in Artificial Intelligence: Insights from the Social Sciences”, arXiv preprint arXiv: 1706.07269, 2017.

Mohankumar, Akash & Nema, Preksha & Narasimhan, Sharan & Khapra, Mitesh &

Srinivasan, Balaji & Ravindran, Balaraman. (2020). Towards Transparent and Explainable Attention Models.

Montavon, Grégoire & Samek, Wojciech & Müller, Klaus-Robert. (2018). Methods for Interpreting and Understanding Deep Neural Networks. Digital Signal Processing.

73. 1-15. 10.1016/j.dsp.2017.10.011.

Mueller, Shane & Hoffman, Robert & Clancey, William & Emrey, Abigail & Klein, Gary. (2019). Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI.

Mullenbach, James & Wiegreffe, Sarah & Duke, Jon & Sun, J. & Eisenstein, Jacob.

(2018). Explainable Prediction of Medical Codes from Clinical Text. 1101-1111.

10.18653/v1/N18-1100.

Pruthi, Danish & Dhingra, Bhuwan & Neubig, Graham & Lipton, Zachary. (2020).

Learning to Deceive with Attention-Based Explanations. 4782-4793.

10.18653/v1/2020.acl-main.432.

Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya. (2019). Language Models are Unsupervised Multitask Learners Raffel, Colin & Shazeer, Noam & Roberts, Adam & Lee, Katherine & Narang, Sharan & Matena, Michael & Zhou, Yanqi & Li, Wei & Liu, Peter. (2019). Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.

‧

Rajani, Nazneen & Zhang, Rui & Tan, Yi & Zheng, Stephan & Weiss, Jeremy &

Vyas, Aadit & Gupta, Abhijit & Xiong, Caiming & Socher, Richard & Radev, Dragomir. (2020). ESPRIT: Explaining Solutions to Physical Reasoning Tasks. 7906-7917. 10.18653/v1/2020.acl-main.706.

Ribera Turró, Mireia & Lapedriza, Agata. (2019). Can we do better explanations? A proposal of User-Centered Explainable AI.

Ribeiro, Marco & Singh, Sameer & Guestrin, Carlos. (2016). "Why Should I Trust You?": Explaining the Predictions of Any Classifier. 1135-1144.

10.1145/2939672.2939778.

Rocktäschel, Tim & Grefenstette, Edward & Hermann, Karl & Kočiský, Tomáš &

Blunsom, Phil. (2015). Reasoning about Entailment with Neural Attention.

Rudin, Cynthia. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence. 1.

206-215. 10.1038/s42256-019-0048-x.

Samek, Wojciech & Wiegand, Thomas & Müller, Klaus-Robert. (2017). Explainable Artificial Intelligence: Understanding, Visualizing and Interpreting Deep Learning Models. ITU Journal: ICT Discoveries - Special Issue 1 - The Impact of Artificial Intelligence (AI) on Communication Networks and Services. 1. 1-10.

Sammons, Mark & Vydiswaran, V.G.Vinod & Roth, Dan. (2010). "Ask Not What Textual Entailment Can Do for You...".. ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 1199-1208.

‧

Sanh, Victor & Debut, Lysandre & Chaumond, Julien & Wolf, Thomas. (2019).

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter.

Serrano, Sofia & Smith, Noah. (2019). Is Attention Interpretable?.

Shrikumar, Avanti & Greenside, Peyton & Kundaje, Anshul. (2017). Learning Important Features Through Propagating Activation Differences.

D.Silver,A.Huang,C.J.Maddison,A.Guez,L.Sifre, G. Van Den Driessche, et al.

Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484–489, 2016.

Simonyan, Karen & Vedaldi, Andrea & Zisserman, Andrew. (2013). Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. preprint.

Vashishth, Shikhar & Upadhyay, Shyam & Tomar, Gaurav & Faruqui, Manaal.

(2019). Attention Interpretability Across NLP Tasks.

Vaswani, Ashish & Shazeer, Noam & Parmar, Niki & Uszkoreit, Jakob & Jones, Llion & Gomez, Aidan & Kaiser, Lukasz & Polosukhin, Illia. (2017). Attention Is All You Need.

Vig, Jesse. (2019). A Multiscale Visualization of Attention in the Transformer Model.

Wang, Alex & Singh, Amapreet & Michael, Julian & Hill, Felix & Levy, Omer &

Bowman, Samuel. (2018). GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding.

Wang, Danding & Yang, Qian & Abdul, Ashraf & Lim, Brian. (2019). Designing Theory-Driven User-Centric Explainable AI. 10.1145/3290605.3300831.

‧

Wiegreffe, Sarah & Pinter, Yuval. (2019). Attention is not not Explanation.

Wu, Lianwei & Yuan, Rao & Zhao, Yongqiang & Liang, Hao & Nazir, Ambreen.

(2020). DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification.

K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, Y. Bengio,

“Show, attend and tell: Neural image caption generation with visual attention”, In International Conference on Machine Learning (ICML), 2015.

Yang, Zhilin & Dai, Zihang & Yang, Yiming & Carbonell, Jaime & Salakhutdinov, Ruslan & Le, Quoc. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding.

Yin, Wenpeng & Schütze, Hinrich & Xiang, Bing & Zhou, Bowen. (2015). ABCNN:

Attention-Based Convolutional Neural Network for Modeling Sentence Pairs.

Transactions of the Association for Computational Linguistics. 4.

10.1162/tacl_a_00244.

Young, Peter & Lai, Alice & Hodosh, Micah & Hockenmaier, Julia. (2014). From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions. TACL. 2. 67-78. 10.1162/tacl_a_00166.

Zeiler, M.D. & Fergus, R.. (2013). Visualizing and understanding convolutional networks. European Conference on Computer Vision(ECCV). 8689. 818-833.

Zintgraf, Luisa & Cohen, Taco & Adel, Tameem & Welling, Max. (2017).

Visualizing Deep Neural Network Decisions: Prediction Difference Analysis.

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

附錄

1.1 信任度評估實驗內容

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

在文檔中自然語言推理之後設可解釋性建模 - 政大學術集成 (頁 93-0)

人類信任度及模型解釋偏好評估

4.3 在 SNLI 資料集之適應性

4.4.2 人類信任度及模型解釋偏好評估

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧

‧

‧

人類信任度變化

‧

‧

‧

‧

‧

‧

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

參考文獻

http://link.springer.com/article/10.1023/A:1007327622663

‧

‧

‧

‧

‧

‧

‧

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

附錄

1.1 信任度評估實驗內容

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

‧ 國

立 政 治 大 學

‧

N a tio na

l C h engchi U ni ve rs it y

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學

立政治大學