• 沒有找到結果。

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

1.3 Outline

The remaining chapters are organized as follows. Chapter 2 begins by mentioning a spectrum of essential knowledge which help the reader to grasp the main concepts of this dissertation. They include prior work on machine learning, especially regarding neural networks, methods and tasks that are prominent in natural language processing. Next, Chapter 3 contains descriptions of our approaches for adapting pre-trained models on various NLP-related tasks, including classification, sequence labeling, sentiment analysis, entailment, and machine translation. Subsequently, the experiments on these tasks are presented in Chapter 4. We provide some discussions on the theoretical aspect of the experiments on the robustness of self-attentive models as compared with recurrent neural networks in Chapter 5. Finally, we conclude this work with Chapter 6, in which we summarize the results from previous chapters as well as propose advances that can be made in the future.

1.4 Publications

The current dissertation is based upon many previous work by the author and other collaborators. They are listed below.

1. Yu-Lun Hsieh, Minhao Cheng, Da-Cheng Juan, Wei Wei, Wen-Lian Hsu, Cho-Jui Hsieh,

“On the Robustness of Self-Attentive Models,”, in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).

2. Yu-Lun Hsieh, Yung-Chun Chang, Nai-Wen Chang, Wen-Lian Hsu, “Identifying Protein-protein Interactions in Biomedical Literature using Recurrent Neural Networks with Long Short-Term Memory,” in Proceedings of the Eighth International Joint Conference on

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Natural Language Processing (IJCNLP 2017).

3. Yu-Lun Hsieh, Yung-Chun Chang, Yi-Jie Huang, Shu-Hao Yeh, Chun-Hung Chen, Wen-Lian Hsu, “MONPA: Multi-objective Named-entity and Part-of-speech Annotator for Chinese using Recurrent Neural Network,” in Proceedings of the Eighth International Joint Conference on Natural Language Processing (IJCNLP 2017).

4. Yu-Lun Hsieh, Shih-Hung Liu, Kuan-Yu Chen, Hsin-Min Wang, Wen-Lian Hsu, Berlin Chen, “Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstrac-tive Summarization,” in Proceedings of the 28th International Conference on Computa-tional Linguistics and Speech Processing (ROCLING 2016).

Other work that are related to this topic includes:

5. Yu-Lun Hsieh, Yung-Chun Chang, Chun-Han Chu, Wen-Lian Hsu, “How Do I Look?

Publicity Mining From Distributed Keyword Representation of Socially Infused News Articles”, in Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media (collocated with EMNLP 2016).

6. Yu-Lun Hsieh, Shih-Hung Liu, Yung-Chun Chang, Wen-Lian Hsu, “Neural Network-Based Vector Representation of Documents for Reader-Emotion Categorization,” in Proceedings of the 2015 IEEE International Conference on Information Reuse and Integration (IRI), pp. 569–573, San Francisco, CA, USA, 2015.

7. Yu-Lun Hsieh, Shih-Hung Liu, Yung-Chun Chang, Wen-Lian Hsu, “Distributed Key-word Vector Representation for Document Categorization,” in Proceedings of the 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 245–

251, 2015.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

The following publications are also completed during the course of the Ph.D.

8. Zheng-Wen Lin, Yung-Chun Chang, Chen-Ann Wang, Yu-Lun Hsieh, Wen-Lian Hsu

“CIAL at IJCNLP-2017 Task 2: An Ensemble Valence-Arousal Analysis System for Chinese Words and Phrases,” in Proceedings of the IJCNLP 2017, Shared Tasks.

9. Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu, “Exploiting graph regularized nonnegative matrix factorization for extractive speech summarization,” in Proceedings of APSIPA 2016.

10. Shih-Hung Liu, Kuan-Yu Chen, Yu-Lun Hsieh, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, Wen-Lian Hsu, “Exploring Word Mover’s Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization.” in Proceedings of INTERSPEECH 2016.

11. Ting-Hao Yang, Yu-Lun Hsieh, You-Shan Chung, Cheng-Wei Shih, Shih-Hung Liu, Yung-Chun Chang, Wen-Lian Hsu, “Principle-Based Approach for Semi-Automatic Construction of a Restaurant Question Answering System from Limited Datasets,” in Proceedings of the 2016 IEEE International Conference on Information Reuse and Integration (IRI), pp. 520–524, Pittsburgh, PA, 2016.

12. Nai-Wen Chang, Hong-Jie Dai, Yu-Lun Hsieh, Wen-Lian Hsu, “Statistical Principle-Based Approach for Detecting miRNA-Target Gene Interaction Articles,” in Proceedings of the 2016 IEEE 16th International Conference on Bioinformatics and Bioengineering (BIBE), 2016.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Chapter 2

Background and Related Work

In this chapter, we briefly review the basis of natural language processing research, which is the main target of this research. Then, we describe fundamental technologies that is later utilized in the rest of this work. We also introduce the readers to more details of the essential model, i.e., neural networks, that is under investigation of this dissertation.

2.1 Natural Language Processing

Natural language processing (NLP) has been an essential part of the development of artificial intelligence since as early as the 1950s [69]. The main aim of this field is to design models and methods for a computer, or any machinery, to store, process, and eventually understand human languages.

There are many levels of processing when dealing with language. Depending on the language, one may need to perform lemmatization or stemming first. These steps break up words into smaller, meaningful parts. Another related work is called “word segmentation,” where one has to find word boundaries within a sentence in which they are not obvious. Languages

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

like Chinese, Japanese, and Thai are some of the examples of this category. Part-of-speech (POS) labeling, or tagging, refers to assigning a label that represents the part-of-speech for each word. POS are classes of a word for the purpose of grammatical description. Some of those classes include: the verb, the noun, the pronoun, the adjective, the adverb, the preposition, the conjunction, the article, and the interjection [29]. Other major components of NLP include:

Syntactic (constituency) parsing involves creating a structured representation of the syntactic relationships of the words.

Dependency parsing aims at identifying the subject, object, and predicates of a sentence. It is done by labeling the relationship between one word and another.

Named entity recognition mainly focuses on finding the entities in a sentence, including persons, places, organizations, etc.

Sentiment analysis or opinion mining, refers to identify the affective content in text. It is commonly employed to analyze product reviews, survey responses, social media, etc., for use in applications such as marketing or customer service.

Entailment detection targets to find out the directional relationship between statements. We define a piece of text T and a hypothesis H. If T entails H, it is understood as if one reads T , one would infer that H is very likely to be true. The directionality factor means that the reverse does not hold, i.e., H does not necessarily entail T .

Machine translation model generates the translation of one language to another based on the training data in a bilingual corpus. It can be traced back to the idea proposed by Weaver [72]

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

that a machine can be utilized to handle this task. Traditionally, statistical machine translation was the common approach. In recent years, the application of neural networks has boosted the performance to a new peak.

Summarization In the past, more attention has been paid to extractive summarization, while abstractive summarization are rather rare. In view of the recent success of deep learning, the research on abstractive summarization has been growing. Recent literature has preliminarily verified the effectiveness of RNN on rewritten summarization of documents. Moreover, the contribution of the attention mechanism is also noticed by many. The characteristic of this mechanism is that it can increase the weight of key segments while generating text, thereby composing a better summary.

相關文件