• 沒有找到結果。

生成式對抗網路 實驗結果

6.2 生成式對抗網路

6.2.2 生成式對抗網路 實驗結果

模型/資料集 IMDB 20Newsgroups

MLP 74.35% 65.48%

GAN 75.61% 66.83%

表 6.3 生成式對抗網路實驗結果

在表 6.3 我們呈現了生成式對抗網路實驗結果,實驗結果並不比前述所提之連體

神經網路效果佳,但我們認為原因在於噪音的維度調整是重要關鍵,對於生成式

模型如何使用這個噪音是無法控制的,因此使得生成式對抗網路的訓練過程和結

果目前都不太可控,為了穩定生成式對抗網路,後續有許多學者提出模型改進和

理論分析。

69

第 7 章 結論與未來展望

本論文提出利用連體神經網路和生成式對網路來學習更優良於自動文本分類的

表示,在連體神經網路中,我們讓模型可以學習到文本與類別之間主題的關聯性,

並能有效提升自動文本分類任務中的效能,

在未來,在連體神經網路的部分我們會嘗試使用一些更複雜的子網路架構,

並且探討子網路與相似度函數的關係,在生成式對抗網路部分,我們會嘗試其他

的對抗網路,並探討其差異性,並能夠建立一個專門用於自動文本分類的架構。

70

參考書目

[1] Feldman, R., & Sanger, J.: “The text mining handbook: advanced approaches in

analyzing unstructured data.” (2007).

[2] Joachims T et al.: “Text categorization with support vector machines: Learning

with many relevant features.” Machine learning: ECML-98, (1998).

[3] Cunningham H, Maynard D, Bontcheva K, et al.: “A framework and graphical

development environment for robust NLP tools and applications.” ACL, (2002).

[4] LeCun Y, Bengio Y and Hinton G.: “Deep learning.” Nature, (2015).

[5] Salton G, Wong A, Yang C S.: “A vector space model for automatic indexing.”

Communications of the ACM, (1975).

[6] Mikolov T, Yih W and Zweig G. : “Linguistic regularities in continuous space

word representations.” NAACL, (2013).

[7] Hayes-Roth, Frederick, Donald Waterman, and Douglas Lenat.: “Building expert

systems.” (1984).

71

[8] Stachniss, Cyrill, Giorgio Grisetti, and Wolfram Burgard.: “Information

Gain-based Exploration Using Rao-Blackwellized Particle Filters.” (2005).

[9] Viola, Paul, and William M. Wells III.: “Alignment by maximization of mutual

information.” (1997).

[10] Mantel, Nathan.: “Chi-square tests with one degree of freedom; extensions of the

Mantel-Haenszel procedure.” (1963).

[11] Yitzhaki, Shlomo.: “Relative deprivation and the Gini coefficient.” The quarterly

journal of economics, (1979).

[12] De Boer, Pieter-Tjerk, et al.: “A tutorial on the cross-entropy method.” Annals of

operations research, (2005).

[13] Joachims, Thorsten.: “A Probabilistic Analysis of the Rocchio Algorithm with

TFIDF for Text Categorization.” No. CMU-CS-96-118. Carnegie-mellon univ

pittsburgh pa dept of computer science, (1996).

72

[14] Lewis, David D.: “Naive (Bayes) at forty: The independence assumption in

information retrieval.” European conference on machine learning. Springer, Berlin, Heidelberg, (1998).

[15] Masand, Brij, Gordon Linoff, and David Waltz.: “Classifying news stories using

memory based reasoning.” Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, (1992).

[16] Weston, Jason, et al.: “Feature selection for SVMs.” Advances in neural

information processing systems, (2001).

[17] Joachims, Thorsten.: “Making large-scale SVM learning practical.” (1998).

[18] Kohavi, Ron.: “Scaling Up the Accuracy of Naive-Bayes Classifiers: A

Decision-Tree Hybrid.” (1996).

[19] De Mántaras, R. López.: “A distance-based attribute selection measure for

decision tree induction.” (1991).

73

[20] Chawla, Nitesh V.: “C4. 5 and imbalanced data sets: investigating the effect of

sampling method, probabilistic estimate, and decision tree

structure.” Proceedings of the ICML, (2003).

[21] Friedl, Mark A., and Carla E. Brodley.: “Decision tree classification of land cover

from remotely sensed data.” Remote sensing of environment, (1997).

[22] Maas, Andrew L., et al.: “Learning word vectors for sentiment

analysis.” Proceedings of the 49th Annual Meeting of the Association for

Computational Linguistics: Human Language Technologies-Volume 1. Association

for Computational Linguistics, (2011).

[23] Cardoso-Cachopo, Ana, and Arlindo L. Oliveira.: “An empirical comparison of

text categorization methods.” SPIRE, (2003).

[24] Bengio, Yoshua, Aaron Courville, and Pascal Vincent.: “Representation learning:

A review and new perspectives.” IEEE transactions on pattern analysis and machine intelligence, (2013).

74

[25] Brown, Peter F., et al.: “Class-based n-gram models of natural

language.” Computational linguistics 18.4, (1992).

[26] Bengio, Yoshua, et al.: “A neural probabilistic language model.” Journal of

machine learning research 3, (2003).

[27] Hinton, Geoffrey E.: “Learning distributed representations of

concepts.” Proceedings of the eighth annual conference of the cognitive science society, (1986).

[28] Mikolov, Tomas, et al.: “Distributed representations of words and phrases and

their compositionality.” Advances in neural information processing systems, (2013).

[29] Lawrence, Steve, et al.: “Face recognition: A convolutional neural-network

approach.” IEEE transactions on neural networks, (1997).

[30] Hochreiter, Sepp, and Jürgen Schmidhuber.: “Long short-term memory.” Neural

computation 9.8, (1997).

75

[31] Bromley, Jane, et al.: “Signature verification using a" siamese time delay neural

network.” Advances in Neural Information Processing Systems, (1994).

[32] Chopra, Sumit, Raia Hadsell, and Yann LeCun.: “Learning a similarity metric

discriminatively, with application to face verification.” Computer Vision and

Pattern Recognitio, (2005).

[33] Mueller, Jonas, and Aditya Thyagarajan.: ”Siamese Recurrent Architectures for

Learning Sentence Similarity.” AAAI, (2016).

[34] Goodfellow, Ian, et al.: “Generative adversarial nets.” Advances in neural

information processing systems, (2014).

[35] Zhao, Junbo, Michael Mathieu, and Yann LeCun.: “Energy-based generative

adversarial network.” (2016).

相關文件