Conclusion and Outlook - 統計式語言模型 – 語音文件標記、檢索以及摘要

Language model research can be dated back to the n-gram language model which is blocked by the frequency counts of words and the multinomial distributions. The original goal of the n-gram language model was to determine the probability of a given word sequence. Following this model, several researchers then proposed a variety of architectures for language models that capturing fine- or coarse-grained semantic and syntactic regularities. The wide array of language models that have been developed so far fall roughly into four main categories: 1) word-regularity models, 2) topic models, 3) continuous language models, and 4) neural network-based language models (c.f.

Chapter 2.1). Founded on a variety of pioneering research, this thesis has proposed several novel extensions, described new developments, and shared interesting findings.

Figure 8.1 summarizes some important language models year by year, and also summarizes the contributions of this thesis.

 The Unified Framework for Pseudo-Relevance Feedback

Language models have been widely used for information retrieval. However, this approach has two major challenges: 1) a query is often a vague expression of the underlying information need, and 2) there can be word usage mismatch between a query and a document even if they are topically related to each other.

To mitigate these problems, in Chapter 4, we reformulated the original queries using relevance-based language models using different objective functions, and then proposed a principled framework to unify the relationships among most of the widely-used query modeling formulations. The school of research has also

been introduced to extractive summarization.

 The I-vector based Language Modeling Framework for Retrieval

The i-vector technique, which reduces a series of acoustic feature vectors of a speech utterance to a low-dimensional vector representation, has yielded great performance improvements in language identification and speaker recognition.

In Chapter 5, we adopted this concept for the i-vector based language model (IVLM) for information retrieval. As the major challenge of using IVLM for Figure 8.1 The important language models and the proposed frameworks are

summarized year by year.

•Probability Latent Semantic Analysis(1999)

•Gaussian Mixture Language Model(2007)•Continuous Topic Language Model(2008)•Tied-Mixture Language Model(2009)

•Discriminative Training Language Model(2000)

•Pseudo-conventional N-gram Model(2008)

•Minimum Word Error Training Language Model(2005)

•Global Conditional Log-linear Model(2007)

•Recurrent Neural Network Language Model(2010)

•Relevance-based Language Model(2001)

•C&W Neural Network Language Model(2008)

•Log-bilinear Language Model(2007)

•Continuous Bag-of-words Representation(2013)

•Skip-gram Representation(2013)•Global Vector(2014)

•Round-robin Discriminative Language Model(2011)

•Three Mixture Model(2002)

•Word Topic Model(2006)•Word Vicinity Model(2006)

•Unified Framework(2014)

•RNN for Summarization(2014)

•Word Embeddings for Summarization(2015)

•I-vector Technique(2009)

•I-vector based Language Modeling(2014)

query modeling is that queries usually consist of only a few words, it is difficult to learn reliable representations. To more accurately represent users’ information needs, three novel reformulation methods were proposed for use in SDR. It is also expected that conventional language identification and speaker recognition applications can benefit from our methods. In addition, IVLM training also yields a useful by-product: document (or query) and word embeddings.

 The RNNLM-based Framework for Summarization

Language models have been used for unsupervised summarization. However, it remains challenging to formulate the sentence models and to estimate their parameters for each document to be summarized. We proposed a novel recurrent neural network language model using a curriculum learning strategy to render word usage cues and to capture long-span structural information of word co-occurrence relationships within documents in Chapter 6. In addition, we also explored different model complexities and combination strategies, as well as provided in-depth elucidations on the modeling characteristics and the associated summarization performance of various instantiated methods.

 The Word Embedding Framework for Summarization

Recently, word embedding has been a popular research area due to its excellent performance in many natural language processing (NLP)-related tasks. However, as far as we are aware, there has been little work investigating its use in extractive spoken document summarization. The common usage of leveraging word embeddings is to represent the document (or sentence) by averaging the embeddings of the words occurring in the document (or sentence). Then,

intuitively, the cosine similarity measure can be used to determine the degree of relevance between a pair of representations. Beyond the continued efforts made to improve word representations, in Chapter 7, we have proposed novel and efficient ranking models based on general word embedding methods. In additions, we have also presented a novel probabilistic modeling framework for learning word and sentence representations, which not only inherits the advantages of the original word embedding methods but also boasts a clear and rigorous probabilistic foundation.

I believe this thesis will help make statistical language modeling more attractive for future research. Still though, there is a need for additional experiments to be conducted and analysis to be made, and there are plenty of related research subtopics that still should be investigated. It is my hope that this work will prove to be a cornerstone for me and others in establishing more elegant, elaborate and powerful methods in the near future.

REFERENCE

[1] O. Abdel-Hamid, A. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu,

“Convolutional neural networks for speech recognition,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 10, pp. 1533–1545, 2014.

[2] M. Afify, O. Siohan, and R. Sarikaya, “Gaussian mixture language models for speech recognition,” in Proc. of ICASSP, pp. IV-29–IV-32, 2007.

[3] J. Allen, M. S. Hunnicutt, D. H. Klatt, R. C. Armstrong, and D. B. Pisoni, From Text to Speech: the Mitalk System. Cambridge University Press, New York, NY, USA, 1987.

[4] R. Baeza-Yates and B. Ribeiro-Neto, Modern information retrieval: the concepts and technology behind search, ACM Press, 2011.

[5] P. B. Baxendale, “Machine-made index for technical literature-an experiment,”

IBM Journal, 1958.

[6] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157–166, 1994.

[7] Y. Bengio, R. Ducharme, and P. Vincent, “A neural probabilistic language model,”

in Proc. of NIPS, pp. 932–938, 2000.

[8] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A neural probabilistic language model,” Journal of Machine Learning Research (3), pp. 1137–1155, 2003.

[9] Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proc. of ICML, pp. 41–48, 2009.

[10] S. Bengio and G. Heigold, “Word embedding for speech recognition,” in Proc. of Interspeech, pp. 1053–1057, 2014.

[11] C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2007.

[12] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of Machine Learning Research, pp. 993–1022, 2003.

[13] D. M. Blei and J. D. McAuliffe, “Supervised topic models,” in Proc. of NIPS, 2007.

[14] D. M. Blei and J. Lafferty, “Topic models”, In A. Srivastava and M. Sahami, (eds.), Text Mining: Theory and Applications. Taylor and Francis, 2009.

[15] P. Boersma, “Praat, a system for doing phonetics by computer,” Glot International, vol.5, no. 9-10, pp. 341–345, 2001.

[16] M. Boden, “A guide to recurrent neural networks and backpropagation,” in the Dallas Project, 2002.

[17] P. F. Brown, V. J. Della Pietra, P. V. deSouza, J. C. Lai, and R. L. Mercer,

“Class-based n-gram models of natural language,” Computational Linguistics, vol.

18, no. 4, pp. 467–479, 1992.

[18] G. Cao, J.-Y. Nie, J. Gao, and S. Robertson, “Selecting good expansion terms for pseudo-relevance feedback,” in Proc. of SIGIR, pp. 243–250, 2008.

[19] J. Carbonell and J. Goldstein, “The use of MMR, diversitybased reranking for reordering documents and producing summaries,” in Proc. of SIGIR, pp. 335–336, 1998.

[20] C. Carpineto and G. Romano, “A survey of automatic query expansion in information retrieval,” ACM Computing Surveys, vol. 44, pp.1–56, 2012.

[21] G. Chechik, V. Sharma, U. Shalit, and S. Bengio, “Large scale online learning of image similarity through ranking,” Journal of Machine Learning Research (11), pp.

1109–1135, 2010.

[22] C. Chelba, T. J. Hazen, and M. Saraclar, “Retrieval and browsing of spoken content,” IEEE Signal Processing Magazine, vol. 25, no. 3, pp. 39–49, 2008.

[23] B. Chen, H. M. Wang, and L. S. Lee, “A discriminative HMM/n-gram-based retrieval approach for Mandarin spoken documents,” ACM Transactions on Asian Language Information Processing, vol. 3, no. 2, pp. 128–145, 2004.

[24] B. Chen, J.- W. Kuo, and W.-H. Tsai, “Lightly supervised and data-driven approaches to Mandarin broadcast news transcription,” in Proc. of ICASSP, 2004.

[25] B. Chen and Y.-T. Chen, “Extractive spoken document summarization for information retrieval,” Pattern Recognition Letters, vol. 29, no. 4, pp. 426–437, 2008.

[26] B. Chen, “Word topic models for spoken document retrieval and transcription,”

ACM Transactions on Asian Language Information Processing, vol. 8, no. 1, pp.

2:1–2:27, 2009.

[27] B. Chen, K.-Y. Chen, P.-N. Chen, and Y.-W. Chen, “Spoken document retrieval with unsupervised query modeling techniques,” IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no.9, pp. 2602–2612, 2012.

[28] B. Chen and K. Y. Chen, “Leveraging relevance cues for language modeling in speech recognition,” Information Processing & Management, vol. 49, no. 4, pp.

807–816, 2013.

[29] B. Chen, S. H. Lin, Y. M. Chang, and J. W. Liu, “Extractive speech summarization using evaluation metric-related training criteria,” Information Processing &

Management, vol. 49, no. 1, pp. 1–12, 2013.

[30] K. Y. Chen, H. S. Chiu, and B. Chen, “Latent topic modeling of word vicinity information for speech recognition,” in Proc. of ICASSP, pp. 5394–5397, 2010.

[31] K. Y. Chen, H. M. Wang, and B. Chen, “Spoken document retrieval leveraging unsupervised and supervised topic modeling techniques,” Special Section: Recent Advances in Multimedia Signal Processing Techniques and Applications, IEICE Transactions on Information and Systems, vol. E95-D, no.5, pp. 1195–1205, 2012.

[32] K. Y. Chen, H. M. Wang, B. Chen, and H. H. Chen, “Weighted matrix factorization for spoken document retrieval,” in Proc. of ICASSP, pp. 8530–8534,

2013.

[33] K. Y. Chen, H. S. Lee, C. H. Lee, H. M. Wang, and H. H. Chen, “A study of language modeling for Chinese spelling check,” in Proc. of SIGHAN, pp. 79–83, 2013.

[34] K. Y. Chen, H. S. Lee, H. M. Wang, B. Chen, and H. H. Chen, “I-vector based language modeling for spoken document retrieval,” in Proc. of ICASSP, pp.

7083–7087, 2014.

[35] K. Y. Chen, S. H. Liu, B. Chen, E. E. Jan, H. M. Wang, W. L. Hsu, and H. H.

Chen, “Leveraging effective query modeling techniques for speech recognition and summarization,” in Proc. of EMNLP, pp. 1474–1480, 2014.

[36] K. Y. Chen, S. H. Liu, B. Chen, H. M. Wang, W. L. Hsu, and H. H. Chen, “A recurrent neural network language modeling framework for extractive speech summarization,” in Proc. of ICME, pp. 569–574, 2014.

[37] S. F. Chen and J. Goodman, “An Empirical Study of Smoothing Techniques for Language Modeling,” Computer Speech and Language, vol. 13, no. 4, pp. 359–393, 1999.

[38] Y. T. Chen, B. Chen, and H. M. Wang , “A probabilistic generative framework for extractive broadcast news speech summarization, ” IEEE Transactions on Audio, Speech and Language Processing, vol. 17, no. 1, pp. 95–106, 2009.

[39] Y. W. Chen, K. Y. Chen, H. M. Wang, and B. Chen, “Effective pseudo-relevance feedback for spoken document retrieval,” in Proc. of ICASSP, pp. 8535–8539, 2013.

[40] Y. Z. Chen, S. H. Wu, C. C. Lu, and T. Ku, “Chinese confusion word set for automatic generation of spelling error detecting template,” in Proc. of ROCLING, pp. 359–372, 2009.

[41] Z. Chen, K. F. Lee, and M. J. Li, “Discriminative training on language model,” in Proc. of ICSLP, pp. 493–496, 2000.

[42] T. K. Chia, K. C. Sim, H. Li, and H. T. Ng, “Statistical lattice-based spoken document retrieval,” ACM Transactions on Information Systems, vol. 28, no. 1, pp.

2:1–2:30, 2010.

[43] S. Clinchant and E. Gaussier, “A theoretical analysis of pseudo-relevance feedback models,” in Proc. of ICTIR, pp. 1–6, 2013.

[44] H. S. Chiu and B. Chen, “Dynamic language model adaptation using word topical mixture models,” in Proc. of WESPAC, 2006.

[45] H. S. Chiu and B. Chen, “Word topical mixture models for dynamic language model adaptation,” in Proc. of ICASSP, pp. IV169–IV172, 2007.

[46] C. H. Chueh and J. T. Chien, “Continuous topic language modeling for speech recognition,” in Proc. of SLT, pp.193–196, 2008.

[47] M. Collins, “Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms,” in Proc. of EMNLP, pp. 1–8, 2002.

[48] R. Collobert and J. Weston, “A unified architecture for natural language processing: deep neural networks with multitask learning,” in Proc. of ICML, pp.

160–167, 2008.

[49] K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz and Y. Singer, “Online passive-aggressive algorithms,” Journal of Machine Learning Research 7, pp. 551–

585, 2006.

[50] B. Croft, D. Metzler and T. Strohman, Search Engines: Information Retrieval in Practice (1st ed.). Addison-Wesley Publishing Company, USA, 2009.

[51] S. Deerwester, S. Dumais, G. Furnas, T. Landauer and R. Harshman, “Indexing by latent semantic analysis”, Journal of the American Society of Information Science, vol. 41, no. 6, pp. 391–407, 1990.

[52] A. P. Dempster, N. M. Laird and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of Royal Statistical Society, vol. 39, no. 1, pp. 1–38, 1977.

[53] L. F. D’Haro, O. Glembek, O. Plchot, P. Matejka, M. Soufifar, R. Cordoba, and J.

Cernocky, “Phonotactic language recognition using i-vectors and phoneme posteriogram counts,” in Proc. of Interspeech, pp. 42–45, 2012.

[54] J. V. Dillon and K. Collins-Thompson, “A unified optimization framework for robust pseudo-relevance feedback algorithms,” in Proc. of CIKM, pp. 1069–1078, 2010.

[55] J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” Journal of Machine Learning Research (12), pp. 2121–2159, 2011.

[56] J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, no. 2, pp. 179–

211, 1990.

[57] J. L. Elman, “Learning and development in neural networks: the importance of starting small,” Cognition, vol. 48, pp. 71–99, 1993.

[58] G. Erkan and D. R. Radev, “LexRank: Graph-based lexical centrality as salience in text summarization”, Journal of Artificial Intelligent Research, vol. 22, no. 1, pp. 457–479, 2004.

[59] S. Furui, T. Kikuchi, Y. Shinnaka, and C. Hori, “Speech-to-text and speech-to-speech summarization of spontaneous speech,” IEEE Transactions on Speech and Audio Processing, vol. 12, no. 4, pp. 401–408, 2004.

[60] S. Furui, L. Deng, M. Gales, H. Ney, and K. Tokuda, “Fundamental technologies in modern speech recognition,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 16–17, 2012.

[61] M. Galley, “Skip-chain conditional random field for ranking meeting utterances by importance,” in Proc. of EMNLP, pp. 364–372, 2006.

[62] D. Garcia-Romero and C. Y. Espy-Wilson, “Analysis of i-vector length normalization in speaker recognition systems,” in Proc. of Interspeech, pp. 249–

252, 2011.

[63] J. Garofolo, G. Auzanne, and E. Voorhees, “The TREC spoken document retrieval track: A success story,” in Proc. TREC, pp. 107–129, 2000.

[64] O. Glembek, L. Burget, P. Matejka, M. Karafiat, and P. Kenny, “Simplification and optimization of i-vector extraction,” in Proc. of ICASSP, pp. 4516–4519, 2011.

[65] Y. Gong and X. Liu, “Generic text summarization using relevance measure and latent semantic analysis,” in Proc. of SIGIR, pp. 19–25, 2001.

[66] I. J. Good, “The population frequencies of species and the estimation of population parameters,” Biometrika, 40:16–264, 1953.

[67] T. L. Griffiths and M. Steyvers, “Finding scientific topics,” in Proc. of PNAS, pp.

5228–5235, 2004.

[68] M. T. Hagan and M. B. Menhaj, “Training feedforward networks with the Marquardt algorithm,” IEEE Transactions on Neural Networks, vol.5, no.6, pp.

989–993, 1994.

[69] A. Haghighi and L. Vanderwende, “Exploring content models for multi-document summarization,” in Proc. of HLT/NAACL, pp. 362–370, 2009.

[70] D. F. Harwath, T. J. Hazen, and J. R. Glass, “Zero resource spoken audio corpus analysis”, in Proc. of ICASSP, pp. 8555–8559, 2013.

[71] T. Hasan, R. Saeidi, J. H. L. Hansen, and D. A. van Leeuwen, “Duration mismatch compensation for i-vector based speaker recognition systems,” in Proc.

of ICASSP, pp. 7663–7667, 2013.

[72] V. Hautamaki, Y. C. Cheng, P. Rajan, and C. H. Lee, “Minimax i-vector extractor for short duration speaker verification,” in Proc. of Interspeech, pp. 3708–3712, 2013.

[73] G. Heigold, H. Ney, R. Schluter, and S. Wiesler, “Discriminative training for automatic speech recognition: Modeling, criteria, optimization, implementation, and performance,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 58–69, 2012.

[74] D. Hiemstra, S. Robertson, and H. Zaragoza, “Parsimonious language models for information retrieval,” in Proc. of SIGIR, pp. 178–185, 2004.

[75] T. Hofmann, “Probabilistic latent semantic indexing,” in Proc. of SIGIR, pp. 50–57, 1999.

[76] T. Hofmann, “Unsupervised learning by probabilistic latent semantic analysis,”.

Machine Learning, vol. 42, pp. 177–196, 2001.

[77] C. L. Huang, B. Ma, H. Li, and C. H. Wu, “Speech indexing using semantic context inference,” in Proc. of Interspeech, pp. 717–720, 2011.

[78] S. F. Huang and S. Renals, “Hierarchical Pitman-Yor language models for ASR in meetings,” in Proc. of ASRU, pp. 124–129, 2007.

[79] X. Huang, A. Acero, H.-W. Hon, “Spoken Language Processing: A Guide to Theory, Algorithm, and System Development (1st ed.).” Prentice Hall PTR, Upper Saddle River, NJ, USA, 2001.

[80] H. Jaeger, “A tutorial on training recurrent neural networks, covering BPPT, RTRL, EKF and the echo state network approach,” GMD Report 159, German National Research Center for Information Technology, 2002.

[81] F. Jelinek, B. Merialdo, S. Roukos, and M. Strauss, “A dynamic language model for speech recognition,” in Proc. of DARPA workshop on speech and natural language, pp. 293–295, 1991.

[82] F. Jelinek, Statistical Methods for Speech Recognition. The MIT Press, 1999.

[83] D. Jurafsky and J. H. Martin, Speech and Language Processing (2nd Edition).

Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 2009.

[84] M. Kageback, O. Mogren, N. Tahmasebi, and D. Dubhashi, “Extractive summarization using continuous vector space models,” in Proc. of CVSC, pp. 31–

39, 2014.

[85] A. Kanagasundaram, R. Vogt, D. Dean, S. Sridharan, and M. Mason, “I-vector based speaker recognition on short utterances,” in Proc. of Interspeech, pp. 2341–

2344, 2011.

[86] A. Kanagasundaram, D. Dean, J. Gonzalez-Dominguez, S. Sridharan, D. Ramos, and J. Gonzalez-Rodriguez, “Improving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques,” in Proc. of Interspeech, pp. 2465–2469, 2013.

[87] A. Karpathy and F. F. Li, “Deep visual-semantic alignments for generating image descriptions,” arXiv:1412.2306, 2014.

[88] S. M. Katz, “Estimation of probabilities from sparse data for the language model component of a speech recognizer,” IEEE Transactions on Audio, Speech, and Signal Processing, vol. 35, no. 3, pp. 400–401, 1987.

[89] P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, “Joint factor analysis versus eigenchannels in speaker recognition,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1435–1447, 2007.

[90] P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, “Speaker and session variability in GMM-based speaker verification,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 4, pp. 1448–1460, 2007.

[91] P. Kenny, T. Stafylakis, P. Ouellet, Md. J. Alam, and P. Dumouchel, “PLDA for speaker verification with utterances of arbitrary duration,” in Proc. of ICASSP, pp.

7649–7653, 2013.

[92] R. Kneser and H. Ney, “Improved backing-off for m-gram language modeling,” in Proc. of ICASSP, pp. 181–184, 1995.

[93] S. Kombrink, T. Mikolov, M. Karafiat, and L. Burget, “Recurrent neural network based language modeling in meeting recognition,” in Proc. of Interspeech, pp.

2877–2880, 2011.

[94] A. Krizhevsky, I. Sutskever and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. of NIPS, pp. 1–9, 2012.

[95] R. Kuhn, “Speech recognition and the frequency of recently used words:A modified Markov model for natural language,” in Proc. of COLING, pp. 348–350, 1988.

[96] R. Kuhn and R. D. Mori, “A cache-based natural language model for speech recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 6, pp. 570–583, 1990.

[97] S. Kullback and R. A. Leibler, “On information and sufficiency,” The Annals of Mathematical Statistics, vol. 22, no. 1, pp. 79–86, 1951.

[98] J. Kupiec, J. Pedersen, and F. Chen, “A trainable document summarizer,” in Proc.

of SIGIR, pp. 68–73, 1995.

[99] J. Lafferty and C. Zhai, “Document language models, query models, and risk minimization for information retrieval,” in Proc. of SIGIR, pp. 111–119, 2001.

[100] M. Larson and G. J. F. Jones, “Spoken content retrieval: A survey of techniques and technologies,” Foundations and Trends in Information Retrieval, vol. 5, no. 4–5, pp. 235–422, 2012.

[101] R. Lau, R. Rosenfeld and S. Roukos, “Trigger-based language models: a maximum entropy approach,” in Proc. of ICASSP, pp. II45–II48, 1993.

[102] V. Lavrenko and B. Croft, “Relevance-based language models,” in Proc. of SIGIR, pp. 120–127, 2001.

[103] V. Lavrenko, A Generative Theory of Relevance. PhD thesis, University of Massachusetts, Amherst, 2004.

[104] LDC, “Project topic detection and tracking,” Linguistic Data Consortium, 2000.

[105] H. S. Le, I. Oparin, A. Messaoudi, A. Allauzen, J. L. Gauvain, and F. Yvon,

“Large vocabulary SOUL neural network language models,” in Proc. of Interspeech, pp. 1469–1472, 2011.

[106] H. Y. Lee and L. S. Lee, “Improved semantic retrieval of spoken content by document/query expansion with random walk over acoustic similarity graphs,”

IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no.

1, pp. 80–94, 2014.

[107] K. S. Lee, W. B. Croft, and J. Allan, “A cluster-based resampling method for pseudo-relevance feedback,” in Proc. of SIGIR, pp. 235–242, 2008.

[108] K. S. Lee and W. B. Croft, “A deterministic resampling method using overlapping document clusters for pseudo-relevance feedback,” Inf. Process. Manage, vo. 49, no. 4, pp. 792–806, 2013.

[109] L. S. Lee and B. Chen, “Spoken document understanding and organization,” IEEE Signal Processing Magazine, vol. 22, no. 5, pp. 42–60, 2005.

[110] C. J. Leggetter and P. C. Woodland, “Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models,” Computer Speech and Language, vol. 9, pp. 171–185, 1995.

[111] O. Levy and Y. Goldberg, “Neural word embedding as implicit matrix factorization,” in Proc. of NIPS, pp. 2177–2185, 2014.

[112] D. Li and D. Yu, Deep Learning: Methods and Applications. Foundations and Trends in Signal Processing, Now Publishers, June 2014.

[113] C. Y. Lin. 2003. ROUGE: Recall-oriented Understudy for Gisting Evaluation.

Available: http://haydn.isi.edu/ROUGE/.

[114] H. Lin and J. Bilmes, “Multi-document summarization via budgeted maximization of submodular functions,” in Proc. of NAACL HLT, pp. 912–920, 2010.

[115] S. H. Lin and B. Chen, “A risk minimization framework for extractive speech summarization,” in Proc. of ACL, pp. 79–87, 2010.

[116] S. H. Lin, Y. M. Yeh, and B. Chen, “Leveraging Kullback-Leibler divergence measures and information-rich cues for speech summarization,” IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 4, pp. 871–882, 2011.

[117] S. H. Lin, Y. T. Chen, H. M. Wang, and B. Chen, “A comparative study of probabilistic ranking models for Chinese spoken document summarization,” ACM Transactions on Asian Language Information Processing, vol. 8, no. 1, pp. 3:1–

3:23, 2009.

[118] C. L. Liu, M. H. Lai, K. W. Tien, Y. H. Chuang, S. H. Wu, and C. Y. Lee,

“Visually and phonologically similar characters in incorrect chinese words:

analyses, identification, and applications,” ACM Transactions on Asian Language Information Processing, vol. 10, no. 2, pp. 1–39, 2011.

[119] F. Liu and Y. Liu, “Unsupervised language model adaptation incorporating named entity information,” in Proc. of ACL, pp. 672–769, 2007.

[120] S. H. Liu, K. Y. Chen, Y. L. Hsieh, B. Chen, H. M. Wang, H. C. Yen and W. L.

Hsu, “Effective pseudo-relevance feedback for language modeling in extractive speech summarization,” in Proc. of ICASSP, pp. 3226–3230, 2014.

[121] S. H. Liu, K. Y. Chen, Y. L. Hsieh, B. Chen, H. M. Wang, H. C. Yen and W. L.

Hsu, “Enhanced language modeling for extractive speech summarization with sentence relatedness information,” in Proc. of Interspeech, pp. 1865–1869, 2014.

[122] S. H. Liu, K. Y. Chen, B. Chen, E. E. Jan, H. M. Wang, H. C. Yen and W. L. Hsu,

“A margin-based discriminative modeling approach for extractive speech summarization,” in Proc. of APSIPA, pp. 1–6, 2014.

[123] X. Liu, Y. Wang, X. Chen, M. Gales, and P. Woodland, “Efficient lattice rescoring using recurrent neural network language models,” in Proc. of ICASSP, pp. 4941–

在文檔中統計式語言模型 – 語音文件標記、檢索以及摘要 (頁 115-132)