• 沒有找到結果。

在本論文中,我們探討了將非負矩陣分解法應用在語音特徵的調變頻譜上的方法,

並進一步的討論如何在此方法的框架下擷取出更具有強健性的資訊。文中的貢獻 主要分為兩大部分:

1. 非負矩陣分解法之延伸,延伸的方法有二種:

a. 第一部分為探討利用訓練資料分群來針對不同特性的語句分別處 理,以從更細微的角度描述語音調變頻譜中重要的局部資訊。當應 用分群技術,同時使用廣域資訊與局部資訊時,對語音辨識的正確 率有相當顯著的改進,而在結合 CMVN 之後,甚至可以達到超越

AFE 的辨識效果。本論文也討論了不同分群個數的選取方式,結果 顯示當群數取過多時會使辨識結果大幅下降,因為當群數過多時會

產生訓練資料不足之情況。

b. 第二部分為討論非負矩陣分解法的稀疏化,針對基底矩陣的每一行 或每一列進行稀疏化的處理,結果顯示對每一行稀疏化的效果是較 佳而且明顯對語音辨識有助益的。單獨使用稀疏化非負矩陣分解法 之效果較原始非負矩陣分解法改進了約 10%。本論文也針對以稀疏 性非負矩陣分解法為基礎之調變頻譜正規化法中稀疏比的參數進 行探討,可發現稀疏比愈低,基底向量間的冗餘性愈高,而稀疏比 愈高,基底向量間重覆的資訊便會較少。從辨識的結果來看在稀疏 比過高或過低之情況皆會使辨識結果下降,需進行實驗性的調整取 得最佳參數。

2. 壓縮感知法之延伸,本論文提出一個展新的想法,將壓縮感知法應用至調 變頻譜,利用少量較相關訓練資料的線性組合來還原資料。

從實驗結果中也證明了在調變頻譜域中低頻之部分是包含較多語音辨識重要之 部分,故若非負矩陣分解法之基底向量多分布在低頻部分,且在高頻部份沒有雜 訊干擾,會明顯有助於提升辨識率。

在未來展望方面,可以分二個方面討論:

1. 對訓練資料進行額外的處理,在訓練資料中附加其他有助益之資訊。例如 將訓練資料以難易程度或一致化程度分組,仿照人類學習的策略將資料由 易而難排列進行訓練,以期使非負矩陣分解的相關方法能收斂到一個較為 優良的解。

2. 另一方向為研究不同的稀疏化非負矩陣分解法,討論不同種類的稀疏化限 制式──如使用 L1-norm、L2-norm 及 L0-norm 之間的優缺點,並嘗試運用 在較大量的資料上進行驗證。

62

參考文獻

[1] J. Benesty, M. Sondhi and Y. Huang, “Springer Handbook of Speech Processing,”

2008.

[2] J. Tabrikian, G. S. Fostick, and H. Messer, “Detection of Environmental Mismatch in a Shallow Water Waveguide,” IEEE Transactions on Signal Processing, Vol. 47, No. 8, pp. 2181–2190, 1999.

[3] J. P. M. Schalkwijk and T. Kailath, “A Coding Scheme for Additive Noise Channels with Feedback-Part I: No Bandwidth Constraint,” IEEE Transactions on Information Theory, Vol. IT-12, No.12, pp. 183–189, 1966.

[4] V. Stouten, H. V. Hamme and P. Wambacq, “Joint Removal of Additive and Convolutional Noise with Model-Based Feature Enhancement, “in Proceeding of International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vol. I, pp. 949–952, 2004.

[5] H. G. Hirsch and C. Ehrlicher, “Noise Estimation Techniques for Robust Speech Recognition,” in Proceeding of IEEE International Conference Acoustics, Speech, Signal Processing, Vol. 1, pp. 153–156, 1995.

[6] Y. Lv and C.-X. Zhai, “Positional Language Models for Information Retrieval,

“ in Proceedings of the ACM SIGIR conference on Research and development in information retrieval (SIGIR), pp. 299–306, 2009.

[7] J. Mark and F. Gales, “Acoustic Modelling for Speech Recognition: Hidden Markov Models and Beyond?” in Proceedings of Automatic Speech Recognition

& Understanding, pp. 44, 2009.

[8] J. Droppo, “Tutorial of International Conference on Spoken Language Processing,

“in Proceedings of International Speech Communication Association (INTERSPEECH), 2008.

[9] S.F Boll, “Supperssion of Acouststic Noise in Speech Using Spectral Subtraction,”

IEEE Transactions on Acoustics, Speech , and Signal Processing,Vol. 27, No. 2, pp. 113–120, 1979.

[10] P. Lockwood and J. Boudy, “Experiments with a Nonlinear Spectral Subtractor(NSS), Hidden Markov Models and The Projection, for Roubst Speech Recognition in Car, “ Speech Communication Vol. 11, No. 2-3, pp. 215–228, 1992.

[11] S. Fruri, “Cepstral Analysis Techniques for Automatic Speaker Verification,”

IEEE Transaction on Acoustic, Speech and Signal Processing, Vol. 29, pp.

254–272, 1981.

[12] V. Olli and K. Laurila, “Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recognition,” Speech Communication, Vol. 25, pp. 113–147, 1998.

[13] S. Yoshizawa, N. Hayasaka, N. Wada and Y. Miyanaga, “Cepstral Gain Normalization for Noise Robust Speech Recognition, “in Proceedings of International Conference on Acoustics, Speech and Signal Processing(ICASSP), Vol. 1, pp. I-209–I-212, 2004.

[14] F. Hilger and H. Ney, “Quantile Based Histogram Equalization for Noise Robust Large Vocabulary Speech Recognition, “IEEE Transaction On Audio, Speech and Language Processing,Vol. 1, pp. I-209–I-212, 2006.

[15] A. Torre, A. M. Peinado, J. C. Segura, J. L. Perez- Cordoba, M. C. Benitez and A.

J. Rubio, “Histogram Equalization of Speech Representation for Robust Speech Recognition, “ IEEE Transaction Speech Audio Processing,Vol. 13, No. 3, pp.

355–366, 2005.

64

[16] S.-H. Lin, H.-B. Chen, Y.-M. Yeh and B. Chen, ”Improved Histogram Equalzaiton (HEQ) for Robust Speech Recogntion,” in Proceedings of IEEE International Conference on Multimedia and Expo(ICME), pp. 2234–2237, 2007.

[17] A. P. Varga and R. K. Moore, “Hidden Markov Model Decomposition of Speech and Noise,” in Proceedings of International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , pp. 845-848, 1990.

[18] M. J. F. Gales, “Model-Based Techniques for Noise Robust Speech Recognition,”

Ph. D. thesis, University of Cambridge, UK, 1995.

[19] C. J. Leggetter and P. C. Woodland, “Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models,“ Computer Speech and Language, Vol. 9, pp. 171–185, 1995.

[20] J.-L. Gauvain and C.-H. Lee, “Maximum a Posteriori Estimation for Multivariate Gaussian Mixture Observations of Markov Chains, “ IEEE Transactions on Speech and Audio Processing, Vol. 2, No. 2, pp. 291–298, 1994.

[21] M. Cooke, P. Green, L. Josifovski and A. Vizinho, “Robust Automatic Speech Recognition With Missing and Unreliable Acoustic Data,” Speech Communication, Vol. 34, No.3, pp. 267–285, 2001.

[22] M. P. Cooke, A. Morris, and P. D. Green, “Missing Data Techniques For Robust Speech Recognition,” in Proceeding of International Conference on Acoustics, Speech and Signal Processing(ICASSP) , pp. 863–866, 1997.

[23] B. Raj, “Reconstruction of Incomplete Spectrograms for Robust Speech Recognition,” Ph. D. dissertation, ECE Department, Carnegie Mellon University, Pittsburgh, 2000.

[24] H. Hermansky, “Perceptual Linear Predictive (PLP) Analysis of Speech,

“ Journal of the Acoustical Society of America, Vol. 87, No 4, pp. 1738–1752, 1991.

[25] S. B. Davis and P. Mermelstein, “Comparison of Parametric Representations for Monosyllaic Word Recognition in Comtinuously Spoken Sentences,“ IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 28, No. 4, pp.

357–366, 1980.

[26] R. Drullman, J. M. Festen, and R. Plomp, “Effect of Temporal Envelope Smearing on Speech Reception,“ The Journal of the Acoustical Society of America, Vol. 95, No. 2, pp. 1053–1064, 1994.

[27] R. Drullman, J. M. Festen, and R. Plomp, “Effect of Reducing Slow Temporal Modulations on Speech Reception,“ The Journal of the Acoustical Society of America, Vol. 95, pp. 2670–2680, 1994.

[28] H. Hermansky, “Should Recognizers Have Ears?,“ Speech Communication, Vol.

25, No.1–3, pp. 3–27, 1998.

[29] N. F. Viemeister, “Temporal Modulation Transfer Functions Based Upon Modulation Thresholds,” Journal of the Acoustical Society of America, Vol. 66, pp. 1364–1380, 1979.

[30] B. Kollmeier, and R. Koch, “Speech Enhancement Based on Physiological and Psychoacoustical Models of Modulation Perception,” Journal of the Acoustical Society of America, Vol. 95, pp. 1593–1602, 1994.

[31] S. Greenberg, “On the Origins of Speech Intelligibility in The Real World, “in Proceedings of European Speech Communication Association (ESCA)–NATO Tutorial and Research Workshop on Robust Speech Rocognition for Unknown Communication Channels, pp. 23–32, 1997.

[32] S. van Vuuren and H. Hermansky, “On the Importance of Components of the Modulation Spectrum for Speaker Verification,” in Proceedings of the International Conference on Spoken Language Processing(ICSLP), Sydney, Australia, Vol. 7, pp. 3205–3208, 1998.

66

[33] Y. Wada, K. Yoshida, T. Suzuki, H. Mizuiri, K. Konishi, K. Ukon, K.Tanabe, Y.

Sakata and M. Fukushima, “Synergistic Effects of Docetaxel And S-1 by Modulating The Expression Of Metabolic Enzymes Of 5-fluorouracil in Human Gastric Cancer Cell Lines, “ International Journal of Cancer, Vol. 119, pp. 783–

791, 2006.

[34] L.-C. Sun, , C.-W. Hsu, and L.-S. Lee, “Modulation Spectrum Equalization for Robust Speech Recognition,“ in Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding(ASRU), pp. 81–86, 2007.

[35] X. Xiao, E.- S. Chng and H. Li, “Normalizing the Speech Modulation Spectrum for Robust Speech Recognition,” in Proceedings of International Conference on Acoustics , Speech and Signal Processing (ICASSP), pp.1021–1024, 2007.

[36] S.-Y. Huang, W.-H. Tu and J.-W. Hung, “A Study of Sub-band Modulation Spectrum Compensation for Robust Speech Recognition,“ in Proceedings of ROCLING Conference on Computational Linguistics and Speech Processing, pp.

39–52, 2009.

[37] B. Chen, W.-H. Chen, S.-H. Lin, and W.-Y. Chu, “Robust Speech Recognition Using Spatial–Temporal Feature Distribution Characteristics,” Pattern Recognition Letters, Vol. 32, No. 7, pp. 919–926, 2011.

[38] J.-W. Hung, W.-H. Tu and C.-C. Lai, “Improved Modulation Spectrum Enhancement Methods for Robust Speech Recognition,” Signal Processing, Vol.

92, No. 11, pp. 2791–2814, 2012.

[39] J.-W. Hung, H.-T. Fan and Y.-C. Lian, “Modulation Spectrum Exponential Weighting for Robust Speech Recognition,” in Proceedings of International Conference on ITS Telecommunications, pp. 812–816, 2012.

[40] S. Ghwanmeh, R. Al-Shalabi and G. Kanaan, “Efficient Data Compression Scheme using Dynamic Huffman Code Applied on Arabic Language,” Journal of Computer Science, Vol. 2, No. 12, pp. 885–888, 2006.

[41] J. Bradbury, “Linear Predictive Coding,” 2000.

[42] D. D. Lee and H. S. Seung. “Learning the Parts of Objects by Non-negative Matrix Factorization. “ Nature, Vol.401, pp. 788–791, 1999.

[43] A. Hyvarinen, J. Karhunen and E. Oja. “Independent Component Analysis,“ Wiley Interscience, Vol. 13, No. 4–5, pp. 411–430, 2001.

[44] R. O. Duda, and P. E. Hatr, “Pattern Classification and Scene Analysis,” Wiley, 1 edition, 1973.

[45] N. Kumar, Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition, Ph.D. dissertation, Johns Hopkins University, Baltimore, MD, 1997.

[46] M. J. Gales, “Maximum Likelihood Multiple Subspace Projections for Hidden Markov Models, “ IEEE Transaction Speech Audio Processing, Vol. 10, No. 2, pp. 37–47, 2002.

[47] G. Saon, M. Padmanabhan, R. Gopinath, and S. Chen, “Maximum Likelihood Discriminant Feature Spaces, “ in Proceedings of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp. 129–132, 2000.

[48] D. D. Lee and H. S. Seung, “Algorithms for Nonnegative Matrix Factorization,”

in Advances in Neural Information Processing Systems ,Vol. 13, pp. 556–562 2001.

[49] W.-Y. Chu, J.-W. Hung and B. Chen, “Modulation Spectrum Factorization for Robust Speech Recognition,” in Proceedings of APSIPA Annual Summit and Conference (APSIPA ASC), pp. 18–21, 2011.

68

[50] K. Kimura and T. Yoshida, “Topic Graph based Transfer Learning via Generalized KL Divergence Based NMF,” in Proceedings of IEEE International Conference on Granular Computing, pp. 330–335, 2011.

[51] D. Cai, X. He, X. Wang, H. Bao and J. Han. “Locality Preserving Nonnegative Matrix Factorization,” in Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), pp. 1010–1015, 2009.

[52] L. Zhang, Z. Chen, M. Zheng and X. He. “Robust Non-negative Matrix Factorization. “ Frontiers of Electrical and Electronic Engineering, Vol. 6, No. 2, pp. 192–200, 2011.

[53] H.-T. Fan, Y.-C. Tsai and J.-W. Hung, “Enhancing the Sub-band Modulation Spectra of Speech Features via Nonnegative Matrix Factorization for Robust Speech Recognition,“ in Proceedings of IEEE International Conference on System Science and Engineering (ICSSE), pp. 179–182, 2012.

[54] P. O. Hoyer, “Non-negative Matrix Factorization with Sparseness Constraints,”

Journal of Machine Learning Research, Vol. 5, pp. 1457–1469, 2004.

[55] M. Mørup, K. H. Madsen and L. K. Hansen. “Approximate 𝐿 Constrained Non-negative Matrix and Tensor Factorization, “in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), special session, pp.

1328–1331, 2008.

[56] R. Peharz, M. Stark and F. Pernkopf, “Sparse Nonnegative Matrix Factorization Using 0 Constraints, “in Proceedings of IEEE International Workshop on Machine Learning for Signal Processing(MLSP), pp. 83–88, 2010.

[57] W.-S. Zheng, S.-Z. Li, J.-H. Lai, and S. Liao. “On Constrained Sparse Matrix Factorization, “in Proceedings of IEEE International Conference Computer Vision, pp. 1–8, 2007.

[58] T. Cai, G. Xu and J. Zhang, “On Recovery of Sparse Signals Via L1 Minimization, “ IEEE Transactions on Information Theory, Vol. 55, No. 7, pp.

3388–3397, 2009.

[59] J. Emmanuel, K. Justin and T. Terence, “Stable Signal Recovery from Incomplete and Inaccurate Measurements,“ Communications on Pure and Applied Mathematics, Vol. 59, No. 8, pp. 1207–1223, 2006.

[60] D. L. Donoho, “Compressed Sensing, “ IEEE Transactions on Information Theory, Vol. 52, No. 4, pp. 1289–1306 , 2006.

[61] E. Candès, J. Romberg and T. Tao, “Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information,” IEEE Transactions Information Theory, Vol. 52, No. 2, pp. 489–509, 2006.

[62] H. G. Hirsch and D. Pearce, “The AURORA Experimental Framework for the Performance Evaluations of Speech Recognition Systems Under Noisy Conditions, “ in Proceeding of International Symposium on Computer Architecture Tutorial and Research Workshop Automatic Speech Recognition, 2000.

[63] S. Young, G. Evermann, M. Gales, T. Hain, D. Kershaw, X. Liu, G. Moore, J.

Odell, D. Ollason, D. Povey, V. Valtchev and P. Woodland, The HTK Book (for version 3.4), Cambridge University Engineering Department, 2009.

i

附錄

附錄一:非負矩陣分解法之公式推導

有了非負矩陣分解法的概念後,我們將詳細討論其中更新法則是如何推導而來的。

(1) 更新法則的證明

首先我們有了最小平方法則的減損函式,我們可以把使用歐氏距離的H更新法則改寫成:

𝐻𝑖𝑚 ← 𝐻𝑖𝑚+ 𝜂𝑖𝑚[ 𝑊𝑇𝑉 𝑖𝑚 𝑊𝑇𝑊𝐻 𝑖𝑚] 附 )

當𝜂𝑖𝑚為一個很小的正值的時候,式(3-3)可視為一個常規的梯度下降法,且當𝜂𝑖𝑚值小到 接近 0 時,減損函式也可簡化為‖𝑉 𝑊𝐻‖。當我們設𝜂𝑖𝑚為:

𝜂𝑖𝑚 𝐻𝑖𝑚

𝑊𝑇𝑊𝐻 𝑖𝑚 附 將式(3-6)代入式(3-5)即可得式(3-4)之H更新式。

𝐻𝑖𝑚 ← 𝐻𝑖𝑚+ 𝐻𝑖𝑚

𝑊𝑇𝑊𝐻 𝑖𝑚[ 𝑊𝑇𝑉 𝑖𝑚 𝑊𝑇𝑊𝐻 𝑖𝑚]

𝐻𝑖𝑚 ← 𝐻𝑊𝑇𝑊𝐻 𝑖𝑚+ 𝐻𝑊𝑇𝑉 𝑖𝑚 𝐻𝑊𝑇𝑊𝐻 𝑖𝑚

𝑊𝑇𝑊𝐻 𝑖𝑚

𝐻𝑖𝑚 ← 𝐻𝑊𝑇𝑉 𝑖𝑚

𝑊𝑇𝑊𝐻 𝑖𝑚

𝐻𝑖𝑚 ← 𝐻𝑖𝑚 𝑊𝑇𝑉 𝑖𝑚

𝑊𝑇𝑊𝐻 𝑖𝑚 同理我們也可以由減損函式,把使用歐氏距離的W更新法則改寫成:

𝑊𝑛𝑖 ← 𝑊𝑛𝑖+ 𝜂𝑛𝑖[ 𝑉𝐻𝑇 𝑛𝑖 𝑊𝐻𝐻𝑇 𝑛𝑖] 附 3 當𝜂𝑖𝑚值小到接近 0 時,減損函式也可簡化為‖𝑉 𝑊𝐻‖。當我們設𝜂𝑖𝑚為:

𝜂𝑛𝑖 𝑊𝑛𝑖

𝑊𝐻𝐻𝑇 𝑛𝑖 附 4

將式(4-8)代入式(4-5)即可得式(4-4)之W更新式。

𝑊𝑛𝑖 ← 𝑊𝑛𝑖+ 𝑊𝑛𝑖

𝑊𝐻𝐻𝑇 𝑛𝑖 [ 𝑉𝐻𝑇 𝑛𝑖 𝑊𝐻𝐻𝑇 𝑛𝑖] 𝑊𝑛𝑖 ← 𝑊𝐻𝐻𝑇 𝑛𝑖+ 𝑊𝑛𝑖𝑉𝐻𝑇 𝑛𝑖 𝑊𝐻𝐻𝑇 𝑛𝑖

𝑊𝐻𝐻𝑇 𝑛𝑖 𝑊𝑛𝑖(𝑊 𝑊𝐻𝐻𝑛𝑖𝑉𝐻𝑇𝑇)𝑛𝑖

𝑛𝑖 𝑊𝑛𝑖 ← 𝑊𝑛𝑖 𝑊𝐻𝐻(𝑉𝐻𝑇𝑇)𝑛𝑖

𝑛𝑖

因為我們選的𝜂𝑖𝑚並不是一個很小的值,所以不定符合梯度下降法遞降的法則,接 下來我們就來進一步探討這個假設的𝜂𝑖𝑚是否適用。

(2) 證明是否收斂

要證明更新法則(也就是式(3-4)),我們需要定義一個輔助函式,在這裡我們是使用類似 最大期望演算法(Expectation-Maximization algorithm)。我們定義一個輔助函式 ,這個輔 助函式是為了要幫我們估測實際函式而設的。

ℎ, ℎ 為實際函式 ℎ 的輔助函式當以下條件成立的時候:

ℎ, ℎ ≥ ℎ , ℎ, ℎ ℎ 附

輔助函式和實際函式的關係如下圖附 1-1:

圖附 1-1:輔助函式與實際函式示意圖, ℎ, ℎ 為輔助函式, ℎ 為實際函數。

由圖附 1-1 中我們可以看出,當函式 中當 h 增時,它的值是愈來愈小的,也就是 說愈靠近極小值,它們的關係可寫為:

𝑡 1 𝑟𝑔 min

ℎ, ℎ𝑡

iii 我們定一個正半定矩陣(positive semi-definiteness matrix):

𝑀𝑎𝑏𝑡𝑎𝑡 𝐾 ℎ𝑡 𝑊𝑇𝑊 𝑎𝑏𝑏𝑡 附 4 當𝑀為正半定矩陣,即𝐾 ℎ𝑡 𝑊𝑇𝑊為正半定矩陣。

𝜈𝑇Mν ∑ 𝜈𝑎𝑀𝑎𝑏𝜈𝑏

𝑎𝑏

∑ ℎ𝑎𝑏 𝑎𝑡 𝑊𝑇𝑊 𝑎𝑏𝑏𝑡𝜈𝑎2 𝜈𝑎𝑎𝑡 𝑊𝑇𝑊 𝑎𝑏𝑏𝑡𝜈𝑏 附 𝑊𝑇𝑊 𝑎𝑏𝑎𝑡𝑏𝑡 [ 𝜈𝑎2+ 𝜈𝑏2 𝜈𝑎𝜈𝑏] 附 7

𝑊𝑇𝑊 𝑎𝑏𝑎𝑡𝑏𝑡 𝜈𝑎 𝜈𝑏 2

≥ 0 附 9 故 ℎ, ℎ ≥ ℎ 得證。

v

附錄二:稀疏化非負矩陣分解法之詳細演算法,與其詳細說明 演算法一:

演算法一:稀疏化非負矩陣分解法之演算法。

(1) 初始化W和H。

(2) 當要使W稀疏化時,將向量矩陣經過投影成非負的向量矩陣,固定 L2 norm,

調整 L1 norm 來達到我們期望的稀疏比。

(3) 當要使H稀疏化時,將向量矩陣經過投影成非負的向量矩陣,縮小 L2 norm 成單位向量,調整 L1 norm 來達成我們要的稀疏比。

(4) 迭代

a、 若要使W稀疏化,

i. 設W W 𝜇𝑊 𝑊𝐻 𝑉 𝐻𝑇

ii. 將向量矩陣經過投影成非負的向量矩陣,固定 L2 norm,調 整 L1 norm 來達到我們期望的稀疏比。

若W不稀疏化,使用原始非負矩陣更新法,即𝑊𝑛𝑖 ← 𝑊𝑐𝑛𝑖 𝑊𝐻𝐻(𝑉𝐻𝑇𝑇)𝑛𝑖

𝑛𝑖 。 b、 若要使H稀疏化,,

i. 設H H 𝜇𝐻𝑊𝑇 𝑊𝐻 𝑉 。

ii. 將向量矩陣經過投影成非負的向量矩陣,縮小 L2 norm 成單 位向量,調整 L1 norm 來達成我們要的稀疏比。

若H不稀疏化,使用原始非負矩陣更新法,即𝐻𝑖𝑚 ← 𝐻𝑐𝑖𝑚 𝑊(𝑊𝑇𝑇𝑊𝐻 𝑉)𝑖𝑚

𝑖𝑚

在演算法一的步驟(1)是先對基底矩陣和權重矩陣初始化,步驟(2)和(3)是說明對基 底矩陣和權重矩陣稀疏化是要怎麼做,步驟(4)就開始迭代求取稀疏化且非負的基底矩 陣和權重矩陣,步驟(4-a)是如果要對基底矩陣做稀疏化的動作時,就要執行此部分,首 先先將基底矩陣的值設為以(4-a-i)的方式計算,此公式是從對減損函式微分而得,接下 來步驟(4-a-ii)則是要針對基底矩陣做稀疏化和檢查是否為非負的動作,此步驟詳細的過

在演算法一的步驟(1)是先對基底矩陣和權重矩陣初始化,步驟(2)和(3)是說明對基 底矩陣和權重矩陣稀疏化是要怎麼做,步驟(4)就開始迭代求取稀疏化且非負的基底矩 陣和權重矩陣,步驟(4-a)是如果要對基底矩陣做稀疏化的動作時,就要執行此部分,首 先先將基底矩陣的值設為以(4-a-i)的方式計算,此公式是從對減損函式微分而得,接下 來步驟(4-a-ii)則是要針對基底矩陣做稀疏化和檢查是否為非負的動作,此步驟詳細的過

相關文件