• 沒有找到結果。

5.4. 詞曲搭配實驗

5.4.1. 演算法參數設定

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

67

是 Pairwise f-score 都不高,原因是歌友標記的斷句方式造成演算法不好分析。《被 動》、《全日愛》、《突然好想你》與《紅豆》只有句字數 SSM 無法分析,這是由 於如果主歌與副歌的句結構結構很像的話,找出的 Family 段落範圍容易有重疊,

造成無法組合句字數結構,其中《被動》與《全日愛》其他特徵的 Pairwise f-score 都為 1.0。《紅豆》不論拼音、聲調與詞性 SSM 分析的 Pairwise f-score 皆為 0.93,

反而線性組合 SSM 降為 0.74,這可能是句字數特徵在這首歌來說。《千千萬萬個 我》與《真的》只有聲調 SSM 無法分析,這兩首即使相同為主歌或副歌的內容 歧異度太大,造成其他特徵分析的詞式結果 Pairwise f-score 分數也不高。

5.4. 詞曲搭配實驗

主旋律的樂句分段我們是採用人工標記的方式,標記的時候必頇要聽 MIDI 旋律 並且搭配原詞的詞句標記相當費時,因此我們只標記了 8 首華語流行音樂的 MIDI 主旋律樂句分段,並且標記上曲式。接著我們對這 8 首曲子推薦歌詞,歌 詞資料庫總共有 85 首歌詞,其中有包含被推薦的 8 首 MIDI 原本搭配的歌詞,

因此我們預期原詞推薦的名次要越前面越好。

5.4.1. 演算法參數設定

詞句與樂句結構搭配中,一個對應 m 的 ctotal中的α 設定為 0.6。cnwr中的 singlimit 設定為 3,也就是一字最多唱三個音符。cmerge中的 mergelimit設定為 5,也就是不 論詞句或樂句合併次數到 5 次以上,cmerge的成本為 1。

漢字與音符中的 Step Type 權重,w1=0.18, w2=0.28, w3=0.54,我們是根據 cnwr中成本曲線

,t

分別為 1, 2, 3 時對應的成本分數。

w3 w2 w1

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

69

響,其中《如果有一天》符合結構搭配歌詞數為 14 首,《紅豆》符合結構搭配歌 詞數為 18 首,但是《紅豆》的計算時間卻比《如果有一天》少了 8 秒,這可能 是因為《如果有一天》符合結構搭配歌詞數的 14 首中與《如果有一天》的對應 序列比較長,造成有較多的對應需要計算漢字與音符的對應,因此拉長了計算時 間。

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

70

第 6 章 結論與未來研究

現在的流行音樂基本上是由音樂與歌詞互相搭配而成,我們希望透過詞曲重新組 合搭配達到新曲舊詞、新詞舊曲或老歌新唱的效果,因此我們先透過詞式分析,

分析歌詞的主歌、副歌等等段落,接下來再利用詞式的資訊做詞曲搭配。

詞式分析中,我們主要是尋找歌詞中的重複樣式,我們提出四個角度來尋找 重複樣式,分別是句字數、拼音、詞性與聲調音高。我們用詞行結構排比演算法 計算與 DTW 計算行與行之間的特徵序列相似度。最後我們可以得到四種特徵建 立的 SSM,接著再用線性組合的方式得到一個線性組合 SSM。在給定的一個 SSM 上,我們對於所有的段落樣式做 Instance Path Search,如此可以找出此段落樣式 所形成的 Family,接下來在任意組合兩種 Family 尋找最佳的 Family 組合。最後 再將找到的最佳 Family 組合形成的分段方式自動標記段落標籤。詞式分析的 Pairwise f-score 可以達到 0.83,Labeling Recover Ratio 達到 0.78。詞曲結構搭配 採用兩層式的對應,實驗結果顯式推薦的歌曲原詞皆為第一名,

四種特徵的 SSM 或許會因為不同時期的歌詞而必頇要有不同的權重,例如 民歌時期的歌詞與現在的歌詞形式差很多,在未來可以做進一步的探討。我們還 希望將詞式分析的方法應用在曲式分析,只不過音樂所形成的 SSM 會很大,可 能的段落樣式會太多,導致在 Family 組合時會花很多時間,因此需要設計一個 有效率的演算法。詞曲結構搭配,在詞句與樂句對應的階段可以增加考量多對多 的情況,而目前沒有考慮到中英夾雜的流行音樂,在未來也可以將英文的特性一 併考量。

[1] F. Bronson, The Billboard Book of Number One Hits, Billboard Books, 1997.

[2] M. Cooper, and J. Foote, “Summarizing Popular Music via Structural Similarity Analysis,” Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2003.

[3] J. Foote, “Visualizing Music and Audio Using Self-Similarity,” Proc. of ACM International Conference on Multimedia, 1999.

[4] J. Foote, “Automatic Audio Segmentation Using a Measure of Audio Novelty,”.

Proc. of IEEE International Conference on Multimedia and Expo, 2001.

[5] H. Fujihara, M. Goto, J. Ogata, K. Komatani, T. Ogata, and H. G. Okuno,

“Automatic Synchronization between Lyrics and Music CD Recordings Based on Viterbi Alignment of Segregated Vocal Signals,” Proc. of IEEE International Symposium on Multimedia, 2006.

[6] S. Fukayama, K. Nakatsuma, S. S. Nagoya, Y. Yonebayashi, T. H. Kim, S. W.

Qin, T. Nakano, T. Nishimoto, and S. Sagayama, “Orpheus: Automatic Composition System Considering Prosody of Japanese Lyrics,” Proc. of International Conference on Entertainment Computing, 2009.

[7] S. Fukayama, K. Nakatsuma, S. Sako, T. Nishimoto, and S. Sagayama,

“Automatic Song Composition from the Lyrics Exploiting Prosody of Japanese Language,” Proc. of Conference on Sound and Music Computing, 2010.

[8] D. Iskandar, Y. Wang, M. Y. Kan, and H. Li, “Syllabic Level Automatic Synchronization of Music Signals and Text Lyrics,” Proc. of ACM International Conference on Multimedia, 2006.

[9] M. Y. Kan, Y. Wang, D. Iskandar, T. L. Nwe, and A. Shenoy, ”LyricAlly:

Automatic Synchronization of Textual Lyrics to Acoustic Music Signals,” IEEE

Transactions on Audio, Speech and Language Processing, Vol. 16, No. 2, 2008.

[10] T. Kitahara, S. Fukayama, S. Sagayama, H. Katayose, and N. Nagata, “An Interactive Music Composition System based on Autonomous Maintenance of Musical Consistency,” Proc. of Conference on Sound and Music Computing, 2011.

[11] K. Lee, and M. Cremer, “Segmentation-based Lyrics-Audio Alignment Using Dynamic Programming,” Proc. of International Conference on Music Information Retrieval, 2008.

[12] M. Levy, and M. Sandler, “Structural Segmentation of Musical Audio by Constrained Clustering,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 16, No. 2, 2008.

[13] H. Lukashevich, “Towards Quantitative Measures of Evaluating Song Segmentation,” Proc. of International Society for Music Information Retrieval, 2008.

[14] N. C. Maddage, and K. C. Sim, “Word Level Automatic Alignment of Music and Lyrics Using Vocal Synthesis,” ACM Transactions on Multimedia Computing, Communications and Applications, Vol. 6, No. 3, 2010.

[15] M. Mauch, H. Fujihara, and M. Goto, “Lyrics-to-Audio Alignment and Phrase-level Segmentation Using Incomplete Internet-style Chord Annotations,”

Proc. of Conference on Sound and Music Computing, 2010.

[16] A. Mesaros, and T. Virtanen, “Automatic Alignment of Music Audio and Lyrics,”

Proc. of International Conference on Digital Audio Effects, 2008.

[17] M. Mueller, P. Grosche, and N. Jianq, “A Segment-Based Fitness Measure for Capturing Repetitive Structures of Music Recordings,” Proc. of International Society for Music Information Retrieval, 2011.

[18] M. Mueller, and F. Kurth, “Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations,” EURASIP Journal on Advances in Signal Processing, 2007.

[19] M. Mueller, and F. Kurth, “Enhancing Similarity Matrices for Music Audio Analysis,” Acoustics, Speech and Signal Processing, 2006.

[20] E. Nichols, D. Morris, S. Basu, and C. Raphael, “Relationships between Lyrics and Melody in Popular Music,” Proc. of International Society for Music Information Retrieval, 2009.

[21] H. R. G. Oliveira, F. A. Cardoso, and F. C. Pereira, “Tra-la-Lyrics: An Approach to Generate Text Based on Rhythm,” Proc. of International Joint Workshop on Computational Creativity, 2007.

[22] J. Paulus, and A. Klapuri, “Music Structure Analysis using a Probabilistic Fitness Measure and a Greedy Search Algorithm,” IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, No. 6, 2009.

[23] J. Paulus, M. Muller, and A. Klapuri, “Audio-Based Music Structure Analysis,”

Proc. of International Society for Music Information Retrieval, 2010.

[24] G. Peeters, “Sequence Representation of Music Structure Using Higher-order Similarity Matrix and Maximum-likelihood Approach,” Proc. of International Society for Music Information Retrieval, 2007.

[25] S. Qin, S. Fukayama, T. Nishimoto, and S. Sagayama, “Lexical Tones Learning with Automatic Music Composition System Considering Prosody of Mandarin Chinese,” Proc. of Second Language Studies: Acquisition, Learning, Education and Technology, 2010.

[26] A. Ramakrishnan A, and S. L. Devi, “An Alternate Approach Towards Meaningful Lyric Generation in Tamil,” Proc. of NAACL HLT Second

Workshop on Computational Approaches to Linguistic Creativity, 2010.

[27] A. Ramakrishnan A, S. Kuppan, and S. L. Devi, “Automatic Generation of Tamil Lyrics for Melodies,” Proc. of Workshop on Computational Approaches to Linguistic Creativity, 2009.

[28] H. Sakoe, and S. Chiba, “Dynamic Programming Algorithm Optimization for Spoken Word Recognition,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Nr. 1, p. 43-49, 1987.

[29] Y. Wang, M. Y. Kan, T. L. Nwe, A. Shenoy, and J. Yin, “LyricAlly:Automatic Synchronization of Acoustic Musical Signals and Textual Lyrics,” Proc. of ACM International Conference on Multimedia, 2004.

[30] C. H. Wong, W. M. Szeto, and K. H. Wong, “Automatic Lyrics Alignment for Cantonese Popular Music,” Multimedia Systems, Vol. 12, No. 4-5, 2007.

[31] S. Yu, J. Hong, and C. C. J. Kuo, “Similarity Matrix Processing for Music Structure Analysis,” Proc. of the 1st ACM Workshop on Audio and Music Computing Multimedia, 2006.

[32] 楊蔭瀏、孫從音、陳帅韓、何為與李殿魁,語言與音樂,丹青圖書有限公司,

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

75

聲符之研究,自然語言與語音處理研討會論文集(ROCLING),2010。

[39] 胡又天,流行詞話,第三期,2011。

[40] 陳富容,現代華語流行歌詞格律初探,逢甲人文社會學報,第 22 期,第 75-100 頁,2011。

[41] 樂句(Phrase),http://en.wikipedia.org/wiki/Phrase_(music)

相關文件