結論與未來展望 - 以情緒感受為基礎之自動音樂選曲系統

在這個章節中第一節總結本篇論文的研究方法及實驗結果；第二節討論實驗方法未來建議改良的方向及系統應用。

5.1、結論

本篇論文應用內涵式音樂資訊檢索技術開發音樂內容的搜尋引擎，目標為找出符合一般認知及情緒感受的音樂資訊檢索演算法，以音樂理論為基礎設計特徵萃取演算法，取建立資料的數值特徵；以相似度量測演算法計算音樂資料中相似的內容。在特徵萃取中測試四種演算法；在相似度以局部相似度量測和總體相似度量測計算資料的相近程度，並且測試四種相似量測演算法。最後以人為標記的資料測試各演算法的檢索結果是否符合一般對於音樂內容的認知及情緒感受。

本次的實驗結果在音樂類型的測試裡，最高的檢索效能達 94.17%，其演算法是 CQT 與音程特徵及餘弦距離的相似度量測在長時距的音樂片段分析之下；

在情緒感受的測試裡，檢索效能最高的演算法是 MFCC 與頻譜特徵及餘弦距離的相似度量測在長時距的音樂片段分析之下，平均查準率達 98.75%。在運算時間方面，整體系統中以特徵萃取這個步驟所需要花費的時間最長。

並且從實驗結果中歸納出以下的結論。在音樂類型的測試裡，特徵萃取演算法的以單獨音框內的絕對數值大小設計演算法可以得到較高的檢索效能；在情緒感受的測試裡，以相鄰音框間的數值變化設計演算法可以得到較高的檢索效能；

而在同時符合相同音樂類型及連續的情緒變化這兩項條件的測試中，四種特徵萃取演算法的效能差異不大。在相似度演算法的實驗中，不同的相似量測演算法對於檢索效能的影響差異不多。另外分析之音樂片段的時間長度對於檢索效能的影響不大，特徵萃取演算法對於檢索效能的影響最多。

5.2、未來展望

從實驗結果中得到以下的推論，建議未來的研究以及應用方法，分別針對系

統中各步驟討論。

A、特徵萃取的演算法中將多項特徵結合，使相似度的判斷上可以同時符合多項條件提升檢索效能。另外也可以針對不同的音樂內容修改特徵萃取的演算法，使其對於音高或是節奏的判斷更為準確，減少雜訊的影響。最後也可以針對運算速度的提升做改進。

B、在相似度量測的計算中，改良局部相似度和總體相似度的演算法，也許可以提升檢索效能。

C、在人機互動方面，由於情緒反應沒有統一的標準，因此在人機互動方面建議加入使用者回饋機制，使系統有更完整的資訊用於判斷音樂的內容是否符合使用者的認知。另外在輸入檢索資料的部分，在輸入檢索歌曲的同時也加入詮釋資料的輸入，使系統在判斷歌曲相似度上有更多的資料提升檢索效能。

D、在系統未來應用方向上，將平臺移植於手機、MP3 播放器上。另外也可將此項功能加入電腦的音樂播放軟體中或線上數位音樂資料庫的檢索上。

參考文獻

[1] 洪元元，從使用者聆賞歷程探討線上音樂分類架構，碩士論文，國立台灣大學圖書資訊學研究所，民國98年7月。

[2] Furman, C.E., and Duke, R.A. (1988). Effects of majority consensus on preferences for recorded orchestral and popular music. Journal of Research in Music Education, 36, 4, 220-231.

[3] Geringer, J.M. (1982). Verbal and operant music listening in relationship to age and musical training, Psychology of music(special issue), 47-50.

[4] Holbrook, M.B., and Schindler, R.M. (1989). Some exploratory findings on the development of musical tastes. Journal of Consumer Research, 20, 119-124.

[5] Rubin, D.C., Rahhal, T.A., and Poon, L.W. (1998). Things learned in early adulthood are remembered best. Memory & Cognition, 26, 1, 3-19.

[6] 許游雅，國小學童音樂偏好與情緒感受之探究，碩士倫文，國立屏東教育大學音樂學系，民國98年7月。

[7] Ze-Nian Li, Mark S. Drew. Fundamentals of Multimedia. Prentice Hall United States ed edition. November 1, 2003

[8] Sonnenschein D. . Sound design : the expressive power of music, voice, and sound effects in cinema. Studio City, CA : Michael Wiese Productions, 2001.

[9] R. Typke, F. Wiering, and R. C. Veltkamp. A survey of music information retrieval systems. In DAFX-05: Proceedings of the 8th Int. Conference on Digital Audio Effects, pages 153–160, 2005.

[10] 陳若涵，以音樂內容為基礎的情緒分析與辨識，碩士論文，國立清華大學資訊系統與應用研究所，民國九十五年六月。

[11] Y. Zhuang et al. Building a Personalized Music Emotion Prediction System, PCM 2006, LNCS 4261, pp. 730 - 739, 2006.

[12] Hu, N., Dannenberg, R., and Tzanetakis, G. Polyphonic audio matching and alignment for music retrieval. In Proceedings of the 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (New Paltz, NY, Oct.

19--22). IEEE Computer Society Press, Piscataway, NJ, 2003, 185--188.

[13] G. Tzanetakis, P. Cook. Music Genre Classification of Audio Signals. IEEE transactions on audio, speech, and language processing, vol.10, no. 5, July 2002。

[14] D. N. Jiang, L. Lu, H. J. Zhang, J. H. Tao, and L. H. Cai, “Music type

classification by spectral contrast features,” in Int. Conf. Multimedia Expo., vol.

1, 2002, p. 113–116.

[15] L. Lu, D. Liu, H. J. Zhang. Automatic Mood Detection and Tracking of Music Audio Signals. IEEE Trans. on Audio, Speech and Language Processing, Vol. 14, No. 1, pp. 5-18, January 2006。

[16] M. D. Korhonen, D. A. Clausi, and M. E. Jernigan, “Modeling emotional content of music using system identification,” IEEE Trans. Syst.,Man., Cybern., vol. 36, no. 3, pp. 588–599, Jun. 2006.

[17] Y. H. Yang, Y. C. Lin, Y. F. Su and H. H. Chen. “A Regression Approach to Music Emotion Recognition”. IEEE transactions on audio, speech, and language processing, vol.16, no. 2, February 2008。

[18] Y. H. Yang, Y. C. Lin, Y. F. Su, and H. H. Chen, "Music emotion classification: A regression approach," in Proc. IEEE Int. Conf. Multimedia and Expo. 2007 (CME'07), Bejing, China, pp. 208-211.

[19] T. L. Wu and S. K. Jeng. Probabilistic estimation of a novel music emotion model. In 14th International Multimedia Modeling Conference. Springer, 2008.

[20] Field, A., Hartel, P. and Mooij, W. 2001 Personal DJ, an Architecture for Personalised Content Delivery. International World Wide Web Conference Proceedings of the 10th international conference on World Wide Web, Hong Kong, 1-7.

[21] S. Pauws and B. Eggen. PATS: Realization and user evaluation of an automatic playlist generator. In ISMIR, 2002.

[22] Li, T. and Ogihara, M. 2004 Content-based music similarity search and emotion detection. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, volume 5, 705-708.

[23] Foote, Jonathan. Content-Based Retrieval of Music and Audio, in C.-C. J. Kuo et al., editor, Multimedia Storage and Archiving Systems II, Proc. of SPIE, Vol.

3229, pages 138-147, 1997.

[24] B. Logan and A. Salomon 2001. A music similarity function based on signal analysis. In Proc IEEE Intl Conf on Multimedia and Expo.

[25] B. Logan and A. Salomon. A content based music similarity function. Technical report, Compaq Cambridge Research Laboratory, June 2001.

[26] Owen Craigie Meyers. A Mood-Based Music Classification and Exploration System. Master thesis. Program in Media Arts and Sciences,

MASSACHUSETTS INSTITUTE OF TECHNOLOGY. June 2007.

[27] 黃捷，運用樣式所引技術之高效性內涵式音樂檢索，碩士論文，國立成功大學資訊工程研究所，民國九十七年七月。

[28] Z. Xiao, E. Dellandréa, W. Dou, L. Chen,What is the Best Segment Duration for Music Mood Analysis ?, International Workshop on Content Based Multimedia Indexing (CBMI), pp. 17 24. 2008.

[29] R. M. Parry and I. Essa, “Feature weighting for segmentation,” in Proc.5th Int.

Conf. Music Information Retrieval, 2004, pp. 116–119.

[30] 陳彥傑，應用於音樂資料的快速重複片斷找尋演算法：以人類感知為依據的

方法，碩士論文，國立成功大學電機工程學系，2004

[31] Fastl, H., E. Zwicker, et al. (2007). "Psychoacoustics Facts and Models." 3rd.

from http://dx.doi.org/10.1007/978-3-540-68888-4.

[32] Xavier Serra and Julius Smith III, Spectral Modeling Synthesis: A Sound Analysis/Synthesis System Based on a Deterministic Plus Stochastic Decomposition, Computer Music Journal, Vol. 14, No. 4, Winter 1990.

[33] Brown, J.C. and Puckette, M.S. (1992). "An Efficient Algorithm for the Calculation of a Constant Q Transform", J. Acoust. Soc. Am. 92, 2698-2701.

(1992)

[34] M. Wu, D. Wang, and G. J. Brown, “A multi-pitch tracking algorithm for noisy speech,” in Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’02), vol. 1, pp. 369–372, Orlando, Fla, USA, May 2002.

[35] M. P. Ryynnen and A. P. Klapuri, “Transcription of the singing melody in polyphonic music,” in Proc. 7th International Conference on Music Information Retrieval (ISMIR), Victoria, Canada, 2006.

[36] Gómez, Emilia, Klapuri, Anssi and Meudic, Benoît(2003). 'Melody Description and Extraction in the Context of Music Content Processing', Journal of New Music Research,32:1,23 － 40

[37] M. Marolt, “Audio melody extraction based on timbral similarity ofmelodic fragments,” in Proc. EUROCON 2005, Nov. 2005.

[38] Gómez, Emilia, Klapuri, Anssi and Meudic, Benoît(2003)'Melody Description and Extraction in the Context of Music Content Processing',Journal of New Music Research,32:1,23 -40

[39] A. S. Durey and M. A. Clements, “Features for melody spotting using hidden Markov models,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, 2002, pp. 1765-1768.

[40] K. Dressler, “Extraction of the melody pitch contour from polyphonic audio,” in Proc. 6th International Conference on Music Information Retrieval, Sept. 2005.

MIREX05 extended abstract, available online http://www.musicir.

org/evaluation/mirex-results/articles/melody/dressler.pdf.

[41] D. B. Loeffer. Instrument Timbres and Pitch Estimation in Polyphonic Music.

Master's thesis, Georgia Institute of Technology, April 2006.

[42] J. P. Bello and J. Pickens, “A robust mid-level representation for harmonic content in music signals,” in Proceedings of the International Symposium on Music Information Retrieval, London, UK, 2005.

[43] Hadi Harb, Liming Chen, A Query by Example Music Retrieval Algorithm, Proceedings of the 4th European Workshop on Image Analysis for Multimedia Interactive Services WIAMIS03, pp 122-128, World Scientific publications,

April 9-11, Queen Mary, University of London, UK

[44] Gang Qian , Shamik Sural , Yuelong Gu , Sakti Pramanik. Similarity between Euclidean and cosine angle distance for nearest neighbor queries, Proceedings of the 2004 ACM symposium on Applied computing, March 14-17, 2004, Nicosia, Cyprus [doi>10.1145/967900.968151]

[45] 林俊男，人工生音信號意象感知評價之研究，碩士論文，國立雲林科技大學工業設計研究所，民89。

[46] R. E. Thayer. The Biopsychology of Mood and Arousal, New York, Oxford University Press, 1989.

[47] K. Hevner. Experimental Studies of the Elements of Expression in Music.

American Journal of Psychology, 48: 246-68. 1936.

[48] Emery Schubert. Update of the Hevner Adjective Checklist. Perceptual and Motor Skills, 96:1117–1122, 2003.

[49] 莊雅量，應用音樂性聲音訊號傳遞訊息屬性的可能性研究－以行動電話之＂

聽聲辨人＂為例，碩士論文，國立交通大學應用藝術研究所，2001。

[50] Schubert, E. Measurement and Time Series Analysis of Emotion in Music. Diss.

University of New South Wales, 1999.

[51] X. Hu. Music and Mood: Where Theory and Reality Meet. iConference, Feb, 2010.

[52] Jyh-Shing Roger Jang, "Speech and Audio Processing Toolbox", available from the link at the author's homepage at "http://mirlab.org/jang".

附錄音樂情緒標記形容詞中英對照表

Adjective List

英中英中

# Artist Filename Genre Adjective Emotion

1 盧廣仲 100 種生活.mp3 pop/rock leisurely 4

3 Stars Ageless Beauty.mp3 electronic vigorous 8 4 王若琳 As Love Begins To Mend.mp3 pop/rock yielding 3 5

Doug Munro, Mariano 10 Charles Mingus Boogie Stop Shuffle [Unedited Form].mp3 jazz quaint 5

11 Beirut Brandenburg.mp3 country gloomy 2

12 The Red Hot

Chili Peppers Buckle Down.mp3 pop/rock robust 8 13 Selfkill Cake on Body.mp3 pop/rock plaintive 3 14 順子 Can't Get Enough.mp3 pop/rock passionate 7 15 Kings of

Convenience Cayman Islands.mp3 pop/rock serene 4 16 The Eagles Chug All Night.mp3 pop/rock exalting 8

Underground Femme Fatale.mp3 pop/rock yearning 3 27 Herbie Hancock Firewater.mp3 jazz leisurely 4 28 Tizzy Bac For the Way I Live.mp3 pop/rock doleful 2 33 DEPAPEPE Hachiroku.mp3 easy listening light 5 34 Sarah Vaughan He's My Guy.mp3 jazz tender 3 44 Alison Krauss It Wouldn't Have Made Any Difference.mp3 country yielding 3

45 Zion Lockwood Jazzy June.mp3 electronic quaint 5

48 The Eagles Life In The Fast Lane.mp3 pop/rock agitated 7

56 Stereophonics Madame Helga.mp3 pop/rock impetuous 7 57 Kings of

Convenience Me in You.mp3 pop/rock serene 4 58 K 歌情人電影

原聲帶 Meaningless Kiss.mp3 pop/rock sentimental 3 59 Dave Weckl 66 Cannonball

Adderley Mystified (aka Angel Face).mp3 jazz light 5 67 Piano Magic Night Of The Hunter.mp3 pop/rock melancholy 2

68 Sonny Rollins Nishi.mp3 jazz graceful 5

69 DEPAPEPE Old Beach.mp3 easy listening leisurely 4 70

The Triplets of Belleville 電影原聲帶

Opening Theme.mp3 stage & screen light 5 71 Frente! Ordinary Angels.mp3 pop/rock bright 6 72 The Eagles Out Of Control.mp3 pop/rock impetuous 7 73 João Gilberto Para Machuchar Meu Coracao (To Hurt My

Heart).mp3 jazz satisfying 4

74 Corinne Bailey

Rae Put Your Records On.mp3 r&b joyous 6 82 Cannonball

Adderley Somethin' Else.mp3 jazz exciting 7 83 胡德夫 Standing on my land.mp3 country solemn 1 84 The Who Substitute.mp3 stage & screen vigorous 8 85 Radiohead Subterranean Homesick Alien.mp3 pop/rock heavy 2 86 9 Lazy 9 Sunday Monday.mp3 electronic sprightly 5 87 Dave Brubeck

Quartet Take Five.mp3 jazz delicate 5

88 Bebel Gilberto Tanto Tempo [Peter Kruder Remix].mp3 latin fanciful 5

89 Weather Report Teen Town.mp3 jazz exhilarated 7 105 Metropolitan

Jazz Affair Yunowhathislifeez [Motorcity Mix].mp3 jazz whimsical 5 106 Peter Broderick a glacier.mp3 ambient melancholy 2

137 陳建年孩子與妳我的天堂.mp3 blues tender 3

Algorithm Genre Emotion Genre & Emotion

特徵萃取相似度長時距短時距長時距短時距長時距短時距

14 FFT FLUX CD 56.25 69.58 51.67 57.08 53.96 63.33

在文檔中以情緒感受為基礎之自動音樂選曲系統 (頁 55-66)