• 沒有找到結果。

Chapter 5 Conclusion and Future Work

5.2   Future Work

Since CAMSHIFT relies on color distribution alone, errors in color (color lighting, dim illumination, too much illumination…etc) will cause errors in tracking procedure.

More sophisticated trackers use multiple modes such as feature tracking and motion analysis to compensate for this, but more complexity would undermine the original

design criterion for CAMSHIFT. Other possible improvements include:

z Improve tempo following: the current system cannot react to some complex and

subtle movement of professional conductor, not only for the direction change. We can replace our beat detection and analysis module with some more sophisticated gesture recognition algorithms, so that we can adjust our module according to the different level of users conducting skill.

51

z Include time stretching algorithm: time stretching algorithm is the process of

changing the speed or duration of an audio signal without affecting its pitch.

While our system can adjust the music playback speed according to the beat event,

it can help users to understand the conducting speed he/she performed.

z New application area: we can use our framework to several different areas in

interface with vision and some other multimedia. We not also can estimate the accuracy of beat events for conducting gesture, but also the movement of the dancer. Based on the previous future work, we can design another system for a dancer whose routine is no longer constrained by the tempo of a recording and the music would spontaneously react to his/her movements.

In conclusion, since the system we proposed is a framework which combines the video and audio processing areas, the applications of this technology can help us to examine unexplored area in interfaces with music and other multimedia. We can build an “interactive karaoke” system where a user could sing a song along to a recording, but have the recording adjusting to the user’s tempo. Some other applications can be implemented following these rules, including conductor training, live performance and music synthesis control, and so forth. We hope that the flexible and interchangeable modules would make the further researches easier in the future.

52

References

[1] T. T. Hewett, et al., "ACM SIGCHI Curricula for Human-Computer Interaction",

ACM Press, New York, NY, 1992, ACM Order Number: 608920.

[2] B. A. Myers, "A Brief History of Human-Computer Interaction Technology",

Interactions, vol. 5, pp. 44-54, 1998.

[3] M.T. Driscoll, "A Machine Vision System for Capture and Interpretation of Orchestra Conductor’s Gestures", M. S. Degree Thesis, May, 1999.

[4] E. Lee, I. I. Grüll, H. Kiel and J. Borchers, "Conga: A Framework for Adaptive Conducting Gesture Analysis", NIME '06: Proceedings of the 2006 Conference on

New Interfaces for Musical Expression, pp. 260-265, 2006.

[5] D. Murphy, “Tracking a Conductor's Baton” , Søren I. Olsen, Editor, Proceedings

of the 12th Danish Conference on Pattern Recognition and Image Analysis,

volume 2003/05 of DIKU technical report series, pp. 59-66, Copenhagen, Denmark, August 2003.

[6] R. Behringer, "Conducting Digitally Stored Music by Computer Vision Tracking",

AXMEDIS '05: Proceedings of the First International Conference on Automated Production of Cross Media Content for Multi-Channel Distribution, pp. 271,

2005.

[7] The Church of Jesus Christ of Latter Day Saints. Conducting Course.

[8] Wikipedia, The free Encyclopedia, http://en.wikipedia.org

[9] M. Lambers, "How Far is Technology from Completely Understanding a

Conductor?”, 4th Twente Student Conference on IT, Enschede, January 30, 2006.

[10] Paul Kolesnik, "A Conducting Gesture Recognition, Analysis and Performance System", M. S. Degree Thesis, McGill University, June, 2004.

53

[11] R. Boulanger and M. Mathews, “The 1997 Mathews Radio-baton and

Improvisation Modes”, Proceedings of the 1997 International Computer Music

Conference, pp.395-398, Thessaloniki, Greece, 1997.

[12] Buchla Lightning II. <http://www.buchla.com>

[13] B. Brecht and G. Garnett, “Conductor Follower” , Proceedings of the 1995

International Computer Music Conference, pp. 185-186, Banff, Canada, 1995,

Available: http://cnmat.berkeley.edu/publication/conductor_follower.

[14] J. Borchers, W. Samminger and M. Mühlhäuser, "Personal Orchestra: Conducting Audio/Video Music Recordings", Proceedings of the Second International

Conference on WEB Delivering of Music (WEDELMUISC’02), 2002.

[15] Carmine Cascaito and Marcelo M. Wanderley, “Lessons from Long Term

Gestural Controller Users”, in Proceedings of the 4th International Conference on

Enactive Interfaces (ENACTIVE'07), pp. 333-336, Grenoble, France, 2007.

[16] F. Tobey and Ichiro Fujinaga, "Extraction of Conducting Gestures in 3D Space",

Proceedings of the 1996 International Computer Music Conference, pp. 305-307,

San Francisco, 1996.

[17] T. Marrin and J. Paradiso, “The Digital Baton: a Versatile Performance

Instrument”, Proceedings of the 1997 International Computer Music Conference, pp.313-316, Thessaloniki, Greece, 1997.

[18] T. Marrin and R. Picard, “The Conductor's Jacket: a Device for Recording Expressive Musical Gestures”, Proceedings of the 1998 International Computer

Music Conference, pp.215-219, Ann Arbor, MI, 1998.

[19] T. Marrin, "Inside the Conductor's Jacket: Analysis, Interpretation and Musical Synthesis of Expressive Gesture", Ph.D. Dissertation, MIT Media Lab, February, 2000.

[20] T. Ilmonen, “Tracking Conductor of an Orchestra Using Artificial Neural Networks”, M. S. Degree Thesis, Helsinki University of Technology, Espoo, Finland, 1999.

[21] T. Ilmonen and T. Takala, "Conductor Following with Artificial Neural

Networks", Proceedings of the 1999 International Computer Music Conference, pp. 367-370, Beijing, China, October, 1999.

54

[22] H. Morita, "A Computer Music System that Follows a Human Conductor,"

Computer, vol. 24, pp. 44-53, 1991.

[23] Light Baton. < http://web.tiscali.it/pcarosi/Lbs.htm>

[24] J. Segen, A. Majumder, and J. Gluckman, "Virtual Dance and Music Conducted by a Human Conductor", Eurographics, vol. 19(3), EACG, 1999.

[25] J. Segen, S. Kumar and J. Gluckman, "Visual Interface for Conducting Virtual Orchestra", Proceedings of the 15th International Conference on Pattern

Recognition (ICPR’00), vol.1, pp. 276-279, 2000.

[26] E. Lee, T. Marrin and J. Borchers, "You're the Conductor: A Realistic Interactive Conducting System for Children", NIME '04: Proceedings of the 2004

Conference on New Interfaces for Musical Expression, pp. 68-73, Hamamatsu,

Japan, June 3-5, 2004.

[27] E. Lee, M. Wolf and J. Borchers, "Improving Orchestral Conducting Systems in Public Spaces: Examining the Temporal Characteristics and Conceptual Models of Conducting Gestures", Proceedings of the CHI 2005 Conference on Human

Factors in Computing Systems, pp. 731-740, Portland, Oregon, April 2-7, 2005.

[28] E. Lee and J. Borchers, "The Role of Time in Engineering Computer Music Systems", NIME '05: Proceedings of the 2005 Conference on New Interfaces for

Musical Expression, pp. 204-207, Vancouver, Canada, May 26-28, 2005.

[29] D. Murphy, T. H. Andersen, and K. Jensen, “Conducting Audio Files via Computer Vision”, Gesture-Based Communication in Human-Computer

Interaction: Selected Revised Papers from the 5th International Gesture Workshop, volume 2915 of LNAI, pp. 529-540, Genoa, Italy, April, 2003.

[30] D. Murphy, “Live Interpretation of Conductors' Beat Patterns” , Proceedings of

the 13th Danish Conference on Pattern Recognition and Image Analysis,

Copenhagen, Denmark, pp. 111-120, 2004.

[31] T. Sim, D. Ng, and R. Janakiraman, "VIM: Vision for Interactive Music",

Proceedings of IEEE Workshop on Applications of Computer Vision (WACV '07),

pp.32-32, February, 2007.

[32] K. C. Ng, "Music via Motion: Transdomain Mapping of Motion and Sound for Interactive Performances", Proceedings of the IEEE, vol.92, no.4, pp. 645-655, April, 2004.

55

[33] W. Hu, T. Tan, L. Wang and S. Maybank, "A survey on visual surveillance of object motion and behaviors", IEEE Transactions on Systems, Man, and

Cybernetics, Part C: Applications and Reviews, vol. 34, pp. 334-352, 2004.

[34] Wen-Han Yao, "Mean-Shift Object Tracking Based On A Multi-blob Model", M.

S. Degree Thesis, National Tawan Chiao Tung University, June, 2006.

[35] G. Bradski, "Computer vision face tracking for use in perceptual user interface",

Intel Technology Journal, vol. 2nd Quarter, 1998.

[36] G. John Allen, Y. D. Richard Xu and S. Jin Jesse, "Object Tracking Using CamShift Algorithm and Multiple Quantized Feature Spaces", Inc. Australian

Computer Society, vol.36, 2004.

[37] Dorin Comaniciu and Peter Meer, "Mean Shift: A robust approach toward feature space analysis", IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5):603-619, May, 2002.

[38] Xia Liu, "Research of the Improved Camshift Tracking Algorithm", International

Conference on Mechatronics and Automation, ICMA 2007, pp. 968-972, 2007.

[39] Hongmo Je, Jiman Kim and Daijin Kim, "Vision-Based Hand Gesture Recognition for Understanding Musical Time Pattern and Tempo", The 33rd

Annual Conference of the IEEE Industrial Electronics Society (IECON), pp.

2371-2376, , Taipei, Taiwan, November 5-8, 2007.

[40] W.S. Rutkowski, A. Rosenfeld, "A comparison of corner-detection techniques for chain-coded curves", TR-623. Computer Science Center, University of Maryland, 1978.

[41] T. Peli, "Corner extraction from radar images", 1988 International Conference on

Acoustics, Speech, and Signal Processing. ICASSP-88, pp. 1216-1219 vol.2,

1988.

[42] Huang-Yu Lian, "The Effects of Human Factors on Reaction Speed to Visual and Auditory Signals ", M. S. Degree Thesis, National Kaohsiung First University of Science and Technology, Kaohsiung, Taiwan, 2000.

相關文件