Database reduction - THE PROPOSED METHOD - 線上樂譜手寫辨識系統

CHAPTER 3 THE PROPOSED METHOD

3.5 Database reduction

The database consists of a lot of templates. To compare the feature with every

template is very inefficient. If we could reduce the number of templates, the

consuming time would be reduced. The problem is how to obtain the representative Table 3.6 List of the supported modification operation.

Delete a note: Drag the note to the deleting area.

Delete a dot of a note: Drag the dot of the note to the deleting area.

Delete a flag of a note: Drag the flag

of the note to the deleting area.

Delete a head of a note: Drag the

head of the note to the deleting area.

Delete some of notes: Draw a line

across these notes which will be deleted and drag to the deleting area.

templates and discard the others. The clustering methods, like K-Means, are proposed

to solve this problem. However, we do not know the distribution of the templates and

could not assign the number of initial seeds exactly. So, we applied MBSAS

(Modified Sequential Algorithmic Scheme) [14] to clustering templates which

automatically determinates the number of initial seeds.

In our databases, we train the shape feature and the direction feature separately.

The training process is applied to every symbol separately. Finally, for each symbol,

we obtain the representative templates.

CHAPTER 4 EXPERIEMENT RESULT

Experiments are conducted to evaluate the performance of the proposed method.

13801 strokes, collected form 14 distinct writers, are used to test our algorithm. 6509

out of 13801 are taken as the training data. The remaining 7292 strokes are the testing

data. Every stroke in the testing data is examined by symbol recognition. Finally, we

could get the most similar symbol of the stroke as the output. In our experiments, a

notebook (Intel T2300 CPU; only single cpu used; 1.66GHz; 1GB memory) and a

digital tablet are used.

In order to measure the performance, we define the “precision” as follows:

Incorrect, Correct

Correct Precision

= + (7)

The precision for each symbol is shown in Table 4.1. The average precision for

the symbols of our method is 98.35%, which is better than 97.54% of Miyao-

Maruyama’s method [11].

From the misclassified strokes, we find that the misclassification is due to that

some users do not have any domain knowledge about the music theory, and they are

not familiar with writing music notations. Sometimes they ignore the detail about the

difference between symbols, like the curvature or the corners in a stroke. It makes

some strokes ambiguous as trying to recognize. For example, if the user ignores the

curvature between the slash and circle in BHead, the stroke is easily to be recognized Table 4.1 Precision of each symbol.

Symbol name Our method (%)

For the misclassified strokes, we provide the semantic correction to correct the

mistakes. There are two rules defined in note level of notation recognition. First,

while a WHead is misclassified to BHead and combine with a Half note, the system

would convert BHead to WHead and do the combination. Second, while a BHead is

misclassified to WHead and combine with Note with filled head, the system would

convert WHead to BHead and do the combination. By the semantic correction, the

precisions of WHead and BHead raise to 99.48% and 99.38%.

The total time of processing the 7292 testing data is about 157.38 seconds. Thus,

the average processing time is about 0.0216 seconds per stroke. This is faster than

Miyao-Maruyama’s method which takes 0.0731 seconds per stroke by a PC (Pentium

4 CPU; 1.8GHz; 512MB memory). Thus, a user takes less waiting time while writing.

Furthermore, our method is more suitable to migrate to the handheld devices with

touched screen which have low computing power, and the user could compose a

music score everywhere.

CHAPTER 5 CONCLUSION

The study proposed a method for recognizing music score by the properties of

strokes. A stroke is recognized as a symbol and the symbol is combined with other

symbols to form a music notation. Firstly, the preprocessing is applied to eliminate

distortion in the stroke. Next, the stroke could be recognized as a simple symbol by

the simple symbol classifier. If not, three feature extraction methods are performed on

the stroke, and then the complex symbol classifier is applied. A decision tree with

three classifiers is used to recognize the stroke to a complex symbol. Finally, the

output symbol is combined with nearby symbols by rules in three levels and output a

music notation. Both recognition rate and recognition speed of our method is better

than those of existing method.

This system is robust enough for a general use. It provides all the common music

notations and easy-using modification operations. Furthermore, music score playing

function is supported for users to listen to the melody while they are editing the music

score. Users are able to compose a complete music score by this system.

The future works are as follows:

• Multi-strokes input: in the proposed system, a stroke is considered as an

input at a time. If the system supports the multi-strokes in a single input, the

user could take less time waiting for the recognition.

• More symbols supported: The symbols are related to the writing styles of the

notation. In this system, we support 17 kinds of symbols. The more

symbols supported means the more ways to write a notation.

• More uncommon used notations supported: The common used music

notations are supported, but some notations only used in a specific purpose

are not included in the system, like C Clef ..., etc. In the future, these

notations would be added to the system, and the system would be suitable

for professional use.

• Semantic hint: When the music score is illegal to the music theory, the

system would show hints to the user. It is very useful for the users who are

not familiar with music theory.

REFERENCES

[1] J. Anstice, T. Bell, A. Cockburn and M. Setchell, “The Design of a Pen-Based

Musical Input System,” In Proceedings of the 6th Australian Conference on

Computer-Human Interaction (OZCHI 1996), Hamilton, New Zealand, pp.

260-267, Nov. 1996.

[2] MagicScore Maestro software, DG software. (http://www.dgalaxy.net/)

[3] Allegro, finale software. (http://www.finalemusic.com/)

[4] S. E. George, “Online Pen-Based Recognition of Music Notation with Artificial

Neural Networks,” Computer Music Journal, vol. 27, no. 2, pp. 70-79, Jun. 2003.

[5] A. Forsberg, M. Dieterich, and R. Zeleznik, “The music notepad,” In Proceedings

of the 11th annual ACM symposium on User interface software and technology,

San Francisco, CA, USA, pp. 203-210, Nov. 1998.

[6] Calligrapher, ParaGraph International, Inc. (http://www.paragraph.com/)

[7] D. Rubine, “Specifying Gestures by Example,” In Proceedings of ACM

SIGGRAPH ’91, New York, USA, pp. 329-337, Jul. 1991.

[8] E. Ng, T. Bell and A. Cockburn, “Improvements to a Pen-Based Musical Input

System,” OzCHI’98: The Australian Conference on Computer-Human

Interaction, Adelaide, South Australia, pp. 178-185, Dec. 1998.

thesis, Brown University, 2005.

[10] S. Macé, E. Anquetil and B. Coüasnon, “A generic method to design pen-based

systems for structured document composition : Development of a musical score

editor,” In Proceedings of the 1st Workshop on Improving and Assessing

Pen-Based Input Techniques, Edinburgh, Scotland, pp. 15-22, Sep. 2005.

[11] H. Miyao and M. Maruyama, “An Online Handwritten Music Score Recognition

System,” In Proceedings of the 17th International Conference on Pattern

Recognition (ICPR 2004), Cambridge, United Kingdom, pp. 461-464, Aug.

2004.

[12] S. Connell and A.K. Jain, "Template-based Online Character Recognition,"

Pattern Recognition 34(1), pp. 1-13. 2001.

[13] X. Li and N. S. Hall, “Corner detection and shape classification of on-line

handprinted Kanji strokes,” Pattern Recognition 26(9), pp. 1315-1334. 1993.

[14] S. Theodoridis and K. Koutroumbas, Pattern recognition, Academic Press. 2006.

在文檔中線上樂譜手寫辨識系統 (頁 51-0)