CHAPTER 3 THE PROPOSED METHOD
3.5 Database reduction
The database consists of a lot of templates. To compare the feature with every
template is very inefficient. If we could reduce the number of templates, the
consuming time would be reduced. The problem is how to obtain the representative Table 3.6 List of the supported modification operation.
Delete a note: Drag the note to the deleting area.
Delete a dot of a note: Drag the dot of the note to the deleting area.
Delete a flag of a note: Drag the flag
of the note to the deleting area.
Delete a head of a note: Drag the
head of the note to the deleting area.
Delete some of notes: Draw a line
across these notes which will be deleted and drag to the deleting area.
templates and discard the others. The clustering methods, like K-Means, are proposed
to solve this problem. However, we do not know the distribution of the templates and
could not assign the number of initial seeds exactly. So, we applied MBSAS
(Modified Sequential Algorithmic Scheme) [14] to clustering templates which
automatically determinates the number of initial seeds.
In our databases, we train the shape feature and the direction feature separately.
The training process is applied to every symbol separately. Finally, for each symbol,
we obtain the representative templates.
CHAPTER 4
EXPERIEMENT RESULT
Experiments are conducted to evaluate the performance of the proposed method.
13801 strokes, collected form 14 distinct writers, are used to test our algorithm. 6509
out of 13801 are taken as the training data. The remaining 7292 strokes are the testing
data. Every stroke in the testing data is examined by symbol recognition. Finally, we
could get the most similar symbol of the stroke as the output. In our experiments, a
notebook (Intel T2300 CPU; only single cpu used; 1.66GHz; 1GB memory) and a
digital tablet are used.
In order to measure the performance, we define the “precision” as follows:
Incorrect, Correct
Correct Precision
= + (7)
The precision for each symbol is shown in Table 4.1. The average precision for
the symbols of our method is 98.35%, which is better than 97.54% of Miyao-
Maruyama’s method [11].
From the misclassified strokes, we find that the misclassification is due to that
some users do not have any domain knowledge about the music theory, and they are
not familiar with writing music notations. Sometimes they ignore the detail about the
difference between symbols, like the curvature or the corners in a stroke. It makes
some strokes ambiguous as trying to recognize. For example, if the user ignores the
curvature between the slash and circle in BHead, the stroke is easily to be recognized Table 4.1 Precision of each symbol.
Symbol name Our method (%)
For the misclassified strokes, we provide the semantic correction to correct the
mistakes. There are two rules defined in note level of notation recognition. First,
while a WHead is misclassified to BHead and combine with a Half note, the system
would convert BHead to WHead and do the combination. Second, while a BHead is
misclassified to WHead and combine with Note with filled head, the system would
convert WHead to BHead and do the combination. By the semantic correction, the
precisions of WHead and BHead raise to 99.48% and 99.38%.
The total time of processing the 7292 testing data is about 157.38 seconds. Thus,
the average processing time is about 0.0216 seconds per stroke. This is faster than
Miyao-Maruyama’s method which takes 0.0731 seconds per stroke by a PC (Pentium
4 CPU; 1.8GHz; 512MB memory). Thus, a user takes less waiting time while writing.
Furthermore, our method is more suitable to migrate to the handheld devices with
touched screen which have low computing power, and the user could compose a
music score everywhere.
CHAPTER 5 CONCLUSION
The study proposed a method for recognizing music score by the properties of
strokes. A stroke is recognized as a symbol and the symbol is combined with other
symbols to form a music notation. Firstly, the preprocessing is applied to eliminate
distortion in the stroke. Next, the stroke could be recognized as a simple symbol by
the simple symbol classifier. If not, three feature extraction methods are performed on
the stroke, and then the complex symbol classifier is applied. A decision tree with
three classifiers is used to recognize the stroke to a complex symbol. Finally, the
output symbol is combined with nearby symbols by rules in three levels and output a
music notation. Both recognition rate and recognition speed of our method is better
than those of existing method.
This system is robust enough for a general use. It provides all the common music
notations and easy-using modification operations. Furthermore, music score playing
function is supported for users to listen to the melody while they are editing the music
score. Users are able to compose a complete music score by this system.
The future works are as follows:
• Multi-strokes input: in the proposed system, a stroke is considered as an
input at a time. If the system supports the multi-strokes in a single input, the
user could take less time waiting for the recognition.
• More symbols supported: The symbols are related to the writing styles of the
notation. In this system, we support 17 kinds of symbols. The more
symbols supported means the more ways to write a notation.
• More uncommon used notations supported: The common used music
notations are supported, but some notations only used in a specific purpose
are not included in the system, like C Clef ..., etc. In the future, these
notations would be added to the system, and the system would be suitable
for professional use.
• Semantic hint: When the music score is illegal to the music theory, the
system would show hints to the user. It is very useful for the users who are
not familiar with music theory.
REFERENCES
[1] J. Anstice, T. Bell, A. Cockburn and M. Setchell, “The Design of a Pen-Based
Musical Input System,” In Proceedings of the 6th Australian Conference on
Computer-Human Interaction (OZCHI 1996), Hamilton, New Zealand, pp.
260-267, Nov. 1996.
[2] MagicScore Maestro software, DG software. (http://www.dgalaxy.net/)
[3] Allegro, finale software. (http://www.finalemusic.com/)
[4] S. E. George, “Online Pen-Based Recognition of Music Notation with Artificial
Neural Networks,” Computer Music Journal, vol. 27, no. 2, pp. 70-79, Jun. 2003.
[5] A. Forsberg, M. Dieterich, and R. Zeleznik, “The music notepad,” In Proceedings
of the 11th annual ACM symposium on User interface software and technology,
San Francisco, CA, USA, pp. 203-210, Nov. 1998.
[6] Calligrapher, ParaGraph International, Inc. (http://www.paragraph.com/)
[7] D. Rubine, “Specifying Gestures by Example,” In Proceedings of ACM
SIGGRAPH ’91, New York, USA, pp. 329-337, Jul. 1991.
[8] E. Ng, T. Bell and A. Cockburn, “Improvements to a Pen-Based Musical Input
System,” OzCHI’98: The Australian Conference on Computer-Human
Interaction, Adelaide, South Australia, pp. 178-185, Dec. 1998.
thesis, Brown University, 2005.
[10] S. Macé, E. Anquetil and B. Coüasnon, “A generic method to design pen-based
systems for structured document composition : Development of a musical score
editor,” In Proceedings of the 1st Workshop on Improving and Assessing
Pen-Based Input Techniques, Edinburgh, Scotland, pp. 15-22, Sep. 2005.
[11] H. Miyao and M. Maruyama, “An Online Handwritten Music Score Recognition
System,” In Proceedings of the 17th International Conference on Pattern
Recognition (ICPR 2004), Cambridge, United Kingdom, pp. 461-464, Aug.
2004.
[12] S. Connell and A.K. Jain, "Template-based Online Character Recognition,"
Pattern Recognition 34(1), pp. 1-13. 2001.
[13] X. Li and N. S. Hall, “Corner detection and shape classification of on-line
handprinted Kanji strokes,” Pattern Recognition 26(9), pp. 1315-1334. 1993.
[14] S. Theodoridis and K. Koutroumbas, Pattern recognition, Academic Press. 2006.