Batch training in character recognition - Multistage character recognition

4.3 Multistage character recognition

4.3.2 Batch training in character recognition

The design of the character recognizer is also based on the SPDNN model. For a K-character recognition problem, an SPDNN character recognizer consists of K subnets. A subnet i in the SPDNN recognizer estimates the distribution on the patterns of character i only, and treats those patterns which do not belong to character i as the “non i” patterns. The combined features such as CCT, BSPN, and STKO are used in the SPDNN character recognizer. The training of the character SPDNN was conducted with the ISLUG principle. During the retrieving phase, each of the subnets corresponding to the candidate characters from the coarse classifier produces a score according to its discriminate function φ(x(t), wi). The subnet which produces the highest score is the winner and its corresponding reference character is considered as the result of the character recognizer.

Figure 4.5: Preprocessing and STKO feature extraction of a Chinese character.

Two experimental results from the SPDNN character recognizer will be discussed. The first type of recognition experiment were performed on the CCL/HCCR1 [37] handwriting database, which has been used by several hand-writing recognition research groups [34, 35, 1]. The second type of experiment explored the ability of SPDNN to deal with the multi-linguistic handwriting recognition problem, which has seldom been discussed in the character recog-nition literature.

(1) Experiment 1–Handwritten Chinese Recognition:

We have conducted experiments on the CCL/HCCR1 database, which con-tains more than 200 samples of 5401 frequently used Chinese characters. The samples were collected from 2600 people including junior high school and col-lege students as well as employees of ERSO/ITRI. According to the most recent survey on handwriting recognition [25, 27, 39], most of the handwritten Chi-nese OCR studies are designed for small databases, i.e., training and testing

on very small character sets, e.g., a few hundred characters. As for studies conducted on recognition using a complete set of commonly used Chinese char-acters, Xia [40] developed an experimental system with a 3755 character set and achieved an 80% of recognition rate. In [1], Li and Yu reported 88.65%

recognition accuracy on the CCL/HCCR1 database. Recently, Tseng et al. [2]

used the M distance method in their recognition system to achieve 88.55% ac-curacy on the CCL/HCCR1 database. Table 4.2 shows the training and testing accuracy of Gaussian model as well as SPDNN with and without the unsuper-vised growing phase. Each subnet of SPDNN is initialized with one Gaussian cluster. At the end of the training, the distribution of the number of clusters in each subnet of SPDNN is shown in Table 4.3. The recognition accuracy of Gaussian model without any training is 83.11%, and is improved to 85.18% by fine tuning the decision boundary between classes via the supervised learning of the SPDNN. After the unsupervised growing, the recognition accuracy can be further improved to 86.12%.

Table 4.2: The training and testing accuracy of Gaussian model as well as SPDNN with and without the unsupervised growing phase. Each subnet of SPDNN is initialized with one Gaussian cluster.

Various Gaussian Model SPDNN (mix=1) SPDNN

Training set 89.49% 94.87% 97.78%

Testing set 83.11% 85.18% 86.12%

Table 4.4 summarizes a performance comparison of these systems evalu-ated using the CCL/HCCR1 database. We would like to comment on the overall performance of these systems as follows. First, compared to the huge number

Table 4.3: The distribution of the number of clusters in each class.

Number of clusters 1 2 3 4 5

Number of class 3697 1376 252 69 7

of character features used by other researchers, e.g., 400 features used in [1] or 256 features used in [2], the SPDNN recognizer uses only 92 features. A more relevant comparison could be made if a comparable number of training and testing features for these two systems were available. In fact, the SPDNN char-acter recognizer is designed to use no more than 100 sets of features since more feature sets would require more memory storage and longer recognition time.

Two reasons explain why an SPDNN-based system can have fewer features yet achieve comparable performance. (1) The mixed Gaussian-based discrimination function permits SPDNN to learn the character decision boundary precisely. (2) The self-growing rules allow a small number of Gaussian clusters to be sufficient to represent the character image distribution.

Table 4.4: Performance of different handwriting recognizers on the CCL/HCCR1 database. A portion of this Table is adapted from Li et al. [1] and Tseng et al. [2].

Various Recognition Features Train & testing Classification

Systems Accuracy Used data used time

SPDNN 86.12% 92 50-50 0.24 sec/char

Li et al. 88.65% 400 50-1 NA

Tseng et al. 88.55% 256 100-100 0.6 sec/char

(2) Experiment 2–Multilinguistic Handwriting Recognition: By searching

on major conference proceedings, journals as well as Web sites, we have not found any performance test report on this type of handwritings. We there-fore conducted experiments on the combined databases of CCL/HCCR1 and CEDAR [38]. The training and testing data sets for Chinese characters are selected by the same way used in the Experiment 1. The CEDAR database contains various style of handwritten alphanumerics, which were lifted from envelop address blocks from USA. Among the data, 4000 alphanumerics were used for training and 2000’s for testing. We also conducted experiments with rejection. Rejection criteria was implemented through the threshold value Ti, which can be learned by the reinforced and antireinforced learning rules. In gen-eral, when an input character is correctly recognized with certain confidence, its output of the discriminate function should maintain a certain gap larger than Ti with respect to the second largest output from other discriminate functions.

The experimental results are discussed as follows: For the sake of compar-ison, we adjusted the thresholds Ti so that the proposed system has 0% false rejection rate during the training phase. The recognition accuracy with 0%

and 6.7% of false rejection rates at the testing phase are shown in Table 4.5.

Li and Yu’s method can not provide rejection function in their Bayesian rule based statistical recognition system [1]. However, SPDNN’s rejection function is based on the reinforced and antireinforced learning rules, thus each subnet, which represents a character in SPDNN, can have its own rejection criteria. We think this characteristic is beneficial for real world applications.

Table 4.5: Performance of SPDNN handwritten character rec-ognizers with and without rejection on the CCL/HCCR1 and CEDAR databases.

Systems Top 1 Accu. Top 2 Accu. Top 3 Accu.

SPDNN (rej=0%) 90.12% 93.49% 94.75 %

SPDNN (rej=6.7%) 94.11% 97.01% 97.67 %

在文檔中複合式高斯類神經網路之研究 (頁 50-55)