EXPERIMENTAL RESULTS - AN INFANT FACIAL EXPRESSION RECOGNITION SYSTEM BASED ON MOMENT FEATURE

AN INFANT FACIAL EXPRESSION RECOGNITION SYSTEM BASED ON MOMENT FEATURE EXTRACTION

7 EXPERIMENTAL RESULTS

instances reaching node S belong to the same class.

Otherwise, if the entropy is high, it means that the many training instances reaching node S belong to different classes and hence should be split further.

The correlation coefficients r _i _j

A (Eq. (14)) between two attributes Aⁱ and A^j of a training instance can be used to split the training instances.

If r i j >0 the correlation coefficient r _i _j

A . Then the accuracy of the split can be measured by

∑ ∑

Finally, the best correlation coefficient selected by the system is

It is to be noted that once a correlation coefficient has been selected, it cannot be selected again by its descendants.

The algorithm to construct a binary classification tree is shown here:

Algorithm: Decision tree construction

Step 1: Initially, put all the training instances into root S_R Regard S_R as an internal decision node and input S_R into a decision node queue.

Step 2: Select an internal decision node S from the decision node queue. Calculate the entropy of node S using Eq. 15. If the entropy of node S is larger than a threshold T_s, proceed to Step 3,

otherwise label node S as a leaf node and proceed to Step 4.

Step 3: Find the best correlation coefficient

* j*

r i

A to split the training instances in node S using Eqs. 16 and 17. Split the training instances in S into two nodes S₁ and S₂using the correlation coefficients _*

* j otherwise stop the algorithm.

7 EXPERIMENTAL RESULTS

The input data for our system was acquired using a SONY TRV-900 video camera mounted above the

Figure 6:The decision tree of the Hu moments.

yes no

crib and processed on a PC with an Intel^RCore™

21.86GHz CPU. The input video sequences recorded at a rate of 30 frames/second were down-sampled to a rate of six frames/second, which is the processing speed of our current system. In order to increase the processing rate, we further reduced the size (640 x 480 pixels) of each image to 320 x 240 pixels.

Five infant facial expressions, including crying, dazing, laughing, yawning and vomiting have been classified in this study. Three different poses of the infant head, including front, left, and right (an example of an infant yawning as shown from the three positions is shown in Figure 4) have been considered and a total of fifteen classes have been identified.

In the first experiment, the Hu moments and their correlation coefficients were calculated using Eqs. 7 and 14. A corresponding decision tree was constructed using the decision tree construction algorithm. Figure 7 shows the decision tree constructed using the correlation coefficients between the Hu moments as the split function. subtree is depicted in Figure 8. The split functions of the roots of the left subtree and the right subtree are,

r

_H3_H5

>

0 and

r

_H6_H7

>

0 respectively.

When Figure 7 and Figure 8 are compared with

each other, it can be seen that most of the sequences of the infant head position ‘turn right’

are classified into the left subtree as shown in Figure 7. Similarly, many sequences of the infant head position ‘turn left’ are classified into the right subtree as shown in Figure 8.

Similarly, the same fifty-nine fifteen frame sequences were used to train and create the decision trees of the R and Zernike moments. The R moments and their correlation coefficients are calculated using Eqs. 8 and 14. The decision tree created based on the correlation coefficients of the R moments consists of fifteen internal nodes and seventeen leaves with a height of ten. The experimental results are shown in Table 2.

Moreover, the Zernike moments and their correlation coefficients are calculated using Eqs. 9 and 14. The decision tree created based on the correlation coefficients of the Zernike moments includes nineteen internal nodes and twenty leaves, with a height of seven.

Table 2 also shows the classification results of the same thirty testing sequences. We observe that the correlation coefficients of the moments are useful attributes for classifying infant facial expressions. Moreover, the classification tree created from the Hu moments has a smaller height Figure 7: The left subtree of the decision tree depicted

in Figure 6.

dazing dazing yes yes

and a fewer number of nodes but a higher classification rate.

Table 2: The experimental results.

(1) (2) (3) (4) (5)

PS. (1) Number of training sequences (2) Number of nodes (internal node + leaf) (3) Height of the decision tree

(4) Number of testing sequences (5) Classification Rate

8 CONCLUSION

This paper presented an infant facial expression recognition technique for a vision-based infant surveillance system. In order to obtain more reliable experimental results, we will be collecting more experimental sequences to construct a more complete infant facial expression database. Binary classification trees constructed in this study may be less tolerant of. If the correlation coefficients are close to zero, then the noise will greatly affect the results of the classification. The fuzzification of the decision tree may help solve this problem. The infant facial expression recognition system is only one part of the intelligent infant surveillance system. We hope that this recognition system will be embedded into the intelligent infant surveillance system in the near future.

ACKNOWLEDGMENT

The authors would like to thank the National Science Council of the Republic of China, Taiwan for financially supporting this research under Contract No. NSC 98-2221-E-003-014-MY2 and NSC 98-2631-S-003-002.

REFERENCES

Doi, M., Inoue, H., Aoki, Y., and Oshiro, O., 2006.

Video surveillance system for elderly person living alone by person tracking and fall detection, IEEJ Transactions on Sensors and Micromachines, Vol.

126, pp.457-463.

Department of Health, Taipei City Government, 2007.

http://www.health.gov.tw/.

The State of Alaska, 2005. Unintentional infant injury in Alaska, Women’s, Children’s, and Family Health, Vol. 1, pp. 1-4, http://www.epi.hss.state.ak.us/

mchepi/pubs/facts/fs2005na_v1_18.pdf.

Pal, P., Iyer, A. N., and Yantorno, R. E., 2006. Emotion detection from infant facial expressions and cries, IEEE International Conference on Acoustics, Speech and Signal Processing, Vol. 2, pp. 14-19.

Zhi, R., and Ruan, Q., 2008. A comparative study on region-based moments for facial expression recognition, The Proceedings of Congress on Image and Signal Processing, Vol. 2, pp. 600-604.

Hu, M. K., 1962. Visual pattern recognition by moment invariants, IRE Transactions on Information Theory, Vol. 8, pp. 179-187.

Liu, J., Liu, Y., and Yan, C., 2008. Feature extraction technique based on the perceptive invariability, Proceedings of the Fifth International Conference on Fuzzy Systems and Knowledge Discovery, Shandong, China, pp. 551-554.

Alpaydin, E., 2004. Introduction to Machine Learning, Chapter 9, MIT Press, Massachusetts, USA.

在文檔中卓越數位學習科學研究中心(II) (頁 31-34)