o (Loss Function)
(Cross Entropy) (Batch Size) 64 (Epoch) 100
o SoftMax
PairNet ConvNet )- 14 1~3
128 4~5 256
o (Rectified Linear Unit, ReLU)
(Batch Normalization Layer) o (Optimizer)
AdaDelta [43] c (
PairNet ) c z ConvNet
- ) 2 × 1
o (Optimizer) Adam [44]
Bi-LSTM Bi-GRU 256 c
Bi-LSTM Bi-GRU ( 13) 256
( 256 0.25 DropOut [38]
a
3-4
4-1 10 1 6 PairNet
0
( n 19 6
( 638 ) PairNet
97.81% ( 1000 )
PairNet 93.1% h 90% Bi-LSTM
PairNet ) 5.3% ) ( 588 )
PairNet Bi-GRU ) 12.82%
Bi-GRU 25.03% PairNet )12.78%
1.
19. PairNet
)12.78% CNN 78.9% n
PairNet PairNet
72.21% PairNet 85.03% )12.82%
2 PairNet CNN ) 58
14 PairNet 108 ( Bi-LSTM 112
PairNet )12.82%
2.
20. 6 PairNet
CNN ) 58 CNN 65% PairNet 106 Bi-LSTM
with Att. 112
•
21. 14
Google Daydream Controller
20ms ( 50 ) 20 0 14
B z n
n
7910 2~5
294
0 o (Loss
Function) (Cross Entropy) (Batch Size) 128
(Epoch) 100 o SoftMax
PairNet ConvNet 13
-1~3 128 4~5 256 h
ReLU) Layer)
o (Optimizer) AdaDelta [43] c
( PairNet ) c z
ConvNet - ) 2 × 1
)
o (Optimizer) Adam Bi-LSTM Bi-GRU
128 c Bi-LSTM Bi-GRU
( 12) 128 ( 128
0.25 DropOut
3 6 PairNet 99.38%
Bi-GRU 1.22%
3.
22. PairNet 100% 99.4%
4 ) 6 PairNet
) 63% 28 PairNet (
Bi-LSTM 30
4.
• PairNet
14 PairNet ConvNet
c
1 2 PairNet
) ( 2 1)
(Overlapping) (Non-overlapping) (Stride) z 1 2
5 6 PainNet
(Max-pooling) c (Global Average Pooling) 6 6 PairNet
c
5. 6 c
5 6 6 PairNet
c 1 3 ConvNet
-n 6 )
(Three-axis Accelerometer) (Three-axis Gyroscope) B
a n PairNet n
6 h - PairNet
h 12.82% 6 (
60~65% PairNet
h (Generalization)
PairNet 6 ResNet
(Shortcut) [41] h 6
PairNet a
n (
[1] Cabral, Marcio C., Carlos H. Morimoto, and Marcelo K. Zuffo. "On the usability of gesture interfaces in virtual reality environments." Proceedings of the 2005 Latin American conference on Human-computer interaction. ACM, 2005.
[2] Vogler, Christian, and Dimitris Metaxas. "Parallel hidden markov models for american sign language recognition." Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on. Vol. 1. IEEE, 1999.
[3] Chai, Xiujuan, et al. "Sign language recognition and translation with kinect." IEEE Conf. on AFGR. 2013.
[4] Sun, Chao, et al. "Discriminative exemplar coding for sign language recognition with Kinect." IEEE Transactions on Cybernetics 43.5 (2013): 1418-1428.
[5] Zhang, Xu, et al. "A framework for hand gesture recognition based on accelerometer and EMG sensors." IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 41.6 (2011): 1064-1076.
[6] Starner, Thad, and Alex Pentland. "Real-time american sign language recognition from video using hidden markov models." Motion-Based Recognition. Springer, Dordrecht, 1997. 227-243.
[7] Xu, Deyou. "A neural network approach for hand gesture recognition in virtual reality driving training system of SPG." Pattern Recognition, 2006. ICPR 2006. 18th International Conference on. Vol. 3. IEEE, 2006.
[8] Murthy, G. R. S., and R. S. Jadon. "A review of vision based hand gestures recognition." International Journal of Information Technology and Knowledge Management 2.2 (2009): 405-410.
[9] Mitra, Sushmita, and Tinku Acharya. "Gesture recognition: A survey." IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 37.3 (2007): 311-324.
[10] Stotts, David, Jason McC Smith, and Karl Gyllstrom. "Facespace: endo-and exo-spatial hypermedia in the transparent video facetop." Proceedings of the fifteenth ACM conference on Hypertext and hypermedia. ACM, 2004.
[11] Rautaray, Siddharth S., and Anupam Agrawal. "Vision based hand gesture recognition for human computer interaction: a survey." Artificial Intelligence Review 43.1 (2015): 1-54.
[12] Sharma, Rajeev, et al. "Speech/gesture interface to a visual computing environment for molecular biologists." Pattern Recognition, 1996., Proceedings of the 13th International
[13] O'Hagan, R. G., Alexander Zelinsky, and Sebastien Rougeaux. "Visual gesture interfaces for virtual environments." Interacting with Computers 14.3 (2002): 231-250.
[14] Molchanov, Pavlo, et al. "Hand gesture recognition with 3D convolutional neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2015.
[15] Ohn-Bar, Eshed, and Mohan Manubhai Trivedi. "Hand gesture recognition in real time for automotive interfaces: A multimodal vision-based approach and evaluations." IEEE transactions on intelligent transportation systems 15.6 (2014): 2368-2377.
[16] Gupta, Hari Prabhat, et al. "A continuous hand gestures recognition technique for human-machine interaction using accelerometer and gyroscope sensors." IEEE Sensors Journal16.16 (2016): 6425-6432.
[17] Elmezain, Mahmoud, et al. "A hidden markov model-based continuous gesture recognition system for hand motion trajectory." Pattern Recognition, 2008. ICPR 2008. 19th International Conference on. IEEE, 2008.
[18] Potter, Leigh Ellen, Jake Araullo, and Lewis Carter. "The leap motion controller: a view on sign language." Proceedings of the 25th Australian computer-human interaction conference:
augmentation, application, innovation, collaboration. ACM, 2013.
[19] Luzhnica, Granit, et al. "A sliding window approach to natural hand gesture recognition using a custom data glove." 3D User Interfaces (3DUI), 2016 IEEE Symposium on. IEEE, 2016.
[20] Marin, Giulio, Fabio Dominio, and Pietro Zanuttigh. "Hand gesture recognition with leap motion and kinect devices." Image Processing (ICIP), 2014 IEEE International Conference on.
IEEE, 2014.
[21] Lee, Hyeon-Kyu, and Jin-Hyung Kim. "An HMM-based threshold model approach for gesture recognition." IEEE Transactions on pattern analysis and machine intelligence 21.10 (1999): 961-973.
[22] Laurel, Brenda, and S. Joy Mountford. The art of human-computer interface design. Addison-Wesley Longman Publishing Co., Inc., 1990.
[23] Huang, Deng-Yuan, Wu-Chih Hu, and Sung-Hsiang Chang. "Vision-based hand gesture recognition using PCA+ Gabor filters and SVM." Intelligent Information Hiding and Multimedia Signal Processing, 2009. IIH-MSP'09. Fifth International Conference on. IEEE, 2009.
[24] Hong, Pengyu, Matthew Turk, and Thomas S. Huang. "Gesture modeling and recognition using finite state machines." Automatic face and gesture recognition, 2000. proceedings. fourth ieee international conference on. IEEE, 2000.
[25] Lefebvre, Grégoire, et al. "BLSTM-RNN based 3D gesture classification." International Conference on Artificial Neural Networks. Springer, Berlin, Heidelberg, 2013.
[26] Shin, Sungho, and Wonyong Sung. "Dynamic hand gesture recognition for wearable devices with low complexity recurrent neural networks." Circuits and Systems (ISCAS), 2016 IEEE International Symposium on. IEEE, 2016.
[27] Ordóñez, Francisco Javier, and Daniel Roggen. "Deep convolutional and lstm recurrent neural networks for multimodal wearable activity recognition." Sensors 16.1 (2016): 115.
[28] Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. "Learning long-term dependencies with gradient descent is difficult." IEEE transactions on neural networks 5.2 (1994): 157-166.
[29] Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. "On the difficulty of training recurrent neural networks." International Conference on Machine Learning. 2013.
[30] Hochreiter, Sepp. "The vanishing gradient problem during learning recurrent neural nets and problem solutions." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6.02 (1998): 107-116.
[31] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[32] Gers, F. A., Schmidhuber, J., & Cummins, F. (1999). Learning to forget: Continual prediction with LSTM.
[33] Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850.
[34] Gers, Felix A., Nicol N. Schraudolph, and Jürgen Schmidhuber. "Learning precise timing with LSTM recurrent networks." Journal of machine learning research 3.Aug (2002): 115-143.
[35] Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).
[36] Chung, Junyoung, et al. "Empirical evaluation of gated recurrent neural networks on sequence modeling." arXiv preprint arXiv:1412.3555 (2014).
[37] LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
[38] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
[39] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale
[40] Szegedy, Christian, et al. "Going deeper with convolutions." Cvpr, 2015.
[41] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[42] Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention." International Conference on Machine Learning. 2015.
[43] Zeiler, Matthew D. "ADADELTA: an adaptive learning rate method." arXiv preprint arXiv:1212.5701 (2012).
[44] Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).