Along with the increasing popularity of video over internet and versatility of video applications such as video surveillance, vision-based control, human-computer interfaces, medical imaging, robotics and so on, the availability and efficiency of videos will heavily rely on object detection and other related object tracking capabilities. Hence, we present an effective, efficient, and reliable scheme for automatically extracting independently moving video objects using motion vectors fields. Embedding our system with a specific configuration as a primary step before starting our object detection algorithm makes the performance much better and reduces the computation time for the object detection system as a whole. This has been verified by examining the results of our experiments. Furthermore, we believe our scheme can satisfy the requirements mentioned in Chapter 7, where we emphasize the importance of accurate motion vectors to human perception. It is a well-known fact that motion information is an important cue for humans to perceive video content. We are achieving the additional advantages of efficiency and speed by staying fully in the compressed domain, using only the P frame, and using a simple approach to filter implementation. Moreover, initialization and operation of our system is simple as well.
For the texture filter, AC coefficients in a DCT transformed macroblock can indirectly provide information on how textured the area of the image is. Low-textured region tends to cause poor encoding matching errors. As B and P frames are residue coded, their texture measures are propagated from the I-frames by inverse motion
compensation. The resultant macroblock from performing inverse motion compensation could overlap with four other 8x8 DCT blocks in I-frames. We measure the average of the four neighboring blocks’ energy as an approximation to true block energy. Texture measure is then based on AC energy computed by grouping AC DCT coefficient into Horizontal, Vertical and diagonal energy groups then computing the average of the three energy.
In the future, we will use our proposed scheme in adopting the method which convert motion vectors in the MPEG coded domain to a uniform set, which is independent of the frame type and the direction of prediction. By utilizing these normalized motion vectors in our system, we expect to achieve better performance in object detection.
References
[1] N. Haering, R. J. Qian, and M. I. Sezan, “A Semantic Event-Detection Approach and Its Application to Detecting Hunts in Wildlife Video,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10, No. 6, pp. 857-868, 2000.
[2] H. L. Eng, and K. K. Ma, “Bidirectional Motion Tracking for Video Indexing,”
Proc .Third IEEE Workshop on Multimedia Signal Processing, pp. 153-158, 1999.
[3] L. Favalli, A. Mecocci, and F. Moschetti, “Object Tracking for Retrieval Applications in MPEG-2,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10, No. 3, pp. 427-432, 2000.
[4] C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, and W.
Equitz, “Efficient and Effective Querying by Image Content,” Journal Intelligent Information Systems, Vol. 3, No. 1, pp. 231-262, 1994.
[5] A. Pentland, R. Picard, and S. Sclaroff, “Tools for Content-Based Manipulation of Image Databases.Storage and Retrieval of Image and Video Databases II,” Proc. SPIE, 34-47, 1994.
[6] V. V. Vinod and H. Murase, “Video Shot Analysis using Efficient Multiple Object Tracking,” Proceeding of the International Conference on Multimedia Computing and Systems, pp. 501-508, 1997.
[7] P. Fieguth, “Color-Based Tracking of Heads and Other Mobile Objects at Video Frame Rates,” Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 21-27, 1997.
[8] N. Brady and N. O’Connor, “Object detection and tracking using an EM-based motion estimation and segmentation framework,” Proceeding of IEEE International Conference on Image Processing, pp. 925–928, 1996.
[9] David P. Elias, The motion Based Segmentation of Image Sequences: Ph.D. thesis (Trinity College, Department of Engineering, University of Cambridge, Aug. 1998).
[10] N. Vasconcelos and A. Lippman, “Empirical Bayesian EM-based motion segmentation,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 527–532, 1997.
[11] P. H. S. Torr, R. Szeliski, and P. Anandan, “An integrated bayesian approach to layer extraction from image sequences,” IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 23, No. 3, pp. 297–303, 2001.
[12] R. Wang, and T. Huang, “Fast Camera Motion Analysis in MPEG domain,”
Proceeding of IEEE International Conference on Image Processing, pp. 691-694, 1999.
[13] R. C. Jones, D. DeMenthon and D. S. Doermann, “Building mosaics from video using MPEG Motion Vectors,” Proceeding of ACM Multimedia Conference, 1999, pp.
29-32.
[14] J. I. Khan, Z. Guo and W. Oh, “Motion based object tracking in MPEG-2 stream for perceptual region discriminating rate transcoding,” Proceeding of ACM Multimedia Conference, pp. 572-576, 2001.
[15] D.-Y. Chen, S.-J. Lin and S.-Y. Lee, “Motion Activity Based Shot Identification and Closed Caption Detection for Video Structuring,” Proc. Visual Information Systems, 5th International Conference, pp. 288-301, 2002.
[16] T. R. M. King, Efficient and Effective Methods for Object Segmentation in Video Images: Final report (Fall Semester the Johns Hopkins University, http://www.apl.jhu.edu/Notes/Beser /525759/kingfinalreport.pdf, 2002).
[17] D.-Y. Chen, and S.-Y. Lee, “Motion-Based Semantic Event Detection for Video Content Description in MPEG-7,” Proceeding of IEEE Pacific-Rim Conference on Multimedia, pp. 110-117, 2001.
[18] Y. P. Tan, D. D. Saur, S. R. Kulkarni, and P. J. Ramadge, “Rapid Estimation of Camera Motion from Compressed Video with Application to Video Annotation”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 10, No. 1, pp. 133-146 , 2000.
[19] S.-F. Chang, “Compressed-Domain Techniques for Image/ Video Indexing and Manipulation,” IEEE International Conference on Image Processing, pp. 314-317, 1995.
[20] S.-F. Chang, Compositing and Manipulation of Video Signals for Multimedia Network Video Service: Ph.D. Dissertation (U.C. Berkeley, 1993).
[21] S. Chien, S. Ma, L. Chen, “Efficient Moving Object Segmentation Algorithm Using Background Registration Technique,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 7, pp. 577-586, 2002.
[22] G. Kühne, S. Richter, and M. Beier, “Motion-based segmentation and contour-based classification of video objects,” Proc. ACM Multimedia Conference, pp. 41-50,2001.
[23] R. Curwen and A. Blake, Dynamic Contours Real-Time Active Splines (Active Vision MIT Press, 1992).
[24] R. Goldenberg, R. Kimmel, E. Rivlin, and M. Rudzsky, “Fast Geodesic Active Contours,” Proc. Int'l Conf. Scale-Space Theories in Computer Vision, pp. 34-45, 1999.
[25] R. Wang, H.-J. Zhang and Y.-Q. Zhang, “A Confidence Measure Based Moving Object Extraction System Built for Compressed Domain,” Proceeding of IEEE International Symposium on Circuits and Systems, pp. 21-24, 2000.
[26] Y. Ma, and H.-J. Zhang, “A New Perceived Motion based Shot Content Representation,” IEEE International Conference on Image Processing, pp. 426-429, 2001.
[27] M. Pilu, On Using Raw MPEG Motion Vectors To Determine Global Camera Motion (HPL-97-102, http://www.hpl.hp.com/techreports/97/HPL-97-102.pdf, 1997).
[28] R. V. .Babu, and K. R. Ramakrishnan , “Compressed Domain Motion Segmentation for Video Object Extraction”, Proceeding of IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 3788-3791, 2002.
[29] R. Ulichney, “Filter Design for void and cluster dither arrays,” Proc. SID Int.
Symposium, pp. 809-812,1994.
[30] D.-Y. Chen, S.-Y. Lee, and H.-T. Chen, “Motion Activity Based Semantic Video Similarity Retrieval,” Proc. IEEE Pacific-Rim Conference on Multimedia, pp. 319-327, 2002.
[31] S.-C. Chen, M.-L. Shyu, C. Zhang, and R.L. Kashyap, “Video Scene Change Detection Method Using Unsupervised Segmentation and Object Tracking,” Proc.
International Conference on Multimedia and Expo, pp. 57-60, 2001.
[32] N. W. Kim, T. Y. Kim, and J. S. Choi, “Motion analysis using the normalization of Motion Vectors on MPEG compressed domain,” Proceeding of The 2002 International Technical Conference On Circuits/Systems, Computers and Communications, pp. 1408-1411, 2002.
[33] Ashraf M.A. Ahmad; Duan-Yu Chen and Suh-Yin Lee “ROBUST COMPRESSED DOMAIN OBJECT DETECTION IN MPEG VIDEOS” Proc. 0f the 7th IASTED International Conference Internet and Multimedia System Applications, pp. 706-712, 2003.
[34] Ashraf M.A. Ahmad; Duan-Yu Chen and Suh-Yin Lee “Robust Object Detection Using Cascade Filter in MPEG Videos” Multimedia Software Engineering, 2003. Proc.
Fifth International Symposium, pp. 196 – 203, 2003.
[35] Yu Zhong, Hongjiang, and Anil K. Jain “Automatic Caption Localization in Compressed Video”, IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 22, No. 4, pp. 385-392, 2000.
[36] Jianhao Meng, Yujen Juan, Shih-Fu Chang “Scene Change Detection in a MPEG Compressed Video Sequence”, in IS&T SPIE Proceedings:Digital Video Compression Algorithm and Technology, Vol. 2419, pp. 14-25, (San Jose) ,1995.
[37] Yulin Wang and Ebroul Izquierdo “High-Capacity Data Hiding in MPEG-2 Compressed Video”, 9th Int. Workshop on Systems, Signals and Image Processing, Manchester, 2002.
[38] A. K. Jain and S. Bhattacharjee, “Text Segmentation Using Gabor Filters for Automatic Document Processing”, Machine Vision and Applications, Vol. 5, No. 3, pp.
169-184, 1992.
[39] A.K. Jain and Y. Zhong, “Page Segmentation Using Texture Analysis”, Pattern Recognition, Vol. 29, No. 5, pp. 743-770, 1996.
[40] B.L. Yeo and B. Liu, “Visual Content Highlighting via Automatic Extraction of Embedded Captions on MPEG Compressed Video”, Proc. SPIE Digital Video Compression: Algorithms and Technologies, Vol. 2668, pp. 38-47,1995.
[41] D. LeGall, “MPEG: A Video Compression Standard for Multimedia Applications”, Comm. ACM, Vol. 34, No. 4, pp. 46-58, 1991.
[42] Roberto Castagno, Touradj Ebrahimi, Murat Kunt “Video segmentation based on multiple features for interactive multimedia applications” IEEE Transaction on circuits and systems for video technology, Vol 8. No 5, pp 111-122, 1999.
[43] Hualu Wang, Shi-Fu Chang “ A highly efficient system for automatic face region detection in MPEG video, IEEETransaction on circuits and systems for video technology, Vol 7 No 4, pp. 89-99,1997.