Conclusion and Future Work - 採用以MPEG-4物件形式視訊編碼之視訊會議傳送端之整合

We developed and implemented of an video conference system on personal computer.

We use the function supplied by SDK library to capture video and audio stream.

Using video segmentation to take the foreground and compressed the data in MPEG-4 video encoder. In the audio aspect, we just send the audio stream into MPEG-MPEG-4 audio encoder. Finally, delivering the compressed data in the network through RTP.

Getting video capture, the video segmentation system, and the optimized MPEG-4 video encoder, we modiﬁed the video capture not to write directly into virtual disk but memory instead. Speeding up the video segmentation by modify the noise stage estimation, correct a bug about U and V components, cancel the residual interface, and speed it up with MMX instructions. We also modify the input and output of MPEG-4 video encoder to make it suitable for video conference system. Besides, we integrate some new system into the video conference, such as audio capture, MPEG-4 audio encoder, and RTP programs.

Doing the job of integration, we also should do the setting of environment.

Changing the C program ﬁle into C++ program ﬁle, and integrating the library of visual 6.0 edition into the visual .NET edition. Moreover, combining the Intel compiler into .NET edition, too.

For quality improvement we can do some improvements for the main projects, in the future.

1. Speeding up the frame rate.

Though we spend many time into speeding up, the frame rate, about 10, is still not satisﬁed. After analyzing the whole video conference system, we know the bottleneck is MPEG-4 video encoder. There’s two ways to go. One is keeping optimizing and the other is to change another edition or using other encoder.

2. Making the segmentation system more suitable

The segmentation system we get sometimes doesn’t work well. For example, when we take something with lattice as the view. The segmentation system would always consider it as the foreground. We should add more function to avoid these condition.

3. Promoting the stability.

The program isn’t very stable, especially when we communicate the decoder.

We should add sleep instruction then the decoder could receive the data and decoder it. If not, the decoder receive nothing from the network. Besides, when the decoder interrupts, sometimes the encoder would interrupts, too. I think we should try to correct these bugs.

4. Promoting the compatibility of the environment.

The compatibility of the environment isn’t very good, too. Give an example involve Intel compiler. Every time when we open an project, we would change change condition of the item “EnableWPO” of Intel Speciﬁc(R) account. This is because the compatibility between .NET and Intel compiler isn’t perfect enough. We should think another method to integrate environment to approve the condition.

Bibliography

[1] C.-K. Chien, “A multimpoint videoconference receiver for MPEG-4 objected-based video,” M.S. thesis, Department of Electrical Engineering, National Chaio Tung University, Hsinchu, Taiwan, R.O.C., June 2005.

[2] Y.-H. Lin, “Real-time video segmentation based on background modeling for videoconferencing,” M.S. thesis, Department of Electrical Engineering, National Chaio Tung University, Hsinchu, Taiwan, R.O.C., June 2004.

[3] M.-Y. Liu, “Real-time implementation of MPEG-4 video encoder using SIMD-enhanced Intel processor,” M.S. thesis, Department of Electronics Engineering, National Chaio Tung University, Hsinchu, Taiwan, R.O.C., July 2004.

[4] Intel, MMX Technology — Programmers Reference Manual. 2000.

[5] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A transport protocol for real-time applications,” RFC 1889, Audio-Video Transport Work-ing Group, GMD Fokus, Precept Software, Inc., Xerox Palo Alto Research Center, and Lawrence Berkeley National Laboratory, Jan. 1996.

[6] T. Aach, A. Kaup, and R. Mester, “Statistical model-based change detection in moving video,” Signal Processing, vol. 31, pp. 165–180, Mar. 1993.

[7] Y.-H. Jan and D. W. Lin, “Video segmentation with extraction of overlaid objects via multi-tier spatio-temporal analysis,” Int. J. Elec. Eng., vol. 11, no.3, pp.205–217, Aug. 2004.

[8] T. Meier and K. N. Ngan, “Video segmentation for content-based coding,”

IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 8, pp. 525–538, Dec.

1999.

[9] J. F. Canny, “A computational approach to edge detection,” IEEE Trans. Pat-tern Anal. Machine Intell., vol. 6, pp. 679–698, Nov. 1986.

[10] “Canny operator code,” http://ouray.cudenver.edu/ na0alber/DataCompressionPaper.htm.

[11] MPEG-4 Video Group, “MPEG-4 overview — (V.21 Jeju Version),” doc. no.

ISO/IEC JTC1/SC29/WG11 N4668, Mar. 2002.

[12] International Committee for Information Technology Standards, http://www.ncits.org/.

[13] A. Puri and A. Eleftheriadis, “MPEG-4: an object-based multimedia coding standard supporting mobile applications mobile networks and applications,”

Mobile Networks Applic. vol. 3, pp. 5–32, 1998.

[14] ISO/IEC 14496-2:2001, Information Technology — Coding of Audio-Visual Ob-jects — Part 2: Visual. July 2001.

[15] A. Ebrahimi and C. Horne, “MPEG-4 natural video coding — an overview,”

Signal Processing Image Commun. vol. 15., pp. 365–385, 2000.

[16] MPEG-4 Video Group, “MPEG-4 video veriﬁcation model version 18.0,” doc.

no. ISO/IEC JTC1/SC29/WG11 N3908, Pisa, Jan. 2001.

[17] Intel, MMX Technology — Programmers Reference Manual. 2000.

[18] Intel, IA-32 Intel Architecture Software Developer’s Manual, vol. 1. 2003.

[19] Intel, IA-32 Intel Architecture Software Developer’s Manual, vol. 2. 2003.

[20] Intel, “Using streaming SIMD extensions in a motion estimation algorithm for MPEG encoding,” doc. AP-818, Jan. 1999.

[21] D. E. Comer, Internetworking with TCP/IP, vol. 1. Englewood Cliﬀs, New Jersey: Prentice Hall, 1991.

[22] Eastlake, D., Crocker, S., and J. Schiller, “Randomness recommendations for security”, RFC 1750, DEC, Cybercash, MIT, Dec. 1994.

[23] “Video for Windows,” http://msdn.microsoft.com/library/psdk/multimed/aviﬁle 8dgz.htm.

[24] “VidCap: Full-featured video capture application,”

http://msdn.microsoft.com/library/devprods/vs6/visualc/vcsample/vcsmpvidcap.htm.

[25] “Capture without using disk storage,” http://msdn.microsoft.com/library/

default.asp?url=/library/en-us/multimed/htm/

win32 capture without using disk storage.asp.

[26] “Microsoft RIFF,” http://netghost.narod.ru/gﬀ/graphics/summary/micriﬀ.htm.

[27] “Recording with a waveform-audio device,”

http://msdn.microsoft.com/library/default.asp?url=/library /en-us/multimed/htm/ win32 recording with a waveform audio device.asp

[28] Intel, Architecture MMX Technology — Programmers Reference Manual. 1996.

[29] Martin Wolters, Kristofer kjorling, Daniel Homm and Heiko Purnhagen, “A closer look into MPEG-4 high eﬃciency AAC,” presented at the 115th Conven-tion., NY, USA, Oct. 2003.

[30] Pourmohammadi-Fallah, Y. Asrar-Haghighi, K. and Alnuweiri, H, “Internet delivery of MPEG-4 object-based multimedia,” IEEE Trans. Multimedia, vol.

10, issue 3, pp. 68–78, July 2003.

[31] Microsoft, ISO/IEC 14496 (MPEG-4) Video Reference Software User Manual.

Oct. 2004.

[32] “JRTPLIB 3.1.0,” http://research.edm.luc.ac.be/jori/jrtplib/jrtplib.html.

[33] “ JThread,” http://research.edm.luc.ac.be/jori/jthread/jthread.html.

[34] “AudioCoding.com,” http://www.audiocoding.com/

在文檔中採用以MPEG-4物件形式視訊編碼之視訊會議傳送端之整合 (頁 122-128)