行政院國家科學委員會補助專題研究計畫成果報告
※※※※※※※※※※※※※※※※※※※※※※※※※
※
多媒體架構與數位視訊浮水印在網際網路之應用
※
※
※
※
Applications of Multimedia Fr amewor k and Digital
※
※
Video Water mar king on Inter net
※
※※※※※※※※※※※※※※※※※※※※※※※※※
計畫類別:□個別型計畫
R整合型計畫
計畫編號:NSC-89-2213-E-009-234
執行期間:
89年
8月
1日至
90年
7月
31日
計畫主持人:蔣迪豪教授
本成果報告包括以下應繳交之附件:
□赴國外出差或研習心得報告一份
□赴大陸地區出差或研習心得報告一份
□出席國際學術會議心得報告及發表之論文各一份
□國際合作研究計畫國外研究報告書一份
執行單位:國立交通大學電子研究所
中
華
民
國
90年
8月
1日
行政院國家科學委員會專題研究計畫成果報告
多媒體架構與數位視訊浮水印在網際網路之應用
Applications of Multimedia Fr amewor k and
Digital Video Water mar king on Inter net
計劃編號:NSC 89-2213-E-009 -234
執行期限:89年08月01日至90年07月31日
主持人:蔣迪豪教授 交通大學電子研究所
計劃參與人員:李俊毅、林耀中、王俊能 交通大學電子研究所
摘要多媒體技術的發達促使資訊的存取與資
料的傳遞之相關應用廣泛地發展。多媒
體應用在百家爭鳴下,卻造成彼此之間
在 資 料 傳 遞 與 存 取 發 生 互 動 困 難 的 問
題。因此,MPEG-21制定了多媒體通用存
取的架構與法則。透過這架構與法則,
多媒體應用環境中廣大的異質的元件,
比如存取裝置、網路、終端器、自然環
境以及使用者偏好等等,可以透過談判
來連接與互動。且這架構也提供保護智
慧財產權的方法。如此,方讓使用者可
隨時隨地隨心所欲地享用多媒體資訊。
關鍵詞:MPEG-21、MPEG-4、MPEG-7、通用多
媒 體 存 取 、 FGS(Fine Granularity
Scalable)、數位浮水印, 轉碼器
ABSTRACTMPEG-21 provides a solution named as
Universal Multimedia Access (UMA) for constructing a multimedia content delivery and rights management framework. Based on the concepts of UMA, we build a simplified UMA model on Internet. In this model, the source video with embedded watermark is encoded and saved as FGS bitstreams. The video features like motion vectors are recorded in MPEG-7 descriptors. To support video contents of different formats, we adopt a transcoder to convert the bitstream from FGS format to a format that fits with the terminal capability.
Moreover, the database is managed by
following the MPEG-7 syntax.
Keywor ds:
MPEG-21, Universal Multimedia Access,
MPEG-4, MPEG-7, Digital Watermarking, FGS, Transcoding.
I. Introduction
Recently, multimedia technology provides the different players in the multimedia value and delivery chain with an excess of
information and services. Access to
information and services can be provided with ubiquitous terminals and networks from almost anywhere at anytime. However, no solution enables all the heterogeneous communities to interact and interoperate with one another so far.
MPEG-21 provides a solution named as Universal Multimedia Access (UMA) [1]. The major aim is to define a description of how these various elements such as network,
terminal, user preference, and natural
environments, etc. can fit together. It defines a multimedia framework to enable transparent and augmented use of multimedia resources across a wide range of networks and devices
employed by different communities.
Additionally, it allows content adaptation according to terminal capability.
II. Multimedia Content Delivery and Rights
Management Framework
The basic idea of the UMA is to find
effective negotiation approaches of all
heterogeneous communities. The key issue of UMA is how to deliver abundant multimedia contents to any user almost anywhere at anytime. When delivering the digital contents, rights management is very important for protecting the intellectual properties, since the progress of digital technology has facilitated the reproduction and retransmission of multimedia data. Thus, we first introduce the concepts of
multimedia content delivery and rights
management (MCD&RM) under the UMA framework using an IMPRIMATUR model [6].
Following this model, we try to construct a MCD&RM framework on Internet.
A. European IMPRIMATUR model
European Commission-funded
IMPRIMATUR business model in an e-commerce trading environment can provide the conceptual framework to develop a prototype
ERMS (Electronic Rights Management
System). Additionally, it is designed to embrace a complete ERMS within which digital content, payment mechanisms and rights management must fully comply with the requirements of an electronic trading architecture. As to the rights management, each of the basic building blocks represents a fundamental activity in the trading of intellectual property rights (IPRs) and is necessary for one or more aspect of trading to take place. For example, the basic building blocks of conceptual business architecture defined within the IMPRIMATUR Business Architecture are summarized in the following.
Creator means the creation of information and IPRs. Creation Provider makes creation available for commercial exploitation. Media Distributor distributes creation. Right Holder is the holder of IPRs. IPR Database retains current information on IPR ownership and restrictions. Consequently, Purchaser can acquire the information from Media Distributor and IPR Database. Monitoring Service Provider checks legal or illegal use of information. Certification Authority will authenticates users (media distributors, purchasers). Unique Number Issuer provides a unique number for creation or a mechanism for uniquely identifying Digital Items. Bank facilitates payments. These basic
building blocks, their relationships and
transactions can be expressed in many ways. The Figure 1 diagram is just one example of the manner in which it may be represented.
B. The proposed architecture
For achieving UMA, we propose a video server that contains the key modules described in MPEG-21 [1]. In this model, we combine the tools as referred to MPEG-4 Fine Granularity Scalability (FGS) [2], MPEG-4 Simple Profile [3], MPEG-7 [4], Digital Watermarking techniques [5], and Internet protocols. The proposed architecture consists of seven modules as shown in Figure 2(a). The details of each module are described in the following.
1) Video Input
The source video is inputted from the image acquisition device and saved in a digital form like YUV or RGS formats.
2) Feature Extractor
It exploits the important features of video signals. The features like motion vectors are stored in the MEGP-7 Database for advanced processing.
3) FGS Bitstream Archive
Each video signal is encoded as FGS bitstreams and stored in mass storages. When any request from users, the server will send bitstreams directly to the terminals, or pass them through the Transcoder when format conversion is necessary.
4) MPEG-7 Database:
The video contents are indexed and retrieved with the approaches as referred to MPEG-7 standard. In addition, the recorded information such as motion vectors of each video signal can
Creator Unique Number Issuer Rights holder Creation Provider Media Distributor Purchaser IPR Database Monitoring Service Provider Certification Authority Creation Description Unique Number Value Conditional Licence
Application forLicence Assignment of Rights Value Creation Creation D P Check P Identity Creation
Current Rights holder and Payment details
Log Value Check MD Authorisation IPR Info Check ID of MD Log $ $ $ $ $ Legend: Imprint of Purchaser ID Imprint of Media Distributor ID Imprint of Unique Number Role exchanges Value with Bank
$ P D N Value IPR Info N
speed up the transcoding when it is necessary to convert the bitstream format priori to delivery. 5) Real-time Transcoder:
The transcoding depends on the channel conditions, terminal capabilities, and video
content features. When justifying the
transmission conditions, the server will ask the transcoder for converting the archived FGS bitstreams into the other bitstreams with specified formats, sizes, and qualities in an adaptive manner.
6) Channel Monitor:
It accepts feedback information from the terminals and also estimates the characteristics of channels, which mean the round-trip time, packet lost ratio, bit error rate, and bandwidth. All obtainable information is passed to the server for adapting the content delivery.
7) Terminals:
Priori to receiving the bitstream, the terminal exchanges its capabilities with the server. Consequently, the terminals receive and reconstruct the demanded video signals.
Thus in our framework, the source video is encoded and archived as FGS bitstreams, which can provide various QoS service like SNR scalable video coding schemes [2]. With the FGS bitstreams saved in FGS BitStream Archive module and the retrieval features of each bitstream in MPEG-7 Database, the proposed MCD&RM system can serves heterogeneous terminals through the Internet. Moreover, according to Internet and Terminal devices capabilities, the Channel Monitor can adapt the different resources to each Terminal.
C. Digital Watermarking on IPMP
Digital Watermarks are methods for secretly and imperceptibly embedding signal of certain
characteristics into the media of interest. People can apply the watermarks and thus marks the host signal as being its intellectual property. When there is any argument about data ownership, the product owner is the only one that can detect the watermark and thus prove his ownership on the data. In order for a digital watermark to be effective and practical, it
should possess such properties as the
imperceptibility, security, and robustness. This is for the least possibility to remove or detect the watermarks even for those who know the
principle of embedding algorithm, or
unauthorized distribution with common
manipulations, like lossy compression, filtering, cropping, and resampling, etc.[4].
An example of digital watermarking is to embed the data into the LSB bit of the motion vectors fields for each video bitstream. The hiding in the motion vector indicates to an alteration of 1 pixel of it. Based on this approach, we embed the watermark into each block that is selected in a predefined and secure way. The embedded data within the specified motion vector fields can be extracted back when authenticating some video sequences from Internet. If the watermark is recognized to in most probably equal to what is embedded before, we can use that to show the ownership of the sequences.
Hiding data in the motion vector fields is compatible with source encoders and needs no overhead for transmission, but its security and robustness are very weak due to the motion vectors can be removed by re-encoding the watermarked bitstream. Thus, we are looking for an improved watermarking technique for digital video. Internet Internet Video Inputs Feature Extractor MPEG7 Database FGS BitStream Archives
Terminal Terminal Terminal Channel Monitor Real -time Transcoder Data Path Control Path Buffer Internet Internet Video signal FGS Encoder FGS Archive MPEG-7 Database FGS Bitstream Feature Extractor Transcoder Data Path Control Path
(a) Proposed architecture (b) Simulated model
Figure 2. Illustration of the proposed multimedia content delivery and rights management architecture, which adopts the tools defined in various existing MPEG standards and Internet protocols. Figure (b) is the simplified model for the system in Figure (a).
III. System Simulations
To demonstrate the performance of our proposed MCD&RM architecture in Figure 2(a), we build a simulated end-to-end model as shown in Figure 2(b). In the simulated model, each sequence is pre-encoded and stored in the FGS Bitstream Archive. Where the video sequences, called as Foreman, Akiyo, and News, are in CIF and YUV format. Moreover, we assume the terminals demand the MPEG-4 Simple Profile bitstreams that the transcoder needs to convert the FGS bitstream into the
MPEG-4 bitstream
[3
] before sending thevideo signals to the terminals.
Figure 3 demonstrates coding efficiency of the simulated model based on applying various motion vectors in transcoding. For a given bitrate, the experimental results show that the transcoded MPEG-4 bitstreams employing the MPEG-7 motion descriptors have almost the same performance as those using the motion vectors obtained by Full Search. Thus, with the MPEG-7 motion descriptors, we can reduce the computational cost of the transcoder by about 50% without significant loss.
Figure 3 also compares the system
performance under bandwidth variations. When the original video signals are available, the coding efficiency at every given bitrate is optimal, which is indicated as ‘Original+ME’ in the figures. When the available bandwidth between the FGS archive and transcoder decreases, the reconstructed video qualities of MPEG-4 bitstreams are reduced. For fast motion sequences like Foreman and Container, the loss of PSNR is about 0~7 dB. While the loss is about 2~10 dB for slow motion sequence such as News. Thus, we will continue to figure out transmission rules that can transfer the
maximal amounts of the FGS bitstream to the transcoder to serve the MPEG-4 bistream of high quality.
In this report, we demonstrate several key modules in our proposed MCD system based on a simulated system. The study of the remaining issues, for example, the handling of the network fluctuation, supports of the transcoding formats, traffic monitoring and flow management, and multicast streaming of video, etc., for accomplishing this system on the real Internet is in progress for the remaining two years for this program.
Refer ence
[1] A. Vetro, “AHG report on describing
terminal capabilities for UMA, ”
MPEG01/m7275, Sydney, Australia, July 2001.
[2] W. Li, "Overview of fine granularity scalability in MPEG-4 video standard,”
IEEE Trans. Circuits Syst. Video Technol.,
vol. 11, pp. 301-317, Mar. 2001.
[3] MPEG Video Group, “Information
technology— Coding of audio-visual
objects— Part 2: visual, ” International
Standard, ISO/IEC JTC1/SC 29/WG 11
N4350, July 2001.
[4] J. Magalhaes and F. Pereira, “Describing user environments for UMA applications,” MPEG01/m7312, Sydney, Australia, July 2001.
[5] F. Hartung and M. Kutter, “Multimedia watermarking techniques, ” Proc. IEEE, vol. 87, no.7, pp.1079-1107, July 1999.
[6] “Information technology–Multimedia
framework (MPEG-21)–Part1: vision,
technologies and strategy, ” ISO/IEC JTC1/SC29/WG11 N4333, 2001 20.000 22.000 24.000 26.000 28.000 30.000 32.000 34.000 36.000 38.000 128 256 512 768 1024
Bit Rate (kbits/sec)
P S N R ( d B ) Original+ME FGS(all)+ME FGS(all)+MVD FGS(2)+ME FGS(2)+MVD 25.000 27.000 29.000 31.000 33.000 35.000 37.000 39.000 41.000 128 256 512 768 1024
Bit Rate (kbits/sec)
PS NR ( dB) Original+ME FGS(all)+ME FGS(all)+MVD FGS(2)+ME FGS(2)+MVD 28.000 29.000 30.000 31.000 32.000 33.000 34.000 35.000 36.000 37.000 38.000 128 256 512 768 1024
Bit Rate (kbits/sec)
P S N R ( d B ) Original+ME FGS(all)+ME FGS(all)+MVD FGS(2)+ME FGS(2)+MVD
(a) Foreman_CIF (Y) (b) News_CIF (Y) (c) Container_CIF (Y)
Figure 3. PSNR versus bitrates of the Y component for each video sequence, where ‘ME’ means the motion vectors derived from Full Search and ‘MVD’ means the motion vectors from MPEG-7 motion descriptors.