Practical issues - 網路多媒體數位浮水印及其應用之研究

In this section, we discuss some practical issues for the application of proposed methods in real world situation.

Issue 1: In general, the security of the proposed fingerprinting method is highly related to the number of watermarks embedded into media contents. However, embedding too many watermarks in an image or a frame of video may often decrease the visual quality, i.e., lower down the PSNR value.

Issue 2: Recently, most media contents are stored and transmitted in compressed formats. Some lossy compression techniques adopt methods of deleting or quantizing high frequent components to reduce the total data size. Thus, properly deciding the number and location of coefficients for fingerprint embedding are an important consideration for digital right protection.

Issue 3: The two JFD methods presented in Sections 5.3 and 5.4 have their strength and weakness in different circumstances. The major strength of the JFD method proposed in Sections 5.4 is at its multicast features, which can save transmission bandwidth required at server side. However, this method is highly dependent on the BIBD scheme. When U is not equal to (v² − v)/(k² − k), the multicast capability of the method is not fully utilized. Under these circumstances, we can find v, k and Uǫ for the given U such that U = (v² − v)/(k² − k) + U^ǫ is satisfied, where Uǫ is the smallest positive number for all possible pairs of v and k in a (v, k, 1)-BIBD. Then, the method proposed in Section 5.4 can be used to multicast the (v, k, 1)-BIBD based descrambling subkeys to U − U^ǫ clients, and the method described in Section 5.3 to unicast Uǫ descrambling keys to the rest of clients. It is also possible to separate clients into several portions and multicast descrambling subkeys to each portion with different BIBD.

Issue 4: When many watermarks are embedded into an image or video frame, they may be overlapped with each other. That is, some coefficients may be used to embed more than one watermarks. The overlap will slightly decrease the detection strength of these watermarks. Practically, we can separate the coefficients into v sets for v watermarks to avoid the overlap.

issue 5: If we randomly decompose a scrambling key into v subkeys, we can only represent and transmit v − 1 subkeys by random seeds. To tackle this problem, we

Figure 5.2: Methods to prevent malicious users from using key collusion attacks.

can decompose a scrambling key into v + 1 subkeys. The first v subkeys are used to generate JFD subkeys as described in Section 5.4.2. As shown in Figure 5.2(b), a k-th root calculation is applied on the (v th subkey, and then the k-th root value of (v +1)-th subkey is multiplied wi+1)-th JFD subkeys ϑi to produce new JFD subkeys kϑi. Similar to the derivation of Eq. (5.11), the calculation of descrambling key for client 2 becomes:

K⁽²⁾ = K1× K²× K³×^kϑ4×^kϑ5×^kϑ6× K⁷× K⁸× K⁹ (5.15)

= K₁× K2× K3× (^q³K₁₀× ϑ4) × (^q³K₁₀× ϑ5) × (^q³ K₁₀× ϑ6) × K7× K⁸× K⁹

= K1× K²× K³× ϑ⁴× ϑ⁵× ϑ⁶× K⁷× K⁸× K⁹× K¹⁰ (5.16) Thus, a scrambled video clip can be descrambled and fingerprinted with the method proposed in Section 5.4

issue 6: There are systematic methods for constructing infinite families of BIBDs. For example,(v, 3, 1) systems (also known as Steiner triple systems) are known to exist if and only if v ≡ 1 or 3 (mod 6) [35]. Techniques for constructing several kinds of BIBDs can be found in [37].

5.6 Experimental results

In this section, two types of experiments were exercised to demonstrate the performance of the proposed fingerprinting methods. In these experiments, a few watermarks are em-bedded in a video frame as a user’s fingerprint. In the first type of experiment, we evaluate

how the number of embedded watermarks may effect the visual quality of fingerprinted images. Usually, more embedded watermarks would provide better protection for a fin-gerprinted image. By embedding several watermarks at randomly selected coefficients of an image, it is possible that a few coefficients may be repeatedly selected to embed differ-ent watermarks. Thus, the second types of experimdiffer-ents will evaluate the robustness of a fingerprint containing multiple watermarks. The watermark embedding method used in the experiment in this section is modified from the method proposed by Cox [33] to fit the 8×8 DCT transformed video frames, which are basic frames for MPEG 1 and 2. A JFD subkey ϑi is first calculated from an independent Gaussian watermark βi according to Eq. (5.4) or Eq. (5.10), then the watermark is embedding to a randomly selected location in the middle frequency coefficient area of 8×8 DCT blocks after descrambling process.

To detect a watermark, the 8×8 DCT coefficients in which the original watermark was embedded were extracted, and then Eq. (5.17) is used to recover the watermark ˆW , Then, W and βˆ i are compared with each other of their similarity according to Eq. (5.18). When the computed similarity value is larger than a predetermined threshold value Tc, then the recovered watermark ˆW is considered to be a valid watermark βi.

W =ˆ 1

In the following experiments, images were extracted from the I-frame of Table Tennis video sequence, obtained from http://media.xiph.org/video/derf/. The strength value α of a watermark is limited between 0.1 and 0.5, the length of a watermark is set to be 1000 real values (i.e., L = 1000, and the predetermined threshold Tc for fingerprint detection is 0.7.

In the first types of experiments, images for embedding fingerprints are extracted from the I-frame of Table tennis video sequence. Each extracted 352 × 288 frames are first transformed into 8×8 DCT domain. Then, various numbers (from 1 to 20) of watermarks

0 5 10 15 20 30

35 40 45 50 55 60

Number of watermarks

PSNR

α = 0.1 α = 0.2 α = 0.3 α = 0.4 α = 0.5

Figure 5.3: The experimental results show that the relationship of the image visual quality (PSNR) vs.

the numbers of embedded watermarks.

are embedded into the DCT coefficients of the transformed video frames according to Eq. (5.11). The PSNR values of the watermark embedded images are depicted in Figure 5.3. As stated in [38], when the PSNR value is larger than 30, the visual quality of an image is acceptable. For different embedded watermark strength α (i.e., 0.1 ≤ α ≤ 0.5) their PSNRs are higher than 30, which is a generally accepted level of visual quality. According to the proposed fingerprint embedding method, when 19 watermarks are embedded into an image simultaneously at client side, the (v, k, 1)-BIBD based method may allow 70 clients to receive a multicasted video clip with a (21, 3, 1)-BIBD. This scale of client size seems to be satisfactory for must real world needs.

In the second type experiments, each extracted frame from Table tennis video sequence is embedded with various number (from 1 to 30) of watermarks with strength α from 0.1 to 0.5. Overlapping among embedding watermarks is allowed in order to test the robustness of the proposed fingerprint method. Table 5.1 shows the similarity value between the extracted fingerprints and the original fingerprints for various number of embedded watermarks and various strength α.

As shown in Table 5.1, although watermarks may be overlapped with each other, the number of corrupted fingerprints is very small. Furthermore, when the number of embedded watermarks increases to 20, the average similarity of watermarks is slightly decreased to 0.65. Thus, we would like to claim that the proposed fingerprint embedding method is practical as well as robustness for real world applications.

5.7 Summary

In this chapter, we propose a new video scrambling and fingerprinting approach for digital media right protection. The proposed method contains two parts: (1) video scrambling at server side, and (2) fingerprint embedding at client side. First, a content server scrambles and multicasts video contents to clients. Then by applying a (v, k, 1)-BIBD scheme, the server partitions a the scrambling key into O(√

U ) descrambling subkeys, and multicasts to U clients. Receiving a designated secret key from the content server, each client

Table 5.1: Average similarity values of four testing images.

combines descrambling subkeys into a descrambling key with fingerprint embedded. By using he descrambling key, a scrambled video becomes a fingerprinted video designated to the user. In general, embedding fingerprints may often add some kinds of noise into the video contents. According to the experiment results, when the fingerprint data is less than 22% of the date size in a image or a video frame, their PSNR can be 35 or higher.

Chapter 6 Application: Web-Based Multimedia News Archive

6.1 Introduction

Due to the recent advances of web technology on multimedia, creating a web-based mul-timedia news system becomes possible. Such a web-based service can provide a well-organized daily news list, convenient searching mechanism, and rich multimedia contents.

And most importantly, a system that can generate contents fully automated. Because of the ease by which multimedia data can be duplicated and distributed, a content server needs effective copyright protection tools. Thus, as an application of the proposed water-marking and fingerprinting approaches, this chapter proposes a fully automated web-based TV-news system. a systematic methodology that can automatically generate semantic la-bels from news video, and statistical methods to discover hidden information. We intend to expect that the following significance will come to exist.

• Although web-news provides another efficient way to access news, watching TV-news already becomes habit of many people. Beside this, most of web-news system can only provide text-based news.

• There are so many channels providing TV-news. People need more information for searching like-minded channel.

• Although almost every channel announced that they are dispassion, real dispassion is hard to archive with human editing. We need some evaluation to check if the channel is really dispassion.

Moreover, this TV news archive is used as a practical application of the proposed pro-gressive image watermarking and video fingerprinting.

The rest of this chapter is organized as follows. First, some related works will be discussed in Section 6.2. Then, an overview of the proposed TV-news archive is presented in Section 6.3. In Section 6.4, methods of generating necessary semantic labels from the recording TV news video are presented. Section 6.5 focus on describing the information mining from these semantic labels. Section 6.6 introduces the overall concepts of the multimedia TV news archive. Finally, summary and concluding remarks are given in Section 6.7.

6.2 Related works

Among the major sources of news program, TV has clearly had the dominant influence at least since the 1960s. Yet it is easy to find the old newspaper in microfilm in any public library, but it is impossible to find the old footage of television news in the same library.

TV news archive has existed in the United States for 35 years. Paul C. Simpson founded the Vanderbilt University Television News archive in 1968. In [39], a team in University of Missouri-Columbia decided to do a content analysis of the three US network coverage of the 1989 Tiananmen Massacre, they located these news items in the Vanderbilt Archive Index as shown in Fig. 6.1. The Vanderbilt archive promptly provided the 11-hour video clips all related to the Tiananmen Massacre. At the same time, the Missourian team also planned to do a comparable study of Taiwanese reportage on Tiananmen Massacre. But the equivalent material of the Vanderbilt archive did not exist in Taiwan then. Therefore, that study only contained the US perspective of the Tiananmen Massacre. In this chapter, we propose an integrated methodology for the construction and information mining on a multimedia TV news archive in Taiwan. As described in [40, 41], a fully automated

Figure 6.1: A sample of evening news index from the Vanderbilt TV news archive.

web-based TV-news system were implemented to achieve the following goals:

1. Academic and applied aspects: This archive will greatly improve the quality of TV news. As Dan Rather, the CBS anchorman, once mentioned that he lives with two burdens -the ratings and the Vanderbilt Television News Archive. Therefore, once the archive is there, the researchers and the public will do some content analysis on the TV news, and the journalists will be more careful in what they report.

2. Timing factor: Vanderbilt archive started its project with Betacam videotapes in 1968. There will be a problem of preservation because these tapes deteriorate along the years. Today, we can save all the TV news in hard disc, VCD or DVD.

Infomedia[42] is an integrated project launched in Carnegie Mellon university. Its overall goal is to use modern AI techniques to archive video and film media. VACE-II[43], a sub-project of Informedia, automatically detects, extracts, and edits highly interested people, patterns, and story evolves and trends in visual content from news video.

Figure 6.2: Flow chart of automatic news content generation.

在文檔中網路多媒體數位浮水印及其應用之研究 (頁 69-80)