• 沒有找到結果。

Chapter 2 Related Work

2.4 Summary

In this chapter, we introduce advantage and disadvantage of some famous framework and algorithms. Based on these characters, we will design and modify our proposed software watermark framework and algorithms.

Chapter 3

The Proposed Software Watermarking Framework

As described in the previous chapter, we explain the problems of failure recognition, low bit rate and color in QP and QPS algorithm. We also proposed static software watermark algorithms by leveraging the concepts in CT dynamic algorithm. The proposed static watermark algorithms will make the improvement on these characters:

„ Easier in constructing and extracting

„ Higher bit rates

„ Better robust from common attacks

„ More programming languages

We proposed a flexible framework to adopt not only proposed graph encoding algorithms but also other graph algorithms. To construct and extract watermark easily, we modify the heavy procedure of random walk in Graph Theoretic Approach. It is also implemented easily with more kinds of programming languages.

3.1 Framework

Framework can be divided into two phases: embedding and recognition phases. Embedding and recognition phase are shown with Figure 3.1 and Figure 3.3 separately.

In embedding phase, process proceeds through three steps:

Transformation step: Program P is parsed into graph G which is constructed by vertices and edges.

Each vertex of graph represents a basic block consisting of instructions. Each edge will be a directed edge represents a function call between two basic blocks. Order of vertices is arranged by depth-first search (DFS) algorithm. Message M can be a statement or special integer in binary

format. To make binary conversion, the hash function or other transform function can also be adopted to achieve higher security.

To increase the complexity and improve the privacy, graph G is segmented into n subgraphs,

{

g1,g2,L,gn

}

, where gi is one of the subgraph of G. The representation of the gi shown asG=

{

gi 1≤in

}

, where n is the number of subgraphs. Message M is also segmented into n fragmental messages,

{

m1,m2,L,mn

}

, using the random seed ω, where mi is one of the fragmental messages of M. The representation of the mi shown asM =

{

mi 1in

}

, where n is the number of subgraphs. Each subgraph is constructed according to the vertices selecting rule that make each subgraph which is decided by the corresponding fragmental message being embedded successfully.

Figure 3.1 Embedding Phase of Proposed Framework

The vertices selecting rule is defined as follow:

1. Select the vertex in the higher level of sub-graph as the first-selected vertex.

2. Select the vertices which aren’t the children of first vertex of sub-graph.

Each subgraph should have these characters:

1. The size of subgraph is decided by corresponding fragmental message 2. A sub-graph should contain at least three vertices.

3. If sub-graph is constructed from k vertices, fragmental message should be k-2 bits.

Embed step: When fragmental messages and subgraphs are generated, the next move is selecting the algorithm of graph encoding. There are three kinds of graph encoding algorithms: link encoding (LE)、color encoding (CE) and link color encoding (LCE). If LE or LCE is selected, path analysis will proceed. During the process of path analysis, as Figure 3.2, subgraph and the path which has more levels for embedding is analyzed by tree diagram. First vertex is selected according to the result from path analysis. Follow on the first vertex, fragmental message is embedded into its corresponding subgraph by adding directed edge between two vertices.

Figure 3.2 Example of Path Analysis

Merge step: The last step is merging original graph G and each subgraph into new graph. Directed edge is added between two vertices in original graph if the same vertices in subgraph have been embedded a bit of fragmental message. Color of vertices in original graph will be tampered according to the same vertices in subgraph. Finally, a new graph G’ contains the embedded message is generated.

Figure 3.3 Recognition Phase of Proposed Framework

In recognition phase, there are two steps:

Recognition step: When watermarked graph G’ is received, information of subgraphs, vertices and its corresponding color, and edges are also known. Algorithm of graph encoding which was used to embed the message is also analyzed and found. In the light of information, graph G’ is segmented into n subgraphs

{

g1′,g2,L,gn

}

, where g’i is one of the subgraph of watermarked graph G’. The representation of the gi is shown asG'=

{

g'i 1≤in

}

, where n is the number of subgraphs.

According to that graph encoding recognition algorithm, the fragmental messages

{

m'1,m'2,L,m'n

}

are extracted from each subgraph, where m’i is one of message M’. The

representation of the mi is shown asM'=

{

m'i 1≤in

}

, where n is the number of subgraphs.

Message processing step: Combine each fragmental message into new message m’. It is not necessary that new message M’ must be equal to original message M’ if we can verify the correct hidden information from M’.

Figure 3.4 is an example of proposed framework in embedding phase:

1. Input is m = 1011 and program P.

2. P is parsed into graph G, and index is arranged by DFS algorithm.

3. MB is segmented into two fragmental messages m1 = 10 and m2 = 11. G is also segmented into two subgraphs g1 and g2 with vertices selecting rule.

4. When link encoding algorithm is selected, each subgraph is analyzed by path analysis and the pivot vertex is decided. Then, fragmental messages bi1 and bi2 are embedded into subgraphs g1

and g2.

5. Merge graph G and subgraphs g1 and g2 into new graph G’.

Figure 3.4 Example of Proposed Embedding Phase Framework

3.2 Proposed Graph Encoding Algorithms

In this section, we will introduce proposed graph encoding algorithms.

At first, the definition is as follows:

Given fragmental message mi ,represent as

{

bi1,bi2,L,bin

}

, where mi =

{

bij 1 jβ

}

, β is number of bit in fragmental message and a subgraph Gi (Vi, Ei) which contains vertices set Vi and edges set Ei. Vertices pair, {va1, va2} contains nearest two vertices va1 and va2, and is also the nearest vertices pair not connected to va. Edges (va, va1), (va, va2) are not in edges set Ei.

3.2.1 Link Encoding (LE)

This algorithm can have improvement in robust by the way of link list. Pivot vertex is the vertex which is selected according to path analysis [24] [22] and subgraph will have the higher capacity

for embedding message.

Figure 3.5 Example of Link Encoding in Embedding Phase

Input: watermark m=b1b2…bn and a graph G(V,E)

Figure 3.6 Pseudo Code of LE Embedding Algorithm

Embedding phase: va is selected pivot vertex. For the first message bi1, vertices pair {va1, va2} is the nearest vertices pair not connected to va is found. If bi1 = 0, directed edge (Va, Va1) is added. Else, bi1 = 1, directed edge (va, va2) is added. Then, the vertex va is treated as invisible vertex and its connected vertex (va1 or va2) is treated as new pivot vertex which will be embedded with the next message m1. Repeat the step until mi = bi1 bi2…bin are all embedded into subgraph Gi (Vi, Ei) and

new graph G’i (V’i, E’i) will be generated. We will find that the last one vertex which is not used during the embedding step and the information of pivot vertex and remaining vertex is useful for robust. Figure 3.5 is an example of link encoding in embedding phase. For given subgraph, as Figure 3.5 (a), mi = 0011 and pivot vertex v1 is selected. First directed edge (v1, v2) is added according to bi1 = 0 and {v2, v3} is the nearest vertices pair not connected to v1. v2 is treated as new pivot vertex and v1 is treated as invisible vertex when bi2 = 0 is ready for embedding. Repeat the step as Figure 3.5 (b) until mi = bi1 bi2 bi3 bi4 are embedded. The pseudo code for the embedding phase in LE algorithm is described in Figure 3.6.

Input: watermarked graph G’(V’,E’), pivot vertex vp, remaining vertex vr

Output: watermark m=b1b2…bn

Figure 3.7 Pseudo Code of LE Recognition Algorithm

Recognition phase: How can we extract the message from graph? Given the graph G’i (V’i, E’i) and the information of pivot vertex va, find the number of vertices not connected to va between va and its connected vertex. If the number is zero, bi1 is 0, and if the number is 1, bi1 is 1. For the vertex

connected to va and treating va as an invisible vertex, the next message will be extracted. Repeat the step until that there is only one vertex in subgraph and compare this vertex with remaining vertex.

We will make sure the message is correct if the answer is “the same”. The pseudo code for the recognition phase in LE algorithm is described in Figure 3.7.

3.2.2 Color Encoding (CE)

The function of color is applied for increasing bit rate in color encoding algorithm. All the vertices in subgraph are the same color in original. The color rule is defined as follows:

va and its connected vertex vb are same color, a 00 message is embedded.

Figure 3.8 Pseudo Code of CE Embedding Algorithm

Embedding phase: For mi = bi1 bi2…bin, each fragment message is a 2-bit message in this method.

Find vertices pair {va1, va2} that are nearest pair and not connected to va. bij bi(j+1) = 00 (va, va2) is added, va and va2 are same color.

bij bi(j+1) = 01 (va, va2) is added, va and va2 are different color.

bij bi(j+1) = 11 (va, va1) is added, va and va1 are same color.

bij bi(j+1) = 10 (va, va1) is added, va and va1 are different color.

The new graph G’i (V’i, E’i) which contains new edges set E’i and new vertices set V’i with its related information of color is generated. The pseudo code for the embedding phase in CE algorithm is described in Figure 3.8.

Figure 3.9 is a simple example of color encoding in embedding phase. At first, mi = 1101 is segmented into 11 and 01. v1 is selected as va, and {v2, v3} is the nearest vertices pair that are not connected to v1. For the first two bits 11, the directed edge (v1, v2) is added and v1 and v2 are still same color. And the next message, v2 is selected as va, and {v3, v4} is the nearest vertices pair that are not connected to v2. For the next two bits 01, the directed edge (v2, v4) is added and v2 and v4

are colored by different color.

Figure 3.9 Example of Color Encoding in Embedding Phase

Recognition phase: Given the graph G’i (V’i, E’i) and vertices set with information of color. Find the number of vertices not connected to va between va and its connected vertex vb. The embedded bit

is extracted according the rule as follows:

The number is 0, va and vb are same color, then mi = 11 The number is 0, va and vb are different color, then mi = 10 The number is 1, va and vb are same color, then mi = 00 The number is 1, va and vb are different color, then mi = 01

The pseudo code for the recognition phase in CE algorithm is described in Figure 3.10.

Input: a graph G(V,E), an embedding vertex set Vx and its color set C Output: m=b1b2…bn

Figure 3.10: Pseudo Code of CE Recognition Algorithm

3.2.3 Link with Color Encoding (LCE)

The third method is link with color encoding, as implied in the name, and is combined with link encoding and color encoding. LCE method is successful to have the higher robust and bit rate.

Follow the definition of link encoding and color encoding, this method is introduced as follow:

Embedding phase: Given message mi = bi1 bi2…bin and a subgraph Gi (Vi, Ei), va is selected pivot vertex. Find vertices pair {va1, va2} is the nearest vertices pair not connected to va. First two bit of message bi1 bi2 are embedded according to the rule as follows:

bi1 bi2 = 00 (va, va2) is added, va and va2 are same color.

bi1 bi2 = 01 (va, va2) is added, va and va2 are different color.

bi1 bi2 = 11 (va, va1) is added, va and va1 are same color.

bi1 bi2 = 10 (va, va1) is added, va and va1 are different color.

Then, the pivot vertex va is treated as invisible vertex and its connected vertex is treated as new pivot vertex which will be embedded with the next two bits bi3 bi4. Repeat the step until mi = bi1

bi2…bin are all embedded into subgraph Gi (Vi, Ei) and new graph G’i (V’i, E’i) will be generated.

The pseudo code for the embedding phase in LCE algorithm is described in Figure 3.11. Figure 3.12 is an example with LCE method in embedding phase.

Input: G(V,E), m=b1b2…bn, and color set C = {c1,c2,…cn}

} }

return G’(V,E’) and C’;

Figure 3.11: Pseudo Code of LCE Embedding Algorithm

Figure 3.12: Example of Link with Color Encoding in Embedding Phase

Recognition phase: Given the graph G’i (V’i, E’i) and vertices set with information of color. Find the number of vertices not connected to va between va and its connected vertex vb. The first embedded bits bi1 bi2 are extracted according the rule as follows:

The number is 0, va and vb are same color, then bi1 bi2 = 11 The number is 0, va and vb are different color, then bi1 bi2 = 10 The number is 1, va and vb are same color, then bi1 bi2 = 00 The number is 1, va and vb are different color, then bi1 bi2 = 01

With the vertex vb connected to va and treating va as an invisible vertex, the next message m1 will be extracted. Repeat the step until that there is only one vertex in subgraph and compare this vertex with remaining vertex. We will make sure the message is correct if answer is “the same”. The pseudo code for the recognition phase in LCE algorithm is described in Figure 3.13.

Input: a watermarked graph G’(V,E’), the pivot vertex vp, the remaining vertex vr and the color set C Output: m=b1b2…bn

Algorithm:

start from the pivot vertex vp; let visiting vertex va equals to vp;

find the closest vertices va1, va2 that are not connected to va; foreach two bits bjbj+1

{

count for the vertices whose indices are in between a1 and a2, and are not connected to va; switch (count)

Figure 3.13: Pseudo Code of LCE Recognition Algorithm

3.3 Example

In this section, we present how the proposed graph encoding algorithms can be applied to a example program shown in Figure 3.14, a prime number generator that generates prime numbers no larger than integer a. A two-bit message 01 will be embedded into the program.

int k(int);

int main()

{ int a, b, sum; //v1

printf("insert a prime number \n"); //v2 scanf("%d",&a); //v3

In the phase of transform, we select the blocks as vertices and the program is parsed into graph as

Figure 3.15. With the length of message, a four-vertex graph is prerequisite. We select a four-vertex graph including vertices set V = {v1, v3, v5, v6} and edges set E = {(v5, v6)}. According the analysis of path, v1 is selected as pivot vertex. In the embedding phase, LE algorithm is used to embed the message 01 by adding the edges (v1, v3) and (v3, v6) as Figure 3. The watermarked program and its related parsed graph is shown as Figure 4 and Figure 5 respectively.

Figure 3.15: The Parsed Program

Figure 3.16: The Graph of Embedding

int k(int);

printf("insert a prime number \n"); //v2

v3: scanf("%d",&a); //v3

Figure 3.17: The Watermarked Program

Figure 3.18: The Parsed Watermarked Program 3.4 Path Analysis

Path analysis is a useful tool which is not only finding the longer path buy also providing a test and verify. Take Figure 3.19 as an example, we introduce the concept of path analysis. Original graph is given, as Figure 3.19 (a) is a 4-vertices diagram and v1 is selected as pivot vertex. LE algorithm is used to embed a 2-bit message with 4-vertices diagram. For a 2-bit message, Figure 3.19 (b) shows 4 possible paths to embed 4 possible messages. Given another graph as Figure 3.20 (a), it is obvious to find that embedding process encounters problem if the message is 00 or 01. And graph as Figure 3.21 (a) can’t be embedded into any message if pivot vertex is v1.

Figure 3.19 Example of 4 Possible Paths with 4 Vertices and LE Algorithm

Figure 3.20 Example of 2 Possible Paths with 4 Vertices and LE Algorithm

Figure 3.21 Example of Zero Possible Paths with 4 Vertices and LE Algorithm

The situation of success and failure embedding can be checked with path analysis. In the same way, possible paths are analyzed and found if a vertex is selected as pivot vertex. Take Figure 3.19 as an example, the graph diagram is as Figure 3.22 (a). The graph diagram shows the possible paths from the pivot vertex. From v1, the vertices pair (v2, v3) is the nearest one not connected to v1 and (v1, v4) will not be a possible path. The graph diagram will expand upon a tree diagram as Fig 3.22 (b) under the rule of LE encoding algorithm. In tree diagram, it is obvious to find four possible paths which are corresponding to Figure 3.19 (b).

Figure 3.22 Example of Path Analysis with 4 Vertices and LE Algorithm

Figure 3.23 is another example corresponding to Figure 3.20. The graph diagram shows the possible

paths from the pivot vertex, v1. The tree diagram as Fig 3.23 (b) will expand from graph diagram.

There are only two possible paths under the rule of LE encoding algorithm.

Figure 3.23 Example of Path Analysis with 2 Vertices and LE Algorithm

3.5 Error Detection

Path analysis described in the previous section can also be applied in error detection. According to the information of pivot vertex and remaining vertex, we can verify that some bits of message are in error or not. Figure 3.24 shows an example of error detection:

Figure 3.24 Example of Error Detection

Given watermarked graph G′ with LE algorithm as Figure 3.24 (a), start and remaining vertex are v1

and v4 respectively. From the pivot vertex v1, to fit in with the information of remaining v4, an

embedding path E={(v1,v2),(v2,v3),(v3,v5),(v5,v6)}is found. A Hamiltonian path can be found if we add a virtual directed path (v6, v4). The message is extracted correctly by LE recognition algorithm.

Figure 3.24 (b) shows that an error occurs in edge (v2, v3) and an altered edge (v2, v6) instead. From the pivot vertex v1, with the information of remaining v4, we can’t find any possible path. Critical error occurs in graph if there is Hamiltonian path is found and we can use path analysis to recover the message. At first, from LE embedding algorithm, we analyze the graph and find that edge (v1, v2) should be correct message. The next pivot vertex is v2 and {v3, v4} is the nearest vertices pair and there should be an edge (v2, v3) or (v2, v4). According to the information of the edge (v3, v5), we can recover the edge (v2, v3) and find the correct embedding path. A Hamiltonian path is found if we add a virtual directed path (v5, v4). The message are recovered and extracted correctly by LE recognition algorithm.

3.6 Summary

In this chapter, we propose a graph-based watermark framework can adopt more than three kinds of graph encoding algorithms. According to the requirement of robust and bit rate, LE, CE and LCE algorithm can be applied respectively.

Chapter 4

Analysis

After illustrating the proposed framework and algorithms, analysis and comparison with other framework and algorithms is an essential work. The performance in stealth, robust and flexibility are the criteria for software watermark algorithms.

4.1 Security Analysis

After the watermarked graph is produced, there are many kinds of adversaries. A robust watermark algorithm can prevent specific attack from extracting or destroying the embedded message. We focus on preventing the additive/subtractive attacks since the graph is constructed by vertices and edges. Thus, a graph-based watermark is vulnerable to attacks on the edges and vertices in the graph.

In this section, we illustrate the software watermark attacks on graph edges and vertices.

‹ Edges additive/subtractive attack

The path analysis described in the previous section can be used to detect if there is any redundant or missing edges, which can result in destroying the watermarked information in the software module. Before the process of extraction, embedding path must be found to construct a Hamiltonian path. Single edge has been altered will be recovered as we described in last section. If few bits have been altered, we have to compare with the information of path analysis from original graph according to start and remaining vertex and graph encoding algorithm. Possible paths will be found to recover the message during comparison.

‹ Vertices additive/subtractive attack

The number of vertices is restricted by the number of bit of the message. For an example, N vertices must be matched with N+2 bit message for LE algorithm. Redundant or lost vertices will be detected by the character of graph encoding algorithm. To recover the message, we

have to compare with the information of path analysis from original graph according to start and remaining vertex and graph encoding algorithm. During comparison, we will find the variation of vertices and reconstruct the graph to extract the correct message.

Figure 4.1 is an example of vertex subtractive attack. Adversaries have deleted the v6, and the edges (v3, v6) and (v5, v6) become null pointers. It is obviously that a vertex had been deleted or lost. The correct message can still be extracted by rebuilding the graph.

Figure 4.1 Example of Vertex Subtractive Attack

Figure 4.2 Example of Edge-flip Attack

‹ Edge-flip Attack

An edge-flip attack against the watermark reorders the edge between vertices. The outgoing of

An edge-flip attack against the watermark reorders the edge between vertices. The outgoing of

相關文件