Motivation - 基於應用層群播之具回復性推拉式同儕式串流方法

Chapter 1 Introduction

1.2 Motivation

The tree-based architecture has low start-up delay, but has less resilient to node failures comparing to the mesh-based architecture, and it would result in a low delivery ratio and instable quality of received multimedia. In this thesis we propose a multi-streaming scheme based on the structure of SplitStream [9]and the application level multicast [16]. We split video streaming data and build multiple trees to transfer streaming data. We integrate a forward error correction (FEC) [17] data recovery algorithm to recover the original data, and integrate the ideas of the pull method and the tree-based method, which is push-based method. Our approach has good resilience to node failures. The rest of this thesis is structured as follows. In Chapter 2, we will review related work and make comparisons of different P2P streaming systems. We will illustrate the design approach in Chapter 3. In Chapter 4, we will present simulation results. Then, in Chapter 5, we will discuss implementation issues, and finally, Chapter 6 is conclusions and future work.

Chapter 2 Related Work

2.1 Existing P2P Streaming Systems

In this chapter, we review some existing P2P streaming systems.

Scribe [16]: Scribe is a scalable application-level multicast system. Scribe builds multicast trees and supports a large number of groups. Scribe is built on Pastry, a peer-to-peer location and routing substrate. Nodes can create groups, join the groups, and send messages to other nodes in the same group. Scribe uses Pastry to execute these behaviors. Scribe provides best-effort reliability guarantees and has good scalability of wide range of groups.

SplitStream [9]: In the tree-based system, the load balancing of nodes is not good.

So SplitStream constructs a forest of multicast trees that distributes the forwarding load to participating nodes. Especially, a node is an internal node in only one tree and is a leaf node in other trees. It relies on a structured peer-to-peer overlay to construct and maintain these trees. In this thesis, we further enhance Splitstream to be more robust and resilient to node failures with some data encoding technique.

PRIME [14]: PRIME is a mesh-based P2P streaming system for swarming content delivery. First, PRIME uses proper peer connectivity to minimize bandwidth bottleneck. Second, PRIME employs an efficient content delivery pattern to minimize content bottleneck. PRIME classifies the delivery pattern into two phases, the

CoolStreaming [5]: There are three key modules in this system: 1) membership manager, which maintains a partial view of the overlay; 2) partnership manager, which establishes and maintains partnership with other peer nodes; 3) scheduler, which is responsible for the schedule of stream transmissions. This system is a mesh-based and receiver-driven design of a streaming overlay. Every node periodically exchanges the Buffer Map which records the data availability information with partner nodes. CoolStreaming uses the pull method to retrieve unavailable data from partners.

HCPS [12]: This system proposes a hierarchically architecture. In HCPS, the nodes are formed into clusters and retrieve data from the source server or the cluster head.

This system performs the perfect scheduling algorithm within each cluster, and fully utilizes the bandwidth to achieve high streaming rate with short delay.

Trickle [11]: Trickle is a peer-to-peer real-time media streaming system built upon SplitStream. This system constructs multiple multicast trees, combines erasure correction code and the recruitment of many peer helpers. The system can transport video streams with low link stresses and stable sub-second frame delays

Gridmedia [13]: CoolStreaming is a kind of receiver-driven system. But the delay of CoolStreamig is long. To improve this drawback, Gridmedia proposed a push-pull streaming method. In order to reduce the delay at nodes as well as to offer resilience to the high churn rate in the overlay, the nodes in Gridmedia are organized into an unstructured overlay. The pull method is the same as that of CoolStreaming. The nodes first use the pull mode. When the partnership between nodes is stable, the delivery mode changes to the push mode.

2.2 Qualitative Comparison of Existing P2P Streaming Systems

We compare existing P2P streaming systems qualitatively in Table 1. Systems like CoolStreaming and PRIME are mesh-based architecture. These systems have longer start-up delay, but have good resilience to node failures and good load balancing between nodes, and have high bandwidth utilization. On the other hand, the systems like Scribe, Trickle, and SplitStream are tree-based architecture. Scribe is single tree architecture, so it has poor performance under node failures. And the load balancing is poor in Scribe. SplitStream is a multi-stream scheme to improve the load balancing and peer churn problems. But compared to the mesh-based architecture, SplitStream performed worse in resilience to node failures. Trickle combined the IDA data recovery algorithm with SplitStream, so it has better resilience to node failures. There is another architecture that combined the mesh-pull and tree-push methods called push-pull method like Gridmedia. This method has the advantages including short start-up delay, good resilience to node failures, and good load balancing. But it also has much more control overhead compared to other methods. Our approach is based on the tree-push method, combined with the FEC recovery algorithm and the pull method. So our approach has short start-up delay and good resilience to node failures and good load balancing. In addition, our approach has less control messages compared to Gridmedia because a node sends request messages only when it missed data. Our pull method is performed when a node misses the data. It sends requests to the spare nodes specified in the SpareTable. The pull method in Gridmedia is

Table 1.Qualitative comparison of existing streaming systems

sec. Good Good Pull High Mesh

PRIME [14] MDC 30 sec.

~ 1 min Good Good Pull High Mesh

Gridmedia [13] None 30 sec.

~ 1 min Good Good Push-

pull High Mesh

Scribe [16] None 10~30

sec. Poor Poor Push Medium Tree

SplitStream [9] None 10~30

sec. Medium Good Push High Multiple

Tree

Trickle [11] IDA 10~30

sec. Medium Good Push High Multiple

Tree

HyStream

( proposed) FEC 10~30

sec. Good Good Push-

pull High Multiple

Tree

Chapter 3 Design Approach

The architecture of the proposed P2P streaming system is based on DHT and application level multicast. We used Pastry [15] as the DHT layer to implement the basic structure of our streaming system. And the application level multicast is responsible to construct a multicast overlay network, like Scribe [16]. Our approach focuses on the streaming layer. The streaming layer is responsible to transfer streaming data. Traditionally, the tree-based approaches use the push method, in which nodes transfer data to its child nodes. This approach has low start-up delay. However, there are two main problems of this method. (1) If the bandwidth of the interior node is low, child nodes may lose data. (2). When encountering an interior node failure, the children can’t receive data until the recovery of the tree. Therefore, we propose a hybrid method which combines the mesh-pull method and the tree-push method to resolve the above two problems and still maintain the advantages of tree-based and mesh-based approaches. Our approach is composed of three parts: (1) Streaming data fragmentation and building a forest: we split streaming data and build a forest to transfer the streaming data. (2) Data restoration: we integrate a forward error correction (FEC) algorithm to recover lost data. (3) Data retransmission: when encountering data loss we use a data retransmission method, which is a pull method, to retrieve lost data. We describe the details of these three parts, as follows.

3.1 Streaming Data Fragmentation and Building a Forest

The concept of multiple data transmissions using a forest was described in [9]. Video data are divided into frames, and each frame is assigned a sequence number to represent its playback order. We then split the frame into several stripes and transfer each stripe by using an individual multicast tree that is formed by participating nodes. To distribute the forwarding load among all participating nodes, all the nodes form interior-node-disjoint multicast trees [9]. The basic architecture of our proposed approach is like SplitStream. We use Scrbie multicast trees to form a forest. We exploit the properties of Pastry to construct interior-node-disjoint trees. In our P2P network, every node has a nodeId, and every stripe has a stripeId. Each stripe’s stripeId starts with a different digit. The nodeIds of interior nodes share a prefix with the stripeId. Figure 1shows an example forest construction[9]. Since nodeId of node A starts with 1, node A is an interior node in the tree for stripeId starting with 1. And node A is a leaf node in other trees. In the forest, one node is an interior node in one multicast tree, and is a leaf node in other multicast trees. Figure 2shows an example of building two multicast trees for two stripes. Each interior node in Tree 1 is a leaf node in Tree 2.

Figure 1. An example forest construction.

Stripe1 Stripe2

D B

Tree 1 Tree 2

Figure 2. An example of building two multicast trees for two stripes.

3.2 Data Restoration

We use multiple trees to transfer stripes. Nodes may lose some stripes, so we use an FEC algorithm to recover a complete data frame. The forward error correction (FEC) algorithm is a technique used in error correction. The sender adds redundant packets to the original data, also known as an error correction code. The use of FEC begins with a proper selection of parameters k and n (k < n). We split one frame into k packets. n is the number of encoded packets. In our approach, the value n is equal to the number of stripes. The source node will transfer n encoded packets to child nodes. Then even suffering from packet loss in some stripes, we still can recover the original data with at least k encoded packets. By this technique, the extra redundant packets are (n-k). The details of the decoder and encoder of FEC are described in [17]. Here we show the process of decoder and encoder in Figure 3. Let xv be the source data. The encoded data yv is generated byyv =Gxv. G is an n × k matrix with rank k.

The matrix G is called the generator matrix [17]. yv is encoded data by linear combination of G andxv. Assuming that only k components of yv are successfully received at the receiver, we can still restore the source data by using the k components. The solution of decoding data is xv=G′⁻¹yv′ fromyv′=G′xv, where xvis the source data and y′v is a subset of k components of yv . Matrix G' is the rows from G corresponding to the components of y′v . The matrix G' is k × k and G' is invertible. We can decode the source data by multiplying G and y′ . ′⁻¹

An example FEC is shown in Figure 4and the detailed calculation is described in [22]. We set the three packets be P₁= [1001], P₂= [0101], and P₃= [1101]. And the generating matrix

After encoding byyv =Gxv, we get the redundant packets, where P₄= 2P₁+ 3P₂+ P₃ = [1001],

Figure 3. The encoding and decoding processes represented by matrix operations. y′v and G' correspond to the grey areas of the yv and G [17].

Figure 4. An example FEC. We can decode the original data with any k blocks out of n blocks.

Figure 5. Spare nodes selection algorithm.

Spare Nodes Selection Algorithm INPUT:

nodeId: SpareTable for this node;

degree[i]: degree of tree i;

num_stripe: number of stripes;

sub_tree[j]: set of nodeIds in sub-tree j of tree i;

SpareTable[i]←SpareTable[i]∪{sub-tree[j]}

End if;

End for j;

End for i;

3.3 Data Retransmission

Here we first explain how to construct a SpareTable, and then describe the proposed retransmission mechanism. The SpareTable records spare nodes information. Nodes can find the targets in the SpareTable when nodes want to send requests. We present a spare nodes selection algorithm as shown in Figure 5, to construct a SpareTable. When a new node joins the network, it will contact the source node. The source node will select spare nodes for this new node. In a multicast tree, a node failure will cause its children not being able to receive data, but nodes in other sub-trees won’t be affected. So, the nodes in other sub-trees are suitable nodes to request for lost data. In our P2P network, there are several multicast trees.

Assume the number of stripes is N. There are N multicast trees. Assume the degree of a multicast Tree 1 is M (M<N). We divide multicast Tree 1 into M’s sub-trees. The source node chooses every node from the other M-1 sub-trees except the sub-tree the new node belonged to and construct one entry in the SpareTable. So there are N entries in the SpareTable.

Following is an example in Figure 2. Since there are two multicast trees in Figure 2, there are two entries in SpareTable of every node as shown in Figure 6. For node A in Figure 2. There are two entries for stripe 1 and stripe 2 respectively in SpareTable. For stripe 1, we choose the nodes B、E、H、F to be spare nodes because they are in other sub-tree. For stripe 2, we choose D、E、G、C to be spare nodes.

Figure 6. The spare table (SpareTable) of node A

In the P2P network, nodes receive frames and store them in the buffer. When one node receives one frame with a new sequence number, for example, sequence i, the node will check whether the frame sequence number (i-1) has been decoded. If decoded, there is no need to send any requests for this frame. If not, the node checks to which stripe number that frame (i-1) belong, and sends a request messages for lost blocks to a randomly selected node from the corresponding entry in the SpareTable. On the other hand, the node will check the receiving buffer whether the block of specific <sequence, stripe> is available after receiving a request message. If the requested block is available, the node will send back the block to the requesting node. If the requested block is not available, the node will not send back a reply.

We see an example to illustrate the retransmission process as shown in Figure 9. In Figure 9, node B failed. Nodes E and F may lose some blocks in stripe 1, so node E sends a request to node G. If the specific <sequence, 1> block is available of node G, it will send back the block to node E. Node F did the same. In tree 2, when node C lost blocks in stripe 2, it sends a request to a node in entry 2 of the SpareTable. Node C send a retransmission request and get a reply from spare node A in multicast tree 2.

Receive frame i

Receive a retransmission request

of the <seq#, stripe#>

block, check receiving buffer

Is <seq#, stripe#>

block available?

Send back the block

Yes

No need to send reply No

Figure 8. The flow char of the retransmission process:

the process of a node sending a reply.

Figure 9. The retransmission processes of two multicast trees. (a) Nodes E and F send retransmission requests and get replies from spare nodes G and D in multicast tree 1. (b) Node C send a retransmission request and get a reply from spare node A in multicast tree 2.

Chapter 4 Simulation Results

We first used FreePastry (version 2.0_04) [18] to implement the DHT layer (Pastry) [15], application layer multicast (Scribe), and streaming layer (SplitStream). Then we implemented our proposed HyStream scheme on this structure. We compare the proposed HyStream with SplitStream in Section 4.1 and with CoolStreaming in Section 4.2

4.1 Simulation against SplitStream

We implemented our simulation environment on Pastry overlay with 800 nodes and built a forest structure with these nodes. Repair time is determined primarily by SplitStream’s failure detection period, which triggers a tree repair when no heartbeats or data packets have been received for 30 seconds. The delivery ratio is defined as the number of packets that arrive at each node before the playback deadline over the total number of delivered packets.

Mean time to failure (MTTF) [19] is defined as the time interval to kill a portion of nodes.

The node failure rate [20] is defined as the percent of nodes that failed simultaneously. In Figure 10, we evaluate the delivery ratios under different node failure rate between 1% and 20% (which implies 8 to 160 simultaneous failures in the overlay with 800 nodes) with a fixed MTTF of 100 seconds. The result shows that the delivery ratio decreases as the node failure rate increases in our HyStream and SplitStream. We set the parameters with FEC (16, 15), and thus the encoded data has about 6% extra redundant packets. We can see that adding an FEC recovery method can improve the delivery ratio. However, when the node failure rate

average improvement is 11.7%. We can see that when the node failure rate increases, the improvement of HyStream increases.

Figure 10. Delivery ratios with various node failure rates under a fixed MTTF (100 seconds).

Figure 11. Extra control overhead with various node failure rates under a fixed MTTF (100 seconds).

0.5

In Figure 11, we show the extra control overhead with various node failure rates. We define the extra control overhead as the control traffic volume / video traffic volume at each node. The extra control traffic of HyStream is the retransmission requests. We observed that the extra control overhead increases as the node failure ratio increases. The maximum extra control overhead is lower than 0.5%. When the node failure rate is under 3%, we can recover most of data with a small number of retransmission requests and low extra control overhead.

4.2 Simulation against CoolStreaming

Here we evalue two metrics: delivery ratio and delivery latency. We compare the proposed HyStream with CoolStreaming in terms of delivery ratio and start-up delay, with CoolStreaming’s simulation results obtained from [5].

(1) Delivery ratio:

We first compare the delivery ratio between HyStream and CoolStreaming. The delivery ratio is defined as the number of packets that arrive at each node before the playback deadline over the total of number of delivered packets. We implemented a simulation environment according to [5]. We set the streaming rate as 500 Kbps. And the overlay size is 200 nodes.

We set each node to change its status according to the ON/OFF period. The node actively participates the overlay during the ON period and leaves (or fails) during the OFF period.

Both ON and OFF periods are exponentially distributed with an average of time T. Simulation results are shown in Figure 12. We found that the shorter ON/OFF period leads to a lower delivery ratio. We also found that the delivery ratio of SplitStream is lower with a lower ON/OFF period beacause SplitStream is a push-based method. Our approach uses a data

Figure 12. Delivery ratio as a function of ON/OFF period T (sec).

(2) Delivery latency:

We define the start-up delay as the waiting time that a node receives enough data to start playing after it joins the overlay. We implemented 1000 nodes in the overlay and recorded the start-up delay in cumulative distribution function (CDF), as shown in Figure 13. HyStream had the same start-up delay with SplitStream. We observed that 90^th percentile nodes had the start-up delay of 15 in our HyStream. The 90^thpercentile nodes had the start-up delay of 50 seconds in CoolStreaming. Our HyStream is 35 seconds shorter in the start-up delay of the 90^thpercentile nodes. Since our approach is a push-based method, its start-up delay is very short.

ON/OFF period T (second)

Delivery ratio

Figure 13. CDF of start-up delays between HyStream and CoolStreaming.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 5 10 15 20 25 30 35 40 45 50 55

CDF of start-up delay

Start-up delay (seconds)

CoolStreaming HyStream(proposed) (same as SpliStream)

Chapter 5 Implementation Issues

5.1 Applying our Approach to SplitStream

In this section, we introduce how to implement our HyStream method. Our HyStream method is an enhanced improvement of SplitStream’s streaming multicast. We can implement our approach based on the FreePastry project [18]. FreePastry is an open source project. We can modify its source code to implement our method. The original SplitStream method

在文檔中基於應用層群播之具回復性推拉式同儕式串流方法 (頁 14-0)