Peer-to-Peer Network - 一個在同儕式網狀串流系統中之有效率的來源選擇方法

Chapter 2 Background

2.1 Peer-to-Peer Network

2.1.1 Introduction

In the traditional client-server architecture, services must be provisioned by specific machine which has sufficient resource and is connected to the network. The machine is called

“server”, which provides contents such as texts, multimedia streams or images, and “client”

that connects to the server to get the data.

In the client-server communication model, a server could be a bottleneck of the service operation when the user grows up because of the restriction of resources such as computing power, network bandwidth, and the storage size of the server. Even worse, all services will be terminated when the server crashes. To reduce the impact of the server failure problem, server cluster or server farm has been used to avoid the single point failure problem. However, as the number of clients grows up, the cost of servers in such system would be too high to be affordable.

To enable the application to support a large number of clients, P2P architecture is employed in recent years. The end hosts participating in the service contribute their resources to help forward and replicate received stream to other clients. The clients, considered as peers in the system, are no longer only data receivers, but also data providers to increase system

capacity and service quality.

2.1.2 P2P file sharing applications

A great number of P2P file sharing applications have been developed over the world with different philosophies and operation modes after Napster raised the interest of P2P in late 1990s. Every peer has the ability to be the server. There is no restriction on what contents a peer can publish and what contents can be obtained from which peer. The most popular P2P file sharing applications are Gnutella, eDonkey and BitTorrent [7], etc.

a) Gnutella

Gnutella, developed by Justin Frankel and Tom Pepper of Nullsoft in 2000, is one of the earliest P2P file sharing tools. It is also one of those pure P2P applications that do not have a centralized server. Gnutella works as follows [8]. A Gnutella peer joins the network via at least one known peer, whose IP address is either obtained via a bootstrapping process or an existing list of pre-configured address. When the user wants to do a search, the source peer will send the search request to all actively connected peers as shown in Figure 2.1. The recipient peer answers the query if it has the data, otherwise it forwards the request.

Figure 2.1 Purely decentralized P2P architecture

A noticeable feature is that Gnutella is under nobody’s specific control and is effectively impossible to shutdown. However, because of too many search requests, when the number of Gnutella users grows too large, the search requests overwhelm the Internet and thus performance gets prematurely dropped.

b) eDonkey

The eDonkey was conceived in Sept. 2000 by MetaMachine Inc. The eDonkey network works as a hybrid P2P network which consists of clients and servers. The eDonkey client downloads and shares the files and the eDonkey server acts as a communication hub and distributes addresses of other eDonkey servers to the clients.

The eDonkey network is based upon an open protocol. As a result, there are many versions of eDonkey clients and servers.

c) BitTorrent

BitTorrent (BT) was created in 2002 by Rram Cohen. It runs on an open protocol. To share a file or a set of files through BitTorrent, a torrent file is first created. The torrent file contains the metadata of the shared content, which includes the information of the tracker that coordinates the file distributed. When a BitTorrent user wants to retrieve a data, it first needs the torrent file. The client then contacts the

“tracker server” listed in the torrent file which can be a single computer or a distributed set of computers. The “Tracker Server” in BT system logs who are downloading the file at the same time and helps peers to find each other.

The most significant contribution of BT is dividing a file into several fixed size pieces, called chunks. A peer can download a file from different peers for different chunks simultaneously. This idea greatly improves download efficiency and reduces a great deal of download time.

Peer A

As Figure 2.2 shows, every peer connects to tracker server. When peer C wants to download the file, it asks tracker and gets a response which indicates that peer A has chunk 2 and chunk 4, peer B has chunk 1 and chunk 3. With the help of the tracker, peer C can download different chunks from different peers simultaneously.

The concept of “chunk” is used for many areas, such as P2P streaming. We will discuss P2P streaming in more detail in Section 2.2.4.

2.1.3 P2P VoIP-skype

Skype is a P2P VoIP (voice over Internet protocol) application. Compared with public VoIP standard such as SIP and H.263, Skype uses a proprietary protocol and relies on the Skype P2P network for user directory. As a result, Skype can easily scale to a very large size.

Skype uses a two tier infrastructure, which consists of super-nodes and clients. When a peer

joins the network, it always starts as a client node. If a client has sufficient ability, Skype automatically promotes the client to a super-node. The Skype client finds at least one super-node as its default gateway, and first tries to establish UDP connection to the super-node. If that fails, it tries to establish a TCP connection on an arbitrary port to the super-node. The overlay used in Skype is shown in Figure 2.3.

Figure 2.3 Two tier P2P overlay used in Skype

2.1.4 P2P streaming

To provide IPTV service in the Internet, the most intuitive approach is through centralized server. However as described above, client-server model has its restrictions. To solve this problem, researchers try to make IPTV based on P2P architecture. We take a look on the architecture in following.

a) Centralized server architecture:

IPTV based on centralized architecture is very similar to the traditional

“client-server”. The difference is that the server provides video contents instead of

images and text. Since IPTV has the potential to overwhelm the Internet backbone and heavy load on the server, we need to consider other approaches to overcome this problem.

b) IP layer multicast:

The first solution is to employ IP multicast scheme on routers. Instead of delivering to every user a copy of the content, server only sends video content to a multicast address. The video content will be replicated by the router so as to reduce the workload on the server. However, the routers in core network belong to different ISPs, a large portion of them do not enable router multicast function.

c) P2P architecture:

In P2P architecture, every peer downloads and shares video contents with each other. Because every peer plays as not only a client but also a server, thus, it reduces the workload on the server. The basic concept is that server provides content to a small subset of peers, and then shares the content with each other, as shown in Figure 2.4.

Figure 2.4 P2P architecture on video stream delivery

在文檔中一個在同儕式網狀串流系統中之有效率的來源選擇方法 (頁 15-21)