System design - Regenerating Code with Cache

3. Regenerating Code with Cache

3.2. System design

Our scheme is fundamentally based on the scheme in [27] . The difference is that we encode with Regenerating code, take the LRU cache into account and add some information in DHT as the peers are requesting to make the access and the encoding more efficient. Overall, each peer is responsible for three works, indexer, register and maintenance, which controls the data placement, the data lookup and the data availability in the P2P storage system. These mechanisms are supported by the Chord protocol in the bottom layer.

For the indexer and the register, each peer periodically registers the unique file IDs of all the holding blocks in its database with some indexers of the file. Here the coding blocks in each peer belonged to the same file have the same ID but own different coefficients. When the indexers receive the report, they record a new index including the IP of the reporting peer and the file IDs in the report into their index table and they set a timer to decide if they should remove these indexes. The indexer can use the timer to detect the situations of the indexed peers, namely the peers are alive or not, in the system. The first indexer is decided by the hash function used in DHT H(ID) and others are the adjacent and continuous successors of the first indexer.

We illustrate this in Figure 3-3 .

Figure 3-3 A peer registers the block d to 3 indexers

For the maintenance, each peer is assigned to manage the blocks of the some files according to the hash function H(ID), and then it periodically evaluates the number of registered blocks in its index table. If the number is under the redundancy target, it invokes the event of generating new coding blocks to increase the availability of the file before communicating with other indexers and confirm that the redundancy actually is not enough.

Next, we explain how to request a file and how to create a new coding block. In the two operations, each peer will check both its database and LRU cache before request to other peer and each requested peer also replies the holding information of the requesting file through its database and LRU. The communication has two steps as a peer wants to request a file from the target peer. First, the peer requests a file to the target peer, and then the target peer compares all its owning blocks ID with the file ID

to produce a list of block information only containing the 49 coefficients of each block. Second, the requesting peer uses the list to get the blocks it does not have from the target peer and the transporting of the blocks occurs at this step. Owing to the two steps, the requesting peer can avoid downloading the owned blocks.

For the request, the indexers will always additionally index the peer for each file that last requested the file, and send the information of both the peer and registered peers to the requesting peer. When requesting a file d, the peer will choose the first indexer to get the peer information of the file. After getting back the peer list, the requesting peer will connect to the last requested peer first. The peer then follows the peer list to collect 7 independent blocks. If the information is still not enough, the peer attempts to look up other indexers. If all indexers can’t provide sufficient information, it will try again later until its max lookup time. After collecting the 7 independent blocks successfully, the peer decodes the blocks and put them back to its LRU cache for the next access.

For the mechanism of creating a new coding block, we introduce another type of the indexed peers. It is similar to the above that the indexers always additionally index the peer for each file that last requested the file but not among the registered members. The reason is that we want to index a peer holding the most blocks to become the first choice to let it generate a new coding block by itself, and this can save the maintaining cost. When the indexers decide to create a new coding block, they notify the peer indexed above first and then randomly choose other peers not in the list of the registered members to generate the new coding blocks and the list of the registered members for the file is in the notification for the notified peers. The chosen peers with 7 coding blocks in LRU cache generate the coding block by themselves. If the coding blocks are not enough in LRU cache, they request the last accessed peer first. If that peer has 7 blocks, then the coding block is generated by

the peer. If not, they then choose the registered members, and have two choices which depend on the number of the gathered and useful peers for creating the coding block. If the number is equal or larger than 13, then it follows the Regenerating Code to create a coding block. If unfortunately the number is smaller than 13 then it turns to collect 7 blocks to reconstruct the file and create a new coding block. Finally, if the number is smaller than 7, like the request process, it tries to request the indexers again later to get more information and follow the steps before until time out occurs.

The flow chart of the process appears in Figure 3-4 .

Figure 3-4 The flow char of creating a coding block

在文檔中使用再造編碼及快取技術之低成本高可用性點對點儲存方法 (頁 23-26)