**Placement of Web-Server Proxies with** **Consideration of Read and Update**

**Placement of Web-Server Proxies with**

**Consideration of Read and Update**

**Operations on the Internet**

**Operations on the Internet**

### Xiaohua Jia, Deying Li, Xiaodong Hu, Weili Wu and Dingzhu Du

### THE COMPUTER JOURNAL, Vol.46, No.4, 2003

**Outline**

**Outline**

### Introduction

### Problem Formulation

### Optimal Placement of k Proxies

### Optimal Number of Proxies

**Caching**

**Caching**

### Caching is a technology to alleviate traffic congestion and improve the response time of Web servers.

### There are basically two types of Web caching:

### client-based caching

### server-based caching

**Two Problems of the Placement of** **Proxies**

**Two Problems of the Placement of**

**Proxies**

### 1. Given proxies, find the optimal placement of the proxies in the network, such that the overall access cost(including both read and update costs) is minimized.

### 2. For an unconstrained number of proxies, find the optimal number of proxies and their placement, such that the overall access cost is minimized.

### They formulate the problems by using the dynamic programming

### method. Optimal results are obtained in polynomial time.

**Outline**

**Outline**

## Problem Formulation

**Notations and Definitions**

**Notations and Definitions**

### The network is modeled by a connected graph

### . For a link

### ,

^{ }

### is the distance of the link.

### Web server is

^{}

### .

### Each node

^{}

### is associated with a non-negative number

^{}

### , which is the access frequency to

^{}

### .

^{}

### can be the number of accesses during a certain period of time, combined with the data size retrieved from the server.

### Server

^{}

### has an update frequency

^{}

### , which is the number of

### update operations (combined with the updated data size) during a

**Stable Routing**

**Stable Routing**

### The implied routing method in the Internet is the shortest path routing.

### It always takes the shortest paths to access data.

### If the routing is stable, the routes to all cilents as viewed by

^{}

### form a shortest path tree, where the root of the tree is

^{}

### .

### Let

^{}

### denote such a tree induced from the original network graph.

**Update Models**

**Update Models**

### There are basically two models for

^{}

### to transmit updated data to the proxies: unicast model and multicast model.

### In this paper, multicast model is used.

**Notations and Definitions (Cont.)**

**Notations and Definitions (Cont.)**

### Let

### be the path connecting

^{}

### and

^{}

### in

^{}

^{}

### . Then,

### P

### . Let

^{}

### denote the first proxy that is met while going from client

^{}

### to

^{}

### along the tree

^{}

^{}

### .

### Suppose the hit ratio of a proxy is

^{}

### . So,

^{}

^{}

### of client’s requests can be served by a proxy if the proxy is met and the remaining

### requests will be served by the server.

### Here, they assume the upstream flow from a proxy has a hit ratio zero(i.e. a request will be served by the server if it is missed by a proxy).

### The hit ratio is an average value of many accesses to a proxy measured over a long period of time.

### They assume the hit ratios of all proxies and for all clients are the same (i.e.

^{}

### ).

**Hit Ratio**

**Hit Ratio**

### They assume the hit ratios of all proxies and for all clients are the same

### (i.e.

^{}

### ).

**Cost**

**Cost**

### The cost for client

^{}

### to access

^{}

### is

^{}

^{}

### . The total cost for all clients in

^{}

^{}

### to access

^{}

### is

### P

### .

### Let

^{}

### denote a set of proxies in the network and

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

### the shortest path tree rooted from

^{}

### to reach all proxies in

^{}

### .

### The overall cost to update the proxies is

### X

### The total cost of all clients in

^{}

^{}

### to access

^{}

### with a set of proxies

^{}

### is

### X

### X

**Outline**

**Outline**

## Optimal Placement of Proxies

**The Meaning of Left**

**The Meaning of Left**

### Let denote a subtree of

^{}

### rooted at a node

^{}

### ,

^{}

^{}

^{}

### .

### Given

^{}

### and

^{}

### in

^{}

### ,

^{}

### is said to be the left of

^{}

### if there exist

^{}

### and

### such that

^{}

^{}

^{}

^{}

### ,

^{}

^{}

^{}

^{}

### , and

^{}

### and

^{}

### are siblings with

^{}

### being to the left of

^{}

### .

**Partitioning of Subtree**

**Partitioning of Subtree**

^{}

### can be partitioned into three subgraphs by dividing point ,

^{}

^{}

### .

**Partitioning of**

**Partitioning of**

^{}

^{}

^{}

### and

^{}

### is to the left of

^{}

### .

### is partitioned into three parts:

^{}

^{}

^{}

^{}

^{}

^{}

### ,

^{}

^{}

### and

^{}

^{}

^{}

^{}

### .

**Simplicitly of Notation**

**Simplicitly of Notation**

### The cost function can be rewritten as:

### X

### X

### X

### X

### X

### For the simplcity of notation, omit the second term.

### X

### X

**Cost Notations of Optimal Placement** **of** **Proxies**

**Cost Notations of Optimal Placement**

**of**

**Proxies**

### To recursively define the placement of proxies in a tree, the server at the root is regarded as one of the proxies.

### For

^{}

^{}

^{}

^{}

### , define

### (or

^{ }

^{ }

^{}

^{}

^{}

^{}

^{}

^{}

^{}

^{}

### ) is the minimal access cost by placing

^{}

### proxies in

^{}

^{}

### (or

### ).

**Dynamic Programming**

**Dynamic Programming**

### When

^{ }

^{}

### , we always place the only proxy at root

^{}

### . So,

### P

### .

### When

^{}

### , we can always find a node

^{}

### ,

^{}

^{}

^{}

^{}

### and

^{}

^{}

^{ }

^{}

### , which satisfies:

### (1) a proxy is placed at

^{}

### (2) no proxy is placed in

^{}

^{}

^{}

^{}

### (

^{}

^{}

^{}

^{}

### could be empty) (3) no proxy is placed in

### .

**Dynamic Programming (cont.)**

**Dynamic Programming (cont.)**

### When

^{}

### , assume proxies are placed in

^{}

^{}

### ,

^{}

^{}

^{}

^{}

^{}

### . So

^{}

### proxies will be placed in

^{}

^{}

^{}

^{}

### .

### Therefore, we have

### , where

^{ }

### P

### .

**Dynamic Programming (cont.)**

**Dynamic Programming (cont.)**

### 8

### > >

### > >

### > <

### > >

### > >

### > : P

### if

^{ }

^{}

### if

^{}

### 8

### > >

### > >

### > <

### > >

### > >

### > : P

### if

^{ }

^{}

### if

^{}

**Algorithm 1**

**Algorithm 1**

**Tracing the Placement of** **Proxies**

**Tracing the Placement of**

**Proxies**

### , a 2-dimensional array, records the minimal cost of by placing

^{}

### proxies in it.

### records the partitioning point of - say

^{}

### - and the number of proxies placed in

^{}

### .

### The final result of proxy placement is derived from arrays

^{}

### and

^{}

### .

**Time Complexity of Algoritm 1**

**Time Complexity of Algoritm 1**

### The work of procedures

^{}

^{}

^{}

^{}

### and

^{}

^{}

^{}

^{}

^{ }

### is to compute results in arrays

### and

^{ }

^{ }

### , respectively.

### and

^{ }

^{ }

### have

^{}

^{}

### and

^{}

### different entries, respectively.

### It takes at most

^{}

### time to compute an entry.

### So It takes

^{}

### time to compute.

**Outline**

**Outline**

## Optimal Number of Proxies

**Cost Notations of Optimal Number of** **Proxies**

**Cost Notations of Optimal Number of**

**Proxies**

### We want to find the optimal number of proxies required in the system to make the total cost minimal.

### Define

### Considering the optimal placement of proxies in

^{}

^{}

### , there are always two choices:

### placing no proxy in

^{}

^{}

### (except

^{}

### ) or place some proxies in

^{}

^{}

### (besides

^{}

### ).

### Let

^{ }

### denote the total cost in

^{}

^{}

### with no proxy placed in it and

^{ }

### the total cost with some proxies in

^{}

^{}

### .

### X

**Dynamic Programming**

**Dynamic Programming**

### X

### X

**Algorithm 2**

**Algorithm 2**

**Time Complexity of Algoritm 2**

**Time Complexity of Algoritm 2**

### Procedures

^{}

^{}

^{}

^{}

### and

^{}

^{}

^{}

^{}

^{ }

### compute arrays

^{ }

### and

^{ }

^{ }

### , respectively.

### and

^{ }

^{ }

### have

^{}

### and

^{}

### different entries, respectively.

### It takes at most

^{}

### time to compute an entry.

### So It takes

^{}