Placement of Web-Server Proxies with Consideration of Read and Update
Operations on the Internet
Xiaohua Jia, Deying Li, Xiaodong Hu, Weili Wu and Dingzhu Du
THE COMPUTER JOURNAL, Vol.46, No.4, 2003
Outline
Introduction
Problem Formulation
Optimal Placement of k Proxies
Optimal Number of Proxies
Caching
Caching is a technology to alleviate traffic congestion and improve the response time of Web servers.
There are basically two types of Web caching:
client-based caching
server-based caching
Two Problems of the Placement of Proxies
1. Given proxies, find the optimal placement of the proxies in the network, such that the overall access cost(including both read and update costs) is minimized.
2. For an unconstrained number of proxies, find the optimal number of proxies and their placement, such that the overall access cost is minimized.
They formulate the problems by using the dynamic programming
method. Optimal results are obtained in polynomial time.
Outline
Problem Formulation
Notations and Definitions
The network is modeled by a connected graph
. For a link
,
is the distance of the link.
Web server is
.
Each node
is associated with a non-negative number
, which is the access frequency to
.
can be the number of accesses during a certain period of time, combined with the data size retrieved from the server.
Server
has an update frequency
, which is the number of
update operations (combined with the updated data size) during a
Stable Routing
The implied routing method in the Internet is the shortest path routing.
It always takes the shortest paths to access data.
If the routing is stable, the routes to all cilents as viewed by
form a shortest path tree, where the root of the tree is
.
Let
denote such a tree induced from the original network graph.
Update Models
There are basically two models for
to transmit updated data to the proxies: unicast model and multicast model.
In this paper, multicast model is used.
Notations and Definitions (Cont.)
Let
be the path connecting
and
in
. Then,
P
. Let
denote the first proxy that is met while going from client
to
along the tree
.
Suppose the hit ratio of a proxy is
. So,
of client’s requests can be served by a proxy if the proxy is met and the remaining
requests will be served by the server.
Here, they assume the upstream flow from a proxy has a hit ratio zero(i.e. a request will be served by the server if it is missed by a proxy).
The hit ratio is an average value of many accesses to a proxy measured over a long period of time.
They assume the hit ratios of all proxies and for all clients are the same (i.e.
).
Hit Ratio
They assume the hit ratios of all proxies and for all clients are the same
(i.e.
).
Cost
The cost for client
to access
is
. The total cost for all clients in
to access
is
P
.
Let
denote a set of proxies in the network and
the shortest path tree rooted from
to reach all proxies in
.
The overall cost to update the proxies is
X
The total cost of all clients in
to access
with a set of proxies
is
X
X
Outline
Optimal Placement of Proxies
The Meaning of Left
Let denote a subtree of
rooted at a node
,
.
Given
and
in
,
is said to be the left of
if there exist
and
such that
,
, and
and
are siblings with
being to the left of
.
Partitioning of Subtree
can be partitioned into three subgraphs by dividing point ,
.
Partitioning of
and
is to the left of
.
is partitioned into three parts:
,
and
.
Simplicitly of Notation
The cost function can be rewritten as:
X
X
X
X
X
For the simplcity of notation, omit the second term.
X
X
Cost Notations of Optimal Placement of Proxies
To recursively define the placement of proxies in a tree, the server at the root is regarded as one of the proxies.
For
, define
(or
) is the minimal access cost by placing
proxies in
(or
).
Dynamic Programming
When
, we always place the only proxy at root
. So,
P
.
When
, we can always find a node
,
and
, which satisfies:
(1) a proxy is placed at
(2) no proxy is placed in
(
could be empty) (3) no proxy is placed in
.
Dynamic Programming (cont.)
When
, assume proxies are placed in
,
. So
proxies will be placed in
.
Therefore, we have
, where
P
.
Dynamic Programming (cont.)
8
> >
> >
> <
> >
> >
> : P
if
if
8
> >
> >
> <
> >
> >
> : P
if
if
Algorithm 1
Tracing the Placement of Proxies
, a 2-dimensional array, records the minimal cost of by placing
proxies in it.
records the partitioning point of - say
- and the number of proxies placed in
.
The final result of proxy placement is derived from arrays
and
.
Time Complexity of Algoritm 1
The work of procedures
and
is to compute results in arrays
and
, respectively.
and
have
and
different entries, respectively.
It takes at most
time to compute an entry.
So It takes
time to compute.
Outline
Optimal Number of Proxies
Cost Notations of Optimal Number of Proxies
We want to find the optimal number of proxies required in the system to make the total cost minimal.
Define
Considering the optimal placement of proxies in
, there are always two choices:
placing no proxy in
(except
) or place some proxies in
(besides
).
Let
denote the total cost in
with no proxy placed in it and
the total cost with some proxies in
.
X
Dynamic Programming
X
X
Algorithm 2
Time Complexity of Algoritm 2
Procedures
and
compute arrays
and
, respectively.
and
have
and
different entries, respectively.
It takes at most
time to compute an entry.
So It takes