Hardness of Packing and Placement for Direct Mapped Cache

CHAPTER 4 PRACTICAL APPROACHES

4.1 Hardness of Packing and Placement for Direct Mapped Cache

Section 3.3 analyzes the properties of the packing and placement problem for the k-set direct mapped cache. It proposes a method to transform the object access trace to a graph by using the Degree-2 trace information. That graph expresses the relations between objects, memory blocks, and sets. The temporal relations among entities classify the edges in the graph into three types (Type-I, Type-B, and Type-S). The goal of the packing and placement problem is creating a memory layout that minimizes cache misses when reproduce the same object access trace. Due to the nature of the pair-wise trace information, we derive the following formula to estimate cache misses –

|BT| - ( Length(Type-I-Edges) + Length(Type-S-Edges) ) (4.1)

The length of the block access trace |BT| is a constant in this formula, but the lengths of Type-I edges and Type-S edges are derived by the object layout. In other words, maximizing the sum of lengths can minimize cache misses. It is easy to show

that minimizing the sum of length of all Type-B edges is a dual problem to Equation (4.1) by the following equation.

|BT| - (Length(Type-I-Edges) + Length(Type-S-Edges) )

=(Length(Type-I-Edges)+Length(Type-B-Edges)+Length(Type-S-Edges)) - Length(Type-I-Edges)+Length(Type-S-Edges))

= Length(Type-B-Edges)

(4.2)

Therefore, the packing and placement problem can be defined as follows.

Definition 4.1. Consider a K-set direct mapped cache and an object set allocating to the memory, defined as O = {o₁, o₂, o₃,...}. The memory space is partitioned into K disjoint sets of memory blocks. A set denoted as s_i-=-{b_i,1, b_i,2, b_i,3 … } represents a collection of memory blocks, where each b_i,j denotes a memory block belonging the i-th set s_i. The size of each memory block b_i,j is M. The purpose is to find a legal mapping function f_pp(o_i)  b_r,t that assigns each object to a memory block in a specific set. Meanwhile, it must satisfy the condition that





minimizing the following cost function –

 

In the last equation, w(o_i, o_j) is the value from the Degree-2 trace information, or the length of Type-B edges, equivalently.

Subsequently, we are going to show that finding an optimal solution for this problem is as hard as solving the MIN k-PARTITION problem. The MIN k-PARTITION is a dual problem of the MAX k-CUT [43].

Consider a graph G₁=(V,E) with K partitions, where |V(G₁)|=Q, and the each vertex is associated with value K. Since the vertex set V is divided into K partitions, the number of vertexes in each partition is denoted as (n₁,n₂,…n_K), and the vertex set is

the r-th partition. In other words, the vertex subset { v_r,1, … v_r,n

r } contains vertexes belonging to the r-th partition. Figure 4.1 shows an example of G₁, and the dashed lines divide the graph G₁ into partitions. We use different notations to distinguish edges within and across partitions. p(v_r,h, v_r,s) denotes the length of an edge inside the r-th partition, and w(v_r,h, v_q,s) denotes the length of an edge across two distinct partitions.

Since the graph G₁ is assumed to satisfy the conditions of MIN k-PARTITION, it implies the summing up length of edges within the same partitions ∑p(u,v) gets the minimum comparing to other geometrics of the partitioned G₁. In the meanwhile, the condition ∑w(x,y) >∑p(u,v) is hold.

Next, we create a mapping F(v) to transform G₁=(V,E) to G₂=(V’,E’), where G₂=(V’,E’) is a restricted version for the packing and placement problem for the direct mapped cache. The mapping F(v) works in the following way.

 v_i,jV, F(v_i,j)={v’_i,j,1,…,v’_i,j,K}, where v’V’. (4.4)

The mapping means evenly splitting every vertex v_i,j into K fractional vertexes.

The value is evenly distributed to fractions as well, that is, the value of each fraction v_i,j,t is 1

KK . As a result, we can get a transformed vertex set, written as )}

v’_i,j,K} are connected to each other and become a K_K complete graph. Therefore,

 

1 K

K edges are appended to the edge set E’(G₂). Edge length h is given to all these

kind of edges, and its value is given as h = ∑w(x,y)+∑p(u,v), that equals to the summing up lengths of all inter-partition edges. This ensures h is the greatest value among all edge lengths in E(G₁). The fractional vertex v’_i,j,1 replaces the role of v_i,j, and edges connected to v_i,j are re-attached to v’_i,j,1 correspondingly. Therefore, the edge set E’(G₂) is expressed as follows.

For example, the graph G₂in Figure 4.2 is constructed from the graph G₁ in Figure 4.1 using the discussed method. The vertexes and the sub-graph enclosed by a shadow area in G₂ are expanded from a single vertex in G₁.

Next, we are going to show that the optimal partition layout of G₁ that satisfies MIN k-PARTITION can be transformed and becomes an optimal layout of G₂ for the K-set packing and placement problem. That is, G₂ can be used to represent an object access graph. Each vertex v’_i,j,t represents an object and its value corresponds to the size of an object. The length of an edge is marked by the Degree-2 trace information.

Besides, block size constraint is assumed K. Since each vertex subset { v_r,1, v_r,2, … v_r,n

r} belongs to the same partition in G₁, the vertex subset { {v’_r,1,1, …, v’_r,1,K }…{ v’_r,n

r,1, … v’_r,n

r,K } } is grouped into the same r-th partition in G₂. Consider the sub-graph enclosed within the r-th partition. The length of edges connects

v’_r,x,1 and v’_r,y,1, which is p(u,v), both are smaller than h.

( holds by our scheme. Therefore, every subset

{ v’_r,t,1, v’_r,t,2,…,v’_r,t,K } can be consolidated to a memory block and that makes the sum of objects in a memory block fulfills the problem requirement. Since the lengths of all Type-B edges are exactly p(u,v), and ^p

 

^u^,^v ^^w

 

^x^,^y ^^hK⁽^K₂ ^¹⁾^Q^holds.

Therefore, the layout satisfies the problem requirements.

Subsequently, the K-set packing and placement problem is as hard as MIN k-PARTITION, as well as MAX k-CUT. Since there is no polynomial time algorithm to

find an optimal layout to satisfy MIN k-PARTITION, neither solves the K-set packing and placement problem.

v

_1,1 ^p1

v

_1,2 ^p2

v

_1,3

v

2,1 ^p³

v

2,2 ^p⁴

v

2,3

v

3,1 ^p⁵

v

3,2 ^p⁶

v

3,3

v

_4,1 ^p7

v

_4,2 ^p8

v

_4,3

w1 w2 w3

w4 w5 w6

w7 w8 w9

Figure 4.1. A partitioned graph satisfies MIN k-PARTITION. The symbols w i and p

i denote edge lengths.

V4,3,1

Figure 4.2. A sample graph transformed from Figure 4.1.

在文檔中適用於快取記憶體的封裝暨安置物件方法 (頁 77-83)