wall time (second per 10,000 runs)

95  Download (0)

Full text

(1)

Triangle Counting in Large Sparse Graph

Meng-Tsung Tsai r95065@cise.ntu.edu.tw

(2)

Problem Setting

(3)

Problem Setting(1/3)

Goal:

Calculating the cluster coefficient of a given graph G(V, E), where |V | = n and |E| = m.

(4)

Problem Setting(1/3)

Goal:

Calculating the cluster coefficient of a given graph G(V, E), where |V | = n and |E| = m.

Cluster coefficient indicates the probability that the friend of one’s friend is also one’s friend.

(5)

Problem Setting(1/3)

Goal:

Calculating the cluster coefficient of a given graph G(V, E), where |V | = n and |E| = m.

Cluster coefficient indicates the probability that the friend of one’s friend is also one’s friend.

Cluster coefficient is one of the important features to examine whether a man-made graph fits to the real one.

(6)

Problem Setting(1/3)

Goal:

Calculating the cluster coefficient of a given graph G(V, E), where |V | = n and |E| = m.

Cluster coefficient indicates the probability that the friend of one’s friend is also one’s friend.

Cluster coefficient is one of the important features to examine whether a man-made graph fits to the real one.

(7)

Problem Setting(1/3)

Goal:

Calculating the cluster coefficient of a given graph G(V, E), where |V | = n and |E| = m.

Cluster coefficient indicates the probability that the friend of one’s friend is also one’s friend.

Cluster coefficient is one of the important features to examine whether a man-made graph fits to the real one.

In terms of graph theory,

CC(G) = 3 × number of triangles ∈ G

number of triples ∈ G .

u

 TT u

 TT

(8)

Problem Setting(2/3)

Example:

 TT

T```





z z

z z

(9)

Problem Setting(2/3)

Example:

 TT

T```





z z

z z

number of triangle = 2

(10)

Problem Setting(2/3)

Example:

 TT

T```





z z

z z

number of triangle = 2 number of triple = 8

(11)

Problem Setting(2/3)

Example:

 TT

T```





z z

z z

number of triangle = 2 number of triple = 8

cluster coefficient = 3 × 2 / 8 = 0.75

(12)

Problem Setting(2/3)

Example:

 TT

T```





z z

z z

number of triangle = 2 number of triple = 8

(13)

Problem Setting(3/3)

Requirement:

Seeking for an efficient algorithm to count the number of triangles such that it takes Ω(m) space and Ω(n3) time.

(14)

Problem Setting(3/3)

Requirement:

Seeking for an efficient algorithm to count the number of triangles such that it takes Ω(m) space and Ω(n3) time.

We focus on social network graphs which cluster coefficient is especially important in.

(15)

Problem Setting(3/3)

Requirement:

Seeking for an efficient algorithm to count the number of triangles such that it takes Ω(m) space and Ω(n3) time.

We focus on social network graphs which cluster coefficient is especially important in.

In social network, the fact that m = ω(n2) usually holds.

(16)

Triangle Counting (Trivial Algorithm)

(17)

Trivial Algorithm

z z

z

 TT T

u v

(18)

Trivial Algorithm

z z

z

 TT T

u v +

(19)

Trivial Algorithm

z z

z

 TT T

u v +

z z

u v

(20)

Trivial Algorithm

z z

z

 TT T

u v +

z z

u v =

(21)

Trivial Algorithm

z z

z

 TT T

u v +

z z

u v =

z z

z

 TT T

(22)

Trivial Algorithm

z z

z

 TT T

u v +

z z

u v =

z z

z

 TT T

Let M be a matrix such that Mi,j is 1 if f an edge to connect vertices i and j exists.

(23)

Trivial Algorithm

z z

z

 TT T

u v +

z z

u v =

z z

z

 TT T

Let M be a matrix such that Mi,j is 1 if f an edge to connect vertices i and j exists.

Let M2 be M · M. What does Mi,j2 mean?

(24)

Trivial Algorithm

z z

z

 TT T

u v +

z z

u v =

z z

z

 TT T

Let M be a matrix such that Mi,j is 1 if f an edge to connect vertices i and j exists.

Let M2 be M · M. What does Mi,j2 mean?

△ = 1 P M2 · Mi,j

(25)

Trivial Algorithm

z z

z

 TT T

u v +

z z

u v =

z z

z

 TT T

Let M be a matrix such that Mi,j is 1 if f an edge to connect vertices i and j exists.

Let M2 be M · M. What does Mi,j2 mean?

△ = 1

6 P Mi,j2 · Mi,j

Simple Matrix Multiplication, Strassen Algorithm, and Winograd Algorithm all require O(n2) space to obtain M2. Not Acceptable!

(26)

Triangle Counting (Forward

Algorithm)

(27)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

(28)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

(29)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

(30)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

{1} ∩ {1, 2, 3} = {1}

(31)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

{1} ∩ {1, 2, 3} = {1}

△ = P

edge(u,v)∈E|Nu ∩ Nv|

(32)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

{1} ∩ {1, 2, 3} = {1}

△ = P

|N ∩ N |

(33)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

{1} ∩ {1, 2, 3} = {1}

△ = P

edge(u,v)∈E|Nu ∩ Nv|

all triangles can be found + all found objects are triangles

(34)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

{1} ∩ {1, 2, 3} = {1}



TT TT TT

``````





~ ~

~ ~

4

2

3

1

△ = P

|N ∩ N |

(35)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

{1} ∩ {1, 2, 3} = {1}



TT TT TT

``````





~ ~

~ ~

4

2

3

1

{1, 2}

{1}

{1, 2}

{}

△ = P

edge(u,v)∈E|Nu ∩ Nv|

all triangles can be found + all found objects are triangles

(36)

Forward Algorithm(1/2)



TT TT TT

``````





~ ~

~ ~

1

2

3

4

{}

{1}

{2}

{1, 2, 3}

{1} ∩ {1, 2, 3} = {1}



TT TT TT

``````





~ ~

~ ~

4

2

3

1

{1, 2}

{1}

{1, 2}

{}

△ = P

|N ∩ N |

(37)

Forward Algorithm(2/2)

Assign indices to vertices according to their degree. The higher the degree of a vertex is, the lower the index of it is.

(38)

Forward Algorithm(2/2)

Assign indices to vertices according to their degree. The higher the degree of a vertex is, the lower the index of it is.

If degree of vertex v ≤ √

2m, |Nv| ≤ √

2m.

(39)

Forward Algorithm(2/2)

Assign indices to vertices according to their degree. The higher the degree of a vertex is, the lower the index of it is.

If degree of vertex v ≤ √

2m, |Nv| ≤ √

2m.

If degree of vertex v >= k, at most 2m/k vertices with higher degree. Thus, |Nv| <= √

2m where deg(v) ≥ √

2m.

(40)

Forward Algorithm(2/2)

Assign indices to vertices according to their degree. The higher the degree of a vertex is, the lower the index of it is.

If degree of vertex v ≤ √

2m, |Nv| ≤ √

2m.

If degree of vertex v >= k, at most 2m/k vertices with higher degree. Thus, |Nv| <= √

2m where deg(v) ≥ √

2m.

(41)

Triangle Counting (Four Russians’

Algorithm)

(42)

Four-Russians’ Algorithm

{1, 0, 1, 1, . . .}

{0, 1, 0, 0, . . .}

. . .

(43)

Four-Russians’ Algorithm

{

sector

z}|{1, 0, 1, 1, . . .} {2, 3, . . .}

{0, 1, 0, 0, . . .} {1, 0, . . .}

. . . .

(44)

Four-Russians’ Algorithm

{

sector

z}|{1, 0, 1, 1, . . .} {2, 3, . . .}

{0, 1, 0, 0, . . .} {1, 0, . . .}

. . . .

0 1 2 3

0 0 0 0 0

1 0 1 0 1

2 0 0 1 1

(45)

Four-Russians’ Algorithm

{

sector

z}|{1, 0, 1, 1, . . .} {2, 3, . . .}

{0, 1, 0, 0, . . .} {1, 0, . . .}

. . . .

0 1 2 3

0 0 0 0 0

1 0 1 0 1

2 0 0 1 1

3 0 1 1 2

The table utilized in Four-Russians’ Algorithm is 2log n by 2log n. Thus, its speedup is O(log n).

(46)

Triangle Counting (FFR Algorithm)

(47)

FFR Algorithm

The red part of △ = P

edge(u,v)∈E |Nu ∩ Nv| in Forward Algorithm can be sped up with Four-Russians’

Algorithm.

(48)

FFR Algorithm

The red part of △ = P

edge(u,v)∈E |Nu ∩ Nv| in Forward Algorithm can be sped up with Four-Russians’

Algorithm.

Let the length of sectors be 12 log m, additional space for table is Θ(m).

(49)

FFR Algorithm

The red part of △ = P

edge(u,v)∈E |Nu ∩ Nv| in Forward Algorithm can be sped up with Four-Russians’

Algorithm.

Let the length of sectors be 12 log m, additional space for table is Θ(m).

The number of non-all-zero sectors in Nv is O(p

m/ log m) where deg(v) ≤ p

m/ log m.

(50)

FFR Algorithm

The red part of △ = P

edge(u,v)∈E |Nu ∩ Nv| in Forward Algorithm can be sped up with Four-Russians’

Algorithm.

Let the length of sectors be 12 log m, additional space for table is Θ(m).

The number of non-all-zero sectors in Nv is O(p

m/ log m) where deg(v) ≤ p

m/ log m.

(51)

FFR Algorithm

The red part of △ = P

edge(u,v)∈E |Nu ∩ Nv| in Forward Algorithm can be sped up with Four-Russians’

Algorithm.

Let the length of sectors be 12 log m, additional space for table is Θ(m).

The number of non-all-zero sectors in Nv is O(p

m/ log m) where deg(v) ≤ p

m/ log m. The number of non-all-zero sectors in Nv is O(p

m/ log m) where deg(v) ≥ p

m/ log m.

FFR needs O(m3/2/ log1/2 m) time.

(52)

CPU Instruction versus Memory

Access

(53)

Instruction versus Memory(1/3)

The inner product in Four-Russians’ Algorithm can be accomplished by two CPU instructions. It is known that the execution speed of CPU instruction is much faster than

that of memory access.

(54)

Instruction versus Memory(1/3)

The inner product in Four-Russians’ Algorithm can be accomplished by two CPU instructions. It is known that the execution speed of CPU instruction is much faster than

that of memory access.

"logical and" C = A ˚∧ B, Ci = min(Ai, Bi)

(55)

Instruction versus Memory(1/3)

The inner product in Four-Russians’ Algorithm can be accomplished by two CPU instructions. It is known that the execution speed of CPU instruction is much faster than

that of memory access.

"logical and" C = A ˚∧ B, Ci = min(Ai, Bi)

"population count" d = ˚σ A, d = Pg

i=1 Ai

(56)

Instruction versus Memory(2/3)

2.5 3 3.5 4

wall time (second per 10,000 runs)

ALGO 5 ALGO 2 with p= 8 ALGO 2 with p=16

(57)

Instruction versus Memory(2/3)

0 5 10 15 20 25 30

0 10 20 30 40 50 60

wall time (second per 10,000 runs)

bit density (x out of 64 bits are 1)

ALGO 2 with p= 8 ALGO 2 with p=16 ALGO 2 with p=22

(58)

Instruction versus Memory(3/3)

CPU instructions can handle sectors of size g, where g is the length of CPU register.

(59)

Instruction versus Memory(3/3)

CPU instructions can handle sectors of size g, where g is the length of CPU register.

Is g a constant in the analysis of algorithm?

(60)

Instruction versus Memory(3/3)

CPU instructions can handle sectors of size g, where g is the length of CPU register.

Is g a constant in the analysis of algorithm?

Are all instructions O(1)-executable?

(61)

Is g a constant?

(62)

Is g a constant?

(63)

Is g a constant?

Assume a program executed on M, a random access machine, using Θ(S) memory space.

(64)

Is g a constant?

Assume a program executed on M, a random access machine, using Θ(S) memory space.

Θ(S) memory address is required.

(65)

Is g a constant?

Assume a program executed on M, a random access machine, using Θ(S) memory space.

Θ(S) memory address is required.

The length of the registers in M is Ω(log S).

(66)

Are all instructions O(1) -executable?

(67)

Are all instructions O(1) -executable?

(68)

Are all instructions O(1) -executable?

AC0 instructions are those which can be realized with polynomial size and constant depth circuit.

(69)

Are all instructions O(1) -executable?

AC0 instructions are those which can be realized with polynomial size and constant depth circuit.

Multiplication is not an AC0 instruction.

(70)

Are all instructions O(1) -executable?

AC0 instructions are those which can be realized with polynomial size and constant depth circuit.

Multiplication is not an AC0 instruction.

To access multi-dimension array in constant time, multiplication must be constant time executable.

(71)

Are all instructions O(1) -executable?

AC0 instructions are those which can be realized with polynomial size and constant depth circuit.

Multiplication is not an AC0 instruction.

To access multi-dimension array in constant time, multiplication must be constant time executable.

We suggest those instructions can be implemented faster than multiplication is constant time

executable.

(72)

Population Count

(73)

Population Count(1/3)

(74)

Population Count(1/3)

˚σ is not supported by all types of CPU.

(75)

Population Count(1/3)

˚σ is not supported by all types of CPU.

Any alternative way?

(76)

Population Count(1/3)

˚σ is not supported by all types of CPU.

Any alternative way?

The previous work shows a bitwise twiddling method to realize the population count. The method needs

O(log(2) g) basic instructions. Hence, the speedup is O(g1/2/ log(2) g) = Ω(log1/2 m/ log(3) m) due to

g = Ω(log m).

(77)

Population Count(1/3)

˚σ is not supported by all types of CPU.

Any alternative way?

The previous work shows a bitwise twiddling method to realize the population count. The method needs

O(log(2) g) basic instructions. Hence, the speedup is O(g1/2/ log(2) g) = Ω(log1/2 m/ log(3) m) due to

g = Ω(log m).

Any faster solution?

(78)

Population Count(1/3)

˚σ is not supported by all types of CPU.

Any alternative way?

The previous work shows a bitwise twiddling method to realize the population count. The method needs

O(log(2) g) basic instructions. Hence, the speedup is O(g1/2/ log(2) g) = Ω(log1/2 m/ log(3) m) due to

g = Ω(log m).

(79)

Population Count(2/3)

{ 1 1 0 0 }

{ 1 0 1 0 }

{ 1 1 0 0 }

(80)

Population Count(2/3)

{ 1 1 0 0 }

{ 1 0 1 0 }

+ { 1 1 0 0 } 20 { 1 0 1 0 } 21 { 1 1 0 0 }

(81)

Population Count(2/3)

{ 1 1 0 0 }

{ 1 0 1 0 }

+ { 1 1 0 0 } 20 { 1 0 1 0 } 21 { 1 1 0 0 }

Using this method to reduce 2d − 1 ˚σ into d ˚σ.

(82)

Population Count(2/3)

{ 1 1 0 0 }

{ 1 0 1 0 }

+ { 1 1 0 0 } 20 { 1 0 1 0 } 21 { 1 1 0 0 }

Using this method to reduce 2d − 1 ˚σ into d ˚σ. The speedup is Ω(log1/2 m/ log(4) m).

(83)

Instruction versus Memory(2/3)

0 0.5 1 1.5 2 2.5 3

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

elapsed wall time (second)

rewiring probability

ALGO 3 ALGO 7[12 <- ALGO 10]

ALGO 7[12 <- ALGO 12]

(84)

Instruction versus Memory(2/3)

40 60 80 100

speedup relative to ALGO 3(%)

ALGO 7[12 <- ALGO 10]

ALGO 7[12 <- ALGO 12]

(85)

Conclusion

(86)

Conclusion

(87)

Conclusion

Previous efficient algorithm, Forward Algorithm, needs O(m3/2) time and O(m) space.

(88)

Conclusion

Previous efficient algorithm, Forward Algorithm, needs O(m3/2) time and O(m) space.

To develop algorithms on random access machines, we come up with two arguments.

(89)

Conclusion

Previous efficient algorithm, Forward Algorithm, needs O(m3/2) time and O(m) space.

To develop algorithms on random access machines, we come up with two arguments.

Based on the arguments, our algorithm has Ω(log1/2 m/ log(4) m) speedup.

(90)

Conclusion

Previous efficient algorithm, Forward Algorithm, needs O(m3/2) time and O(m) space.

To develop algorithms on random access machines, we come up with two arguments.

Based on the arguments, our algorithm has Ω(log1/2 m/ log(4) m) speedup.

Though it may slightly worse than FFR Algorithm in

(91)

Future Work

(92)

Future Work

(93)

Future Work

Maybe some graph features are more proper to analyze than degeneracy when the algorithm to

calculate the intersection of given two sets changed.

(94)

Future Work

Maybe some graph features are more proper to analyze than degeneracy when the algorithm to

calculate the intersection of given two sets changed.

The same arguments on random access machines can be applied to many other algorithms.

(95)

Thanks for your attention!

Any Questions?

Figure

Updating...

References

Related subjects :