• 沒有找到結果。

Bridging the Gap between von Neumann Graph Entropy and Structural Information: Theory and Applications

N/A
N/A
Protected

Academic year: 2022

Share "Bridging the Gap between von Neumann Graph Entropy and Structural Information: Theory and Applications"

Copied!
12
0
0

加載中.... (立即查看全文)

全文

(1)

Bridging the Gap between von Neumann Graph Entropy and Structural Information: Theory and Applications

Xuecheng Liu

Shanghai Jiao Tong University [email protected]

Luoyi Fu

Shanghai Jiao Tong University [email protected]

Xinbing Wang

Shanghai Jiao Tong University [email protected]

ABSTRACT

The von Neumann graph entropy is a measure of graph complexity based on the Laplacian spectrum. It has recently found applications in various learning tasks driven by networked data. However, it is computational demanding and hard to interpret using simple structural patterns. Due to the close relation between Lapalcian spectrum and degree sequence, we conjecture that the structural information, defined as the Shannon entropy of the normalized de- gree sequence, might be a good approximation of the von Neumann graph entropy that is both scalable and interpretable.

In this work, we thereby study the difference between the struc- tural information and von Neumann graph entropy named asen- tropy gap. Based on the knowledge that the degree sequence is majorized by the Laplacian spectrum, we for the first time prove the entropy gap is between 0 and log

2e in any undirected unweighted graphs. Consequently we certify that the structural information is a good approximation of the von Neumann graph entropy that achieves provable accuracy, scalability, and interpretability simulta- neously. We further study two entropy based applications which can benefit from the bounded entropy gap and structural information:

network design and graph similarity measure. We combine greedy method and pruning strategy to develop fast algorithm for the net- work design, and propose a novel graph similarity measure with a fast incremental algorithm for graph streams. Our experimental re- sults on graphs of various scales and types show that the very small entropy gap readily applies to a wide range of graphs and weighted graphs. As an approximation of the von Neumann graph entropy, the structural information is the only one that achieves both high efficiency and high accuracy among the prominent methods. It is at least two orders of magnitude faster than SLaQ [40] with com- parable accuracy. Our structural information based methods also exhibit superior performance in two entropy based applications.

ACM Reference Format:

Xuecheng Liu, Luoyi Fu, and Xinbing Wang. 2021. Bridging the Gap between von Neumann Graph Entropy and Structural Information: Theory and Applications. InWWW ’21: Proceedings of The Web Conference 2021, April 19–23, 2021, Ljubljana, Slovenia. ACM, New York, NY, USA, 12 pages. https:

//doi.org/10.1145/1122445.1122456

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and /or a fee. Request permissions from [email protected].

WWW ’21, April 19–23, 2021, Ljubljana, Slovenia

© 2021 Association for Computing Machinery.

ACM ISBN 978-1-4503-XXXX-X/18/06. . . $15.00 https://doi.org/10.1145/1122445.1122456

1 INTRODUCTION

Evidence has rapidly grown in the past few years that graphs are ubiquitous in our daily life; online social networks, metabolic net- works, transportation networks, and collaboration networks are just a few examples that could be represented precisely by graphs.

One important issue in graph analysis is to measure the complexity of these graphs [4, 28] which refers to the level of organization of the structural features such as the scaling behavior of degree distri- bution, community structure, etc. In order to capture the inherent structural complexity of graphs, many entropy based graph mea- sures [5, 13, 21, 28, 36, 37] are proposed, each of which is a specific form of the Shannon entropy for different types of distributions extracted from the graphs.

As one of the aforementioned entropy based graph complexity measures, the von Neumann graph entropy defined as the Shan- non entropy of the spectrum of the trace rescaled Laplacian matrix of a graph (see Definition 3.1), is of special interests to scholars and practitioners [2, 7, 8, 12, 15, 22, 30, 40]. This spectral based entropy measure distinguishes between different graph structures.

For instance, it is maximal for complete graphs, minimal for graphs with only single edge, and takes on intermediate values for ring graphs. Actually, the entropy measure originates from quantum information theory and is used to describe the mixedness of a quan- tum system. It is Braunstein et al. that first use the von Neumann entropy to measure the complexity of graphs by viewing each pure state of a quantum system as one of the edges of a graph [5].

Built upon the Laplacian spectra, the von Neumann graph en- tropy is a natural choice to capture the graph complexity since the Laplacian spectra is well-known to contain rich information about the multi-scale structure of graphs [17, 20]. As a result, it has recently found applications in downstream tasks of complex network analysis and pattern recognition. For example, the von Neumann graph entropy facilitates the measure of graph similarity via Jensen-Shannon divergence, which could be used to compress multilayer networks [15] and detect anomalies in graph streams [7]. As another example, the von Neumann graph entropy could be used to measure edge centrality [30] and design entropy-driven networks [33].

1.1 Motivations

However, despite the popularity received in applications, the main obstacle encountered in practice is the computational inefficiency of the exact von Neumann graph entropy. Indeed, as the spectral based entropy measure, the von Neumann graph entropy suffers from computational inefficiency since the computational complexity of the graph spectrum is cubic in the number of nodes. Meanwhile, the existing approximation approaches [7, 8, 40] such as quadratic approximation fail to capture the presence of non-trivial structural

(2)

0 10 20 30

index

0 10

spectrum degree

(a) Zachary’s karate club

0 20 40 60

index

0

10 spectrum

degree

(b) Dolphins

Figure 1: The close relation between Laplacian spectra and degree sequence in two representative real-world graphs.

Both the Laplacian spectra and degree sequence are sorted in non-increasing order. The x-axis represents the index of the sorted sequences, and the y-axis represents the value of Laplacian spectrum and degree.

patterns that seem to interpret the spectral based entropy measure.

Therefore,there is a strong desire to find a good approximation that achieves accuracy, scalability, and interpretability simultaneously.

Instead of starting from scratch, we are inspired by the well- known knowledge that there is a close relationship between the combinatorial characteristics of a graph and the algebraic properties of its associated matrices [9]. To illustrate, we plot the Laplacian spectrum and degree sequence together in a same figure for two representative real-world graphs. As shown in Fig. 1, the sorted spectrum sequence and the sorted degree sequence almost coincide with each other. The similar phenomenon can also be observed in larger scale free graphs, which indicates that it is possible to reduce the approximation of the von Neumann graph entropy to the time- efficient computation of simple node degree statistics. Therefore, we ask without hesitation the first research question,

RQ1: Does there exist some non-polynomial functionϕ such that Ín

i=1ϕdin

j=1dj



is close to the von Neumann graph entropy?

wherediis the degree of the nodei in a graph of order n.

We emphasize on the non-polynomial property of the function ϕ since most of previous works that are based on polynomial ap- proximations fail to fulfill the interpretability. The challenges from scalability and interpretability are translated directly into two re- quirements on the functionϕ to be determined. First, the explicit expression ofϕ must exist and keep simple to ensure the inter- pretability of the sum over degree statistics. Second, the functionϕ should be graph-agnostic to meet the scalability requirement, that is,ϕ should be independent from the graph to be analyzed. One natural choice yielded by the entropy nature of the graph complex- ity measure for the non-polynomial functionϕ is ϕ(x) = −x log2x.

The sum−Ín

i=1din j=1dj

log2din j=1dj

has been named as one-dimensionalstructural information by Li et al. [28] in a connected graph since it has an entropy form and captures the information of a classic random walker in a graph. We extend this notion to arbitrary undirected graphs. Following the questionRQ1, we raise the second research question,

RQ2: Is the structural information an accurate proxy of the von Neumann graph entropy?

To address the second question, we conduct to our knowledge a first study of the difference between structural information and von Neumann graph entropy, which we name asentropy gap.

1.2 Contributions

To study the entropy gap, we are based on a fundamental rela- tionship between Laplacian spectrumλ and degree sequence d in undirected graphs:d is majorized byλ. In other words, there is a doubly stochastic matrixP such that Pλ = d. Leveraging the ma- jorization and classic Jensen’s inequality, we prove that the entropy gap is no less than 0 in arbitrary undirected graphs. By exploiting the Jensen’s gap [29] which is an inverse version of the classic Jensen’s inequality, we further prove that the entropy gap is no more than log

2e in arbitrary unweighted undirected graphs. The constant lower and upper bounds on the entropy gap are further sharpened using more advanced knowledge about the Lapalcian spectrum and degree sequence, such as the Grone-Merris majoriza- tion [1]. We also apply the similar technique to bound the entropy gap in weighted graphs.

In a nutshell, our paper makes the following contributions:

• Theory and interpretability: Inspired by the close relation be- tween Laplacian spectrum and degree sequence, we for the first time bridge the gap between the von Neumann graph entropy and structural information by proving that the entropy gap is between 0 and log

2e in any unweighted graph. To the best of our knowledge, the constant bounds on the approximation er- ror in unweighted graphs are sharper than that of any existing approaches with provable accuracy, such as FINGER [7]. There- fore, the answers to bothRQ1 and RQ2 are YES! Besides, the structural information provides a simple geometric interpreta- tion of the von Neumann graph entropy as a measure of degree heterogeneity. Thus, the structural information is a good approxi- mation of the von Neumann graph entropy that achieves provable accuracy, scalability, and interpretability simultaneously.

• Applications and efficient algorithms: Using the structural information as a proxy of the von Neumann graph entropy with bounded error (entropy gap), we develop fast algorithms for two entropy based applications: network design and graph similarity measure. For the network design aiming to maximize the von Neumann entropy, we combine greedy method and pruning strat- egy to speed up the searching process. For the graph similarity measure, we propose a new distance measure based on structural information and Jensen-Shannon divergence. We further show that the proposed measure is a pseudometric and devise fast in- cremental algorithm to compute the similarity between adjacent graphs in a graph stream.

• Extensive experiments and evaluations: We use 3 random graph models, 9 real-world static graphs, and 2 real-world tem- poral graphs to evaluate the properties of the entropy gap and proposed algorithms. The results show that the entropy gap is small in a wide range of graphs, including the weighted graphs.

And it is insensitive to the change of graph size. Compared with prominent methods for approximating the von Neumann graph entropy, the structural information is superior in both accuracy and computational speed. It is at least 2 orders of magnitude faster than the accurate SLaQ [40] algorithm with comparable accuracy.

Our proposed algorithms based on structural information also exhibit superb performance in two entropy based applications.

Roadmap: The remainder of this paper is organized as follows. We review two related issues in Section 2. In Section 3 we introduce

(3)

Table 1: Comparison of methods for approximating the von Neumann graph entropy in terms of fulfilled (✓) and miss- ing (✗) properties.

[7] [40] [8] Structural Information (Ours)

Provable accuracy

Scalability

Interpretability

the definitions of the von Neumann graph entropy, structural infor- mation, and the notion of entropy gap. Section 4 shows the close relationship between von Neumann graph entropy and structural information by bounding the entropy gap. Section 5 presents effi- cient algorithms for two graph entropy based applications. Section 6 provides experimental results. Section 7 offers some conclusions and directions for future research.

2 RELATED WORK

We review two main issues arised from the broad applications [2, 6, 11, 15, 26, 30, 31, 33] of the von Neumann graph entropy:

computation and interpretation.

Approximate computation of the von Neumann graph en- tropy: In an effort to overcome the computational inefficiency of the von Neumann graph entropy, past works have resorted to various numerical approximations. Chen et al. [7] first compute a quadratic approximation of the entropy via Taylor expansion, then derive two finer approximations with accuracy guarantee by spectrum-based and degree-based rescaling, respectively. Before Chen’s work, the Taylor expansion is widely adopted to give compu- tationally efficient approximations [45], but there is no theoretical guarantee on the approximation accuracy. Following Chen’s work, Choi et al. [8] propose several more complex quadratic approxi- mations based on advanced polynomial approximation methods whose superiority are verified through experiments.

Besides, there is a trend to approximate spectral sums using stochastic trace estimation based approximations [19], the merit of which is the provable error-bounded estimation of the spectral sums. For example, Kontopoulou et al. [22] propose three random- ized algorithms based on Taylor series, Chebyshev polynomials, and random projection matrices to approximate the von Neumann entropy of density matrices. As another example, based on the stochastic Lanczos quadrature technique [41], Tsitsulin et al. [40]

propose an efficient and effective approximation technique called SLaQ to estimate the von Neumann entropy and other spectral de- scriptors for web-scale graphs. However, the approximation error bound of SLaQ for the von Neumann graph entropy is not provided.

The disadvantages of such stochastic approximations are also ob- vious; their computational efficiency depends on the number of random vectors used in stochastic trace estimation, and they are not suitable for applications like anomaly detection in graph streams and entropy-driven network design.

The comparison of methods for approximating the von Neumann graph entropy is presented in Table 1. One of the common draw- backs of the aforementioned methods is the lack of interpretability, that is, none of these methods provide enough evidence to interpret this spectral based entropy measure in terms of structural patterns.

By contrast, as a good proxy of the von Neumann graph entropy, the structural information offers us the intuition that the spectral based entropy measure is closely related to the degree heterogeneity of graphs.

Spectral descriptor of graphs and its structural counterpart:

Researchers in spectral graph theory have always been interested in establishing a connection between combinatorial characteristics of a graph and the algebraic properties of its associated matrices.

For example, the algebraic connectivity (also known as Fiedler eigenvalue), defined as the second smallest eigenvalue of a graph Laplacian matrix, has been used to measure the robustness [20] and synchronizability [46] of graphs. The magnitude of the algebraic connectivity has also been found to reflect how well connected the overall graph is [17]. As another example, the Fiedler vector, defined as the eigenvector corresponding to the Fiedler eigenvalue of a graph Laplacian matrix, has been found to be a good sign of the bi-partition structure of a graph [14]. However, there are some other spectral descriptors that have found applications in graph analytics, but require more structural interpretations, such as the heat kernel trace [39, 44] and von Neumann graph entropy.

Simmons et al. [38] suggest to interpret the von Neumann graph entropy as the centralization of graphs, which is very similar to our interpretation using structural information. They derive both upper and lower bounds on the von Neumann graph entropy in terms of graph centralization under some hard assumptions on the range of the von Neumann graph entropy. Therefore, their results cannot be directly converted to accuracy guaranteed approximations of the von Neumann graph entropy for arbitrary simple graphs. By constrast, our work shows that the structural information is an accurate, scalable, and interpretable proxy of the von Neumann graph entropy for arbitrary simple graphs. Besides, the techniques used in our proof are also quite different from [38].

3 PRELIMINARIES

In this paper, we study the undirected graphG = (V , E, A) with positive edge weights, whereV = {1, . . . ,n} is the node set, E is the edge set, andA ∈ Rn×n+ is the symmetric weight matrix with positive entryAij denoting the weight of an edge(i, j) ∈ E. If the node pair(i, j) < E, then Aij = 0. If graph G is unweighted, the weight matrixA ∈ {0, 1}n×n is called the adjacency matrix ofG. The degree of node i ∈ V in graph G is defined as di = Ín

j=1Aij. The Laplacian matrix of graphG is defined as L = D − A whereD = diag(d1, . . . , dn) is the degree matrix. Let {λi}ni=1be the sorted eigenvalues ofL such that λ1 ≥ λ2 ≥ · · · ≥ λn = 0, which is called Laplacian spectrum. We define vol(G) = Íni=1di as the volume of graphG, then vol(G) = tr(L) = Íni=1λi where tr(·) is the trace operator. For the convenience of delineation, we define a special functionf (x) ≜ x log2x on the support [0, ∞) where f (0) ≜ limx ↓0f (x) = 0 by convention. In the following, we present formal definitions of the von Neumann graph entropy, structural information, and the entropy gap. Slightly different from the one-dimensional structural information proposed by Li et al.

[28], our definition of structural information does not require the graphG to be connected.

Definition 3.1 (von Neumann graph entropy). The von Neumann graph entropy of an undirected graphG = (V , E, A) is defined as

(4)

Hvn(G) = − Íni=1f (λi/vol(G)), where λ1≥λ2≥ · · · ≥λn= 0 are the eigenvalues of the Laplacian matrixL = D − A of the graph G, and vol(G) = Íni=1λi is the volume ofG.

Definition 3.2 (Structural information). The structural informa- tion of an undirected graphG = (V , E, A) is defined as H1(G) =

−Ín

i=1f (di/vol(G)), where di is the degree of nodei in G and vol(G) = Íni=1is the volume ofG.

Definition 3.3 (Entropy gap). The entropy gap of an undirected graphG = (V , E, A) is defined as ∆H(G) = H1(G) − Hvn(G).

The von Neumann graph entropy and structural information are well-defined for all the undirected graphs except for the graphs with empty edge set, in which vol(G) = 0. When E = ∅, we take it for granted thatH1(G) = Hvn(G) = 0.

4 APPROXIMATION ERROR ANALYSIS

In this section we bound the entropy gap in the undirected graphs of ordern. Since the nodes with degree 0 have no contribution to structural information and von Neumann graph entropy, without loss of generality we assume thatdi > 0 for any node i ∈ V .

4.1 Bounds on the Approximation Error

We first provide the additive approximation errors in Theorem 4.1, Corollary 4.5, and Corollary 4.6, then obtain the multiplicative approximation error in Theorem 4.7.

Theorem 4.1 (Bounds on the absolute approximation error).

For any undirected graphG = (V , E, A), the inequality 0≤∆H(G) ≤log

2e δ · tr(A2)

vol(G) (1)

holds, whereδ = min{di|di > 0} is the minimum positive degree.

Before proving Theorem 4.1, we introduce two techniques: ma- jorization and Jensen’s gap. The former one is a preorder of the vector of reals, while the latter is an inverse version of the Jensen’s inequality, whose definitions are presented as follows.

Definition 4.2 (Majorization [32]). For a vector x ∈ Rd, we denote byx ∈ Rdthe vector with the same components, but sorted in descending order. Givenx, y ∈ Rd, we say thatx majorizes y (written asx ≻ y) if and only ifÍk

i=1xi≥Ík

i=1yifork = 1, . . . ,d andxT1= yT1.

Lemma 4.3 ( Jensen’s gap [29]). LetX be a one-dimensional ran- dom variable with meanµ and support Ω. Let ψ (x) be a twice differ- entiable function onΩ and define function h(x) =ψ (x)−ψ (µ)

(x−µ)2ψx−µ(µ), then E[ψ (X )] − ψ (E[X ]) ≤ supx ∈Ω{h(x)} · var(X ). Additionally, if ψ(x) is convex, then h(x) is monotonically increasing in x, and if ψ(x) is concave, then h(x) is monotonically decreasing in x.

Lemma 4.4. The functionf (x) = x log2x is convex, its first order derivativef(x) = log2x + log2e is concave.

Proof. The second order derivativef′′(x) = (log2e)/x > 0,

thusf (x) = x log2x is convex. □

We can see that the majorization characterizes the degree of concentration between two vectors,x ≻ y means that the entries

ofy are more concentrated on its mean yT1/1T1 than the entires ofx. An equivalent definition of the majorization [32] using linear algebra says thatx ≻ y if and only if there exists a doubly stochastic matrixP such that Px = y. As a famous example of the majoriza- tion, the Schur-Horn theorem [32] says that the diagonal elements of a positive semidefinite Hermitian matrix are majorized by its eigenvalues. SincexTLx = Í(i, j)∈EAij(xi−xj)2≥ 0 for any vector x ∈ Rn, the Laplacian matrixL is a positive semidefinite symmetric matrix whose diagonal elements form the degree sequenced and eigenvalues form the spectrumλ. Therefore, λ ≻ d implying that there exists some doubly stochastic matrixP = (pij) ∈ [0, 1]n×n such thatPλ = d.

Using the fact thatPλ = d and the convexity of f (x) in Lemma 4.4, we can now proceed to prove Theorem 4.1.

Proof of Theorem 4.1. For eachi ∈ V , we define a discrete random variableXi with probability mass functionÍn

j=1pijδλj(x), whereδa(x) is the Kronecker delta function. Then the expectation E[Xi]= Ínj=1pijλj = di and the variancevar(Xi)= Ínj=1pijj− di)2= Ínj=1pijλ2j−di2.

First, we express the entropy gap in terms of the Lapalcian spec- trum and the degree sequence. Since

H1(G) = − Õn i=1

 di

vol(G)

 log2

 di

vol(G)



= − 1 vol(G)

Õn i=1

f (di) − Õn i=1

dilog

2(vol(G))

!

= log2(vol(G)) − Ín

i=1f (di) vol(G) ,

(2)

and similarlyHvn(G) = log2(vol(G)) − Íni=1f (λi)/vol(G), we have

∆H(G) = H1(G) − Hvn(G) = Ín

i=1f (λi) −Ín

i=1f (di)

vol(G) . (3)

Second, we use Jensen’s inequality to prove∆H(G) ≥ 0. Since f (x) is convex, f (di)= f (E[Xi]) ≤ E[f (Xi)] for anyi ∈ {1, . . . , n}.

By summing overi, we have Õn

i=1

f (di) ≤ Õn

i=1E[ f (Xi)]= Õn i=1

Õn j=1

pijf (λj)= Õn j=1

f (λj).

Therefore,∆H(G) ≥ 0 for any undirected graphs.

Finally, we use Jensen’s gap to prove∆H(G) ≤ logδ2evol(G)tr(A2). Apply the Jensen’s gap toXi andf (x),

E[ f (Xi)] −f (E[Xi]) ≤ sup

x ∈[0,vol(G)]{hi(x)} · var(Xi), (4) where

hi(x) = f (x) − f (E[Xi])

(x − E[Xi])2 − f(E[Xi]) x − E[Xi].

Since f(x) is concave, hi(x) is monotonically decreasing in x.

Therefore, supx ∈[0,vol(G)]{hi(x)} = hi(0). Since hi(0)= f (0) − f (di)

di2 + f(di) di =log

2e di

log2e δ ,

(5)

the inequality in (4) can be simplified as

n

Õ

j=1

pijf (λj) −f (di) ≤ log2e

δ · ©­

«

n

Õ

j=1

pijλ2j−di2ª

®

¬

. (5)

By summing both sides of the inequality (5) overi, we get an upper boundUB onÍn

j=1f (λj) −Ín

i=1f (di) as

UB= log

2e δ ·

Õn i=1

©

­

« Õn j=1

pijλ2j−di2ª

®

¬

=log

2e δ · ©­

« Õn j=1

λ2j − Õn i=1

di2ª

®

¬

= log2e δ ·

tr(L2) − tr(D2)

= log

2e δ ·

tr(A2) − tr(AD) − tr(DA) = log

2e δ · tr(A2) As a result,∆H(G) = Íni=1f (λvoli)−(G)Íni=1f (di)logδ2e tr(A

2) vol(G).

□ To illustrate the tightness of the bounds in Theorem 4.1, we further derive bounds on the entropy gap for unweighted graphs, especially the regular graphs. Via multiplicative error analysis, we show that the structural information converges to the von Neumann graph entropy as graph size grows.

Corollary 4.5 (Constant bounds on the entropy gap). For any unweighted, undirected graphG, 0 ≤ ∆H(G) ≤ log2e holds.

Proof. In unweighted graphG, tr(A2) = Íni=1Ín

j=1AijAji = Ín

i=1Ín

j=1Aij = Íni=1di = vol(G) and δ ≥ 1, therefore 0 ≤

∆H(G) ≤logδ2evol(tr(AG)2) =logδ2e ≤ log2e. □

Corollary 4.6 (Entropy gap of regular graphs). For any unweighted, undirected, regular graphGdof degreed, the inequality 0≤∆H(Gd) ≤ logd2e holds.

Proof sketch. In any unweighted, regular graphGd,δ = d. □ Theorem 4.7 (Convergence of the multiplicative approx- imation error). For almost all unweighted graphsG of order n,

H1(G)

Hvn(G)− 1 ≥ 0 and decays to 0 at the rate ofo(1/log2(n)).

Proof. Dairyko et al. [10] proved that for almost all unweighted graphsG of order n, Hvn(G) ≥ Hvn(K1,n−1) whereK1,n−1stands for the star graph. SinceHvn(K1,n−1)= log2(2n−2)−2n−2n log

2n = 1+12log

2n + o(1), HH1(G)

vn(G)− 1= ∆H(G)H

vn(G)Hlog2e

vn(K1,n−1)= o(log1 2n).

4.2 Sharpened Bounds on the Entropy Gap

Though the constant bounds on the entropy gap is tight enough for applications, we can still sharpen the bounds on the entropy gap in unweighted graphs using more advanced majorizations.

Theorem 4.8 (Sharpened lower bound on entropy gap). For any unweighted, undirected graphG, ∆H(G) is lower bounded by (f (dmax+ 1) − f (dmax)+ f (δ − 1) − f (δ))/vol(G) where dmaxis the maximum degree andδ is the minimum positive degree.

Proof. The proof is based on the advanced majorization [18]:

λ ≻ (d1+ 1,d2, . . . , dn− 1) holds on any unweighted, undirected graphG where d1 ≥d2≥ · · · ≥dnis the sorted degree sequence ofG. Similar to the proof of Theorem 4.1, we have Íni=1f (λi) ≥ f (d1+1)+ f (dn−1)+Ín−1i=2 f (di). Then the sharpened upper bound follows from the equation (3) sinced1= dmaxanddn= δ. □ Theorem 4.9 (Sharpened upper bound on entropy gap). For any unweighted, undirected graphG = (V , E), ∆H(G) is upper bounded by min{log2e,b1,b2} whereb1 = Íni=1vol(f (dG)i)

Ín i=1f (di) vol(G)

andb2= log2(1+ Íni=1d2i/vol(G)) −Íni=1vol(f (dG)i). Here (d1, . . . , dn) is the conjugate degree sequence ofG where dk= |{i|di ≥k}|.

Proof. We first prove∆H(G) ≤ b1using the Grone-Merris ma- jorization [1]:(d1, . . . , dn) ≻λ. Similar to the proof of Theorem 4.1, we haveÍn

i=1f (di) ≥Ín

i=1f (λi), thusb1

Ín

i=1f (λi)−Ín i=1f (di)

vol(G) =

∆H(G). We then prove ∆H(G) ≤ b2. Since Ín

i=1f (λi) vol(G) =

Õn i=1

λi

Ín j=1λj

!

log2λi ≤ log2 Ín

i=1λ2i Ín

j=1λj

!

and

Ín i=1λ2i Ín

i=1λi = vol(tr(LG)2) = 1 +Ívol(ni=1G)di2,∆H(G) = Íni=1f (λvol(iG))−f (di) ≤b2.

5 APPLICATIONS AND ALGORITHMS

As a measure of the structural complexity of a graph, the von Neu- mann entropy has been applied in a variety of applications. For ex- ample, the von Neumann graph entropy is exploited to measure the importance of an edge [30]. As another example, the von Neumann graph entropy can also be used to measure the distance between graphs for graph classification and anomaly detection [2, 7]. In addition, the von Neumann graph entropy is used in the context of network embedding [11] to learn low-dimensional feature repre- sentations of nodes. We observe that, in these applications, the von Neumann graph entropy is used to address the following primitive tasks:

• Entropy-based network design: Change the existing graph to a new graph such that the entropy requirement is attained with minimal perturbations on the existing graph. For example, Minello et al. [33] use the von Neumann entropy to explore the potential network growth model via experiments.

• Graph similarity measure: Compute a real positive number to reveal the similarity between two graphs. For example, Domenico et al. [15] use the von Neumann graph entropy to compute the Jensen-Shannon distance between graphs for the purpose of compressing multilayer networks.

To resolve both tasks, it requires computing the von Neumann graph entropy exactly. To reduce the computational cost and pre- serve the interpretability, we can use the accurate proxy, structural information, to approximately solve these tasks.

5.1 Entropy-based network design

Network design aims to minimally perturb the network to fulfill some goals. Consider a goal to maximize the von Neumann entropy of a graph, it helps to understand how different structural patterns

(6)

influence the entropy value. The entropy-based network design problem is formulated as follows,

Problem 1 (MaxEntropy). Given an unweighted, undirected graphG = (V , E) of order n and an integer budget k, find a set F of non-existing edges ofG whose addition to G creates the largest increase of the von Neumann graph entropy and |F | ≤ k.

Due to the spectral nature of the von Neumann graph entropy, it is not easy to find an effective strategy to perturb the graph, especially in the scenario where there are exponential number of combinations for the subsetF . If we use the structural informa- tion as a proxy of the von Neumann entropy, Problem 1 reduces to maximizeH1(G) whereG= (V , E ∪ F) such that |F | ≤ k. To further alleviate the computational pressure rooted in the exponen- tial size of the search space forF , we adopt the greedy method in which the new edges are added one by one until either the struc- tural information attains its maximum value log

2n or k new edges have already been added. We denote the graph withl new edges asGl = (V , El), thenG0= G. Now suppose that we have Glwhose structural information is less than log

2n, then we want to find a new edgeel+1 = (u,v) such that H1(Gl+1) is maximized, where Gl+1= (V , El∪ {el+1}). Since H1(Gl+1) can be rewritten as

log2(2|El|+ 2) − 1

2|El|+ 2 f (du+ 1) + f (dv+ 1) + Õ

i,u,v

f (di)

! ,

the edgeel+1maximizingH1(Gl+1) should also minimize the edge centralityEC(u,v) = f (du+ 1) − f (du)+ f (dv+ 1) − f (dv), where diis the degree of nodei in Gl.

We present the pseudocode of our fast algorithm EntropyAug in Algorithm 1, which leverages the pruning strategy to acceler- ate the process of finding a single new edge that creates a largest increase of the von Neumann entropy. EntropyAug starts by ini- tiating an empty setF used to contain the node pairs to be found and an entropy valueH used to record the maximum structural information in the graph evolution process (line 1). In each follow- ing iteration, it sorts the set of nodesV in non-decreasing degree order (line 3). Note that the edge centralityEC(u,v) has a nice monotonic property:EC(u1,v1) ≤ EC(u2,v2) if min{du1, dv1} ≤ min{du2, dv2} and max{du1, dv1} ≤ max{du2, dv2}. With the sorted list of nodesVs, the monotonicity ofEC(u,v) can be translated into EC(Vs[i1],Vs[j1]) ≤EC(Vs[i2],Vs[j2]) if the indices satisfyi1< j1, i2< j2,i1< i2, andj1< j2. Thus, using the two pointers{head, tail}

and a thresholdT , it can prune the search space and find the desired non-adjacent node pair as fast as possible (line 4-12). It then adds the non-adjacent node pair minimizingEC(u,v) into F and update the graphG (line 13). The structural information of the updated graph is computed to determine whetherF is the optimal subset till current iteration (line 14-15).

5.2 Graph Similarity Measure

Entropy based graph similarity measure aims to compare graphs using Jensen-Shannon divergence. The Jensen-Shannon divergence, as a symmetrized and smoothed version of the Kullback-Leibler divergence, is defined formally in the following Definition 5.1.

Definition 5.1 (Jensen-Shannon divergence). LetP and Q be two probability distributions on the same support setΩN = {1, . . . , N }.

Algorithm 1: EntropyAug

Input: The graphG = (V , E) of order n, the budget k Output: A set of node pairs

1 F ← ∅, H ← 0;

2 while |F | < k do

3 Vs: list← sortV in non-decreasing degree order;

4 head← 0, tail ← |Vs| − 1,T ← +∞;

5 while head < tail do

6 fori = head + 1, head + 2, . . . , tail do

7 ifEC(Vs[head],Vs[i]) ≥ T then

8 tail←i − 1; break;

9 if (Vs[head],Vs[i]) < E then

10 u ← Vs[head],v ← Vs[i], T ← EC(u,v);

11 tail←i − 1; break;

12 head← head+ 1;

13 E ← E ∪ {(u,v)}, F ← F ∪ {(u,v)};

14 if H1(G) > H then H ← H1(G), F←F ;

15 if H= log2n then break;

16 returnF.

The Jensen-Shannon divergence betweenP and Q is defined as DJS(P, Q) = H((P + Q)/2) − H(P)/2 − H(Q)/2,

whereH(P) = − ÍNi=1pilogpiis the entropy of the distributionP.

Endres et al. [16] prove thatpDJS(P, Q) is a bounded metric on the space of distributions overΩNwith its maximum valuep

log 2 being attained when min{pi, qi}= 0 for every i ∈ ΩN. Since the von Neumann graph entropy is an entropy of the spectrum based distribution, Lamberti et al. [25] define a quantum Jensen-Shannon distance between two graphs which is closely related to the von Neumann graph entropy in the following Definition 5.2.

Definition 5.2 (Quantum Jensen-Shannon distance). The quantum Jensen-Shannon distance between two weighted, undirected graphs G1= (V , E1, A1) andG2= (V , E2, A2) is defined as DQJS(G1, G2)= q

Hvn(G) − (Hvn(G1)+ Hvn(G2))/2, whereG = (V , E1∪E2, A) is an weighted graph withA = A1/2vol(G1)+ A2/2vol(G2).

Based on the quantum Jensen-Shannon distance, we consider the following problem that can be applied in anomaly detection and multiplex network compression,

Problem 2. Compute the quantum Jensen-Shannon distance be- tween adjacent graphs in a stream of graphs {Gk= (V , Ek, tk)}Kk=1 wheretkis the timestamp of the graphGkandtk < tk+1.

As a distance measure between graphs,DQJSis typically required to be a pseudometric [39], that is, it should be symmetric and satisfy triangle inequality. However, to the best of our knowledge, it is still an open problem whetherDQJSfulfills the triangle inequality [25].

Meanwhile, the quantum Jensen-Shannon distance inherits the computational inefficiency from the von Neumann graph entropy.

Therefore, to solve Problem 2 efficiently we propose a new distance measure based on structural information as a surrogate forDQJS.

(7)

Algorithm 2: IncreSim Input:G1and{∆Gk}K−1k=1 Output: {DSI(Gk, Gk+1)}K−1

k=1

1 d ← the degree sequence of the graph G1;

2 m ← Íni=1di/2;

3 H1(G1) ← log2(2m) −21mÍn

i=1f (di);

4 fork = 1, . . . , K − 1 do

5 ∆d ← the degree sequence of the signed graph ∆Gk;

6 ∆m ← Íi ∈Vk∆di/2;

7 Computea,b,y, z in Lemma 5.6 via iterating over Vk;

8 ComputeH1(Gk+1) and H1(Gk) based on Lemma 5.6;

9 DSI(Gk, Gk+1) ← q

H1(Gk) − (H1(Gk)+ H1(Gk+1))/2;

10 m ← m + ∆m;

11 foreachi ∈ Vkdodi ←di+ ∆di;

12 return {DSI(Gk, Gk+1)}k=1K−1

Definition 5.3 (Structural information distance). The structural information distance between two weighted, undirected graphs G1= (V , E1, A1) andG2 = (V , E2, A2) is defined as DSI(G1, G2)= q

H1(G) − (H1(G1)+ H1(G2)) /2, whereG = (V , E1∪E2, A) is an weighted graph withA = A1/2vol(G1)+ A2/2vol(G2).

It is a little surprising to find thatDSIis a pseudometric, the details of which are stated in Theorem 5.4.

Theorem 5.4 (Properties of the distance measureDSI). The distance measure DSI(G1, G2) is a pseudometric on the space of undi- rected graphs:

• DSIis symmetric, i.e., DSI(G1, G2)= DSI(G2, G1);

• DSIis non-negative, i.e., DSI(G1, G2) ≥ 0 where the equality holds if and only ifÍndi, 1

k=1dk, 1 =Índi, 2

k=1dk, 2 for every nodei ∈ V wheredi, jis the degree of nodei in Gj;

• DSIobeys the triangle inequality, i.e.,

DSI(G1, G2)+ DSI(G2, G3) ≥ DSI(G1, G3);

• DSIis upper bounded by 1, i.e., DSI(G1, G2) ≤ 1 where the equality holds if and only if min{di,1, di,2}= 0 for every node i ∈ V where di, jis the degree of nodei in Gj.

To establish a connection betweenDSIandDQJS, we study their extreme values and present the results in Theorem 5.5.

Theorem 5.5 (Connection between DQJS andDSI). Both DQJS(G1, G2) and DSI(G1, G2) attain the same maximum value of 1 under the identical condition that min{di,1, di,2}= 0 for every node i ∈ V where di, jis the degree of nodei in Gj.

In order to compute the structural information distance between adjacent graphs in the graph stream{Gk = (V , Ek, tk)}Kk=1, we first compute the structural informationH1(Gk) for eachk ∈ {1, . . . , K}, which takes Θ(Kn) time. Then we compute the struc- tural information ofGkwhose adjacent matrixAk= Ak/2vol(Gk)+ Ak+1/2vol(Gk+1) for eachk ∈ {1, . . . , K − 1}. Since the degree of nodei in Gkisdi,k =2vol(Gdi, k

k)+2vol(Gdi, k+1k+1)andÍn

i=1di,k = 1, the

structural information ofGk isH1(Gk) = − Íni=1f (di,k) which takesΘ(n) time for each k. Therefore, the total computational cost isΘ((2K − 1)n).

In practice, the graph stream is fully dynamic such that it would be more efficient to represent the graph stream as a stream of edge insertions and deletions over time, rather than a sequence of graphs. Suppose that the graph stream is represented as an initial base graphG1= (V , E1, t1) and a sequence of graph changes {∆Gk= (Vk, E+,k, E,k, tk)}k=1K−1wheretkis the timestamp of the setE+,kof edge insertions and the setE,kof edge deletions, andVk is the subset of nodes covered byE+,k∪E−,k. We can view the graph change∆Gkas a signed network where the edge inE+,khas positive weight+1 and the edge in E,khas negative weight−1. The degree of nodei ∈ Vkin the graph change∆Gkrefers toÍ

j ∈VkI{(i, j) ∈ E+,k}−I{(i, j) ∈ E,k}. Using the information about previous graph Gkand current graph change∆Gk, we can compute the entropy statistics of the current graphGk+1incrementally and efficiently via the following lemma, whose proof can be found in the appendix.

Lemma 5.6. Using the degree sequenced of the graph Gk, the structural information H1(Gk), and the degree sequence∆d of the signed graph∆Gk, the structural information of the graphGk+1can be efficiently computed as

H1(Gk+1)= f (2(m + ∆m)) − a − f (2m) + 2mH1(Gk)

2(m + ∆m) ,

wherem = Íni=1di/2,∆m = Íi ∈Vk∆di/2, anda = Íi ∈Vkf (di +

∆di) −f (di). Moreover, the structural information of the averaged graphGkbetweenGkandGk+1can be efficiently computed as

H1(Gk)= −b − (2m − y)f (c) − c(f (2m) − 2mH1(Gk) −z), wherey = Íi ∈Vkdi,z = Íi ∈Vkf (di),c = 4m(m+∆m)2m+∆m , andb = Íi ∈Vk f d

i

4m +4(dm+∆m)i+∆di  .

The pseudocode of our fast algorithm IncreSim for computing the structural information distance in a graph stream is shown in Algorithm 2. It starts by computing the structural information of the base graphG1(line 1-3), which takesΘ(n) time. In each following iteration, it first computes the value ofa,b, c,y, z (line 5-7), then calculates the structural information distance between two adjacent graphs (line 8-9), finally updates the edge countm and the degree sequenced (line 10-11). The time cost of each iteration is Θ(|Vk|), consequently the total time complexity isΘ(n + ÍK−1k=1 |Vk|).

6 EXPERIMENTS AND EVALUATIONS

We conduct extensive experiments over both synthetic and real- world datasets to answer the following questions:

Q1.Universality of the entropy gap over arbitrary simple graphs:

Is the entropy gap close to 0 for a wide range of graphs? Is the structural information a good proxy of the von Neumann graph entropy for a wide range of graphs?

Q2.Sensitivity of the entropy gap to graph properties: How do graph properties affect the value of entropy gap?

Q3.Accuracy of the approximation: As a proxy of the von Neu- mann graph entropy, is the structural information more accu- rate than its prominent competitors?

(8)

Table 2: Real-world datasets used in our experiments.

Name #Nodes #Edges Category Statistics Static graphs without timestamps Avg. degree

Zachary (ZA) 34 78 Friendship 4.59

Dolphins (DO) 62 159 Animal 5.13

Jazz ( JA) 198 2,742 Contact 27.70

Skitter (SK) 1,696,415 11,095,298 Internet 13.08 Brightkite (BK) 58,228 214,078 Friendship 7.35 Caida (CA) 26,475 53,381 Internet 4.03 YouTube (YT) 1,134,890 2,987,624 Friendship 5.27 LiveJournal (LJ) 3,997,962 34,681,189 Friendship 17.35 Pokec (PK) 1,632,803 22,301,964 Friendship 27.32 Dynamic graphs with timestamps #Snapshots Wiki-IT (WK) 1,204,009 34,826,283 Hyperlink 100 Facebook (FB) 61,096 788,135 Friendship 29

Q4. Speed of the computation: Is the computation of the structural information faster than its prominent competitors?

Q5. Extensibility of the entropy gap to weighted graphs: Is the entropy gap sensitive to the change of edge weights? Is the entropy gap still close to 0 for weighted graphs?

Q6. Performance analysis (Appendix A): What is the performance of EntropyAug (Algorithm 1) in maximizing the von Neumann graph entropy? What is the performance of IncreSim (Algo- rithm 2) in analyzing graph streams? Can the structural infor- mation distance be further used to detect anomalies in a graph stream?

6.1 Experimental Settings

Datasets: We consider both synthetic graphs and real-world graphs.

The synthetic graphs are generated from three well-known random graph models: Erdös-Rényi (ER) model, Barabási-Albert (BA) model [3], and Watts-Strogatz (WS) model [43]. The real-world graphs [24, 27, 42] used in our experiments are listed in Table 2, which contain both static graphs with varying size and average degree, and temporal graphs with varying size and time span. In every static graph, we ignore the direction and weight of all edges and remove both self-loops and multiple edges. We treat every temporal graph as a stream of undirected weighted edges with timestamps. For the convenience of analysis, we partition these edges into several groups where each group is within a certain time interval.

Hardwares: The experiments have been performed on a server with Intel(R) Xeon(R) CP U 2.40 GHz (32 virtual cores) and 256GB RAM, averaging 10 runs for random algorithms and random inputs unless stated otherwise.

Implementation: All of the proposed algorithms and baselines are implemented in Python.

Reproducibility: The code and datasets used in the paper are available at https://github.com/xuecheng27/WWW21- Structural- Information.

6.2 Q1. Universality (Fig. 2)

To evaluate the universality of the entropy gap, we measure the structural information and exact von Neumann entropy on a set of synthetic graphs with 2,000 nodes. For the ER and BA models,

we generate graphs with average degree in{2, 4, . . . , 200}. For the WS model, we generate graphs with edge rewiring probability in{0, 1/20, . . . , 1} for each average degree in {6, 10, 20, 50}. We additionally measure the sharpened lower and upper bounds of the entropy gap. The results are shown in Fig. 2.

The observations are three fold. First,the entropy gap is close to 0 for a wide range of graphs. The entropy gap of each syn- thetic graph is no more than 0.2, whereas the exact von Neumann entropy is greater than 10. Second,the entropy gap is negatively correlated with the average degree. Dense graph tends to have very small entropy gap. Third,the structural information is lin- early correlated with the von Neumann graph entropy, with only few exceptions. There is no exception for the ER synthetic graphs. For the BA synthetic graphs, the exceptions are those graphs with extremely small average degree. For the WS synthetic graphs, the exceptions are those graphs with extremely small edge rewiring probability.

6.3 Q2. Sensitivity (Fig. 2, Fig. 3)

To evaluate the sensitivity of the entropy gap to graph proper- ties such as average degree, graph size, and rewiring probability, we further measure the entropy gap of 10 synthetic graphs with average degree in{500, 1000, . . . , 5000} for each random model.

The average degree is chosen from{2, 5, 10, 20, 50, 100} for ER and BA models, and the edge rewiring probability is chosen from {0, 0.1, 0.2, 0.4, 0.8, 1} for the WS model.

The observations from Fig. 2 and Fig. 3 are three fold. First, the entropy gap decreases as the average degree increases for all the three random graph models. Second, the entropy gap decreases as the edge rewiring probability increases for the WS model. Third, the entropy gap is nearly insensitive to the change of graph size.

6.4 Q3. Accuracy (Fig. 4)

To evaluate the accuracy of the structural information as an approx- imation of the von Neumann graph entropy, we measure the struc- tural information, exact von Neumann entropy (when the graph size is small), and three prominent approximations (as competitors) in 9 real-world static graphs. The competitors are 1) FINGER- bH [7] de- fined asHbF(G) = −Q log2max/tr(L)) where Q = 1 − tr(L2)/tr2(L), 2) FINGER- eH [7] defined asHeF(G) = −Q log2(2dmax/tr(L)), and 3) SLaQ [40]. The results in Fig. 4 show thatthe structural infor- mation is an accurate approximation of the von Neumann graph entropy. The approximation error of structural information is obviously much smaller thanHbFandHeF. And it is comparable to the approximation error of SLaQ with only few exceptions such as YT and SK where the structural information is slightly better.

6.5 Q4. Speed (Fig. 5)

To evaluate computational speed of the structural information, we measure the running time of structural information and its three competitors in 9 real-world static graphs. The results in Fig. 5 show thatthe computation of structural information is fast. It is about 2 orders of magnitude faster thanHbF, at least 2 orders of magnitude faster than SLaQ, and comparable toHeF. Combining

數據

Figure 1: The close relation between Laplacian spectra and degree sequence in two representative real-world graphs.
Table 1: Comparison of methods for approximating the von Neumann graph entropy in terms of fulfilled (✓) and  miss-ing (✗) properties.
Table 2: Real-world datasets used in our experiments.
Figure 2: The structural information, von Neumann graph entropy, and entropy gap of synthetic graphs generated from three random graph models with 2, 000 nodes, varying average degree, and edge rewiring probability.
+3

參考文獻

相關文件

In 2007, results of the analysis carried out by the Laboratory of the Civic and Municipal Affairs Bureau indicated that the quality of the potable water of the distribution

In 2007, results of the analysis carried out by the Laboratory of the Civic and Municipal Affairs Bureau indicated that the quality of the potable water of the distribution

In 2007, results of the analysis carried out by the Laboratory of the Civic and Municipal Affairs Bureau indicated that the quality of the potable water of the distribution

○ Propose a method to check the connectivity of the graph based on the Warshall algorithm and introduce an improved approach, then apply it to increase the accuracy of the

• Give the chemical symbol, including superscript indicating mass number, for (a) the ion with 22 protons, 26 neutrons, and 19

I) Liquids have more entropy than their solids. II) Solutions have more entropy than the solids dissolved. III) Gases and their liquids have equal entropy. IV) Gases have

An information literate person is able to recognise that information processing skills and freedom of information access are pivotal to sustaining the development of a

Wang, Solving pseudomonotone variational inequalities and pseudocon- vex optimization problems using the projection neural network, IEEE Transactions on Neural Networks 17