o(m + n) 2m+Sn+o(m+n) O(m + n) 2n- o(n). Compact Encodings of Planar Graphs via Canonical Orderings and Multiple Parentheses

(1)

Canonical Orderings and Multiple Parentheses

Richie Chih-Nan Chuang 1 , Ashim Garg 2, Xin He 2., Ming-Yang Kao 3.*, and Hsueh-I Lu 1

1 Department of Computer Science and Information Engineering,

National Chung-Cheng University, Chia-Yi 621, Taiwan, {cjn85, hil}@cs.ccu.edu.tw 2 Department of Computer Science, State University of New York at Buffalo,

Buffalo, NY 14260, USA, {agarg,xinhe}~cs.buffalo.edu 3 Department of Computer Science, Yale University, New Haven, CT 06250, USA, kao-ming-yang@cs.yale.edu

Abstract. We consider the problem of coding planar graphs by binary strings. Depending on whether O(1)-time queries for adjacency and degree are supported, we present three sets of coding schemes which all take linear time for encoding and decoding. The encoding lengths are significantly shorter than the previously known results in each case.

1 I n t r o d u c t i o n

This paper investigates the problem of encoding a graph G with n nodes and m edges into a binary string S. This problem has been extensively studied with three objectives: (1) minimizing the length of S, (2) minimizing the time needed to compute and decode S, and (3) supporting queries efficiently.

A number of coding schemes with different trade-offs have been proposed.

The adjacency-list encoding of a graph is widely useful but requires 2m[logn]

bits. (All logarithms are of base 2.) A folklore scheme uses 2n bits to encode a rooted n-node tree into a string of n pairs of balanced parentheses. Since the total number of such trees is at least ~ . (n-1)!(n-1)!' the minimum number of bits needed to differentiate these trees is the log of this quantity, which is

2 n - o(n).

Thus, two bits per edge up to an additive o(1) term is an information- theoretic tight bound for encoding rooted trees. Works on encodings of certain other graph families can be found in [7, 12, 4, 17, 5, 16].

Let G be a plane graph with n nodes, m edges, f faces, and no self-loop. G need not be connected or simple. We give coding schemes for G which all take

O(m + n)

time for encoding and decoding. The bit counts of our schemes depend on the level of required query support and the structure of the encoded graphs.

For applications that require support of certain queries, Jacobson [6] gave an G(n)-bit encoding for a simple planar graph G that supports traversal in G(log n) time per node visited. Munro and Raman [15] recently gave schemes to encode a planar graph using

2m+Sn+o(m+n)

bits while supporting adjacency and degree queries in O(1) time. We reduce this bit count to 2m + 5~n +

o(m + n)

for any

* Research supported in part by NSF Grant CCR-9205982.

** Research supported in part by NSF Grant CCR-9531028.

(2)

adjacency and degree [15] ours self-loops

general 2m + 8n I 2m + 5}n

simple .~m + 5~.n

degree-one free

triconnected 2m + 3n

simple &

triconnected 2m + 2n

triangulated 2m + 2n

simple &

triangulated 2m + n

adjacency old I ours

2m + 4~n .~m + 5n 2m + 3n 2m + 2n 2m + 2n

2m+n

no query [1311 ours 3.58m

3 m

~(log3)m

1.53m 4 ~m

Fig. 1. This table compares our results with previous ones, where k is a positive constant. The lower-order terms are omitted. All but row 1 assume that G has no self-loop.

constant k > 0 with the same query support. If G is triconnected or triangulated, our bit count decreases to 2m + 3n +

o(m + n)

or 2m + 2n +

o(m + n),

resp. With the same query support, we can encode a simple G using only

5m + 5~n + o(n)

bits for any constant k > 0. If a simple G is also triconnected or triangulated, the bit count is 2m + 2n +

o(n)

or 2m + n + o(n), resp. If only O(1)-time adjacency queries are supported, our bit counts for a general G and a simple G become

2 m + 4 2 n + o ( m + n )

and 4 ~m + 5n + o(n), resp.

If we only need to reconstruct G with no query support, the code length can be substantially shortened. For this case, Turs [19] used 4m bits. This bound was improved by Keeler and Westbrook [13] to 3.58m bits. T h e y also used 1.53m bits for a triangulated simple G, and 3m bits for a connected G free of self-loops and degree-one nodes. For a simple triangulated G, we improve the count to

~m + O(1). For a simple G that is free of self-loops, triconnected and thus free 4

of degree-one nodes, we improve the bit count to 1.5(log3)m + O(1). Figure 1 summarizes our results and compares them with previous ones.

Our coding schemes employ two new tools. One is new techniques of process- ing strings of multiple types of parentheses. The other tool is new properties of canonical orderings for plane graphs which were introduced in [3, 8]. These con- cepts have proven useful also for drawing plane graphs [10, 11, 18]. w discusses the new tools. w describes the coding schemes that support queries. w presents the more compact coding schemes which do not support queries. Due to space limitation, the proofs of most lemmas are omitted.

2 N e w E n c o d i n g Tools

A simple

(resp.,

multiple)

graph is one t h a t does not contain (resp., m a y contain) multiple edges between two distinct vertices. A multiple graph can be viewed as a simple one with positive integral edge weights, where each edge's weight indicates its multiplicity. T h e

simple version

of a multiple graph is one obtained from the graph by deleting all but one copy of each edge. In this paper, all graphs are multiple unless explicitly stated otherwise. The

degree

of a node v in a graph

(3)

is the number of edges, counting multiple edges, incident to v in the graph. A node v is a

leaf

of a tree T if v has exactly one neighbor in T. Since T may have multiple edges, a leaf of T may have a degree greater than one.

2.1 M u l t i p l e T y p e s o f P a r e n t h e s e s

Let S be a string. S is

binary

if it contains at most two kinds of symbols. Let

S[i]

be the symbol at the i-th position of S, for 1 < i < [SI. Let select(S, i, []) be the position of the i-th [] in S. Let rank(S, k, []) be the number of 9 that precede or at the k-th position of S. Clearly, if k = select(S, i, []), then i = rank(S, k, •).

Let

$1 + . . . + Sk

denote the concatenation of strings

$ 1 , . . . , Sk.

(In this paper, the encoding of G is usually a concatenation of several strings. For simplicity, we ignore the issue of separating these strings. This can be handled by using well-known data compression techniques with log n + O(log log n) bits [1].)

Let S be a string of multiple types of parentheses. Let

S[i]

and

S[j]

be an open and a close parenthesis with i < j of the same type.

S[i]

and

S[j] match

in S if every parenthesis enclosed by

S[i]

and

S[j]

that is the same type as

S[i]

and

S[j]

matches a parenthesis enclosed by

S[i]

and

S[j].

Here are some queries d e f n e d for S:

- Let match(S, i) be the position of the parenthesis in S that matches

S[i].

- Let

firstk(S,i)

(resp.,

lastk(S,i))

be the position of the first (resp., last) parenthesis of the k-th type that succeeds (resp., precedes)

S[i].

- Let enclosek (S, il, is) be the positions ( j l , j 2 ) of the closest matching parenthesis pair of the k-th type that encloses

S[il]

and

S[i2].

S is

balanced

if every parenthesis in S belongs to a matching parenthesis pair.

Note that the answer to a query above may be undefined. If there is only one type of parentheses in S, the subscript k in firstk (S, i), laStk (S, i), and enclosek (S, i, j ) may be omitted; thus, first(S, i) = i + 1 and last(S, i) = i - 1. If it is clear from the context, the parameter S may also be omitted.

F a c t 1 ([2, 14, 15])

1. Let S be a binary string. An auxiliary binary string

#1(S) of length o(ISI) can be obtained in O(ISI) time such that

rank(S,/, •)

and

select(S,i,

[]) can be answered from S + #t (S) in

0 ( 1 )

time.

2. Let S be a balanced string of one type of parentheses. An auxiliary bi- nary string #2(S) of length o(ISI) can be obtained in O(ISI) time such that

match(S,

i) and

enclose(S,

i, j) can be answered from S+#2(S) in

0(1)

time.

The next theorem generalizes Fact 1 to handle a string of multiple types of parentheses that is not necessarily balanced.

T h e o r e m 1.

Let S be a string of

0(1)

types of parentheses that may be un- balanced. An auxiliary o(ISI)-bit string (~(S) can be obtained in O(ISI) time such that

rank(S,i,D), select(S,i,D),

match(S,i), firstk(S,i), lastk(S,i), and enclosek(S,i,j) can be answered from S + a(S) in

O(1)

time.

Proof.

The statement for rank(S,/, []) and select(S,/, F]) is a straightforward generalization of Fact 1(1). The statement for firstk (S, i) can be shown as follows.

Let

f ( S , i , •)

be the position of the first [] that succeeds

S[i].

Clearly,

f(S,

i, •) = select(S, 1 + rank(S, i, E]), [3); firstk(S, i) = min{f(S, i, (),

f(S, i, )

)}

(4)

where ( and ) are the open and close parentheses of the k-th type in S, resp.

The statement for lastk (S, i) can be shown similarly.

To prove the statement for match(S, i) and enclosek(S, i , j ) , first we ca show t h a t Fact 1 can be generalized to an unbalanced binary string S (proof omitted).

Suppose S has e types of parentheses. Let Sk (1 < k < g) be the string obtained from S as follows.

- Every open (resp., close) parenthesis of the k-th type is replaced by two consecutive open (resp., close) parentheses of the k-th type.

- Every parenthesis of any other type is replaced by a matching parenthesis pair of the k-th type.

Each Sk is a string of length 21S I consisting of one type of parentheses and each symbol Sk[i] can be determined from S[Li/2]] in O(1) time. For example,

S = [ [ ( { ) ] ( { } } ( ] )

$1 = ( ) ( ) ( ( ( ) ) ) () ( ( ( ) ( ) () ( ( ( ) ) ) S2 = [ [ [ [ [] [] [] ] ] [] [] [] [] [] ] ] []

The queries for S can be answered by answering the queries for Sk as follows.

- match(S, i) = Lmatch(Sk, 2i)/2], where S[i] is a parenthesis of the k-th type.

- Given i and j , let A = {2i,2i + 1, match(Sk,2i),match(Sk,2i + 1)} U {2j, 2j + 1, m a t c h ( & , 2j), m a t c h ( & , 2j + 1)}. Let il = min A, j l = max A, and (i2, j2) = enclose(Sk, il, j l ) . Then: enclosek (S, i, j) = (Li2/2], L J2/2] ).

Note t h a t each of the above queries on some Sk can be answered in O(1) time by Sk + #2(Sk). Since each symbol Sk[i] can be determined from S[Li/2]] in O(1) time, the theorem holds by letting c~(S) = #2($1) + #2($2) + . " + #2(S~). []

Let $ 1 , . . . , S k be k strings, each of O(1) types of parentheses. For the re- mainder of the paper, let a(S1, S 2 , . . . , Sk) denote c~(St) + c~(S~) + . . . + c~(Sk).

2.2 E n c o d i n g T r e e s

An encoding for a graph G is weakly convenient if it takes linear time to reconstruct G; O(1) time to determine the adjacency of two nodes in G; O(d) time to determine the degree of a node; and O(d) time to list the neighbors of a node of degree d. A weakly convenient encoding for G is convenient if it takes O(1) time to determine the degree of a node.

The folklore encoding F(T) of a simple rooted unlabeled tree T of n nodes uses a balanced string S of one type of parentheses to represent the preordering of T. Each node of T corresponds to a matching parenthesis pair in S.

F a c t 2 Let vi be the i-th node in the preordering of a rooted simple following properties hold for the folklore encoding S of T.

1. The parenthesis pair for vi encloses the parenthesis pair for vj only if vi is an ancestor of vj.

2. The parenthesis pair for vi precedes the parenthesis pair for vj only if vi and vj are not related and i < j.

tree T. The

in S if and in S if and

(5)

3. The i-th open parenthesis in S belongs to the parenthesis pair for vi.

F a c t 3 ([15]) Let T be a simple rooted tree of n nodes. F ( T ) + # 2 ( F ( T ) ) is a weakly convenient encoding for T of 2n + o(n) bits, obtainable in O(n) time.

We show Fact 3 holds even if S is mixed with other O(1) types of parentheses.

T h e o r e m 2. Let T be a types of parentheses such encoding of T. Then S +

simple rooted unlabeled tree. Let S be a string of O(1) that a given type of parentheses in S gives the folklore a(S) is a weakly convenient encoding of T.

Proof. Let the parentheses, denoted by ( and ), in S used by the encoding of T be the k-th type. Let v l , . . . ,v,~ be the preordering of T. Let Pi = select(S,/, () and qi = match(S, pi). By Theorem 1, Pi and qi can be obtained from S + c~(S) in O(1) time. T h e index i can be obtained from Pi or qi in O(1) time by i = rank(S, pi, () = rank(S, match(S, qi), (). The queries for T are as follows.

Case: adjacency queries. Suppose i < j . Then, (pi,qi) = enclosek(pj,qj) if and only if vi is adjacent to vj in T, i.e., vi is the parent of vj in T.

Case: neighbor queries. Suppose t h a t vi has degree d in T. T h e neighbors of vi in T can be listed in O(d) time as follows. First, if i ~ 1, o u t p u t vj, where (pj, qj) = enclosek(pi, qi). Then, let pj = firstk(pi). As long as pj < qi, we repeatedly o u t p u t vj and update pj by firstk(match(pj)).

Case: degree queries. Since T is simple, the degree d of vi in T is simply the number of neighbors in T, which is obtainable in O(d) time. []

We next improve Theorem 2 to obtain convenient encodings for multiple trees. For a condition P , let 5(P) = 1, if P holds; let 5(P) = 0, otherwise.

T h e o r e m 3. Let T be a rooted unlabeled tree of n nodes, nt leaves and m edges.

Let S + a ( S ) be a weakly convenient encoding of Ts (the simple version of T).

1. A string D of (2m - n + nl) bits can be obtained in O ( m + n) time such that S + D + a(S, D) is a convenient encoding for T of 2m + n + nl + o(m) bits.

2. I f T is simple, a string D of nl bits and a string Y of n bits can be obtained in O ( m + n) time such that S + D + a(S, D, Y ) is a convenient encoding for T and has 2n + nl + o(n) bits.

Proof. Let v t , . . . , vn be the preordering of Ts. Let di be the degree of vi in T.

We show how to use a string D to store the information required to obtain di in O(1) time. We only prove Statement 1.

Let 5i = 5(vi is internal in Ts). Since S + c~(S) is a weakly convenient encoding for Ts, each 5~ can be obtained in O(1) time from S + a ( S ) . Initially, D is just n copies of 1. Let bi = di - 1 - 5i. We add bi copies of 0 right after the i-th 1 in D for each v~. Since the number of internal nodes in Ts is n - n l , the bit count of D is n + ~'~4=l (di - 1 - ~i) = n + 2m - n - ( n - n 1 ) = 2m - n + nl. D n

can be obtained from T in O(m + n) time. The number bi of O's right after the i-th 1 in D is select(D,i + 1, 1) - s e l e c t ( D , / , 1) - 1. Since di = 1 + 5i + bi, the degree of vi in T can be computed in O(1) time from S + D + c~(S, D). []

(6)

14 step j : interval Ij : 1 3, 4, 5 2 6 , 7

3 8

4 9

5 10,11

6 12

7 13

8 14

1 2

Fig. 2. A triconnected plane graph G and a canonical ordering of G.

2.3 Canonical Orderings

In this subsection, we describe the canonical ordering of plane graphs. It was first introduced for plane triangulations in [3], and extended to triconnected plane graphs in [8]. We prove some new properties of this ordering. Let G be a simple triconnected plane graph. Let v l , . . . , v~ be a node ordering of G. Let Gi be the subgraph of G induced by vl, v 2 , . . . , vi. Let Hi be the exterior face of Gi.

D e f i n i t i o n 1. Let v l , v 2 , . . . , v n be a node ordering of a simple triconnected plane graph G = (V,E), where (vl,v2) is an arbitrary edge on the exterior face of G. T h e ordering is canonical if there exist ordered intervals /1, . . . , IK t h a t partition the interval [3, n] such that the following properties hold for every 1 _< j _< K : Suppose Ij = [k, k + q]. Let Cj be the path (Vk, Vk+l,..., Vk+q).

-- T h e graph Gk+q is biconnected. Its b o u n d a r y Hk+q contains the edge (vl, v2) and the path Cj. Cj has no chords in G.

- If q = 0, vk has at least two neighbors in Gk-1, each of them is on Hk-1.

- If q > 0, the path Cj has exactly two neighbors in Gk-1, each of them is on Hk-1. The leftmost neighbor ve is incident only to Vk and the rightmost neighbor vr is incident only to Vk+q.

-- For each vi (k < i < k + q ) , i f / < n, vi has at least one neighbor in G-Gk+q.

Figure 2 shows a canonical ordering of G. Every triconnected plane graph has a canonical ordering which can be constructed in O(n) time [8].

Given a canonical ordering of G with interval p a r t i t i o n / 1 , / 2 , . . . , IK, we can obtain G = Gn from G2, which consists of the single edge (vl,v2), through the following K steps: Suppose Ij = [k, k+q]. T h e j - t h step obtains Gk+q from Gk-1 by adding q + 1 nodes Vk, Vk+l,..., Vk+q and their incidental edges in Gk+q.

Let T be the edge (vl, v2) plus the union of the paths (ve, Vk, Vk+l,..., Vk+q) over all intervals Ij = Irk, Vk+q], 1 <_ j < K, where v~ is the leftmost neighbor of Vk on Hk-1. One can easily see that T is a spanning tree of G rooted at vl. T is called a canonical spanning tree of G. In Figure 2, T is indicated by thick lines.

We show every canonical spanning tree T has the following property.

L e m m a 1. Let T be the canonical spanning tree rooted at vl corresponding to a canonical ordering vl, v 2 , . . . , Vn of G.

(7)

1. Let (vi,vi,) be an edge in G - T. Then vi and vi, are not related in T . 2. For each node vi, the edges incident to vi show the following pattern around

vi in counterclockwise order: The edge from vi to its parent in T ; followed by a block of nontree edges from vi to lower-numbered nodes; followed by a block of tree edges from vi to its children in T ; followed by a block of nontree edges from vi to higher-numbered nodes. (Any of these blocks maybe empty).

3 S c h e m e s w i t h Q u e r y S u p p o r t

In this section we present our coding schemes t h a t support queries. We give a weakly convenient encoding for a simple triconnected graph G in w which il- lustrates our basic techniques. We give the schemes for triconnected plane graphs in w We state our results for triangulated and general plane graphs in w 3.1 B a s i s

Let T be a canonical spanning tree of a simple triconnected plane graph G. We encode G using a balanced string S of two types of parentheses. The first type (parentheses) is for the edges of T. The second type (brackets) is for the edges of G - T.

The encoding Let S be the folklore encoding for T. Let vi be the i-th node in the counterclockwise preordering of nodes of T. Let (i and )i be the parenthesis pair corresponding to vi in S. We augment S by inserting a pair [e and ] e of brackets for every edge e = (vi,vj), where i < j , of G - T as follows: we place

[e right after )~ and ] e right after (j.

Suppose t h a t vi is adjacent to gi (resp., hi) lower- (higher-, resp.) numbered nodes in G - T. Then S has the following pattern for every 1 < i < n: The open parenthesis (i is immediately followed by gi close brackets. The close parenthesis ) i is immediately followed by hi open brackets. The following properties are clear.

F a c t 4 Let e = (v~,vj) be an edge of G - T, where i < j . Then

1. [e is located between ) i and the first parenthesis that succeeds ) ~ in S;

2. ] ~ is located between (j and the first parenthesis that succeeds (j in S.

The following property for S is immediate from Fact 4:

P r o p e r t y A: The last parenthesis that precedes an open bracket is close. The last parenthesis t h a t precedes a close bracket is open.

Let e = (v~, vj) be an edge of G - T, where i < j . By Lemma 1 and Fact 2, ) i precedes (j in S. By Fact 4, S has the following property:

F a c t 5 Let e be an edge of G - T. Then [e precedes ]e in S.

L e m m a 2. Let e and f be two edges in G - T with no common end vertex.

Suppose that [e < Ef . Then either [e < ]~ < [I < ] I or [e < [[ < ] f < ]e.

([e< [I indicates [~ precedes [I-) The above lemma implies t h a t ]e and the bracket t h a t matches [~ in S are in the same block of brackets. From now on, we rename the close brackets by redefining ] e to be the close bracket t h a t matches

[~ in S. It is clear t h a t Property A and Facts 4, 5 still hold for S.

(8)

The queries We show S + a ( S ) is a weakly convenient encoding for G. Since T is simple, then by T h e o r e m 2, S + a ( S ) is a weakly convenient encoding for T. It remains to show t h a t S + ~(S) is also a weakly convenient encoding for G - T.

Let Pi and qi be the positions of (i and )i in S, resp.

- Adjacency. Suppose i < j. Note t h a t vi and vj are adjacent in G - T if and only if qi < p < q < firstl(pj), where (p, q) = enclose2(firstl(qi),pj), as indicated by the following figure:

)i [ ( j ]

t l " t t l " 1"

qi P firstl(qi) pj q firstl(pj)

- Neighbors and degree. T h e neighbors, and thus the degree, of a degree-d node vi in G - T can be obtained in O(d) time as follows.

9 For every position p such that qi < P < firstl(qi), we o u t p u t vj, where pj = lastl(match(p)). ((vi,vj) is an edge in G - T with j > i.)

9 For every position q such t h a t Pi < q < firstl(Pi), we o u t p u t vj, where qy = lastl (match(q)). ((v~,vj) is an edge in G - T with j < i.)

The bit count. Clearly ISI = 2 n + 2 ( m - n ) = 2m. Since there are four symbols in S, S can be encoded by 4m bits. We can improve the bit count by the following:

L e m m a 3. Let S be a string of p parentheses and b brackets that satisfies Prop- erty A. Then S can be encoded by a string of 2p + b + o(p + b) bits, from which each S[i] can be determined in O(1) time.

Proof. Let $1 and $2 be two binary strings defined as follows.

- Sl[i] = 1 if and only if S[i] is a parenthesis, 1 < i < p + b.

- S2[j] = 1 if and only if the j - t h parenthesis in S is open, 1 < j < p.

Each S[i] can be determined from S1 + $2 + a(S1) in O(1) time as follows. Let j = r a n k ( S l , i , 1). If Sl[i]= 1, S[i] is a parenthesis. W h e t h e r it is open or close can be determined from S2[j]. If Sl[i] = O, S[i] is a bracket. W h e t h e r it is open or close can be determined from S2[select(S1, r a n k ( S l , i , 1), 1)] by P r o p e r t y A.

[]

We summarize the above arguments as follows.

L e m m a 4. A simple triconnected plane graph of n nodes and m edges has a weakly convenient encoding that has 2m + 2n + o(n) bits.

3.2 T r i c o n n e c t e d P l a n e G r a p h s

We adapt all notation of w to this subsection. We first show t h a t the weakly convenient encoding for a simple triconnected plane graph G given in w can be further shortened to 2(m + n - n l ) + o(n), where nl is the number of leaves in T. We then give a convenient encoding for G t h a t has 2m + 2n + o(n) bits.

Finally we augment both encodings to handle multiple edges.

(9)

Let vi be a leaf of T, where 2 < i < n. By definition of T and Definition 1, vi is adjacent to a higher-numbered node and a lower-numbered node in G - T.

This implies t h a t (i is immediately succeeded by a ] , and )i is immediately succeeded by a [, for every such vi. Let P be the string obtained from S by removing a ] t h a t immediately succeeds (i, and removing a [ t h a t immediately succeeds ) i for every leaf vi of T, where 2 < i < n. If each S[j] were obtainable in O(1) time from P + a ( P ) , the string S could then be replaced by P + a ( P ) . This does not seem likely. However, we can show that there exists a string Q of length IPI, each Q[i] can be obtained from P + a ( P ) in O(1) time, such t h a t P + a(P, Q) is a weakly convenient encoding for G. Since S satisfies P r o p e r t y A and P is obtained from S by removing some brackets, P also satisfies P r o p e r t y A. Since P has 2n parentheses and 2(m - (n - 1) - nl) brackets, by L e m m a 3 G has ^aweakly convenient encoding of 2(m + n - nl) + o(n) bits.

Next we augment our weakly convenient encoding for G to a convenient one.

Note t h a t the degree ofvi in G - T can be obtained in O(1) time from P+(~(P, Q).

It remains to supply O(1)-time degree query for T. By T h e o r e m 3 we know t h a t n l + o(n) more bits suffices. Therefore there exists a (2m + 2n - nl + o(n))-bit convenient encoding for G that can be obtained in O ( m + n) time.

T h e above convenient encoding can be extended to handle multiple edges as follows. Let Ga be a multiple graph obtained from G by adding some multiple edges between nodes t h a t are adjacent in G - T. Note t h a t the above arguments in this subsection also hold for Ga exactly the same way. Suppose t h a t Ga has m~

edges. Then G~ has a weakly convenient encoding of 2(ma + n - n l ) + o(ma + n) bits, from which the degree of a node in G~ - T can actually be determined in O(1) time. Let Gb be a multiple graph obtained from Ga by adding some multiple edges between nodes t h a t are adjacent in T. Suppose t h a t Gb has mb edges. Let Tb be the union of multiple edges of Gb between the nodes t h a t are adjacent in T.

In order to obtain a convenient encoding for Gb, it remains to supply O(1)-time query for the degree of a node in Tb. Clearly Tb has mb -- ma + n -- 1 edges. By T h e o r e m 3, 2(rob -- m~ + n -- 1) -- n + nl + o(mb) more bits suffice.

We summarize the subsection as follows.

L e m m a 5. Let G be a trieonnected plane graph of n nodes and m edges. Let Gs be the simple version of G, which has ms edges. Let nt be the number of leaves in a canonical spanning tree of Gs. Then G (resp., Gs) has a convenient encoding of 2m + 3n - nl + o(m + n) (resp., 2ms + 2n - nl + o(n)) bits. All these encodings can be obtained in linear time.

3.3 Plane T r i a n g u l a t i o n s a n d G e n e r a l P l a n e Graphs

L e m m a 6. Let G be a plane triangulation of n >_ 3 nodes and m edges. Let Gs be the simple version of G, which has ms = 3n - 6 edges. Then G (resp., Gs) has a convenient encoding of 2m + 2n + o(m + n) (resp., 2ms + n + o(n)) bits.

All these encodings can be obtained in linear time.

L e m m a 7. Let G be a plane graph of n nodes and m edges. Let Gs be the simple version of G, which has ms edges. Let k be a positive constant. Then G has a convenient encoding of 2 m + 5 ~ n + o ( m + n ) bits and a weakly convenient encoding

o/2m+4 n+o(m+n)

bits. a s a convenient encoding of bits and a weakly convenient encoding of yms4 + 5n + o(n) bits.

(10)

4 M o r e C o m p a c t S c h e m e s

In some applications, the only requirement for the encoding is to reconstruct the graph, no queries are needed. In this case, we can obtain even more compact encodings for simple triconnected and triangulated plane graphs.

Let G be a simple triconnected plane graph. Let T be a canonical spanning tree of G. Let v l , . . . , v n be the counterclockwise preordering of T. By using techniques in [8], it can be shown that this ordering is also a canonical ordering of G. (In Figure 2, the canonical ordering shown is the counterclockwise preordering of T.) This special canonical ordering is used in our encoding.

Let

I 1 , . . . , IK

be the interval partition corresponding to the canonical ordering. G can be constructed from a single edge (vl, v2) through K steps. T h e j - t h step corresponds to the interval

Ij =

[k, k + q]. T h e r e are two cases:

Case

1: A single node

Vk

is added.

Case

2: A chain of q + 1 (q > 0) nodes

V k , . . . , Vk+q

is added.

The last node added during a step is called a

type a

node. Other nodes are

type b

nodes. Thus the single node vk added during a Case i step is of t y p e a.

For a Case 2 step, the nodes

Vk,... ,Vk+q-1

are of type b and

Vk+q

is of type a.

Consider the interval

Ij = [k, k + q].

Let c t ( =

v l ) , c 2 , . . . ,ct(= v2)

be the nodes of the exterior face

Hk-1

ordered consecutively along Hk-1 from left to right above the edge (vl, v2). We define the following terms.

Case 1. Let

ce

and cr (1 < ~ < r _< t) be the leftmost and rightmost neighbors of

Vk

in

Hk-1,

resp. The edge

(ce, Vk)

is in T. T h e edge (cr,

Vk)

is called an

external

edge.

T h e edges

(ci,vk)

where e < i < r, if present, are

internal edges.

Case 2. Let

ce

and c~ (1 < g < r < t) be the neighbors of

Vk

and

Vk+q

in H k - 1 , resp. The edges

(ce, Vk), (Vk, Vk+l),..., (Vk+q-1, Vk+q)

are in T. T h e edge

(c~, Vk)

is called an external edge.

For each vk (1 < k < n - 1), let

B(vk)

denote the edge set

{(vk,vj) I k < j}.

By Definition 1 and L e m m a 1, the edges in

B(vk)

show the following p a t t e r n around vk in counterclockwise order: A block (maybe empty) of tree edges;

followed by at most one internal edge; followed by a block (maybe empty) of external edges. Next, we show that if we know the sets

B(Vk)

(1 < k < n - 1) and the type of

Vk

(3 < k < n), then we can uniquely reconstruct G.

First the edge

(vl,v2)

is drawn. Then we perform the following K steps.

T h e j - t h step processes

Ij =

[k, k + q]. Before the j - t h step, the graph

Gk-1

and its exterior face

Hk-1

has been constructed. We need to determine the leftmost neighbor

ce

and the rightmost neighbor cr of the nodes added in this step. We know

(ce, vk)

is a tree edge in T. Since v l , . . . , vn is the counterclockwise preordering of

T, ce

is the rightmost node t h a t has a remaining tree edge and cr is the leftmost node that is to the right of ct and has a remaining external edge.

T h e r e are two cases:

If

vk

is of type a, this is a Case 1 step and

Vk

is the single node added during this step. We add the edges

(ct,vk)

and

(c~,vk).

For each ci with e < i < r, if

B(ci)

contains an internal edge, we also add the edge

(ci, Vk).

If vk is of type b, this is a Case 2 step. Let q be the integer such t h a t

Vk, Vk+l,... ,Vk+q-1

are of type b and

Vk+q

is of type a. T h e chain

Vk,... ,Vk+q

is added between ct and c~.

This completes the j - t h step. When the process terminates, we obtain the graph G. Thus, if we can encode the type of each v~ and the sets

B(vk) 1 <

(11)

k _< n - 1, then we get an encoding of G. We first define the type of a set B(vk), which tells us the types of the edges contained in B(vk). We use T to denote the tree edges, X the external edges, and I the internal edges. The type of B(vk) is a combination of the symbols T, X, I. For examples, if B(vk) has type T X I , then B(vk) contains tree edges, external edges and an internal edge, and so on.

We further divide type a nodes Vk into two subtypes: If B(Vk) contains no tree edges, then vk is a type al node. If B(Vk) contains tree edges, then vk is a type a2 node. For a type b node Vk, since Vk is not the last node added during a Case 2 step, by the definition of T, B(vk) contains at least one tree edge.

Our encoding of G uses two strings $1 and $2 both using three symbols 0, 1, *.

T h e length of $1 is n. Silk] (1 < k < n) indicates whether Vk is of t y p e a l , a2, or b. $2 encodes the sets B(vk) (1 < k < n - 1). Each B(vk) is specified by a code word, denoted by Code[vk]. $2 is the concatenation of Code[Vk] (1 < k < n - 1).

T h e length of Code[vk] equals to the number of the edges in B(Vk). Depending on the type of Vk and the type of B(vk), Figure 3 gives the format of Code[vk].

In the table, the number of the tree edges (external edges, resp.) in B(vk) is denoted by a (fl, resp). 1 ~ denotes a string of a copies of 1, and so on. A symbol T (resp., X or I) under Code[vk] denotes the portion in Code[vk] corresponding to the tree (resp., external or internal) edges.

Type of vk al

Type of B(vk) Code[vk] Type of vk X I 1 ~ 0 a2 or b

X I

I 0

I

X 1~-~*

X

Type of B(vk) T

T X I T X T I

Fig. 3. Code Word Table.

Code[vk]

OCt-- I , T

1 ~ 0 ~ ,

T X I

1,~-10 0~-I 1

T X

1 r ,

v v

T I

From $1, $2 and the Code Word Table, we can easily recover the type of each Vk and the sets B(Vk). It is straightforward to implement the encoding and decoding procedures in O(n) time. T h e length of $1 is n. T h e length of $2 is m.

We use the binary representation S of $1 and $2 to encode G. Since b o t h $1 and $2 use 3 symbols, ISI = log3(n + m). Thus we have the following:

L e m m a 8. Any simple triconnected plane graph with n nodes and m edges can be encoded using at most log 3 ( n + m ) bits. Both encoding and decoding procedures take O(n) time.

We can improve L e m m a 8 as follows. Let G* be the dual of G. G* has f nodes, m edges and n faces. Since G is triconnected, so is G*. Furthermore, if n > 3, then f > 3 and G* has no self-loop or multiple edge. Thus, we can use the coding scheme of L e m m a 8 to encode G* with at most l o g 3 ( f + m) bits.

Since G can be uniquely determined from G*, to encode G, it suffices to encode

(12)

G*. To make S shorter, if n < f , we encode G using at most log 3(n + m) bits;

otherwise, we encode G* using at most l o g 3 ( f + m) bits. This new encoding uses at most log3(min{n, f } + m) bits. Since mAn{n, f } _< ~2+-~, the bit count is at most log3(1.5m + 1) by Euler's formula n + f = m + 2. We use one extra bit to denote whether we encode G or G*. Thus we have proved the following:

T h e o r e m 4. Any simple triconnected plane graph with n nodes, m edges and f faces can be encoded using at most log3(min{n, f } + m) + 1 _< 1.5(log3)m + 3

bits. Both encoding and decoding take O(n) time.

T h e o r e m 5. Any simple plane triangulation of n nodes and m edges can be encoded using 4n - 7 = ~ + 1 bits. Both encoding and decoding take O(n) time.

R e f e r e n c e s

1. T. BELL, J. G. CLEARY, AND I. WITTEN, Text Compression, Prentice-Hall, 1990.

2. D. R. CLARK, Compact Pat Tree, PAD thesis, University of Waterloo, 1996.

3. I~I. D. FRAYSSEIX, ,~. PACH, AND R. POLLACK, How to draw a planar graph on a grid, Combinatorica, 10 (1990), pp. 41-51.

4. H. GALPERIN AND A. WIGDERSON, Succinct representations of graphs, Information and Control, 56 (1983), pp. 183-198.

5. A. ITAI AND M. RODEH, Representation of graphs, Acta Informatica, 17 (1982), pp. 215-219.

6. G. JACOBSON, Space-e~cient static trees and graphs, in proc. 30th FOCS, 30 Oct.- 1 Nov. 1989, pp. 549-554.

7. S. KANNAN, N. NAOR, AND S. RUDICH, Implicit representation of graphs, SIAM Journal on Discrete Mathematics, 5 (1992), pp. 596-603.

8. G. KANT, Drawing planar graphs using the lmc-ordering (extended abstract), in proc. 33rd FOCS, 24-27 Oct. 1992, pp. 101-110.

9. ~ . , Algorithms for Drawing Planar Graphs, PAD thesis, Univ. of Utrecht, 1993.

10. G. KANT AND X. HE, Regular edge labeling of 4-connected plane graphs and its applications in graph drawing problems, TCS 172 (1997), pp. 175-193.

11. M. Y. KAO, M. FORER, X. HE, AND B. RAGHAVACHARI, Optimal parallel algo- rithms for straight-line grid embeddings of planar graphs, SIAM Journal on Discrete Mathematics, 7 (1994), pp. 632-646.

12. M. Y. KAO AND S. H. TENG, Simple and efficient compression schemes for dense and complement graphs, in Fifth Annual Symposium on Algorithms and Compu- tation, LNCS 834, Beijing, China, 1994, Springer-Verlag, pp. 201-210.

13. K. KEELER AND J. WESTBROOK, Short encodings of planar graphs and maps, Discrete Applied Mathematics, 58 (1995), pp. 239-252.

14. J. I. MUNRO, Tables, in proc. of 16th Conf. on Foundations of Software Technology and Theoret. Comp. Sci., LNCS 1180, 1996, Springer-Verlag, pp. 37-42.

15. J. I. MUNRO AND V. RAMAN, Succinct representation of balanced parentheses, static trees and planar graphs, in proc. 38th FOCS 20-22 Oct. 1997.

16. M. NAOR, Succinct representation of general unlabeled graphs, Discrete Applied Mathematics, 28 (1990), pp. 303-307.

17. C. H. PAPADIMITRIOU AND M. YANNAKAKIS, A note on succinct representations o] graphs, Information and Control, 71 (1986), pp. 181-185.

18. W. SCHNYDER, Embedding planar graphs on the grid, in Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms, 1990, pp. 138-148.

19. G. TURAN, On the succinct representation of graphs, Discrete Applied Mathemat- ics, 8 (1984), pp. 289-294.