**LINEAR-TIME COMPRESSION OF BOUNDED-GENUS GRAPHS**
**INTO INFORMATION-THEORETICALLY OPTIMAL NUMBER OF**

**BITS**^{∗}

HSUEH-I LU^{†}

**Abstract. A compression scheme A for a class G of graphs consists of an encoding algorithm***Encode*_{A}*that computes a binary string Code** _{A}*(

*G) for any given graph G in G and a decoding*

*algorithm Decode*

*that recovers*

_{A}*G from Code*

*A*(

*G). A compression scheme A for G is optimal if*

*both Encode*

_{A}*and Decode*

_{A}*run in linear time and the number of bits of Code*

*(*

_{A}*G) for any n-node*graph

*G in G is information-theoretically optimal to within lower-order terms. Trees and plane*triangulations were the only known nontrivial graph classes to admit optimal compression schemes.

Based upon Goodrich’s separator decomposition for planar graphs and Djidjev and Venkatesan’s
planarizers for bounded-genus graphs, we give an optimal compression scheme for any hereditary
(i.e., closed under taking subgraphs) class*G under the premise that any n-node graph of G to be*
encoded comes with a genus-*o(*_{log}* ^{n}*2

*n*) embedding. By Mohar’s linear-time algorithm that embeds a bounded-genus graph on a genus-

*O(1) surface, our result implies that any hereditary class of genus-*

*O(1) graphs admits an optimal compression scheme. For instance, our result yields the ﬁrst-known*optimal compression schemes for planar graphs, plane graphs, graphs embedded on genus-1 surfaces, graphs with genus 2 or less, 3-colorable directed plane graphs, 4-outerplanar graphs, and forests with degree at most 5. For nonhereditary graph classes, we also give a methodology for obtaining optimal compression schemes. From this methodology, we give the ﬁrst-known optimal compression schemes for triangulations of genus-

*O(1) surfaces and ﬂoorplans.*

**Key words. trees, planar graphs, graph algorithms, data structures, compression**
**AMS subject classifications. 05C05, 05C10, 05C85, 68P05, 68P30**

**DOI. 10.1137/120879142**

**1. Introduction. Compact representations of graphs are fundamentally impor-**
tant and useful in many applications, including representing the meshes in ﬁnite
element analysis, terrain models of GIS, three-dimensional (3D) models of graph-
ics [48, 64, 80, 81, 82, 85, 89, 92], and VLSI design [56, 84], designing compact
routing tables of computer networks [1, 3, 16, 35, 36, 38, 66, 77, 94, 95], and com-
pressing the link structure of the Internet [2, 5, 7, 15, 21, 88]. Let G be a class
*of graphs. Let num(G, n) denote the number of distinct n-node graphs in G. The*
*information-theoretically optimal number of bits to encode an n-node graph in*G is

*log num(G, n).*^{1} For instance, if*G is the class of rooted trees, then num(G, n) ≈* _{n}^{2}*3/2*^{2n}

*and log num(G, n) = 2n − O(log n); if G is the class of plane triangulations, then*
*log num(G, n) = log*^{256}_{27}*n + o(n)≈ 3.2451n+o(n) [97]. A compression scheme A for G*
*consists of an encoding algorithm Encode*_{A}*that computes a binary string Code*_{A}*(G) for*
*any given graph G inG and a decoding algorithm Decode**A**that recovers graph G from*
*Code*_{A}*(G). A compression scheme A for a graph classG with log num(G, n) = O(n)*

*∗*Received by the editors May 29, 2012; accepted for publication (in revised form) January 9, 2014;

*published electronically March 20, 2014. A preliminary version of this paper appeared in Proceedings*
*of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, 2002, pp. 223–224. This*
research was supported in part by NSC grant 101–2221–E–002–062–MY3.

http://www.siam.org/journals/sicomp/43-2/87914.html

*†*Department of Computer Science and Information Engineering, National Taiwan University,
Taipei 106, Taiwan, ROC (hil@csie.ntu.edu.tw, http://www.csie.ntu.edu.tw/*∼hil/). The author also*
holds joint appointments from the Graduate Institute of Networking and Multimedia and the Grad-
uate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University.

1All logarithms throughout the paper are to the base of two.

477

*is optimal if the following three conditions hold:*

*Condition C1. The running time of algorithm Encode*_{A}*(G) is linear in the size of*
*G.*

*Condition C2. The running time of algorithm Decode*_{A}*(Code*_{A}*(G)) is linear in*
*the bit count of Code*_{A}*(G).*

*Condition C3. For all positive constants β with log num(G, n) ≤ βn + o(n), the*
*bit count of Code*_{A}*(G) for an n-node graph G in* G is no more
*than βn + o(n).*

*Note that Condition C3 basically says the bit count of Code*_{A}*(G) is information-*
theoretically optimal to within lower-order terms. Although there has been con-
siderable work on compression schemes, trees (see, e.g., [11, 50, 67, 72]) and plane
triangulations [79] were the only known nontrivial graph classes to admit optimal com-
*pression schemes. A graph class is hereditary if it is closed under taking subgraphs.*

Below is the main result of the paper.

Theorem 1.1. *Any hereditary class* *G of graphs with log num(G, n) = O(n)*
*admits an optimal compression scheme, as long as each input n-node graph in* *G to*
*be encoded comes with a genus-o(*_{log}^{n}_{2}

*n**) embedding.*

*By Theorem 1.1 and Mohar’s linear-time genus-O(1) embedding algorithm for*
*genus-O(1) graphs [54, 70] (see Lemma 2.5), any hereditary class of genus-O(1) graphs*
admits an optimal compression scheme. For instance, our result yields the ﬁrst-known
optimal compression schemes for planar graphs, plane graphs, graphs embedded on
genus-1 surfaces, graphs with genus 2 or less, 3-colorable directed plane graphs, 4-
outerplanar graphs, and forests with degree at most 5. For nonhereditary graph
classes, we also give an extension (see Corollary 5.1) of Theorem 1.1. As summarized
*in the following theorem, we show two classes of genus-O(1) graphs whose optimal*
compression schemes are obtainable via this extension, where the class of ﬂoorplans
is deﬁned in related work below.

Theorem 1.2. *The following two classes of graphs admit optimal compression*
*schemes:*

*(1) triangulations of a genus-g surface for any integral constant g,*
*(2) ﬂoorplans.*

*Technical overview. The kernel of the proof of Theorem 1.1 is a linear-time disjoint*
*partition G*_{0}*, . . . , G*_{p}*of an n-node graph G embedded on a genus-o(*_{log}^{n}_{2}

*n*) surface.^{2}
*Let poly(n) denote O(n** ^{O(1)}*). Based upon Goodrich’s separator decomposition of

*planar graphs [40] and Djidjev and Venkatesan’s planarizer [26], partition G*0

*, . . . , G*

_{p}*satisﬁes the following conditions, where n*

_{i}*is the number of nodes of G*

_{i}*and d*

*is*

_{i}*the number of times that the nodes of G*

_{i}*are duplicated in some G*

_{j}*with j*

*= i:*

^{3}

*(a) n*

_{0}

*= o(*

_{log n}

^{n}*), (b) n*

_{i}*= poly(log n) holds for each i = 1, 2, . . . , p, (c)*

_{p}*i=1**d** _{i}* =

*o(*

_{log n}*), and (d)*

^{n}

_{p}*i=0**n*_{i}*= n + o(*_{log n}^{n}*). By condition (a), G*_{0} can be encoded in
*o(n) bits. By conditions (b) and (c), the information required to recover G from*
*G*_{0}*, G*_{1}*, . . . , G*_{p}*can be encoded into o(n) bits (see Lemma 4.1). By condition (d), we*
*have log num(G, n) ≤ o(n) +*_{p}

*i=1**log num(G, n**i*). Therefore, the disjoint partition
*reduces the problem of encoding an n-node graph in*G to the problem of encoding a
*poly(log n)-node graph in*G. Applying such a reduction for one more level, it remains
*to encode a poly(log log n)-node graph in*G into an information-theoretically optimal

2Precisely, the disjoint partition*G*^{0}*, . . . , G**p*of the edges of the embedded graph*G in the proof*
of Theorem 1.1 is*G[V*0]*, G(V*1)*, . . . , G(V**p*), where [*V*0*, . . . , V**p*] is both (i) a 1-separation S1 of an
arbitrary triangulation Δ of*G and (ii) a reﬁnement of the 0-separation S*0= [*∅, Node(Δ)] of Δ.*

3As a matter of fact, in our construction, all duplicated nodes of*G**i*with*i ≥ 1 belong to G*0.

(c)

(a) (b)

Fig. 1*. Three ﬂoorplans with 14 nodes, 6 internal faces, and 19 edges. Floorplans (a) and (b)*
*are equivalent, and ﬂoorplans (b) and (c) are not equivalent.*

number of bits, which can be resolved by the standard technique (see, e.g., [47, 72, 78]) of precomputation tables (see Lemma 2.3).

*Related work. The compression scheme of Tur´an [96] encodes an n-node plane*
*graph that may have self-loops into 12n bits.*^{4} Keeler and Westbrook [55] improved
*this bit count to 10.74n. They also gave compression schemes for several families*
*of plane graphs. In particular, they used 4.62n bits for plane triangulation, and 9n*
bits for connected plane graphs free of self-loops and degree-one nodes. For plane
*triangulations, He, Kao, and Lu [46] improved the bit count to 4n. For triconnected*
*plane graphs, He, Kao, and Lu [46] also improved the bit count to at most 8.585n bits.*

This bit count was later reduced to at most ^{9 log}_{2}^{2}^{3}*n≈ 7.134n by Chuang et al. [20].*

*For any given n-node graph G embedded on a genus-g surface, Deo and Litow [25]*

*showed an O(ng)-bit encoding for G. These compression schemes all take linear time*
for encoding and decoding, but Condition C3 does not hold for them. The compression
schemes of He, Kao, and Lu [47] (respectively, Blelloch and Farzan [14]) for planar
graphs, plane graphs, and plane triangulations (respectively, separable graphs) satisfy
*Condition C3, but their encoding algorithms require Ω(n log n) time on n-node graphs.*

Floorplanning is a fundamental issue in circuit layout [4, 8, 17, 24, 32, 43, 51,
57, 58, 62, 68, 69, 84, 91, 106, 108]. Motivated by VLSI physical design, various
representations of ﬂoorplans were proposed [33, 109, 110]. Designing a ﬂoorplan to
meet a certain criterion is NP-complete in general [44, 87, 100], so heuristic techniques
such as simulated annealing [17, 101, 102] are practically useful. The length of the
*encoding aﬀects the size of the search space. A ﬂoorplan, which is also known as*
*rectangular drawing, is a division of a rectangle into rectangular faces using horizontal*
*and vertical line segments. Two ﬂoorplans are equivalent if they have the same adja-*
cency relations and relative positions among the nodes. For instance, Figure 1 shows
three ﬂoorplans: Floorplans (a) and (b) are equivalent. Floorplans (b) and (c) are not
*equivalent. Let G be the input n-node ﬂoorplan. Under the conventional assumption*
*that each node of G, other than the four corner nodes, has exactly three neighbors*
*(see, e.g., [45, 107]), one can verify that G has 0.5n faces and 1.5n−2 edges. Yamanaka*
*and Nakano [103] showed how to encode G into 2.5n bits. Chuang [19] reduced the bit*
*count to 2.293n. Takahashi, Fujimaki, and Inoue [90] further reduced the bit count to*
*2n. All these compression schemes for ﬂoorplans satisfy Conditions C1 and C2, but*
not Condition C3. Takahashi, Fujimaki, and Inoue [90] also showed that the number
*of distinct n-node ﬂoorplans is no more than 3.375*^{n+o(n)}*≈ 2**1.755n+o(n)*. Therefore,
*our Theorem 1.2(2) encodes an n-node ﬂoorplan into at most 1.755n bits.*

*For applications that require query support, Jacobson [50] gave a Θ(n)-bit en-*

4For brevity, we omit all lower-order terms of bit counts in our discussion of related work.

*coding for a connected and simple planar graph G that supports traversal in Θ(log n)*
time per node visited. Munro and Raman [71] improved this result and gave schemes
*to encode binary trees, rooted ordered trees, and planar graphs. For a general n-node*
*m-edge planar graph G, they used 2m + 8n bits while supporting adjacency and de-*
*gree queries in O(1) time. Chuang et al. [20] reduced this bit count to 2m + (5 +*_{k}^{1}*)n*
*for any constant k > 0 with the same query support. The bit count can be fur-*
*ther reduced if only O(1)-time adjacency queries are supported, or if G is simple,*
triconnected, or triangulated [20]. Chiang, Lin, and Lu [18] reduced the number of
*bits to 2m + 2n. Yamanaka and Nakano [105] showed a 6n-bit encoding for plane*
triangulations with query support. The succinct encodings of Blandford, Blelloch,
and Kash [13] and Blelloch and Farzan [14] for separable graphs support queries. Ya-
manaka and Nakano [104] also gave a compression scheme for ﬂoorplans with query
support. For labeled planar graphs, Itai and Rodeh [49] gave an encoding of ^{3}_{2}*n log n*
bits. For unlabeled general graphs, Naor [74] gave an encoding of^{1}_{2}*n*^{2}bits. For certain
graph families, Kannan, Naor, and Rudich [52] gave schemes that encode each node
*with O(log n) bits and support O(log n)-time testing of adjacency between two nodes.*

Galperin and Wigderson [34] and Papadimitriou and Yannakakis [75] investigated complexity issues arising from encoding a graph by a small circuit that computes its adjacency matrix. Related work on various versions of succinct graph representations can be found in [6, 9, 28, 29, 30, 31, 37, 42, 53, 73, 76, 83] and the references therein.

*Outline. The rest of the paper is organized as follows. Section 2 gives the pre-*
liminaries. Section 3 shows our algorithm for computing graph separations. Section 4
gives our optimal compression scheme for hereditary graph classes. Section 5 shows
a methodology for obtaining optimal compression schemes for nonhereditary graph
*classes and applies this methodology on triangulations of genus-O(1) graphs and ﬂoor-*
plans. Section 6 concludes the paper with a couple of open questions.

**2. Preliminaries. Unless clearly stated otherwise, all graphs throughout the**
paper are simple, i.e., have no multiple edges or self-loops.

**2.1. Segmentation prefix. Let***X denote the number of bits of binary string*
*X. A binary string X*0 *is a segmentation preﬁx of binary strings X*1*, . . . , X** _{d}* if (a) it

*takes O(*

_{d}*i=1**X**i**) time to compute X*0 *from X*1*, . . . , X** _{d}* and (b) given the concate-

*nation of X*

_{0}

*, X*

_{1}

*, . . . , X*

_{d}*, it takes O(*

_{d}*i=0**X*_{i}*) time to recover all X** _{i}*with 1

*≤ i ≤ d.*

Lemma 2.1 (*see, e.g., [10, 27]). Any binary strings X*_{1}*, . . . , X*_{d}*with d = O(1)*
*have a segmentation preﬁx with O(log*_{d}

*i=1**X*_{i}*) bits.*

Lemma 2.2. *Any binary strings X*_{1}*, X*_{2}*, . . . , X*_{d}*have an O(min{m, d log m})-bit*
*segmentation preﬁx, where m =X*1* + · · · + X*_{d}*.*

*Proof. Let X be the concatenation of X*_{1}*, . . . , X*_{d}*. If m≤ d log m, let X** ^{}* be the

*m-bit binary string with exactly d copies of 1-bits such that the jth bit of X*

*is 1*

^{}*if and only if j =X*1

*+ · · · + X*

*i*

*holds for some i = 1, . . . , d. Otherwise, let X*

^{}*store the O(log m)-bit numbersX*1

*+ · · · + X*

*i*

*for all i = 1, . . . , d. Let X*0

*be the*

^{}*segmentation preﬁx of X*

^{}*and X as ensured by Lemma 2.1. The concatenation of*

*X*

_{0}

^{}*and X*

^{}*is a segmentation preﬁx X*

_{0}

*of X*

_{1}

*, . . . , X*

_{d}*with O(min{m, d log m}) bits.*

The lemma is proved.

*For the rest of the paper, let X*_{1}*◦· · ·◦X**d**be the concatenation of X*_{0}*, X*_{1}*, . . . , X** _{d}*,

*where X*0

*is the segmentation preﬁx of X*1

*, . . . , X*

*as ensured by Lemma 2.2.*

_{d}**2.2. Precomputation table. Let** *|S| denote the cardinality of set S. Let*
*Node(G) consist of the nodes in graph G, and let node(G) =* *|Node(G)|. For any*
*subset V of Node(G), let G[V ] denote the subgraph of G induced by V , and let G\ V*

1

4 5 7 8

3 6

(a) (b)

6 5

3 1 2

0 8 7 4 0

2

Fig. 2*. (a) A 9-node plane graph**G. (b) A separator decomposition T of G.*

*denote the subgraph of G obtained by deleting V and their incident edges. Two dis-*
*joint subsets V and V*^{}*of Node(G) are adjacent in G if there is an edge (v, v*^{}*) of G*
*with v* *∈ V and v*^{}*∈ V*^{}*. For any subset V of Node(G), let Nbr*_{G}*(V ) consist of the*
*nodes in Node(G)\ V that are adjacent to V in G, and let nbr**G**(V ) =|Nbr**G**(V )|. A*
*connected component of graph G is a maximal subset C of Node(G) such that G[C] is*
connected.

Lemma 2.3. *Let* *G be a graph class satisfying log num(G, n) = O(n). Given*
*positive integers and n with = poly(log log n), it takes overall o(n) time to compute*
*(i) a labeling Label(H) and alog num(G, node(H))-bit binary string Optcode(H) for*
*each distinct graph H∈ G with at most nodes and (ii) an o(n)-bit string Table(G, )*
*such that the following statements hold:*

*(1) Given a graph H* *∈ G with node(H) ≤ , it takes O(node(H)) time to obtain*
*Optcode(H) and Label(H) from Table(G, ).*

*(2) Given Optcode(H) for a graph H* *∈ G with node(H) ≤ , it takes O(node(H))*
*time to obtain H and Label(H) from Table(G, ).*

*Proof. It is straightforward by O(1)*^{poly()}*= o(n).*

* 2.3. Separator decomposition of planar graphs. Sets S*1

*, S*2

*, . . . , S*

*form*

_{d}*a disjoint partition of set S if S*1

*, . . . , S*

_{d}*are pairwise disjoint and S = S*1

*∪ · · · ∪*

*S*

_{d}*. A subset S of Node(G) is a separator of graph G with respect to S*1

*and S*2

*if (1) S, S*1*, and S*2 *form a disjoint partition of Node(G), (2) S*1 *and S*2 are not
*adjacent in G, (3)* *|S| = O(node(G)** ^{1/2}*), and (4) max

*{|S*1

*|, |S*2

*|} ≤*

^{2}

_{3}

*· node(G). A*

*separator decomposition [12] of G is a rooted binary tree*T on a disjoint partition of

*Node(G) such that the following two statements hold, where “nodes” specify elements*

*of Node(G) and “vertices” specify elements of Node(*T). Statement 1: Each leaf vertex of

*T consists of a single node of G. Statement 2: Each internal vertex S of T is a*

*separator of G[Oﬀspring(S)] with respect to Oﬀspring(S*

_{1}

*) and Oﬀspring(S*

_{2}), where

*S*

_{1}

*and S*

_{2}

*are the child vertices of S inT and Oﬀspring(S) (respectively, Oﬀspring(S*1)

*and Oﬀspring(S*

_{2})) is the union of all the vertices in the subtree of

*T rooted at S*

*(respectively, S*

_{1}

*and S*

_{2}). See Figure 2 for an illustration.

Lemma 2.4 (*Goodrich [40]). It takes O(n) time to compute a separator decom-*
*position for any given n-node planar graph.*

**2.4. Planarizers for nonplanar graphs. The genus of a graph G is deﬁned***to be the smallest integer g such that G can be embedded on an orientable surface*
*with g handles without edge crossings [41]. For example, the genus of a planar graph*
*is zero. By Euler’s formula (see, e.g., [39]), an n-node genus-O(n) graph has O(n)*
edges. Determining the genus of a general graph is NP-complete [93], but Mohar
*[70] showed that it takes linear time to determine whether a graph is of genus g*

*V*1

(a)

*G(V*^{2})

*V*2

*G[V*^{0}]

*V*3

*G(V*^{3})

(b)
*G(V*^{1})

*V*3 *V*0

*V*2

*V*1

*V*^{0}
*G*

Fig. 3*. (a) A 9-node plane graph with a separation [**V*0*, . . . , V*3*]. (b)**G[V*0*],**G(V*1*),**G(V*2*), and*
*G(V*3*) form a disjoint partition of the edges of**G.*

*for any g = O(1). Mohar’s algorithm is simpliﬁed by Kawarabayashi, Mohar, and*
Reed [54].

Lemma 2.5 (*Kawarabayashi, Mohar, and Reed [54] and Mohar [70]). It takes*
*O(n) time to compute a genus-O(1) embedding for any given n-node genus-O(1)*
*graph.*

*Gilbert, Hutchinson, and Tarjan [39] gave an O(n + g)-time algorithm to compute*
*an O((gn)*^{0.5}*)-node separator of an n-node genus-g graph, generalizing Lipton and*
Tarjan’s classic separator theorem for planar graphs [63]. Our result relies on the
following planarization algorithm.

Lemma 2.6 (Djidjev and Venkatesan [26]). *Given an n-node graph G embedded*
*on a genus-g surface, it takes O(n + g) time to compute a subset V of Node(G) with*

*|V | = O((gn)*^{0.5}*) such that G\ V is planar.*

**3. Separation and refinement. We say that [V**_{0}*, V*_{1}*, . . . , V*_{p}*] with p≥ 1 is a*
*separation of graph G if the following properties hold:*

*Property S1. V*_{0}*, V*_{1}*, . . . , V*_{p}*form a disjoint partition of Node(G).*

*Property S2. Any two V*_{i}*and V** _{i}* with 1

*≤ i = i*

^{}*≤ p are not adjacent in G.*

*Figure 3(a) shows a separation [V*_{0}*, V*_{1}*, V*_{2}*, V*_{3}*] of graph G, and Figure 4(a) shows*
*another separation [U*_{0}*, U*_{1}*, U*_{2}*] of G. For any subset V of Node(G), let G(V ) be*
*the subgraph of G induced by V* *∪ Nbr**G**(V ) excluding the edges of G[Nbr*_{G}*(V )]. If*
*[V*_{0}*, . . . , V*_{p}*] is a separation of G, then G[V*_{0}*], G(V*_{1}*), . . . , G(V** _{p}*) form a disjoint partition

*of the edges of G. See Figures 3(b) and 4(b) for illustrations. Let log*

^{(0)}

*n = n. For*

*any positive integer k, let log*

^{(k)}*n = log (log*

^{(k−1)}*n). For notational brevity, for any*

*nonnegative integer k, let*

* _{k}* = max

*{1, log*

^{(k)}*n}.*

*For any nonnegative integer k, separation [V*0*, . . . , V*_{p}*] of an n-node graph G is a*
*k-separation of G if the following three properties hold:*

*Property S3.* *|V*0*| = o(*_{}^{n}_{k}*) and p = o(*_{}^{n}

*k*) + 1.

*Property S4.* *|V**i**| + nbr**G**(V*_{i}*) = poly(*_{k}*) holds for each i = 1, . . . , p.*

*Property S5.* _{p}

*i=1**nbr*_{G}*(V*_{i}*) = o(*_{}^{n}

*k*).

One can easily verify that [*∅, Node(G)] is a 0-separation of G.*^{5} *Let [V*0*, . . . , V** _{p}*]

*and [U*

_{0}

*, . . . , U*

_{q}*] be two separations of graph G. We say that [V*

_{0}

*, . . . , V*

_{p}*] is a reﬁnement*

*of [U*

_{0}

*, . . . , U*

*] if the following three properties hold:*

_{q}5The “+1” in Property S3 is redundant for*k ≥ 1. However, we need it so that [∅, Node(G)] is a*
0-separation of*G, since 1 = o(*_{}^{n}_{0}).

*V*^{1}

*U*^{0}
*V*^{2}

*U*^{2}
*V*^{3}
*U*^{1}

(a)
*G*

(b)

*G(U*^{2})
*U*0

*G[U*^{0}] *G(U*^{1})

*U*^{2}
*U*^{1}

*V*0

Fig. 4*. (a) Separation [**V*0*, V*1*, V*2*, V*3*] is a reﬁnement of separation [**U*0*, U*1*, U*2*]. (b) Subgraphs*
*G[U*0*],**G(U*1*), and**G(U*2*) of**G.*

*Property R1. U*_{0}*⊆ V*0.

*Property R2. For each index i = 1, . . . , p, there is an index j with 1≤ j ≤ q and*
*V*_{i}*⊆ U**j*.

*Property R3. For any indices i, i*^{}*, i** ^{}* with 1

*≤ i < i*

^{}*< i*

^{}*≤ p, if V*

*i*

*∪ V*

*i*

^{}*⊆ U*

*j*,

*then V*

_{i}*⊆ U*

*j*.

*For instance, in Figure 4(a), [V*0*, V*1*, V*2*, V*3*] is a reﬁnement of [U*0*, U*1*, U*2]. Below
is the main lemma of the section.

Lemma 3.1. *Let k be a positive integer. Let G be an n-node connected graph*
*embedded on a genus-o(n/*^{2}_{k}*) surface. Given a (k− 1)-separation S**k−1* *of G, it takes*
*O(n) time to compute a k-separation* S*k* *of G that is a reﬁnement of*S*k−1**.*

The proof of Lemma 3.1 needs the following lemma, which can be proved by Lemmas 2.4 and 2.6.

Lemma 3.2. *Let k be a positive integer. Given an n-node graph G embedded on*
*a genus-o(n/*^{2}_{k}*) surface, it takes O(n) time to compute an o(*_{}^{n}

*k**)-node subset V of*
*Node(G) such that each node of Node(G)\ V has degree at most *^{2}_{k}*in G and each*
*connected component of G\ V has at most *^{4}_{k}*nodes.*

*Proof. We ﬁrst apply Lemma 2.6 to compute in O(n) time an o(*_{}^{n}

*k*)-node subset
*V*^{}*of Node(G) such that G\ V** ^{}* is planar. We then apply Lemma 2.4 to compute

*in O(n) time a separator decomposition*

*T of G \ V*

^{}*. For each vertex S of*T, let

*Oﬀspring(S) denote the union of all the vertices in the subtree of*

*T rooted at S,*

*and let oﬀspring(S) =|Oﬀspring(S)|. Let r =*

^{2}

_{k}*. Let V*

^{}*consist of the nodes of G*

*with degree more than r in G. Let V*

^{}*be the union of all the vertices S of*T with

*oﬀspring(S) > r*

^{2}

*. Let V = V*

^{}*∪ V*

^{}*∪ V*

^{}*. By V*

^{}*∪ V*

^{}*⊆ V and the deﬁnition of T,*

*each connected component of G\ V has at most r*

^{2}

*nodes. By V*

^{}*⊆ V , each node of*

*Node(G)\V has degree at most r in G. Since G has O(n) edges, |V*

^{}*| = O(*

^{n}

_{r}*) = o(*

^{n}*k*).

It remains to show that *|V*^{}*| = o(*_{}^{n}_{k}*). For each index i* *≥ 1, let I**i* consist of the
*vertices S ofT with r*^{2}*· (*^{3}_{2})^{i−1}*< oﬀspring(S)≤ r*^{2}*· (*^{3}_{2})^{i}*. By r*^{2}*≥ 1 and i ≥ 1, each*
*S* *∈ I**i* is an internal vertex of *T. By deﬁnition of T, we know that Oﬀspring(S) and*
*Oﬀspring(S*^{}*) are disjoint for any two distinct elements S and S** ^{}* ofI

*i*, implying that

*S∈I**i**oﬀspring(S)≤ n holds. Since oﬀspring(S) > r*^{2}*· (1.5)*^{i−1}*holds for each S∈ I**i*,
we have*|I**i**| <* * _{r}*2

*·(1.5)*

^{n}

^{i−1}*. Since each S*

*∈ I*

*i*is an internal vertex of

*T, S is a separator*

*of G[Oﬀspring(S)]. Therefore,|S| = O(r · (1.5)*

^{i/2}*) holds for each vertex S in*I

*i*. We have

*|V*

^{}*| =*

*i≥1*

*S∈I**i**|S| =*

*i≥1**O(* ^{n}

*r·(1.5)*^{i/2}*) = O(*^{n}_{r}*) = o(*_{}^{n}

*k*). The lemma is
proved.

**Algorithm 1**

*Let p = 0, and let all elements of*C be initially unmarked.

*For each j = 1, . . . , q, perform the following repeat-loop.*

Repeat the following steps until all elements ofC*j* are marked:

*Let v*0 *be an arbitrary node of V*0 adjacent to some unmarked element ofC*j*.
LetU consist of the unmarked elements of C*j* *that are adjacent to v*0 *in G.*

*Let C*_{i}_{1}*, . . . , C*_{i}_{3} be the elements of*U in clockwise order around v*0*in G.*

*Mark all i*_{3}*− i*1+ 1 elements ofU.

*Repeat the following four steps until i*_{1}*> i*_{3}:

*Let i*_{2}*be the largest index with i*_{1}*≤ i*2*≤ i*3 and*|C**i*1*| + · · · + |C**i*2*| ≤ *^{4}* _{k}*.

*Let p = p + 1.*

*Let hook*_{p}*= v*_{0}*and V*_{p}*= C*_{i}_{1}*∪ · · · ∪ C**i*2.
*Let i*_{1}*= i*_{2}+ 1.

*Output V*_{1}*, . . . , V*_{p}*and hook*_{1}*, . . . , hook** _{p}*.

5 3 5

3
*hook*^{3}*= hook*^{4}

*hook*^{1}
*hook*^{2}

2 5

2

1 3 4 6

Fig. 5*. An illustration for Algorithm 1.*

*Proof of Lemma 3.1. Suppose that [U*_{0}*, . . . , U*_{q}*] is the given (k− 1)-separation*
S_{k−1}*. Let V*_{0}^{}*be the O(n)-time computable subset of Node(G) ensured by Lemma 3.2.*

We have*|V*0^{}*| = o(*_{}^{n}_{k}*). Let V*_{0}*= U*_{0}*∪ V*0* ^{}*. LetC consist of the connected components

*of G\ V*0

*. By V*

_{0}

^{}*⊆ V*0, each element of

*C has at most*

^{4}

_{k}*nodes. By U*0

*⊆ V*0

and Properties S1 and S2 of S*k−1*, each element of *C is contained by some U**j* with
1*≤ j ≤ q. For each j = 1, . . . , q, let C**j* *consist of the elements C ofC with C ⊆ U**j*.
*We run Algorithm 1 to obtain (a) a disjoint partition V*_{1}*, . . . , V*_{p}*of G\ V*0 *and (b) p*
*nodes hook*_{1}*, . . . , hook*_{p}*of V*_{0}, which may not be distinct. LetS*k**= [V*_{0}*, . . . , V** _{p}*]. Since

*G is connected, each element ofC is adjacent to V*0. The ﬁrst statement of the outer repeat-loop is well deﬁned. Since each element of

*C has at most*

^{4}

*nodes, the ﬁrst statement of the inner repeat-loop is well deﬁned. See Figure 5 for an illustration:*

_{k}*Suppose that all nodes are in U*_{1}*. All nodes are initially unmarked. Let V*_{0}consist of
*the nine unlabeled nodes, including the three gray nodes. For each i = 1, . . . , 6, let C*_{i}*consist of the nodes with label i. That is, C*1*, . . . , C*6are the six connected components
*of G\ V*0*. Suppose that *^{4}* _{k}* = 7 and the ﬁrst two iterations of the outer repeat-

*loop obtain V*1

*= C*1

*and V*2

*= C*2. In the third iteration of the outer repeat-loop,

*C*3

*, . . . , C*6 are the unmarked elements of

*C that are adjacent to hook*3 in clockwise

*order around hook*3. By

*|C*3

*| + |C*4

*| + |C*5

*| = 7, the two iterations of the inner repeat-*

*loop obtain V*3

*= C*3

*∪ C*4

*∪ C*5

*and V*4

*= C*6.

By deﬁnition of Algorithm 1, one can verify that Properties R1, R2, and R3 hold
forS*k−1* andS*k* (that is,S*k* is a reﬁnement ofS*k−1*) and Properties S1 and S2 hold
for S*k*. By Property S3 of S*k−1*, we have*|U*0*| = o(*_{}_{k−1}^{n}*) = o(*_{}^{n}

*k*). By *|V*0^{}*| = o(*_{}^{n}* _{k}*),

*hook**i*

5 3

3 5

3

6
*hook**i*

*v**i*

(c)
*hook**i*

3
*v**i*

3

3 55

(b)

5 6 4 3

5

5 6 4 3

(a) 5

Fig. 6*. The operation that contracts all nodes of**V**i* *into a node* *v**i**, which takes over some*
*neighbors of hook*_{i}*.*

we have*|V*0*| ≤ |U*0*| + |V*0^{}*| = o(*_{}^{n}_{k}*). Let I*_{small} *consist of the indices i with 1≤ i ≤ p*
and *|V*_{i}*| ≤* ^{1}_{2}*· *^{4}_{k}*. Let I*_{large} *consist of the indices i with 1≤ i ≤ p and |V*_{i}*| >* ^{1}_{2}*· *^{4}* _{k}*.

*We show that p =|I*small

*| + |I*large

*| = o(*

_{}

^{n}*) as follows. By Property S1 ofS*

_{k}*, we have*

_{k}*|I*large*| = o(*_{}^{n}* _{k}*). To show that

*|I*small

*| = o(*

_{}

^{n}

_{k}*), we categorize the indices i in I*small

with 1*≤ i < p into the the following types, where j is the index with V**i* *⊆ U**j*:
*Type 1: i* *∈ I*small *and i + 1∈ I*large*. The number of such indices i is no more*

than*|I*large*| = o(*_{}^{n}* _{k}*).

*Type 2: i∈ I*small *and i + 1∈ I*small.

*Type 2a: V*_{i+1}*⊆ U*_{j+1}*. The number of such indices i is no more than*
*q = o(*_{}^{n}

*k−1**) = o(*_{}^{n}

*k*).

*Type 2b: V*_{i+1}*⊆ U**j* *and hook*_{i}*∈ V*0*\ U*0. By Properties S1 and S2 of
S*k−1**, we know that hook*_{i}*∈ U**j**. By deﬁnition of Algorithm 1, hook*_{i}*=*

*hook*_{i}*holds for all indices i*^{}*with i < i*^{}*≤ p. The number of such indices*
*i is no more than|V*0*\ U*0*| ≤ |V*0*| = o(*_{}^{n}* _{k}*).

*Type 2c: V*_{i+1}*⊆ U**j* *and hook*_{i}*∈ U*0*. We have hook*_{i}*∈ Nbr**G**(U** _{j}*). By

*deﬁnition of Algorithm 1, hook*

_{i}*= hook*

*i*

*holds for all indices i*

^{}*> i with*

*V*

_{i}*⊆ U*

*j*. By Property S5 of S

*k−1*

*, the number of such indices i is no*more than

_{q}*j=1**nbr*_{G}*(U*_{j}*) = o(*_{}^{n}

*k−1**) = o(*_{}^{n}

*k*).

*We have p = o(*^{n}

*k*). Property S3 holds for S*k*. By deﬁnition of Algorithm 1,

*|V**i**| ≤ *^{4}_{k}*holds for each i = 1, . . . , p. By V*_{0}^{}*⊆ V*0*, each node of Node(G)\ V*0 has
*degree at most *^{2}* _{k}*. Property S4 holds forS

*k*.

To see Property S5 ofS*k**, we obtain a contracted graph from G by performing the*
*following two steps for each i = 1, . . . , p.*^{6} *Step 1: Let C*_{i}_{1}*, . . . , C*_{i}_{2} be the elements
of *C with V**i* *= C*_{i}_{1} *∪ C**i*1+1*∪ · · · ∪ C**i*2 *in clockwise order around hook*_{i}*in G. Split*
*hook*_{i}*into two adjacent nodes hook*_{i}*and v*_{i}*, and let v*_{i}*take over the neighbors of hook*_{i}*in clockwise order around hook*_{i}*from the ﬁrst neighbor of hook*_{i}*in C*_{i}_{1} to the ﬁrst
*neighbor of hook*_{i}*in C*_{i}_{2}*. Step 2: Contract all nodes of V*_{i}*into node v** _{i}*, and delete

*multiple edges and self-loops. See Figure 6 for an illustration: For each i = 3, . . . , 6,*

*let C*

_{i}*consist of the nodes with labels i in Figure 6(a). Suppose that i*

_{1}

*= 3, i*

_{2}= 5,

*and V*

_{i}*= C*

_{3}

*∪ C*4

*∪ C*5

*. The unlabeled circle nodes belong to V*

_{0}. The square nodes

*are two previously contracted nodes v*

_{i}*and v*

_{i}*from V*

_{i}*and V*

_{i}*for some indices i*

^{}*and i*

*with 1*

^{}*≤ i*

^{}*= i*

^{}*< i. Figure 6(b) shows the result of Step 1. Figure 6(c)*

*shows the result of Step 2. Observe that each node that is adjacent to V*

*becomes a*

_{i}6The contraction procedure is only for proving Property S5 ofS*k*; it is not needed for computing
S*k*.

*G(U*1)

3

*G(V*3)

0 2

0

3 2 0 1

5
4
*U*1

2 0 1 *V*^{0}

3 2 2

0

1 3

8 (a)

7 6

5
*G*

4

*V*^{1} *V*^{2} *V*^{3}

(b)
*G(V*2)
*U*^{0}

*U*2

(c)

*U*^{2}
*U*^{1}

*V*^{1} *V*^{2} *V*^{3}

*G(V*1)

2

*G(U*2)
1

1

1 0

Fig. 7*. (a) Graph**G with a labeling. (b) Subgraphs G(V*1*),**G(V*2*), and**G(V*3*) of**G with labelings.*

*(c) Subgraphs**G(U*1*) and**G(U*2*) of**G with labelings.*

*neighbor of v*_{i}*after applying Steps 1 and 2. Also, each neighbor of hook** _{i}* that is not

*in V*

_{i}*either remains a neighbor of hook*

_{i}*or becomes a neighbor of v*

*after applying*

_{i}*Steps 1 and 2. Therefore, for each i = 1, . . . , p and each node v*

_{0}

*∈ Nbr*

*G*

*(V*

*), there is*

_{i}*either an edge (v*

_{0}

*, v*

_{i}*) or an edge (v*

_{i}*, v*

_{i}*) for some index i*

^{}*with i*

^{}*> i and hook*

_{i}*= v*

_{0}. Thus,

_{p}*i=1**nbr*_{G}*(V** _{i}*) is no more than the number of edges in the resulting contracted
simple graph, which has

*|V*0

*|+p = o(*

_{}

^{n}*) nodes. Observe that Step 1 does not increase*

_{k}*the genus of the embedding. Since the subgraph induced by V*

_{i}*∪ {v*

*i*

*} is connected,*Step 2 does not increase the genus of the embedding either. The number of edges in

*the resulting contracted simple genus-o(n/*

^{2}

_{k}*) graph is o(*

^{n}*k*). Property S5 holds for
S* _{k}*. The lemma is proved.

**4. Our compression scheme. This section proves Theorem 1.1.**

**4.1. Recovery string. A labeling of graph G is a one-to-one mapping from***Node(G) to* *{0, 1, . . . , node(G) − 1}. For instance, Figure 7(a) shows a labeling for*
*graph G. Let G be a graph embedded on a surface. We say that a graph Δ embedded*
*on the same surface is a triangulation of G if G is a subgraph of Δ with Node(Δ) =*
*Node(G) such that each face of Δ has three nodes.* The following lemma shows
*an o(n)-bit string with which the larger embedded labeled subgraphs of G can be*
*recovered from smaller embedded labeled subgraphs of G in O(n) time.*

Lemma 4.1. *Let k be a positive integer. Let G be an n-node graph embedded*
*on a genus-o(*_{}^{n}

*k**) surface. Let Δ be a triangulation of G. Let* S*k* *= [V*_{0}*, . . . , V*_{p}*] be a*
*given k-separation of Δ and* S*k−1* *= [U*0*, . . . , U*_{q}*] be a given (k− 1)-separation of Δ*
*such that* S*k* *is a reﬁnement of* S*k−1**. For any given labeling L*_{k,i}*of G(V*_{i}*) for each*
*i = 1, . . . , p, the following statements hold :*

*(1) It takes overall O(n) time to compute a labeling L*_{k−1,j}*of subgraph G(U*_{j}*) for*
*each j = 1, . . . , q.*

*(2) Given the above labelings L*_{k−1,j}*of subgraphs G(U*_{j}*) with 1≤ j ≤ q, it takes*
*O(n) time to compute an o(n)-bit string Rec*_{k}*such that G(U*_{j}*) and L*_{k−1,j}*for*
*all j = 1, . . . , q can be recovered in overall O(n) time from Rec*_{k}*and G(V** _{i}*)

*and L*

_{k,i}*for all i = 1, . . . , p.*

*Proof. Since Δ is a subgraph G with Node(Δ) = Node(G), one can easily verify*
thatS*k−1* (respectively,S*k**) is also a (k− 1)-separation (respectively, k-separation) of*
*G. For each j = 1, . . . , q, let I*_{j}*consist of the indices i with V*_{i}*⊆ U**j**. Let W** _{j}* consist

*of the nodes of G(U*

_{j}*) that are not in any V*

_{i}*with i∈ I*

*j*. By Properties S1 and S2 of S

*k*

*, W*

_{j}*⊆ V*0

*. For instance, if G is as shown in Figure 7(a), where v*

*with 0*

_{t}*≤ t ≤ 8*

*denotes the node with label t, we have I*_{1} = *{1}, I*2 = *{2, 3}, W*1 = *{v*2*, v*_{3}*}, and*
*W*_{2}=*{v*0*, v*_{1}*, v*_{2}*, v*_{6}*}. Let the labeling L**k−1,j* *for G(U** _{j}*) be deﬁned as follows:

*• For the nodes of G(U**j**) in W*_{j}*, let L** _{k−1,j}* be an arbitrary one-to-one map-

*ping from W*

*to*

_{j}*{0, 1, . . . , |W*

*j*

*| − 1}. In Figure 7(c), we have L*

*k−1,1*

*(v*

_{2}) =

*1, L*

_{k−1,1}*(v*

_{3}

*) = 0, L*

_{k−1,2}*(v*

_{0}

*) = 2, L*

_{k−1,2}*(v*

_{1}

*) = 3, L*

_{k−1,2}*(v*

_{2}) = 0, and

*L*

_{k−1,2}*(v*6) = 1.

*• For the nodes of G(U**j**) not in W*_{j}*, let L** _{k−1,j}* be the one-to-one mapping
from

*i∈I**j**V** _{i}* to

*{|W*

*j*

*|, |W*

*j*

*| + 1, . . . , node(G(U*

*j*))

*− 1} obtained by sorting*

*(i, L*

_{k,i}*(v)) for all indices i∈ I*

*j*

*and all nodes v∈ V*

*i*

*such that L*

_{k−1,j}*(v) <*

*L*_{k−1,j}*(v*^{}*) holds for a node v of V*_{i}*and a node v*^{}*of V** _{i}* if and only if (a)

*i < i*

^{}*or (b) i = i*

^{}*and L*

_{k,i}*(v) < L*

_{k,i}*(v*

^{}*). For instance, if L*

_{k,1}*, L*

*, and*

_{k,2}*L*

_{k,3}*are as shown in Figure 7(b), then L*

_{k−1,1}*and L*

*can be as shown in*

_{k−1,2}*Figure 7(c) and L*

*can be as shown in Figure 7(a).*

_{k−2,1}*It takes O(node(G(U*_{j}*))) = O(|U**j**| + nbr**G**(U*_{j}*)) time to compute L*_{k−1,j}*from all L*_{k,i}*with i∈ I**j*. By Property S5 ofS*k−1**, it takes overall O(n) time to compute all L** _{k−1,j}*
with 1

*≤ j ≤ q from all L*

*k,i*with 1

*≤ i ≤ p. Statement (1) is proved.*

By Property S4 ofS*k−1**, the label of each node of G(U*_{j}*) assigned by L** _{k−1,j}* can

*be represented by O(log poly(*

_{k−1}*)) = O(*

*) bits. By Property S4 ofS*

_{k}*k*, the label of

*each node of G(V*

_{i}*) assigned by L*

_{k,i}*can be represented by O(log poly(*

_{k}*)) = O(*

*)*

_{k+1}*bits. For each index j = 1, . . . , q,*

*• string Rec*^{}_{k,j}*stores the adjacency list of the embedded subgraph of G(V** _{j}*)

*induced by W*

_{j}*via the labeling L*

_{k−1,j}*of W*

*,*

_{j}*• string Rec*^{}_{k,j}*stores the information required to recover L*_{k−1,j}*from all L*_{k,i}*with i∈ I**j*, and

*• string Rec*^{}* _{k,j}* stores the information required to recover the embedding of

*G(U*

_{j}*) from the embeddings of all G(V*

_{i}*) with i*

*∈ I*

*j*and the embedding of

*the subgraph of G(U*

_{j}*) induced by W*

*.*

_{j}*By deﬁnition of W** _{j}*, we have

*|W*

*j*

*| = |V*0

*∩U*

*j*

*|+nbr*

*G*

*(U*

*). It follows from Property S3 ofS*

_{j}*k*and Property S5 ofS

*k−1*that

*q*
*j=1*

*|W**j**| ≤ |V*0*| +*

*q*
*j=1*

*nbr*_{G}*(U*_{j}*) = o*

*n*

_{k}

*+ o*

*n*

_{k−1}

*= o*

*n*

_{k}

*.*

*Let W =*_{q}

*j=1**W*_{j}*. Since G[V*_{0}*], G(V*_{1}*), . . . , G(V** _{p}*) form a disjoint partition of the edges

*of G, the overall number of edges in the subgraphs of G(V*

_{j}*) induced by W*

*for all*

_{j}*j = 1, . . . , q is no more than the number of edges in G[W ], which is O(|W | + o(*

_{}

^{n}*))*

_{k}*≤*

*O(*

_{q}*j=1**|W**j**|) + o(*_{}^{n}_{k}*) = o(*^{n}

*k*). Therefore,
(1)

*q*
*j=1*

*Rec*^{}_{k,j}* = o*

*n*

_{k}

*· O(**k**) = o(n).*

*It suﬃces for Rec*^{}_{k,j}*to store the list of (i, L*_{k,i}*(v), L*_{k−1,j}*(v)) for all i* *∈ I**j* and all
*v* *∈ Nbr**G**(V** _{i}*). By Property R3 of S

*k−1*and S

*k*and Property S4 of S

*k−1*, index

*i can be represented by an O(*

_{k}*)-bit oﬀset t such that i is the tth smallest index*

*in I*

*. Thus,*

_{j}*Rec*

^{}

_{k,j}*=*

*i∈I**j**nbr*_{G}*(V** _{i}*)

*· O(*

*k*). By Property S5 of S

*k*, we have

_{q}

*j=1*

*i∈I**j**nbr*_{G}*(V** _{i}*) =

_{p}*i=1**nbr*_{G}*(V*_{i}*) = o(*^{n}

*k*). Therefore,
(2)

*q*
*j=1*

*Rec*^{}_{k,j}* = o*

*n*

_{k}

*· O(**k**) = o(n).*