The best previously known result is due to Geary, Raman, and Raman [SODA 2004, pages 1–10]

(1)

Hsueh-I Lu

National Taiwan University and

Chia-Chi Yeh

National Taiwan University

An ordinal tree is an arbitrary rooted tree where the children of each node are ordered. Succinct representations for ordinal trees with efficient query support have been extensively studied. The best previously known result is due to Geary, Raman, and Raman [SODA 2004, pages 1–10].

The number of bits required by their representation for an n-node ordinal tree T is 2n + o(n), whose first-order term is information-theoretically optimal. Their representation supports a large set of O(1)-time queries on T . Based upon a balanced string of 2n parentheses, we give an improved 2n + o(n)-bit representation for T . Our improvement is two fold: Firstly, the set of O(1)-time queries supported by our representation is a proper superset of that supported by the representation of Geary, Raman, and Raman. Secondly, it is also much easier for our representation to support new queries by simply adding new auxiliary strings.

Categories and Subject Descriptors: E.1 [Data]: Trees; E.4 [Coding and Information The- ory]: Data compaction and compression; F.2.2 [Analysis of Algorithms and Problem Com- plexity]: Nonnumerical Algorithms and Problems—Computations on discrete structures; G.2.2 [Discrete Mathematics]: Graph Theory—Graph algorithms, Trees; H.3.1 [Information Stor- age and Retrieval]: Content Analysis and Indexing—Dictionaries, Indexing methods

General Terms: Algorithm, Design, Theory

Additional Key Words and Phrases: Succinct data structures, XML document representation

1. INTRODUCTION

An ordinal tree (see, e.g., [Geary et al. 2004; Benoit et al. 2005]) is an arbitrary rooted tree where the children of each node are ordered. All trees in the paper are ordinal. The number of distinct n-node trees is 22n−Θ(log n)[Graham et al. 1989], so the information-theoretically minimum number of bits to differentiate these trees is 2n − Θ(log n). There are three major types of 2n-bit representations for an n-node tree T :

Authors’ Address: Department of Computer Science and Information Engineering, National Tai- wan University. 1 Roosevelt Road, Section 4, Taipei 106, Taiwan, Republic of China. Emails:

[email protected], [email protected]. Web: www.csie.ntu.edu.tw/∼hil/. This research is supported in part by NSC Grants 94-2213-E-002-126 and 95-2221-E-002-077.

The first author is the corresponding author, who is also affiliated with the Graduate Institute of Networking and Multimedia and the Graduate Institute of Biomedical Electronics and Bioinfor- matics, National Taiwan University.

Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.

c

2007 ACM 0000-0000/2007/0000-0001 $5.00

ACM Journal Name, Vol. V, No. N, June 2007, Pages 1–13.

(2)

v₉ v₃

v₄ v₅

v₇ v₈ v₆ v₁

v₂

Balanced parentheses: (((()()))(()()())).

LOUDS: 11010111011000000.

DFUDS: 11010110001110000.

Fig. 1. Three representations for the same tree.

—Balanced parentheses [Munro and Raman 2001; Chuang et al. 1998; He et al.

1999; Chiang et al. 2005; Munro and Rao 2004; Bonichon et al. 2006], a folklore encoding consisting of a balanced string of parentheses representing the counter- clockwise depth-first traversal of T , where an open (respectively, closed) parenthesis denotes a descending (respectively, ascending) edge traversal. For techni- cal reason, one usually adds a pair of enclosing parentheses to the above 2n − 2 parentheses, resulting in a representation consisting of 2n parentheses.

—Level order unary degree sequence (LOUDS) [Jacobson 1989], representing a node of degree d as a string of d copies of 1-bits followed by a 0-bit, where these nodes are represented in a level-order traversal of T .

—Depth first unary degree sequence (DFUDS) [Benoit et al. 2005], representing a node of degree d as a string of d copies of 1-bits followed by a 0-bit, where these nodes are represented in a depth-first traversal of T .

An example is shown in Figure 1.

Initiated by Jacobson [Jacobson 1989], succinct representations for trees with efficient query support have been extensively studied in the literature. Jacob- son [Jacobson 1989] extended the LOUDS representation into a Θ(n)-bit encoding to support the parent query and the rank and select queries for nodes in level-order traversal of T in Θ(log n) time. Clark and Munro [Clark 1996; Clark and Munro 1996] squeezed Jacobson’s encoding into a 3n + o(n)-bit representation, from which the above queries and the subtree-size query can be supported in O(1) time. Later succinct representations, all have 2n+o(n) bits, form the following trade-off between the choices of base representations and the sets of supported O(1)-time queries:

—Based upon balanced parentheses, Munro and Raman [Munro and Raman 2001]

showed that an o(n)-bit auxiliary string suffices to support the following queries in O(1) time: parent, depth, subtree-size, and the rank and select queries for nodes in pre-order and post-order traversal of T . Munro, Raman, and Rao [Munro et al. 2001] showed an o(n)-bit auxiliary string to support O(1)-time query for leaf-rank, leaf-select, and leaf-size. Chiang, Lin, and Lu [Chiang et al. 2005]

showed an o(n)-bit auxiliary string to support O(1)-time degree query. Munro and Rao [Munro and Rao 2004] further gave an o(n)-bit auxiliary string to support O(1)-time level-ancestor query.

—Based upon the DFUDS representation, Benoit et al. [Benoit et al. 2005] gave an o(n)-bit auxiliary string that supports the following queries in O(1) time:

child-rank, child-select, degree, subtree-size, and node-rank and node-select in

ACM Journal Name, Vol. V, No. N, June 2007.

(3)

parentheses DFUDS Geary et al. new

pre-order select and rank ∨ ∨ ∨ ∨

post-order select and rank ∨ ∨ ∨

child-select and child-rank ∨ ∨ ∨

leaf-select, leaf-rank, and leaf-size ∨ ∨

lowest common ancestor ∨

subtree height ∨

subtree size ∨ ∨ ∨ ∨

level ancestor ∨ ∨ ∨

distance ∨

degree ∨ ∨ ∨ ∨

depth ∨ ∨ ∨

Table I. A summary for current 2n + o(n)-bit encodings for an n-node tree: Parentheses [Munro and Raman 2001; Chiang et al. 2005; Munro and Rao 2004; Munro et al. 2001], DFUDS [Benoit et al. 2005], Geary et al. [Geary et al. 2004].

the pre-order traversal of T . However, such a choice of the base representation still does not provide O(1)-time support for the depth and level-ancestor queries, the node-rank and node-select queries in the post-order traversal of T , and the rank, select, and size queries for leaves.

Recently, Geary, Raman, and Raman [Geary et al. 2004] almost resolved the above trade-off by giving a 2n + o(n)-bit encoding for T that supports in O(1) time the aforementioned queries except those leaf-related ones [Munro et al. 2001]. Their approach differs from all previous work achieving 2n + o(n) bits in that their encoding does not consist of a 2n-bit base representation for the topology of T plus an o(n)-bit auxiliary string. Instead, they decomposed T into several types of subtrees, whose topologies are represented in a hierarchical way, where different levels are composed of mixtures of different base representations and auxiliary strings. Such an involved structure seriously complicates the possibility of supporting additional queries using other stand-alone auxiliary strings. An implementation based upon a similar concept is studied in [Geary et al. 2004]. Very recently, Delpratt, Rahman, and Raman [Delpratt et al. 2006] showed that LOUDS-based representation can also be implemented to have competitive practical performance.

In the present paper, we give new o(n)-bit auxiliary strings for the 2n-bit balanced string of parentheses representing T . Together with previous o(n)-bit auxiliary strings for balanced parentheses [Munro and Raman 2001; Chiang et al. 2005;

Munro and Rao 2004], our 2n+o(n)-bit encoding for T supports all of Geary et al.’s queries in O(1) time. Consisting of a base representation plus o(n)-bit auxiliary strings, our encoding is better in the ease of supporting new queries by adding new o(n)-bit auxiliary strings. To demonstrate such an advantage, we also show how to handle O(1)-time queries currently unsupported by Geary et al.’s encoding, including (a) lowest common ancestor, (b) distance, and (c) subtree height. Table I summarizes the above discussion.

We follow the convention of unit-cost RAM model of computation with Θ(log n)- bit word size [van Emde Boas 1990], which is assumed in all the previous work except that of Jacobson [Jacobson 1989]. The rest of the paper is organized as follows. Section 2 gives the preliminaries. Section 3 shows our auxiliary strings

(4)

for distance, subtree height, and lowest common ancestor. Section 4 shows our auxiliary strings for child-rank and child-select.

2. PRELIMINARIES

Let T be the input n-node tree. Let vi denote the i-th node of T in the pre-order traversal of T . Let S be the balanced string of 2n parentheses for T . Let S[i, j]

denote the substring of S from index i to index j. Let S[i] = S[i, i]. Let ℓi be the index such that S[ℓi] is the i-th open parenthesis in S. Let ribe the index such that S[ri] is the closed parenthesis that matches S[ℓi] in S. One can easily see that the correspondence between vi and the matched parentheses S[ℓi] and S[ri]: vi is the parent of vj if and only if S[ℓi] and S[ri] is the closest parenthesis pair that encloses S[ℓj] and S[rj]. Let w(i, j) = j − i + 1. For the rest of the paper, all logarithms are of base 2. Let B = ⌈log³n⌉, b = ⌈(log log n)³⌉, nB= ⌈²ⁿ_B⌉, and nb= ⌈²ⁿ_b ⌉.

Lemma 2.1 (see [Bell et al. 1990; Elias 1975]). For any O(n)-bit strings S1, S2, . . . , Sk with k= O(1), there is an O(log n)-bit auxiliary string αconcat such that, given the concatenation of αconcat, S1, S2, . . . , Sk as input, the index of the first symbol of any given Si in the concatenation is computable in O(1) time.

Let S1 ◦ S2 ◦ · · · ◦ Sk denote the concatenation of αconcat, S1, S2, . . . , S_k as in Lemma 2.1.

Lemma 2.2 (see [Munro and Raman 2001; Chiang et al. 2005]). Let S be a length-2n string of balanced parentheses that represents an n-node tree T . It takes O(n) time to compute an o(n)-bit string αaux such that the following queries for S can be determined from S and αaux in O(1) time: (a) the parent, degree, and depth of vi in T , (b) the parenthesis that matches S[i] in S, and (c) the rank and select queries for open and closed parentheses in S.

By Lemma 2.2, given S ◦ αaux, indices i, ℓi, and ri can be determined from one another in O(1) time. Our technique of dividing the input strings into multiple levels of blocks, which has been widely used in many succinct data structures, is inspired by Munro and Raman [Munro 1996; Munro and Raman 2001].

3. DISTANCE, SUBTREE HEIGHT, AND LOWEST COMMON ANCESTOR Let L be the 2n-element array such that each L[i] is the number of open parentheses minus the number of closed parentheses in S[1, i]. Therefore, if S[j] is the i-th open parenthesis in S, then L[j] is the level of vi in T . For any indices i and j with i≤ j, let indexmin(L, i, j) (respectively, indexmax(L, i, j)) denote the smallest index k with i ≤ k ≤ j such that L[k] equals the minimum (respectively, maximum) of L[i], L[i + 1], . . . , L[j]. As observed by Gabow, Bentley, and Tarjan [Gabow et al. 1984], the lowest-common-ancestor query can be reduced to the above range- minima query indexmin. Similarly, our auxiliary string for supporting the queries of distance, subtree height, and lowest common ancestor is based on the lemma below.

Observe that each L[i] can be obtained from S in O(1) time using the auxiliary string αaux for the rank queries with respect to open and closed parentheses in S.

Therefore, the following lemma does not require L in the encoding.

Let I be an array of m indices. Let kmin(I, m, i, j) (respectively, kmax(I, m, i, j)) be the smallest index k with i ≤ k ≤ j that minimizes (respectively, maximizes)

(5)

L[I[k]]. We first prove the following lemma using techniques extended from Sec- tion 3 of [Bender and Farach-Colton 2000].

Lemma 3.1. It takes O(m log m) time to compute an O(m log²m)-bit string αq(I, m) from which kmin(I, m, i, j) and kmax(I, m, i, j) for any indices i and j with1 ≤ i ≤ j ≤ m can be determined from S, αaux, and αq in O(1) time.

Proof. For each i = 1, 2, . . . , m and j = 1, 2, . . . , ⌈log m⌉, let Mmin[i][j] (respectively, Mmax[i][j]) be the smallest index k with i ≤ k < i + 2^j that minimizes (respectively, maximizes) L[I[k]]. Let αq(I, m) = Mmin ◦ Mmax. Observe that αq(I, m) takes O(m log²m) bits and can be computed from L and I in O(m log m) time using dynamic programming. Let k1= Mmin[i][k] and k2= Mmin[j−2^k+1][k], where k = ⌊log(j − i)⌋. It is not difficult to see that

kmin(I, m, i, j) = k1 if L[I[k1]] < L[I[k2]]

k2 otherwise.

One can compute kmax(I, m, i, j) from Mmax, I, and L analogously in O(1) time.

Lemma 3.2. It takes O(n) time to compute an o(n)-bit string αrmq such that indexmin(L, i, j) and indexmax(L, i, j) for any indices i and j can be computed from S, α_aux, and αrmq in O(1) time.

Proof. First let IB be the nB-element array such that each IB[i] is the smallest index j with (i − 1)B < j ≤ iB that minimizes L[j]. IB takes O(nBlog B) = o(n) bits. Also, for each i = 1, 2, . . . , nB, let Ib[i] be the ⌈^B_b⌉-element array such that each Ib[i][j] is the smallest index t with (j − 1)b < t ≤ jb that minimizes L[(i − 1)B + t]. Ib takes O(nB⌈^B_b⌉ log b) = o(n) bits. Let αq1 = αq(IB, nB), and for each i = 1, 2, . . . , nB, let αq2[i] = αq(Ib[i], ⌈^B_b⌉). By Lemma 3.1, both of αq1

and αq2 take o(n) bits and can be obtained in O(n) time. Finally, let αq3 be an O(n)-time obtainable table such that any indexmin(L, i, j) and indexmax(L, i, j) with w(i, j) ≤ 2b can be computed from S[i, j] and αq3 in O(1) time. That is, let αq3[S[i, i + 2b − 1]][j − i + 1] = (indexmin(L, i, j) − i, indexmax(L, i, j) − i) for any indices i and j with w(i, j) ≤ 2b. Since each entry takes O(log b) bits, the number of bits required by αq3is O(2^2b2b log b) = o(n). Let αrmq = αq1◦ αq2◦ αq3◦ IB◦ Ib, which has o(n) bits and is obtainable in O(n) time.

To answer indexmin(L, i, j) from S, αaux, and αrmq, we can always decompose the interval [i, j] into two (not necessarily disjoint) subintervals [i1, j1] and [i2, j2] whose union is [i, j]. Clearly indexmin(L, i, j) can be determined from indexmin(L, i1, j1) and indexmin(L, i2, j2) in O(1) time. Consider the following cases.

—Case 1: w(i, j) ≤ 2b. We simply resort to S[i, j] and αq3.

—Case 2: w(i, j) > 2b and S[i, j] is in the same length-B block of S. Since indexmin(L, i, i + b − 1) and indexmin(L, j − b + 1, j) can be determined in O(1) time using Case 1, it suffices to determine indexmin(L, i^′, j^′), where (a) i^′ is the smallest index with i ≤ i^′ that is a starting index of a length-b block of S, and (b) j^′ is the largest index with j^′≤ j that is an ending index of a length-b block of S. Since i^′ and j^′ are in the same length-B block of S, indexmin(L, i^′, j^′) can be determined from S, αaux, and αq2 in O(1) time.

(6)

—Case 3: w(i, j) > 2b and S[i, j] belongs to two or more consecutive length-B blocks of S. Let i^′−1 be the ending index of the length-B block of S that contains i. Let j^′+1 be the starting index of the length-B block of S that contains j. Since indexmin(L, i, i^′− 1) and indexmin(L, j^′+ 1, j) can be determined in O(1) time using Case 2, it suffices to determine indexmin(L, i^′, j^′) for the case that i^′ ≤ j^′. Since i^′is a starting index of a length-B block of S and j^′is an ending index of a length-B block of S, one can determine indexmin(L, i^′, j^′) from S, αaux, and αq1

in O(1) time.

It is not difficult to answer indexmax(L, i, j) from S, αaux, and αrmq analogously in O(1) time.

As pointed out by an anonymous reviewer, our data structure for lowest common ancestor is similar to that of Sadakane [Sadakane 2002] for suffix arrays.

Theorem 3.3. It takes O(n) time to compute an o(n)-bit string αnew1 such that the queries of distance, subtree height, and lowest common ancestor can be answered from S and αnew1 in O(1) time.

Proof. Let αnew1= αaux◦ αrmq. By Lemmas 2.2 and 3.2, αnew1has o(n) bits and can be computed from S in O(n) time.

—The height of the subtree rooted at vi is L[indexmax(L, ℓi, r_i)] minus the depth of vi in T .

—The lowest common ancestor vk of vi and vj with ℓi < ℓj can be determined as follows. If ri > rj, then vk = vi. Otherwise, S[indexmin(L, ri, ℓj)] has to be a closed parenthesis rx such that vx is a child of vk, as observed by Bender and Farach-Colton [Bender and Farach-Colton 2000].

—The distance of vi and vj is exactly the depth of vi plus the depth of vj minus two times of the depth of vk, where vk is the lowest common ancestor of vi and vj.

By Lemmas 2.2 and 3.2, the above queries can all be answered from S and αnew1

in O(1) time.

4. RANK AND SELECT FOR CHILDREN

Before solving rank and select for children, we introduce the following definition and its property. A non-root node vi is k-far if w(ℓp, ℓi) > k and w(ℓi, rp) > k, where vp is the parent of vi.

Lemma 4.1. If vi and vj are two k-far non-root nodes with|w(ℓi, ℓj)| ≤ k, then vi and vj are siblings.

Proof. Without loss of generality, we assume ℓi < ℓj. Since vi and vj are k- far non-root nodes with w(ℓi, ℓ_j) ≤ k, vi cannot be an ancestor or descendant of v_j. Thus we have ri < ℓ_j. Assume for a contradiction that vp (respectively, vq) is the parent of vi (respectively, vj) and vp 6= vq. Observe that either ri < ℓ_q or r_p < ℓ_j holds. Since vj is k-far, ri < ℓ_q implies w(ri, ℓ_j) > k. Since vi is k-far, r_p< ℓ_j implies w(ri, ℓ_j) > k. Either case leads to a contradiction, so the lemma is proved.

(7)

For presentational brevity, we classify non-root nodes into the following three disjoint classes: A node is

—narrow if it is not b-far;

—medium if it is b-far but not B-far; and

—wide if it is B-far.

4.1 Child rank

Let child rank (S, vk) denote the number c such that vkis the c-th child of its parent.

We have the following theorem.

Theorem 4.2. It takes O(n) time to compute an o(n)-bit string αnew2 such that child rank(S, vk) for each node vk can be answered from S and αnew2 in O(1) time.

Proof. Let vpbe the parent of vk. If S[i, j] is a balanced string of parentheses, let sibling(S, i, j) be the number of non-enclosed parenthesis pairs in S[i, j]. Observe that

child rank(S, vk) = sibling (S, ℓp+ 1, ℓk− 1) + 1

= degree(S, vp) − sibling(S, ℓk, rp− 1) + 1.

Therefore, it remains to support each query sibling(S, i, j) in O(1) time.

If vkis narrow, we only need to answer sibling (S, i, j) with w(i, j) ≤ b. We simply build an O(n)-time obtainable table M1to store the answers for any possible inputs.

That is, let M1[S[i, i + b − 1]][j − i + 1] = sibling (S, i, j) for any indices i and j with w(i, j) ≤ b. Since sibling (S, i, j) ≤ w(i, j), each entry requires O(log b) bits and M1

takes O(2^bblog b) = o(n) bits.

If vk is medium, we cannot afford to store all the answers of sibling(S, i, j) with w(i, j) ≤ B. We split S into length-b blocks. By Lemma 4.1, any two medium nodes viand vj with |w(ℓi, ℓj)| ≤ b have the same parent, so for each block we save at most one medium node as a shortcut. Define tables M2 and M3 as follows. For each t = 1, 2, . . . , nb,

—let M2[t] = (ℓi, sibling(S, ℓp+ 1, ℓi− 1)), where ℓi is the smallest index, if any, with (t − 1)b < ℓi ≤ tb such that vi is a medium child of vp with w(ℓp, ℓi) ≤ B;

and

—let M3[t] = (ℓi, sibling(S, ℓi, rp− 1)), where ℓi is the smallest index, if any, with (t − 1)b < ℓi≤ tb such that vi is a medium child of vp with w(ℓi, rp) ≤ B.

Note that M2 and M3 have nb entries, each requiring O(log B) bits, so both of them take O(nblog B) = o(n) bits. Therefore, for any medium child vk of vp, if w(ℓp, ℓk) ≤ B, then

sibling(S, ℓp+ 1, ℓk− 1) = sibling(S, ℓp+ 1, ℓi− 1) + sibling (S, ℓi, ℓk− 1)

= m + M1[S[ℓi, ℓ_i+ b − 1]][ℓk− ℓi], where (ℓi, m) = M2[⌈^ℓ_b^k⌉]. Similarly, if w(ℓk, rp) ≤ B, then

sibling(S, ℓk, rp− 1) = sibling(S, ℓi, rp− 1) − sibling (S, ℓi, ℓk− 1)

= m − M1[S[ℓi, ℓi+ b − 1]][ℓk− ℓi],

(8)

functionchild rank(S, vk) 1: let vpbe the parent of vk;

2: if w(ℓp, ℓ_k) ≤ b, then return M1[S[ℓp+ 1, ℓp+ b]][ℓk− ℓp− 1] + 1;

3: if w(ℓk, rp) ≤ b, then return degree(S, vp) − M1[S[ℓk, ℓ_k+ b − 1]][rp− ℓk]] + 1;

4: if w(ℓp, ℓ_k) ≤ B, then let (ℓi, m) = M2[⌈^ℓ_b^k⌉], and return m + M1[S[ℓi, ℓi+ b − 1]][ℓk− ℓi] + 1;

5: if w(ℓk, rp) ≤ B, then let (ℓi, m) = M3[⌈^ℓ_b^k⌉], and return degree(S, vp) − m + M1[S[ℓi, ℓi+ b− 1]][ℓk− ℓi] + 1;

6: let (ℓj, m) = M5[⌈^ℓ_B^k⌉][⌈^ℓ^k^modB_b ⌉], and return M4[⌈^ℓ_B^k⌉] + m + M1[S[ℓj, ℓj+ b − 1]][ℓk− ℓj] + 1;

Fig. 2. An O(1)-time algorithm that computes child rank (S, vk).

where (ℓi, m) = M3[⌈^ℓ_b^k⌉].

Similar tricks work for wide nodes, but they have to be applied in two levels. We first split S into length-B blocks. For each t = 1, 2, . . . , nB, let M4[t] = sibling(S, ℓp+1, ℓi−1), where ℓiis the smallest index, if any, with (t−1)B < ℓi≤ tB such that vi is a wide child of vp. We further split each length-B block into length-b blocks. For each t = 1, 2, . . . , nB and u = 1, 2, . . . , ⌈^B_b⌉, let M5[t][u] = (ℓj, sibling(S, ℓp+ 1, ℓj − 1) − M4[t]), where ℓj is the smallest index, if any, with (u − 1)b < ℓj − (t − 1)B ≤ ub such that vj is a wide child of vp. Note that sibling(S, ℓp+ 1, ℓj− 1) − M4[t] ≤ B. One can easily verify that the number of bits required by M4 is O(nBlog n) = o(n) and the number of bits required by M5 is O(nB⌈^B_b⌉ log B) = o(n). Thus, for any wide child vk of vp, we have

sibling(S, ℓp+ 1, ℓk− 1) = sibling (S, ℓp+ 1, ℓj− 1) + sibling (S, ℓj, ℓk− 1)

= M4[⌈ℓ_k

B⌉] + m + M1[S[ℓj, ℓ_j+ b − 1]][ℓk− ℓj], where (ℓj, m) = M5[⌈^ℓ_B^k⌉][⌈^ℓ^k^modB_b ⌉].

Finally, let αnew2 = αaux ◦ M1 ◦ M2 ◦ M3◦ M4◦ M5, which is an o(n)-bit string obtainable from S in O(n) time. The O(1)-time algorithm for computing child rank(S, vk) is shown in Figure 2.

4.2 Child select

First we need the following lemmas to handle the select query for children. For any node vi, let indexc(S, ℓi, m, c) = ℓj−ℓi, where vjis a sibling of viwith w(ℓi, ℓ_j) ≤ m such that child rank (S, vj) = child rank (S, vi) + c. If such a vj does not exist, indexc(S, ℓi, m, c) = φ.

Lemma 4.3. It takes O(n) time to compute an o(n)-bit string αb such that indexc(S, ℓi, b², c) for any node vi and index c can be computed from S and α_b in O(1) time.

Proof. We simply build an O(n)-time obtainable table αbto store the answers for any possible inputs. That is, let αb[S[ℓi, ℓ_i+ b²− 1]][c] = indexc(S, ℓi, b², c) for any node vi and index c. Since each entry takes O(log b) bits, αb requires O(2^b²b²log b) = o(n) bits.

Lemma 4.4. Given a node vi, it takes O(B) time to compute an o(B)-bit string αB(ℓi) such that indexc(S, ℓi, B, c) for any index c can be computed from S, αb, and

(9)

αB(ℓi) in O(1) time.

Proof. For each t = 0, 1, ..., ⌈^B_b⌉ − 1, let W1[t] = indexc(S, ℓi, B, tb). W1 takes O(⌈^B_b⌉ log B) = o(B) bits. If w(W1[t], W1[t + 1]) > b², we save the answers of indexc(S, ℓi, B, tb+ z) for each z = 0, 1, . . . , b − 1 in W2. W2 takes at most O(⌈_b^B2⌉b log B) = o(B) bits. Otherwise, by Lemma 4.3 indexc(S, ℓi, B, tb+ z) can be computed in O(1) time using W1[t] + indexc(S, ℓi+ W1[t], b², z). Let αB(ℓi) = W1◦ W2, which has o(B) bits and is obtainable in O(B) time.

Given an array A of ⌈^m_u⌉ positive ⌈log u⌉-bit integers with m ≤ n and u =

⌈log³m⌉, let indexsum(A, x) denote the largest index y with Py

t=1A[t] < x.

Lemma 4.5. It takes O(m) time to compute an o(m)-bit string αA(A, m) such that indexsum(A, x) for any index x can be determined from A and αA(A, m) in O(1) time.

Proof. This is a special case of the search query of the searchable partial sums problem [Raman et al. 2001; Hon et al. 2003]. Theorem 3 of [Hon et al. 2003] gave an o(m)-bit auxiliary string to support this query in O(1) time, but it is unclear whether the preprocessing time is O(m). Let us briefly prove this lemma as follows.

Let d(x1, x2) denote indexsum(A, x2)−indexsum(A, x1). For each t = 0, . . . , ⌈^m_u⌉−

1, let W3[t] = indexsum(A, tu). W3 needs O(⌈^m_u⌉ log m) = o(m) bits. If d(tu, (t + 1)u) > ⌈log²u⌉, for each z = 0, 1, . . . , u − 1 we save the values of d(tu, tu + z) in W4. Because A is an array of positive integers, we have d(tu, tu + z) ≤ z and W4

needs at most O(⌈_u_log^m²_u⌉u log u) = o(m) bits. Otherwise, let

W5[A[indexsum(A, tu), indexsum(A, tu) + ⌈log²u⌉ − 1]][z] = d(tu, tu + z) for each z = 0, 1, . . . , u − 1. W5 takes O(2^log³^uulog log u) = o(m) bits and is obtainable in O(m) time. Now, let αA(A, m) = W3◦ W4 ◦ W5, which requires o(m) bits and can be obtained in O(m) time. To answer indexsum(A, x) in O(1) time, first let t and z be the integers with x = tu + z and 0 ≤ z < u, and then find the values of indexsum(A, tu) and d(tu, tu + z) from αA(A, m). The answer is indexsum(A, tu) + d(tu, tu + z).

Let child select (S, vp, c) denote the index ℓk such that vk is the c-th child of vp. We have the following theorem.

Theorem 4.6. It takes O(n) time to compute an o(n)-bit string αnew3 such that child select(S, vp, c) for each node vp and c can be answered from S and αnew3 in O(1) time.

Proof. We say that nodes in a set D are d-disjoint [Chiang et al. 2005] if

—w(ℓi, r_i) > d holds for any node vi in D; and

—any two nodes viand vjin D satisfy at least one of |w(ℓi, ℓ_j)| > d and |w(ri, r_j)| >

d.

Let X be a 2⌈²ⁿ_d⌉-element array. For each t = 1, 2, . . . , ⌈²ⁿ_d ⌉, we store viin X[2t−1], where ℓiis the smallest index, if any, with (t−1)d < ℓi≤ td such that viis in D; and also store vj in X[2t], where rj is the largest index, if any, with (t − 1)d < rj ≤ td such that vj is in D. Then, every node vi in D takes at least one slot in X, and

(10)

can be easily verified using ℓi and ri. We simply say that X has vi if and only if vi takes at least one of X[2⌈^ℓ_dⁱ⌉ − 1] or X[2⌈^r_dⁱ⌉]. For notational brevity, let X[vi] denote the element taken by vi.

The preprocessing is under the following traversal procedure: first traverse each node vp of T in prefix order, and for each vptraverse every child viof vp in counter- clockwise order. Since selecting and matching a parenthesis on S takes O(1) time, and each node is traversed at most two times, one as vp and the other as vi, the whole procedure takes O(n) time. The discussion below focuses on nodes vpand vi

in each iteration of the aforementioned traversal.

—Case 1: vi is a wide child of vp. Let counter denote the number of wide nodes discovered before each iteration. It is not difficult to see that the parents of wide nodes are B-disjoint. Let X1 be the 2nB-element array with X1[vp] = (beforep, first, last), where beforep is the value of counter before we get vp, and first (respectively, last ) is the rank of the first (respectively, last) wide child of vp. Then we partition S into length-B blocks. Let Y1 be the nB-element array with Y1[t] = (beforei, ℓi), where ℓi is the smallest index in a block such that vi

is wide, beforei is the value of counter before we get vi, and t is the first empty entry of Y1. Both of X1and Y1 take O(nBlog n) = o(n) bits.

—Case 2: vi is a medium child of vp. First we partition S into length-B blocks.

If w(ℓp, ℓi) ≤ B, we say that vi belongs to the ⌈^ℓ_B^p⌉-th block, otherwise the

⌈^r_B^p⌉-th block. For each t = 1, 2, . . . , nB, let counter[t] denote the number of medium nodes belonging to the t-th block before each iteration. Note that at most B medium nodes belong to a block. Similarly, one can verify that the parents of medium nodes are b-disjoint. Let X2 be the 2nb-element array with X2[vp] = (beforeL, firstL, lastL, beforeR, firstR, lastR), where

—beforeL (respectively, beforeR) is the value of counter[⌈^ℓ_B^p⌉] (respectively, the value of counter[⌈^r_B^p⌉]) before we get vp,

—firstL(respectively, firstR) is the rank of the first medium child of vpbelonging to the ⌈^ℓ_B^p⌉-th (respectively, ⌈^r_B^p⌉-th) block, and

—lastL (respectively, lastR) is the rank of the last medium child of vp belonging to the ⌈^ℓ_B^p⌉-th (respectively,⌈^r_B^p⌉-th) block.

Note that 1 ≤ first_L ≤ lastL ≤ B and degree(S, vp) − B ≤ first_R ≤ lastR ≤ degree(S, vp). We further partition each length-B block into length-b blocks.

For each t = 1, 2, . . . , nB, let Y2[t] be the ⌈^B_b⌉-element array with Y2[t][u] = (beforei, ℓ_i), where ℓi is the smallest index in a length-b block such that vi is a medium node belonging to the t-th length-B block, before is the value of counter[t] before we get vk, and u is the first empty entry of Y2[t]. Observe that X2 needs O(nblog B) = o(n) bits and Y2 needs O(nB⌈^B_b⌉ log B) = o(n) bits.

For each t = 1, 2, . . . , nB, let αB1[t] = αB(ℓi) with (beforei, ℓ_i) = Y1[t]. By Lemma 4.4, αB1 takes o(n) bits and is obtainable in O(n) time. Let A1 be the n_B-element array such that Pu

t=1A1[t] = beforei with (beforei, ℓ_i) = Y1[u] holds for each u = 1, 2, . . . , nB. Note that 0 < A1[t] ≤ B holds for any index t, so A1

takes O(nBlog B) = o(n) bits. Also, for each t = 1, 2, . . . , nB, let A2[t] be the ⌈^B_b⌉- element array such that Px

u=1A2[t][u] = beforei with (beforei, ℓi) = Y2[t][x] holds

(11)

functionchild select(S, vp, c) 1: if X1 has vpthen

2: let (beforep, first , last) = X1[vp];

3: iffirst≤ c ≤ last then

4: let z = beforep+ c − first + 1 and (beforei, ℓi) = Y1[indexsum(A1, z)];

5: return ℓi+ indexc(S, ℓi, B, z− beforei);

6: end if 7: end if

8: if X2 has vpthen

9: let (beforeL, firstL, lastL, beforeR, firstR, lastR) = X2[vp];

10: iffirstL≤ c ≤ lastLthen

11: let t = ⌈^ℓ_B^p⌉, z = beforeL+ c − firstL+ 1, and (beforei, ℓi) = Y2[t][indexsum(A2[t], z)];

12: return ℓi+ indexc(S, ℓi, b², z− beforei);

13: end if

14: iffirstR≤ c ≤ lastRthen

15: let t = ⌈^r_B^p⌉, z = beforeR+ c − firstR+ 1, and (beforei, ℓi) = Y2[t][indexsum(A2[t], z)];

16: return ℓi+ indexc(S, ℓi, b², z− beforei);

17: end if 18: end if

19: if indexc(S, ℓp+ 1, b², c) 6= φ, then return ℓp+ 1 + indexc(S, ℓp+ 1, b², c);

20: else return rp− F [S[rp− b + 1, rp]][degree(S, vp) − c];

Fig. 3. An O(1)-time algorithm that computes child select (S, vp, c).

for each x = 1, 2, . . . , ⌈^B_b⌉. Observe that 0 < A2[t][u] ≤ b holds for any indices t and u, so A2takes O(nB⌈^B_b⌉ log b) = o(n) bits. Let αA1= αA(A1, n), and for each t = 1, 2, . . . , nB, let αA2[t] = αA(A2[t], B). By Lemma 4.5, both of αA1 and αA2

take o(n) bits and are obtainable in O(n) time. At last, we construct an O(n)-time obtainable table F with F [S[rp− b + 1, rp]][degree(S, vp) − c] = rp− ℓi, where vi is the c-th child of vp with w(ℓi, r_p) ≤ b. Note that degree(S, vp) − c ≤ b, so F takes O(2^bblog b) = o(n) bits.

To implement child select in O(1) time, let ℓk= child select (S, vp, c). vk is wide if and only if X1 has vp and first ≤ c ≤ last , where (beforep, first, last) = X1[vp].

Moreover, letting z = beforep+c−first+1, vkis the z-th wide node discovered during the traversal procedure. Let (beforei, ℓ_i) = Y1[indexsum(A1, z)], so vk is a sibling of v_i with w(ℓi, ℓ_k) ≤ B such that child rank (S, vk) = child rank (S, vi) + z − beforei. By Lemma 4.4, we can locate vk using ℓk = ℓi+ indexc(S, ℓi, B, z− beforei).

v_k is medium if and only if X2 has vp and at least one of firstL≤ c ≤ lastL and firstR ≤ c ≤ lastR is satisfied, where (beforeL, firstL, lastL, beforeR, firstR, lastR) = X2[vp]. If firstL ≤ c ≤ lastL, let t = ⌈^ℓ_B^p⌉ and z = beforeL+ c − firstL+ 1. If firstR ≤ c ≤ lastR, let t = ⌈^r_B^p⌉ and z = beforeR + c − firstR+ 1. Then, vk is the z-th medium node belonging to the t-th length-B block discovered during the traversal procedure. Let (beforei, ℓ_i) = Y2[t][indexsum(A2[t], z)], so vk is a sibling of vi with w(ℓi, ℓk) ≤ b such that child rank (S, vk) = child rank (S, vi) + z − beforei. By Lemma 4.3, we can locate vk using ℓk = ℓi+ indexc(S, ℓi, b², z− beforei).

If vkis neither wide nor medium, it must be narrow. If indexc(S, ℓp+1, b², c) 6= φ, then we have ℓk = ℓp+ 1 + indexc(S, ℓp+ 1, b², c). Otherwise, ℓk = rp− F [S[rp− b+ 1, rp]][degree(S, vp) − c].

Finally, let αnew3= αaux◦ αb◦ αB1◦ X1◦ Y1◦ X2◦ Y2◦ A1◦ αA1◦ A2◦ αA2◦ F ,

(12)

which takes o(n) bits and can be computed from S in O(n) time. The O(1)-time algorithm for computing child select (S, vp, c) is shown in Figure 3.

Acknowledgments

We thank Kai-min Chung for helpful discussion. We also thank the anonymous reviewers for their helpful comments.

REFERENCES

Bell, T. C., Cleary, J. G., and Witten, I. H. 1990. Text Compression. Prentice-Hall, Engle- wood Cliffs, NJ.

Bender, M. A. and Farach-Colton, M. 2000. The LCA problem revisited. In Proceedings of the 4th Latin American Symposium on Theoretical Informatics, G. H. Gonnet, D. Panario, and A. Viola, Eds. Lecture Notes in Computer Science 1776. Springer, Punta del Este, Uruguay, 88–94.

Benoit, D., Demaine, E. D., Munro, J. I., Raman, R., Raman, V., and Rao, S. S. 2005.

Representing trees of higher degree. Algorithmica 43, 4, 275–292.

Bonichon, N., Gavoille, C., Hanusse, N., Poulalhon, D., and Schaeffer, G. 2006. Planar graphs, via well-orderly maps and trees. Graph and Combinatorics 22, 1–18.

Chiang, Y.-T., Lin, C.-C., and Lu, H.-I. 2005. Orderly spanning trees with applications. SIAM Journal on Computing 34,4, 924–945.

Chuang, R. C.-N., Garg, A., He, X., Kao, M.-Y., and Lu, H.-I. 1998. Compact encodings of planar graphs via canonical ordering and multiple parentheses. In Proceedings of the 25th Inter- national Colloquium on Automata, Languages, and Programming, K. G. Larsen, S. Skyum, and G. Winskel, Eds. Lecture Notes in Computer Science 1443. Springer-Verlag, Aalborg, Denmark, 118–129.

Clark, D. R. 1996. Compact PAT trees. Ph.D. thesis, University of Waterloo.

Clark, D. R. and Munro, J. I. 1996. Efficient suffix trees on secondary storage. In Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms. ACM/SIAM, Atlanta, Georgia, 383–391.

Delpratt, O., Rahman, N., and Raman, R. 2006. Engineering the LOUDS succinct tree representation. In Proceedings of the 5th International Workshop on Experimental Algorithms.

Lecture Notes in Computer Science 4007. Springer-Verlag, Cala Galdana, Menorca, Spain, 134–

145.

Elias, P. 1975. Universal codeword sets and representations of the integers. IEEE Transactions on Information Theory IT-21, 194–203.

Gabow, H. N., Bentley, J. L., and Tarjan, R. E. 1984. Scaling and related techniques for geometry problems. In Proceedings of the Sixteenth Annual ACM Symposium on Theory of Computing. ACM Press, New York, NY, USA, 135–143.

Geary, R. F., Rahman, N., Raman, R., and Raman, V. 2004. A simple optimal representation for balanced parentheses. In Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching, S. C. Sahinalp, S. Muthukrishnan, and U. Dogrus¨oz, Eds. Lecture Notes in Computer Science 3109. Springer-Verlag, Istanbul, Turkey, 159–172.

Geary, R. F., Raman, R., and Raman, V. 2004. Succinct ordinal trees with level-ancestor queries.

In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, J. I.

Munro, Ed. SIAM, New Orleans, Louisiana, USA, 1–10.

Graham, R. L., Knuth, D. E., and Patashnik, O. 1989. Concrete Mathematics. Addison-Wesley, Reading, Massachusetts.

He, X., Kao, M.-Y., and Lu, H.-I. 1999. Linear-time succinct encodings of planar graphs via canonical orderings. SIAM Journal on Discrete Mathematics 12, 3, 317–325.

Hon, W.-K., Sadakane, K., and Sung, W.-K. 2003. Succinct data structures for searchable partial sums. In Proceedings of the 14th Symposium on Algorithms and Computation, T. Ibaraki, N. Katoh, and H. Ono, Eds. Lecture Notes in Computer Science 2906. Springer-Verlag, Kyoto, Japan, 505–516.

(13)

Jacobson, G. 1989. Space-efficient static trees and graphs. In Proceedings of the 30th An- nual Symposium on Foundations of Computer Science. IEEE, Research Triangle Park, North Carolina, 549–554.

Munro, J. I. 1996. Tables. In Proceedings of the 16th Conference on Foundations of Soft- ware Technology and Theoretical Computer Science. Lecture Notes in Computer Science 1180.

Springer-Verlag, Hyderabad, India, 37–42.

Munro, J. I. and Raman, V. 2001. Succinct representation of balanced parentheses, static trees and planar graphs. SIAM Journal on Computing 31, 3, 762–776.

Munro, J. I., Raman, V., and Rao, S. S. 2001. Space efficient suffix trees. Journal of Algo- rithms 39,2, 205–222.

Munro, J. I. and Rao, S. S. 2004. Succinct representations of functions. In Proceedings of the 31st International Colloquium on Automata, Languages and Programming. Lecture Notes in Computer Science 3142. Springer-Verlag, Turku, Finland, 1006–1015.

Raman, R., Raman, V., and Rao, S. S. 2001. Succinct dynamic data structures. In Proceedings of the 7th International Workshop on Algorithms and Data Structures, F. K. H. A. Dehne, J.-R. Sack, and R. Tamassia, Eds. Lecture Notes in Computer Science 2125. Springer-Verlag, Providence, RI, USA, 426–437.

Sadakane, K. 2002. Succinct representations of lcp information and improvements in the com- pressed suffix arrays. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM/SIAM, San Francisco, 225–232.

van Emde Boas, P. 1990. Machine models and simulations. In Handbook of Theoretical Computer Science, J. van Leeuwen, Ed. Vol. A. Elsevier, Amsterdam, Chapter 1, 1–60.

Received Month Year; revised Month Year; accepted Month Year