*Algorithms for Finding the Weight-Constrained k Longest* *Paths in a Tree and the Length-Constrained k*

## Maximum-Sum Segments of a Sequence

### Hsiao-Fei Liu

^{1}

### and Kun-Mao Chao

^{1,2,3,∗}1

### Department of Computer Science and Information Engineering

2

### Graduate Institute of Biomedical Electronics and Bioinformatics

3

### Graduate Institute of Networking and Multimedia National Taiwan University, Taipei, Taiwan 106

### June 20, 2008

Abstract In this work, we obtain the following new results:

*– Given a tree T = (V, E) with a length function ` : E → R and a weight function*
*w : E → R, a positive integer k, and an interval [L, U ], the Weight-Constrained*
*k Longest Paths problem is to find the k longest paths among all paths in T with*
*weights in the interval [L, U ]. We show that the Weight-Constrained k Longest*
*Paths problem has a lower bound Ω(V log V + k) in the algebraic computation tree*
*model and give an O(V log V + k)-time algorithm for it.*

*– Given a sequence A = (a*_{1}*, a*_{2}*, . . . , a*_{n}*) of numbers and an interval [L, U ], we define*
*the sum and length of a segment A[i, j] to be a*_{i}*+ a*_{i+1}*+ · · · + a*_{j}*and j − i + 1,*
*respectively. The Length-Constrained k Maximum-Sum Segments problem*
*is to find the k maximum-sum segments among all segments of A with lengths in*
*the interval [L, U ]. We show that the Length-Constrained k Maximum-Sum*
*Segments problem can be solved in O(n + k) time.*

*∗*Corresponding author,kmchao@csie.ntu.edu.tw

### 1 Introduction

Optimization is one of the most basic types of algorithmic problems. In an optimization prob-
lem, the goal is to find the best feasible solution. However, it is often not satisfactory in practice
to only find the best feasible solution, and we may be required to enumerate, for example, all
the top ten or top twenty feasible solutions. We call a problem of such kind, where the goal is
*to find the top k best feasible solution for a given k, an enumeration problem. In this paper,*
we study some enumeration problems on trees and sequences.

*We start by considering problems on trees. Let T = (V, E) be a tree with a length function*

*` : E → R and a weight function w : E → R. Define the length and weight of a path*
*P = (v*_{1}*, v*_{2}*, . . . , v*_{n}*) in T to be* P

*1≤i≤n−1**`(v*_{i}*v** _{i+1}*) and P

*1≤i≤n−1**w(v*_{i}*v** _{i+1}*), respectively. Given

*T , the Tree Longest Path problem (also known as the Tree Diameter problem) is to find*

*the longest path in T . The Tree Longest Path problem is a fundamental problem in dealing*

*with trees and solvable in O(V ) time [37]. In the following, we introduce two generalizations*of the Tree Longest Path problem, which are closely related to our study in this paper.

*One is the Tree k Longest Paths problem. Given T and a positive integer k, the Tree k*
*Longest Paths problem is to find the k longest paths from all paths in T . Megiddo et al. [33]*

*proposed an O(V log*^{2}*V )-time algorithm for finding the k** ^{th}* longest path. Later, Frederickson

*and Johnson [21] improved the time complexity to O(V log V ). After finding the k*

*longest*

^{th}*path, the k longest paths can be constructed with additional O(k) time from the computed*

*information. Hence, the Tree k Longest Paths problem is solvable in O(V log V + k) time.*

*The other is the Weight-Constrained Longest Path problem. Given T and interval*
*[L, U ], the Weight-Constrained Longest Paths problem is to find the longest path among*
*all paths in T with weights in the interval [L, U ]. The Weight-Constrained Longest Path*
*problem was formulated by Wu et al. [36] and motivated as follows. Given a tree network with*
length and weight on each edge, we want to maintain the network by choosing a path and
renewing the old and shabby edges of this path. The length and weight on an edge measure the
traffic load and update cost of this edge, respectively. Since we also have budget constraints
which limit the weight of the path to be updated, the goal is to find the longest path subject
*to the weight constraints. Wu et al. [36] proposed an O(V log*^{2}*V )-time algorithm for the case*
*where the edge weight lower bound is ineffective, i.e., L = −∞. Kim [28] gave an O(V log V )-*
time algorithm to cope with the case where the tree has a constant degree and a uniform edge
weight and the edge weight lower bound is ineffective.

*In this paper, we study the Weight-Constrained k Longest Paths problem, which*
*is a combination of the Tree k Longest Paths problem and the Weight-Constrained*
*Longest Path problem. Given T , a positive integer k, and interval [L, U ], the Weight-*
*Constrained k Longest Paths problem is to find the k longest paths of T among all paths*
*in T with weights in the interval [L, U ]. We give an O(V log V + k)-time algorithm for the*

*Weight-Constrained k Longest Paths problem and prove that it has an Ω(V log V + k)*
in the algebraic computation tree model.

*Next, we consider problems on sequences. Let A = (a*_{1}*, a*_{2}*, . . . , a** _{n}*) be a sequence of numbers.

*Define the sum and length of a segment A[i, j] to be a*_{i}*+a*_{i+1}*+· · ·+a*_{j}*and j −i+1, respectively.*

*The Maximum-Sum Segment problem, given A, is to find a segment of A that maximizes*
the sum. The Maximum-Sum Segment problem was first presented by Grenader [23] and
finds applications to pattern recognition [23, 34], biological sequence analysis [1], and data
mining [22]. The Maximum-Sum Segment problem is linear-time solvable using Kadane’s
algorithm [7]. A variety of generalizations of the Maximum-Sum Segment problem have been
proposed to fulfill more requirements. In the following, let us introduce two of them, which are
closely related to our study in this paper.

*One is the k Maximum-Sum Segments problem. Given A and a positive integer k, the k*
*Maximum-Sum Segments problem is to locate the k segments whose sums are the k largest*
*among all possible sums. The k Maximum-Sum Segments problem was first presented by*
Bae and Takaoka [2]. Since then, this problem has drawn a lot of attention [3, 6, 10, 13, 31, 32],
*and recently an optimal O(n + k)-time algorithm was given by Brodal and Jørgensen [10].*

*The other is the Length-Constrained Maximum-Sum Segment problem. Given A and*
*two integers L, U with 1 ≤ L ≤ U ≤ n, the Length-Constrained Maximum-Sum Seg-*
*ment problem is to find the maximum-sum segment among all segments of A with lengths in the*
*interval [L, U ] and is solvable in O(n) time [17, 31]. The Length-Constrained Maximum-*
Sum Segment problem was formulated by Huang [26] and motivated by its application to
finding GC-rich segments of a DNA sequence. A DNA sequence is composed of four letters A,
C, G, and T. Given a DNA sequence, biologists often need to identify the GC-rich segments
*satisfying some length constraints. By giving each of letters C and G a reward of 1 − p and*
*each of letters A and T a penalty of −p, where p is a positive constant ratio, the problem is*
reformulated as finding the length-constrained maximum-sum segment.

*In this paper, we study the Length-Constrained k Maximum-Sum Segments prob-*
*lem, which is a combination of the k Maximum-Sum Segments problem and the Length-*
*Constrained Maximum-Sum Segment problem. Given A, a positive integer k and two*
*integers L, U with 1 ≤ L ≤ U ≤ n, the Length-Constrained k Maximum-Sum Seg-*
*ments problem is to find the k maximum-sum segments among all segments of A with lengths*
*in the interval [L, U ]. Note that the Length-Constrained k Maximum-Sum Segments*
*problem can also be considered as a specialization of the Weight-Constrained k Longest*
Paths problem if we treat the given sequence as a chain of edges whose lengths are given by
*the numbers in the sequence and weights are all equal to one. After giving an O(V log V + k)-*
*time algorithm to deal with the Weight-Constrained k Longest Paths problem, we give*
*an O(n + k)-time algorithm for the Length-Constrained k Maximum-Sum Segments*

*problem (or equivalently, an O(V + k)-time algorithm for a specialization of the Weight-*
*Constrained k Longest Paths problem where the input tree is a chain of edges with*
a uniform weight). It should be noted that our basic approach for solving the Length-
*Constrained k Maximum-Sum Segments problem was discovered independently by Brodal*
*and Jørgensen [10] in solving the k Maximum-Sum Segments problem. Both of us construct*
*in O(n) time a heap that implicitly stores all feasible solutions and then run Frederickson’s [18]*

*heap selection algorithm on this heap to find the k best feasible solutions in O(k) time.*

As a byproduct, we show that our algorithms can be used as a basis for delivering more effi-
cient algorithms for some related enumeration problems such as finding the weight-constrained
*k largest elements of X + Y , finding the sum-constrained k longest segments, finding k length-*
*constrained segments satisfying a density lower bound, and finding area-constrained k maximum-*
sum subarrays.

### 2 *O(V log V +k)-Time Algorithm for the Weight-Constrained* *k Longest Paths Problem*

*In this section, we prove that the Weight-Constrained k Longest Paths problem can be*
*solved in O(V log V + k) time.*

### 2.1 Preliminaries

*To achieve the time bound of O(V log V + k), we make use of Frederickson and Johnson’s [21]*

representation of intervertex distances of a tree, range maxima query (RMQ) [5, 19, 25], and
*Frederickson’s [18] algorithm for finding the maximum k elements in a heap-ordered tree. In*
the following, we briefly review these data structures and algorithms.

*Definition 1: Let T = (V, E) be a tree. A node v ∈ V is said to be the centroid of T if and*
*only if after removing v from T , each resulting connected component contains at most |V |/2*
nodes.

*Definition 2: Let T = (V, E) be a tree. A triplet (c, T*_{1} *= (V*_{1}*, E*_{1}*), T*_{2} *= (V*_{2}*, E*_{2})) is called a
*centroid decomposition of T if it satisfies the following properties: (1) c is a centroid of T ; (2)*
*T*_{1} *and T*_{2} *are two subtrees of T such that V*_{1}*∩ V*_{2} *= c,* ^{|V |+2}_{3} *≤ |V*_{1}*| ≤* ^{2|V |+1}_{3} *, and E*_{1}*∪ E*_{2} *= E.*

*Notation 1: Let T = (V, E) be a tree with a length function ` : E → R and a weight function*
*w : E → R. We slightly overload the notation by letting `(u, v) and w(u, v) also denote the*
*length and weight of the path from u to v if there is no edge from u to v.*

Definition 3: *Let T = (V, E) be a tree with a length function ` : E → R and a weight*
*function w : E → R. A rooted ordered binary tree T*^{0}*= (V*^{0}*, E*^{0}*, r) in which each node contains*
*fields cent, list*_{1} *and list*_{2} *is called a centroid decomposition tree of T rooted at r if it satisfies*
*the following recursive properties: (1) If |V | = 1, then |V*^{0}*| = 1, r.cent is the only vertex in*
*V , and r.list*_{1} *= r.list*_{2}*=NIL; (2) if |V | = 2, then |V*^{0}*| = 1, r.cent is one of the vertex in V ,*
*r.list*_{1} *= ((v, `(r.cent, v), w(r.cent, v)), and r.list*_{2} *= ((r.cent, 0, 0)), where v ∈ V \{r.cent}; (3)*
*if |V | > 2, then ∃ centroid decomposition (c, T*_{1} *= (V*_{1}*, E*_{1}*), T*_{2} *= (V*_{2}*, E*_{2}*)) of T such that the*
*left subtree and right subtree of r are centroid decomposition trees of T*_{1} *and T*_{2}, respectively,
*r.cent = c, and r.list*_{j}*, j ∈ {1, 2}, is a list of triplets ((v*_{i}*, `(c, v*_{i}*), w(c, v*_{i}*)) : v*_{i}*∈ V*_{j}*− {c}) sorted*
*on w(c, v** _{i}*).

*As an illustration, a tree T and its centroid decomposition tree T** ^{0}* are shown in Figure 1
and Figure 2, respectively.

Theorem 1: *[Frederickson and Johnson [21]] Given a tree T (V, E) with a length function*

*` : E → R and a weight function w : E → R, we can construct a centroid decomposition tree*
*of T in O(V log V ) time.*

Now we describe the Range Maxima Query (RMQ) problem. In the RMQ problem,
*a list A = (a*1*, a*2*, . . . , a**n**) of n real numbers is given to be preprocessed such that any range*
*maxima query can be answered quickly. A range maxima query specifies an interval [i, j] and*
*the goal is to find the index k with i ≤ k ≤ j such that a**k* achieves maximum.

*We first describe a simple algorithm for solving the RMQ problem in O(n log n) preprocess-*
*ing time and O(1) time per query. For each 1 ≤ i ≤ n and each 1 ≤ j ≤ blog nc, we precompute*
*M[i][j] = arg max**k=i,...,i+2*^{j}*−1**{a**k**}, i.e., the index of the maximum element in A[i, i + 2*^{j}*− 1].*

*This can be done in O(n log n) time by using dynamic programming because*
*M[i][j] =*

(

*M[i][j − 1]* *if A[M[i][j − 1]] ≥ A[M[i + 2*^{j−1}*− 1][j − 1]];*

*M[i + 2*^{j−1}*− 1][j − 1] otherwise.*

*Given a query interval [i, j], let k = blog(j − i)c. Because both [i, i + 2*^{k}*− 1] and [j − 2*^{k}*+ 1, j]*

*are subintervals of [i, j] and [i, i + 2*^{k}*− 1] ∪ [j − 2*^{k}*+ 1, j] = [i, j], the index of the maximum*
*element in A[i, j] is arg max**k∈{M [i][i+2*^{k}*−1],M [j−2*^{k}*+1][j]}**{A[k]}.*

*We now sketch an algorithm for solving the RMQ problem in O(n) preprocessing time and*
*O(1) time per query. This algorithm was given by Bender and Farach-Coltongiven [5], and*
*they showed that the RMQ problem is linearly equivalent to the RMQ±1 problem which is the*
same as the RMQ problem except that the adjacent elements of the input list differ by exactly
*one. Thus, in the following we focus on the RMQ±1 problem. Let A = (a*_{1}*, a*_{2}*, . . . , a** _{n}*) be an

*instance to the RMQ±1 problem.*

^{1}

*The algorithm starts by dividing the list A into 2n/ log n*

1*For simplicity, we assume n is a power of two.*

*shorter sublists A[1,*^{log n}_{2} *], A[*^{log n}_{2} *+ 1, log n], . . . , A[n −* ^{log n}_{2} *+ 1, n], each of length* ^{log n}_{2} . Each
*sublist A[**(i−1) log n*

2 *+ 1,*^{i log n}_{2} *] is represented by the maximum element r** _{i}* in it. They then run

*the simple RMQ algorithm described in the beginning on these O(n/ log n) representatives in*

*O(*

_{log n}*log(*

^{n}

_{log n}

^{n}*)) = O(n) preprocessing time. By the property that adjacent elements in the*

*list A differs by exactly one, they use a table-lookup technique to precompute the indices of the*

*maximum elements in all sublists of A with lengths ≤*

_{log n}

^{2n}*in O(n) time. Given a query interval*

*[i, j], let i*

^{0}*= d*

_{log n}

^{2i}*e and j*

^{0}*= b*

_{log n}

^{2j}*c. Let r*

_{k}*be the maximum of {r*

_{i}

^{0}_{+1}

*r*

_{i}

^{0}_{+2}

*, . . . , r*

_{j}

^{0}

_{−1}*}, a*

_{i}*be the*

^{∗}*maximum element in A[i,*

^{i}

^{0}

^{log n}_{2}

*], and a*

_{j}

^{∗}*be the maximum element in A[*

^{j}

^{0}

^{log n}_{2}

*, j]. Because we*

*have run the simple RMQ algorithm on (r*

_{1}

*, r*

_{2}

*, . . . , r*

^{2n}*log n**), k can be found in constant time given*
*[i*^{0}*+ 1, j*^{0}*− 1]. Because both A[i,*^{i}^{0}^{log n}_{2} *] and A[*^{j}^{0}^{log n}_{2} *, j] have lengths ≤* _{log n}^{2n}*, we can also find a*_{i}^{∗}*and a*_{j}^{∗}*in constant time. Note that the maximum of {r*_{k}*, a*_{i∗}*, a*_{j∗}*} is also the maximum element*
*in A[i, j]. Thus, if a*_{i}^{∗}*is the maximum of {r*_{k}*, a*_{i∗}*, a*_{j∗}*}, then we can directly return i** ^{∗}*. Similarly,

*if a*

_{j}

^{∗}*is the maximum of {r*

_{k}*, a*

_{i∗}*, a*

_{j∗}*}, then return j*

^{∗}*. Otherwise, if r*

*is the the maximum of*

_{k}*{r*

_{k}*, a*

_{i∗}*, a*

_{j∗}*}, then find and return the index of the maximum element in A[*

*(k−1) log n*

2 *+ 1,* ^{k log n}_{2} ],
*which can be done in constant time because A[**(k−1) log n*

2 *+ 1,*^{k log n}_{2} ] has length equal to _{log n}* ^{2n}* .

*Theorem 2: [RMQ [5, 19, 25]] The RMQ problem can be solved in O(n) preprocessing time*

*and O(1) time per query.*

*For our purposes, a D-heap is a rooted degree-D tree in which each node contains a field*
*value, satisfying the restriction that the value of any node is larger than or equal to the values*
of its children. Note that we do not require the tree to be balanced. Frederickson [18] proposed
*an algorithm for finding the k largest elements in a D-heap in O(k) time. When Frederickson’s*
*algorithm traverses the heap to find the k largest nodes, it does not access a node unless it*
has ever accessed the node’s parent. This property makes it possible to run the Frederickson’s
algorithm without first explicitly building the entire heap in the memory as long as we have a
way to obtain the information of a node given the information of its parent.

*We sketch an O(k log log k)-time algorithm [18] for enumerating the k largest value nodes*
in a heap as follows. For simplicity, we assume all nodes in the heap have different values. A
*node is said to be of rank i if it is the i** ^{th}* largest node. The algorithm runs by first finding

*a node u in the heap in O(k log log k) time such that the rank of u is between k and ck for*

*some constant c. Then the algorithm identifies all nodes in the heap not smaller than u in*

*O(ck) = O(k) time and returns the k largest nodes among them. To find u, we form at*

*most 2dk/blog kce + 1 groups of nodes, called clans. Each clan is of size at most blog kc and*represented by the smallest node in it; representatives are managed in an auxiliary heap. We

*form the first clan C*

_{1}

*in O(log k log log k) time by grouping the largest blog kc nodes in the*

*original heap and initialize the auxiliary heap with the representative of C*

_{1}

*. Set the offspring*

*os(C*

_{1}

*) of C*

_{1}

*to the set of nodes in the original heap which are children of C*

_{1}

*but not in C*

_{1},

*a*

*b*

*d* *f*

*c*

*e*

*g* *h*

*Ɛ =2**,**w=2*

*Ɛ =3**,**w=-4*

*Ɛ =1**,**w=7*

*Ɛ =8**,**w=2* *Ɛ =7**,**w=5*

*Ɛ =3**,**w=1* *Ɛ =1**,**w=8*

*T*

*Figure 1: A tree T associated with an edge length function ` and an edge weight function w.*

*and set the poor relations pr(C*_{1}*) of C*_{1} *to the empty set. Then for i from 1 to blog kc, do the*
*following. Extract the largest element in the auxiliary heap and let C** _{j}* be the clan represented

*by the element extracted. If os(C*

_{j}*) is not empty, then form a new clan C*

_{i+1}*in O(log k log log k)*

*time by grouping the blog kc largest nodes from the subheaps rooted at os(C*

*) in the original*

_{j}*heap. Insert the representative of C*

_{i+1}*into the auxiliary heap. Set os(C*

*) to the group of*

_{i+1}*nodes in the original heap which are children of C*

_{i+1}*but not in C*

_{i+1}*, and set pr(C*

*) to the*

_{i+1}*group of nodes which are members of os(C*

_{j}*) but not included in C*

_{i+1}*. If pr(C*

*) is not empty,*

_{j}*then form a new clan C*

_{i+2}*in O(log k log log k) time by grouping the dk/blog kce largest nodes*

*from the subheaps rooted at pr(C*

_{j}*) in the original heap. Insert the representative of C*

*into*

_{i+2}*the auxiliary heap. Set os(C*

*) to the group of nodes in the original heap which are children of*

_{i+1}*C*

_{i+2}*but not in C*

_{i+2}*, and set pr(C*

_{i+2}*) to the group of nodes which are members of pr(C*

*) but*

_{j}*not included in C*

_{i+2}*. When the loop terminates, set u to the last element extracted from the*

*auxiliary heap. Since at most 2dk/blog kce + 1 clans are formed and each clan can be formed*

*in O(log k log log k) time, the total time is O(k log log k).*

By applying the above approach recursively, plus some speed-up techniques, Frederick-
*son [18] obtained an O(k)-time algorithm.*

*Theorem 3: [Frederickson [18]] For any constant D, we can find the k largest value nodes in*
*any D-heap, in O(k) time.*

### 2.2 *Finding the Weight-Constrained k Longest Paths*

For simplicity, we only consider paths with at least two distinct vertices, and we do not distin-
*guish between the path from u to v and the path from v to u, i.e., the path from u to v and*

*cent = a*

*list*_{1}*= ((d,5,-2),(g,13,0),(b,2.2),(h,12,3))*
*list*2*= ((c,1,7),(e,4,8),(f,2,15))*

*cent = d*

*list*_{1}*= ((g,8,2),(h,7,5))*
*list*_{2}*= ((b,3,-4),(a,5,-2))*

*cent = c*

*list*_{1}*= ((e,3,1),(a,1,7))*
*list*_{2}*= ((f,1,8))*

*cent = d*
*list*_{1}*= ((g,8,2))*
*list*2*= ((h,7,5))*

*cent = b*
*list*_{1}*= ((a,2,2))*
*list*2*= ((d,3,-4))*

*cent = c*
*list*_{1}*= ((a,1,7))*
*list*2*= ((e,3,1))*

*cent = c*
*list*_{1}*= ((f,1,8))*
*list*2*= ((c,0,0))*

*cent = d*
*list*1*= ((g,8,2))*
*list*_{2}*= ((d,0,0))*

*cent = d*
*list*1*= ((h,7,5))*
*list*_{2}*= ((d,0,0))*

*cent = b*
*list*1*= ((a,2,2))*
*list*_{2}*= ((b,0,0))*

*cent = b*
*list*1*= ((d,3,-4))*
*list*_{2}*= ((b,0,0))*

*cent = c*
*list*1*= ((a,1,7))*
*list*_{2}*= ((c,0,0))*

*cent = c*
*list*1*= ((e,3,1))*
*list*_{2}*= ((c,0,0))*

*A*

*B* *C*

*D* *E* *F* *G*

*H* *I* *J* *K* *L* *M*

*T* ’

*Figure 2: A centroid decomposition tree T** ^{0}* of the tree in Figure 1.

*the path from v to u are considered the same. Thus each path is uniquely determined by the*
*unordered pair of its end vertices. We define the length and weight of an unordered pair {u, v}*

*to be the length and weight of the path from u to v, respectively. We say an unordered pair*
*{u, v} of vertices is feasible if and only if its weight is in the interval [L, U ]. Our task is to find*
*the k longest feasible unordered pairs of vertices in T .*

Before moving on to the details of the algorithm, let us pause here to sketch our main
*idea. First, we divide T into two subtrees T*_{1} *and T*_{2} of roughly the same size and find all
*the feasible unordered pairs {u, v} satisfying u ∈ V (T*_{1}*) and v ∈ V (T*_{2}). Next, we recursively
*compute all feasible unordered pairs of vertices in T*_{1} and all feasible unordered pairs of vertices
*in T*_{2}, respectively. After finishing this recursive process, we have all feasible unordered pairs
*of vertices in T . We then build a heap consisting of all these unordered pairs and find the k*
longest unordered pairs in this heap by applying the Frederickson’s algorithm [18]. The major
*difficulty is that the number of feasible unordered pairs of vertices in T may be much larger than*

*|V | log |V | + k. Thus, we have to represent the set of all feasible unordered pairs of vertices*
*in T in a succinct way such that we are still able to build an implicit representation of the*
heap stated above and run the Frederickson’s algorithm [18] on this implicitly-represented heap
without loss of efficiency.

We now describe our algorithm in detail. First, we construct a centroid decomposition

*tree T*^{0}*= (V*^{0}*, E*^{0}*, r) of T in O(V log V ) time by Theorem 1. For each v ∈ V*^{0}*and i ∈ {1, 2},*
*let (v*_{i,j}*, `(v.cent, v*_{i,j}*), w(v.cent, v*_{i,j}*)) be the j*^{th}*element of v.list** _{i}* if it exists. Note that since
P

*v∈V*^{0}*(|v.list*_{1}*| + |v.list*_{2}*| + 1) = O(V log V ), we can find `(v.cent, v*_{i,j}*) and w(v.cent, v** _{i,j}*) for all

*v ∈ V*

^{0}*, i ∈ {1, 2} and 1 ≤ j ≤ |v.list*

_{i}*| in total O(V log V ) time. By the next lemma, in total*

*O(V log V ) time, for all v ∈ V*

^{0}*and 1 ≤ i ≤ |v.list*

_{1}

*|, we can find an interval [p*

^{v}

_{i}*, q*

^{v}*] such that*

_{i}*1. w(v.cent, v*_{1,i}*) + w(v.cent, v*_{2,j}*) = w(v*_{1,i}*, v*_{2,j}*) ∈ [L, U ] for all j ∈ [p*^{v}_{i}*, q*_{i}* ^{v}*];

*2. w(v*_{1,i}*, v*_{2,j}*) 6∈ [L, U ] for all j 6∈ [p*^{v}_{i}*, q*^{v}* _{i}*].

*It follows that the set of all feasible unordered pairs of vertices in T is equal to the set*
S

*v∈V*^{0}

S_{|v.list}_{1}_{|}

*i=1* *{{v*_{1,i}*, v*_{2,j}*} : j ∈ [p*^{v}_{i}*, q*_{i}^{v}*]}.*

Lemma 1: *Let T*^{0}*= (V*^{0}*, E*^{0}*, r) be a centroid decomposition tree of T = (V, E). In total*
*O(V log V ) time, for all v ∈ V*^{0}*and 1 ≤ i ≤ |v.list*_{1}*|, we can find an interval [p*^{v}_{i}*, q*_{i}* ^{v}*] such that

*(1) w(v*

_{1,i}*, v*

_{2,j}*) ∈ [L, U ] for all j ∈ [p*

^{v}

_{i}*, q*

_{i}

^{v}*] and (2) w(v*

_{1,i}*, v*

_{2,j}*) 6∈ [L, U ] for all j 6∈ [p*

^{v}

_{i}*, q*

_{i}*].*

^{v}Proof: Since P

*v∈V*^{0}*(|v.list*_{1}*| + |v.list*_{2}*| + 1) = O(V log V ), we only have to show that for each*
*v ∈ V*^{0}*, we can compute [p*^{v}_{i}*, q*_{i}^{v}*] for all 1 ≤ i ≤ |v.list*_{1}*| in total O(|v.list*_{1}*| + |v.list*_{2}*| + 1) time.*

*Given v ∈ V*^{0}*, we claim the following procedure computes [p*^{v}_{i}*, q*_{i}^{v}*] for all 1 ≤ i ≤ |v.list*_{1}*| in*
*total O(|v.list*_{1}*| + |v.list*_{2}*| + 1) time.*

*1. Let n*^{0}*= |v.list*_{1}*| and m*^{0}*= |v.list*_{2}*|.*

*2. If n*^{0}*= 0 or m** ^{0}* = 0 then stop.

*3. Set p and q to m** ^{0}*.

*4. For i ← 1 to n*

*do*

^{0}*(a) While(w(v*_{1,i}*, v*_{2,p−1}*) ≥ L and p − 1 ≥ 1) do p ← p − 1.*

*(b) While(w(v*_{1,i}*, v*_{2,q}*) > U and q ≥ p) do q ← q − 1.*

*(c) p*^{v}_{i}*← p and q*^{v}_{i}*← q.*

*It is not hard to see the running time of this procedure is O(|v.list*_{1}*| + |v.list*_{2}*| + 1) since*
*both the values of p and q are nonincreasing. To verify the correctness, it suffices to note that*
*since the list v.list*_{i}*, i ∈ {1, 2}, is sorted on w(v.cent, v*_{i,j}*), the sequence (p*^{v}_{1}*, . . . , p*^{v}_{|v.list}_{1}* _{|}*) and

*the sequence (q*

_{1}

^{v}*, . . . , q*

_{|v.list}

^{v}_{1}

*) must be nonincreasing.*

_{|}*Next, for each v ∈ V*^{0}*, we preprocess v.list*_{2} *so that given any interval [i, j], we can find the*
*index k, denoted RMQ(v.list*_{2}*, i, j), in [i, j] such that `(v.cent, v*_{2,k}*) achieves maximum in O(1)*
*time. By Theorem 2, this preprocessing can be done in O(*P

*v∈V*^{0}*|v.list*_{2}*|) = O(V log V ) time.*

Before going on to the next point, we would like to define some data structures. For each
*v ∈ V*^{0}*and 1 ≤ i ≤ |v.list*_{1}*|, define H(v** _{1,i}*) to be a rooted ordered binary tree which consists of

*nodes with fields pair, value, and interval and satisfies the following properties.*

*1. There are total |v.list*_{1}*| nodes in H(v*_{1,i}*) and the interval of the root of H(v*_{1,i}*) is [p*^{v}_{i}*, q*^{v}* _{i}*].

*2. For each node u of H(v**1,i**), if p < k then u’s left child has interval [p, k − 1], and*
*if k < q then u’s right child has interval [k + 1, q], where [p, q] = u.interval and*
*k =RMQ(v.list*_{2}*, p, q).*

*3. For each node u of H(v**1,i**), if u.interval = [p, q] then u.pair = {v**1,i**, v**2,k**} and u.value =*

*`(v**1,i**, v**2,k**), where k =RMQ(v.list*2*, p, q).*

*Let us now return to describe our algorithm. Denote by V (H(v** _{1,i}*)) the set of nodes in

*H(v*

_{1,i}*). It should be noticed that the set of all feasible unordered pairs of vertices in T is equal*to the set

[

*v∈V*^{0}

*|v.list*[1*|*
*i=1*

*{{v*_{1,i}*, v*_{2,j}*} : j ∈ [p*^{v}_{i}*, q*^{v}_{i}*]} =* [

*v∈V*^{0}

*|v.list*[1*|*
*i=1*

*{u.pair : u ∈ V (H(v*_{i}*))}.*

*Therefore, the remaining work is to find the k largest value nodes in* S

*v∈V*^{0}

S_{|v.list}_{1}_{|}

*i=1* *V (H(v** _{1,i}*)).

*Clearly, we can not afford to construct H(v*_{1,i}*) explicitly for each v** _{1,i}*. But notice that given

*any node u of H(v*

_{1,i}*), we can always construct u’s children in O(1) time since we have done the*

*RMQ preprocessing on the list v.list*

_{2}

*. Thus we shall only construct the root of H(v*

*) in the*

_{1,i}*first instance and expand the tree as needed. Since we have known p*

^{v}

_{i}*and q*

_{i}

^{v}*for each v ∈ V*

^{0}*and 1 ≤ i ≤ |v.list*

_{1}

*| and done the RMQ preprocessing on the list v.list*

_{2}

*for each v ∈ V*

*, we*

^{0}*can construct, in total O(V log V ) time, the root of H(v*

_{1,i}*) for all v*

*. Then we place these*

_{1,i}*roots into a balanced 2-heap of size up to O(V log V ) by the heapify operation [15] in linear*

*time, i.e., in O(V log V ) time. Note that each H(v*

*) is a 2-heap, so we have conceptually built a 4-heap for the set S*

_{1,i}*v∈V*^{0}

S_{|v.list}_{1}_{|}

*i=1* *V (H(v** _{1,i}*)). Now by Theorem 3, we can apply Frederickson’s

*algorithm [18] to find the k largest value nodes in that 4-heap in O(k) time. Of course, except*

*the roots of all H(v*

*), all the nodes in that 4-heap are not physically created until they are needed in running Frederickson’s [18] algorithm. We summarize the results of this section by the following theorem.*

_{1,i}*Theorem 4: Let T = (V, E) be a tree with a length function ` : E → R and a weight function*
*w : E → R. Given T , a positive integer k and an interval [L, U ], we can find the k longest*
*paths among all paths in T with weights in the interval [L, U ] in O(V log V + k) time.*

### 3 *Ω(V log V +k) Lower Bound for the Weight-Constrained* *k Longest Paths Problem*

*We prove that the Weight-Constrained Longest Path problem has an Ω(V log V ) bound*
*in the algebraic computation tree model. It follows that the Weight-Constrained k Longest*
*Paths problem has an Ω(V log V + k) lower bound in the algebraic computation tree model*
*since extra Ω(k) time is necessary for outputting the answer.*

*Definition 4: [Set Intersection Problem] Given two sets {x*1*, x*2*, . . . , x**n**} and {y*1*, y*2*, . . . , y**n**},*
*the Set Intersection problem asks whether there exist indices i and j such that x**i* *= y**j*.
*Lemma 2: [Ben-Or [8]] The Set Intersection problem has an Ω(n log n) lower bound in*
the algebraic computation tree model.

*Theorem 5: The Weight-Constrained Longest Path problem has an Ω(V log V ) lower*
bound in the algebraic computation tree model.

Proof: We reduce the Set Intersection problem to the Weight-Constrained Longest
*Path problem. Given two sets {x*_{1}*, x*_{2}*, . . . , x*_{n}*} and {y*_{1}*, y*_{2}*, . . . , y*_{n}*}, we construct, in O(n)*
time, a problem instance of the Weight-Constrained Longest Path problem as follows.

*We first construct a tree T = (V, E), where V = {x*^{0}_{1}*, . . . , x*^{0}_{n}*} ∪ {y*_{1}^{0}*, . . . , y*_{n}^{0}*} ∪ {c*_{1}*, c*_{2}*} and*
*E = {x*^{0}_{1}*c*_{1}*, . . . , x*^{0}_{n}*c*_{1}*} ∪ {y*_{1}^{0}*c*_{2}*, . . . , y*^{0}_{n}*c*_{2}*} ∪ {c*_{1}*c*_{2}*}. Define the length function ` : E → R by*
*letting `(e) = 1 for all e ∈ E. Define the weight function w : E → R by letting w(x*^{0}_{i}*c*_{1}*) = x*_{i}*and w(y*_{i}^{0}*c*_{2}*) = −y*_{i}*for all i = 1, . . . , n, and w(c*_{1}*c*_{2}*) = 0. Set both the weight lower bound L*
*and the weight upper bound U of paths to 0. It can be verified that the longest path in T with*
*weight = 0 has length 3 if and only if there exist indices i and j such that x*_{i}*= y** _{j}*. Since in this

*reduction we have |V | = 2n + 2 and the Set Intersection problem has an Ω(n log n) in the*algebraic computation tree model by Lemma 2, we conclude that the Weight-Constrained

*Longest Path problem has an Ω(V log V ) lower bound in the algebraic computation tree*model.

*Corollary 1: The Weight-Constrained k Longest Paths problem has an Ω(V log V +k)*
lower bound in the algebraic computation tree model.

### 4 *O(n + k)-time Algorithm for the Length-Constrained k* Maximum-Sum Segments Problem

*Given a sequence A = (a*_{1}*, a*_{2}*, . . . , a** _{n}*) of numbers, we define the sum and length of a segment

*A[i, j] to be a*

_{i}*+ a*

_{i+1}*+ · · · + a*

_{j}*and j − i + 1, respectively. The Length-Constrained*

*k Maximum-Sum Segments problem is to find the k maximum-sum segments among all*
*segments with lengths in a specified interval [L, U ]. In the following, we show how to solve the*
*Length-Constrained k Maximum-Sum Segments problem in O(n + k) time.*

### 4.1 Preliminaries

*Let P denote the prefix-sum array of the input sequence A, i.e., P [0] = 0 and P [i] = a*_{1}*+ a*_{2}+

*· · ·+a*_{i}*for i = 1, . . . , n. P can be computed in linear time by set P [0] to 0 and P [i] to P [i−1]+a*_{i}*for i = 1, 2, . . . , n. Let S[i, j] denote the sum of A[i, j]. Since S[i, j]=P [j] − P [i − 1], the sum*
of any segment can be computed in constant time after the prefix-sum array is constructed.

Now we describe the Range Maximum-Sum Segment Query (RMSQ) problem. In the
*RMSQ problem, a sequence A = (a*_{1}*, a*_{2}*, . . . , a*_{n}*) of n numbers is given to be preprocessed such*
that any range maximum-sum segment query can be answered quickly. A range maximum-sum
*segment query specifies two intervals [i, j] and [k, l], and the goal is to find a pair of indices*
*(x, y) with i 6 x 6 j and k 6 y 6 ` that maximizes S[x, y].*

Chen and Chao [11] have showed that RMSQ is linearly equivalent to RMQ. For ease of explanation, in the following description of the algorithm we use RMSQ instead of RMQ.

*Theorem 6: [Chen and Chao [11]] The RMSQ problem can be solved in O(n) preprocessing*
*time and O(1) time per query.*

### 4.2 *Finding the Length-Constrained k Maximum-Sum Segments*

The algorithm is similar to the one in Section 2.2, but this time we can achieve linear running
*time. First we preprocess the input sequence A so that given any two intervals [i, j] and [k, l], we*
*can find the pair (x, y), denoted RMSQ(i, j, k, l), with i 6 x 6 j and k 6 y 6 ` that maximizes*
*S[x, y]. By Theorem 6, this preprocessing can be done in O(n) time. In the following, we say*
*a segment A[i, j] is feasible if and only if L ≤ j − i + 1 ≤ U. Set p**i* *= max{i − U + 1, 1}*

*and q**i* *= i − L + 1 for all i = 1, . . . , n. For simplicity, we assume p**i* *≤ q**i* *for all i = 1, . . . , n.*

Then S_{n}

*i=1**{A[h, i] : h ∈ [p**i**, q**i**]} is the set of all feasible segments. Our task is to find the k*
maximum-sum segments in this set.

*Before moving on to the algorithm, let us define some data structures. For each index i,*
*define H(i) to be a rooted ordered binary tree which consists of nodes with fields pair, value,*
*and interval and satisfies the following properties.*

*1. There are total q*_{i}*− p*_{i}*+ 1 nodes in H(i) and the interval of the root of H(i) is [p*_{i}*, q** _{i}*].

*2. For each node u of H(i), if p < k then u’s left child has interval [p, k −1], and if k < q then*
*u’s right child has interval [k + 1, q], where [p, q] = u.interval and (k, i) =RMSQ(p, q, i, i).*

*3. For each node u of H(i), if u.interval = [p, q] then u.pair = (k, i) and u.value = S[k, i],*
*where (k, i)=RMSQ(p, q, i, i).*

*We now describe our algorithm. Let V (H(i)) denote the set of nodes in H(i). It is clear that*
*the k largest value nodes in*S_{n}

*i=1**V (H(i)) correspond to the k maximum-sum feasible segments.*

*Thus the remaining work is to find the k largest value nodes in*S_{n}

*i=1**V (H(i)). Notice that given*
*any node u of H(i), we can always construct u’s children in O(1) time since we have done the*
*RMSQ preprocessing on A[1..n]. Thus we only construct the root of H(i) in the first instance*
*and expand the tree as needed. Since we have known p*_{i}*and q*_{i}*for each index i and done the*
*RMSQ preprocessing on A[1..n], we can construct, in total O(n) time, the root of H(i) for*
*each index i. Then we place these roots into a balanced 2-heap by the heapify operation [15]*

*in O(n) time. Note that each H(i) is a 2-heap, so we have conceptually built a 4-heap for the*
setS_{n}

*i=1**V (H(i)). Now by Theorem 3, we can apply Frederickson’s algorithm [18] to find the k*
*largest value nodes in that 4-heap in O(k) time. As before, except the roots of all H(i), all the*
nodes in that 4-heap are not physically created until they are needed in running Frederickson’s
[18] algorithm. The following theorem summarizes the results of this section.

Theorem 7: *Given a sequence A = (a*_{1}*, . . . , a*_{n}*) of numbers, a positive integer k, and an*
*interval [L, U ], we can find, in O(n + k) time, the k maximum-sum segments of A with lengths*
*in [L, U ].*

*Definition 5: Let A = ((a*_{1}*, `*_{1}*), . . . , (a*_{n}*, `*_{n}*)) be a sequence of pairs of numbers, where `*_{i}*> 0*
*for all i = 1, . . . , n. We define the sum, length, and density of a segment A[i, j] to be*P

*i≤h≤j**a** _{h}*,
P

*i≤h≤j**`** _{h}*, and

^{P}

^{P}

^{i≤h≤j}

^{a}

^{h}*i≤h≤j**`**h*, respectively.

We prove the following stronger theorem by slightly modifying the above algorithm.

*Theorem 8: Given a sequence of pairs of numbers A = ((a*1*, `*1*), . . . , (a**n**, `**n**)), where `**i* *> 0*
*for i = 1, . . . , n, a positive integer k, and an interval [L, U ], we can find, in O(n + k) time, the*
*k maximum-sum segments of A with lengths in [L, U ].*

Proof: We show how to modify the above algorithm to achieve this theorem. In fact, we
*only need to change the settings of p**i**’s and q**i*’s. The remaining parts are the same. For all
*i = 1, . . . , n, we redefine p**i* *to be the minimum index 1 ≤ h ≤ i such that £[h, i] ≤ U and q**i* to
*be the maximum index 1 ≤ h*^{0}*≤ i such that £[h*^{0}*, i] ≥ L. For simplicity, we assume p**i* *and q**i*

*exist for all i = 1, . . . , n. Since `**i* *is positive for all i = 1, . . . , n, the sequences (p*1*, . . . , p**n*) and
*(q*1*, . . . , q**n**) must be nondecreasing. Thus we can compute p**i* *and q**i* *for all i = 1, . . . , n by the*
*following procedure in O(n) time.*

*1. Set p = 1 and q = 1.*

*2. For i ← 1 to n do*

*(a) While(£[p, i] > U and p ≤ i) do p ← p + 1.*

*(b) While(£[q + 1, i] ≥ L and q + 1 ≤ i) do q ← q + 1.*

*(c) p*_{i}*← p and q*_{i}*← q.*

*3. Output (p*1*, . . . , p**n**) and (q*1*, . . . , q**n*).

### 5 Applications

In this section, we give some applications of our algorithms.

### 5.1 *Finding the Weight-Constrained k Largest Elements of X + Y*

*Let X and Y be two sets associated with value functions V*_{X}*: X → R and V*_{Y}*: Y → R,*
*respectively. The Cartesian sum X + Y is the set {(x, y) : (x, y) ∈ X × Y } associated with a*
*value function V : X ×Y → R defined by letting V (x, y) = V*_{X}*(x)+V*_{Y}*(y) for all (x, y) ∈ X ×Y .*
*For convenience, we just use x + y to denote V*_{X}*(x) + V*_{Y}*(y), and we call a set associated with*
*a value function a valued set. Frederickson and Johnson [20] gave an optimal algorithm for*
*finding the k*^{th}*largest element in X + Y in O(m + p log(k/p)) time, where m = |X| ≤ |Y | = n*
*and p = min{k, m}. Recently Bae and Takaoka proposed an efficient O(n + k log k)-time*
*algorithm [4] for finding the k largest elements of X + Y . In the following, we first show how*
*to find the k largest elements of X + Y in O(n + k) time by using Eppstein’s algorithm [16],*
*and then we show how to cope with the weight-constrained case in O(n log n + k) time by using*
our algorithm.

*Lemma 3: [Eppstein [16]] Given a directed acyclic graph G = (V, E) with a length function*

*` : E → R and two distinguished vertices s and t, we can find, in O(V + E + k) time, an*
*implicit representation of the k longest paths connecting s and t in G. And by using the*
*implicit representation, we can list the edges of any path P in the set of the k longest paths in*
*time proportional to the number of edges in P .*

*Theorem 9: Given two valued sets X = {x*1*, . . . , x**n**} and Y = {y*1*, . . . , y**n**}, we can find the*
*k largest elements of X + Y in O(n + k) time.*

*Proof: We describe an O(n + k) algorithm for finding the k largest elements of X + Y as*
*follows. We first construct, in O(n) time, a directed acyclic graph G = (V, E) where V =*
*{s, t, c} ∪ {x*^{0}_{1}*, . . . , x*^{0}_{n}*} ∪ {y*_{1}^{0}*, . . . , y*_{n}^{0}*} and E = {−→*

*sx*^{0}_{1}*, . . . ,−→*

*sx*^{0}_{n}*} ∪ {−→*

*x*^{0}_{1}*c, . . . ,−→*

*x*^{0}_{n}*c} ∪ {−→*

*cy*^{0}_{1}*, . . . ,−→*

*cy*_{n}^{0}*} ∪*
*{−→*

*y*_{1}^{0}*t, . . . ,−→*

*y*_{n}^{0}*t}. Define ` : E → R by letting `(−→*

*sx*^{0}_{i}*) = 0, `(−→*

*x*^{0}_{i}*c) = x*_{i}*, `(−→*

*cy*^{0}_{i}*) = y*_{i}*, and `(−→*
*x*^{0}_{i}*t) = 0*
*for all i = 1, . . . , n. It can be verified that (x*_{i}*, y*_{j}*) is the k*^{th}*largest element of X + Y if and*
*only if (s, x*^{0}_{i}*, c, y*^{0}_{j}*, t) is the k*^{th}*longest path connecting s and t in G. Thus, by Lemma 3, we*
*can first find the k longest paths connecting s and t in G in O(V + E + k) = O(n + k) time*
*and then find the corresponding k largest elements of X + Y in O(k) time.*

*Now we show how to cope with the weight-constrained case. Let X and Y be two valued*
*sets associated with weight functions W**X* *: X → R and W**Y* *: Y → R, respectively. Then for*
*each (x, y) ∈ X + Y , we define the weight of (x, y) to be W**X**(x) + W**Y**(y).*

*Theorem 10: Let X = {x*_{1}*, . . . , x*_{n}*} and Y = {y*_{1}*, . . . , y*_{n}*} be two valued sets associated with*
*weight functions W*_{X}*: X → R and W*_{Y}*: Y → R, respectively. Given a positive integer k and*
*an interval [L, U ], we can find, in O(n log n + k) time, the k largest elements of X + Y with*
*weights in the interval [L, U ].*

*Proof: We construct, in O(n) time, a tree T = (V, E) where V = {x*^{0}_{1}*, . . . , x*^{0}_{n}*} ∪ {y*_{1}^{0}*, . . . , y*_{n}^{0}*} ∪*
*{c} and E = {x*^{0}_{1}*c, . . . , x*^{0}_{n}*c} ∪ {y*_{1}^{0}*c, . . . , y*^{0}_{n}*c}. Let δ be a large enough positive number, say,*
*greater than max{|U|, |L|} + max{ max*

*1≤i≤n**|W*_{X}*(x*_{i}*)|, max*

*1≤i≤n**|W*_{Y}*(y*_{i}*)|}. Define the weight function*
*w : E → R by letting w(x*^{0}_{i}*c) = W*_{X}*(x*_{i}*) + δ and w(y*_{i}^{0}*c) = W*_{Y}*(y*_{i}*) − δ for all i = 1, . . . , n.*

*Define the length function ` : E → R by letting `(x*^{0}_{i}*c) = V*_{X}*(x*_{i}*) and `(y*_{i}^{0}*c) = V*_{Y}*(y** _{i}*) for all

*i = 1, . . . , n.*

*Let P be a path of T . Consider the following cases. First, if P has both of its end vertices in*
*{x*^{0}_{1}*, . . . , x*^{0}_{n}*}, i.e., P = (x*^{0}_{i}*, c, x*^{0}_{j}*) for some i and j, then we have w(P ) = W*_{X}*(x*_{i}*)+W*_{X}*(x*_{j}*)+2δ >*

*U. Second, if P has one end vertex in {x*^{0}_{1}*, . . . , x*^{0}_{n}*} and the other end vertex being c, i.e.,*
*P = (x*^{0}_{i}*, c) for some i, then we also have w(P ) = W*_{X}*(x*_{i}*) + δ > U . Similarly, if P has both of*
*its end vertices in {y*^{0}_{1}*, . . . , y*^{0}_{n}*} or P has one end vertex in {y*_{1}^{0}*, . . . , y*_{n}^{0}*} and the other end vertex*
*being c, then we have w(P ) < L. Finally, if P has one end vertex in {x*^{0}_{1}*, . . . , x*^{0}_{n}*} and the other*
*in {y*_{1}^{0}*, . . . , y*_{n}^{0}*}, i.e., P = (x*^{0}_{i}*, c, y*_{j}^{0}*) for some i and j, then we have w(P ) = W*_{X}*(x*_{i}*) + W*_{Y}*(y** _{j}*)

*and v(P ) = V*

_{X}*(x*

_{i}*) + V*

_{Y}*(y*

*).*

_{j}*From the above discussion, we conclude that (x*_{i}*, y*_{j}*) is the k*^{th}*largest element of X + Y*
*with weight in [L, U ] if and only if (x*^{0}_{i}*, c, y*_{j}^{0}*) is the k*^{th}*longest path of T with weight in [L, U ].*

*Thus, by Theorem 4, we can first find the k longest paths of T with weights in [L, U ] in*
*O(V log V + k) = O(n log n + k) time and then find the corresponding k largest elements of*
*X + Y with weights in [L, U ] in O(k) time.*

### 5.2 *Finding the Sum-Constrained k Longest Segments*

In biological sequence analysis, several researchers have devoted to the problem of finding the
*longest segment whose sum is not less than a specified lower bound L [1, 12, 35]. Allison [1]*

*gave an algorithm which runs in linear time if the input sequence is a 0-1 sequence and L is*
a rational number. For real number sequences and real number lower bound, Wang and Xu
[35] provided the first linear time algorithm, and Chen and Chao [12] gave an alternative linear
time algorithm which runs in an online manner. We consider a more general problem in which
*both the lower bound L and the upper bound U of the sums of the segments are given and we*
*want to find the k longest segments whose sums satisfy both the lower bound condition and*
the upper bound condition.

*Theorem 11: Given a sequence A = (a*_{1}*, a*_{2}*, . . . , a*_{n}*) of real numbers and an interval [L, U ],*
*we can find, in O(n log n+k) time, the k longest segments whose sums are in the interval [L, U ].*

Proof: Directly from Theorem 4.

### 5.3 *Finding k Length-Constrained Maximum-Density Segments Sat-* isfying a Density Lower Bound

*Given a sequence of pairs of numbers A = ((a*1*, `*1*), . . . , (a**n**, `**n**)), where `**i* *> 0 for all i = 1, . . . , n,*
*a positive integer k, an interval [L, U ], and a number δ, let k**out* *= min{k, n**δ**}, where n**δ* is the
*total number of segments of A with lengths in [L, U ] and densities ≥ δ. We show how to find k**out*

*segments of A with lengths in [L, U ] and densities ≥ δ in O(n + k**out**) time. A segment A[i, j] is*
*called a feasible segment if and only if the length of A[i, j] is in [L, U ]. Let δ**max* be the density
of the feasible segment which has the maximum density among all feasible segments. The
Length-Constrained Maximum-Density Segment problem is to find a feasible segment
*with density equal to δ**max*. The Length-Constrained Maximum-Density Segment
problem is well studied in [14, 24, 26, 27, 29, 30] and can be solved in linear time by [14, 24].

*Let n**δ**max* *be the total number of feasible segments with density equal to δ**max*. If we are
*not satisfied by finding only one feasible segments with density equal to δ**max*, then by first
*computing δ**max* *by O(n)-time algorithms in [14, 24] and setting δ to δ**max*, our algorithm can
*list k**out* *= min{k, n**δ**max**} feasible segments with density equal to δ**max* *in O(n + k**out*) time.

*Theorem 12: Given a sequence of pairs of numbers A = ((a*_{1}*, `*_{1}*), . . . , (a*_{n}*, `*_{n}*)), where `*_{i}*> 0*
*for all i = 1, . . . , n, a positive integer k, an interval [L, U ], and a number δ, let k*_{out}*= min{k, n*_{δ}*},*
*where n*_{δ}*is the total number of segments of A with lengths in [L, U ] and densities ≥ δ. Then*
*we can find, in O(n + k*_{out}*) time, k*_{out}*segments of A with lengths in [L, U ] and densities ≥ δ.*