· 21−(k2) < 1, then R(k, k) > n. Thus R(k, k) >j 2k2k
for all k ≥ 3. Consider a random two-coloring of the edges of Kn obtained by coloring each edge independently either red or blue, where each color is equally likely. For any fixed set R of k vertices, let AR be the event that the induced subgraph of Kn
on R is monochromatic (i.e., that either all its edges are red or they are all blue).
Clearly, P r(AR) = 21−(k2). Since there aren k
possible choices for R, the probability that at least one of the events AR occurs is at most n
k
· 21−(k2) < 1. Thus, with positive probability, no event AR occurs and there is a two-coloring of Kn without a monochromatic Kk, i.e., R(k, k) > n. Note that if k ≥ 3 and we take n = j
2k2k , then
n k
· 21−(k2) < 21+k2 k! · nk
2k22
< 1
and hence R(k, k) >j 2k2k
for all k ≥ 3.
2.4 The Lov´ asz Local Lemma
There is a trivial case in which one can show that a certain event holds with positive, though small, probability. Indeed, if we have n mutually independent events and each of them holds with probability at least p > 0, then the probability that all events hold simultaneously is at least pn, which is positive, although it may be exponentially small in n. It is natural to expect that the case of mutual independence can be generalized to that of rare dependencies, and provide a more general way of proving that certain events hold with positive, though small, probability. Such a generalization is indeed possible and is stated in the Lov´asz Local Lemma.
Next, we review the main ideas of the Lov´asz Local Lemma, following the
treat-ment described in Alon and Spencer [1].
Definition 2.4.1. Let A1, A2, · · · , An be events in an arbitrary probability space. A graph G = (V, E) on the set of vertices V = {1, 2, · · · , n} is said to be a dependency graph for the events A1, A2, · · · , An if for each i, 1 ≤ i ≤ n, the event Ai is mutually independent of a set of all the other events except for those Aj with {i, j} ∈ E.
We’re now in the position to state the Lov´asz Local Lemma by skipping its proof:
Theorem 2.4.2. (The Lov´asz Local Lemma; General Case)
Let A1, A2, · · · , An be events in an arbitrary probability space and let G = (V, E) be a dependency graph for them. Suppose there are real numbers x1, · · · , xn such that 0 ≤ xi < 1 and P r(Ai) ≤ xi · Q
{i,j}∈E(1 − xj) for all 1 ≤ i ≤ n. Then P r(Tn
i=1Ai) ≥ Qn
i=1(1 − xi). In particular, with positive probability no event Ai
holds.
The next corollary establishes a result that holds when all events have probability at most p, for some constant p. In this corollary and elsewhere, e denotes the base of natural logarithms (i.e., e ≈ 2.71828).
Corollary 2.4.3. (The Lov´asz Local Lemma; Symmetric Case)
Let A1, A2, · · · , An be events in an arbitrary probability space. Suppose that each event Ai is mutually independent of a set of all the other events Aj but at most µ, and that P r(Ai) ≤ p for all 1 ≤ i ≤ n. If e · p · (µ + 1) ≤ 1, then P r(Tn
i=1Ai) > 0.
In the remaining of this thesis, our goal is to prove the existence of some kind of matrix with the desired properties under some conditions, e.g., the number of rows is large enough, by using Corollary 2.4.3. Thus deducing an upper bound for the minimum size of this kind of matrix.
Chapter 3
the collection of all subsets of [n] with cardinality d. Let t(d, n) denote the minimum number of rows for a d-disjunct matrix with n columns. Yeh [17] proves the following theorem by using Corollary 2.4.3. For completeness, we include his proof in what follows, with a little adjustment.Theorem 3.1.1. [17]
q for 1 ≤ k ≤ q, and the entries mij are mutually independent.
Let M∗ be a t × n random {0, 1}-matrix converted from M by replacing each q-ary alphabet by a unique q-digit binary column array with unit weight. For example, when q = 3, the replacement can be
1 →
event that the union of columns Cj, j ∈ J , contains column Cs. For i ∈ t
Note that AJ,s is mutually independent of all the other events AJ0,s0 except for those with (J0 ∪ {s0}) ∩ (J ∪ {s}) 6= φ. There are exactly
such events. According to Corollary 2.4.3, a t × n d-disjunct matrix exists whenever
e ·
holds. Taking natural logarithm to both sides yields the equivalent inequality
t ≥ q ·
3.2 (k, m, n)-Selectors
We begin this section with the definition of a (k, m, n)-selector.
Definition 3.2.1. Given integers k, m, and n, with 1 ≤ m ≤ k ≤ n, we say that a t × n binary matrix M is a (k, m, n)-selector if any submatrix of M obtained by choosing k out of n arbitrary columns of M contains at least m distinct rows of the identity matrix Ik. The integer t is the size of the (k, m, n)-selector.
As the relationship between (k, m, n)-selectors and group testing, De Bonis, G¸asieniec, and Vaccaro [5] proved that there exists a two-stage group testing algorithm for finding up-to-d positives out of n items and that uses a number of tests equal to t + k − 1, where t is the size of a (k, d + 1, n)-selector.
Let ts(k, m, n) denote the minimum size of a (k, m, n)-selector. De Bonis, G¸asieniec, and Vaccaro [5] obtain upper bounds for ts(k, m, n) by translating the problem into the hypergraph language. Still for completeness, we include their proof in what fol-lows. Given a finite set X and a family F of subsets of X, a hypergraph is a pair H = (X, F ). Elements of X will be called vertices of H, and elements of F will be called hyperedges of H. A cover of H is a subset T ⊆ X such that for any hyperedge E ∈ F we have T ∩ E 6= φ. The minimum size of a cover of H will be denoted by τ (H). A fundamental result by Lov´asz [13] implies that
τ (H) < |X|
minE∈F|E|(1 + ln ∆), (3.3)
where ∆ = maxx∈X|{E : x ∈ E ∈ F }|.
Essentially, Lov´asz proves that, by greedily choosing vertices in X that intersect the maximum number of yet nonintersected hyperedges of H, one obtains a cover of a size smaller than the R.H.S. of (3.3). Our aim is to show that (k, m, n)-selectors
are covers of properly defined hypergraphs. Lov´asz’s result (3.3) will then provide us with the desired upper bound on the minimum selector size.
We shall proceed as follows. Let X be the set of all binary vectors x = (x1, · · · , xn) of length n containing n/k 1’s (the value n/k is a consequence of an optimized choice whose justification can be skipped here). For any integer i, 1 ≤ i ≤ k, denote by ai the binary vector of length k having all components equal to zero with the exception of the component in position i. Moreover, for any set of indices S = {i1, · · · , ik}, with 1 ≤ i1 < i2 < · · · < ik ≤ n, and for any binary vector a = (a1, · · · , ak) ∈ {a1, · · · , ak}, define the set of binary vectors Ea,S = {x = (x1, · · · , xn) ∈ X : xi1 = a1, · · · , xik = ak}. For any set A ⊆ {a1, · · · , ak} of size r, r = 1, · · · , k, and any set S ⊆ {1, · · · , n} with |S| = k, define EA,S = S
a∈AEa,S. For any r = 1, · · · , k we define Fr = {EA,S : A ⊂ {a1, · · · , ak}, |A| = r, S ⊆ {1, · · · , n}, |S| = k} and the hypergraph Hr = (X, Fr). We claim that any cover T of Hk−m+1 is a (k, m, n)-selector; i.e., any submatrix of k arbitrary columns of T contains at least m distinct rows of the identity matrix Ik. The proof is done by contradiction. Assume that there exists a set of indices S = {i1, · · · , ik} such that the submatrix of T obtained by considering only the columns of T with indices i1, · · · , ik contains at most m − 1 distinct rows of Ik. Let such rows be aj1, · · · , ajs, with s ≤ m − 1; let A be any subset of {a1, · · · , ak} \ {aj1, · · · , ajs} of cardinality |A| = k − m + 1; and let EA,S be the corresponding hyperedge of Hk−m+1. By construction we have that T ∩ EA,S = φ, contradicting the fact that T is a cover for Hk−m+1.
The above proof that (k, m, n)-selectors coincide with the covers of Hk−m+1 allows us to use Lov´asz’s result (3.3) to give upper bounds for ts(k, m, n).
Theorem 3.2.2. [5]
where e = 2.71828... is the base of the natural logarithm.
Proof. We need only to evaluate the quantities |X|, min{|E| : E ∈ Fk−m+1}, and ∆ for the hypergraph Hk−m+1. By definition |X| =
n n/k
. Moreover, each hyperedge EA,S of Hk−m+1 is the union of k − m + 1 disjoint sets Ea,S; therefore it has cardinality
To compute ∆, observe that each x ∈ X belongs to n/k 1
distinct hyperedges EA,S. Therefore, for Hk−m+1 we have
For k ∈ {1, 2}, it is
Moreover, using the well-known inequality a b
≤ea b
b
, one can conclude
k − 1
The theorem now follows from (3.5) and the above inequalities.
Chapter 4
Main Results
4.1 (d, r]-Disjunct Matrices
To generalize Theorem 3.1.1, we start by giving a more general definition.
Definition 4.1.1. A t × n binary matrix M is called (d, r]-disjunct if the union of any d columns does not contain the intersection of any other r columns in M . Clearly, (d, 1]-disjunctness is precisely d-disjunctness.
As the relationship between (d, r]-disjunct matrices and nonadaptive group test-ing, Chen, Du and Hwang [2] proved that a (d, r]-disjunct matrix can identify the up-to-d positives on the complex model.
Let t(n, d, r] denote the minimum number of rows for a (d, r]-disjunct matrix with n columns. We have the following generalization of Theorem 3.1.1, followed by the proof using the same approach used in the proof of Theorem 3.1.1.
Theorem 4.1.2.
t(n, d, r] ≤
1 + d
r
r
· 1 + r
d
d
· (4.1)
1 + lnn d
n − d r
−n − (d + r) d
n − (d + r) − d r
. Proof. Let M and M∗ be as in the proof of Theorem 3.1.1. Again let C1, · · · , Cn be the columns of M∗. For D ∈[n]
d
and R ∈[n]
r
with D ∩ R = φ, let AD,R be
the event that the union of columns Cj, j ∈ D, contains the intersection of columns
Note that AD,Ris mutually independent of all the other events AD0,R0 except for those with (D0∪ R0) ∩ (D ∪ R) 6= φ. There are exactly
such events. According to Corollary 2.4.3, a t×n (d, r]-disjunct matrix exists whenever
e ·
holds. Taking natural logarithm to both sides yields the equivalent inequality
t ≥ q ·
holds, (4.2) holds. To minimize the R.H.S. of (4.3), we let q = d
r + 1 and complete the proof.
In the above proof of Theorem 4.1.2, some small problems may occur. For example, with the restriction that the number of rows of M and q must be positive integers, how about q doesn’t divide t or r doesn’t divide d ? For this sake, we provide another proof of Theorem 4.1.2 by omitting the process converting M into M∗ and letting M be a random {0, 1}-matrix directly. (Note that in the remaining sections of this chapter, we adopt the above technique.) However, if q divides t and r divides d, the above proof says more: the column sum of the desired matrix equals a constant t
q. The following is our second proof of Theorem 4.1.2.
Proof. Let M = (mij) be a t × n random {0, 1}-matrix with P r(mij = 1) = p, the event that the union of columns Cj, j ∈ D, contains the intersection of columns Ck, k ∈ R. Then
P r(AD,R) = h
1 − pr· (1 − p)dit
.
Similar to the first proof, a t × n (d, r]-disjunct matrix exists whenever
e ·h
holds, which is equivalent to
t ≥
Using the fact that − ln(1 − x) ≥ x for 0 ≤ x < 1, we conclude that whenever
Chen, Fu and Hwang [3] also provided an upper bound for t(n, d, r]:
t(n, d, r] <
Note that Stinson and Wei [16] provided two asymptotic upper bounds for t(n, d, r]
by using two other structures. One bound is Od + r r
. Also note that their bounds are asymptotic and our bound in (4.1) is non-asymptotic.
4.2 (d, r)-Disjunct Matrices
We present another generalization of Theorem 3.1.1 in this section.
Definition 4.2.1. A t × n binary matrix M is called (d, r)-disjunct if the union of any d columns does not contain the union of any other r columns in M . Clearly, (d, 1)-disjunctness is precisely d-disjunctness.
As the relationship between (d, r)-disjunct matrices and nonadaptive group test-ing, De Bonis and Vaccaro [6] proved that the (h, d)-disjunctness is a necessary con-dition for identifying P on the (d, h)-inhibitor model.
Let t(n, d, r) denote the minimum number of rows for a (d, r)-disjunct matrix with n columns. We have the following generalization of Theorem 3.1.1.
Theorem 4.2.2. be the event that the union of columns Cj, j ∈ D, contains the union of columns Ck, k ∈ R. Then
P r(AD,R) =n
1 − (1 − p)d· [1 − (1 − p)r]ot
.
Similar to the proof of Theorem 4.1.2, a t × n (d, r)-disjunct matrix exists whenever e ·n
holds, which is equivalent to
t ≥
Du and Hwang [9] proved that a (k, m, n)-selector is (m − 1, k − m + 1)-disjunct, which implies that a (d + r, d + 1, n)-selector is (d, r)-disjunct. By Theorem 3.2.2, we have
t(n, d, r) < e(d + r)2
r ln n
d + r +e(d + r)[2(d + r) − 1]
r .
(4.10)
Note that the bound in (4.10) is O((d + r) ln n) and the bound in (4.7) is O((d + r − 1) ln n), which is a little bit better.
4.3 (d, s out of r]-Disjunct Matrices
In section 4.1 and 4.2, two versions of generalizations of Theorem 3.1.1 are given.
However, there exists a more generalized category containing these two versions, which is presented in this section.
Definition 4.3.1. For 1 ≤ s ≤ r, a t × n binary matrix M is called (d, s out of r]-disjunct if for any d columns and any other r columns of M , there exists a row index in which none of the d columns appear and at least s of the r columns do.
Clearly, (d, 1 out of disjunctness is precisely (d, r)-disjunctness and (d, r out of r]-disjunctness is precisely (d, r]-r]-disjunctness.
Let t(n, d, r, s] denote the minimum number of rows for a (d, s out of r]-disjunct matrix with n columns. We have the following theorem:
Theorem 4.3.2.
t(n, d, r, s] ≤
1 + lnn d
n − d r
−n − (d + r) d
n − (d + r) − d r
fd,r,s(p) (4.11)
for all 0 < p < 1, where
fd,r,s(p) = (1 − p)d·
"
1 −
s−1
Xr i
pi(1 − p)r−i
# .
Proof. Let M = (mij) be a t × n random {0, 1}-matrix with P r(mij = 1) = p, the event that there exists a row index in which none the columns Cj, j ∈ D, appear and at least s of the columns Ck, k ∈ R, do. Then
Similar to the proof of Theorem 4.1.2, a t × n (d, s out of r]-disjunct matrix exists whenever
holds, which is equivalent to
t ≥ holds, (4.13) holds, completing the proof.
4.4 (k, m, n)-Selectors
We can use similar approaches to obtain an upper bound for the minimum size of a (k, m, n)-selector.
Theorem 4.4.1.
, define AK be the event that the t × k submatrix of M∗ corresponding to K contains at most m − 1 rows of Ik, and AK,M be the event that the m × k submatrix of M∗ corresponding to K and M doesn’t consist of m distinct rows of Ik. Observe that
Note that AK is mutually independent of all the other events AK0 except for those with K ∩ K0 6= φ. There are exactly
n
−n − k
− 1
such events. According to Corollary 2.4.3, a t × n (k, m, n)-selector exists whenever
e ·
1 − k m
· m! · pm· (1 − p)m·(k−1)
mt
·n k
−n − k k
≤ 1.
holds. Taking natural logarithm to both sides yields the equivalent inequality
t ≥ m ·
1 + lnn k
−n − k k
− ln
1 − k m
· m! · pm· (1 − p)m·(k−1) (4.15) .
Using the fact that − ln(1 − x) ≥ x for 0 ≤ x < 1, we conclude that whenever
t ≥ m ·
1 + lnn k
−n − k k
k m
· m! · pm· (1 − p)m·(k−1) (4.16)
holds, (4.15) holds. To minimize the R.H.S. of (4.16), we let p = 1
k and complete the proof.
As m = 1, the bound in (3.4) is O(k ln n) and the bound in (4.14) is O((k−1) ln n), which is a little bit better.
Chapter 5 Conclusion
In Theorem 4.3.2, we define a function fd,r,s(p) of p. To minimize the R.H.S. of the inequality (4.11), we must maximize fd,r,s(p), which is indeed a tough task. However, the maximum does exist, since t(n, d, r, s] is a positive integer for fixed n, d, r, and s.
We leave this as an open problem. Also, in our proof of Theorem 4.4.1, we partition the row indices into t
m parts of equal size to obtain an approximation of P r(AK).
However, when m ≥ 2, this approximation is not as good as we expect. Finally, we point out that all bounds in Chapter 4 are obtained by using probabilistic method.
We do wish that deterministic constructions can be discovered in the near future.
Bibliography
[1] Noga Alon, Joel H. Spencer, The Probabilistic Method, 2nd ed., John Wiley and Sons, Inc., (2000).
[2] H. B. Chen, D. Z. Du and F. K. Hwang, An unexpected meeting of four seem-ingly unrealated problems: graph testing, DNA complex screening, superimposed codes and secure key distribution, J. Combin. Opt., to appear.
[3] H. B. Chen, H. L. Fu and F. K. Hwang, An upper bound of the number of tests in pooling designs for the error-tolerant complex model, Opt. Letters, to appear.
[4] P. Damaschke, Threshold Group Testing, Electronic Notes in Discrete Mathe-matics 21, (2005), 265-271.
[5] A. De Bonis, Leszek G¸asieniec, U. Vaccaro, Optimal Two-Stage Algorithms for Group Testing Problems, SIAM J. Comput. Vol. 34, No. 5, (2005), 1253-1270.
[6] A. De Bonis and U. Vaccaro, Improved algorithms for group testing with in-hibitors, Inform. Process Lett. 67, (1998), 57-64.
[7] D. Deng, D. R. Stinson, R. Wei, The Lov´asz Local Lemma and Its Applications to some Combinatorial Arrays, Designs, Codes and Cryptography, 32, (2004), 121-134.
[8] Ding-Zhu Du, Frank K. Hwang, Combinatorial Group Testing and Its Applica-tions, 2nd ed., World Scientific, (2000).
[9] Ding-Zhu Du, Frank K. Hwang, Pooling Designs and Nonadaptive Group Testing - Important Tools for DNA Sequencing, World Scientific, (2006).
[10] P. Erd˝os, L. Lov´asz, Problems and Results on 3-chromatic Hypergraphs and Some Related Questions, in: Infinite and Finite Sets(A. Hajnal et al., eds.), North-Holland, Amsterdam, (1975), 609-628.
[11] M. Farach, S. Kannan, E. Knill, S. Muthukrishnan, Group Testing Problem with Sequences in Experimental Molecular Biology, Proc. Compression and Complex-ity of Sequences, (1997), 357-367.
[12] C. H. Li, A sequential method for screening experimental variables, J. Amer.
Statist. Assoc. 57, (1962), 455-477.
[13] L. Lov´asz, On the Ratio of Optimal Integral and Fractional Covers, Discrete Math., 13, (1975), 383-390.
[14] F. P. Ramsey, On A Problem of Formal Logic, Proc. London Math. Soc. 30(2), (1929), 264-286.
[15] Joel Spencer, Probabilistic Methods, Graphs and Combinatorics 1, (1985), 357-382.
[16] D. R. Stinson and R. Wei, Generalized cover-free families, Discrete Math. 279, (2004), 463-477.
[17] Hong-Gwa Yeh, d-Disjunct matrices: bounds and Lov´asz Local Lemma, Discrete Mathematics 253, (2002), 97-107.