行政院國家科學委員會專題研究計畫 期中進度報告
關於強正則多重圖及其相關有限幾何的研究(1/3)
計畫類別: 個別型計畫
計畫編號: NSC94-2115-M-009-015-
執行期間: 94 年 08 月 01 日至 95 年 07 月 31 日
執行單位: 國立交通大學應用數學系(所)
計畫主持人: 黃大原
報告類型: 精簡報告
報告附件: 出席國際會議研究心得報告及發表論文
處理方式: 本計畫可公開查詢
中 華 民 國 95 年 5 月 30 日
期中報告
黃大原
作為堵丁柱在完備圖上的不同大小的完美匹配上所建構的 Pooling
設計的推廣,我們以詹氏圖、格氏圖的不同大小的點團所構成的鄰接圖為
基礎,我們給出倆列類具有改錯能力的 Pooling 設計。
Some Error-Correcting Pooling Designs
Associated with Johnson Graphs and
Grassmann Graphs
(Preliminary Version 3.1)Yujuan Bai, Tayuan Huang and Kaishun Wang∗
May 3, 2006
Abstract
Based on the inclusion matrices of t-cliques with various sizes of Johnson graphs J(n, t) and Grassmann graphs Jq(n, t) respectively, two families of
correcting pooling designs are given, some of their properties including the error-correcting capability together with two parameters edand e≤dare studied. With
an interpretation of matchings K2m of as 2-cliques of Johnson graph J(n, 2),
this gives a q-analogue of the pooling designs defined over matchings of K2m
given by Ngo and Du.
1
Introduction
Suppose there are at most d defective items among n items to be tested, and we assume some testing mechanism exists which if applied to an arbitrary subset of the
population gives a negative outcome if the subset contains no positive and positive
outcome otherwise. A group testing algorithm is non-adaptive if all tests must be
specified without knowing the outcomes of other tests, which is useful in many areas such as DNA library screening.
The notion of de-disjunct matrices (defined in Sec.2) provides a mathematical
model for error-correcting pooling designs. Macula [5] constructed de-disjunct ma-trices for certain values of e by the containment relation of subsets in a finite set. The q-analogue of Macula’s construction is given by Ngo and Du in [7]. Moreover, the notion of pooling spaces was introduced by Huang and Weng [ ] which provides one of general frameworks for de-disjunct matrices. They showed that a d2e-disjunct
matrix is e-error-correcting in [3].
Recently, Ngo and Du constructed a class of disjunct matrices over the incidence matrices of matchings with various sizes of the complete graph K2m in [7], and
asked for its q-analogue. With an interpretation of matchings as 2-cliques of Johnson graphs J(n, 2), we generalize Ngo and Du’s construction to the incidence matrices of
t-cliques with various sizes of Johnson graphs J(n, t) and Grassmann graphs Jq(n, t), respectively. We show that our pooling designs have the same capability of error-detecting and error-correcting as Ngo and Du’s, however the test to item ratio of ours is much smaller. Moreover, the parameters edand e≤d of these pooling designs
are also considered.
An overview of up-to-date results on Combinatorial Group Testing algorithms was given by Du and Ngo [8]. It is interesting to note that they pointed out that this is a young and interesting field with deep connections to coding theory and design theory, and they strongly believe that the theory of association schemes , and in particular distance regular graphs, should play an important role in improving our pooling designs.
We will recall some known results regarding pooling designs in the framework of two families of distance regular graphs, the John graph and the Grassmann Graphs. We first recall some pooling designs associated with the Johnson graphs and Grassmann graphs as well in section 2. Some basic definitions on t-cliques, {1, 2, K, t}-cliques of Johnson graphs are also given in Section 2. Two new families of pooling designs together with their capability of error-correcting are given in Section 3. Moreover, two parameters ed and e≤d over the Johnson graphs are studied in
Sec-tion 4.
2
Preliminaries
The notion of de-disjunct matrices provides a mathematical model for error-correcting
pooling designs.
Definition 2.1 A binary matrix M is said to be de-disjunct if given any d + 1 columns of M with one designated, there are e + 1 rows with a 1 in the designated column and 0 in each of the other d columns.
A de-disjunct matrix with e = 0 is said to be d-disjunct matrix. Let q be a
positive integer, indeed a prime power in use. Given positive integers 1 ≤ i ≤ n, the
Gaussian binomial coefficients with basis q is defined by
h n i i q = i−1Q j=0 n−j i−j, if q = 1, i−1Q j=0 qn−qj qi−qj, if q 6= 1.
In the case q = 1, we write³ n
i ´ instead ofh n i i 1 for convenience.
For any positive integer n we use [n] to denote the set {1, 2, . . . , n}. For any pos-itive integer k,
µ [n]
k
¶
denotes the collection of all k-subsets of [n], and ·
GF (q)n k
¸
denotes the collection of all k-subspaces of GF (q)n.
Definition 2.2
1. The Johnson graph J(n, t) is the graph defined on
µ [n]
t
¶
such that A and B are adjacent if |A ∩ B| = t − 1.
2. The Grassmann graph Jq(n, t) is the graph defined on · GF (q)n t ¸ q such that A and B are adjacent if dim(A ∩ B) = t − 1.
Definition 2.3 A clique C of J(n, 2) is a subfamily of µ
[n] 2
¶
such that |A ∩ B| = 1 for any two distinct A, B ∈ C.
Note that J(n, 2) is a strongly regular graph, i.e.a distance regular graph of diameter 2. Both Johnson graphs and Grassmann graph are distance-regular, refer to [1] for details.
Hence an l-matching in [7] is a 2-clique of J(n, 2) with size l. With this inter-pretation, its q-analogue extensions are available.
Definition 2.4
1. A t-clique of J(n, t) with size l is a subfamily {A1, A2, . . . , Al} of
µ [n]
t
¶
such that |A1∪ A2· · · ∪ Al| = tl, i.e., Ai∩ Aj = ∅ for any two distinct i and j. 2. A t-clique of Jq(n, t) with size l is a subfamily {A1, A2, . . . , Al} of
·
GF (q)n t
¸
q such that dim(A1+ A2+ · · · + Al) = tl.
Definition 2.5 A family of k-subsets in [n] with |KTK0| ≤ k − t for all K and in K0 in K is called a {1, 2, K, t}-clique of J(n, k).
The notations for disjunct matrices: Let d < k < n,
J(n, d, k): the incidence matrix of the system (
µ [n] d ¶ , µ [n] k ¶ ; ⊆)
( J is for Johnson Schemes)
Gq(n, d, k): the incidence matrix of the system (
µ GF (q)n d ¶ , µ GF (q)n k ¶ ; ⊆) (G is for Grassmann Schemes)
M (2n, d, k) : the incidence matrix of the system
(M is for matchings)
Mq(2n, d, k): the incidence matrix of the system of q-analog of
(Mq is for q-analogues of matchings) The error-correcting capability of de -
dis-junct matrices is summarized in the following.
Theorem 2.1 Suppose a de- disjunct matrix M of order N ×t is used for a pooling design, and P is the positive set to be identified,
1. if it is known that |P | = d, then M can correct e-errors; 2. if it is known that |P | ≤ d, then M can correct j e
2 k
-errors; moreover, M can correct e-errors in addition to another d-confirmation tests.
Moreover, the q-analogue of G(m, t, k, r) can be obtained naturally by Definition 2.2.
Definition 2.6 Given positive integers m ≥ k > r ≥ 1,
1. G(m, t, k, r) be the binary-matrix M with row-indexed (resp. column-indexed) by t-cliques of size r (resp. k) of J(tm, t) such that M (A, B) = 1 if A ⊆ B and 0 otherwise.
2. Gq(m, t, k, r) be the binary-matrix M be with row-indexed (resp. column-indexed) by t-cliques of size r (resp. k) of Jq(tm, t) such that M (A, B) = 1 if A ⊆ B and 0 otherwise.
Lemma 2.2 Let W be a k-subspace of Fqn. Then the number of d-subspaces of Fqn intersecting trivially with W is
· n − k d ¸ q qdk.
Proof. Let D = {A|A ∈ · V d ¸ q , A ∩ W = 0}.
Counting the set {(v1, v2, . . . , vd)| vi ∈ hW, v/ 1, v2, . . . , vi−1i} in two ways, we have
(qm− qk)(qm− qk+1) · · · (qm− qk+d−1) = |D| · (qd− 1)(qd− q) · · · (qd− qd−1). Hence |D| = · n − k d ¸ q qdk as required. ¤ Lemma 2.3 1. The number u[m, l]1 = u(m, l) of t-cliques of J(tm, t) with size
l is u[m, l]1 = u(m, l) = µ tm tl ¶ (tl)!/(t!)ll!.
2. The number uq(m, l) of t-cliques of Jq(tm, t) with size l is uq(m, l) = · tm tl ¸ q t Y i=1 · it i ¸ q ·qt 2l(l−1)/2 l! . , where 1 ≤ l ≤ m and
Proof. By Definition 2.2, {A1, A2, . . . , Al} is a t-clique of Jq(tm, t) with size l if
and only if A1+ A2+ · · · + Al is a tl-subspace of GF (q)tm.
Let L(m, l) be the number of ordered tuples (A1, A2, . . . , Al) of t-subspaces of GF (q)tmsuch that dim(A
1+A2+· · · Al) = tl. Notice that the number of tl-subspaces
of GF (q)tmis · tm tl ¸ q
. Counting L(m, l) directly, there are · tl t ¸ q ways to choose A1, then · tl − t t ¸ q
qt2 ways to choose A2 by Lemma 2.1 and so on. Thus
L(m, l) = · tm tl ¸ q · tl t ¸ q qt·0 · tl − t t ¸ q qt·t· · · · 2t t ¸ q qt·(l−2)t · t t ¸ q qt·(l−1)t. (1) On the other hand, (A1, A2, . . . Al) may be obtained by first picking a t-clique of Jq(tm, t) with size l in u[m, l]q ways, then for each t-clique, there are l! ways to get
the ordered tuples (A1, A2, . . . , Al). This yields
L(m, l) = u[m, l]ql!. (2)
Theorem 2.4 ([6, Theorem 2]) Let Kbe a family of k-subsets of [n] and αd =
min(td, k − d), i.e., a 1,2,K,t-clique of J(n,k). If the minimum Hamming distance dH(K) between any pair of k-sets in K is at least 2t, then J(n, d, K) is dαd−1-disjunct.
Theorem 2.5 J(n, k, d) is se - disjunct for 1 ≤ s ≤ d, where e is the function of s defined by e = µ k − s d − s ¶ − 1.
Proof. For those columns of J(n, k, d) indexed by K0, K1, . . . , Ks ∈
µ [n]
k
¶ , let
xi ∈ K0− Ki, 1 ≤ i ≤ s, and let S be a s-subset of K0 containing {xi|1 ≤ i ≤ s}.
Then each row indexed by D ∈ µ
[n]
d
¶
with S ⊆ D ⊆ K0 is of the form 1 · · · 1 over
K0, and 0 · · · 0 over K1, · · · , Kd. Indeed, there are µ
k − s d − s
¶
many such choice of D, as required.
3
New families of d
e-disjunct matrices
Given positive integers m ≥ k > d ≥ 1, . Let M (2m, d, k) be the binary-matrix
M with row-indexed (resp. column-indexed) by d-matchings (resp. k-matchings) of K2m such that M (A, B) = 1 if A ⊆ B and 0 otherwise. In [7], Ngo and Du proved
the following results:
Theorem 3.1 ([7, Theorem 11]) Let g(m, l) = ( µ
2m 2l
¶
)(2l)!/2ll!, v = g(m, d) and n = g(m, k). For m ≥ k > d ≥ 1, M (m, k, d) is a v × n d-disjunct matrix with row weight g(m − d, k − d) and column weight
µ
k d
¶
.
Theorem 3.2 ([7, Corollary 12]) Given integers m > d ≥ 1, the following hold: (1) M (m, m, d) is d-error-detecting and bd/2c-error-correcting. Moreover,
(2) If the number of positives is known to be exactly d, then M (m, m, d) is
With an interpretation of matchings as 2-cliques of the Johnson graph J(n, 2), we will give some generalizations of Ngo and Du’s construction.
Find examples of Γ ⊆ µ
[n]
k
¶
with dH(Γ) ≥ 2r? study their properties?
Theorem 3.3 Let m ≥ k > r ≥ d ≥ 1, then the matrix G(m, t, k, r) is a de-disjunct matrix of order v × n where (v, n) = (u(m, r), u(m, k)) and e =
µ
k − d r − d
¶
− 1 with a constant row weight u(m − r, k − r) and a constant column weight
µ
k r
¶
.
Proof. By Lemma 2.2, G(m, t, k, r) is a v ×n matrix with row weight u(m−r, k −r)
and column weight µ
k r
¶ .
Let Cj0, Cj1, . . . , Cjd be any d + 1 distinct columns of G(m, t, k, r). For each
i ∈ [d], there is a t-subset Vi of [tm] such that Vi ∈ Cj0\Cji. Let E = {Vi|i ∈ [d]}.
Then |E| ≤ d and E ⊂ Cj0 but E * Cji for each i ∈ [d]. If |E| = i, the number of
r-subsets of Cj0 containing E is µ k − i r − i ¶ . Since µ k − i r − i ¶ ≥ µ k − d r − d ¶ , the number of t-cliques of size r contained in Cj0 but not contained in Cji for each i ∈ [d] is at
least µ k − d r − d ¶ . ¤
The following corollary shows the above e is optimal if m > k.
Corollary 3.4 Let m > k > r ≥ d ≥ 1, the matrix G(m, t, k, r) is de-disjunct, but not a de+1-disjunct matrix with e =
µ
k − d r − d
¶
− 1.
Proof. In order to prove that G(m, t, k, r) is not a de+1-disjunct matrix, we need
only to show that the maximum size of E is obtained. Since m > k, there exists a t-clique T = {A1, A2, . . . , Ak+1} with size k + 1. Let Cji = T \{Ai} for each i ∈ [d + 1].
Then |E| = |{Ai | i ∈ [d]}| = d. ¤
The results in Theorem 3.3 and Corollary 3.4 hold for its q-analogues too as shown below, their proofs are similar, and will be omitted.
Theorem 3.5 Let m ≥ k > r ≥ d ≥ 1, then the matrix Gq(m, t, k, r) is a de -disjunct matrix of order v×n where (v, n) = (u[m, r]q, u[m, k]q) and e =
µ
k − d r − d
¶
−1, with a constant row weight u[m − d, k − d]q and a constant column weight
µ
k r
¶
.
Corollary 3.6 Let m > k > r ≥ d ≥ 1, then the matrix Gq(m, t, k, r) is de-disjunct, but not a de+1-disjunct matrix with e =
µ
k − d r − d
¶
− 1.
An d-matching of K2mis simply a family of size d of 2 -subsets of [n] which are
pairwise disjoint. A 2-clique of Jq(2m, 2) of size l is the q-analogue of an l-matching
of K2m.
Similar to Corollary 12 in [7], G(m, t, m, d) is d-detecting and bd/2c error-correcting.
For fixed integers m ≥ k > r, the test to item ratio (v/n) of G(m, t, k, r) (resp.
Gq(m, t, k, r)) is a strictly decreasing function in t.
Some more examples of de-disjunct matrices.
Theorem 3.7 Let 1 ≤ s ≤ d ≤ k ≤ n. Let 1 ≤ q and e = µ k − s d − s ¶ − 1. J(n, d, k) is se-disjunct. proofs. Note that µ k − s d − s ¶ = µ k − s k − d ¶ , it is a decreasing sequence. Theorem 3.8 1. Gq(n, d, k) is se-disjunct.
2. Iq()n, d, k is se-disjunct for 1 ≤ s ≤ p, where p = ÷ k d ¸ q − · k − 1 d ¸ q ! ÷ k − 1 d ¸ q − · k − 2 d ¸ q !−1 , and e = · k d ¸ q − · k − 1 d ¸ − (s − 1) ÷ k − 1 d ¸ q − · k − 2 d ¸ q ! − 1
Theorem 3.9 ([],[]) For 1 ≤ d ≤ k ≤ n and 1 ≤ r ≤ k, let K be a family of
k-subsets of [n] with the minimum Hamming distance dH(K) between any pair of k-sets in K is at least 2r, then
1. J(n, d, K) is dαd−1 - disjunct where α d= min(r4, k − d). (Theorem 2). 2. J(n, d, k, K, r) is se-disjunct if 1 ≤ s ≤ p, where p = &µ· k d ¸ − · k − r d ¸¶ µ· k − r d ¸ − · k − 2r d ¸¶−1' , and e = · k d ¸ − · k − r d ¸ − (s − 1) µ· k − r d ¸ − · k − 2r d ¸¶ − 1. The following lemma is used in the proof of the following theorem.
Lemma 3.10 Let K be a family of k-subsets in [n] with |KTK0| ≤ k − t for all K and K0 in K. Let d ≥ 1 with t ≥ 1 + t/(k − d) and set αd = min(td, k − d). Then given d + 1 k-sets {Ki}di=0 ⊂ K, there are αd d-sets {Dj}αj=1d in [n] such that each Dj is contained in K0 and no Dj is connected in Ki for 1 ≤ i ≤ d.
4
Parameters e
dand e
≤dfor error-correcting
For a binary matrix M of order t × n, let B(D) denote the Boolean sum of those columns indexed by elements of D ⊆ [n], and let dH(B(D), B(D0)) denote the
Hamming distance between B(D) and B(D0) whenever D and D0 are two distinct
subsets of [n]. Suppose Bd(M ) is the binary matrix consists of columns B(S) for
all S ⊆ [n] with |S| ≤ d. Let dH(Bd(M )) be the minimum Hamming distance over all pairs of columns of Bd(M ). The minimum Hamming distance dH(Bd(M )) is
interesting for error tolerance; for example, Macula proved the following result: Let
ed= min
|D|=|D0|=ddH(B(D), B(D
and
e≤d = min
|D|=|D0|≤d D,D0 are antichainsdH(B(D), B(D
0)).
The larger the parameter e≤d is, the better its capacity of error correcting is.
Their values for the matrices G(m, t, k, r) and Gq(m, t, k, r) will be considered in
this section. We first treat the case for G(m, t, k, r) by giving a specific example. Example 4.1 Let m > k, and let T = {A1, A2, . . . , Ak+1} be a t-clique of J(tm, t)
with size k +1. For each i ∈ [d+1], suppose Bi= T \{Ai}. Then each Bi is a t-clique
of J(tm, t) with size k. Let
D = {B1, B2, . . . , Bd−1, Bd} and D0 = {B1, B2, . . . , Bd−1, Bd+1}. Then dH(B(D), B(D0)) = |{R|R ∈ µ Bd r ¶ , R * B1, B2, . . . , Bd−1, Bd+1}| + |{R|R ∈ µ Bd+1 r ¶ , R * B1, B2, . . . , Bd−1, Bd}| = |{R|{A1, A2, . . . , Ad−1, Ad+1} ⊆ R ⊆ Bd}| + |{R|{A1, A2, . . . , Ad−1, Ad} ⊆ R ⊆ Bd+1}| = 2 µ k − d r − d ¶ .
Theorem 4.1 Let m > k > r ≥ d ≥ 1. Then ed = e≤d = 2
µ k − d r − d ¶ for M = G(m, t, k, r).
We have
ed = min
|D|=|D0|=ddH(B(D), B(D
0))
≥ min |{R ⊆ Ai for some i ∈ [d] and R * A0j f or j ∈ [d]}|
+ min |{R ⊆ A0i for some i ∈ [d] and R * Aj f or j ∈ [d]}| ≥ 2 µ k − d r − d ¶ by Theorem 3.3.
On the other hand, Example 1 shows ed ≤ 2
µ k − d r − d ¶ . Hence ed= 2 µ k − d r − d ¶ as required. To show e≤d = 2 µ k − d r − d ¶
, we consider two antichains D = {A1, A2, . . . , Au} and D0 = {A0
1, A02, . . . , A0v} where u, v ≤ d. Without loss of generality, we may assume
that Du ∈ D/ 0 and D0v ∈ D. By Theorem 3.3 there exist at least/
µ
k − v r − v
¶
t-cliques
with size r contained in Au but not in A0j for each j ∈ [v]. By the symmetry, we
have e≤d ≥ µ k − v r − v ¶ + µ k − u r − u ¶ . Note that µ k − s r − s ¶ ≥ µ k − d r − d ¶ if s ≤ d. Hence e≤d ≥ 2 µ k − d r − d ¶
. On the other hand, by definition, e≤d ≤ ed = 2
µ k − d r − d ¶ . This yields e≤d = 2 µ k − d r − d ¶ . ¤
Similar result holds for Gq(m, t, k, r). The proof is similar to that of Theorem 4.2
and will be omitted.
Theorem 4.2 Let m > k > r ≥ d ≥ 1. Then ed = e≤d = 2 µ k − d r − d ¶ for M = Gq(m, t, k, r).
References
[1] A. E. Brouwer, A. M. Cohen and A. Neumaier, Distance-Regular Graphs, Springer Verlag, Berlin, Heidelberg, 1989.
[2] A.G. D’yachkov, F.K. Hwang, A.J. Macula, P.A. Vilenkin, C. Weng, A construc-tion of pooling designs with some happy surprises, J. Computaconstruc-tional Biology, 12 (2005), 1129-1136.
[3] T. Huang and C. Weng, A note on decoding of superimposed codes, J. Comb. Optim. 7 (2003), no. 4, 381-384.
[4] T. Huang and C. Weng, Pooling spaces and non-adaptive pooling designs, Dis-crete Math. 282 (2004), 163-169.
[5] A.J. Macula, A simple construction of d-disjunct matrices with certain constant weights, Discrete Math. 162 (1996), 311-312.
[6] A.J. Macula, Error-corerecting nonadaptive group testing with de-disjunct
ma-trices, Discrete Appl. Math. 80 (1997), 217-222.
[7] H. Ngo and D. Du, New constructions of non-adaptive and error-tolerance pool-ing designs, Discrete Math. 243 (2002), 161-170.
[8] H. Ngo and D. Zu, A survey on combinatorial group testing algorithms with ap-plications to DNA library screening, DIMACS Ser. Discrete Math. Theoretical Comp. Sci. 55 (2000), 171-182.
[9] A.D’yachkov and P. Vilenkin, A. Macula, Nonadaptive Group Testing with Error-Correcting de-Disjunct Inclusion Matrices, Bolyai Society Mathematical