關於強正則多重圖及其相關有限幾何的研究(I)

(1)

行政院國家科學委員會專題研究計畫期中進度報告

關於強正則多重圖及其相關有限幾何的研究(1/3)

計畫類別：個別型計畫

計畫編號： NSC94-2115-M-009-015-

執行期間： 94 年 08 月 01 日至 95 年 07 月 31 日

執行單位：國立交通大學應用數學系(所)

計畫主持人：黃大原

報告類型：精簡報告

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫可公開查詢

中華民國 95 年 5 月 30 日

(2)

期中報告

黃大原

作為堵丁柱在完備圖上的不同大小的完美匹配上所建構的 Pooling

設計的推廣，我們以詹氏圖、格氏圖的不同大小的點團所構成的鄰接圖為

基礎，我們給出倆列類具有改錯能力的 Pooling 設計。

(3)

Some Error-Correcting Pooling Designs

Associated with Johnson Graphs and

Grassmann Graphs

(Preliminary Version 3.1)

Yujuan Bai, Tayuan Huang and Kaishun Wang∗

May 3, 2006

Abstract

Based on the inclusion matrices of t-cliques with various sizes of Johnson graphs J(n, t) and Grassmann graphs Jq(n, t) respectively, two families of

correcting pooling designs are given, some of their properties including the error-correcting capability together with two parameters edand e≤dare studied. With

an interpretation of matchings K2m of as 2-cliques of Johnson graph J(n, 2),

this gives a q-analogue of the pooling designs defined over matchings of K2m

given by Ngo and Du.

1 Introduction

Suppose there are at most d defective items among n items to be tested, and we assume some testing mechanism exists which if applied to an arbitrary subset of the

(4)

population gives a negative outcome if the subset contains no positive and positive

outcome otherwise. A group testing algorithm is non-adaptive if all tests must be

specified without knowing the outcomes of other tests, which is useful in many areas such as DNA library screening.

The notion of de_{-disjunct matrices (defined in Sec.2) provides a mathematical}

model for error-correcting pooling designs. Macula [5] constructed de-disjunct ma-trices for certain values of e by the containment relation of subsets in a finite set. The q-analogue of Macula’s construction is given by Ngo and Du in [7]. Moreover, the notion of pooling spaces was introduced by Huang and Weng [ ] which provides one of general frameworks for de_{-disjunct matrices. They showed that a d}2e_-disjunct

matrix is e-error-correcting in [3].

Recently, Ngo and Du constructed a class of disjunct matrices over the incidence matrices of matchings with various sizes of the complete graph K2m in [7], and

asked for its q-analogue. With an interpretation of matchings as 2-cliques of Johnson graphs J(n, 2), we generalize Ngo and Du’s construction to the incidence matrices of

t-cliques with various sizes of Johnson graphs J(n, t) and Grassmann graphs J_q(n, t), respectively. We show that our pooling designs have the same capability of error-detecting and error-correcting as Ngo and Du’s, however the test to item ratio of ours is much smaller. Moreover, the parameters edand e≤d of these pooling designs

are also considered.

An overview of up-to-date results on Combinatorial Group Testing algorithms was given by Du and Ngo [8]. It is interesting to note that they pointed out that this is a young and interesting field with deep connections to coding theory and design theory, and they strongly believe that the theory of association schemes , and in particular distance regular graphs, should play an important role in improving our pooling designs.

(5)

We will recall some known results regarding pooling designs in the framework of two families of distance regular graphs, the John graph and the Grassmann Graphs. We first recall some pooling designs associated with the Johnson graphs and Grassmann graphs as well in section 2. Some basic definitions on t-cliques, {1, 2, K, t}-cliques of Johnson graphs are also given in Section 2. Two new families of pooling designs together with their capability of error-correcting are given in Section 3. Moreover, two parameters ed and e≤d over the Johnson graphs are studied in

Sec-tion 4.

2 Preliminaries

The notion of de_{-disjunct matrices provides a mathematical model for error-correcting}

pooling designs.

Definition 2.1 A binary matrix M is said to be de_{-disjunct if given any d + 1} columns of M with one designated, there are e + 1 rows with a 1 in the designated column and 0 in each of the other d columns.

A de_{-disjunct matrix with e = 0 is said to be d-disjunct matrix. Let q be a}

positive integer, indeed a prime power in use. Given positive integers 1 ≤ i ≤ n, the

Gaussian binomial coefficients with basis q is defined by

h n i i q =          i−1_Q j=0 n−j i−j, if q = 1, i−1_Q j=0 qn_−qj qi_−qj, if q 6= 1.

In the case q = 1, we write³ n

i ´ instead ofh n i i 1 for convenience.

For any positive integer n we use [n] to denote the set {1, 2, . . . , n}. For any pos-itive integer k,

µ [n]

k

¶

denotes the collection of all k-subsets of [n], and ·

GF (q)n k

¸

(6)

denotes the collection of all k-subspaces of GF (q)n_.

Definition 2.2

1. The Johnson graph J(n, t) is the graph defined on

µ [n]

t

¶

such that A and B are adjacent if |A ∩ B| = t − 1.

2. The Grassmann graph J_q(n, t) is the graph defined on · GF (q)n t ¸ q such that A and B are adjacent if dim(A ∩ B) = t − 1.

Definition 2.3 A clique C of J(n, 2) is a subfamily of µ

[n] 2

¶

such that |A ∩ B| = 1 for any two distinct A, B ∈ C.

Note that J(n, 2) is a strongly regular graph, i.e.a distance regular graph of diameter 2. Both Johnson graphs and Grassmann graph are distance-regular, refer to [1] for details.

Hence an l-matching in [7] is a 2-clique of J(n, 2) with size l. With this inter-pretation, its q-analogue extensions are available.

Definition 2.4

1. A t-clique of J(n, t) with size l is a subfamily {A1, A2, . . . , Al} of

µ [n]

t

¶

such that |A1∪ A2· · · ∪ Al| = tl, i.e., Ai∩ Aj = ∅ for any two distinct i and j. 2. A t-clique of Jq(n, t) with size l is a subfamily {A1, A2, . . . , Al} of

·

GF (q)n t

¸

q such that dim(A₁+ A₂+ · · · + A_l) = tl.

Definition 2.5 A family of k-subsets in [n] with |KTK0_{| ≤ k − t for all K and in} K0 _{in K is called a {1, 2, K, t}-clique of J(n, k).}

The notations for disjunct matrices: Let d < k < n,

J(n, d, k): the incidence matrix of the system (

µ [n] d ¶ , µ [n] k ¶ ; ⊆)

(7)

( J is for Johnson Schemes)

Gq(n, d, k): the incidence matrix of the system (

µ GF (q)n d ¶ , µ GF (q)n k ¶ ; ⊆) (G is for Grassmann Schemes)

M (2n, d, k) : the incidence matrix of the system

(M is for matchings)

Mq(2n, d, k): the incidence matrix of the system of q-analog of

(Mq is for q-analogues of matchings) The error-correcting capability of de -

dis-junct matrices is summarized in the following.

Theorem 2.1 Suppose a de_{- disjunct matrix M of order N ×t is used for a pooling} design, and P is the positive set to be identified,

1. if it is known that |P | = d, then M can correct e-errors; 2. if it is known that |P | ≤ d, then M can correct j e

2 k

-errors; moreover, M can correct e-errors in addition to another d-confirmation tests.

Moreover, the q-analogue of G(m, t, k, r) can be obtained naturally by Definition 2.2.

Definition 2.6 Given positive integers m ≥ k > r ≥ 1,

1. G(m, t, k, r) be the binary-matrix M with row-indexed (resp. column-indexed) by t-cliques of size r (resp. k) of J(tm, t) such that M (A, B) = 1 if A ⊆ B and 0 otherwise.

2. Gq(m, t, k, r) be the binary-matrix M be with row-indexed (resp. column-indexed) by t-cliques of size r (resp. k) of Jq(tm, t) such that M (A, B) = 1 if A ⊆ B and 0 otherwise.

Lemma 2.2 Let W be a k-subspace of Fqn. Then the number of d-subspaces of Fqn intersecting trivially with W is

· n − k d ¸ q qdk.

(8)

Proof. Let D = {A|A ∈ · V d ¸ q , A ∩ W = 0}.

Counting the set {(v1, v2, . . . , vd)| vi ∈ hW, v/ 1, v2, . . . , vi−1i} in two ways, we have

(qm− qk)(qm− qk+1) · · · (qm− qk+d−1) = |D| · (qd− 1)(qd− q) · · · (qd− qd−1). Hence |D| = · n − k d ¸ q qdk as required. ¤ Lemma 2.3 1. The number u[m, l]1 = u(m, l) of t-cliques of J(tm, t) with size

l is u[m, l]1 = u(m, l) = µ tm tl ¶ (tl)!/(t!)ll!.

2. The number uq(m, l) of t-cliques of Jq(tm, t) with size l is u_q(m, l) = · tm tl ¸ q t Y i=1 · it i ¸ q ·qt 2_l(l−1)/2 l! . , where 1 ≤ l ≤ m and

Proof. By Definition 2.2, {A1, A2, . . . , Al} is a t-clique of Jq(tm, t) with size l if

and only if A1+ A2+ · · · + Al is a tl-subspace of GF (q)tm.

Let L(m, l) be the number of ordered tuples (A1, A2, . . . , Al) of t-subspaces of GF (q)tm_{such that dim(A}

1+A2+· · · Al) = tl. Notice that the number of tl-subspaces

of GF (q)tmis · tm tl ¸ q

. Counting L(m, l) directly, there are · tl t ¸ q ways to choose A1, then · tl − t t ¸ q

qt2 ways to choose A₂ by Lemma 2.1 and so on. Thus

L(m, l) = · tm tl ¸ q · tl t ¸ q qt·0 · tl − t t ¸ q qt·t· · · · 2t t ¸ q qt·(l−2)t · t t ¸ q qt·(l−1)t. (1) On the other hand, (A1, A2, . . . Al) may be obtained by first picking a t-clique of Jq(tm, t) with size l in u[m, l]q ways, then for each t-clique, there are l! ways to get

the ordered tuples (A1, A2, . . . , Al). This yields

L(m, l) = u[m, l]ql!. (2)

(9)

Theorem 2.4 ([6, Theorem 2]) Let Kbe a family of k-subsets of [n] and αd =

min(td, k − d), i.e., a 1,2,K,t-clique of J(n,k). If the minimum Hamming distance dH(K) between any pair of k-sets in K is at least 2t, then J(n, d, K) is dαd−1-disjunct.

Theorem 2.5 J(n, k, d) is se _{- disjunct for 1 ≤ s ≤ d, where e is the function of s} defined by e = µ k − s d − s ¶ − 1.

Proof. For those columns of J(n, k, d) indexed by K0, K1, . . . , Ks ∈

µ [n]

k

¶ , let

xi ∈ K0− Ki, 1 ≤ i ≤ s, and let S be a s-subset of K0 containing {xi|1 ≤ i ≤ s}.

Then each row indexed by D ∈ µ

[n]

d

¶

with S ⊆ D ⊆ K0 is of the form 1 · · · 1 over

K₀, and 0 · · · 0 over K₁, · · · , K_d. Indeed, there are µ

k − s d − s

¶

many such choice of D, as required.

3 New families of d

e

_{-disjunct matrices}

Given positive integers m ≥ k > d ≥ 1, . Let M (2m, d, k) be the binary-matrix

M with row-indexed (resp. column-indexed) by d-matchings (resp. k-matchings) of K2m such that M (A, B) = 1 if A ⊆ B and 0 otherwise. In [7], Ngo and Du proved

the following results:

Theorem 3.1 ([7, Theorem 11]) Let g(m, l) = ( µ

2m 2l

¶

)(2l)!/2ll!, v = g(m, d) and n = g(m, k). For m ≥ k > d ≥ 1, M (m, k, d) is a v × n d-disjunct matrix with row weight g(m − d, k − d) and column weight

µ

k d

¶

.

Theorem 3.2 ([7, Corollary 12]) Given integers m > d ≥ 1, the following hold: (1) M (m, m, d) is d-error-detecting and bd/2c-error-correcting. Moreover,

(2) If the number of positives is known to be exactly d, then M (m, m, d) is

(10)

With an interpretation of matchings as 2-cliques of the Johnson graph J(n, 2), we will give some generalizations of Ngo and Du’s construction.

Find examples of Γ ⊆ µ

[n]

k

¶

with dH(Γ) ≥ 2r? study their properties?

Theorem 3.3 Let m ≥ k > r ≥ d ≥ 1, then the matrix G(m, t, k, r) is a de_-disjunct matrix of order v × n where (v, n) = (u(m, r), u(m, k)) and e =

µ

k − d r − d

¶

− 1 with a constant row weight u(m − r, k − r) and a constant column weight

µ

k r

¶

.

Proof. By Lemma 2.2, G(m, t, k, r) is a v ×n matrix with row weight u(m−r, k −r)

and column weight µ

k r

¶ .

Let Cj0, Cj1, . . . , Cjd be any d + 1 distinct columns of G(m, t, k, r). For each

i ∈ [d], there is a t-subset Vi of [tm] such that Vi ∈ Cj0\Cji. Let E = {Vi|i ∈ [d]}.

Then |E| ≤ d and E ⊂ Cj0 but E * Cji for each i ∈ [d]. If |E| = i, the number of

r-subsets of Cj0 containing E is µ k − i r − i ¶ . Since µ k − i r − i ¶ ≥ µ k − d r − d ¶ , the number of t-cliques of size r contained in Cj0 but not contained in Cji for each i ∈ [d] is at

least µ k − d r − d ¶ . ¤

The following corollary shows the above e is optimal if m > k.

Corollary 3.4 Let m > k > r ≥ d ≥ 1, the matrix G(m, t, k, r) is de_{-disjunct, but} not a de+1_{-disjunct matrix with e =}

µ

k − d r − d

¶

− 1.

Proof. In order to prove that G(m, t, k, r) is not a de+1_{-disjunct matrix, we need}

only to show that the maximum size of E is obtained. Since m > k, there exists a t-clique T = {A₁, A₂, . . . , A_k+1} with size k + 1. Let C_j_i = T \{A_i} for each i ∈ [d + 1].

Then |E| = |{Ai | i ∈ [d]}| = d. ¤

The results in Theorem 3.3 and Corollary 3.4 hold for its q-analogues too as shown below, their proofs are similar, and will be omitted.

(11)

Theorem 3.5 Let m ≥ k > r ≥ d ≥ 1, then the matrix Gq(m, t, k, r) is a de -disjunct matrix of order v×n where (v, n) = (u[m, r]q, u[m, k]q) and e =

µ

k − d r − d

¶

−1, with a constant row weight u[m − d, k − d]_q and a constant column weight

µ

k r

¶

.

Corollary 3.6 Let m > k > r ≥ d ≥ 1, then the matrix Gq(m, t, k, r) is de-disjunct, but not a de+1_{-disjunct matrix with e =}

µ

k − d r − d

¶

− 1.

An d-matching of K2mis simply a family of size d of 2 -subsets of [n] which are

pairwise disjoint. A 2-clique of Jq(2m, 2) of size l is the q-analogue of an l-matching

of K2m.

Similar to Corollary 12 in [7], G(m, t, m, d) is d-detecting and bd/2c error-correcting.

For fixed integers m ≥ k > r, the test to item ratio (v/n) of G(m, t, k, r) (resp.

Gq(m, t, k, r)) is a strictly decreasing function in t.

Some more examples of de_{-disjunct matrices.}

Theorem 3.7 Let 1 ≤ s ≤ d ≤ k ≤ n. Let 1 ≤ q and e = µ k − s d − s ¶ − 1. J(n, d, k) is se_{-disjunct. proofs.} Note that µ k − s d − s ¶ = µ k − s k − d ¶ , it is a decreasing sequence. Theorem 3.8 1. G_q(n, d, k) is se_-disjunct.

2. Iq()n, d, k is se-disjunct for 1 ≤ s ≤ p, where p =     Ã· k d ¸ q − · k − 1 d ¸ q ! Ã· k − 1 d ¸ q − · k − 2 d ¸ q !₋₁    , and e = · k d ¸ q − · k − 1 d ¸ − (s − 1) Ã· k − 1 d ¸ q − · k − 2 d ¸ q ! − 1

(12)

Theorem 3.9 ([],[]) For 1 ≤ d ≤ k ≤ n and 1 ≤ r ≤ k, let K be a family of

k-subsets of [n] with the minimum Hamming distance dH(K) between any pair of k-sets in K is at least 2r, then

1. J(n, d, K) is dαd−1 _{- disjunct where α} d= min(r4, k − d). (Theorem 2). 2. J(n, d, k, K, r) is se_{-disjunct if 1 ≤ s ≤ p, where} p = &µ· k d ¸ − · k − r d ¸¶ µ· k − r d ¸ − · k − 2r d ¸¶₋₁' , and e = · k d ¸ − · k − r d ¸ − (s − 1) µ· k − r d ¸ − · k − 2r d ¸¶ − 1. The following lemma is used in the proof of the following theorem.

Lemma 3.10 Let K be a family of k-subsets in [n] with |KTK0_{| ≤ k − t for all K} and K0 in K. Let d ≥ 1 with t ≥ 1 + t/(k − d) and set αd = min(td, k − d). Then given d + 1 k-sets {Ki}di=0 ⊂ K, there are αd d-sets {Dj}α_j=1d in [n] such that each Dj is contained in K0 and no Dj is connected in Ki for 1 ≤ i ≤ d.

4 Parameters e

d

and e

≤d

for error-correcting

For a binary matrix M of order t × n, let B(D) denote the Boolean sum of those columns indexed by elements of D ⊆ [n], and let dH(B(D), B(D0)) denote the

Hamming distance between B(D) and B(D0_{) whenever D and D}0 _{are two distinct}

subsets of [n]. Suppose Bd(M ) is the binary matrix consists of columns B(S) for

all S ⊆ [n] with |S| ≤ d. Let d_H(B_d(M )) be the minimum Hamming distance over all pairs of columns of Bd(M ). The minimum Hamming distance dH(Bd(M )) is

interesting for error tolerance; for example, Macula proved the following result: Let

e_d= min

|D|=|D0_|=ddH(B(D), B(D

(13)

and

e≤d = min

|D|=|D0_{|≤d D,D}0 _{are antichains}dH(B(D), B(D

0_)).

The larger the parameter e≤d is, the better its capacity of error correcting is.

Their values for the matrices G(m, t, k, r) and Gq(m, t, k, r) will be considered in

this section. We first treat the case for G(m, t, k, r) by giving a specific example. Example 4.1 Let m > k, and let T = {A1, A2, . . . , Ak+1} be a t-clique of J(tm, t)

with size k +1. For each i ∈ [d+1], suppose Bi= T \{Ai}. Then each Bi is a t-clique

of J(tm, t) with size k. Let

D = {B1, B2, . . . , Bd−1, Bd} and D0 = {B1, B2, . . . , Bd−1, Bd+1}. Then dH(B(D), B(D0)) = |{R|R ∈ µ Bd r ¶ , R * B1, B2, . . . , Bd−1, Bd+1}| + |{R|R ∈ µ B_d+1 r ¶ , R * B1, B2, . . . , Bd−1, Bd}| = |{R|{A1, A2, . . . , Ad−1, Ad+1} ⊆ R ⊆ Bd}| + |{R|{A1, A2, . . . , Ad−1, Ad} ⊆ R ⊆ Bd+1}| = 2 µ k − d r − d ¶ .

Theorem 4.1 Let m > k > r ≥ d ≥ 1. Then ed = e≤d = 2

µ k − d r − d ¶ for M = G(m, t, k, r).

(14)

We have

ed = min

|D|=|D0_|=ddH(B(D), B(D

0₎₎

≥ min |{R ⊆ Ai for some i ∈ [d] and R * A0j f or j ∈ [d]}|

+ min |{R ⊆ A0_i for some i ∈ [d] and R * Aj f or j ∈ [d]}| ≥ 2 µ k − d r − d ¶ by Theorem 3.3.

On the other hand, Example 1 shows e_d ≤ 2

µ k − d r − d ¶ . Hence e_d= 2 µ k − d r − d ¶ as required. To show e_≤d = 2 µ k − d r − d ¶

, we consider two antichains D = {A1, A2, . . . , Au} and D0 _{= {A}0

1, A02, . . . , A0v} where u, v ≤ d. Without loss of generality, we may assume

that Du ∈ D/ 0 and D0v ∈ D. By Theorem 3.3 there exist at least/

µ

k − v r − v

¶

t-cliques

with size r contained in Au but not in A0j for each j ∈ [v]. By the symmetry, we

have e≤d ≥ µ k − v r − v ¶ + µ k − u r − u ¶ . Note that µ k − s r − s ¶ ≥ µ k − d r − d ¶ if s ≤ d. Hence e≤d ≥ 2 µ k − d r − d ¶

. On the other hand, by definition, e≤d ≤ ed = 2

µ k − d r − d ¶ . This yields e_≤d = 2 µ k − d r − d ¶ . ¤

Similar result holds for Gq(m, t, k, r). The proof is similar to that of Theorem 4.2

and will be omitted.

Theorem 4.2 Let m > k > r ≥ d ≥ 1. Then e_d = e_≤d = 2 µ k − d r − d ¶ for M = Gq(m, t, k, r).

References

[1] A. E. Brouwer, A. M. Cohen and A. Neumaier, Distance-Regular Graphs, Springer Verlag, Berlin, Heidelberg, 1989.

(15)

[2] A.G. D’yachkov, F.K. Hwang, A.J. Macula, P.A. Vilenkin, C. Weng, A construc-tion of pooling designs with some happy surprises, J. Computaconstruc-tional Biology, 12 (2005), 1129-1136.

[3] T. Huang and C. Weng, A note on decoding of superimposed codes, J. Comb. Optim. 7 (2003), no. 4, 381-384.

[4] T. Huang and C. Weng, Pooling spaces and non-adaptive pooling designs, Dis-crete Math. 282 (2004), 163-169.

[5] A.J. Macula, A simple construction of d-disjunct matrices with certain constant weights, Discrete Math. 162 (1996), 311-312.

[6] A.J. Macula, Error-corerecting nonadaptive group testing with de_-disjunct

ma-trices, Discrete Appl. Math. 80 (1997), 217-222.

[7] H. Ngo and D. Du, New constructions of non-adaptive and error-tolerance pool-ing designs, Discrete Math. 243 (2002), 161-170.

[8] H. Ngo and D. Zu, A survey on combinatorial group testing algorithms with ap-plications to DNA library screening, DIMACS Ser. Discrete Math. Theoretical Comp. Sci. 55 (2000), 171-182.

[9] A.D’yachkov and P. Vilenkin, A. Macula, Nonadaptive Group Testing with Error-Correcting de_{-Disjunct Inclusion Matrices, Bolyai Society Mathematical}

關於強正則多重圖及其相關有限幾何的研究(I)

行政院國家科學委員會專題研究計畫 期中進度報告

關於強正則多重圖及其相關有限幾何的研究(1/3)

計畫類別： 個別型計畫

計畫編號： NSC94-2115-M-009-015-

執行期間： 94 年 08 月 01 日至 95 年 07 月 31 日

執行單位： 國立交通大學應用數學系(所)

計畫主持人： 黃大原

報告類型： 精簡報告

報告附件： 出席國際會議研究心得報告及發表論文

處理方式： 本計畫可公開查詢

中 華 民 國 95 年 5 月 30 日

期中報告

黃大原

作為堵丁柱在完備圖上的不同大小的完美匹配上所建構的 Pooling

設計的推廣，我們以詹氏圖、格氏圖的不同大小的點團所構成的鄰接圖為

基礎，我們給出倆列類具有改錯能力的 Pooling 設計。

Some Error-Correcting Pooling Designs

Associated with Johnson Graphs and

Grassmann Graphs

1

Introduction

2

Preliminaries

3

New families of d

-disjunct matrices

4

Parameters e

and e

for error-correcting

References

行政院國家科學委員會專題研究計畫期中進度報告

計畫類別：個別型計畫

執行單位：國立交通大學應用數學系(所)

計畫主持人：黃大原

報告類型：精簡報告

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫可公開查詢

中華民國 95 年 5 月 30 日

_{-disjunct matrices}