Pooling spaces and non-adaptive pooling designs

(1)

Discrete Mathematics 282 (2004) 163–169

www.elsevier.com/locate/disc

Poolingspaces and non-adaptive poolingdesigns

Tayuan Huang, Chih-wen Weng

Department of Applied Mathematics, National Chiao Tung University, Hsinchu, Taiwan China Received 19 March 2002; received in revised form 29 October 2003; accepted 10 November 2003

Abstract

A poolingspace is de1ned to be a ranked partially ordered set with atomic intervals. We show how to construct non-adaptive poolingdesigns from a poolingspace. Our poolingdesigns are e-error detectingfor some e; moreover, e can be chosen to be very large compared with the maximal number of defective items. Eight new classes of non-adaptive pooling designs are given, which are related to the Hamming matroid, the attenuated space, and six classical polar spaces. We show how to construct a new poolingspace from one or two given poolingspaces.

c

Keywords: Poolingspace; Poolingdesign; Ranked partially ordered set; Atomic interval

1. Introduction

The basic problem of group testing is to identify the set of defective items in a large population of items. A group testingalgorithm is non-adaptive if all tests must be speci1ed without knowingthe outcomes of other tests. A non-adaptive group testing algorithm is useful in many areas. One of the examples is the problem of DNA library screening. Suppose we have n items to be tested and that there are at most d defective items amongthem. Each test (or pool) is (or contains) a subset of items. The output of a pool is positive if and only if it contains at least one of the defective items on the defective items, and the goal is to determine all of the defectives in t-tests. A mathematical model of the non-adaptive group testing design for this problem is a t × n d-disjunct matrix (see Section2). In this paper, we de1ne a pooling space to be a ranked partially ordered set which has atomic intervals. We show how to construct d-disjunct matrices from a poolingspace. These d-disjunct matrices have a special property described below. If we view these d-disjunct matrices as (d − 1)-disjunct matrices, then they detect e errors for some positive integer e. As our examples show, the number e is very large compared to d. Macula [7,8] gave a construction of d-disjunct matrices from the poset consistingof the subsets of a 1nite set. Ngo and Du [10] gave a construction of d-disjunct matrices from the poset consistingof the subspaces of a vector space. Our construction is a generalization of their results. This type of generalization was initially proposed by Ngo and Zu [11, p. 177].

2. Preliminaries

Let M be a t × n matrix over {0; 1}. In this paper we frequently associate each row i (resp. column j) with a set that contains all column indices j (resp. row indices i) such that Mij= 1. M is said to be d-disjunct if the union of any d

columns does not contain another column. A d-disjunct t ×n matrix M can be used to design a non-adaptive group testing algorithm on n items by associatingthe column indices with the items and the row indices with the tests. If Mij= 1 then

E-mail address:[email protected](C.-w. Weng).

(2)

item j is contained in test i: Let M be a d-disjunct matrix. The weight wt(u) of a column vector or a row vector u of M is the number of 1s in u.

Example 2.1. We can easily check

M =              1 1 0 0 1 0 1 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 1 1             

is 2-disjunct, since the union of any two columns of M does not contain any one of the remainingtwo columns. Each column of M has weight 3 and each row of M has weight 2.

Let M be a t × n d-disjunct matrix. For a set S ⊆ {1; 2; : : : ; n} with |S| 6 d, S represents the set of defective items and the output o(S) of S in M is the union of those columns indexed by S. For example o({2; 3}) = (1; 1; 1; 0; 1; 1)t _with

M as above (Example 2.1). Kautz and Singleton [6] gave a simple algorithm to identify the set S from its test result u = o(S). In set notation, the algorithm can be written as

S = {j | Cj⊆ u}; (2.1)

where C1; C2; : : : ; Cn are columns of M. The design of a d-disjunct matrix is also called non-adaptive pooling design.

A t × n matrix M over {0; 1} is (d; e)-disjunct if for any d + 1 columns C

0; C1; : : : ; Cd of M there are at least e + 1

elements in C 0− d i=1 C i:

In particular, (d; 0)-disjunct is d-disjunct. In Example2.1, M is (2; 0)-disjunct and (1; 1)-disjunct, but M is not (2; 1)-disjunct. From a codingtheory point of view, a (d; e)-disjunct matrix is equivalent to a superimposed distance code with strength d and distance e + 1. See [3,4] for details.

We show that a (d; e)-disjunct matrix can be used to construct a non-adaptive poolingdesign that can detect e errors and correct e=2 errors. Let M be a (d; e)-disjunct t × n matrix. Let S; T ⊆ {1; 2; : : : ; n} be two distinct subsets with each at most d elements. We show the Hammingdistance of the test results o(S) and o(T) is at least e + 1. At least one of S − T; T − S is nonempty, so assume S − T = ∅. Pick j ∈ S − T. We can 1nd e + 1 positions i such that Mij= 1 and

Mik= 0 for all k ∈ T. Hence o(S) and o(T) have Hammingdistance at least e + 1.

We now give the basic de1nitions and properties of a partially ordered set. The expert may want to skip the remaining of this section and go to the next section.

Let P denote a 1nite set. By a partial order on P, we mean a binary relation 6 on P such that

(i) x 6 x (∀x ∈ P),

(ii) x 6 y and y 6 z → x 6 z (∀x; y; z ∈ P), (iii) x 6 y and y 6 x → x = y (∀x; y ∈ P).

By a partially ordered set (or poset, for short), we mean a pair (P; 6), where P is a 1nite set, and where 6 is a partial order on P. By abusingnotation, we will suppress reference to 6, and just write P instead of (P; 6).

Let P denote a poset, with partial order 6 ; and let x and y denote any elements in P. As usual, we write x ¡ y whenever x 6 y and x = y. We say y covers x whenever x ¡ y, and there is no z ∈ P such that x ¡ z ¡ y. An element x ∈ P is said to be minimal whenever there is no y ∈ P such that y ¡ x. Let min(P) denote the set of all minimal elements in P. Whenever min(P) consists of a single element, we denote it by 0, and we say P has the least element 0.

Throughout the paper we assume P is a poset with the least element 0. By an atom in P, we mean an element in P that covers 0. We let AP denote the set of atoms in P. By a rank function on P, we mean a function

(3)

such that rank(0) = 0, and such that for all x; y ∈ P, y covers x implies rank(y) − rank(x) = 1. Observe the rank function is unique if it exists. P is said to be ranked whenever P has a rank function. In this case, we set

rank(P) := max{rank(x)|x ∈ P};

Pi:= {x|x ∈ P; rank(x) = i} (i ∈ N ∪ {0});

and observe P0= {0}, P1= AP.

Let P denote any 1nite poset, and let S denote any subset of P. Then there is a unique partial order on S such that for all x; y ∈ S, x 6 y in S if and only if x 6 y in P. This partial order is said to be induced from P. By a subposet of P, we mean a subset of P, together with the partial order induced from P. Pick any x; y ∈ P such that x 6 y. By the interval [x; y], we mean the subposet

[x; y] := {z|z ∈ P; x 6 z 6 y} of P.

Let P denote any poset, and let S be a subset of P. Fix z ∈ P. Then z is said to be an upper bound of S, if z ¿ x for all x ∈ S. Suppose the subposet of upper bounds of S has a unique minimal element. In this case we call this element the least upper bound of S.

Suppose P is ranked. Then P is said to be atomic whenever for each element x of P, x is the least upper bound of [0; x] ∩ P1.

Let q be a positive integer. Fix a positive integer N. The Gaussian binomial coe4cients with basis q is de1ned by N i q =              i−1 j=0 N − j i − j if q = 1; i−1 j=0 qN_{− q}j qi_{− q}j if q = 1:

In the case q = 1, for convenience, we write (N

i) instead of [Ni]1. Now assume q = 1, or a prime power. Set

Lq(N) =

_{all subsets of {1; 2; : : : ; N}} _{if q = 1;}

subspaces of GF(q)N if q is a prime power;

where GF(q) is the 1nite 1eld of q elements. Let P = Lq(N) be a poset with the usual set inclusion order. Note that

N i q = |Pi|:

3. Construct (d; e)-disjunct matrices Let P be a poset. For any w ∈ P, de1ne

w+_{= {y ¿ w|y ∈ P}:}

A pooling space is a ranked poset P such that w+ _{is atomic for all w ∈ P. In particular a poolingspace is atomic. If P}

is a poolingspace, then so is w+ _{for any w ∈ P. We show how to construct d-disjunct matrices from a poolingspace in}

this section.

Theorem 3.1. Let P be a pooling space with rank D ¿ 1. Fix an element x ∈ PD and 8x an integer d (1 6 d 6 D). Let

T ⊆ PD be a subset such that |T| 6 d and x ∈ T. Then there exists an element y ∈ [0; x] ∩ Pd such that y z for all

z ∈ T.

Proof. We prove the theorem by induction on D. If D = 1 then d = 1 and the theorem holds by setting y = x. In general, pick an element z ∈ T. Then x = z by assumption. Since x is the least upper bound of [0; x] ∩ P1 and x z, z is not an

upper bound of [0; x] ∩ P1. Hence we can pick an element w ∈ [0; x] ∩ P1 such that w z. Then T ∩ w+has at most d − 1

elements. In the poolingspace w+_{, the element x and the elements of T ∩ w}+ _{all have rank D − 1, and the elements of}

w+_{∩ P}

d have rank d − 1. Hence by induction, we can choose y ∈ [w; x] ∩ Pd such that y u for all u ∈ T ∩ w+. Note

(4)

With notation in Theorem 3.1, observe for any integer ‘ (d 6 ‘ 6 D), each element w ∈ [y; x] ∩ P‘ satis1es w 6 x and

w z for all z ∈ T. Hence the characteristic matrix of the binary relation induced on the subposet P‘∪ PD of a pooling

space P is in fact (d; e)-disjunct, where the number e + 1 is the minimal number in countingsuch w. More precisely, we state this as the followingcorollary.

Corollary 3.2. Let P be a pooling space with rank D. Fix an integer ‘ (1 6 ‘ 6 D). Let M = M(D; ‘) be the matrix over {0; 1} whose rows (resp. columns) are indexed by P‘ (resp. PD) such that Muv= 1 i9 u 6 v. Then for each integer

d (1 6 d 6 ‘), M is (d; e)-disjunct, where e = min[y; x] ∩ P‘ − 1

with the minimum taken over all pairs (x; T) such that x ∈ PD, T ⊆ PD, x ∈ T, |T| 6 d, and with the union taken over

all y such that y ∈ Pd, y 6 x, y z for all z ∈ T.

Note that the truncation of a poolingspace is a poolingspace. That is if P is a poolingspace with rank D, then P0∪ P1∪ · · · ∪ Pk

is a poolingspace with rank k for each k (0 6 k 6 D). Hence in the above construction of M we can choose any k (‘ 6 k 6 D) and use Pk to replace PD. The de1nition of e in Corollary 3.2 seems complicate. However, in our

examples in the next section the number |[y; x] ∩ P‘| is a constant.

4. Examples

In this section, we give some examples of pooling spaces P with rank D. All of these examples are quantum matroids with the base q [13], where q is 1 or a prime power. The number |Pi| can be computed from results given in [13].

We omit the details of the computing. For integers 1 6 d 6 ‘ 6 k 6 D, the examples produce the (d; e)-disjunct matrices M = M(k; ‘) have size t × n, where t = |P‘|, n = |Pk| and

e = k − d ‘ − d q − 1:

The weight of each column of M is k

‘

q

;

and the weight of each row of M is |Pk| |P‘| k ‘ q :

4.1. The Hamming matroid H(D; N) (2 6 N) [2,12] Set A = A1∪ A2∪ · · · ∪ AD (disjoint union); where |Ai| = N (1 6 i 6 D): P = {x | x ⊆ A; |x ∩ Ai| 6 1 for all i (1 6 i 6 D)}; x 6 y whenever x is a subset of y (x; y ∈ P); rank(x) = |x| (x ∈ P); |Pi| = D i Ni:

(5)

4.2. The attenuated space Aq(D; N) (D 6 N) [2,5]

Let V denote a vector space of dimension N over the 1eld GF(q), and 1x a subspace w ⊆ V of dimension N − D. P = {x | x is a subspace of V; x ∩ w = 0}; x 6 y whenever x is a subspace of y (x; y ∈ P); rank(x) = dim(x) (x ∈ P); |Pi| = D i q qi(N−D)_:

4.3. The classical polar spaces of rank D over GF(q) [1]

Let V denote a vector space over the 1eld GF(q), and assume V possesses a given non-degenerate form. We call a subspace of V isotropic whenever the form vanishes completely on that subspace. The maximal isotropic subspaces have the same dimension, denoted by D.

P = {x | x is an isotropic subspace of V }; x 6 y whenever x is a subspace of y (x; y ∈ P); rank(x) = dim(x) (x ∈ P);

Name dim V Form |Pi|

BD(q) 2D + 1 Quadratic D i q (1 + qD_{)(1 + q}D−1_{) · · · (1 + q}D−i+1₎ CD(q) 2D Alternating D i q (1 + qD_{)(1 + q}D−1_{) · · · (1 + q}D−i+1₎

DD(q) 2D _{(witt index D)}Quadratic

D i q (1 + qD−1_{)(1 + q}D−2_{) · · · (1 + q}D−i₎ 2_D

D+1(q) 2D + 2 _{(witt index D)}Quadratic

D i q (1 + qD+1_{)(1 + q}D_{) · · · (1 + q}D−i+2₎ 2_A 2D(r) 2D + 1 Hermitian_{(q = r}2₎ D i q (1 + qD+1=2_{)(1 + q}D−1=2_{) · · · (1 + q}D−i+3=2₎ 2_A 2D−1(r) 2D Hermitian_{(q = r}2₎ D i q (1 + qD−1=2_{)(1 + q}D−3=2_{) · · · (1 + q}D−i+1=2₎ 5. Pooling polynomials

Let P be a poolingspace with rank D. The ratio |P‘|=|Pk| is the main concern of the construction of poolingdesigns,

and the structure of P is less important. With this motivation, we give the following de1nition. De&nition 5.1. Let P be a poolingspace with rank D. The pooling polynomial of P is

fP(x) := D

i=0

(6)

Note that the constant term of a poolingpolynomial is always 1. With lexicographical order, 1 and 1 + x are the 1rst two poolingpolynomials.

Let P_{, P} _{be poolingspaces with rank D}_{, D}_{, respectively. We de1ne the direct sum P}_{+ P} _{of P} _{and P} _as follows. The element set of P_{+ P} _{is the disjoint union of P} _{and P} _{except that the 0 of P} _{and the 0 of P} _are identical. Hence P_{+ P} _{has |P}_{| + |P}_{| − 1 elements. The partial order of P}_{+ P} _{is naturally inherited from P} _and P_{. It is easy to see P}_{+ P} _{is a poolingspace with rank max{D}_{; D}_{}. We de1ne the product P}_{⊗ P}_{of P} _{and P} as follows. The element set of P = P_{⊗ P}_is

{(a; b) | a ∈ P; b ∈ P}:

The partial order in P_{⊗ P} _{is de1ned by} (a; b) 6 (c; d) iL a 6 c and b 6 d;

for any a; c ∈ P _{and any b; d ∈ P}_{. It is easy to see that for any a; c ∈ P} _{and b; d ∈ P}_{, the following(i)–(iii) hold.} (i) rank((a; b)) = rank(a) + rank(b);

(ii) [0; (a; b)] ∩ P1= {(a1; 0); : : : ; (ar; 0); (0; b1); : : : ; (0; bs)}, where {a1; : : : ; ar} = [0; a] ∩ P1 and {b1; : : : ; bs} = [0; b] ∩ P1.

(iii) [(a; b); (c; d)] = [a; c] ⊗ [b; d].

We conclude from (i)–(iii) above that P_{⊗ P} _{is a poolingspace with rank D}_{+ D}_.

Note that if P is a poolingspace then so is P \ w+ _{for any w ∈ P. Let f be a poolingpolynomial. By a reduction}

of f, we mean a polynomial obtained by replacingthe leadingcoeMcient of f by a smaller non-negative integer. We immediately have the followingtheorem.

Theorem 5.2. Let F be the set of pooling polynomials. Suppose f1(x); f2(x) ∈ F. Then the following (i)–(iii) hold.

(i) A reduction of f1(x) is in F;

(ii) f1(x) + f2(x) − 1 ∈ F;

(iii) f1(x)f2(x) ∈ F.

Theorem5.2 provides us a few ways to construct more poolingpolynomials and correspondingpoolingdesigns. Example 5.3. (1 + 3x + 2x2₎m _{is a poolingpolynomial, since it can be obtained from the poolingpolynomial 1 + x by}

usingproductions and reductions as shown in the equation (1 + 3x + 2x2₎m_{= (((1 + x)}3_{− x}3_{) − x}2₎m_:

6. Concluding remarks

We construct (d; e)-disjunct matrices from a poolingspace in Section 3. Some examples of poolingspaces are given in Section 4. By checkingthese examples, the ratio t=n = |P‘|=|Pk| is small and the error-tolerance number e is large if ‘; k

are well chosen. However, it seems that d is too small compared to n in all these examples. We show how to construct a new poolingspace from given poolingspaces in Section 5. This can be used to obtain a poolingspace with a desired |Pi| range.

Of course, our list of poolingspaces is not exhaustive. It can be expected that there are a lot of unknown pooling spaces and a complete list of them is unlikely to be completed. We give another class to show this line of study might have number theory involved. Fix a positive integer m, and set

P = {i | 2 6 i 6 m; and i is an integer which contains no square factors}: The partial order in P is de1ned by

i 6 j iL i divides j:

By identifyingan element in P with a subset of primes, the poset P can be obtained from the in1nite poset consistingall the subsets of primes and then deletingeach subposet w+ _{for each integer w ¿ m (in natural integers ordering). It can}

be easily checked that P is a poolingspace. However, the computingof |Pi| is not likely to be written as a nice formula

(7)

Another interestingproblem is to 1nd an eLective decodingalgorithm for the set S ⊆ {1; 2; : : : ; n} of defective items from its output u with at most e=2 errors in a (d; e)-disjunct matrix M. This will be a generalization of the well known decodingalgorithm in the d-disjunct case. See [6] for details.

A class of poolingspace related to the Hermitian form graphs is constructed in [14]. All examples of the poolingspaces we mentioned in this paper have an additional property of being(meet) semi-lattice; this means that any two elements have a greatest lower bound. To close the paper, we propose the followingquestion: Try to 1nd a poolingspace which is not a semi-lattice.

Acknowledgements

The authors wish to express their sincere thanks to Frank Hwangand anonymous referees for very helpful comments and suggestions that led to a considerable improvement of this paper.

References

[1] P. Cameron, Projective and polar spaces, QMW Math. Notes, Vol. 13, University of London, London, 1992. [2] P. Delsarte, Association schemes and t-designs in regular semi-lattice, J. Combin. Theory Ser. A 20 (1976) 230–243.

[3] A.G. D’yachkov, A.J. Macula, P.A. Vilenkin, Nonadaptive group testing with error-correction de_{-disjunct inclusion matrices, preprint.}

[4] A.G. D’yachkov, V. Rykov, Superimposed distance codes, Probl. Control Inform. Theory 18 (4) (1989) 237–250. [5] T. Huang, A characterization of the association schemes of bilinear forms, European J. Combin. 8 (1987) 159–173. [6] W.H. Kautz, R.R. Singleton, Nonadaptive binary superimposed codes, IEEE Trans. Inform. Theory 10 (1964) 363–377. [7] A.J. Macula, A simple construction of d-disjunct matrices with certain constant weights, Discrete Math. 162 (1996) 311–312. [8] A.J. Macula, Probabilistic nonadaptive group testing in the presence of errors and DNA library screening, Ann. Combin. 3 (1999)

61–69.

[9] A.J. Macula, P.A. Vilenkin, Constructions of superimposed codes based on incidence structures, IEEE ISIT, Sorrento, Italy, June 25 –30, 2000.

[10] H. Ngo, D. Du, New constructions of non-adaptive and error-tolerance pooling designs, Discrete Math. 243 (2002) 161–170. [11] H. Ngo, D. Zu, A survey on combinatorial group testing algorithms with applications to DNA library screening, DIMACS Ser.

Discrete Math. Theoretical Comp. Sci. 55 (2000) 171–182.

[12] P. Terwilliger, The incidence algebra of a uniform poset, coding theory and design theory, Part I: Coding Theory, IMA Volumes in Mathematics and its Applications, Vol. 20, Springer, New York, 1990, pp. 193–212.

[13] P. Terwilliger, Quantum matroids, progress in algebraic combinatorics, Fukuoka, 1993, pp. 323–441; Adv. Stud. Pure Math., Vol. 24, Mathematical Society of Japan, Tokyo, 1996.