• 沒有找到結果。

離散條件機率分配之相容性研究 - 政大學術集成

N/A
N/A
Protected

Academic year: 2021

Share "離散條件機率分配之相容性研究 - 政大學術集成"

Copied!
63
0
0

加載中.... (立即查看全文)

全文

(1)國立政治大學統計學系 博士論文. 立. 政 治 大. ‧ 國. 學. 離散條件機率分配之相容性研究. ‧. On compatibility of discrete conditional distributions. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. 指導教授 : 姚怡慶 博士 研究生 : 陳世傑. 中華民國104年7月.

(2) 謝 辭 與其說跟姚老師學統計,倒不如說跟姚老師學數學。與其說跟姚老師學 數學,倒不如說跟姚老師學做事情的態度。姚老師做事情就好像在做數學題 目一樣,有條有理。我的個性比較散漫,再加上我大學和研究所,都不是念 數學系或統計系,所以當我開始和姚老師學習撰寫論文時,就相當吃力。經 過姚老師多年來的耐心指導,他改變了我,也才有這篇論文的產生。 有幸請到鄭宗記老師、翁久幸老師、洪英超老師、陳宏老師和程毅豪老 師來當我的口試委員,他們所提供的寶貴意見,讓本篇論文有更佳的呈現。 學弟泰期和宇翔,可說是我的貴人。他們在程式上幫了我很大的忙,沒 有他們,此篇論文無法完成。. 立. 政 治 大. ‧ 國. 學. 撰寫論文期間,幸虧有內人月霞和兒子昶廷幫忙打字,才能使本篇論文 如期完成。 承蒙劉惠美老師的鼓勵,我才能進入統計學的研究領域。. ‧. 我來政大唸博士班,一待就是十年,過程雖然辛苦,但終於磨成一劍。. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v.

(3) Abstract For two discrete random variables X1 and X2 taking values in {1, . . . , I} and {1, . . . , J}, respectively, a putative conditional model for the joint distribution of X1 and X2 consists of two I × J matrices representing the conditional distributions of X1 given X2 and of X2 given X1 . We say that two conditional distributions (matrices) A and B are compatible if there exists a joint distribution of X1 and X2 whose two conditional distributions are exactly A and B. We present new versions of necessary and sufficient conditions for compatibility of discrete conditional distributions via a graphical representation. Moreover, we show that there is a unique joint distribution for two given compatible conditional distributions if and only if the corresponding graph is connected. Markov chain characterizations are also presented.. 立. 政 治 大. ‧ 國. 學 ‧. Keywords: compatibility of conditional distributions, graph theory, connectedness, spanning tree, Gibbs sampler, MCMC.. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v.

(4) Contents 1 Introduction. 1. 2 Compatible conditional distributions. 5. 2.1. Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . .. 5. 2.2. Review of the ratio matrix approach for compatibility between two conditional distributions . . . . . . . . . . . . . . . . . . .. 7. 3 Graphical representation approach 3.2. 14. bility distributions satisfying R . . . . . . . . . . . . . . . . .. 19. The relation between the ratio matrix approach and graphical representation approach. . . . . . . . . . . . . . . . . . . . . .. 26. 學. 3.3. 政 治 大 Compatibility of立 a ratio set R and characterization of proba-. Graphical representation . . . . . . . . . . . . . . . . . . . . .. ‧. ‧ 國. 3.1. 14. 4.2. Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . .. n. al. er. sit. Compatibility by the Gibbs sampler . . . . . . . . . . . . . . .. io. 4.1. 5 Conclusions References. 31. y. Nat. 4 Markov chain characterizations. Ch. engchi. i n U. v. 31 49 55 57.

(5) 1. Introduction. The problem of characterizing a joint distribution by conditional distributions has been extensively studied in the last few decades. Applications may be found in classical construction of joint distribution and elicitation of Bayesian multiparameter prior distribution. We first consider the simple case of two random variables (X, Y ) whose joint distribution is to be determined, given the two conditional distributions PX|Y and PY |X . For example, Toffoli et al. (2006) investigated the relationship between adverse reaction to drug treatment and genotype in the gene region UGT1A1 ∗ 28 for metastatic colorectal cancer patients. Drug treatment with irinotecan has been found to be effective for metastatic colorectal cancer patients. However, irinotecan includes toxicity and the individual reaction to the toxicity is known to be different. Genotype in the gene region UGT1A1 ∗ 28 is known to be associated with adverse reaction. Let X be the genotype with three kinds: TA7/TA7, TA7/TA6 and TA6/TA6. Let Y be the adverse reaction with four levels: severe, moderate, mild and nil. Two conditional models are commonly used by clinicians: the diagnostic conditional model PX|Y and the treatment conditional model PY |X . To obtain full information about the probability model, it is necessary to aconstruct a joint distribution v from PX|Y and PY |X . i l Since in practice, PX|Y and PC Y |X are either estimated h e n g c h i U n or hypothesized, there may not exist a joint distribution which has the given PX|Y and PY |X as its conditional distributions. If there is a joint distribution which has PX|Y and PY |X as its conditional distributions, then PX|Y and PY |X are said to be compatible. Therefore, it is desired (i) to determine whether the given PX|Y and PY |X are compatible, and (ii) to find all such (compatible) joint distributions when the given conditional distributions are compatible.. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. The compatibility problem is most easily visualized in the finite discrete case. A convenient introduction to the topic may be found in Arnold and Press (1989). They proposed “ratio matrix” of two discrete conditional distributions to check the compatibility. They showed that two discrete condi1.

(6) tional distributions are compatible if and only if the ratio matrix is of rank one. However, this version can be used only for the discrete conditional distributions without zero elements. If the discrete conditional distributions contain zero elements, the method can not be applied. Song, Li, Chen, Jiang and Kuo (2010) proposed “positive extension ratio matrix” of two discrete conditional distributions to solve the problem. “Positive extension” means inserting positive values for the undefined elements in a ratio matrix. They showed that two discrete conditional distributions are compatible if and only if there exists a positive extension ratio matrix of rank one. They also found that there is a unique joint distribution for two given compatible conditional distributions if and only if the corresponding ratio matrix is irreducible. However, the positive extension ratio matrix may not be unique and may be difficult to find. Moreover, this approach can not be naturally extended to the higher-dimensional case.. 立. 政 治 大. ‧ 國. 學. ‧. A discrete conditional distribution can be considered as linear constraints on the (unknown) joint distribution. Thus, the linear algebra approach is a natural way to deal with the compatibility problem. A given set of discrete conditional distributions is compatible if and only if there is a joint distribution satisfying conditional distribution constraints. Arnold, Castillo and Sarabia (2002) proposed athree i vconstrained linear equal C methods for solving n tions to obtain the compatible joint h edistributions. n g c h i U The first method is to find a joint distribution satisfying the given set of discrete conditional distributions directly. The second method is to find the marginal distributions of X and Y which combined with PX|Y and PY |X will determine the joint distribution. The third method is to find a marginal distribution of X, knowing that a compatible marginal distribution of X, combined with PY |X , will determine the joint distribution. (Both the second and third methods are based on Theorem 2.2.1.) Alternatively, Tian, Tan, Ng and Tang (2009) used the Euclidean distance measure to transform the compatibility problem into a quadratic optimization problem with unit cube constraints. This method is related to the third method of Arnold, Castillo and Sarabia (2002), and requires nu-. n. er. io. sit. y. Nat. 2.

(7) merically solving a quadratic optimization problem. While the above linear algebra approach can be naturally extended to the higher-dimensional case, it can be very time consuming for three or more random variables. Therefore, the linear algebra approach is mostly applied to bivariate distributions in which each random variable takes values in a small set. A discrete conditional distribution can also be treated as a stochastic matrix. Arnold, Castillo and Sarabia (1999) applied Markov chain to check the compatibility of discrete conditional distributions. This approach needs not only to solve a set of stationary marginal distributions, but also to determine whether the set of stationary marginal distributions matches the given set of conditional distributions.. 立. 政 治 大. ‧. ‧ 國. 學. Ip and Wang (2009) proposed using the canonical representation to deal with the compatibility problem. The method does not require solving constrained linear equations, but requires no zero elements. Wang and Kuo (2010) proposed an odds-oriented approach to deal with the case involving zero elements. This approach is computationally demanding. The algorithm proposed by Kuo and Wang (2011) shares some similar features with our approach in Chapter 3, although it did not use the graphical representation and no underlying theory was provided.. er. io. sit. y. Nat. a. n. i v problem in terms of a In this thesis we reformulate l C the compatibility n i U e nvertex graphical representation where h each to a possible sample g c hcorresponds point and an edge connects two vertices if and only if the probability ratio of the two corresponding possible sample points is specified through one of the given conditional distributions. A sequence of connected vertices is called a path. A path determines the probability ratio of any two vertices in the path. However, two vertices may be connected by more than one path. We show that a given set of conditional distributions is compatible if and only if the probability ratio of any two vertices in a path does not depend on the chosen path, and that there is a unique joint distribution for a given set of compatible conditional distributions if and only if the corresponding graph is connected. 3.

(8) In addition, we use the spanning tree to determine all the compatible joint distributions when a given set of conditional distributions is compatible. The rest of the thesis is organized as follows. In Chapter 2, we review the ratio matrix approach for checking compatibility and uniqueness of two discrete conditional distributions. In Chapter 3, we first use bivariate discrete conditional distributions to illustrate the graphical representation approach (cf. Yao, Chen and Wang (2014)). Then, we extend the graphical representation approach to the higher-dimensional case. We also discuss the relationship between the ratio matrix approach and the graphical representation approach. In Chapter 4, we discuss Markov chain characterizations, which help to understand the connection of compatibility with Gibbs sampler. Gibbs sampler is a Markov chain Monte Carlo sampling algorithm for generating joint distribution via individual conditional distributions. For example, the bivariate Gibbs sampler generates X sample from PX|Y and Y sample from PY |X , re ∞  ∞ sulting in two sequences X (t) , Y (t) t=1 and Y (t) , X (t) t=1 , t = 1, 2, . . .. If PX|Y and PY |X are compatible, they have the same limiting distribution. If PX|Y and PY |X are incompatible, Liu (1996) observed that the two limiting distributions are different and also derived some results. However, he only considered the two-dimensional case. We will extend his discussions to the higher-dimensional case. a l iv. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. Ch. n engchi U. 4.

(9) 2. Compatible conditional distributions. 2.1. Compatibility. Consider two discrete random variables X and Y taking values in {x1 , . . . , xI } and {y1 , . . . , yJ }, respectively. Let A = (Aij ) and B = (Bij ) be two I × J matrices with nonnegative elements such that each column sum of A equals 1 and each row sum of B equals 1. Definition 2.1.1. A and B are said to be compatible if there exists a joint distribution of X and Y such that P (X = xi |Y = yj ) = Aij and P (Y = yj |X = xi ) = Bij for i = 1, . . . , I, j = 1, . . . , J.. 立. 政 治 大. n. y er. io. al. ! a c , b d. sit. ‧ 國. Nat. where. p=. ‧. Let. 學. Example 2.1.1. Consider two conditional distribution matrices: ! ! 1/3 3/5 1/4 3/4 A= , B= . 2/3 2/5 1/3 2/3. Ca +h b + c + d = 1,U n i g 0.c h i a, b, e c, dn≥. v. Suppose that the joint distribution p has A and B as its two conditional distributions, i.e., pX|Y (X = xi |Y = yj ) = Aij and pY |X (Y = yj |X = xi ) = Bij , i, j = 1, 2. Then a, b, c, d satisfy the following constraints: 1/3 1 a A11 = = = , b A21 2/3 2 c A12 3/5 3 = = = , d A22 2/5 2 a B11 1/4 1 = = = , c B12 3/4 3 5.

(10) B21 1/3 1 b = = = . d B22 2/3 2 However, there are no solutions to the above linear equations. It follows that there does not exist a joint distribution whose two conditional distributions are exactly A and B. Therefore, A and B are incompatible. Example 2.1.2. Consider two conditional distribution matrices: ! ! 1/3 3/8 1/4 3/4 A= , B= . 2/3 5/8 2/7 5/7. 政 治 大. A and B are compatible, since we can find a joint distribution ! 1/11 3/11 p= 2/11 5/11. 立. ‧ 國. 學. ‧. such that pX|Y (X = xi |Y = yj ) = Aij and pY |X (Y = yj |X = xi ) = Bij , i, j = 1, 2.. sit. y. Nat. If A and B contain zero elements, there is a necessary condition for the compatibility of A and B.. n. er. io. Definition 2.1.2. The set N A = {(i, j): Aij > 0} is called the incidence set al of matrix A, the set of locations of non-zero elements i v in matrix A.. n. C. Clearly N A = N B (i.e., A hand incidence set) is a e nBg share c h i aUcommon necessary condition for compatibility. We assume N A = N B and denote the common incidence set by N = N A = N B = {(i, j)|Aij > 0, Bij > 0}. Note that N A = N B is trivially satisfied if A and B contain only positive elements. Example 2.1.3. Consider two conditional distribution matrices: ! ! 1/4 0 1/3 2/3 A= , B= . 3/4 1 1 0 Since A and B do not share a common incidence set, A and B are incompatible. 6.

(11) Example 2.1.4. Consider two conditional distribution matrices:     2/5 0 0 1 0 0     A =  0 1 1 B = 0 1/2 1/2 3/5 0 0 1 0 0 A and B are compatible, since N A = N B and we can find infinitely many compatible joint distributions   2a 0 0   P =  0 b b 3a 0 0. 立. with a, b > 0 and 5a + 2b = 1.. ‧ 國. 學. 2.2. 政 治 大. Review of the ratio matrix approach for compatibility between two conditional distributions. ‧. er. io. sit. y. Nat. Suppose that A and B are compatible. Then there exists a joint distribution p such that p (xi , yi ) = pX (xi ) Bij = pY (yj ) Aij for all i, j. With this simple observation, Arnold and Press (1989) obtained the following theorem.. n. Theorem 2.2.1. Assume a lN A = N B . Then A andi vB are compatible if and C h vectors τ = (τU1, .n. . , τI ) and η = (η1, . . . , ηJ ) only if there exist two probability e hi such that τi Bij = ηj Aij for all i, j. n g c The vectors τ and η can be readily interpreted as the X- and Y -marginal distributions, respectively. From Theorem 2.2.1, Arnold and Press (1989) derived the following theorem . Theorem 2.2.2. Assume N A = N B = N. Then A and B are compatible if and only if there exist two vectors u = (u1 , . . . , uI ) and υ = (υ1 , . . . , υJ ) such that Cij = Aij /Bij = ui υj , (i, j) ∈ N.. 7.

(12) Definition 2.2.1. Let  A /B ij ij Cij =  ∗. if Aij , Bij > 0 if Aij = Bij = 0,. where the symbol ∗ denotes an undefined element. Then the matrix C = (Cij ) is called the ratio matrix of A and B. When A and B contain only positive elements, Arnold and Press (1989) applied the ratio matrix of A and B to obtain the following necessary and sufficient condition for the compatibility of A and B.. 政 治 大 Theorem 2.2.3. Suppose that A and B contain only positive elements. 立 Then the following statements are equivalent.. ‧ 國. 學. (i) A and B are compatible. (ii) The ratio matrix C is of rank one.. Nat. y. ‧. (iii) The compatible joint distribution is unique.. n. er. io. sit. Example 2.2.1. Consider two conditional distribution matrices:     1/6 3/7 1/7 1/5 3/5 1/5 a    v2/7 3/7 i A = 1/3 2/7l 3/7 , B = 2/7 . n Ch 1/2 2/7 3/7 e n g c h i U 3/8 1/4 3/8 The corresponding ratio matrix is   5/6 5/7 5/7   C = 7/6 1 1 . 8/6 8/7 8/7 Since the ratio matrix C is of rank one, A and B are compatible and the compatible joint distribution is unique. Let p be the compatible joint distribution. We can use a convenient method to find this p. We begin with a 3×3 positive matrix Q = (Qij ), where Qij represents the probability ratio of pij 8.

(13) to p11 . Note that Q11 = p11 : p11 = 1. The remaining elements can be derived from A and B. If we start from the first column of A, then we have p21 A21 1/3 2 Q21 = = = = , Q11 p11 A11 1/6 1. and yield Q21 = 2, Q31. p31 A31 1/2 3 Q31 = = = = , Q11 p11 A11 1/6 1 = 3.. Next, from first row of B, we have Q12 p12 B12 3/5 3 = = = = , Q11 p11 B11 1/5 1 Q13 13 13 = = = = , Q11 p11 B11 1/5 1 = 1.. 立. ‧ 國. 學. and yield Q12 = 3, Q13. 治 政 p B 1/5大 1. ‧. Similarly, from the second and third rows of B, yield Q22 = 2, Q23 = 3, Q32 = 2, Q33 = 3. Then, we yield the matrix Q:   1 3 1   Q = 2 2 3 . 3 2 3. er. io. sit. y. Nat. n. a l the compatible joint distribution Finally, normalize Q to form iv  n Ch  1 c3 h1i U e1 n g   p= 2 2 3 . 20 3 2 3 If the conditional distribution matrices contain zero elements, Theorem 2.2.3 can not be applied. Song, Li, Chen, Jiang and Kuo (2010) proposed “positive extension” of ratio matrix C to deal with the problem. They extended the ratio matrix by properly assigning positive numbers to all undefined elements. Definition 2.2.2. A matrix C¯ = (C¯ij ) is called a positive extension of ratio matrix C if all C¯ij > 0 and C¯ij = Cij when Cij > 0. 9.

(14) If the ratio matrix C contains only positive elements, then the positive extension C¯ = C. Otherwise, the positive extension C¯ is not unique. Theorem 2.2.4. Assume N A = N B . Then A and B are compatible if and only if there exists a rank one positive extension C¯ of ratio matrix C. Example 2.2.2. Consider two conditional distribution matrices:     1/3 3/4 0 1/2 1/2 0     A =  0 1/4 3/5 , B =  0 1/4 3/4 . 2/3 0 2/5 1/3 0 2/3. 政 治 大. ‧. ‧ 國. 立. 學. The corresponding ratio matrix is   2/3 3/2 ∗   C= ∗ 1 4/5 . 2 ∗ 3/5. n. er. io. sit. y. Nat. In order to make the first and second rows proportional in matrix C, we have to assign C13 = 6/5 and C21 = 4/9, and the matrix turns out to be   2/3 3/2 6/5   4/9 1 4/5 . a l 2 ∗ 3/5 v ni. Ch. U i e h n c g no matter what the value of C is, the last two rows will never. However, 32 be proportional, i.e., there does not exist a rank one positive extension ratio matrix. Therefore, A and B are incompatible. Definition 2.2.3. A ratio matrix C is said to be reducible if, after interchanging some rows and/or columns, it can be rearranged as ! T1 ∗ , ∗ T2 where entries off the diagonal block matrices T1 and T2 are all ∗. The matrix C is irreducible if it is not reducible. 10.

(15) If the ratio matrix C is irreducible, Song, Li, Chen, Jiang and Kuo (2010) obtained the following theorem. Theorem 2.2.5. Assume that N A = N B and the ratio matrix C is irreducible. Then the following statements are equivalent. (i) A and B are compatible. ¯ (ii) C has a unique rank one positive extension C. (iii) The compatible joint distribution is unique.. 治 政 Example 2.2.3. Consider two conditional distribution 大 matrices: 立     1/4 3/4 0   B =  0 1/4 3/4 . 1/2 0 1/2. y. sit. io. er. Nat. The corresponding ratio matrix is   4/3 1 ∗   C =  ∗ 1 4/5 . 4/3 ∗ 4/5 a. ‧. ‧ 國. 學. 1/3 3/4 0   A =  0 1/4 3/5 , 2/3 0 2/5. n. iv l C n is obviously irreducible.i We he n g c h U can set C13 = 4/5, C21 =. The ratio matrix C 4/3 and C32 = 1, the extension ratio matrix turns out to be   4/3 1 4/5   4/3 1 4/5 . 4/3 1 4/5. Since the rank one positive extension ratio matrix is unique, the compatible joint distribution is unique and easily found to be   1/12 1/4 0   p =  0 1/12 1/4 . 1/6 0 1/6 11.

(16) When the ratio matrix C is reducible, Song, Li, Chen, Jiang and Kuo (2010) used the following lemma to rearrange a reducible ratio matrix C as an irreducible block diagonal matrix . Lemma 2.2.1. For any ratio matrix columns, it can be rearranged as an noted by  T1  ∗ T (C) =  ∗  ∗. 立. C, by interchanging some rows and/or irreducible block diagonal matrix, de ∗ ∗  ∗ ∗ , . . . ∗ ∗  ∗ ∗ Tk. ∗ T2. 政 治 大. ‧. ‧ 國. 學. where the diagonal block matrices T1 , . . . , Tk (k ≥ 1) are irreducible and elements off these diagonal block matrices are all ∗. When k = 1, C itself is irreducible. For simplicity, let T (C) = Diag (T1 , . . . , Tk ) for k ≥ 1.. er. io. sit. y. Nat. Song, Li, Chen, Jiang and Kuo (2010) applied irreducible block diagonal matrix to present another version of necessary and sufficient conditions for compatibility in the following theorem.. n. Theorem 2.2.6. Assume a l N A = N B . Let T (C) =i v Diag(T1, . . . , Tk ) be an n C h of ratio matrix irreducible block diagonal matrix e n g c h i U C. Then A and B are compatible if and only if each Ti has a unique rank one positive extension T¯i , i = 1, . . . , k. Example 2.2.4. Consider  3/4 0 0   0 1 2/3 A=  0 0 1/3  1/4 0 0. two conditional distribution matrices:    1 3/5 0 0 2/5     0 1/3 2/3 0  0 , . B=  0  0 0 1 0    0 1 0 0 0. 12.

(17) The corresponding ratio matrix is  5/4   ∗ C=  ∗  1/4.  ∗ ∗ 5/2  3 1 ∗  . ∗ 1/3 ∗   ∗ ∗ ∗. The reducible ratio matrix C can be rearranged as an irreducible block diagonal matrix   5/4 5/2 ∗ ∗    1/4 ∗ ∗ ∗  . T (C) =   ∗ ∗ 3 1    ∗ ∗ ∗ 1/3. 立. 政 治 大. !. io. a. y. Nat. Two positive extensions of T1 and T2 are ! 5/4 5/2 T¯1 = and T¯2 = 1/4 1/2. 3 1 . ∗ 1/3. 3 1 1 1/3. !. sit. and T2 =. .. er. !. ‧. 5/4 5/2 1/4 ∗. T1 =. 學. ‧ 國. Let T (C) = Diag(T1 , T2 ), where. n. v Since both T¯1 and T¯2 are ofl rank n i compatible. C one, A and B are hengchi U. Theorem 2.2.6 provides an effective method to check compatibility when the ratio matrix C is reducible. However, Song, Li, Chen, Jiang and Kuo did not characterize all compatible joint distributions when the ratio matrix C is reducible. In the next section, we propose a graphical representation approach to deal with this case where the ratio matrix C is reducible. The ratio matrix approach cannot be naturally extended to the higher –dimensional case. We will use graphical representation approach to solve the problem.. 13.

(18) 3 3.1. Graphical representation approach Graphical representation. Consider a graph with vertex set V and edge set E where an edge connecting vertices u, υ ∈ V is denoted by {u, υ} (so that E is identified with a subset of the collection of all 2-element subsets of V ). For each edge {u, υ}, there is a specified ratio r(u, υ) where r(u, υ) represents the probability ratio of vertex u to vertex υ. Let R = R(E) denote the collection of all the specified ratios (to be called the ratio set), and we refer to (V, E, R) as a graphical representation. It is desired (i) to determine whether R is compatible in the sense that there is a (positive) probability distribution (p(υ))υ∈V on the vertex set V such that p(u) : p(υ) = r(u, υ) for all {u, υ} ∈ E, and (ii) to find all (compatible) probability distributions satisfying R when R is compatible.. 立. 政 治 大. ‧ 國. 學. ‧. In this chapter, instead of X and Y, we use X1 and X2 to denote two random variables taking values in {1, . . . , I} and {1, . . . , J}, respectively. Let ξ = {p1|2 , p2|1 } be a given set of conditional distributions for two random variables X = (X1 , X2 ). Let A = (Aij ) = (p1|2 {X1 = i|X2 = j}) and B = (Bij ) = (p2|1 {X2 = j|X1 = i}). Let N A = {(i, j)|Aij > 0} and N B = {(i, j)|Bij > 0}. We assume N A = N B and let N = N A = N B = {(i, j)|Aij > al iv 0, Bij > 0}.. n. er. io. sit. y. Nat. n U i e We now define the graphical representation n g c h (Vξ , Eξ , Rξ ) for ξ. Let Vξ = N. Ch. be the vertex set. Write x = (x1 , x2 ) ∈ N and x0 = (x01 , x02 ) ∈ N . An edge {x, x0 } connects two vertices x, x0 ∈ Vξ if and only if either x1 = x01 or x2 = x02 . The ratio associated with this edge is given by r(x, x0 ) = p1|2 (x1 |x2 ) : p1|2 (x01 |x02 ) if x2 = x02 or r(x, x0 ) = p2|1 (x2 |x1 ) : p2|1 (x02 |x01 ) if x1 = x01 . The resulting graphical representation is denoted by (Vξ , Eξ , Rξ ). Since each edge connects two vertices x, x0 ∈ Vξ through only one conditional distribution 14.

(19) in ξ = {p1|2 , p2|1 }, the ratio associated with each edge is uniquely defined. The following theorem shows that compatibility of ξ is equivalent to compatibility of Rξ . Theorem 3.1.1. Assume that ξ satisfies N A = N B . Then a joint distribution with support Vξ satisfies ξ if and only if it satisfies Rξ . Consequently, ξ is compatible if and only if Rξ is compatible. Proof. To prove the “only if” part, suppose that a joint distribution p (with support Vξ ) satisfies ξ, i.e., under p the conditional distribution of XS given XT agrees with pS|T for (S, T ) = (1, 2) and (2, 1). Consider an (arbitrary) edge {x, x0 } ∈ Eξ with an associated ratio given by r(x, x0 ) = pS|T (xS |xT ) : pS|T (x0S |x0T ), where (S, T ) = (1, 2) or (2, 1). By definition, we have xT = x0T . Then p satisfies the associated ratio specification since. 學. Nat. from which it follows that p satisfies Rξ .. y. pS|T (xS |xT ) p(x) = = r (x, x0 ) , 0 0 0 p(x ) pS|T (xS |xT ). ‧. ‧ 國. 立. 政 治 大. n. er. io. sit. To prove the “if” part of the theorem, suppose that a joint distribution p (with support Vξ ) satisfies Rξ . Consider a conditional pS|T ∈ ξ. Fix an x ∈ Vξ \ {x 0 } with (arbitrary) x 0 ∈ Vξ . Fora every v xT = x0T , we have (by i l n 0 Eξhand , x )i =UpS|T (x0S |x0T ) : pS|T (xS |xT ). It definition) an edge {x 0 , x } ∈C e nr(x gch follows that pS|T (xS |xT ) pS|T (xS |x0T ) p(x) 0 = . = r(x, x ) = p(x0 ) pS|T (x0S |x0T ) pS|T (x0S |x0T ) Since this holds for all x ∈ Vξ \ {x 0 } with xT = x0T , under p the conditional distribution of XS given XT = x0T agrees with pS|T . As x0 (and x0T ) is arbitrary, pS|T is indeed the conditional distribution of XS given XT under p. This shows that p satisfies ξ. The proof of the theorem is complete. To illustrate, consider, in the following examples, two random variables X1 and X2 both taking values in {1, 2, 3}, and let ξ = {p1|2 , p2|1 }. Let 15.

(20) A = (Aij ) and B = (Bij ), where Aij := P1|2 (i|j) = p(X1 = i|X2 = j), Bij := P2|1 (j|i) = p(X2 = j|X1 = i), i, j = 1, 2, 3. Example 3.1.1. For   1/5 2/7 3/8   A = 3/5 2/7 1/8 , 1/5 3/7 1/2.   1/6 1/3 1/2   B = 1/2 1/3 1/6 , 1/8 3/8 1/2. a graphical representation is given in Fig. 3.1.1, where a vertex labeled (i, j) corresponds to a configuration (i, j) of (X1, X2 ), and ratios attached to vertical edges are derived from A while ratios attached to horizontal edges are derived from B. Note that the six edges {(i, 1), (i, 3)}, {(1, j), (3, j)}, i, j = 1, 2, 3 and the associated ratios are not shown. These six ratios can be derived from those shown in the figure. For example, the ratio r((1, 1), (1, 3)) = B11 : B13 = 1 : 3 can be derived from r((1, 1), (1, 2)) = 1 : 2 and r((1, 2), (1, 3)) = 2 : 3.. 政 治 大. 立. io. 3:2. 3 : 1 2:1. 2,2. 3 C : h 1. 3,1. 2 : 3. engchi. 1:3. 3,2. 3:4. 2,3. y. 1 : 1. n. al. 1,3. sit. Nat. 1 : 3 2,1. 2:3. 1,2. er. 1:2. ‧. ‧ 國. 學. 1,1. i n U 1 : 4. v. 3,3. Fig. 3.1.1. Graphical representation for Example 3.1.1. Remark 3.1.1. For a path υ0 υ1 . . . υl , we have l−1 Y. r (υi+1 , υi ) = r (υl , υ0 ) .. i=0. For example, consider the path (1, 1) → (1, 2) → (1, 3) in Fig 3.1.1, r((1, 2), (1, 1))×r((1, 3), (1, 2)) = 2/1×3/2 = 3 : 1 = B13 : B11 = r((1, 3), (1, 1)). 16.

(21) Definition 3.1.1. Two graphical representations (V, E, R) and (V, E 0 , R0 ) with the same vertex set V are said to be equivalent if (i) R and R0 agree on E ∩ E 0 , (ii) for {u, υ} ∈ E\E 0 , there exists an E 0 -path υ0 υ1 . . . υk with υ0 = u, υk = υ and {υl , υl+1 } ∈ E 0 , l = 0, 1, . . . , k − 1, such that r (υ, u) =. k−1 Y. r0 (υl+1 , υl ) .. l=0. and (iii) for {u, υ} ∈ E 0 \E, a condition similar to (ii) is satisfied with the roles of (E, R) and (E 0 , R0 ) interchanged.. 政 治 大. In words, two graphical representations (V, E, R) and (V, E 0 , R0 ) are equivalent if the two ratio sets R and R0 agree on all common edges and a ratio in only one of R and R0 can be derived from ratios in the other set. Strictly speaking, Fig. 3.1.1 is a simplified, but equivalent version of the graphical representation for the given matrices A and B in Example 3.1.1. Another simplified, but equivalent version of the graphical representation is given in Fig. 3.1.1-1. Lemma 3.1.1 states a simple result on equivalent graphical representations, whose proof is straightforward and omitted.. 立. ‧. ‧ 國. 學. er. io. sit. y. Nat. 1:3. n. al. i 1:2 1,2 1,3 n 1,1C hengchi U 1 : 3. 2,1. 1 : 1. 3:2. 3 : 1 3,1. 3 : 1. 2:1. 2,2. 2 : 3 1:3. 3,2. v. 2,3. 1 : 4 3:4. 3,3. Fig. 3.1.1-1. Another equivalent graphical representation for Example 3.1.1. Lemma 3.1.1. Suppose (V, E, R) and (V, E 0 , R0 ) are equivalent. Then a 17.

(22) positive probability distribution p on V satisfies R if and only if p satisfies R0 . Consequently, R is compatible if and only if R0 is compatible. Example 3.1.2. For   1/2 1/2 0   A =  0 1/2 1/2 , 1/2 0 1/2.   1/3 2/3 0   B =  0 1/3 2/3 , 1/3 0 2/3. a graphical representation is given in Fig. 3.1.2, where there are only six vertices corresponding to the six (possible) configurations of (X1 , X2 ). Note that the underlying graph has exactly one cycle. 1,2. 1 : 1 1:2. 2,3. Nat. 1 : 1. 1:2 3,1. y. 2,2. ‧. 1 : 1. 學. ‧ 國. 1,1. sit. 立. 政 治 大 1:2. 3,3. er. io. n. a Fig. 3.1.2. Graphical representation for iExample 3.1.2. v l C hengchi Un. Definition 3.1.2. A connected graph with no cycles is called a tree. A tree containing every vertex of a graph is called a spanning tree of the graph. Example 3.1.3.   1/5 2/7 3/8   A = 3/5 2/7 1/8 , 1/5 3/7 1/2.   1/6 1/3 1/2   B= ? ? ? , ? ? ?. a (simplified but equivalent) graphical representation is given in Fig. 3.1.3, where the four edges {(1, j), (3, j)}, j = 1, 2, 3, and {(1, 1), (1, 3)}, and the 18.

(23) associated ratios are not shown. This example admits a graphical representation even though the conditional probabilities p2|1 (j|i), j = 1, 2, 3, i = 2, 3, are unavailable. Note that the underlying graph is a tree. 1,1. 1:2. 1 : 3 2,1. 3 : 1 3,1. 立. 2:3. 1,2. 1,3. 1 : 1 2,2. 3 : 1 2,3. 2 : 3. 1 : 4. 政 治 大 3,2. 3,3. Fig. 3.1.3. Graphical representation for Example 3.1.3.. ‧ 國. 學. 3.2. Compatibility of a ratio set R and characterization of probability distributions satisfying R. ‧. n. er. io. sit. y. Nat. A graph (V, E) is connected if every pair of vertices are connected by a path. If (V, E) is not connected, it can be decomposed into some k > 1 S components (disjoint connected subgraphs), written (V, E) = ki=1 (Vi , Ei ), where each (Vi , Ei ) is a connected subgraph and where the symbol ∪ denotes al v disjoint union (implying thatCVi ∩ Vj = ∅ and Ei ∩ n iEj = ∅ for i 6= j).. hengchi U. Theorem 3.2.1. For a graphical representation (V, E, R) where (V, E) is connected, the following statements are equivalent. (i) R is compatible. (ii) For any two paths υ0 υ1 . . . υl and w0 w1 . . . wm with υ0 = w0 , υl = wm , l−1 Y i=0. r (υi+1 , υi ) =. m−1 Y. r (wi+1 , wi ) .. (3.1). i=0. (iii) For every cycle υ0 υ1 . . . υl υ0 (i.e., a path whose initial and terminal. 19.

(24) vertices are the same), we have l Y. r (υi+1 , υi ) = 1,. (3.2). i=0. where υl+1 := υ0 . Proof. The equivalence of (ii) and (iii) is obvious. To show that (i) implies (ii), suppose R is compatible and let p be a (positive) probability distribution on V satisfying R. For two paths υ0 υ1 . . . υl and w0 w1 . . . wm with υ0 = w0 , υl = wm , we have l−1. 政 治 大 l−1. p (υl ) Y p (υi+1 ) Y = = r (υi+1 , υi ) , p (υ0 ) i=0 p (υi ) i=0. 立. ‧. ‧ 國. 學. m−1 Y p (wi+1 ) m−1 Y p (wm ) p (υl ) = = = r (wi+1 , wi ) . p (υ0 ) p (w0 ) p (w ) i i=0 i=0. This shows that (3.1) holds.. n. er. io. sit. y. Nat. To show that (ii) implies (i), fix a vertex υ0 ∈ V . Since the graph is connected, for every υ ∈ V \{υ0 }, there exists a path υ0 υ1 . . . υl connecting υ0 and υ = υl . Define l−1 a lq (υ) := Y r (υi+1 , υi ) .n i v. Ch. ei=0 ngchi U. By condition (ii), the definition of q(υ) does not depend on the chosen path. Letting q(υ0 ) := 1, define X p (υ) := q (υ) / q (υ) , υ ∈ V. υ∈V. which is a positive probability distribution on V satisfying R. So R is compatible. The proof is complete. Remark 3.2.1. As a simple application of Theorem 3.2.1, consider Example 3.1.2 for which the graphical representation in Fig. 3.1.2 has only one cycle. There are two methods to check the compatibility for Example 3.1.2. One 20.

(25) method is to check condition (ii) in Theorem 3.2.1. Suppose that initial vertex is (1, 1) and terminal vertex is (3, 3). There are two paths from vertex (1, 1) to vertex (3, 3). For the path (1, 1) → (1, 2) → (2, 2) → (2, 3) → (3, 3), the ratio r((3, 3), (1, 1)) = 2/1 × 1/1 × 2/1 × 1/1 = 4 : 1. For the other path (1, 1) → (3, 1) → (3, 3), the ratio r((3, 3), (1, 1)) = 1/1 × 2/1 = 2 : 1. Since the ratio r((3, 3), (1, 1)) depends on the chosen path, A and B are incompatible. Another method is to check condition (iii) in Theorem 3.2.1. For the cycle (1, 1) → (3, 1) → (3, 3) → (2, 3) → (2, 2) → (1, 2) → (1, 1), the left hand side of (3.2) for this cycle equals 1/1 × 2/1 × 1/1 × 1/2 × 1/1 × 1/2 = 1/2 6= 1, implying incompatibility.. 政 治 大 Remark 3.2.2. It follows from Theorem 3.2.1 and its proof that when (V, E) 立 is connected and R is compatible, there is a unique probability distribution. ‧ 國. 學. satisfying R.. ‧. Remark 3.2.3. If (V, E) is a tree (i.e., a connected graph with no cycles), then any ratio set R is compatible since condition (iii) in Theorem 3.2.1 is trivially satisfied. By Remark 3.2.2, there is a unique probability distribution on V satisfying R. Example 3.1.3 is such a case, so it is compatible and has a unique compatible joint distribution. In a graphical representation (V, E, R) a l every spanning tree ofi the v graph (V, E), there is where (V, E) is connected, for n Ch e n gsatisfies a unique probability distribution which c h i UR restricted to the spanning tree. Thus, R is compatible if and only if all spanning trees give rise to the same probability distribution. Alternatively, to check compatibility of R, it may be easier to first choose a convenient spanning tree and find the unique probability distribution p which satisfies R restricted to the spanning tree, and then check if this p satisfies R. As an illustration, consider Example 3.1.1 and note that Example 3.1.3 is derived from Example 3.1.1 with the second and third rows of matrix B removed. As a result, the graphical representation for Example 3.1.3 as shown in Fig. 3.1.3 is a spanning tree of the graphical representation for Example 3.1.1 as shown in Fig. 3.1.1. The. n. er. io. sit. y. Nat. 21.

(26) unique compatible joint distribution  1/20  p = 3/20 1/20. for Example 3.1.3 is easily found to be  1/10 3/20  1/10 1/20 . 3/20 1/5. For this p, it is readily shown that p2|1 agrees with B in Example 3.1.1, implying compatibility. If a graph (V, E) is not connected, we have the following theorem. S Theorem 3.2.2. Consider a graphical representation (V, E, R) = ki=1 (Vi , Ei , Ri ) where each (Vi , Ei ) is a connected subgraph of (V, E) and Ri is R restricted to Ei . Then R is compatible if and only if Ri is compatible, i = 1, . . . , k.. 立. 政 治 大. ‧ 國. 學. ‧. Proof. To show the “only if” part, suppose R is compatible. Let p be a (positive) probability distribution on V satisfying R. Let pi be a probability P distribution on Vi defined by pi (υ) := p(υ)/ υ∈Vi p (υ) , υ ∈ Vi . It is easily verified that pi satisfies Ri , implying that Ri is compatible. To show the “if” part, suppose Ri is compatible for i = 1, . . . , k. Let pi be a (positive) probability distribution on Viasatisfying Ri , i = 1, . . . , k. For any given positive Pl iv numbers λ1 , . . . , λk with ki=1Cλi = 1, define U n. n. er. io. sit. y. Nat. hengchi. pλ1 ,...,λk (υ) :=. k X. λi pi (υ) 1Vi (υ) , υ ∈ V.. i=1. It is easily verified that pλ1 ,...,λk is a probability distribution on V satisfying R, implying that R is compatible. Remark 3.2.4. In Theorem 3.2.2, a graphical representation (V, E, R) is written as a disjoint union of (Vi , Ei , Ri ), i = 1, . . . , k where each (Vi , Ei ) is a connected subgraph of (V, E). To show that R is compatible, it suffices to show that each Ri is compatible. Suppose now R is compatible. We want to characterize all positive probability distributions on V satisfying R. Since 22.

(27) each Ri is compatible, there is a unique positive probability distribution pi on Vi satisfying Ri (cf. Theorem 3.2.1). For any positive numbers λ1 , . . . , λk P with ki=1 λi = 1, define pλ1 ,...,λk (υ) :=. k X. λi pi (υ) 1Vi (υ) , υ ∈ V.. (3.3). i=1. which is a positive probability distribution on V satisfying R (cf. the proof of Theorem 3.2.2). On the other hand, let p be a positive probability distriP bution on V satisfying R. Define p∗i (υ) := p(υ)/ υVi p (υ) , υ ∈ Vi , which is a positive probability distribution on Vi satisfying Ri . By uniqueness, we have p∗i = pi . P Letting λi = υVi p (υ) , it follows that. 立. ‧ 國. λi p∗i (υ) 1Vi. i=1. (υ) =. k X i=1. 學. p (υ) =. k X. 政 治 大. λi pi (υ) 1Vi (υ) = pλ1 ,...,λk (υ) .. ‧. Thus the set of all positive probability distributions on V satisfying R is P {pλ1 ,...,λk : λi > 0, ki=1 λi = 1}, the set of all convex combinations of p1 , . . . , pk with positive coefficients. We summarize this result in Theorem 3.2.3.. sit. y. Nat. n. er. io. S Theorem 3.2.3. Consider a graphical representation (V, E, R) = ki=1 (Vi , Ei , Ri ) where each (Vi , Eia) is a connected subgraph vof (V, E). Suppose R is i l C n compatible. Let pi be the unique probability distribution on Vi satisfying hengchi U Ri , i =1, . . . , k. Then the set of all positive probability distributions on V P satisfying R is {pλ1 ,...,λk : λi > 0, ki=1 λi = 1} where pλ1 ,...,λk is given in (3.3).. By Theorem 3.2.3, we can easily check the compatibility and find all compatible joint distributions for Example 2.2.4. Example 3.2.1. (Example  3/4 0 0   0 1 2/3 A=  0 0 1/3  1/4 0 0. 2.2.4 continued) For    3/5 0 0 2/5 1     0 1/3 2/3 0  0 , . B=  0  0 0 1 0    0 1 0 0 0 23.

(28) a graphical representation(V, E, R) is given in Fig.3.2.1. There are two components in the figure. Let (V1 , E1 , R1 ) be the first component containing vertex (1, 1), (1, 4), (4, 1) and (V2 , E2 , R2 ) be the second component containing vertex (2, 2), (2, 3), (3, 3). Since (V1 , E1 ) and (V2 , E2 ) are trees, the corresponding ratio sets R1 and R2 are compatible. Therefore, R is compatible. The unique distribution p1 with support V1 satisfying R1 is given by   1/2 0 0 1/3    0 0 0 0   p1 =   0 0 0 0 .   1/6 0 0 0. 政 治 大 The unique distribution p立with support V satisfying R   2. 2. ‧. ‧ 國. is given by. 學. 0 0 0 0   0 1/4 1/2 0  p2 =  0 0 1/4 0 .   0 0 0 0. 2. y. Nat. io. sit. The set of positive probability distributions on V satisfying R is. n. al. er. {λ1 p1 + λ2 p2 : λ1 > 0, λ2 > 0, λ1 + λ2 = 1}. 1,1. Ch 2,2. i n U. 3:2. engchi 1:2. v. 1,4. 2,3. 3. 2. :. :. 1. 1 3,3. 4,1. Fig. 3.2.1. Graphical representation for Example 3.2.1. 24.

(29) The graphical representation approach can be readily extended to higherdimensional case. To illustrate, consider, in the following example, three random variables X1 , X2 and X3 all taking values in {1, 2}, and let ξ = {p1|23 , p2|31 and p3|12 }. Let A = (Aijk ), B = (Bijk ) and C = (Cijk ), where Aijk = p1|23 (X1 = i|X2 = j, X3 = k), Bijk = p2|31 (X2 = j|X3 = k, X1 = i), Cijk = p3|12 (X3 = k|X1 = i, X2 = j), i, j, k = 1, 2. Example 3.2.2. For. 政 治 大 X =1 X. 立 X = 1, X. 1. ‧ 國. ‧. 0.1 0.9 0.2 0.8. 3. 學. =1 A = X2 = 1, X3 = 2 X2 = 2, X3 = 1 X2 = 2, X3 = 2 2. =2 0.9 0.1 , 0.8 0.2. 1. X1 C = X1 X1 X1. X3 = 1 X3 = 2 0.4 0.6 0.6 0.4 , 0.5 0.5 0.5 0.5. n. er. io. sit. y. Nat. X3 = 1, X1 = 1 B = X3 = 1, X1 = 2 Xa3l = 2, X1 = 1 C =2 X3 = 2,hXe 1 n. X 2 = 1 X2 = 2 0.3 0.7 0.7 0.3 , v 0.4 i0.6 n h0.6i U 0.4. gc. = 1, X2 = 1, X2 = 2, X2 = 2, X2. =1 =2 =1 =2. a graphical representation is given in Fig. 3.2.2. For the cycle (1, 1, 1) → (1, 1, 2) → (2, 1, 2) → (2, 1, 1) → (1, 1, 1), the left hand side of (3.2) for this cycle equals 6/4 × 1/9 × 5/5 × 1/9 = 1/54 6= 1, implying incompatibility.. 25.

(30) 3:7 4:6. (1,1,1). 4:6. (1,1,2). (1,2,1). (1,2,2). 6:4. 1. 9. 2. 8. :. Figure :. :. :. 9. 1. 8. 2. (2,1,1). 5:5. (2,1,2). (2,2,1). 5:5. (2,2,2). 6:4 7:3. Fig. 3.2.2. Graphical representation for Example 3.2.2.. 學. ‧ 國. 3.3. 政 治 大 The relation between the ratio matrix approach and graphical 立 representation approach. ‧. Restricting attention to the case of two random variables, we have the following result.. io. (i) The ratio matrix C is of rank one.. n. (ii) For every cycle υ0aυl1 . . . υl υ0 , we have. Cl h Y. engchi. er. sit. y. Nat. Theorem 3.3.1. Suppose that A and B contain only positive elements. Then the following statements are equivalent:. i n U. v. r(υi+1 , υi ) = 1,. i=0. where υl+1 : = υ0 . Proof. To show (i) implies (ii), consider a cycle (i0 , j0 ) → (i1 , j0 ) → (i1 , j1 ) → (i2 , j1 ) → . . . → (il , jl ) → (il+1 , jl ) → (il+1 , jl+1 ) where il+1 = i0 and jl+1 = j0 . Then the left hand side of (3.2) equals Ai ,j Bi ,j Ai1 ,j0 Bi1 ,j1 Ai2 ,j1 × × × . . . × l+1 l × l+1 l+1 , Ai0 ,j0 Bi1 ,j0 Ai1 ,j1 Ail ,jl Bil+1 ,jl Ai ,j Bi ,j Ai ,j Bi ,j Ai ,j = 1 0 × 1 1 × 2 1 × . . . × l+1 l × l+1 l+1 , Bi1 ,j0 Ai1 ,j1 Bi2 ,j1 Bil+1 ,jl Ai0 ,j0. =. 26.

(31) =. Ci ,j Cil+1 ,jl Ci1 ,j0 Ci2 ,j1 × × . . . × l l−1 × . Ci1 ,j1 Ci2 ,j2 Cil ,jl Cil+1 ,jl+1. Since the ratio matrix C is of rank one, there exist two vectors τ = (τ1 , . . . , τI ) and η = (η1 , . . . , ηJ ) such that Cij = τi ηj for all i, j. So Ci ,j Cil+1 ,jl Ci1 ,j0 Ci2 ,j1 × × . . . × l l−1 × Ci1 ,j1 Ci2 ,j2 Cil ,jl Cil+1 ,jl+1 τ i ηj τ i ηj τi η j τi η j = 1 0 × 2 1 × . . . × l l−1 × l+1 l , τi1 ηj1 τi2 ηj2 τil ηjl τil+1 ηjl+1 τi0 τi1 . . . τil ηj0 ηj1 . . . ηjl = × , τi0 τi1 . . . τil ηjl+1 ηj1 . . . ηjl. 政 治 大 To show (ii) implies (i), consider a cycle (i , j ) → (i , j ) → (i , j ) → 立 (i , j ) → . . . → (i , j ) → (i , j ) → (i , j ) where i = i and j = j . = 1.. 0. 1. l. l. l+1. l. l+1. l+1. 1. l+1. 0. 0. 1. l+1. 1. 0. 學. ‧ 國. 2. 0. ‧. Then by (3.2), Ai ,j Bi ,j Ai1 ,j0 Bi1 ,j1 Ai2 ,j1 × × × . . . × l+1 l × l+1 l+1 = 1, Ai0 ,j0 Bi1 ,j0 Ai1 ,j1 Ail ,jl Bil+1 ,jl. n. er. io. sit. y. Nat. which implies that Ai ,j Bi ,j Ai ,j Bi ,j Ai ,j 1 = 1 0 × 1 1 × 2 1 × . . . × l+1 l × l+1 l+1 Bi1 ,j0 Ai1 ,j1 Bi2 ,j1 Bil+1 ,jl Ai0 ,j0 Ci ,j Cil+1 ,jl Ci ,j Ci ,j = 1 0 × a 2 1 × . . . × l l−1 × i v,jl+1 . Ci1 ,j1 Cil2 ,j2 Cil ,jl Cn il+1. Ch. U. Therefore, for every cycle (ia , jb )e→n(iga+1 c,hjbi) → (ia+1, jb+1) → (ia, jb+1) → (ia , jb ), we have Cia+1 ,jb Ci ,j × a b+1 = 1, Cia+1 ,jb+1 Cia ,jb Cia ,jb+1 Cia ,jb = . Cia+1 ,jb Cia+1 ,jb+1 So the ratio matrix C is of rank one. The proof is complete. Example 3.3.1. (Example 2.1.2 continued) For ! ! 1/3 3/8 1/4 3/4 A= B= , 2/3 5/8 2/7 5/7 27.

(32) a graphical representation is given in Fig. 3.3.1. 1:3. 1,1. 1,2 3 : 5. 1 : 2 2,1. 2,2. 2:5. Fig. 3.3.1. Graphical representation for Example 3.3.1.. 政 治 大 !. The corresponding ratio matrix is C=. 4/3 4/8 . 7/3 7/8. 學. Then, we have. ‧ 國. 立. ‧. 4/3 4/8 = , 7/3 7/8 3/4 1/3 1/4 3/8 × = 1, 2/3 5/7 a l 2/7 5/8 iv n C 3/4 5/8 h 2/7 h1/3 i U= 1, × e×n g c× 1/4 3/8 5/7 2/3 3 5 2 1 × × × = 1, 1 3 5 2 r ((1, 2), (1, 1)) ×r ((2, 2), (1, 2)) ×r ((2, 1), (2, 2)) × r ((1, 1), (2, 1)) = 1.. n. er. io. sit. y. Nat. Theorem 3.3.2. The following statements are equivalent: (i) The ratio matrix C is irreducible. (ii) The graph (V, E) is connected. Proof. To show (i) implies (ii), suppose that the graph (V, E) is not connected and it can be decomposed into some k > 1 components (disjoint 28.

(33) S connected subgraphs), written (V, E) = ki=1 (Vi , Ei ), where each (Vi , Ei ) is a connected subgraph and where the symbol ∪ denotes disjoint union. Obviously, if a 6= b, then, for two vertices (ia , ja ) ∈ Va and (ib , jb ) ∈ Vb , we have ia 6= ib and ja 6= jb . So if a vertex (ia , ja ) is in Va , then all vertices on the same row or same column are in Va . Let αi = the set of row indices in Vi ⊂ {1, . . . , I} and βi = the set of the column indices in Vi ⊂ {1, . . . , J}, i = 1, . . . , k. Then. αa ∩ αb = ∅ and βa ∩ βb = ∅ for a 6= b, k [. 政 治 β =大{1, . . . ,J} .. αi = {1, . . . ,I} and. i=1. 立. k [. i. i=1. ‧. ‧ 國. 學. Therefore, by interchanging some rows and/or columns, the ratio matrix C can be rearranged as a block diagonal matrix   T1 ∗ ∗ ∗    ∗ T2 ∗ ∗     ∗ ∗ ... ∗  ,   ∗ ∗ ∗ Tk. io. sit. y. Nat. a. er. where Ti corresponds to rows in αi and columns in βi .. n. iv By definition 2.2.3, thel ratio matrix C is not irreducible. So (i) implies. (ii).. Ch. n engchi U. To show (ii) implies (i). Suppose that the ratio matrix C is not irreducible. By Lemma 2.2.1, after interchanging some rows and/or columns, the ratio matrix C can be rearranged as an irreducible block diagonal matrix   T1 ∗ ∗ ∗    ∗ T2 ∗ ∗     ∗ ∗ ... ∗  ,   ∗. ∗. ∗. Tk. where the diagonal block matrices T1 , . . . , Tk are irreducible and elements off these diagonal block matrices are all ∗. Let Vi denote the vertices in Ti . 29.

(34) Clearly, V1 , . . . , Vk are not connected. So the graph (V, E) is not connected and (ii) implies (i). The proof is complete. Since the ratio matrix C is irreducible if and only if the graph (V , E) is connected, we can combine Theorem 2.2.5 with Theorem 3.2.1 into the following theorem. Theorem 3.3.3. Suppose that the graph (V, E) is connected. Then the following statements are equivalent. ¯ (i) C has a unique rank one positive extension C.. 政 治 大. (ii) For every cycle υ0 υ1 . . . υl υ0 , we have. 立 Y r(υ l. = 1,. 學. ‧ 國. i+1 , υi ). i=0. where υl+1 : = υ0 .. ‧. n. er. io. sit. y. Nat. al. Ch. engchi. 30. i n U. v.

(35) 4 4.1. Markov chain characterizations Compatibility by the Gibbs sampler. Suppose that X and Y are two random variables taking values in {x1 , . . . , xI } and {y1 , . . . , yJ }, respectively. Consider two conditional probability matrices A = (Aij ) = (P {X = xi |Y = yj }) and B = (Bij ) = (P {Y = yj |X = xi }). Arnold, Castillo and Sarabia (1999) treated the matrix A0 (transpose of A) as a transition matrix from Y to X and the matrix B as a transition matrix from X to Y , and then applied the Gibbs sampler to obtain stationary distributions. We describe the method as follows. For ease of discussion, we assume Aij > 0, Bij > 0 for all i, j.. 政 治 大. 立. ‧ 國. 學. We begin with an initial X (1) . Conditioning on X (1) , draw a Y (1) from B. Next, conditioning on Y (1) , draw a X (2) from A0 . So we have the following transitions: A0. B. A0. B. B. ‧. X (1) −→ Y (1) −→ X (2) −→ Y (2) −→ X (3) −→ Y (3) → . . .. er. io. sit. y. Nat. This is a Markov chain, but not homogeneous. We then combine two transitions into a single one, so that we have the following two homogeneous chains:. n. al. X (1) C → X (2) → X (3) . . .n Y. (1). h e n(2)g c h(3)i U. →Y. →Y. iv. .... The transition matrix of the first chain is BA0 , and the transition matrix of the second chain is A0 B. Each chain determines a stationary distribution, say τ = (τi ) and η = (ηj ) where τi = P (X = xi ) and ηj = P (Y = yj ). That is, τ and η are solutions of the following systems: τ BA0 = τ,. (4.1). ηA0 B = η.. (4.2). Note that both transition matrices BA0 and A0 B are irreducible, so that the respective stationary distributions τ and η are unique. 31.

(36) τ and B together determine a joint distribution f (xi , yj ) = τi Bij , and η and A together determine a joint distribution g(xi , yj ) = ηj Aij . P P Let f (xi , +) = yj f (xi , yj ) and f (+, yj ) = xi f (xi , yj ), so that f (xi , +) and f (+, yj ) are the marginal distributions of f . Arnold, Castillo and Sarabia (1999) obtained the following theorem. Theorem 4.1.1. (i) Whether A and B are compatible or not, both joint distributions f and g have the same marginal distributions. That is, f (xi , +) = g(xi , +) and f (+, yj ) = g(+, yj ) for all i, j.. 政 治 大 (ii) A and B are compatible 立 if and only if the stationary distributions τ and η of the respective transition matrices BA0 and A0 B satisfy. ‧ 國. 學. τi Bij = ηj Aij for all i, j, i.e., f (xi , yj ) = g(xi , yj ) for all i, j.. n. xi. al. f (xi , yj ) =. X. τi Bij = (τ B)j .. er. X. io. f (+, yj ) =. sit. y. Nat. (i) Note that. ‧. Proof:. iv n C So the row vector τ B corresponds U Y -marginal distribution of h e n g c htoi the i. f.. Similarly, ηA0 corresponds to the X-marginal distribution of g. Multiplying B to equation (4.1) yields (τ B)A0 B = (τ B), which together with (4.2) implies τ B = η. So the Y -marginal distribution of f = τ B = η = the Y -marginal distribution of g. 32.

(37) Multiplying A0 to equation (4.2) yields (ηA0 )BA0 = (ηA0 ). From equation (4.1), we have ηA0 = τ. So the X-marginal distribution of g = ηA0 = τ = the X-marginal distribution of f . This proves that both joint distributions have the same marginal distributions.. 政 治 大. (ii) Suppose that A and B are compatible, implying that there exists a joint distribution h(xi , yj ) such that. 立. ‧. ‧ 國. 學. Let. h (xi , yj ) h (xi , yj ) = Aij and = Bij . h (+, yj ) h (xi , +). sit. y. Nat. hX = (h(x1 , +), . . . , h(xI , +)) and hY = (h(+, y1 ), . . . , h(+, yJ )),. io. al. n. So. er. which correspond to the X- and Y -marginal distributions of h.. Ch. i n , e nhXgBc=h hi Y U. v. hY A0 = hX .. Multiplying A0 to equation (4.3) and B to equation (4.4) yields hX BA0 = hY A0 = hX , hY A0 B = hX B = hY . From (4.1) and (4.2), we have τ = hX and η = hY . It follows that f (xi , yj ) = g(xi , yj ) = h(xi , yj ) for all i, j. 33. (4.3) (4.4).

(38) Conversely, suppose that f (xi , yj ) = g(xi , yj ) for all i, j. Since A is the conditional distribution of X given Y under g and B is the conditional distribution of Y given X under f , it follows that f = g has A and B as its two conditional distributions. This proves that A and B are compatible.. Example 4.1.1. Consider two conditional distribution matrices: " # " # 0.4 0.2 0.9 0.1 A= , B= . 0.6 0.8 0.3 0.7 Then, we have ". 立. 政 治 大 # ". # 0.54 0.46 A0 B = . 0.42 0.58. y. sit. τ = (0.29546, 0.70454),. er. io. a lη = (0.47727, 0.52273). i v n Ch U i e and B together determine a joint n distribution gch n. τ. ‧. τ BA0 = τ and ηA0 B = η,. Nat. yields. ‧ 國. Solving. 學. 0.38 0.62 BA0 = , 0.26 0.74. ". (f (xi , yj )) = (τi Bij ) =. #. 0.26591 0.02955 , 0.21136 0.49318. while η and A together determine a joint distribution " # 0.19091 0.10455 (g(xi , yj )) = (ηj Aij ) = . 0.28636 0.41818 The two joint distributions are different, so A and B are incompatible. However, they have the same marginal distributions, τ and η.. 34.

(39) Arnold, Castillo, Sarabia (1999) only considered Markov chain characterization involving two random variables. We now consider the threedimensional case where X, Y and Z are discrete random variables with I, J and K possible values, respectively. Three conditional distributions are given by Aijk = P (X = xi |Y = yj , Z = zk ), Bijk = P (Y = yj |X = xi , Z = zk ), Cijk = P (Z = zk |X = xi , Y = yj ). Again for ease of discussion, we assume Aijk , Bijk and Cijk are all positive.. 政 治 大. We generate a Markov chain X (1) , Y (1) , Z (1) , X (2) , Y (2) , Z (2) , . . . as follows.. 立. ‧. ‧ 國. 學. We start with (X (1) , Y (1) ). Then generate Z (1) using C together with (X (1) , Y (1) ). Thus we move from (X (1) , Y (1) ) to (Y (1) , Z (1) ). Next, we generate X (2) using A together with (Y (1) , Z (1) ), resulting in a movement from (Y (1) , Z (1) ) to (Z (1) , X (2) ). Note that in each transition, one of the two components remains the same. So we have the following transitions. sit. y. Nat. (X (1) , Y (1) ) → (Y (1) , Z (1) ) → (Z (1) , X (2) ) → (X (2) , Y (2) ) → (Y (2) , Z (2) ) → . . .. n. er. io. This is a Markov chain, but not homogeneous. We then combine three transitions into a single one, chains, a so that we have three homogeneous v (X. (1). ,Y. (1). i l C (2) (2) n (3) ) → (Xh e, Y ) → i U , Y (3)) → . . . n g c h (X. (Y (1) , Z (1) ) → (Y (2) , Z (2) ) → (Y (3) , Z (3) ) → . . .. (Z (1) , X (2) ) → (Z (2) , X (3) ) → (Z (3) , X (4) ) → . . . Let A¯ be the transition matrix from (Y, Z) to (Z, X):  A ijk ¯ A((j, k), (h, i)) = P (Z = zh , X = xi |Y = yj , Z = zk ) = 0. if h = k if h 6= k,. ¯ the transition matrix from (Z, X) to (X, Y ): B  B ijk ¯ B((k, i), (h, j)) = P (X = xh , Y = yj |Z = zk , X = xi ) = 0 35. if h = i if h 6= i,.

(40) and C¯ the transition matrix from (X, Y ) to (Y, Z):  C ijk ¯ C((i, j), (h, k)) = P (Y = yh , Z = zk |X = xi , Y = yj ) = 0. if h = j if h 6= j.. ¯ the transition matrix of the The transition matrix of the first chain is C¯ A¯B, ¯ C¯ and the transition matrix of the third chain is B ¯ C¯ A. ¯ second chain is A¯B Each chain has a unique stationary distribution, say τ = (τ (i, j)) of dimension IJ, η = (η(j, k)) of dimension JK and θ = (θ(k, i)) of dimension KI. That is, τ, η and θ satisfy. 立. ¯= τ C¯ A¯B 治τ, 政 ¯ C¯ = η, 大 η A¯B. (4.5) (4.6). ¯ C¯ A¯ = θ. θB. (4.7). ‧ 國. 學. ‧. τ and C together determine a joint distribution, f (xi , yj , zk ) = τ (i, j)Cijk , η and A together determine a joint distribution, g(xi , yj , zk ) = η(j, k)Aijk , and θ and B together determine a joint distribution, h(xi , yj , zk ) = θ(k, i)Bijk .. Nat. sit er. io. Theorem 4.1.2. y. We have the following result.. a. n. v. i as that under g, the l C under f is the same (i) The (Y, Z)-distribution n h i U as that under h, and the (X, Z)-distribution underegnisgthe c hsame (X, Y )-distribution under h is the same as that under f . That is, f (+, yj , zk ) = g(+, yj , zk ) for all j, k, g(xi , +, zk ) = h(xi , +, zk ) for all i, k, h(xi , yj , +) = f (xi , yj , +) for all i, j. Consequently, f, g and h have the same X-, Y - and Z-marginal distributions. (ii) A, B and C are compatible if and only if the stationary distribu¯ A¯B ¯ C¯ and tions τ , η and θ of the respective transition matrices C¯ A¯B, 36.

(41) ¯ C¯ A¯ satisfy τ (i, j)Cijk = η(j, k)Aijk = θ(k, i)Bijk for all i, j, k, i.e., B f (xi , yj , zk ) = g(xi , yj , zk ) = h(xi , yj , zk ) for all i, j, k. Proof: (i) The distribution of (Y Z) under f is X f (+, yj , zk ) = τ (i, j)Cijk =. i X. ¯ (i, h) , (j, k)) τ (i, h)C(. 治 政 = the (j, k) component 大 of τ C.¯ i,h. 立. ¯ A¯B ¯ C¯ = (τ C), ¯ (τ C). y. Nat. er. io. al. sit. which together with (4.6) implies τ C¯ = η.. ‧. ‧ 國. 學. So τ C¯ corresponds to the (Y, Z)-distribution under f . Similarly, η A¯ ¯ correspond respectively to the (Z, X) and (X, Y ) distribution and θB under g and h. Multiplying C¯ to equation (4.5) yields. n. iv n ¯ C So the (Y, Z)-distribution f = τ CU= η = the (Y, Z)-distribution h under e n chi g under g. Multiplying A¯ to equation (4.6) yields ¯B ¯ C¯ A¯ = (η A). ¯ (η A) From equation (4.7), we have η A¯ = θ. So the (Z, X)-distribution under g = η A¯ = θ = the (Z, X)-distribution ¯ to equation (4.7) yields under h. Multiplying B ¯ C¯ A¯B ¯ = (θB). ¯ (θB) 37.

(42) From equation (4.5), we have ¯ = τ. θB ¯ = τ = the (X, Y )-distribution So the (X, Y )-distribution under h = θB under f . We have shown that the (Y, Z)-distribution under f is the same as that under g, the (X, Z)-distribution under g is the same as that under h, and the (X, Y )-distribution under h is the same as that under f . That is,. 政 治 大. f (+, yj , zk ) = g(+, yj , zk ) for all j, k,. 立. g(xi , +, zk ) = h(xi , +, zk ) for all i, k,. ‧ 國. 學. h(xi , yj , +) = f (xi , yj , +) for all i, j.. ‧. Consequently, f, g and h have the same X-, Y - and Z-marginal distributions.. Nat. y. sit. er. io. (ii) Suppose that A, B and C are compatible, implying that there exists a joint distribution d(xi , yj , zk ) such that. n. a Al ijk = d(xi, yj , zk )/d(+, yi vj , zk ), n C = d(xi, yj , zk )/d(x Bijk h e n g c h i U i, +, zk ), Cijk = d(xi , yj , zk )/d(xi , yj , +).. Let dX,Y = (d(x1 , y1 , +), . . . , d(xI , yJ , +)), dY,Z = (d(+, y1 , z1 ), . . . , d(+, yJ , zK )), dZ,X = (d(x1 , +, z1 ), . . . , d(xI , +, zK )). So dX,Y C¯ = dY,Z , 38. (4.8).

(43) dY,Z A¯ = dZ,X , ¯ = dX,Y . dZ,X B. (4.9) (4.10). ¯ to equation (4.8), B ¯ C¯ to equation (4.9) and C¯ A¯ to Multiplying A¯B equation (4.10) yields ¯ = dX,Y , dX,Y C¯ A¯B ¯ C¯ = dY,Z , dY,Z A¯B ¯ C¯ A¯ = dZ,X . dZ,X B From (4.5), (4.6) and (4.7), we have. 立. 政 τ =治 d ,大 X,Y. η = dY,Z ,. ‧ 國. 學. θ = dZ,X .. ‧. It follows that f (xi , yj , zk ) = g(xi , yj , zk ) = h(xi , yj , zk ) = d(xi , yj , zk ) for all i, j, k.. n. er. io. sit. y. Nat. Conversely, suppose f (xi , yj , zk ) = g(xi , yj , zk ) = h(xi , yj , zk ) for all i, j, k. Since A is the conditional distribution of X given (Y, Z) under g, B is the conditional distribution of Y given (Z, X) under h and C is the conditional a distribution of Z given (X,l Y ) under f , it follows that i v f = g = h has A, B n Ch U proves that A, B and C and C as its three conditional distributions. e n g c h i This are compatible. Example 4.1.2. (Example 3.2.2 continued) Consider three random variables X, Y and Z with possible values (x1 , x2 ), (y1 , y2 ) and (z1 , z2 ), and three matrices A, B and C. y1 , z1 A = y1 , z2 y2 , z1 y2 , z2 39. x1 , 0.1 0.9 0.2 0.8. x2 0.9 0.1 0.8 0.2.

(44) z1 , x1 B = z1 , x2 z2 , x1 z2 , x2. y1 , 0.3 0.7 0.4 0.6. y2 0.7 0.3 0.6 0.4. x1 , y1 C = x1 , y2 x2 , y1 x2 , y2. z1 , 0.4 0.6 0.5 0.5. z2 0.6 0.4 0.5 0.5. 政 治 大 Suppose that our generation sequence is X , Y 立 ¯ (1). (1). , Z (1) , X (2) , Y (2) , Z (2) . . .. y2 , z2 0 0.4 0 0.5. y. y2 , z1 0 0.6 0 0.5. io. sit. y1 , z2 0.6 0 0.5 0. ‧. Nat. x1 , y1 x1 , y2 x2 , y1 x2 , y2. y1 , z1 0.4 0 0.5 0. 學. ‧ 國. Then C is the following transition matrix from (X, Y ) to (Y, Z):. iv. n. al. er. A¯ is the following transition matrix from (Y, Z) to (Z, X): y1 , z1 y1 , z2 y2 , z1 y2 , z2. zC 1 , x1 h e zn1,gx2c hzi2, xU1 n z2, x2 0.1 0.9 0 0 0 0 0.9 0.1 0.2 0.8 0 0 0 0 0.8 0.2. ¯ is the following transition matrix from (Z, X) to (X, Y ): B z1 , x1 z1 , x2 z2 , x1 z2 , x2. x1 , y1 0.3 0 0.4 0. x1 , y2 0.7 0 0.6 0 40. x2 , y1 0 0.7 0 0.6. x2 , y2 0 0.3 0 0.4.

(45) ¯ is the following transition matrix from (X, Y ) to (X, Y ) Then C¯ A¯B. x1 , y1 x1 , y2 x2 , y1 x2 , y2. x1 , y1 0.228 0.164 0.195 0.190. x1 , y2 0.352 0.276 0.305 0.310. x2 , y1 0.288 0.384 0.345 0.340. x2 , y2 0.132 0.176 0.155 0.160. ¯ C¯ is the following transition matrix from (Y, Z) to (Y, Z) A¯B. ‧ 國. 立. y1 , z2 0.333 0.246 0.316 0.252. y2 , z1 0.177 0.344 0.204 0.328. 政 治 大. y2 , z2 0.163 0.236 0.176 0.232. 學. y1 , z1 y1 , z2 y2 , z1 y2 , z2. y1 , z1 0.327 0.174 0.304 0.188. ¯ C¯ A¯ is the following transition matrix from (Z, X) to (Z, X) B. n. sit. y. z2 , x1 z2 , x2 0.386 0.074 0.435 0.065 0.408 0.072 v 0.430 n i0.070. er. io. z1 , x2 0.444 0.435 0.432 0.430. ‧. Nat. z1 , x1 , z1 , x1 0.096 z1 , x2 0.065 z2 , x1 0.088 a z2 , x2 l 0.070 C. hengchi U. Suppose that τ, η and θ satisfy the following systems: ¯ = τ, τ C¯ A¯B ¯ C¯ = η, η A¯B ¯ C¯ A¯ = θ. θB We find τ = (0.1910322, 0.3058966, 0.3452520, 0.1578192), η = (0.2490389, 0.2872453, 0.2624476, 0.2012682), θ = (0.0773934, 0.4340930, 0.4195354, 0.0689782). 41.

(46) From τ and C, we can determine a joint distribution (f (xi , yj , zk )) = (τ (i, j)Cijk ) = (f (x1 , y1 , z1 ) , f (x1 , y1 , z2 ) , f (x1 , y2 , z1 ) , f (x1 , y2 , z2 ) , f (x2 , y1 , z1 ) , f (x2 , y1 , z2 ) , f (x2 , y2 , z1 ) , f (x2 , y2 , z2 )) = (0.07641288, 0.1146193, 0.183538, 0.1223586, 0.172626, 0.172626, 0.0789096, 0.0789096). Then the X-marginal distribution of. 政 治 大 the Y -marginal distribution 立 of. f = (f (x1 , +, +), f (x2 , +, +)) = (0.4969288, 0.5030712),. ‧ 國. 學. f = (f (+, y1 , +), f (+, y2 , +)) = (0.5362842, 0.4637158), the Z-marginal distribution of. ‧. f = (f (+, +, z1 ), f (+, +, z2 )) = (0.5114865, 0.4885135).. er. io. sit. y. Nat. From η and A, we can determine a joint distribution (g(xi , yj , zk )) = (η(j, k)Aijk ) = (g (x1 , y1 , z1 ) , g (x1 , y1 , z2 ) , g (x1 , y2 , z1 ) , g (x1 , y2 , z2 ) ,. n. a. v. g (x2 , y1 , z1 ) ,lg (x C2, y1, z2) , g (x2, y2, zn1)i , g (x2, y2, z2)). i U. he. gch = (0.02490389, 0.2585208,n0.05248952, 0.1610146, 0.224135, 0.02872453, 0.2099581, 0.04025364). Then the X-marginal distribution of. g = (g(x1 , +, +), g(x2 , +, +)) = (0.4969288, 0.5030712), the Y -marginal distribution of g = (g(+, y1 , +), g(+, y2 , +)) = (0.5362842, 0.4637158), the Z-marginal distribution of g = (g(+, +, z1 ), g(+, +, z2 )) = (0.5114865, 0.4885135). 42.

(47) From θ and B, we can determine a joint distribution (h(xi , yj , zk )) = (θ(k, i)Bijk ) = (h (x1 , y1 , z1 ) , h (x1 , y1 , z2 ) , h (x1 , y2 , z1 ) , h (x1 , y2 , z2 ) , h (x2 , y1 , z1 ) , h (x2 , y1 , z2 ) , h (x2 , y2 , z1 ) , h (x2 , y2 , z2 )) = (0.02321802, 0.1678142, 0.05417538, 0.2517212, 0.3038651, 0.04138691, 0.1302279, 0.02759127). Then the X-marginal distribution of. 政 治 大. h = (h(x1 , +, +), h(x2 +, +)) = (0.4969288, 0.5030712),. 立. the Y -marginal distribution of. ‧ 國. 學. h = (h(+, y1 , +), h(+, y2 , +)) = (0.5362842, 0.4637158), the Z-marginal distribution of So that. ‧. h = (h(+, +, z1 ), h(+, +, z2 )) = (0.5114864, 0.4885136).. sit. y. Nat. io. er. f (xi , +, +) = g(xi , +, +) = h(xi , +, +) for all i, f (+, yj , +) = g(+, yj , +) = h(+, yj , +) for all j,. n. a. v. f (+, +, zk ) =l g(+, C +, zk ) = h(+, +,nzki ) for all k.. hengchi U. All these three joint distributions are different, so A, B and C are incompatible. However, they have the same marginal distributions. In fact, we can consider an alternative Markov chain X (1) , Z (1) , Y (1) , X (2) , Z (2) , Y (2) . . . Specifically, start with (X (1) , Z (1) ), then we generate Y (1) using B. Thus we move from (X (1) , Z (1) ) to (Z (1) , Y (1) ). Next, we generate X (2) using A together with (Z (1) , Y (1) ), resulting in a movement from (Z (1) , Y (1) ), to (Y (1) , X (2) ). Note that in each transition, one of the two components remains the same. So we have the following transitions (X (1) , Z (1) ) → (Z (1) , Y (1) ) → (Y (1) , X (2) ) → (X (2) , Z (2) ) → (Z (2) , Y (2) ) → . . . 43.

(48) This is a Markov chain, but not homogeneous. We then combine three transitions into a single one, so that we have three homogeneous chains, (X (1) , Z (1) ) → (X (2) , Z (2) ) → (X (3) , Z (3) ) → . . . (Z (1) , Y (1) ) → (Z (2) , Y (2) ) → (Z (3) , Y (3) ) → . . . (Y (1) , X (2) ) → (Y (2) , X (3) ) → (Y (3) , X (4) ) → . . . Let A˜ be the following transition matrix from (Z, Y ) to (Y, X):  A ijk ˜ A((k, j), (h, i)) = P (Y = yh , X = xi |Z = zk , Y = yj ) = 0 ˜ be the following transition matrix from (X, Z) to (Z, Y ): B  B ijk ˜ B((i, k), (h, j)) = P (Z = zh , Y = yj |X = xi , Z = zk ) = 0. if h = k. y. if h 6= j.. C˜ be the following transition matrix from (Y, X) to (X, Z):  C ijk ˜ C((j, i), (h, k)) = P (X = xh , Z = zk |Y = yj , X = xi ) = 0. if h = i. sit. 政 治 大. if h = j. if h 6= i.. 立. ‧. ‧ 國. 學. if h 6= k.. Nat. er. io. n. ˜ A˜C, The transition matrixa of the first chain is B v˜ the transition matrix i l n ˜C of the second chain is A˜C˜ B and matrix of the third chain is U h ethen transition i h c g ˜ A. ˜ Each chain has a unique stationary distribution, say τ˜ = (˜ C˜ B τ (i, k)) of ˜ i)) of dimension dimension IK, η˜ = (˜ η (k, j)) of dimension KJ and θ˜ = (θ(j, JI. That is, τ˜, η˜ and θ˜ are solutions of the following systems: ˜ A˜C˜ = τ˜, τ˜B ˜ = η˜, η˜A˜C˜ B ˜ ˜ A˜ = θ, θ˜C˜ B. (4.11) (4.12) (4.13). τ˜ and B together determine a joint distribution, f˜(xi , yj , zk ) = τ˜(i, k)Bijk , η˜ and A together determine a joint distribution, g˜(xi , yj , zk ) = η˜(k, j)Aijk , ˜ i , yj , zk ) = θ(j, ˜ i)Cijk . and θ˜ and C together determine a joint distribution, h(x 44.

(49) Following the proof of Theorem 4.1.2, we obtain the following theorem. Theorem 4.1.3 (i) The (Y, Z)-distribution under f˜ is the same as that under g˜, the ˜ and the (X, Y )-distribution under g˜ is the same as that under h, ˜ is the same as that under f˜. That is, (X, Z)-distribution under h f˜(+, yj , zk ) = g˜(+, yj , zk ) for all j, k, ˜ i , yj , +) for all i, j, g˜(xi , yj , +) = h(x ˜ i , +, zk ) = f˜(xi , +, zk ) for all i, k. h(x. 政 治 大 ˜ have the same X-, Y - and Z-marginal disConsequently, f˜, g立 ˜ and h. ‧ 國. 學. tributions.. ‧. (ii) A, B and C are compatible if and only if the stationary distribu˜ A˜C, ˜ A˜C˜ B ˜ and tions τ˜, η˜ and θ˜ of respective transition matrices B ˜ i)Cijk for all i, j, k. That ˜ A˜ satisfy τ˜(i, k)Bijk = η˜(k, j)Aijk = θ(j, C˜ B ˜ i , yj , zk ) for all i, j, k. is f˜(xi , yj , zk ) = g˜(xi , yj , zk ) = h(x. er. io. sit. y. Nat. n. Example 4.1.3. (Example a 4.1.2 continued). iv l C n h e n g cx1h, i xU2. z1 , y1 A = z1 , y2 z2 , y1 z2 , y2. 0.1 0.2 0.9 0.8. 0.9 0.8 0.1 0.2. x1 , z1 B = x1 , z2 x2 , z1 x2 , z2. y1 , 0.3 0.4 0.7 0.6. y2 0.7 0.6 0.3 0.4. 45.

(50) y1 , x1 C = y1 , x2 y2 , x1 y2 , x2. z1 , 0.4 0.5 0.6 0.5. z2 0.6 0.5 0.4 0.5. Suppose that our generation sequence is X (1) , Z (1) , Y (1) , X (2) , Z (2) , Y (2) . . . Then C˜ is the following transition matrix from (Y, X) to (X, Z):. ‧ 國. 立. x1 , z2 0.6 0 0.4 0. x2 , z1 0 0.5 0 0.5. 政 治 大. x2 , z2 0 0.5 0 0.5. 學. y1 , x1 y1 , x2 y2 , x1 y2 , x2. x1 , z1 0.4 0 0.6 0. A˜ is the following transition matrix from (Z, Y ) to (Y, X):. n. y. y2 , x2 0 0.8 0 v i0.2. sit. y2 , x1 0 0.2 0 0.8. er. io. Ch. y1 , x2 0.9 0 0.1 0. ‧. Nat. y1 , x1 z1 , y1 0.1 z1 , y2 0 z2 , y1 0.9 a z2 , y2 l 0. n engchi U. ˜ is the following transition matrix from (X, Z) to (Z, Y ): B. x1 , z1 x1 , z2 x2 , z1 x2 , z2. z1 , y1 0.3 0 0.7 0. z1 , y2 0.7 0 0.3 0. 46. z2 , y1 0 0.4 0 0.6. z2 , y2 0 0.6 0 0.4.

(51) ˜ A˜C˜ is the following transition matrix from (X, Z) to (X, Z) Then B. x1 , z1 x1 , z2 x2 , z1 x2 , z2. x1 , z1 0.096 0.432 0.064 0.408. x1 , z2 0.074 0.408 0.066 0.452. x2 , z1 0.415 0.080 0.435 0.070. x2 , z2 0.415 0.080 0.435 0.070. ˜ is the following transition matrix from (Z, Y ) to (Z, Y ) A˜C˜ B. ‧ 國. 立. z1 , y2 0.163 0.204 0.267 0.366. z2 , y1 0.294 0.272 0.246 0.188. 政 治 大. z2 , y2 0.216 0.208 0.344 0.232. 學. z1 , y1 z1 , y2 z2 , y1 z2 , y2. z1 , y1 0.327 0.316 0.143 0.214. n. hengchi U. y. y2 , x1 y2 , x2 0.344 0.296 0.190 0.160 0.276 0.384 v 0.190 n i0.160. sit. io. y1 , x2 0.132 0.345 0.178 0.345. er. Nat. y1 , x1 , y1 , x1 0.228 y1 , x2 0.305 y2 , x1 a 0.162 y2 , x2 l 0.305 C. ‧. ˜ A˜ is the following transition matrix from (Y, X) to (Y, X) C˜ B. Suppose that τ˜, η˜ and θ˜ satisfy the following systems: ˜ A˜C˜ = τ˜, τ˜B ˜ = η˜, η˜A˜C˜ B ˜ ˜ A˜ = θ. θ˜C˜ B We find τ˜ = (0.25, 0.25, 0.25, 0.25), η˜ = (0.25, 0.25, 0.25, 0.25), θ˜ = (0.25, 0.25, 0.25, 0.25). 47.

(52) From τ˜ and B, we can determine a joint distribution (f˜(xi , yj , zk )) = (˜ τ (i, k)Bijk ) = (f˜ (x1 , y1 , z1 ) , f˜ (x1 , y1 , z2 ) , f˜ (x1 , y2 , z1 ) , f˜ (x1 , y2 , z2 ) f˜ (x2 , y1 , z1 ) , f˜ (x2 , y1 , z2 ) , f˜ (x2 , y2 , z1 ) , f˜(x2 , y2 , z2 )) = (0.075, 0.100, 0.175, 0.150, 0.175, 0.150, 0.075, 0.100). Then the X-marginal distribution of f˜ = (f˜(x1 , +, +), f˜(x2 , +, +)) = (0.5, 0.5), the Y -marginal distribution of f˜ = (f˜(+, y1 , +), f˜(+, y2 , +)) = (0.5, 0.5),. 政 治 大 From η˜ and A, we can 立 determine a joint distribution (˜g(x , y , z )) =. the Z-marginal distribution of f˜ = (f˜(+, +, z1 ), f˜(+, +, z2 )) = (0.5, 0.5). i. k. ‧ 國. 學. (˜ η (k, j)Aijk ). j. = (˜ g (x1 , y1 , z1 ) , g˜ (x1 , y1 , z2 ) , g˜ (x1 , y2 , z1 ) , g˜ (x1 , y2 , z2 ). ‧. g˜ (x2 , y1 , z1 ) , g˜ (x2 , y1 , z2 ) , g˜ (x2 , y2 , z1 ) , g˜(x2 , y2 , z2 )) = (0.025, 0.225, 0.050, 0.200, 0.225, 0.025, 0.200, 0.050).. io. sit. y. Nat. Then. er. the X-marginal distribution of g˜ = (˜ g (x1 , +, +), g˜(x2 , +, +)) = (0.5, 0.5),. n. a. v. l C of g˜ = (˜g(+, y1, +), the Y -marginal distribution n i g˜(+, y2, +)) = (0.5, 0.5),. U. h. h i+, z1), g˜(+, +, z2)) = (0.5, 0.5). the Z-marginal distribution ofeg˜n=g(˜ gc(+, ˜ i , yj , zk )) = From θ˜ and C, we can determine a joint distribution (h(x ˜ i)Bijk ) (θ(j, ˜ (x1 , y2 , z2 ) ˜ (x1 , y1 , z1 ) , h ˜ (x1 , y1 , z2 ) , h ˜ (x1 , y2 , z1 ) , h = (h ˜ (x2 , y1 , z1 ) , h ˜ (x2 , y1 , z2 ) , h ˜ (x2 , y2 , z1 ) , h(x ˜ 2 , y2 , z2 )) h = (0.100, 0.150, 0.150, 0.100, 0.125, 0.125, 0.125, 0.125). Then ˜ = (h(x ˜ 1 , +, +), h(x ˜ 2 +, +)) = (0.5, 0.5), the X-marginal distribution of h ˜ = (h(+, ˜ ˜ the Y -marginal distribution of h y1 , +), h(+, y2 , +)) = (0.5, 0.5), 48.

參考文獻

相關文件

From these characterizations, we particularly obtain that a continuously differentiable function defined in an open interval is SOC-monotone (SOC-convex) of order n ≥ 3 if and only

• A way of ensuring that charge confinement does occurs is if there is a global symmetry which under which quarks (heavy charges) are charged and the gluons and other light fields

(2007) demonstrated that the minimum β-aberration design tends to be Q B -optimal if there is more weight on linear effects and the prior information leads to a model of small size;

Given a connected graph G together with a coloring f from the edge set of G to a set of colors, where adjacent edges may be colored the same, a u-v path P in G is said to be a

„ There is no Hamilton circuit in G2 (this can be seen by nothing that any circuit containing every vertex must contain the edge {a,b} twice), but G2 does have a Hamilton

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005..

Given a graph and a set of p sources, the problem of finding the minimum routing cost spanning tree (MRCT) is NP-hard for any constant p > 1 [9].. When p = 1, i.e., there is only

As n increases, not only does the fixed locality bound of five become increasingly negligible relative to the size of the search space, but the probability that a random