• 沒有找到結果。

Control of the global level

Signal Detection Using a Multiple Hypothesis Test

3.2 Multiple Hypothesis Test

3.2.3 Control of the global level

There are often a number of interesting parameters to be estimated or a number of interesting hypotheses to be tested. In some cases the problems may be treated separately without any connection to each other. In most cases, however, the problems are connected and thus we face a multiple statistical inference problem where we need to consider the different problems simultaneously.

The multiple test algorithms have been designed and interpreted from several angles.

Some tests are based on decision theoretic concepts, while others are based on probabilities of making wrong decisions. In the following we use the common approach of protection against error of the first kind to avoid any true hypotheses.

Let the hypotheses in a multiple test problem be denoted by Hi and the alternatives to those by Ai. A multiple test procedure is a rule assigning to each outcome a set of rejected

hypotheses. That is, for a single null hypothesis H1 against an alternative A1, the size of the test is defined as the supremum of the probability of the critical region C1 when the hypothesis H1 is true. This probability of error of the first kind is always kept at a small level α. If the null hypothesis H1 is true, the probability of accepting it is 1 − α. When we have made a “discovery” by rejecting the null hypothesis, we can be quite confident that the null hypothesis is not true. This implies that we do not make any a “discovery” by accepting the null hypotheses, because we do not have such a protection against errors of the second kind. In such a multiple test, there are many possible combinations of null hypotheses H1, H2, · · · , HM. If the “discoveries” in the form of rejected null hypotheses need to be claimed safely, the probability of rejecting any true null hypotheses must be kept small.

3.2.3.1 The Bonferroni Procedure

Before discuss about Bonferroni procedure, we need to know what is familywise error rate (FWER). In mutiple comparison procedure (MCPs), existence of one or more false results is called familywise error. MCPs aim to control the probability of committing any type I error in families of comparisons under simultaneous consideration. FWER (familywise error rate) is probability of familywise error.

FWER = P (FWE)

= P (One or more Hi|Ho)

= P (Max Hi|H0),

(3.22)

where P (·) is a probability function, H0 is null hypothesis and Hi are alternative hypotheses, i 6= 0.

It is an important problem in multiple inferences to control of type I error. The traditional concern in multiple hypotheses focuses on the probability of erroneously rejecting any of the true null hypotheses, i.e., FWER. Given a significance level α, the control of FWER requires each of the M tests to be conducted at a lower level. In Bonferroni procedure, the significance

level of each test is under α/M. Reference [17] has more discussion about this. Figure 3.3 shows the flow of procedure.

3.2.3.2 A Simple Sequentially Rejective Test

Holm proposed a technique based on the Boole inequality which was developed in the multiple interference theory called the “Bonferroni technique” in Section 3.2.3.1. For this reason the method is called “sequentially rejective Bonferroni test” [1].

When there are M hypotheses H1, H2, · · · , HM to be tested with the level α/M, the probability of rejecting any true hypotheses is smaller than or equal to α according to the Boole inequality. It constitutes the classical Bonferroni multiple test procedure with the multiple level of significance α.

In the procedure we need some test statistics, Y1, Y2, · · · , YM, to test the separate hypotheses. Denote by y the the outcome of these test statistics. The critical level ˆαk(y) for the test statistic Yk is equal to the supremum of the probability P (Yk ≤ y) when the hypothesis Hk is true. We denote Bk = ˆαk(Yk) so that the Bonferroni procedure can be achieved by comparing the levels R1, R2, · · · , RM with α/M (see Fig. 3.3).

The sequentially rejective Bonferroni test has a slight difference by modified version than the classic one. Let R(i)s be the sorted representation of Ris, i.e., R(i)s are the ordered set of the original Ris in the sense that R(1) ≤ R(2) ≤ · · · ≤ R(M ). In the procedure, we compare the obtained levels with the numbers α

M, α

n − 1, · · · ,α

1 instead of α

M. This means that the probability of rejecting any set of hypotheses using the classical Bonferroni test is smaller than or equal to that using the sequentially rejective Bonferroni test based on the same test statistics. The classical Bonferroni test has been applied mainly in cases where no other multiple test procedure is presented. We can always replace the classic Bonferroni test by the corresponding sequentially rejective Bonferroni test without loss.

Figure 3.3: Classic Bonferroni scheme.

Figure 3.4 shows the sequentially rejective Bonferroni procedure. We need to notice that the judgement of accepting or rejecting null hypotheses depends on the sorted R(i)

corresponding to H(i). Once we reject or accept some H(k) where k is between 1 and M, it implies that which one we choose is actually the original hypothesis corresponding to H(k).

The great advantage with the sequentially rejective Bonferroni test is flexibility. There are no restrictions on the type of test except we need the required level for each separate test.

3.2.3.3 False Discovery Rate

Though the above-described classical multiple comparison procedures (MCPs) have been used since the early 1950s, they have not yet been adopted widely. Despite of their use in many institutions such as the FDA of the USA or some journals, these schemes overlook

Figure 3.4: The Sequentially rejective multiple test procedure

various kind of multiplicity and results in exaggerated treatment differences. Classical MCPs aim to control the probability of committing any type I error in families of comparisons under simultaneous consideration. The control of familywise error rate (FWER) is usually required in a strong sense, say, under all configurations of the true and false hypotheses tested.

Classical MCPs have some difficulties causing constraints in research. First of all, classical procedures controlling the FWER in the strong sense tend to have substantially less power than the per comparison procedure of the same levels. Secondly, the control of the FWER is not so required to be adopted in any kind of application. The control of FWER is quite substantial when a decision from the various individual inferences is likely to be erroneous when at least one of them is. For example, several new medical treatments are compared against a standard. One single treatment is selected from the sets of treatment which are declared significantly better than the standard. However, a treatment group and a control

group are often compared by testing various aspects of the effect. The overall result that the treatment is superior need not be erroneous even if some of the null hypotheses are falsely rejected. Finally, the FWER controlling MCPs concerns comparisons of multiple treatments and families whose test statistics are multivariate normal. In practice, problems are not of the multiple-treatments type and test statistics are not multivariate normal. Actually, the families are often combined with statistics of different types.

The last issue has been partially addressed by the advancing Bonferroni-type procedures we have mentioned previously. It adopted observed individual p-value but still follows the concept of FWER control. The first two problems still present a serious issue. Some ap-proaches called per comparison error rate (PCER), which neglects the multiplicity problems altogether, are still recommended by others. A brief introduction about p-value is provided in appendix.

To keep the balance between type I error control and power, Benjamini and Hochberg [2] have proposed a new point of view on the problem of multiplicity. The traditional concern in multiple hypotheses testing problems has been about controlling the probability of erroneously rejecting even one of the true null hypotheses, i.e., the FWER mentioned previously. The power of detecting a specific hypothesis is greatly declined when the number of hypotheses in the family increases. However, the new point of view is more powerful than FWER controlling procedures and can be used for many multiple testing problems. The new procedures are as flexible as the Bonferroni is. Here we consider the new procedures.

The Benjamini Hochberg Procedures

Let H1, H2, · · · , HM be the hypotheses or treatments in medical test, and P1, P2, · · · , Pm are p-values corresponding to them. Sort these p-values into an ordered set P(1) ≤ P(2)

· · · ≤ P(m) and each P(i) corresponds to a specific hypothesis H(i). The Bonferroni multiple

testing procedure is as follows: controls the false discovery rate (FDR) at γ.

At first this was mentioned by Simes [18] as an extension to his procedure for rejection of the intersection hypothesis that all null hypotheses are true if P(i) M1 α. Though Simes showed that this extension controls the FWER under the intersection null hypothesis, Hom-mel [19] discussed the invalidity for individual hypotheses, i.e., the probability of erroneous rejections is greater than α. Afterwards, Hochberg [20] offered another procedure in order to control the FWER in the above invalid circumstance:

k = max smaller than α, all hypotheses are rejected; otherwise, the procedure proceeds to P(M −1) and begins comparison again until it finds a smaller p-value that satisfies the condition; see Fig.

3.5. At the two ends of both Benjamini Hochberg procedure and FDR procedure, both their compared values are equivalent, which means both procedures are the same. Between the two ends of testing procedures, in FDR controlling procedure the set of P(i) are compared with 1 − (i − 1)

M α instead of 1

M + 1 − iα in (3.24). For the reason of the decreasing linearity between two ends, the FDR controlling procedure rejects at least as many hypotheses as Hochberg’s procedure, where the compared values between two ends declines hyperbolically, and has also greater power than other FWER controlling methods.

Figure 3.5: The Benjamini Hochberg procedure.

Chapter 4 Simulation

In this chapter, we demonstrate the advantages of the approach by simulation. From the algorithm in chapter 3, we know how to decide the number of signals. Take Fig. 3.5 for example, if the comparing procedure does not meet the corresponding criterion, move to the next step and test; otherwise, we accept the corresponding set of hypotheses. From Fig.

4.1, it is clear that the sorted p values increasingly climb from 0 to 1, and the compared criterion value is a linearly model. Under our setting of 3 sensors, we see the sorted p value is greater than compared value since the number of sources is 3; that is, when compared value is smaller than sorted value, we reject the two corresponding hypotheses and thus the result of detection is 3.

The whole simulation procedure is demonstrated as Fig. 4.2. First of all, we generate what kind of scenario we need to distinguish the advantages we have. Can it be outperform under closely located sources , few number of sensors, or weak sources situations? The source generation step decides what kind of scenario we try to test. Then, we collect data, use eigenanalysis to get sampled eigenvalues. Take these eigenvalues into the test statistics, depending on different proposed formulas we have discussed in chapter 2. After we adopt the test statistics, use the asymptotically chi-square distribution to get p value. Appendix will give some concept of how to get p value and why it is used in statistics area. Finally we

2 3 4 5 6 7 8 9 Number of sensors

sorted p value critical value

Figure 4.1: From the algorithm, the hypotheses set are rejected by the maximum number of the set which meet the criterion of procedure; that is, from the figure, the number of signals is decided from the minimum value of sorted p values greater than the corresponding compared values.

adopt Benjamini-Hochberg procedure to control the error rate.

Consider different data collected from different scenarios. First, the data is collected from an array with 7 sensors with inter-element spacings of half a wavelength. Three sources are generated from different directions, θ1, θ2 and θ3, respectively. We perform 100 Monte-Carlo runs and detected the number of signals decided by different approaches. Considering 7 sensors with spatial distance d, direction of arrivals (DOA) are 10o, 50o and 80o and snapshots T = 50. The significance level α is 0.01. The SNR varies from −5 dB to 0 dB in 0.5 dB steps. The setting of arrival angles are general case, which means that all the sources are separate with sufficient distance. If we use other set of DOA (Degree Of Arrival), such as

Figure 4.2: Simulation flow diagram.

10o and 50o, we still can get the same result only if the locations of sources are not too close.

(Generally, closely located sources have inference on the clarity of collected data. In order to avoid this inference, we separate the sources with sufficient degree, such 10o or wider, in this work.)

Fig. 4.3 shows that when the SNR is under −2 dB, the performance of FDR controlling is better than sphericity test which is without controlling of global level of multiple test.

In this figure, we add conventional approach, MDL (∅ line), for comparison to reinforce the advantage of this approach. When the probability goes to one, the slight difference between these curves can be seen as the result of the setting of significance level and random parameters in simulation. Significance level is the largest type I error probability we can accept. Once we set the level, we can control the error probability at least under the level.

Smaller significance level comes with smaller power. Generally, it is suggest to set the level between 0.2 to 0.01 depends on trade off. Under the same SNR, the proposed procedure has higher correct probability (increase around 5% to 15%) than the classical one.

Since one benefit of sphericity test comes from the closed locations of sources, we collect data from 20o and 30o with 10 sensors and other parameters remain unchanged. Figure

−4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 SNR(dB)

FWE MDL FDR

Figure 4.3: Three sources are generated from 10o, 50o and 80o direction of arrival, respec-tively. The array consists 7 sensors with snapshots T = 50. α = 0.01 and SNR = [−5:0.5:0]

dB

4.4 shows the advantage by using the multiple hypotheses control procedure with sphericity test. The only difference between the two simulations in Fig. 4.4 is the number of sensors.

The underside plot is simulated under 5 sensors and the upper is 10. Under few number of sensors, we can see that even under the situation of few number of sensors, its performance still maintains better. If one moves the locations more close, such as the 25o and 30o, the difference between the two curves will be closer; however, if we maintain the same difference between degrees, such as 40o and 50o (the difference remains 10o), the gap between two curves will not changed with their absolute degrees greatly.

−5 −4 −3 −2 −1 0 1 2 3 SNR(dB)

FWE FDR

−5 −4 −3 −2 −1 0 1 2 3

SNR(dB) FWE

FDR

Figure 4.4: Two sources are located closely from 20o, 30o direction of arrival, respectively.

The upper simulation contains 10 sensors and the underside one contains 5 only. SNR = [−6:0.5:3] dB, Other parameters remain unchanged.

−5 −4 −3 −2 −1 0 1 2 3 SNR(dB)

FWE FDR

Figure 4.5: Under the same scenario as Fig. 4.4, the only difference is that the sources are from 40o and 50o. It can be observed that if the difference of DOA (Degree Of Arrivals) does not change, the result almost remains unchanged.

Chapter 5

相關文件