Simulation - 國國國國立立立立中中中中央央央央大大大大學學學學

4.3 Results

4.3.1 Simulation

In this section, we report the simulation results and performance of OGA+HDHQ and OGA+HDBIC through three different types of models: the multiple linear regression, autoregressive time-series and threshold spiking neuron models. The Matlab code used for this study is available to interested readers upon request.

Multiple linear regression model Consider the regression model

yt =

j=1

βjxtj+

j=q+1

βjxtj+ t, t = 1, . . . , n, (4.10)

where β_q+1 = · · · = β_p = 0, p n, _t are i.i.d. N (0, σ²) and are independent of the x_tj. The regressors are designed as

x_tj = d_tj+ ηw_t, j = 1, . . . , p, (4.11) where η ≥ 0 tunes the strength of correlation between input variables, and (d_t1, . . . , d_tp, w_t)^>, 1 ≤ t ≤ n are i.i.d. normal with zero mean and identity covariance matrix I. Simple calculation gives Corr(x_tj, x_tk) = _1+η^η²2, which increases with η > 0. The input variables are mutually independent when η = 0.

Consider (4.10) with q = 5, (β₁, . . . , β₅) = (3, −3.5, 4, −3.6, 3.2), and σ = 1. We choose K = D(n/ log p)^1/2 with D = 5 and choose c = 2.01 for HDHQ. We have also al-lowed D to vary between 3 and 10 and c to vary among 2.01, 2.51, 3.01, 3.51 and 4.01, but the results are quite similar for the different choices. The results are shown in Table 4.1.

For the sample size n ≥ 100, OGA includes all of the 5 relevant regressors within K iter-ations, and for about 99% of the 1000 simuliter-ations, the HDHQ and HDBIC successfully identify the smallest correct model, irrespective of whether the candidate regressors x_j are uncorrelated (η = 0) or highly correlated (η = 2) with y. The OGA+HDBIC+Trim has the best performance, choosing the smallest correct model in all simulations. For comparison, we have also included the performance of OGA+BIC. Table 4.1 shows that for n ≥ 100, OGA+BIC always chooses the largest model ˆJK along the OGA path.

Define the mean squared prediction error as follows:

MSPE = 1 1000

1000

l=1

(

j=1

βjx^(l)_n+1,j− ˆy^(l)_n+1)², (4.12)

where x^(l)_n+1,1, . . . , x^(l)_n+1,p are the regressors associated with y^(l)_n+1, the new outcome in the lth simulation run, and ˆy_n+1^(l) denotes the predictor of y^(l)_n+1. The MSPEs of OGA+BIC are quite larger than that of other methods when n ≥ 100 due to serious overfitting.

The traditional BIC can include all relevant variables but it also includes all irrelevant variables. The large p leads to many spuriously significant regression coefficients if one does not adjust for multiple testing [6]. The factor used in (4.4) of HDIC can be regarded as such adjustment.

Autoregressive time-series model Consider the autoregressive model

y_t= 0.95y_t−r₀ +

j=1

β_jx_t−r_j_,j+

N −1

j=q+1 m

r=1

β_r,jx_t−r,j+ _t, t = 1, . . . , n, (4.13)

where β_1,q+1 = · · · = β_m,N = 0, _t are i.i.d. N (0, σ²) and are independent of x_tj. Here, for simplicity, we assume that {y, xj, j = 1, . . . , q} contributes to y only through single lags {r_j, j = 0, . . . , q}, respectively. The regressors x_j are designed in the same way as (4.11). After converting to multiple linear regression format, the total number of candidate variables will be p := N m n.

Table 4.1: Frequency, in 1000 simulations, of including all five relevant variables (Correct), of selecting exactly the relevant variables (E), of selecting all relevant variables and i irrelevant variables (E_i), and of selecting the largest model which includes all relevant variables (E^∗).

η n p Method E E₁ E₂ E_3∼5 E^∗ Correct MSPE

0 50 1000 OGA+HDHQ 818 78 23 8 1 928 5.416

OGA+HDHQ+Trim 873 45 8 2 0 928 5.408

OGA+HDBIC 864 44 16 0 0 924 5.533

OGA+HDBIC+Trim 923 1 0 0 0 924 5.526

OGA+BIC 0 0 0 0 928 928 10.598

100 2000 OGA+HDHQ 991 9 0 0 0 1000 0.057

OGA+HDHQ+Trim 991 9 0 0 0 1000 0.057

OGA+HDBIC 999 1 0 0 0 1000 0.053

OGA+HDBIC+Trim 1000 0 0 0 0 1000 0.053

OGA+BIC 0 0 0 0 1000 1000 1.214

200 4000 OGA+HDHQ 1000 0 0 0 0 1000 0.023

OGA+HDHQ+Trim 1000 0 0 0 0 1000 0.023

OGA+HDBIC 1000 0 0 0 0 1000 0.023

OGA+HDBIC+Trim 1000 0 0 0 0 1000 0.023

OGA+BIC 0 0 0 0 1000 1000 0.942

2 50 1000 OGA+HDHQ 592 132 55 38 2 819 10.908

OGA+HDHQ+Trim 780 37 1 0 1 819 10.884

OGA+HDBIC 614 116 51 20 0 801 12.247

OGA+HDBIC+Trim 800 1 0 0 0 801 12.230

OGA+BIC 0 0 0 0 819 819 14.682

100 2000 OGA+HDHQ 989 11 0 0 0 1000 0.062

OGA+HDHQ+Trim 995 5 0 0 0 1000 0.062

OGA+HDBIC 994 6 0 0 0 1000 0.059

OGA+HDBIC+Trim 1000 0 0 0 0 1000 0.059

OGA+BIC 0 0 0 0 1000 1000 1.152

200 4000 OGA+HDHQ 1000 0 0 0 0 1000 0.027

OGA+HDHQ+Trim 1000 0 0 0 0 1000 0.027

OGA+HDBIC 1000 0 0 0 0 1000 0.027

OGA+HDBIC+Trim 1000 0 0 0 0 1000 0.027

OGA+BIC 0 0 0 0 1000 1000 1.022

Consider (4.13) with q = 5, (β₁, . . . , β₅) = (3, −3.5, 4, −3.6, 3.2), m = 3, (r₀, r₁, . . . , r₅) = (3, 1, 2, 3, 2, 1) and σ = 1. We choose K = D(n/ log p)^1/2 with D = 5 and choose c = 2.01 for HDHQ. Define the mean squared prediction error

MSPE = 1 1000

1000

l=1

(y_n+1^(l) − ˆy^(l)_n+1)², (4.14)

where y_n+1^(l) is the new outcome in the lth simulation run, and ˆy_n+1^(l) denotes the predictor

of y^(l)_n+1. This MSPE is also used in the next subsection because in practice we only observed the data y and the true model coefficients are unknown. The simulation results are shown in Table 4.2. The results are quite similar to those presented in Table 4.1 except that more samples are needed for the methods to perform well when η > 0. This is mainly due to the structural discrepancy between autoregressive and linear regression models for the former has temporal correlations between lag variables while the latter has ideal i.i.d. samples.

Table 4.2: Simulation results of autoregressive time-series model (see notation in Ta-ble 4.1).

η n p Method E E₁ E₂ E_3∼5 E^∗ Correct MSPE

0 50 750 OGA+HDHQ 793 94 22 9 1 919 5.491

OGA+HDHQ+Trim 859 44 11 4 1 919 5.490

OGA+HDBIC 847 54 13 0 0 914 6.047

OGA+HDBIC+Trim 912 2 0 0 0 914 6.046

OGA+BIC 0 0 0 0 919 919 10.964

100 1500 OGA+HDHQ 995 5 0 0 0 1000 1.017

OGA+HDHQ+Trim 995 5 0 0 0 1000 1.017

OGA+HDBIC 1000 0 0 0 0 1000 1.019

OGA+HDBIC+Trim 1000 0 0 0 0 1000 1.019

OGA+BIC 0 0 0 0 1000 1000 2.278

1 100 750 OGA+HDHQ 776 178 33 13 0 1000 1.142

OGA+HDHQ+Trim 983 16 1 0 0 1000 1.120

OGA+HDBIC 790 167 30 13 0 1000 1.139

OGA+HDBIC+Trim 999 1 0 0 0 1000 1.118

OGA+BIC 0 0 0 0 1000 1000 2.300

200 1500 OGA+HDHQ 991 9 0 0 0 1000 1.120

OGA+HDHQ+Trim 995 5 0 0 0 1000 1.120

OGA+HDBIC 996 4 0 0 0 1000 1.119

OGA+HDBIC+Trim 1000 0 0 0 0 1000 1.119

OGA+BIC 0 0 0 0 1000 1000 2.077

Threshold spiking neuron model

To illustrate the variable selection method in a neural spiking context, a large-scale sparse feedforward I&F neuron network is simulated (depicted in Figure 4.1). Briefly, excluding the target neuron, there are N − 1 neurons in total, but only q = 5 neurons have inputs to the target. Three of the five have excitatory effects on the target and the other two hava inhibitory effects. The target neuron is implemented by a direct discrete time summation of the synaptic inputs {β₁, . . . , β₅} with propagation delay {r₁, . . . , r₅} respectively. Besides, the background noisy input is set to be i.i.d. N (µ, σ) at every time step with time resolution 1 ms. The internal potential is reset to V_reset = −80 (mV) and produces a spike when the threshold value V_th = −55 (mV) is reached. During the refractory period, the potential will linearly recover from Vreset to the resting potential

V_rest = −70 (mV). To model the diffusion of ions, there is a 2 mV decrease/increase of the potential to the V_rest every unit time depending on the status of de/hyper -polarizations, respectively. The internal potential is forced to lie in the range [E_K⁺, E_{N a}⁺] = [−90, 60], which are the equilibrium potentials of K⁺ and N a⁺.

Figure 4.1: The simulated feedforward neuronal network. Red circles represent excita-tion and green squares represent inhibiexcita-tion. Source neurons 1 − 5 have synaptic inputs β₁, . . . , β₅ with propagation delay r₁, . . . , r₅, respectively. The target neuron N is im-plemented by a direct discrete time summation of the synaptic inputs. The background noisy input is set to be i.i.d. N (µ, σ) at every time step with time resolution 1 ms.

The rest N − 1 − q neurons are independent of the target neuron, but all the N − 1 neurons are modeled as Poisson point processes (X_tj) with same firing rate:

X_tj = D_tj + W_t, j = 1, . . . , N − 1, (4.15) where (D_t1, . . . , D_{t,N −1}, W_t) are independent Poisson processes with firing rate (λ, . . . , λ, η) Hz. So, {Xtj, j = 1, . . . , N −1} have the same firing rate λ+η and are correlated with each other if η 6= 0. The synaptic weights are fixed at (β₁, . . . , β₅) = (3, −3.5, 4, −3.6, 3.2)×

3 mV, and the associated propagation delay are fixed at (r₁, . . . , r₅) = (1, 2, 3, 2, 1)× 0.5 sec. Let n donote the data length after binning with bin size 0.5 sec and p = N ∗ m denote the total number of candidate variables, where m is the chosen AR order (will be 3 in this case).

We begin with Simulation 1 in which the associated parameter settings are: λ = 40 Hz, η = 0 Hz, µ = σ = 0; namely, the candidate input processes are uncorrelated with each other with common firing rate 40 Hz and there are no background noisy inputs to the target neuron. The simulated firing trajectories and the simulation result are shown in Figure 4.2 and Table 4.3, respectively. For about 99% of the 1000 simulations, both HDHQ and HDBIC successfully identify the smallest correct model as sample size n increases. But it is found that, for small sample size, HDBIC is too conservative and performs not very well; while HDHQ+Trim identifies the smallest correct input neurons

in 948 simulations and includes 1 − 2 extra irrelevant neurons in only 16 simulations. The traditional BIC always chooses the largest set of neurons along the OGA path and causes serious overfitting (largest MSPE).

Figure 4.2: The firing behaviors of the source and target neurons in Simulation 1.

The associated parameter settings are: (β₁, . . . , β₅) = (3, −3.5, 4, −3.6, 3.2)× 3 mV, (r₁, . . . , r₅) = (1, 2, 3, 2, 1)× 0.5 sec., N = 250, n = 100, m = 3, p = N m = 750, λ = 40, η = µ = σ = 0.

Table 4.3: Results of Simulation 1 in threshold spiking neuron model (see notation in Table 4.1). The associated parameter settings are: λ = 40 Hz, η = 0 Hz, µ = 0, σ = 0.

n p Method E E₁ E₂ E_3∼5 E^∗ Correct MSPE

100 750 OGA+HDHQ 900 57 6 1 0 964 1.221

OGA+HDHQ+Trim 948 15 1 0 0 964 1.222

OGA+HDBIC 760 4 0 0 0 764 1.680

OGA+HDBIC+Trim 761 0 0 0 0 761 1.680

OGA+BIC 0 0 0 0 993 993 2.455

200 1500 OGA+HDHQ 995 5 0 0 0 1000 1.188

OGA+HDHQ+Trim 995 5 0 0 0 1000 1.188

OGA+HDBIC 999 1 0 0 0 1000 1.185

OGA+HDBIC+Trim 999 1 0 0 0 1000 1.185

OGA+BIC 0 0 0 0 1000 1000 2.041

400 2250 OGA+HDHQ 997 3 0 0 0 1000 1.215

OGA+HDHQ+Trim 997 3 0 0 0 1000 1.215

OGA+HDBIC 1000 0 0 0 0 1000 1.217

OGA+HDBIC+Trim 1000 0 0 0 0 1000 1.217

OGA+BIC 0 0 0 0 1000 1000 1.920

In Simulation 2, the background noise presents with mean value and standard devi-ation (µ, σ) = (0.15, 0.1). Results are shown in Table 4.4. HDBIC still converges to the smallest set of input neurons as n becomes large enough while performs not well when n is small. HDHQ+Trim performs relatively well when n is small, selecting the correct set of neurons in 969 simulations. However, it tends to include 1 − 2 irrelevant neurons at all n since the target neuron in this simulation is not completely triggered by the five input neurons but also by the unmeasurable background noise. Nevertheless, compared to BIC, including 1 − 2 irrelevant neurons among a large set of neurons is still acceptable and good enough in practical use.

Table 4.4: Results of Simulation 2 in threshold spiking neuron model (see notation in Table 4.1). The associated parameter settings are: λ = 40 Hz, η = 0 Hz, µ = 0.15, σ = 0.1.

n p Method E E₁ E₂ E_3∼5 E^∗ Correct MSPE

100 750 OGA+HDHQ 770 176 22 1 0 969 1.285

OGA+HDHQ+Trim 811 153 5 0 0 969 1.279

OGA+HDBIC 649 14 0 0 0 663 1.937

OGA+HDBIC+Trim 656 3 0 0 0 663 1.934

OGA+BIC 0 0 0 0 997 997 2.469

200 1500 OGA+HDHQ 728 263 9 0 0 1000 1.241

OGA+HDHQ+Trim 728 264 8 0 0 1000 1.241

OGA+HDBIC 997 3 0 0 0 1000 1.227

OGA+HDBIC+Trim 997 3 0 0 0 1000 1.227

OGA+BIC 0 0 0 0 1000 1000 2.128

400 2250 OGA+HDHQ 409 567 24 0 0 1000 1.331

OGA+HDHQ+Trim 409 567 18 0 0 1000 1.330

OGA+HDBIC 994 6 0 0 0 1000 1.299

OGA+HDBIC+Trim 994 6 0 0 0 1000 1.299

OGA+BIC 0 0 0 0 1000 1000 2.023

In Simulation 3, we set further (λ, η) = (30, 10) Hz, making the firing behaviors of the candidate input neurons similar; in other words, we make the candidate input variables correlated to each other. Here, the background noise with (µ, σ) = (0.15, 0.1) still keeps.

Table 4.5 shows that, compared to Simulation 2, more sample data are needed to ensure good performance. Because the input variables behave similarly, more information (i.e., large sample size) are needed for the method to distinguish them using the associated behavior of the target neuron. For practical use, if neurons from some brain area behave very similarly, then longer recordings will be needed to correctly identify relevant input neurons among them.

Finally, in simulation 4, we consider two extra indirect inputs in the original network (Figure 4.3). Neurons 6 and 7 are simulated as independent Poisson processes with firing rate 40 Hz. The point processes of Neurons 1 and 2 are designed as time shifted version of Neurons 6 and 7 respectively with i.i.d. N (0.5, 0.1) propagation delay on each spike

Table 4.5: Results of Simulation 3 in threshold spiking neuron model (see notation in Table 4.1). The associated parameter settings are: λ = 30 Hz, η = 10 Hz, µ = 0.15, σ = 0.1.

n p Method E E₁ E₂ E_3∼5 E^∗ Correct MSPE

200 750 OGA+HDHQ 768 225 4 0 0 997 1.347

OGA+HDHQ+Trim 769 226 2 0 0 997 1.347

OGA+HDBIC 918 3 0 0 0 921 1.359

OGA+HDBIC+Trim 915 2 0 0 0 917 1.358

OGA+BIC 0 0 0 0 1000 1000 2.188

400 1500 OGA+HDHQ 473 508 19 0 0 1000 1.376

OGA+HDHQ+Trim 473 512 15 0 0 1000 1.376

OGA+HDBIC 989 11 0 0 0 1000 1.363

OGA+HDBIC+Trim 989 11 0 0 0 1000 1.363

OGA+BIC 0 0 0 0 1000 1000 2.080

event [61]. The two indirect inputs are considered as very strong and excitatory so that Neurons 1 and 2 have very similar spiking behavior to that of Neurons 6 and 7. The simulated firing trajectories of Neurons 1 and 6 is shown in Figure 4.4 and the simulation result is shown in Table 4.6. Comparing Table 4.6 with Table 4.4, we can find that, for small sample size (n = 100), there are much more mistakes made in Simulation 4 since the method has not enough information to distinguish Neuron 1 from Neuron 6 and distinguish Neuron 2 from Neuron 7. However, increasing sample size n rapidly alleviates the mistakes and converges nearly to the true set of input neurons.

Figure 4.3: Two extra indirect inputs added in the original network. Neurons 6 and 7 are simulated as independent Poisson processes with firing rate 40 Hz. The point processes of Neurons 1 and 2 are designed as time shifted version of Neurons 6 and 7 respectively with i.i.d. N (0.5, 0.1) propagation delay on each spike event.

Figure 4.4: The firing behaviors of Neurons 1 and 6. The indirect input (Neuron 6) is considered as very strong and excitatory so that Neuron 1 has a very similar spiking behavior to that of Neurons 6.

Table 4.6: Results of Simulation 4 in threshold spiking neuron model with two extra indirect inputs (see notation in Table 4.1). The associated parameter settings are: λ = 40 Hz, η = 0 Hz, µ = 0.15, σ = 0.1.

n p Method E E₁ E₂ E_3∼5 E^∗ Correct MSPE

100 750 OGA+HDHQ 661 144 11 1 0 817 1.427

OGA+HDHQ+Trim 684 130 3 0 0 817 1.434

OGA+HDBIC 555 10 0 0 0 565 1.955

OGA+HDBIC+Trim 558 0 0 0 0 558 1.980

OGA+BIC 0 0 0 0 875 854 2.616

200 1500 OGA+HDHQ 736 210 3 1 0 950 1.301

OGA+HDHQ+Trim 737 210 3 0 0 950 1.301

OGA+HDBIC 946 1 0 0 0 947 1.290

OGA+HDBIC+Trim 946 1 0 0 0 947 1.290

OGA+BIC 0 0 0 0 957 957 2.191

400 2250 OGA+HDHQ 451 527 20 0 0 998 1.233

OGA+HDHQ+Trim 452 531 15 0 0 998 1.234

OGA+HDBIC 993 4 0 0 0 997 1.241

OGA+HDBIC+Trim 993 4 0 0 0 997 1.241

OGA+BIC 0 0 0 0 998 998 1.950

在文檔中國國國國立立立立中中中中央央央央大大大大學學學學 (頁 63-71)