Figure 4.8: Average all the recipient unit activity changes in the ACC, InC and MI when the rat was opening the gate for a toy rat or nothing inside the box. Values in the Y-axis are normalized Z-score calculated from ACC neurons which received projection from InC (A, n = 8), MI (B, n = 6), and ACC (C, n = 8) units; InC neurons which received projection from ACC (D, n = 7), MI (E, n = 2), and InC (F, n = 14) units; MI neurons which received projection from InC (G, n = 4), ACC (H, n = 9), and MI (I, n = 12) units. The red line represents the 99 % confidence interval. Bin size = 100 ms.

Evidence from human studies implicates that the ACC and InC are involved in em-pathy [23, 36, 46, 64, 70]. According to our previous findings, neurons in the ACC and InC showed activity specifically increased prior to rescuing acts and relate to empathic driven prosocial behaviors (Wu et al., 2016). However, the neuronal flow direction and causal information remain unclear. Therefore, we utilize the proposed method to assess this issue. By means of the method, we discover causal relationship among ACC, InC and MI under the empathy-like behavior task. The results show that the activity of the project-ing neurons in InC connected with MI was increased significantly and specifically prior to the execution of prosocial-rescuing behaviors. These findings provide an important evidence to support that the empathy-related neurons in InC would convey information to MI to trigger the prosocial behaviors.

To realize the dynamics of neural circuitry under prosocial acts is crucial for elucidat-ing underlyelucidat-ing empathic mechanisms. Our data demonstrated that the means by which observing another in trap engages empathically motivated helping behaviors may relate to the increased activity of the neurons in the InC which projected to the MI. In view of the idea that the InC integrates the endogenous and exogenous information of self [17], our results would imply a greater role of the InC in perception of emotions of the others without confusion between self and others, which is an important characteristic of the empathy. In light of our findings, the InC may serve the affective-motivational compo-nent, i.e., the perspective and evaluation of subjective discomfort and response to trigger the specific prosocial acts through the neurons in the MI.

Now, back to the mathematical part of the proposed method that should be noticed.

For greedy-like algorithms, the most crucial component lies in when should we stop the iterative procedure. As Table 4.3 to Table 4.6 in Section 4.3.1 show, HDHQ performs quite satisfactorily even when the sample size is small and only includes 1 − 2 irrelevant neurons when the behavior of target neurons can not be fully explained due to the noise.

On the other hand, the HDBIC performs well only when the sample size is large enough and is too conservative when the sample size is samll. For practical use, we suggest that HDHQ can be the first stopping criterion to be used. One can also resort to the HDBIC if the information in hand is quite rich (e.g., the sample size n > 200). However, the HDHQ can play a more important role than the HDBIC since the sample size is often limited in reality.

## Appendix A

## Derivation of the Explicit Formula

We first denote x_{k} = x(n − k) and y_{k} = y(n − k) for convenience. Then for the model in
(2.1), we compute matrices A and Σ by the method of Yule-Walker [54]. Since x and y
are stationary, multiply (2.1) from the right by the vector x(n − 1) y(n − 1) and then
take the expectation E, we have R(-1)=AR(0), where

R(0) =

E x^{2}_{1}

E x_{1}y_{1}
E x_{1}y_{1}

E y_{1}^{2}

and R(−1) = E x_{1}x_{2}

E x_{2}y_{1}
E x_{1}y_{2}

E y_{1}y_{2}

.

Thus, we get A = R(−1)R^{−1}(0). Alternatively, Σ can be obtained by Σ = R(0) −
AR^{>}(−1) [68]. Substituting A into Σ gives

Σ = R(0) − R(−1)R^{−1}(0)R^{>}(−1). (A.1)
Using the same computation, we have ˜A = ˜R(−1) ˜R^{−1}(0) and ˜Σ = ˜R(0) − ˜A ˜R^{>}(−1) for
the perturbed model in (2.4), where

R(0) =˜ E x^{2}_{1}+ δx^{2}_{1}+ 2x_{1}δx_{1}

E x_{1}y_{1}+ y_{1}δx_{1}
E x_{1}y_{1} + y_{1}δx_{1}

E y^{2}_{1}

and

R(−1) =˜ E (x_{1}+ δx_{1})(x_{2} + δx_{2})

E x_{2}y_{1}+ y_{1}δx_{2}
E x1y2+ y2δx1

E y1y2

. Substituting ˜A into ˜Σ also gives

Σ = ˜˜ R(0) − ˜R(−1) ˜R^{−1}(0) ˜R^{>}(−1). (A.2)
Using (A.1) and (A.2), and denoting δR(0) := ˜R(0) − R(0) and δR(−1) := ˜R(−1) −
R(−1), it follows that

∆ := Σ − Σ˜

= δR(0) − δR(−1)R^{−1}(0)R(1)

− ˜R(−1)R^{−1}(0)δR(1)

+ ˜R(−1)R^{−1}(0)δR(0) ˜R^{−1}(0) ˜R(1).

(A.3)

By the definition of S and ˜S defined in (2.5) we know that ˜S −S = ∆2,2, the (2,2)-element
of matrix ∆. Hence, we can decompose ˜S into S + ∆_{2,2}. Annoying algebraic computation
from (A.3) gives ∆_{2,2} = (S_{y} − S)I, where I is defined in (2.7), and the formula in (2.6)
is obtained by denoting Θ = (Sy − S)I.

## Appendix B

## Derivation of the NSI using Simple Network

Here, we re-formulate the NSI using the simple network (Figure 3.1). Let u = αx+βy+γz
form the BLP of w, then there exist p, {f_{r}, r = 1, 2, . . . , p}, and {d_{r}, r = 1, 2, . . . , p} such
that w_{t}=Pp

r=1[f_{r}u_{t−r}+ d_{r}w_{t−r}] + _{t}, where is a stationary white noise possessing the
smallest variance among G = span({x, y, z, v_{1}, v_{2}, v_{3}}). Replacing u with the weighted
trajectory, we obtain

w_{t} =

p

X

r=1

[f_{r}u_{t−r}+ d_{r}w_{t−r}] + _{t}

=

p

X

r=1

[fr(αxt−r + βyt−r+ γzt−r) + drwt−r] + t

=

p

X

r=1

[αf_{r}x_{t−r} + βf_{r}y_{t−r} + γf_{r}z_{t−r}+ d_{r}w_{t−r}] + _{t}.

(B.1)

On the other hand, fitting to data the following empirical regression wt=

p

X

r=1

[arxt−r + bryt−r + crzt−r+ grvt−r + drwt−r] + ˜t, (B.2)
where g_{r}v_{t−r} :=P3

k=1g_{k,r}v_{k,t−r} for convenience.

• If v is stochastically independent of x, y, z, w, then we have g_{r}≡ 0. Since {a_{r}}, {b_{r}}, {c_{r}}
can be obtained through Least-Squares method, comparing (B.2) with (B.1), we
have

p

X

r=1

ar = α

p

X

r=1

fr,

p

X

r=1

br = β

p

X

r=1

fr,

p

X

r=1

cr = γ

p

X

r=1

fr, (B.3)

and get

α : β : γ =

p

X

r=1

a_{r} :

p

X

r=1

b_{r} :

p

X

r=1

c_{r}, provided

p

X

r=1

f_{r} > 0, (B.4)

where sgn(α) = sgn(

p

X

r=1

ar), sgn(β) = sgn(

p

X

r=1

br), and sgn(γ) = sgn(

p

X

r=1

cr).

• If v is linear dependent of x, y, z, w, then g_{r} 0 and {a_{r}}, {b_{r}}, {c_{r}} will be
af-fected. However, since in (B.1) possessing the smallest variance among G, taking
out v does not increase the variance of ˜, therefore we still can correct the model
coefficients by ruling out the useless information v.

Finally, the neuron synaptic index from x, y, z to w are defined respectively as
N_{x→w} := |α|+|β|+|γ|^{α} F_{u→w},

N_{y→w} := |α|+|β|+|γ|^{β} F_{u→w},
N_{z→w} := |α|+|β|+|γ|^{γ} F_{u→w},

(B.5)

where |N_{x→w}| + |N_{y→w}| + |N_{z→w}| = F_{u→w} is the GC index from the weighted trajectory
u = αx + βy + γz to the target trajectory w.

## Bibliography

[1] Arnold, A., Liu, Y., and Abe, N. (2007). Temporal causal modeling with graphical Granger methods. Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, 7, 66–75.

[2] Astolfi, L., Cincotti, F., Mattia, D., Marciani, M. G., Baccala, L. A., de Vico, Fallani F., Salinari, S., Ursino, M., Zavaglia, M., Ding, L., Edgar, J. C., Miller, G. A., He, B., and Babiloni, F. (2007). Comparison of different cortical connectivity estimators for high-resolution EEG recordings. Human Brain Mapping, 28, 143–157.

[3] Baccala, L. A. and Sameshima, K. (2001). Partial directed coherence : a new concept in neural stucture determination. Biological Cybernetics, 84, 463–474.

[4] Barnett, L. and Seth, A. K. (2014). The MVGC multivariate Granger causality tool-box: A new approach to Granger-causal inference. Journal of Neuroscience Methods, 223, 50–68.

[5] Barrett, A. B. and Barnett, L. (2013). Granger causality is designed to measure effect, not mechanism. Frontiers in Neuroinformatics, 7, 1–2.

[6] Benjamini, Y. and Hochberg Y. (1995). Controlling the false discovery rate: A prac-tical and powerful approach to multiple testing. Journal of the Royal Statisprac-tical Society, 57, 289–300.

[7] Bressler, S. L., Richter, C. G., Chen, Y., and Ding, M. (2007). Cortical functional network organization from autoregressive modeling of local filed potential oscilla-tions. Statistic in Medicine, 26, 3875–3885.

[8] Bressler, S. L. and Seth, A. K. (2011). Wiener-Granger Causality: A well established methodology. NeuroImage, 58, 323–329.

[9] Brown, E. N., Kass, R. E., and Mitra, P. P. (2004). Multiple neural spike train data analysis: state-of-the-art and future challenges. Nature Neuroscience, 7, 456–461.

[10] Brunel, N. (2000). Dynamics of Sparsely Connected Networks of Excitatory and Inhibitory Spiking Neurons Journal of Computational Neuroscience, 8, 183–208.

[11] Bush, P. and Sejnowski, T. (1996). Inhibition synchronizes sparsely connected cor-tical neurons within and between columns in realistic network models. Journal of Computational Neuroscience, 3, 91–110.

[12] Cadotte, A. J., DeMarse, T. B., He, P., and Ding, M. (2008). Causal measures of structure and plasticity in simulated and living neural networks. PLoS Computational Biology, 3, 1–14.

[13] Cadotte, A. J., DeMarse, T. B., Mareci, T. H., Parekh, M. B., Talathi, S. S., Hwang, D. U., Ditto, W. L., Ding, M., and Carney, P. R. (2010). Granger causality relation-ships between local field potentials in an animal model of temporal lobe epilepsy.

Journal of Neuroscience methods, 189, 121–129.

[14] Cao, Y., Maran, S. K., Dhamala, M., Jaeger, D., and Heck, D. H. (2012). Behavior-related pauses in simple-spike activity of mouse purkinje cells are linked to spike rate modulation. Journal of Neuroscience, 25, 8678–8685.

[15] Cheng, D., Bahadori, M., T., and Liu, Y. (2014). FBLG: A simple and effective approach for temporal dependence discovery from time series data. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 14, 382–391.

[16] Churchland, P. S. and Sejnowski, T. J. (1992). The computational brain. The MIT press.

[17] Craig, A. D. (2009). How do you feel–now? The anterior insula and human awareness.

Nat Rev Neurosci, 10(1), 59–70.

[18] Dhamala, M., Rangarajan, G., and Ding, M. (2008). Analyzing information flow in brain networks with nonparametric Granger causality. NeuroImage, 41, 354–362.

[19] Deborah, S., Won, D. Y., and Patrick, D. W. (2003). Effects of spike sorting error on information content in multi-neuron recordings. Proceedings of the 1st International IEEE EMBS Conference on Neural Engineering, 3, 618–621.

[20] Ding, M., Chen, Y., and Bressler, S. L. (2006). Granger Causality: Basic Theory and Application to Neuroscience. Handbook of Time Series Analysis: Recent Theoretical Developments and Applications. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim, Germany. 437–460.

[21] Fallani, F. De Vico, Corazzol, M., Sternberg, J., Wyart, C., and Chavez, M. (2015).

Hierarchy of Neural Organization in the Embryonic Spinal Cord: Granger-Causality Graph Analysis of Calcium Imaging Data. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 23, 333–341.

[22] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties.Journal of American Statistical Association, 96, 1348–1360.

[23] Fan, Y., Duncan, N. W., de Greck, M., and Northoff, G. (2011). Is there a core neural network in empathy? An fMRI based quantitative meta-analysis. Neurosci Biobehav Rev, 35(3), 903–911.

[24] Ge, T., Kendrick, K.M., and Feng, J. (2009). A Novel Extended Granger Causal Model Approach Demonstrates Brain Hemispheric Differences during Face Recogni-tion Learning. PLoS ComputaRecogni-tional Biology, 5, 11.

[25] Gomez, L., Budelli, R., Saa, R., Stiber, M., and Segundo, J. P. (2005). Pooled spike trains of correlated presynaptic inputs as realizations of cluster point processes.

Biological Cybernetics, 92, 110–127.

[26] Gow, D. W., Segawa, J. A., Ahlfors, S. P., Lin, and F. H. (2008). Lexical influences on speech perception: a Granger causality analysis of MEG and EEG source estimates.

Neuroimage, 43, 614–623.

[27] Granger, C. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 424–438.

[28] Granger, C. (1980). Testing for causality: A personal viewpoint. Journal of Economic Dynamics and Control, 2, 329–352.

[29] Greene, W. H. (2002). Econometric Analysis, fifth ed. Prentice-Hall, Upper Saddle River, NJ.

[30] Guo, S., Seth, A. K., Kendrick, K. M., Zhou, C., and Feng, J. (2008). Partial Granger causality – Eliminating exogenous inputs and latent variables. Journal of Neuroscience Methods, 172, 79–93.

[31] Guo, S., Ladroue, C., and Feng, J. (2010). Granger Causality: Theory and Applica-tions. Frontiers in Computational and Systems Biology. Springer, London 83–111.

[32] Hu, S., Dai, G., Worrel, G. A., Dai, Q., and Liang, H. (2011). Causality analysis of neural connectivity: critical examnination of existing methods and advances of new methods. IEEE Transactions on Neural Networks, 22, 829–844.

[33] Huang, J. J., Yen, C. T., Liu, T. L., Tsao, H. W., Hsu, J. W., and Tsai, M. L.

(2013). Effects of dopamine D2 agonist quinpirole on neuronal activity of anterior cingulate cortex and striatum in rats. Psychopharmacology, 227, 459–466.

[34] Ing, C. K. and Lai, T. L. (2011). A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Statistica Sinica, 21, 1473–1513.

[35] Izhikevich, E. M. (2003). Simple models of spiking neurons. IEEE Transactions on Neural Networks, 14, 1569–1572.

[36] Jackson, P. L., Rainville, P., and Decety, J. (2006). To what extent do we share the pain of others? Insight from the neural bases of pain empathy. Pain, 125(1-2), 5–9.

[37] Kaminski, M., Ding, M., Truccolo, W. A., and Bressler, S. L. (2001). Evaluating causal relations in neural systems : Granger causality, directed transfer function and statistical assessment of significance. Biological Cybernetics, 85, 145–157.

[38] Kim, S., Putrino, D., Ghosh, S., and Brown, E. N. (2011). A Granger causality measure for point process models of ensemble neural spiking activity. PLoS Compu-tational Biology, 7, 3.

[39] Kitagawa, G. (2010). Introduction to Time Series Modeling. Chapman & Hall/CRC Monographs on Statistics & Applied Probability.

[40] Krueger, F., Landgraf, S., van der Meer, E., Deshpande, G., and Hu, X. (2011). Ef-fective connectivity of the multiplication network: a functional MRI and multivariate Granger Causality Mapping study. Human Brain Mapping, 32, 1419–1431.

[41] Krumin, M. and Shoham, S. (2010). Multivariate autoregressive modeling and Granger causality analysis of multiple spike trains. Computational Intelligence and Neuroscience 752428.

[42] Lehky, S. R. (2010). Decoding poisson spike trains by gaussian filtering. Neural Computation, 22, 1245–1271.

[43] Lewicki, M. S. (1998). A review of methods for spike sorting: the detection and classification of neural action potentials. Network: Computation in Neural Systems, 9, R53–R78.

[44] Luo, Q., Lu, W., Cheng, W., Valdes-Sosa, P. A., Wen, X., Ding, M., and Feng, J. (2013). Spatio-temporal Granger causality: A new framework. NeuroImage, 79, 241–263.

[45] Luo, Q., Ge, T., Grabenhorst, F., Feng, J., and Rolls, E. T. (2013). Attention-Dependent Modulation of Cortical Taste Circuits Revealed by Granger Causality with Signal-Dependent Noise. PLoS Computational Biology, 9, 10.

[46] Morrison, I., Lloyd, D., di Pellegrino, G., and Roberts, N. (2004). Vicarious responses to pain in anterior cingulate cortex: is empathy a multisensory issue? Cogn Affect Behav Neurosci, 4(2), 270–278.

[47] Lutkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Springer.

[48] Marinazzo, D., Liao, W., Chen, H., and Stramaglia, S. (2011). Nonlinear connectivity by Granger causality. NeuroImage, 58, 330-338.

[49] Michailidis, G. and d’Alche-Buc, F. (2013). Autoregressive models for gene regula-tory network inference: Sparsity, stability and causality issues. Mathematical Bio-sciences, 246, 326–334.

[50] Nageswaran, J. M., Dutt, N., Krichmar, J. L., Nicolau, A., and Veidenbaum, A. V.

(2009). A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors. Neural Networks, 22, 791–800.

[51] Nedungadi, A. G., Rangarajan, G., Jain, N., and Ding M. (2009). Analyzing mul-tiple spike trains with nonparametric granger causality. Journal of Computational Neuroscience, 27, 55–64.

[52] Ostojic, S. (2014). Two types of asynchronous activity in networks of excitatory and inhibitory spiking neurons. Nature Neuroscience, 17, 594–600.

[53] Pillow, J. W., Shlens, J., Paninski, L., Sher, A., Litke, A. M, Chichilnisky, E. J., and Simoncelli, E. P. (2008). Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature, 454, 995–999.

[54] Priestley, M.B. (1994). Spectral Analysis and Time Series. London: Academic Press.

[55] Quinn, C. J., Coleman, T. P., Kiyavash, N., and Hatsopoulos, N. G. (2011). Es-timating the directed information to infer causal relationships in ensemble neural spike train recordings. Journal of Computational Neuroscience, 30, 17–44.

[56] Rosenbaum, R., Trousdale, J., and Josic, K. (2011). The effects of pooling on spike train correlations. Frontiers in Neuroscience, 5, 1–10.

[57] Sameshima, K. and Baccala, L. A. (1999). Using partial directed coherence to de-scribe neuronal ensemble interactions. Journal of Neuroscience Methods, 94, 93–103.

[58] Seth, A. K. (2005). Causal connectivity of evolved neural networks during behavior.

Network, 16, 35–54.

[59] Seth, A. K. and Edelman, G. M. (2007). Distinguishing causal interactions in neural populations. Neural Computation, 19, 910–933.

[60] Seth, A. K. (2010). A MATLAB toolbox for Granger causal connectivity analysis.

Journal of Neuroscience Methods, 186, 262–273.

[61] Shao, P. C., Tseng, W. T., Kuo, C. C., Shann, W. C., Tsai, M. L., and Yen, C. C.

(2013). Effects of spike sorting error on the Granger causality index. Neural Networks, 46, 249–259.

[62] Shao, P. C., Huang, J. J., Shann, W. C., Yen, C. T., Tsai, M. L., and Yen, C. C.

(2015). Granger causality-based synaptic weights estimation for analyzing neuronal networks. Journal of Computational Neuroscience, 38, 483–497.

[63] Shimazaki, H. and Shinomoto, S. (2007). A method for selecting the bin size of a time histogram. Neural Computation, 19, 1503–1527.

[64] Singer, T., Seymour, B., O’Doherty, J., Kaube, H., Dolan, R. J., and Frith, C. D.

(2004). Empathy for pain involves the affective but not sensory components of pain.

Science, 303(5661), 1157–1162.

[65] Takeda, M. and Matsumoto S. (1997). Discharge patterns of dorsal and ventral respiratory group neurons during spontaneous augmented breaths observed in pen-tobarbital anesthetized rats. Brain Res., 749(1), 95–100.

[66] Tripp, B. P. and Orchard, J. (2012). Population coding in sparsely connected net-works of noisy neurons. Frontiers in Computational Neuroscience, 6, 23.

[67] Tseng, W. T., Tsai, M. L., Iwata, K., and Yen, C. T. (2012). Long-Term Changes in Trigeminal Ganglionic and Thalamic Neuronal Activities following Inferior Alveolar Nerve Transection in Behaving Rats. Journal of Neuroscience, 32, 16051–16063.

[68] Wei, William W. S. (2006). Time Series Analysis: Univariate and Multivariate Meth-ods, second edition. California: Addison-Wesley. 391–392.