Chapter 4: Discussion
7. Future directions
Based on previous findings and our current results, two potential studies are listed below as future directions.
1. Because our results support the idea of the hierarchical reinforcement learning in the cortico-striatal circuits, it is of great interest to use electrophysiological
recording in the DMS and NA (including NA core and NA shell) to see if there are specific change related to certain action or certain step during animals’ choice process in the 2-choice dynamic foraging task.
2. As described previously, NA core appears to promote a flexible approach toward reward-related locations, whereas NA shell has been implicated in suppression of non-rewarded actions and in learning to ignore irrelevant stimuli. As a result, specific modulation of striatonigral MSNs and striatopallidal MSNs using
optogenetic technique in the NA core and NA shell is worth further exploring. It is of interest to see its effect on decision making. For example, activation of NA core of striatonigral MSNs or striatopallidal MSNs when animal approaches to reward to see if the manipulation disrupts the animals’ value representation of reward;
activcation of NA shell of striatonigral MSNs or striatopallidal MSNs when
animal is going to make perseverative errors to see if the manipulation disrupts the
58
animals’ suppression of non-rewarded actions and in learning to ignore irrelevant stimuli.
59
References
Albin, R. L., Young, A. B., & Penney, J. B. (1989). The functional anatomy of basal ganglia disorders. Trends in Neurosciences, 12(10), 366–375. doi:
10.1016/0166-2236(89)90074-X
Alexander, G. E., & Crutcher, M. D. (1990). Functional architecture of basal ganglia circuits: neural substrates of parallel processing. Trends in Neurosciences, 13(7), 266–271. doi:10.1016/0166-2236(90)90107-L
Alexander, G. E., DeLong, M. R., & Strick, P. L. (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience, 9, 357–381. doi:10.1146/annurev.ne.09.030186.002041
Ambroggi, F., Ghazizadeh, A., Nicola, S. M., & Fields, H. L. (2011). Roles of nucleus accumbens core and shell in incentive-cue responding and behavioral inhibition.
The Journal of Neuroscience, 31(18), 6820–6830. doi:
10.1523/JNEUROSCI.6491-10.2011
Ambroggi, F., Ishikawa, A., Fields, H. L., & Nicola, S. M. (2008). Basolateral amygdala neurons facilitate reward-seeking behavior by exciting nucleus
accumbens neurons. Neuron, 59(4), 648–661. doi:10.1016/j.neuron.2008.07.004
Annett, L. E., McGregor, A., & Robbins, T. W. (1989). The effects of ibotenic acid lesions of the nucleus accumbens on spatial learning and extinction in the rat.
Behavioural Brain Research, 31(3), 231–242. doi:
10.1016/0166-4328(89)90005-3
60
Asaad, W. F., & Eskandar, E. N. (2011). Encoding of both positive and negative reward prediction errors by neurons of the primate lateral prefrontal cortex and caudate nucleus. The Journal of Neuroscience, 31(49), 17772–17787. doi:
10.1523/JNEUROSCI.3793-11.2011
Balleine, B., & Killcross, S. (1994). Effects of ibotenic acid lesions of the Nucleus Accumbens on instrumental action. Behavioural Brain Research, 65(2), 181–193.
doi:10.1016/0166-4328(94)90104-X
Balleine, B. W., & O’Doherty, J. P. (2010). Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action.
Neuropsychopharmacology, 35(1), 48–69. doi:10.1038/npp.2009.131
Barraclough, D. J., Conroy, M. L., & Lee, D. (2004). Prefrontal cortex and decision making in a mixed-strategy game. Nature Neuroscience, 7(4), 404–410.
doi:10.1038/nn1209
Baum, W. M. (1974). On two types of deviation from the matching law: bias and undermatching. Journal of the Experimental Analysis of Behavior, 22(1), 231–242. doi:10.1901/jeab.1974.22-231
Bayer, H. M., & Glimcher, P. W. (2005). Midbrain dopamine neurons encode a quantitative reward prediction error signal. Neuron, 47(1), 129–141.
doi:10.1016/j.neuron.2005.05.020
Belin, D., Jonkman, S., Dickinson, A., Robbins, T. W., & Everitt, B. J. (2009).
Parallel and interactive learning processes within the basal ganglia: relevance for
61
the understanding of addiction. Behavioural Brain Research, 199(1), 89–102.
doi:10.1016/j.bbr.2008.09.027
Belova, M. A, Paton, J. J., & Salzman, C. D. (2008). Moment-to-moment tracking of state value in the amygdala. The Journal of Neuroscience, 28(40), 10023–10030.
doi:10.1523/JNEUROSCI.1400-08.2008
Bentivoglio, M., & Morelli, M. (2005). Chapter I The organization and circuits of mesencephalic dopaminergic neurons and the distribution of dopamine receptors in the brain. In S. B. Dunnett, M. Bentivoglio, A. Björklund, & T. Hökfelt (Eds.), Dopamine (Vol. 21, pp. 1–107). Elsevier. doi:
http://dx.doi.org/10.1016/S0924-8196(05)80005-3
Björklund, A., & Dunnett, S. B. (2007). Dopamine neuron systems in the brain: an update. Trends in Neurosciences, 30(5), 194–202. doi:10.1016/j.tins.2007.03.006
Blaiss, C. A, & Janak, P. H. (2009). The nucleus accumbens core and shell are critical for the expression, but not the consolidation, of Pavlovian conditioned approach.
Behavioural Brain Research, 200(1), 22–32. doi:10.1016/j.bbr.2008.12.024
Braun, S., & Hauber, W. (2011). The dorsomedial striatum mediates flexible choice behavior in spatial tasks. Behavioural Brain Research, 220(2), 288–293.
doi:10.1016/j.bbr.2011.02.008
Burk, J. A, & Mair, R. G. (2001). Effects of dorsal and ventral striatal lesions on delayed matching trained with retractable levers. Behavioural Brain Research, 122(1), 67–78. doi:10.1016/S0166-4328(01)00169-3
62
Bush, R. R., & Mosteller, F. (1955). Stochastic models for learning. John Wiley &
Sons, Inc.
Cai, X., Kim, S., & Lee, D. (2011). Heterogeneous coding of temporally discounted values in the dorsal and ventral striatum during intertemporal choice. Neuron, 69(1), 170–182. doi:10.1016/j.neuron.2010.11.041
Chang, J.-Y., Chen, L., Luo, F., Shi, L.-H., & Woodward, D. J. (2002). Neuronal responses in the frontal cortico-basal ganglia system during delayed
matching-to-sample task: ensemble recording in freely moving rats.
Experimental Brain Research. Experimentelle Hirnforschung. Expérimentation Cérébrale, 142(1), 67–80. doi:10.1007/s00221-001-0918-3
Dalton, G. L., Phillips, A. G., & Floresco, S. B. (2014). Preferential involvement by nucleus accumbens shell in mediating probabilistic learning and reversal shifts.
The Journal of Neuroscience, 34(13), 4618–4626. doi:
10.1523/JNEUROSCI.5058-13.2014
Day, J. J., & Carelli, R. M. (2007). The nucleus accumbens and Pavlovian reward learning. The Neuroscientist : A Review Journal Bringing Neurobiology, Neurology and Psychiatry, 13(2), 148–159. doi:10.1177/1073858406295854
DeLong, M. R. (1990). Primate models of movement disorders of basal ganglia origin.
Trends in Neurosciences, 13(7), 281–285. doi:10.1016/0166-2236(90)90110-V
Ding, L., & Gold, J. I. (2012). Neural correlates of perceptual decision making before, during, and after decision commitment in monkey frontal eye field. Cerebral Cortex, 22(5), 1052–1067. doi:10.1093/cercor/bhr178
63
Dorris, M. C., & Glimcher, P. W. (2004). Activity in posterior parietal cortex is correlated with the relative subjective desirability of action. Neuron, 44(2), 365–378. doi:10.1016/j.neuron.2004.09.009
Doya, K. (2008). Modulators of decision making. Nature Neuroscience, 11(4), 410–416. doi:10.1038/nn2077
Fallon, J. H., & Moore, R. Y. (1978). Catecholamine innervation of the basal
forebrain. IV. Topography of the dopamine projection to the basal forebrain and neostriatum. The Journal of Comparative Neurology, 180(3), 545–580.
doi:10.1002/cne.901800310
Fellows, L. K., & Farah, M. J. (2003). Ventromedial frontal cortex mediates affective shifting in humans: evidence from a reversal learning paradigm. Brain, 126(8), 1830–1837. doi:10.1093/brain/awg180
Floresco, S. B., Ghods-Sharifi, S., Vexelman, C., & Magyar, O. (2006). Dissociable roles for the nucleus accumbens core and shell in regulating set shifting. The Journal of Neuroscience, 26(9), 2449–2457. doi:
10.1523/JNEUROSCI.4431-05.2006
Floresco, S. B., McLaughlin, R. J., & Haluk, D. M. (2008). Opposing roles for the nucleus accumbens core and shell in cue-induced reinstatement of food-seeking behavior. Neuroscience, 154(3), 877–884. doi:
10.1016/j.neuroscience.2008.04.004
Fridberg, D. J., Queller, S., Ahn, W.-Y., Kim, W., Bishara, A. J., Busemeyer, J. R., … Stout, J. C. (2010). Cognitive Mechanisms Underlying Risky Decision-Making
64
in Chronic Cannabis Users. Journal of Mathematical Psychology, 54(1), 28–38.
doi:10.1016/j.jmp.2009.10.002
Gerfen, C. R. (1992). The neostriatal mosaic: multiple levels of compartmental organization in the basal ganglia. Annual Review of Neuroscience, 15, 285–320.
doi:10.1146/annurev.ne.15.030192.001441
Gerfen, C. R., & Scott Young, W. (1988). Distribution of striatonigral and
striatopallidal peptidergic neurons in both patch and matrix compartments: an in situ hybridization histochemistry and fluorescent retrograde tracing study. Brain Research, 460(1), 161–167. doi:10.1016/0006-8993(88)91217-6
Giménez-Amaya, J. M., & Graybiel, A. M. (1990). Compartmental origins of the striatopallidal projection in the primate. Neuroscience, 34(1), 111–126.
doi:10.1016/0306-4522(90)90306-O
Gremel, C. M., & Costa, R. M. (2013). Orbitofrontal and striatal circuits dynamically encode the shift between goal-directed and habitual actions. Nature
Communications, 4. doi:10.1038/ncomms3264
Groenewegen, H. J., Berendse, H. W., Wolters, J. G., & Lohman, A. H. M. (1991).
The Prefrontal Its Structure, Function and Cortex Pathology. Progress in Brain Research (Vol. 85, pp. 95–118). Elsevier. doi:10.1016/S0079-6123(08)62677-1
Haber, S. N., Fudge, J. L., & McFarland, N. R. (2000). Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum.
The Journal of Neuroscience, 20(6), 2369–2382.
65
Haluk, D. M., & Floresco, S. B. (2009). Ventral striatal dopamine modulation of different forms of behavioral flexibility. Neuropsychopharmacology, 34(8), 2041–2052. doi:10.1038/npp.2009.21
Hare, T. A., Camerer, C. F., & Rangel, A. (2009). Self-control in decision-making involves modulation of the vmPFC valuation system. Science, 324(5927), 646–648. doi:10.1126/science.1168450
Haruno, M., & Kawato, M. (2006). Heterarchical reinforcement-learning model for integration of multiple cortico-striatal loops: fMRI examination in
stimulus-action-reward association learning. Neural Networks, 19(8), 1242–1254.
doi:10.1016/j.neunet.2006.06.007
Hikosaka, O., Nakahara, H., Rand, M. K., Sakai, K., Lu, X., Nakamura, K., … Doya, K. (1999). Parallel neural networks for learning sequential procedures. Trends in Neurosciences, 22(10), 464–471. doi:10.1016/S0166-2236(99)01439-3
Hollerman, J. R., & Schultz, W. (1998). Dopamine neurons report an error in the temporal prediction of reward during learning. Nature Neuroscience, 1(4), 304–309. doi:10.1038/1124
Hong, S., & Hikosaka, O. (2008). The globus pallidus sends reward-related signals to the lateral habenula. Neuron, 60(4), 720–729. doi:10.1016/j.neuron.2008.09.035
Horwitz, G. D., & Newsome, W. T. (2001). Target selection for saccadic eye movements: prelude activity in the superior colliculus during a
direction-discrimination task. Journal of Neurophysiology, 86(5), 2543–2558.
66
Ito, M., & Doya, K. (2011). Multiple representations and algorithms for reinforcement learning in the cortico-basal ganglia circuit. Current Opinion in Neurobiology, 21(3), 368–373. doi:10.1016/j.conb.2011.04.001
Kaelbling, L. P., Littman, M. L., & Moore, A. W. (1996). Reinforcement Learning : A Survey. Journal of Artificial Intelligence Research, 4, 237-285.
Kass, R. E., & Raftery, A. E. (1995). Bayes Factors. Journal of the American Statistical Association, 90(430), 773–795. doi:
10.1080/01621459.1995.10476572
Kawaguchi, Y., Wilson, C. J., Augood, S. J., & Emson, P. C. (1995). Striatal interneurones: chemical, physiological and morphological characterization.
Trends in Neurosciences, 18(12), 527–535. doi:10.1016/0166-2236(95)98374-8
Kennerley, S. W., Walton, M. E., Behrens, T. E. J., Buckley, M. J., & Rushworth, M.
F. S. (2006). Optimal decision making and the anterior cingulate cortex. Nature Neuroscience, 9(7), 940–947. doi:10.1038/nn1724
Kersten, D., Mamassian, P., & Yuille, A. (2004). Object perception as Bayesian inference. Annual Review of Psychology, 55, 271–304. doi:
10.1146/annurev.psych.55.090902.142005
Kim, H., Sul, J. H., Huh, N., Lee, D., & Jung, M. W. (2009). Role of striatum in updating values of chosen actions. The Journal of Neuroscience, 29(47), 14701–14712. doi:10.1523/JNEUROSCI.2728-09.2009
67
Kim, S., Hwang, J., & Lee, D. (2008). Prefrontal coding of temporally discounted values during intertemporal choice. Neuron, 59(1), 161–172. doi:
10.1016/j.neuron.2008.05.010
Kirkby, R. J. (1969). Caudate nucleus lesions and perseverative behavior. Physiology
& Behavior, 4(4), 451–454. doi:10.1016/0031-9384(69)90135-8
Kolb, B. (1977). Studies on the caudate-putamen and the dorsomedial thalamic nucleus of the rat: Implications for mammalian frontal-lobe functions.
Physiology & Behavior, 18(2), 237–244. doi:10.1016/0031-9384(77)90128-7
Kreitzer, A. C. (2009). Physiology and pharmacology of striatal neurons. Annual Review of Neuroscience, 32, 127–147. doi:
10.1146/annurev.neuro.051508.135422
Kreitzer, A. C., & Malenka, R. C. (2007). Endocannabinoid-mediated rescue of striatal LTD and motor deficits in Parkinson’s disease models. Nature, 445(7128), 643–647.
Kreitzer, A. C., & Malenka, R. C. (2008). Striatal plasticity and basal ganglia circuit function. Neuron, 60(4), 543–554. doi:10.1016/j.neuron.2008.11.005
Lau, B., & Glimcher, P. W. (2008). Value representations in the primate striatum during matching behavior. Neuron, 58(3), 451–463.
doi:10.1016/j.neuron.2008.02.021
68
Lauwereyns, J., Watanabe, K., & Coe, B. (2002). A neural correlate of response bias in monkey caudate nucleus, Nature, 418(6896), 413–7. doi:
10.1038/nature00844.1.
Lee, D., Rushworth, M. F. S., Walton, M. E., Watanabe, M., & Sakagami, M. (2007).
Functional specialization of the primate frontal cortex during decision making.
The Journal of Neuroscience, 27(31), 8170–8173. doi:
10.1523/JNEUROSCI.1561-07.2007
Lee, D., Seo, H., & Jung, M. W. (2012). Neural basis of reinforcement learning and decision making. Annual Review of Neuroscience, 35, 287–308.
doi:10.1146/annurev-neuro-062111-150512
Lee, M. D., & Wagenmakers, E. J. (2014). Bayesian cognitive modeling: A practical course. Cambridge University Press.
Lindvall, O., & Bjorklund, A. (1974). The organization of the ascending
catecholamine neuron systems in the rat brain as revealed by the glyoxylic acid fluorescence method. Acta Physiologica Scandinavica. Supplementum, 412, 1–48.
Lindvall, O., Bjorklund, A., & Divac, I. (1977). Organization of mesencephalic dopamine neurons projecting to neocortex and septum. Advances in Biochemical Psychopharmacology, 16, 39–46.
Lovinger, D. M. (2010). Neurotransmitter roles in synaptic modulation, plasticity and learning in the dorsal striatum. Neuropharmacology, 58(7), 951–961.
doi:10.1016/j.neuropharm.2010.01.008
69
Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000). WinBUGS – A
Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10, 325–337. doi:10.1023/A:1008929526011
Matsumoto, M., & Hikosaka, O. (2007). Lateral habenula as a source of negative reward signals in dopamine neurons. Nature, 447(7148), 1111–1115.
doi:10.1038/nature05860
Mayer, M. L., & Westbrook, G. L. (1987). Cellular mechanisms underlying excitotoxicity. Trends in Neurosciences. doi:10.1016/0166-2236(87)90023-3
Montague, P. R., Hyman, S. E., & Cohen, J. D. (2004). Computational roles for dopamine in behavioural control. Nature, 431(7010), 760–767. doi:
10.1038/nature03015
Moussa, R., Poucet, B., Amalric, M., & Sargolini, F. (2011). Contributions of dorsal striatal subregions to spatial alternation behavior. Learning & Memory, 18(7), 444–451. doi:10.1101/lm.2123811
Murray, E. a, O’Doherty, J. P., & Schoenbaum, G. (2007). What we know and do not know about the functions of the orbitofrontal cortex after 20 years of
cross-species studies. The Journal of Neuroscience, 27(31), 8166–8169.
doi:10.1523/JNEUROSCI.1556-07.2007
Nicola, S. M. (2010). The flexible approach hypothesis: unification of effort and cue-responding hypotheses for the role of nucleus accumbens dopamine in the activation of reward-seeking behavior. The Journal of Neuroscience, 30(49), 16585–16600. doi:10.1523/JNEUROSCI.3958-10.2010
70
Niv, Y. (2009). Reinforcement learning in the brain. Journal of Mathematical Psychology, 53(3), 139–154. doi:10.1016/j.jmp.2008.12.005
Okano, K., & Tanji, J. (1987). Neuronal activities in the primate motor fields of the agranular frontal cortex preceding visually triggered and self-paced movement.
Experimental Brain Research, 66(1), 155–166. doi:10.1007/BF00236211
Oyama, K., Hernádi, I., Iijima, T., & Tsutsui, K.-I. (2010). Reward prediction error coding in dorsal striatal neurons. The Journal of Neuroscience, 30(34),
11447–11457. doi:10.1523/JNEUROSCI.1719-10.2010
Padoa-Schioppa, C. (2011). Neurobiology of economic choice: a good-based model.
Annual Review of Neuroscience, 34, 333–359. doi:
10.1146/annurev-neuro-061010-113648
Padoa-Schioppa, C., & Assad, J. a. (2006). Neurons in the orbitofrontal cortex encode economic value. Nature, 441(7090), 223–226. doi:10.1038/nature04676
Pastor-Bernier, A., & Cisek, P. (2011). Neural correlates of biased competition in premotor cortex. The Journal of Neuroscience, 31(19), 7083–7088. doi:
10.1523/JNEUROSCI.5681-10.2011
Platt, M. L., & Glimcher, P. W. (1999). Neural correlates of decision variables in parietal cortex. Nature, 400(6741), 233–238. doi:10.1038/22268
Pothuizen, H. H. J., Jongen-Rêlo, A. L., Feldon, J., & Yee, B. K. (2005). Double dissociation of the effects of selective nucleus accumbens core and shell lesions on impulsive-choice behaviour and salience learning in rats. The European
71
Journal of Neuroscience, 22(10), 2605–16. doi:
10.1111/j.1460-9568.2005.04388.x
Quilodran, R., Rothé, M., & Procyk, E. (2008). Behavioral shifts and action valuation in the anterior cingulate cortex. Neuron, 57(2), 314–325. doi:
10.1016/j.neuron.2007.11.031
Raftery, A. E. (1995). Bayesian model selection in social research. Sociological Methodology, 25, 111–164.
Ragozzino, M. E. (2007). The contribution of the medial prefrontal cortex,
orbitofrontal cortex, and dorsomedial striatum to behavioral flexibility. Annals of the New York Academy of Sciences, 1121, 355–375. doi:
10.1196/annals.1401.013
Ragozzino, M. E., Jih, J., & Tzavos, A. (2002). Involvement of the dorsomedial striatum in behavioral flexibility: role of muscarinic cholinergic receptors. Brain Research, 953(1-2), 205–214. doi:10.1016/S0006-8993(02)03287-0
Ragozzino, M. E., Ragozzino, K. E., Mizumori, S. J. Y., & Kesner, R. P. (2002). Role of the dorsomedial striatum in behavioral flexibility for response and visual cue discrimination learning. Behavioral Neuroscience, 116(1), 105–115. doi:
10.1037//0735-7044.116.1.105
Roitman, J. D., & Shadlen, M. N. (2002). Response of neurons in the lateral
intraparietal area during a combined visual discrimination reaction time task. The Journal of Neuroscience, 22(21), 9475–9489.
72
Rorie, A. E., Gao, J., McClelland, J. L., & Newsome, W. T. (2010). Integration of sensory and reward information during perceptual decision-making in lateral intraparietal cortex (LIP) of the macaque monkey. PloS One, 5(2), e9308.
doi:10.1371/journal.pone.0009308
Rushworth, M. F. S., Walton, M. E., Kennerley, S. W., & Bannerman, D. M. (2004).
Action sets and decisions in the medial frontal cortex. Trends in Cognitive Sciences, 8(9), 410–417. doi:10.1016/j.tics.2004.07.009
Rutledge, R. B., Lazzaro, S. C., Lau, B., Myers, C. E., Gluck, M. A., & Glimcher, P.
W. (2009). Dopaminergic drugs modulate learning rates and perseveration in Parkinson’s patients in a dynamic foraging task. The Journal of Neuroscience, 29(48), 15104–15114. doi:10.1523/JNEUROSCI.3524-09.2009
Samejima, K., Ueda, Y., Doya, K., & Kimura, M. (2005). Representation of action-specific reward values in the striatum. Science, 310(5752), 1337–1340.
doi:10.1126/science.1115270
Schoenbaum, G., Nugent, S. L., Saddoris, M. P., & Setlow, B. (2002). Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor
discriminations. Neuroreport, 13(6), 885–890. doi:
10.1097/00001756-200205070-00030
Schoenbaum, G., & Setlow, B. (2003). Lesions of nucleus accumbens disrupt learning about aversive outcomes. The Journal of Neuroscience, 23(30), 9833–9841.
Schultz, W. (1997). A Neural Substrate of Prediction and Reward. Science, 275(5306), 1593–1599. doi:10.1126/science.275.5306.1593
73
Schultz, W. (2006). Behavioral theories and the neurophysiology of reward. Annual Review of Psychology, 57, 87–115. doi:
10.1146/annurev.psych.56.091103.070229
Schultz, W. (2010). Dopamine signals for reward value and risk: basic and recent data.
Behavioral and Brain Functions, 6, 24. doi:10.1186/1744-9081-6-24
Seo, H., Barraclough, D. J., & Lee, D. (2009). Lateral intraparietal cortex and reinforcement learning during a mixed-strategy game. The Journal of Neuroscience, 29(22), 7278–7289. doi:10.1523/JNEUROSCI.1479-09.2009
Seo, H., & Lee, D. (2007). Temporal filtering of reward signals in the dorsal anterior cingulate cortex during a mixed-strategy game. The Journal of Neuroscience, 27(31), 8366–8377. doi:10.1523/JNEUROSCI.2369-07.2007
Seo, H., & Lee, D. (2009). Behavioral and neural changes after gains and losses of conditioned reinforcers. The Journal of Neuroscience, 29(11), 3627–3641.
doi:10.1523/JNEUROSCI.4726-08.2009
Shen, W., Flajolet, M., Greengard, P., & Surmeier, D. J. (2008). Dichotomous
dopaminergic control of striatal synaptic plasticity. Science, 321(5890), 848–851.
doi:10.1126/science.1160575
Shidara, M., & Richmond, B. J. (2002). Anterior cingulate: single neuronal signals related to degree of reward expectancy. Science, 296(5573), 1709–1711.
doi:10.1126/science.1069504
74
Shiflett, M. W., & Balleine, B. W. (2011). Molecular substrates of action control in cortico-striatal circuits. Progress in Neurobiology, 95(1), 1–13. doi:
10.1016/j.pneurobio.2011.05.007
Smith, C. A. B. (1961). Consistency in Statistical Inference and Decision. Journal of the Royal Statistical Society. Series B (Methodological), 23(1), 1–37. doi:
10.2307/2983842
So, N.-Y., & Stuphorn, V. (2010). Supplementary eye field encodes option and action value for saccades with variable reward. Journal of Neurophysiology, 104(5), 2634–2653. doi:10.1152/jn.00430.2010
Sohn, J.-W., & Lee, D. (2007). Order-dependent modulation of directional signals in the supplementary and presupplementary motor areas. The Journal of
Neuroscience, 27(50), 13655–13666. doi:10.1523/JNEUROSCI.2982-07.2007
Soon, C. S., Brass, M., Heinze, H.-J., & Haynes, J.-D. (2008). Unconscious
determinants of free decisions in the human brain. Nature Neuroscience, 11(5), 543–545. doi:10.1038/nn.2112
Sugrue, L. P., Corrado, G. S., & Newsome, W. T. (2004). Matching behavior and the representation of value in the parietal cortex. Science, 304(5678), 1782–1787.
doi:10.1126/science.1094765
Sul, J. H., Jo, S., Lee, D., & Jung, M. W. (2011). Role of rodent secondary motor cortex in value-based action selection. Nature Neuroscience, 14(9), 1202–1208.
doi:10.1038/nn.2881
75
Sul, J. H., Kim, H., Huh, N., Lee, D., & Jung, M. W. (2010). Distinct roles of rodent orbitofrontal and medial prefrontal cortex in decision making. Neuron, 66(3), 449–460. doi:10.1016/j.neuron.2010.03.033
Surmeier, D. J., Ding, J., Day, M., Wang, Z., & Shen, W. (2007). D1 and D2 dopamine-receptor modulation of striatal glutamatergic signaling in striatal medium spiny neurons. Trends in Neurosciences, 30(5), 228–235.
doi:10.1016/j.tins.2007.03.008
Sutton, R. S., & Barto, A. G. (1990). Time-derivative models of pavlovian
reinforcement. In Gabriel, M & Moore, J (Eds.), Learning and Computational Neuroscience: Foundations of Adaptive Networks (pp. 497-537). Cambridge, MA: MIT Press.
Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction.
Cambridge, MA: MIT Press.
Taghzouti, K., Louilot, A., Herman, J. P., Le Moal, M., & Simon, H. (1985).
Alternation behavior, spatial discrimination, and reversal disturbances following 6-hydroxydopamine lesions in the nucleus accumbens of the rat. Behavioral and Neural Biology, 44(3), 354–363. doi:10.1016/S0163-1047(85)90640-5
Tai, L.-H., Lee, a M., Benavidez, N., Bonci, A., & Wilbrecht, L. (2012). Transient stimulation of distinct subpopulations of striatal neurons mimics changes in action value. Nature Neuroscience, 15(9), 1281–1289. doi:10.1038/nn.3188
Tanaka, S. C., Samejima, K., Okada, G., Ueda, K., Okamoto, Y., Yamawaki, S., &
Doya, K. (2006). Brain mechanism of reward prediction under predictable and
76
unpredictable environmental dynamics. Neural Networks, 19(8), 1233–1241.
doi:10.1016/j.neunet.2006.05.039
Thorn, C. a, Atallah, H., Howe, M., & Graybiel, A. M. (2010). Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning.
Neuron, 66(5), 781–795. doi:10.1016/j.neuron.2010.04.036
Wagenmakers, E.-J., Lodewyckx, T., Kuriyal, H., & Grasman, R. (2010). Bayesian hypothesis testing for psychologists: a tutorial on the Savage-Dickey method.
Cognitive Psychology, 60(3), 158–189. doi:10.1016/j.cogpsych.2009.12.001
Wagner, A. R., & Rescorla, R. A. (1972). Inhibition in Pavlovian conditioning:
Application of a theory. In Boakes, R. A. & Halliday, M. S. (Eds.), Inhibition and Learning (pp. 301-336). London: Academic press.
Walton, M. E., Behrens, T. E. J., Buckley, M. J., Rudebeck, P. H., & Rushworth, M. F.
S. (2010). Separable learning systems in the macaque brain and the role of orbitofrontal cortex in contingent learning. Neuron, 65(6), 927–939.
doi:10.1016/j.neuron.2010.02.027
Watkins, C. J. C. H. (1989). Learning from delayed rewards. University of Cambridge.
Watkins, C. J. C. H., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3-4), 279–292. doi:10.1007/BF00992698
77
Weiner, I. (2003). The “two-headed” latent inhibition model of schizophrenia:
modeling positive and negative symptoms and their treatment.
Psychopharmacology, 169(3-4), 257–297. doi:10.1007/s00213-002-1313-x
Wetzels, R., Lee, M. D., & Wagenmakers, E. J. (2010). Bayesian inference using WBDev: A tutorial for social scientists. Behavior Research Methods,42(3), 884-897.
Yang, T., & Shadlen, M. N. (2007). Probabilistic reasoning by neurons. Nature, 447(7148), 1075–1080. doi:10.1038/nature05852
Yin, H. H. (2010). The sensorimotor striatum is necessary for serial order learning.
The Journal of Neuroscience, 30(44), 14719–14723. doi:
10.1523/JNEUROSCI.3989-10.2010
Yin, H. H., & Knowlton, B. J. (2004). Contributions of striatal subregions to place and response learning. Learning & Memory, 11(4), 459–463. doi:
10.1101/lm.81004
Yin, H. H., & Knowlton, B. J. (2006). The role of the basal ganglia in habit formation.
Nature Reviews Neuroscience, 7(6), 464–476. doi:10.1038/nrn1919
Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2004). Lesions of dorsolateral
striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. The European Journal of Neuroscience, 19(1), 181–189.
Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2005). Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental
78
conditioning. The European Journal of Neuroscience, 22(2), 505–512.
doi:10.1111/j.1460-9568.2005.04219.x
Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2006). Inactivation of dorsolateral striatum enhances sensitivity to changes in the action-outcome contingency in instrumental conditioning. Behavioural Brain Research, 166(2), 189–196.
doi:10.1016/j.bbr.2005.07.012
Yin, H. H., Mulcare, S. P., Hilário, M. R. F., Clouse, E., Holloway, T., Davis, M.
I., … Costa, R. M. (2009). Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nature Neuroscience, 12(3), 333–341.
doi:10.1038/nn.2261
Yin, H. H., Ostlund, S. B., & Balleine, B. W. (2008). Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of cortico-basal ganglia networks. The European Journal of Neuroscience, 28(8), 1437–1448.
doi:10.1111/j.1460-9568.2008.06422.x
Yin, H. H., Ostlund, S. B., Knowlton, B. J., & Balleine, B. W. (2005). The role of the dorsomedial striatum in instrumental conditioning. The European Journal of Neuroscience, 22(2), 513–523. doi:10.1111/j.1460-9568.2005.04218.x
Zahm, D. S. (2000). An integrative neuroanatomical perspective on some subcortical substrates of adaptive responding with emphasis on the nucleus accumbens.
Neuroscience & Biobehavioral Reviews, 24(1), 85–105.
doi:10.1016/S0149-7634(99)00065-2
79
80
Table 2. 1
The grades of evidence corresponding to values of the Bayes factor.
Note. The table illustrates value of Bayes factor and its corresponding grade of evidence. Adapted from “Bayesian model selection in social research,” by A. E.
Raftery, 1995, Sociological Methodology, 25, 111-164.
Bayes factor Evidence
<1 Negative (supports H0)
1-3 Weak
3-20 Positive
20-150 Strong
>150 Very strong
81
Figure 2. 1. Schematic diagram of drug injection site.
Note. Black bar: NA group; blue bar: DLS group; red bar: DMS group.
NA DLS DMS
AP + 1.8 mm ML± 1.2 mm DV - 4.7 mm
AP + 0.5 mm ML± 2.5 mm DV - 3 mm
AP + 0.5 mm ML± 1.5 mm DV - 3 mm
82
Figure 2. 2.The procedure of the 2-choice dynamic foraging task.
Note. Mice had to nose poke into the food magazine to initiate a trial. A 5 sec intertribal interval (ITI) then preceded the illumination of stimulus-response apertures, and light stimulus was illuminated in the two apertures. Mice were required nose poking into one of the illuminated apertures. Each nose poke into the illuminated aperture was followed by either the delivery of a reward or no any reward, and both of
Note. Mice had to nose poke into the food magazine to initiate a trial. A 5 sec intertribal interval (ITI) then preceded the illumination of stimulus-response apertures, and light stimulus was illuminated in the two apertures. Mice were required nose poking into one of the illuminated apertures. Each nose poke into the illuminated aperture was followed by either the delivery of a reward or no any reward, and both of