• 沒有找到結果。

4.4 WMC and EWA Learning

4.4.1 Camerer’s EWA Learning

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

4.4. WMC AND EWA LEARNING

neighbors.

However, when coming to the second phase, things became less pre-dictable; in addition to the original three core levels, the target number could also fall outside that interval.16 The visiting frequency of the target number to these non-core intervals increases from nil to one fourth, which is accompanied by the declining visiting frequency to all three core levels.

In particular, the most dramatic change happens in level 2, in which this frequency declines from the original 50% to only 30%. Despite this change, level 2 remains the most likely region to host the target number, followed, almost equally, by levels 1 and 3. This may help explain why the Markov transition matrix shows a strong tendency to switch to core levels, in partic-ular, level 2, but never degenerates to it. Therefore, from the perspective of the distribution behavior, cognitive hierarchies echo guessing performance.

4.4 WMC and EWA Learning

If we consider level reasoning as a choice variable, then we can certainly go further to examine the intriguing dynamics between this choice made by the subjects and the aggregate result after pooling their choices. This would allow us to formally build a learning model, such as reinforcement learning or generalized reinforcement learning, known as the EWA learning, to account for the observed correspondence between the empirical level distribution and the target-d distribution. In fact, we calibrated the EWA learning model with the choice made separately by high and low WMC subjects. We expect that subjects endowed with high or low WMC will exhibit different learning parameters. In particular, because the imagination factor δ distinguishes subjects being belief or reinforcement learning that requiring different degree of cognitive ability in essence, we hypothesis that the estimates of δ would positively correlate with cognitive capacity.

4.4.1 Camerer’s EWA Learning

We estimated two versions of Camerer’s EWA learning model. The parameter estimates following Camerer and Ho (1999) and Camerer, Ho and Chong (2002) are given in Table 4.5 and Table 4.6 respectively.

As we can see in the first column of Table4.5, we calibrated Model I with the data of all subjects (sample size M = 1080). The δ estimate of this rep-resentative agent is zero, while Camerer and Ho (1999) showed a δ = 0.232

16This may be also caused by the insensitivity issue due to narrow and narrower intervals.

See footnote14.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

4.4.WMCANDEWALEARNING Table 4.5: Model Parameter Estimates of EWA Following Camerer and Ho (1999)

Model I Camerer and Ho (1999)

Parameters All Subjects WMC > P67 WMC > mean WMC < mean WMC < P33 All Subjects Initial values

A1(0) 1000.000 650.995 999.653 1000 1000 3.348

A2(0) 843.453 603.302 853.518 850.305 827.940 3.311

A3(0) 609.908 556.747 721.618 499.077 485.753 3.301

A4(0) 602.262 533.672 635.722 607.027 595.728 3.269

A5(0) 409.08 467.274 492.692 378.091 327.518 3.227

A6(0) 385.661 491.440 464.772 357.925 312.050 3.180

A7(0) 293.776 412.806 374.657 279.607 292.980 3.052

A8(0) 0.000 335.744 0.346 0.000 0.000 2.192

A9(0) 137.497 319.480 333.387 25.0297 28.356 2.871

A10(0) 392.352 463.211 547.117 272.644 123.005 3.060

N(0) 12.890 2.773 2.485 13.578 15.165 16.815

Decay parameters

φ 1.236 1.231 1.222 1.255 1.257 1.330

ρ 0.922 0.639 0.598 0.926 0.934 0.941

Imagination factor

δ 0.000 0.000 0.000 0.000 0.000 0.232

Payoff sensitivity

λ 0.003 0.010 0.003 0.002 0.002 2.579

Log-likelihood

-LL 3707.17 1227 1747.83 1949.78 1507.71 5878.197

Sample size

54

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

4.4.WMCANDEWALEARNIN Table 4.6: Model Parameter Estimates of EWA Following Camerer, Ho and Chong (2002)

Model IIa Model IIIb Camerer ,Chong and Ho(2002)

Parameters WMC > mean WMC < mean WMC > mean WMC < mean Inexperienced Experienced

N(0) 0 0 0 0.4716 – cc

φ 0.7009 0.6829 0.7347 0.6216 0.00 0.22

ρ 0 0 0.5361 0 0.00 0.00

δ 0.4360 0.5976 0.4244 0.5690 0.90 0.99

λ 0.0401 0.0300 0.0463 0.0206 – cc

d – – 0.7571 0.4367 0.13 0.11

-LL 2333.17 2526.92 1942.68 2059.27 2155.09 2128.88

M 370 520 560 430 1372 1372

a In Model II, initial attractions Aj(0) are initialized by first period data.

b In Model III, initial attractions Aj(0) are initialized by first period data and an additional parameter d is introduced to replace the unrealistic assumption.

c Camerer, Ho and Chong(2002) didn’t report the parameter estimates of N(0) and λ in their paper.

55

4.4. WMC AND EWA LEARNING

(a) Guess distribution: WMC > Mean (b) Guess distribution: WMC < Mean

Figure 4.8: Guess distribution and predicted guess distribution

(the last column of Table 4.5). However, there is no desiring correspon-dence between cognitive capacity and the parameter δ. Although separately calibrating the EWA learning model with several groups of subjects with dif-ferent level of WMC (groups P67, higher than the mean, lower than the mean and P33), we obtained a homogeneous estimate of δ zero. Then we consider two modifications of original EWA learning. In Model II, we initialized Aj(0) by observed probability of strategy in the first period. In addition to that, we also introduced additional parameter d to replace the unrealistic assumption about the knowledge of winning number in Model III. The parameter esti-mates of Model II and III are given in Table4.6. In that table, we also show the results of Camerer, Ho and Chong (2002) in which these modifications were first taken into consideration. In both models, we observed heteroge-neous estimates of parameter δ yet its correspondence to cognitive capacity is not consistent with our expectation. In fact, we observed a higher estimate of δ for low WMC groups in both models.

4.4. WMC AND EWA LEARNING

It should be note that none of these models captures the nature of learn-ing well. Figure4.8panels (a) and (b) show the relative frequencies of guesses made by subjects with WMCs higher or lower than the mean respectively.

Figure 4.8 panels (c) and (d) show the relative frequencies of guesses pre-dicted by Model II. Although EWA learning seems to predict a tendency of convergence to equilibrium, its speed of convergence is too slow to mimic experimental data.17 In fact, Camerer and Ho (1999) horse raced various learning models including choice reinforcement, belief-based and EWA learn-ing. They also found lack of ability when applying EWA model to explain beauty contest game, compared to the other two noncooperative games. One ofCamerer and Ho(1999)’s suggestion to remedy this problem is to consider learning when players sophisticatedly realize that other players are learning as well. Sophistication is central in BCG for producing level-k reasoning and it has been put into practice inCamerer, Ho and Chong(2002). We also con-sidered level-k reasoning during the learning process, yet from an alternative perspective.

4.4.2 EWA Rule Learning

We redefined the choice variable from 101 guessing numbers to 6 level rules.

We presumed that few strategies are more plausible than many strategies for subjects to ”reinforced” them. In this case, levels of reasoning are in-dependently calculated and their attractiveness are directly reinforced. Two versions of EWA rule learning were estimated and the results of parameter es-timates are given in Table4.7, for Model IV and Table4.8, for Model V. Note that the parameter estimates are not sensitive to whether estimating Aj(0) or not. We also demonstrated the relative frequencies of levels predicted by Model V in Figure 4.9. In both models, we found that EWA rule learning reasonably captures the attractiveness of d = 2 through the learning process.

Our EWA rule learning seems to be more descriptive than Camerer’s original EWA learning, at least in the experiments of beauty contest. It implies that the framework of experience-weighted attraction still works given that the provided strategy we assumed subjects to learn is descriptive.

As for the intelligence effect, high capacity corresponds to high imagina-tion ability in general, although the correlaimagina-tion between WMC and δ is not linear. First, we compared the estimates of δ for high and low WMC groups based on three classification criteria, mean, one-third and one-fourth. For all of these three criteria, high WMC groups exhibit higher estimates of δ than

17We don’t demonstrate the results of Model I, because its log-likelihood is too large to be true.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

4.4.WMCANDEWALEARNIN Table 4.7: Model Parameter Estimates of EWA Rule Learning: Model IV

Parameters WMC > P75 WMC > P67 WMC > mean WMC < mean WMC < P33 WMC < P25

A1(0) 533.361 518.615 500.240 533.144 553.022 513.975

A2(0) 493.314 468.395 475.770 549.254 559.275 510.991

A3(0) 789.98 664.918 600.983 551.464 574.821 511.407

A4(0) 791.015 678.365 573.574 524.537 524.52 504.834

A5(0) 646.955 549.271 451.966 468.705 430.489 486.738

A6(0) 609.044 512.761 416.919 484.360 474.161 497.761

N(0) 0.490 0.584 0.2101 1.131 1.287 4.918

φ 0.889 0.914 0.861 0.848 0.665 0.820

ρ 0.000 0.000 0.000 0.711 0.223 0.949

δ 0.411 0.484 0.578 0.472 0.405 0.314

λ 0.010 0.009 0.011 0.027 0.017 0.090

-LL 480.352 562.985 815.472 927.286 712.276 566.625

M 320 370 520 560 430 340

In Model IV, initial attractions Aj(0) are estimated by all data.

58

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

4.4.WMCANDEWALEARNIN Table 4.8: Model Parameter Estimates of EWA Rule Learning: Model V

Parameters WMC > P75 WMC > P67 WMC > mean WMC < mean WMC < P33 WMC < P25

N(0) 0.333 0.178 0.083 0.706 14.068 9.171

φ 0.906 0.939 0.864 0.849 0.677 0.493

ρ 0.260 0.362 0.114 0.675 1.000 0.964

δ 0.464 0.558 0.588 0.489 0.400 0.277

λ 0.014 0.015 0.012 0.025 0.255 0.181

-LL 483.013 564.933 815.656 929.064 713.718 569.032

M 320 370 520 560 430 340

In Model V, initial attractions Aj(0) are initialized by first period data.

59

4.4. WMC AND EWA LEARNING

(a) Level distribution: WMC > Mean (b) Level distribution: WMC < Mean

Figure 4.9: Level distribution and predicted level distribution

low WMC groups (higher than mean vs. lower than mean, P67 vs. P33, P75

vs. P25). However, δ is linearly increasing with WMC only in the range from P25 to higher than mean. Subjects with even higher WMCs (P67 andP75) exhibit yet decreasing δs.

We also applied likelihood ratio (LR) test to study the statistical signifi-cance of difference in parameter estimates for model V18. ˆθ = { dN(0), ˆφ, ˆρ, ˆδ, ˆλ denotes the all parameter estimates. ˆθh and ˆθl denote the parameter esti-mates for high and low WMC groups respectively. LLh(·) denotes the

log-18We only applied statistical test to Model V, because the parameter estimates in Model IV are similar to those in Model V

4.4. WMC AND EWA LEARNING

likelihood function for high WMC groups. By this notation, LLh(ˆθh) repre-sents the value of log-likelihood function for high WMC group. This defines the value of log-likelihood function for a unrestricted model. To test the dif-ference of general learning behavior between high and low WMC groups, we simply replaced ˆθh with ˆθl into LLh(·) as if restricting parameters θ = ˆθl and obtained corresponding LLh(ˆθl). This defines the value of log-likelihood function for a restricted model. The test statistic, known as likelihood ratio, is twice the difference in these log-likelihoods:

LR = −2(LLh(ˆθh) − LLh(ˆθl))

Under the null hypothesis that the restrictions are true, the LR statistic is distributed as a chi-squared random variable with degrees of freedom equal to the number of restrictions, denote as m. In the LR test for general learning behavior comparisons, the degrees of freedom m equals to 5. The upper part of Table4.9present these results. On the other hand, to test the difference in single learning parameter δ between high and low WMC groups, we replaced δˆh with ˆδlinto LLh(·) and obtained LLh( dN(0)h, ˆφh, ˆρh, ˆδl, ˆλh). The likelihood ratio is given as follows:

LR = −2(LLh(ˆθh) − LLh( dN(0)h, ˆφh, ˆρh, ˆδl, ˆλh))

This statistic follows chi-squared distribution with degrees of freedom m = 1.

The bottom part of Table 4.9 present these results.

When we simultaneously compared all parameter estimates, the mean of WMCs separates subjects into two groups of learning behavior. As we can see the upper part of Table 4.9, subjects are different between groups but indifferent within groups. The only one exception of comparison (lower than mean vs. P75) was so close to significant with p-value = 0.07.

Returning to our main interest, we confirmed that the ˆδ are significant different between groups such as P67 vs. P33 and P75 vs. P25. However, there is no evidence to show that ˆδs are different between groups of higher than the mean and lower than the mean. Remember that the deviation from mean WMC, no matter increasing or decreasing, leads to the decline of δ. It means that low WMC groups (P33 and P25) suffer from a greater decline of δ than high WMC groups (P67 and P75). One the one hand, the decline of δ in high WMC groups is not statistical significant. Even if the P75 subjects exhibit lowest δ equals 0.4635 within high WMC groups, they are not significantly different to subjects with WMC higher than the mean, having highest δ equals 0.5877. On the other hand, the decline of δ in low WMC groups is statistical significant. The P25 subjects, exhibiting the lowest δ among all groups, are significantly different to all other groups except the P33 subjects.

Table 4.9: LR Test for the Significance of Difference in Parameter Estimates: Model V

Parameters WMC> P67 WMC>mean WMC<mean WMC< P33 WMC< P25

General comparisons: ˆθl → LLh(·) Single parameter comparisons: ˆδl → LLh(·)

(ˆδ = 0.5582) (ˆδ = 0.5877) (ˆδ = 0.4894) (ˆδ = 0.3998) (ˆδ = 0.2774)

The test statistics χ2and its p-value (below in parentheses) are shown. The critical values for general comparisons are χ20.95(5) = 11.071,

62

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

4.4. WMC AND EWA LEARNING

In summary, we showed that WMC could reasonably separate the learn-ing behavior behind the beauty contest experiment. In particular, lower in cognitive capacity cause a shirking membership of belief learning character-izing by the value of δ.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Chapter 5

Discussion and Conclusions

This study contributes to the burgeoning literature on the inquiry into the relation between cognitive capacity and cognitive hierarchy. There are three major research questions being posed in this line of study.

• First, would cognitive capacity have a positive effect on subjects’ re-vealed cognitive hierarchies?

• Second, would this effect endure when subjects become more experi-enced?

• Third, would cognitive capacity affect the way in which subjects learn?

Based on the detailed analysis of our beauty contest experiments, we provide some clues for all of these three questions. We first show that cognitive capacity can affect the revealed cognitive hierarchy, at least in the initial periods when all subjects are novices. We then show that, if the environment can be repeated like “the groundhog day”, a metaphor copied from Thaler (2000), the influence of cognitive capacity can become weaker, but without completely disappearing. Third, the analysis based on the Markov transition matrix indicates the possibility that subjects with different WMCs may have different underlying learning patterns, and it is later confirmed by calibrating data with EWA learning model. In sum, we find that individual differences in working memory capacity predict: the performances even when learning is involved, the revealed reasoning levels, and the learning dynamics in beauty contest games. These results support the relevance of cognitive capability in strategic thinking in both a novel and perhaps a “familiar” environment.

Among very few studies in this area, our answer to the first question is in line with Branas-Garza et al. (2012a) and Gill and Prowse (2012), but is different fromGeorganas et al.(2010). Our answer to the second question, to

some extent, is largely in the same direction as the one observed in Branas-Garza et al. (2012a) and Gill and Prowse (2012), although our endurance effect is much weaker, particularly compared to Gill and Prowse (2012).1 Schnusenberg and Gallo(2011), however, find that cognitive ability matters only in the initial periods in their repeated beauty contest games. This may be attributed to the experimental design: they conduct classroom experi-ments and repeat runs between classes. In such a design, it would be more difficult to control feasible information upon which the learning depends.2 In relation to the third question, our evidences, whatever from indirect or di-rect perspectives, are also in the same didi-rection thatGill and Prowse (2012) have found in their rule-learning model. Note that we share a common pre-sumption with Gill and Prowse (2012) that subjects try to learn from their experiences in level-k rules instead of guessing numbers.

While this study tries to contribute itself to the literature by connecting our results to the existing literature, such as that summarized in Table2.1, we have to say that the development of the whole literature has not reached the point where the robustness of the obtained results can be established. This is mainly because these experiments have not been ‘standardized’ in such a way that meaningful comparisons can be made; on the contrary, the designs are quite variant, such as the number of rounds, number of different values of p, the size or number of participants, information available during the game, the measure of cognitive capacity and many other technical implementa-tions. Given this variety, it will take more time to work out their differences, and, before that, to what extent cognitive capacity and cognitive hierarchy are related remains to be an issue to be further explored. For example, fu-ture research should determine the minimal time horizon and information required for the dominance of the learning effect over the cognitive-ability effect.

1It is hard to see whereBranas-Garza et al.(2012a) stand at this point. They did run a number of regressions to examine the effect of cognitive capacity on the reasoning level. By pooling all data together from their six p-beauty contest games and assuming that subjects were able to learn in the Weber sense, i.e., learning without feedbacks (Weber,2003), then the whole of the data will be dominated by the data generated by experienced subjects.

If so, the effect of cognitive capacity should be minimal and the corresponding coefficient should be insignificant. Hence, their regression results of having a significant coefficient of cognitive capability using CRT may be regarded as evidence that gap in cognitive capacity remains even though agents are able to learn. However, since they did not provide the analysis by separating data into initial periods and later periods, something equivalent to our Table4.1, further confirmation of this is infeasible.

2In addition, they also present subjects with the guess distribution of the previous round which is not available in our design.

Leaving our specific BCG (multi-person and multiple rounds) aside, it is still interesting to know what are the general environments in which models of cognitive hierarchy may be applied and the hierarchy is dependent upon cog-nitive capacity, while dependence may or may not go away with repetitions and learning. Our findings are in accord with the results of previous studies which support the relevance of intelligence for a subject’s initial response to a novel encountering. This evidences is found in a beauty contest (Burnham et al., 2009;Schnusenberg and Gallo, 2011), simplified beauty contest (Rydval et al., 2009), an undercutting game (Georganas et al., 2010), a normal-form game solvable by iterated dominance, dirty faces, and an extensive-form game solvable by backward induction (Devetag and Warglien, 2003). Besides the use of design with novelty, these games also share a similar solution concept:

iterated reasoning, which is also required when applying level-k heuristic.

Note that the beauty contest, simple matrix game, and undercutting game belong to the family that seems to trigger the level-k heuristic more frequently than the other family of games (Georganas et al., 2010). In addition, the three simple games selected by Devetag and Warglien (2003) are solvable by applying some form of iterated reasoning. In sum, what we constantly learn from the literature is that the initial response to a novel encountering and the need for applying iterated reasoning may turn out to be the (joint) necessary conditions defining where cognitive capacity may play a role.

To take the analysis one step further, a challenge ahead of us is to figure out an environment (experiment) where cognitive capacity can be positively related to cognitive hierarchy, and learning does not annihilate this relation.

When the link between the two disappears, what can account for the distri-bution of cognitive hierarchy? For the latter, evolutionary models, such as the adaptive belief systems (Brock and Hommes, 1998), may play a role.

An additional interesting avenue of investigation might be to determine the maximal size of reasonable strategy space that subjects could deal with.

Remember that the desirable connection between cognitive capacity and learning parameters is found only when the strategy space is reduced from 101 guessing numbers to 6 level-k rules. One might attribute this improvement to the explanatory power of level-k model in this specific context. Another related but not identical interpretation would give credit to the limited num-ber of level-k rules we apply for calibrating. More extensive research would be necessary to test this less-is-more effect 3.

3We borrow this term fromGoldstein and Gigerenzer(2002). They believe that optimal statistical procedures are too complicated for ordinary mind to carry out and consider heuristic to be imperfect but adaptive strategies, such as recognition heuristic. Goldstein and Gigerenzer(2002) specify the conditions under which “less” knowledge is better than

“more” for making accurate inferences. We believe that the less-is-more effect is not only

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Another extension of this study would be to explore some policy and ethical issues. Remember that beauty contest is an analogy originally used by Keynes to capture the strategic nature of investment behavior. In financial market, if more cognitively able traders could take advantage of less ones, or even severely, they ensure that they interact with those less cognitively able by expending some money (Gill and Prowse, 2012), it would make income inequality worse. Future research may focus on how intervention such as

Another extension of this study would be to explore some policy and ethical issues. Remember that beauty contest is an analogy originally used by Keynes to capture the strategic nature of investment behavior. In financial market, if more cognitively able traders could take advantage of less ones, or even severely, they ensure that they interact with those less cognitively able by expending some money (Gill and Prowse, 2012), it would make income inequality worse. Future research may focus on how intervention such as