Method - 以代理人基模型模擬的施與受賽局

Chapter 1. Introduction

1.3 Method

17th International Conference on Computing in Economics and Finance. 2011, obtained analytic solutions in their discussion of The Donor-Recipient game cooperative evolutionary stable state (CESS) and non-cooperative evolutionary stable state (NESS) of the three social norms. The comparison of the evolutionary stable states, the attraction basin areas, the process of moving trajectory under the three social norms between Mean-Field approach and Agent-Based Model, have shown some interesting new results. The Agent-Based Model simulation results are virtually the same with those from the Mean-field approach, but there are still slight differences.

The Agent-based Model of Donor-Recipient Game simulation setup is similar to the complex systems, the interaction between the various components of agents have some characteristics that make the complex and self-organizing systems have adaptive and evolutionary ability. Agents will take the initiative to change and to learn what happens to become their own advantage. It is the nature of the species in a changing environment to constantly evolve, in order to live better.

1.3 Method.

Based on the original setting of the Donor-Recipient Game within a society, agents interact with each other and have two or three choices for action: cooperation, defection or punishment. The different choices will form a series of Social Norms. An individual who is in active side can take an action according to the opponent’s reputation. The opponent who is in passive side just can accept the action. At the same time, based on the applied social norm, a new reputation is assigned to this active side individual. This replaced new reputation will determine what action others will take toward him in next games.

‧

Individuals in each society interact with each other under the applied social norm, and they learn and update their strategies to obtain higher individual payoffs. This within-society competition determines how often different strategies are used in the society.

In order to combine two disciplines in Economics and Physics, we use Maxwell-Boltzmann distribution. Maxwell-Boltzmann distribution is a probability distribution of the most common application in the field of statistical mechanics. For any macroscopic physical system, temperature is a parameter from the motions of molecules or atoms. The molecules or atoms to collide with each other so that the velocities follow the Maxwell-Boltzmann distribution when the system is in equilibrium, with a width proportional to temperature. In simulation, a dynamic process is often driven to equilibrium by imposing a transition rule that would lead the simulated system to reach states described by such distributions. The Metropolis algorithm has been widely used as a multi-state transition rule in simulation, to impose Maxwell-Boltzmann distributions for various complex systems, such as those of Potts model, beyond the systems of molecules.

In agent-based model, the Donor-Recipient Game induces a non-equilibrium dynamic process. The driving force for evolution is from the strategy-updating mechanism. We impose a social learning procedure to produce such a mechanism.

The flux of wealth is analogously related to the velocity in systems of molecules. At a given temperature, we apply the Metropolis algorithm to the transition probabilities among strategies. Decision to convert the strategies or not is then controlled by a mechanism that would impose the same probability function for molecular velocities, to the distribution of fluxes of wealth for individuals. In the model of simulation, the microstate of an individual player is labeled by the strategy of the player, rather than the flux of wealth, which is like in systems of Potts model.

‧

It is the mean flux of wealth for every strategy is calculated and compared. The chance for adapting a new strategy is then determined by the transition probability.

The procedure is coined as a ‘social learning’ procedure.

We will create each agent own account from the game play. We calculate the agent’s flow income and stock wealth, donor’s and recipient’s reputation, donor’s and recipient’s strategy, in each game step and try to form some core hypothesis about the model. Agents’ strategy evolution dynamics, moving trajectory, CESS and NESS attraction basin, converge speed in the Donor-Recipient Game. At the present stage, our simulation outcomes have found some consistency and some non-consistency outputs that could be contradistinguished in Agent-Based Model and Mean-Field Equation Model.

The remainder of the paper is organized as follows. Chapter 2 presents a model of an Evolutionary Donor-Recipient Game. Chapter 3 compares the attraction basin of CESS and NESS for three social norms and the dynamics of the strategy evolution in Agent-Based model to Mean-Field equation model. Chapter 4 gives the conclusion and discussions.

In this paper, we study the dynamics of donor-recipient game using agent-based simulation.

In our agent-based model, agents are randomly matched in pair and in time.Each point in time (step) two agents are randomly chosen as a pair to play the donor-recipient game.One of them plays the role of the donor, and the other one plays the role of recipient.These roles are also randomly determined.Based on the standard payoff matrix of the donor-recipient game, the payoffs of the two players are determined by their chosen or received actions: cooperate (C), defect (D), or punishment (P).The payoffs will be constantly updated and cumulatively attributed

‧

The strategy (decision rule) that the agent uses to play the game will evolve over time with his learning. In this article, we assume that agents are able to learn from other participants’ experience; hence, it is a style of social learning. We assume that each agent learns every after he plays the role of donor of the game for two times. The learning is in a form a reconsideration of the choice of the strategy.

Basically, the incumbent strategy will be challenged by the available alternative.

One of the two will be stochastically chosen as the new strategy.

This stochastic choice is characterized by the familiar logistic (Boltzmann-Gibbs) distribution, which is based on the gain in the performance of the incumbent strategy as opposed to the that of the alternative. The performance of each strategy is measured by its associated increments in wealth. Every time when the strategy is applied by one agent in his encounter, we can observe how that strategy bring a change in the wealth of that agent. Such information is accumulated that the expectation of wealth increment in adapting each strategy is available to all members of the society in form of average over records.

We use two average formula, named “simple-average”(player-weighted) and

“weighted-average” (event-weighted). The formula calculation method is silimar to the article Aoki, M. and Yoshikawa, H. (2012) Non-self-averaging in macroeconomics models: a criticism of modern micro-founded macroeconomics.

This article shows the condition under which the mean-field interaction can be a poor approximation of the whole complex web of interactions.

Agent-Based Models retain fluctuations which are not included in the

Mean-field analysis. The Agent-Based Model models, therefore, produce situations closer to what happen in real society. We found that the attractors obtained from

‧

simulations of ABMs are in general the same as those from mean field analysis, in all social norms, but the volumes of attraction basins have adjustments.

We considered two different counting methods in producing information content in the social-learning procedure. They are play-weighted and event-weighted, respectively. We found the three main solutions are consistent with previous studies, in the paper by Tongkui Yu, Shu-heng Chen, Honggang Li, "Social Norm, Costly Punishment and the Evolution to Cooperation," (2011). Namely, DD, CD and CP are the dominant attractors in the simple social norm, weakly augmented social norm and strongly augmented social norm, respectively.

In our simulation outcomes, the CESS attractors are CD, CD, CP, respectively, in simple norm, weakly augmented norm and strongly augmented norm. There is another CESS attractor, CC, appears in all three social norms. Especially in the strongly augmented norm, the CC attractor is the secondarily dominant in our simulations, while the attractor basin of CC was found larger than that of DD in the phase portraits in the mean field analysis by Tong et al in 2011.

In Player-weighted society, there appear unstable attractors in the early stage of evolutions, when it is not easy to distinguish the strengths and weaknesses of various strategies, based on the coarsened information. The agents have equal chances to take each strategy and the systems stay at the center of each phase portrait, where an unstable attractor appears, until the sufficient accumulation of data measuring the merits of all the strategies. With the latter information, the systems evolve away from the center of the phase portrait, approaching to stable attractors. It is found that the attractors may lose their positions one by one over time on the strategy competitions, which result in the final convergence of the systems along the edges in the phase portrait, indicating a two-side competition. These observations also suggest the possibility of the appearing of other unstable attractors.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

In Event-weighted society, the attractors are stable. Information is refined, all societies evolve relatively faster in converging to stable attractors. Under the same setup, the societies employing player-weighted social learning require 60,000 steps to reach stable and those using event-weighted social learning need only about 1,000 steps to do so.

There are twointeresting new observations revealed in our agent-based simulations. One observation is that a minor stable attractor may survive in the time evolution which are ported by harmonious societies, where all agents are reputed as

“good”. In contrast, the agents in the societies harboring at the major attractor are not inclined to be reputed mainly as “good” or as “bad”. The chances are 50-50 in percentages. For instance, there is a tendency toward the CD strategy which is a non-dominant attractor in strongly augmented social norm and the entire population of those societies adapting this strategy is in Good reputation.The other observation is that the competition between strategies may display the presence of dynamic orbits as the final domain of time evolution.

‧

Based on the framework of donor-recipient game proposed by Tongkui Yu,

Shu-heng Chen, Honggang Li, (Tongkui Yuet.al. 2011), we study a multi agent model, with an emphasis on the asynchronous nature of the strategy updating process for individual members in the model society.

The asynchronous decision making originates mainly from the diversity in time of occurrence of the events among individuals, who are chosen randomly to play the role as a donor or as a recipient in a game. During a game, the donor would act in accord with her strategy in response to the reputation status of the recipient. The action leads to a renewed reputation status for the donor, regulated by the social norm of the society. At the end of the game, the wealth of the donor and that of the recipient may encounter a change in forms of cost (for the donor) and benefit or penalty (for the recipient). The knowledge of the wealth changes from the members of the whole society in adapting each of the strategies would be used by a player who considers adjusting her strategy. This happens when she had just played the role of donor for UD times since her last strategy updating. The member who adjusts her strategy would randomly pick up a strategy and decide whether she changes her strategy by expecting a better pay off, either as the gain or as the loss in her wealth.

The varieties of fluctuations behind the dynamics of this heterogeneous model society, which underscores the diversity of sources of randomness, test the robustness of the outcomes of the simulations. These fluctuations include the asynchronous decision-making and the random picking-up of players. Moreover, the random picking-up of players in a game and the asynchronous occurrences of

在文檔中以代理人基模型模擬的施與受賽局 - 政大學術集成 (頁 23-29)