Agents Learned, but Do We? Knowledge Discovery Using the Agent-Based Double Auction Markets

(1)

Shu-Heng CHEN, Tina YU

Agents learned, but do we? Knowledge discovery using the

agent-based double auction markets

c

Higher Education Press and Springer-Verlag Berlin Heidelberg 2011 Abstract This paper demonstrates the potential role

of autonomous agents in economic theory. We ﬁrst dis-patch autonomous agents, built by genetic program-ming, to double auction markets. We then study the bar-gaining strategies, discovered by them, and from there, an autonomous-agent-inspired economic theory with re-gard to the optimal procrastination is derived.

Keywords agent-based double auction markets, au-tonomous agents, genetic programming, bargaining strategies, monopsony, procrastination strategy

1 Motivation and introduction

Economics is about the eﬃcient use of resources, which very much relies on the ability of humans to discover chances and hidden patterns. However, what is lack-ing in current economic theory is a proper model of the chance-discovering agents. Recent developments with re-gard to autonomous agents have provided economists with an opportunity to ﬁll this intellectual gap. This is particularly evident in the growing literature on agent-based computational economics [1]. The massive use of the tools of intelligent agents has placed various kinds of autonomous agents in economic environments so that they can explore their surroundings and make decisions without too much external supervision [2]. Models built using these autonomous agents can, therefore, evolve on their own, and changes are no longer placed exogenously, but generated endogenously.

In addition, by studying what kinds of chances or patterns are being discovered by these agents, we, as Received July 10, 2010; accepted December 16, 2010

Shu-Heng CHEN

Department of Economics, Chengchi University, Taipei, China E-mail: chen.shuheng@gmail.com

Tina YU

Department of Computer Science, Memorial University of New-foundland, Canada

model-builders, can also better learn about the intri-cate structure of the models. In this way, autonomous agents not only learn by themselves, but also “instruct” model-builders to learn. Nevertheless, current studies on agent-based economic models largely focus only on the macroscopic level. The microscopic analysis has not been advanced enough to gain insights into the discov-ery behavior of autonomous agents. In this paper, we will use an agent-based double auction market for which autonomous agents are built by genetic programming to illustrate the challenges posed by knowledge discovery to the ecological market dynamics. In addition, through these familiar double auction markets, we shall see why this task can be diﬃcult and will also see how interdis-ciplinary research can help to make a breakthrough.

The rest of this paper is organized as follows. Section 2 presents the experimental designs of the paper. We adopt four market designs (four diﬀerent demand and supply schedules), each with a diﬀerent equilibrium (or equilibria), so as to generalize what we may be able to learn from our dispatched autonomous agents. Section 3 provides the simulation results. After a short summary, we analyze the best strategies found in each market sce-nario and try to discover the rationale behind them. A generalization of what we learn from these four cases mo-tivates the theory of optimal procrastination proposed in Sect. 4, followed by concluding remarks in Sect. 5.

2 Experimental designs

2.1 Environment

In this paper, we consider four diﬀerent schedules of demand and supply, as shown in Figs. 1 and 2. Each market has four buyers and four sellers. They are num-bered from Buyers 1 to 4 and Sellers 1 to 4. The com-modity traded in this market is called the token. Buyers value these tokens and their maximum willingness to

(2)

Fig. 1 Market 1 (a) and Market 2 (b). The bottom panel is the token-value table, which speciﬁes the reservation price of buyers and sellers for each additional token. The middle panel is a list of all reservation prices, which are arranged, from left to right, in descending order for buyers’ reservation prices and ascending order for sellers’ reservation prices. The corresponding demand and supply schedule is then given in the top panel

specified in the token-value table. The willingness to pay is non-increasing with the number of tokens already owned. For example, in Fig. 1, Market 1, for Buyer 1, the maximum willingness to pay for the first token is 23, 22 for the second, 12 for the third, and 7 for the fourth. A similar structure holds for other buyers. On the other hand, sellers would like to provide these tokens, and the minimum acceptable price (the reservation price of the seller) for each token is also specified in the token-value table. As the opposite of the maximum willingness to pay, the minimum acceptable price is non-decreasing with the number of tokens already sold. Let us use Seller 1 in Market 1 (Fig. 1(a)) as an example. The minimum acceptable price starts from 2 for the first token, 7 for the second, 17 for the third, and 18 for the fourth.

This structure of the token-value table is generated in light of the familiar behavior of marginal utility and marginal cost, and hence, it ﬁts well with the law of de-mand and supply. If we pool all the maximum willingness to pay and the minimum acceptable price together, and arrange then in descending order and ascending order separately (as shown in the middle panel of each ﬁgure), then we can draw a downward-sloping demand schedule and upward-sloping supply schedule, as shown in the top

panel of each ﬁgure.

Since different demand and supply schedules may have different effects on the behavior of bargaining, we try to make our demand-supply schedules different so as to explore this sensitivity. The first two cases (Fig. 1) have demand and supply cross so that the equilibrium price is unique. Their difference lies in the multiplicity of the equilibrium quantity. For the first case, this is also unique, whereas for the second, it has multiple equilibria. The last two cases (Fig. 2) differ from the first two in that they never cross, so their equilibrium price is not unique, but has an equilibrium quantity. Between the two, differ-ences exist in the distance between demand and supply curves. For Market 3, the demand and supply schedules are parallel to each other from the left end to the right end, and the distance between the two changes little, whereas for Market 4, the demand curve is piecewisely step-down, the supply curve is piecewisely step-up, and the distance between the two, therefore, gets smaller. It is not quite clear how these different topologies may im-pact the bargaining strategies. At the outset, we simply wonder and let our simulation results enlighten us.

One autonomous agent will be placed in this environ-ment, and it will play the role of Buyer 1 (see Sect. 2.4

(3)

Fig. 2 Market 3 (a) and Market 4 (b). The bottom panel is the token-value table, which speciﬁes the reservation price of buyers and sellers for each additional token. The middle panel is a list of all reservation prices, which are arranged, from left to right, in descending order for buyers’ reservation prices and ascending order for sellers’ reservation prices. The corresponding demand and supply schedule is then given in the top panel

for more)1). The trading behavior of the autonomous

agents are the focus of this study. In particular, we ask

how genetic programming enables these agents to cover proﬁtable trading strategies and what they dis-cover, a question already raised in Ref. [3]2). To bet-ter understand the nature of this question, consider the case of one autonomous agent. The four reserva-tion prices (the maximum willingness to pay) of Buyer 1 are highlighted in each ﬁgure. From this mark, we can immediately see the original competitive advantage of Buyer 1 relative to other competitors (buyers and sell-ers). The question is what the optimal bargaining strat-egy is given this competitive position, be it superior or inferior, and how genetic programming can help the au-tonomous agent to discover this strategy.

2.2 Bargaining strategies

By a bargaining strategy, we mean an algorithm to in-dicate the kind of information required for making a de-cision (the inputs), and how the dede-cision is made based

upon the information received (the outputs), or, alterna-tively, a process to connect inputs to outputs. Of course, that would in turn depends on the inputs available for the artiﬁcial agents. In this paper, we follow the exper-imental design of the Santa Fe Double Auction

Tour-nament [4,5] and make the information, as summarized

in Table 1, available for traders. Basically, three types of information are available for our autonomous agents: price or quote information (indexes 1–9 and 16–17), time information (indexes 10,11), and private information (in-dexes 12–14).

This information is rather concise compared to what human subjects will normally have from lab-oratory experiments, where they may get access to all information on the previous trading “days” and “rounds”. What we have here is only partial in-formation on the previous day and previous round. However, this condensed information may be suffi-cient given what we learned from Refs. [4,5]. In fact, some later developed artificial double auction sys-tems are also built upon this “minimal” information 1) This fixed design is simple and make our analysis coming later easier

2) In an experimental setting, the same kind of question has been asked by Refs. [4,5]. The diﬀerence is that they asked how human

agents discover or design the trading strategies, but here we are addressing how software agents do that. For other subtle diﬀerences,

(4)

Table 1 Statistics of selected graph datasets

index terminal interpretation

1 PMax the highest transaction price on the previous day

2 PMin the lowest transaction price on the previous day

3 PAvg the average transaction price on the previous day

4 PMaxBid the highest bidding price on the previous day

5 PMinBid the lowest bidding price on the previous day

6 PAvgBid the average bidding price on the previous day

7 PMaxAsk the highest asking price on the previous day

8 PMinAsk the lowest asking price on the previous day

9 PAvgAsk the average asking price on the previous day

10 Time1 the number of auction rounds left for today

11 Time2 the number of auction rounds that have no transaction

12 HT the highest token value

13 NT the second highest token value

14 LT the lowest token value

15 Pass pass the current auction round

16 CASK the lowest asking price in the previous auction round

17 CBID the highest bidding price in the previous auction round

18 Constant randomly generated constant number

set [7].

This information set deﬁnes part of the algorithms that may be discovered by the autonomous agents, but it is just a collection of primitive inputs. To process this raw data, some further operations are expected. Nor-mally, this can be done by allowing agents to obtain access to some logical and mathematical operators, and Table 2 provides a list of these options.

Table 2 Logic and mathematical operators (function set) function

+ – * % min

> exp abs log max

sin cos if-then-else if-bigger-then-else

Given the information (Table 1) and the way to op-erate it (Table 2), various bargaining strategies can be formed. Two examples are given as follows1):

1) (Min PMinBid HT)

2) (If Bigger Then Else HT CASK CASK+1 Pass) In the ﬁrst example, to decide how much to bid, the buyer simply looks at the minimum bid on the previous day (PMinBid) and his current reservation price (HT), and bids at the minimum of the two. In the second exam-ple, the buyer ﬁrst checks whether his reservation price (HT) is bigger than the lowest ask (CASK) in the previ-ous round. If this condition is met, he will bid by adding one dollar to the current ask; otherwise, he will simply pass. Not all bargaining strategies are that simple. A lit-tle knowledge of combinatorics or context-free grammar will lead us to see that the formed bargaining algorithm can potentially become complex like the next one.

1) ((Min (If Bigger Then Else PMinBid PAvgBid CASK PAvgBid)

2) (If Bigger Then Else HT PAvgBid PAvgBid CASK)) 2.3 Institutional arrangements

One may question whether the syntax developed above can be semantically meaningful as well and that depends on the kind of institutional arrangements. In a double auction market, both buyers and sellers can submit bids and asks. This contrasts with only buyers shouting bids (as in an English auction) or only sellers shouting asks (as in a Dutch auction). There are several variations of DA markets. One example is the clearinghouse DA of the Santa Fe token exchange (SFTE) [4] on which this work is based.

On the SFTE platform, time is discretized into alter-nating bid/ask (BA) and buy/sell (BS) steps. Initially, the DA market opens with a BA step in which all traders are allowed to simultaneously post bids and asks for one token only. After the clearinghouse informs the traders of each others’ bids and asks, the holders of the highest

bid and lowest ask are matched and enter into a BS step.

During the BS step, the two matched traders carry out the transaction using the mid-point between the highest

bid and the lowest ask as the transaction price. Once the

transaction is cleared, the market enters into a BA stage for the next auction round. The DA market operations are a series of alternating BA and BS steps.

The syntax of the bargaining strategy that we intro-duced works well with the SFTE platform because for this platform the decision only involves a bid or an ask 1) The generation of these examples is based on the grammar of the formal language, in particular, the context-free grammar. The famous Backus-Naur form (BNF) is extensively used in the literature and is also applied here

(5)

for a single unit of token. There is no involvement of

market time, which certainly exists in the

continuous-time double auction, and there is no involvement of units when each transaction allows for one unit as the maxi-mum.

2.4 Opponents’ behavior

In this paper, to make our later analysis simple, we as-sume that all opponents are truth tellers except one au-tonomous agent (the ﬁrst buyer) (see Fig. 3). Being a truth teller, the trader simply bids or asks at his cur-rent reservation price (HT for buyers and LT for sell-ers). This simpliﬁcation makes it easier for us to make sense of the behavior of autonomous agents and evalu-ate their novelty-discovery capability. We start with the simplest case, only one one autonomous agent (the left panel of Fig. 3) to gain some basic understanding of the bargaining strategies discovered1).

Fig. 3 Composition of market participants

The inquiry above is addressed based on multiple runs of four designs. The four designs diﬀer according to the population size of GP being 10 or 50. Each of the designs is implemented with the four demand-supply schedules. A summary is given in Table 3. There are 90 runs con-ducted for each market under each design. Hence, a total of 720 (2× 4 × 90) runs are completed. For the purpose of running GP, each run lasts for 300 generations (iter-ations).

2.5 Genetic programming

Autonomous agents in this paper are programmed

by genetic programming. Genetic programming is a population-based stochastic search algorithm. The pop-ulation is composed of a number of chromosomes. In diﬀerent applications, these chromosomes represent dif-ferent things. In our case, each chromosome simply rep-resents a bargaining strategy, as described in Sect. 2.2. The grow method is used to generate an initial popu-lation of bargaining strategies. Each initial strategy can have a hierarchy with a depth of up to 5. The popula-tion size is set to 10 and 50. The populapopula-tion is time-variant; it evolves for a given duration (a given number of generations), which is 300 in our setting. To have ef-fectively equal sampling for evaluating two populations with diﬀerent sizes, each generation lasts 2× pop size (population size) days.

The evolving population is driven by a sequence of genetic operators, which introduce a few more parame-ters. The selection mechanism is the tournament selec-tion with a tournament size of 5. To avoid disrupselec-tion due to genetic variations, the elitism operator is triggered to automatically keep the best strategy discovered in the previous generation in the next. Finally, the new bar-gaining strategies are discovered through crossover (re-combination) and mutation. The crossover rate is 100%. There are two styles of mutation, point mutation and subtree mutation. The mutation rate for each is 0.45 and 0.05, respectively. Table 4 gives the GP parameter values used to perform simulation runs.

3 Novelty-discovering agents

To “appreciate” what our GP buyers have found, we need to be reminded that these GP buyers are not much diﬀerent from ants in the sense that they are almost “blind” without knowing what the market structure is, both on the demand side and the supply side. The only thing known to them is what are given in Table 1. Hence, they are placed in a much more disadvantageous situa-tion human agents that were placed in the double auc-tion market experiments. Nevertheless, their opponents (the programmed agents), from the beginning to the end, following the same trading rules to trade, round to Table 3 Experimental designs

# of autonomous agents (code) markets (code) cognitive capacity (code)

One, Buyer 1 (B1) 1 (M1), 2 (M2), 3 (M3), 4 (M4) 10 (P10)

1 (M1), 2 (M2), 3 (M3), 4 (M4) 50 (P50)

Notes: Inside the brackets is the code of the corresponding design. Therefore, for example, Case B1M1-P10 refers to the design involving one GP buyer in Market one, and the population size of the GP buyer is set to 10. Case B1M4-P50 refers to the one involving one GP buyer in Market 4, and the population size is 50

1) Co-evolution will complicate the situation when one more autonomous agent is added to the market. While one may be interested in knowing whether the market co-evolves toward Nash equilibria when there are two autonomous agents, given the inﬁnite number of possible bargaining strategies, it would be very diﬃcult to identify Nash equilibria in a strategy space

(6)

Table 4 GP parameters

parameter value parameter value

initialization method grow initial maximum tree depth 5

population size 10, 50 No. of generations 300

number of days 2×pop size tournament selection size 5

elitism size 1 crossover rate 1.0

point mutation rate 0.045 subtree mutation rate 0.005

round, have created a very friendly environment, as in the movie Ground Hog Day for these GP agents1). Novelty-discovering agents will not always be rewarded by their constant attempt to discover hidden patterns, but when these “hidden” patterns are just there, it is just a matter of time for these agents to discover them. With this remark, we hope that the following descrip-tion of what we have learned from GP agents will not be considered to be trivial due to human hindsight bias.

3.1 Data processing and analytical procedure

As mentioned earlier, the analysis below is not based on any single run, but on 90 runs for each market with each design. This leaves a huge amount of data in front of us. For example, for each single run, depending on the population size, one can have 6000 (300 × 2 × 10) or 30000 (300 × 2 × 50) strategies being observed in the auctions2). By multiplying them by 90, one would have a total of 540000 or 2700000 strategies. This huge amount of data inevitably drives us to take some steps to make it tractable. The process done here is shown the following. We drop off the first 290 generations and focus only on the last 10 generations. The usual justification to have this convenience is something related to ergodic-ity, which we found quite applicable to our environment when the simulation has run for such a long time. Re-stricting our analysis to the last 10 generations will then

reduce the set of observable strategies to “only” a size of 18000 (pop size = 10) or 90000 (pop size = 50) separately. This reduced sample is still large enough to provide a valid answer to the question: What did the autonomous

agent learn, and what is the eﬀect of a larger population size?

The second step that we take to deal with this large amount of data is to focus our attention on some of the most frequently used strategies. An alternative is to first generate a profit distribution over the reduced dataset. For example, Table 5 gives the distribution of daily profit generated by the set of 18000 strategies in the case of B1M1-P10. As shown in the table, the strategies that generated the profits 21, 10.5, and 14.5 were used for a total of 93% of the auctions. It is therefore reasonable to assume that they represent the GP buyer’s trading strategies. Table 6 presents these three strategies and their associated information.

The most highly used strategy is (PMinBid): the low-est bidding price on the previous day. The other highly used strategy is (HT): the highest token value, which is just the truth-telling strategy. In this way, we can then have a general idea of what the autonomous agent dis-covered, and why it is so; we as experimenters can then also learn [3]3). Finally, the procedure suggested above will be carried out on both the cases of P-10 and P-50 so that a comparison between the results of the two will enable us analyze how the autonomous agents behave differently so as to see the effect of population size. Table 5 Distribution of daily profit generated by 18000 strategies pop size 10)

proﬁt –0.5 0 8 10.5 12.5 14 14.5 17 18 18.5 21 total

count 18 429 346 5666 72 8 3829 16 4 335 7277 18000

Table 6 Three most frequently used strategies and the associated information (pop size 10)

strategy proﬁt count ratio (count/18000)

PMinBid 21 7277 0.4043

PMinBid 10.5 5666 0.3148

HTV 14.5 3829 0.2127

elitism size 1 crossover rate 1.0

point mutation rate 0.045 subtree mutation rate 0.005

total 16722 0.9318

1) See Ref. [8] for the use of this metaphor

2) As mentioned in Sect. 2.5, the number of iterations for each generation is composed of 2×pop size (population size) trading days 3) This is a response to the criticism that agents learned but we do not, i.e., the title of the paper

(7)

3.2 General results

Since there are a total of eight scenarios to be discussed, it would be easier to have a general picture first, and then to get into some specific results later. Table 7 pro-vides such a summary of the effect of a larger population size. The column “benchmark” gives the major bargain-ing strategies discovered by the autonomous agents when the population size is set to 10 (pop size = 10). These major strategies are identified based on the procedure given in Sect. 3.1. For example, as we have seen earlier, the key strategy discovered by the autonomous agent in Case B1M1 is (PMinBid). The next column “Differences” then shows the essential differences after the population size increased to 50 (pop size = 50).

As we summarize in this table, there are a number of things that we are looking at, namely, proﬁtability,

stability (or robustness), and complexity. They are our

focuses because the fundamental question to address is stated as follows: Does a larger population size lead to

the discovery of better bargaining strategies, better in the sense of higher and stable (robust) proﬁts? If the answer

is yes, we further ask: what may cause this diﬀerence? The answer to the second question hinges upon

complex-ity.

A larger population size makes it easier for au-tonomous agents to find the profitable bargaining strate-gies that are complex and are normally beyond the avail-ability of the agents with a smaller population size. Com-plex profitable bargaining strategies are observed in both Markets 1 and 4. For example, in Market 1, when the population size increases to 50, a new, better, and more complex strategy, (Min (HT PMinBid)), is discovered. In fact, this is not the only improvement being dis-covered. A class of new strategies, called P-22, is also discovered1). Nevertheless, some of the better but more complex strategies discovered in the course of evolution did not get stabilized and remain to the end. Market 4 has several such examples. Hence, the best surviving strategy in both cases P10 and P50 is the same, i.e., (CASK) (see Table 7).

Stability (robustness) can be an issue because

bar-gaining strategies can be context-dependent, or, more specifically, history-dependent. As we can see in Table 1, autonomous agents relied very much on the histori-cal data to develop their strategies. Hence, even though some strategies are good in one or a few runs, the use of these strategies may cause history to change and re-sult in a new environment, which in turn leads to their deterioration. However, the stability issue is not only the privilege of complex strategies, for simple strategies can evoke the same problem. In fact, in Markets 1 and 3, we observe the possibility that larger population size can either enhance stability by discovering more com-plex strategies (Market 1), or by intensifying the use of robust, not necessarily complex, strategies (Market 3). This is reflected in a higher frequency of using (CASK), which is more robust, and a lower frequency of using (PMaxAsk) and (PMin), which is more history dependent. Among the four markets, only in Market 2, does the expansion of population size not have much effect on any features that we mentioned above? In both cases of P-10 and P-50, the best strategies discovered are both (NT). 3.3 Analysis of the best strategy found: what do we learn?

3.3.1 Market 1

One of the most competitive strategies found by the au-tonomous agent in Market 1 is (PMinBid). This strategy is very aggressive, and attempts to maximize the pos-sible profits from trading. Buyer 1 first realized that in equilibrium, the market can have a trading volume of 9. His advantageous positions, determined by the reserva-tion prices, ranked him as the fifth (his first unit) and the seventh (his second unit) trade, which means that these two are surely sellable. The question is what would be the best strategy to sell them. From “experience”, Buyer 1 also realized that it would not be wise to compete with those buyers with similar advantageous positions. Therefore, he decided to wait and make others with lower ranks trade first. He then gave concessions to Buyers 2 and 3 (Fig. 4(a)). While losing the opportunities of early Table 7 Summary of results

case benchmarks (P-10) alternative (P-50)

B1M1 (PMinBid) (Min (HT PMinBid)), (P-22)

proﬁtability↑, stability ↑, complexity ↑

B1M2 (NT) (NT)

no Eﬀect

B1M3 (CASK), (PMaxAsk),(PMin) (CASK) ↑, (PMaxAsk) ↓, (PMinn) ↓

stability↑

B1M4 (CASK) (CASK)

complexity↑

(8)

trades with good oﬀers, this strategy of procrastination allowed him to stand in a monopsony position after the

ﬁrst few trades, and hence enabled him to exploit the residual sellers much more when all those high bids were

Fig. 4 Trading processes of GP buyer in Market 1. The two trading processes above correspond to two diﬀerent trading strategies: (PMinBid) (a) and (Min (HT PMinBid)) (b)

(9)

gone. This is what the strategy (PMinBid) did for him. While (PMinBid) enabled a procrastination strategy for Buyer 1 to wield a “monopsony power”, it also con-stantly led Buyer 1 to bid a price of 14, which was the minimum bid on the previous trading day. This bidding not only made Buyer 1 successfully sell the first two to-kens, but also made him able to sell the third token, which nonetheless has a reservation price lower than 14 (Fig. 4(a)). Therefore, Buyer 1 suffered a loss from the trade of the last unit. This is equivalent to selling the first three tokens in a package, something like “42 for three”. Buyer 1 then used the profits gained from the first two tokens to compensate for the loss of the last token. Now, the question is whether we can separate the first two from the third and even make a bigger profit. The an-swer is this strategy, (Min (HT PMinBid)), which has a description length of 3, and hence is more complex than (PMinBid). This strategy will guide the GP buyer to bid the minimum of PMinBid and HT. Therefore, after trad-ing his first two tokens, the GP buyer will bid HT because it is the minimum of the two. Of course, in Fig. 1, this bid will not get matched, but neither will it incur an economic loss (Fig. 4(b)). With this improvement, the profit of the GP buyer increases by one accordingly. 3.3.2 Market 2

In Market 2, the best strategy discovered by the GP buyer is (NT). With the help of Fig. 1(b), this strategy can again be interpreted as a procrastination strategy. The GP Buyer gave up the privilege of an early trade. In fact, in this case, his advantageous position was ranked as number one, but he made other participants trade

first. When three other tradable tokens had been fin-ished, he held the last possible tradable token, and used this monopsony power to fully exploit the producer’s sur-plus from the residual seller (Seller 4). After this trade, because the demand curve overlaps the supply curve, trades are still possible, but making profits is infeasible. After the population size increases to 50, the GP buyer can not discover any better strategy. As we can see from this market (Fig. 1), there exists no better strategy given that all participants are truth tellers.

3.3.3 Market 3

The best strategies found in Market 3 are (CASK), (PMaxAsk), and (PMin). Given what we have learned from the previous two cases, it is not surprising to see that all these three strategies are also kinds of procras-tination strategies, which can also be seen in Fig. 2(a). Since the demand curve is parallel to the supply curve, all tokens are in principle tradable. The only question is how the created surplus should be divided. Using the procrastination strategy, the GP buyer simply waited for other participants to trade ﬁrst, and when all 12 other tokens had been traded, he acquired the monop-sony power and completely exploited all the remaining producer’s surplus. As we can see from the trading pro-cesses exhibited in Fig. 5, the three strategies led to the same trading pattern, and all four tokens of the GP buy-ers were traded at a price of 42 (zero producbuy-ers’ surplus). Nevertheless, it does not mean that the three strategies are the same. The subtle diﬀerence between (CASK) and the other two, (PMaxAsk) and (PMin), is that the former is history-independent, whereas the latter are not. They

(10)

Fig. 5 Trading processes of GP buyer in Market 3. The three trading processes above correspond to three diﬀerent trading strategies: (CASK) (a), (PMaxAsk) (b), and (PMin) (c)

will work only when, on the previous trading day, these same kinds of strategies were played; if not, the history may not be the same, and there is no guarantee that (PMaxAsk) or (PMin) will still be 42. This is why these two strategies are not that robust (stable) as compared to (CASK).

When the population size increases to 50, there is no better strategy being discovered. However, something in-teresting still happens. The GP buyer increases his re-liance on (CASK) and hence enhances the stability of his monopsony proﬁts.

3.3.4 Market 4

As opposed to the other three market scenarios, Market 4 is more intriguing. In one respect, it is very similar to Market 3 of which the demand and supply schedules never intersect. Hence, all 16 tokens in the market can in principle be sold. Under these circumstances, one may expect that the GP buyer will develop a strategy that is very similar to what the GP buyer used in Market 3. This is indeed the case. The most frequently seen strat-egy used by the GP buyer in this market is still (CASK).

(11)

Obviously, Buyer 1 already learned that all tokens are tradable, and he wanted to be patient to wait for the last few rounds so that he could completely exploit all of the remaining producers’ surplus. However, Market 4 is different from Market 3. In Market 3, procrastination does not cause good trading opportunities to be missed since the supply curve is essentially flat; nevertheless, in Market 4, the supply curve is piecewisely step-up. The later the GP buyer gets into the market, the less likely he is to receive favorable offers. In other words, the cost of delayed trading becomes more significant in Market 4, and the pure procrastination strategy simply neglects this cost. Therefore, one may wonder whether the GP buyer can learn something even more intelligent, i.e., a strategy that can balance the gains from full monopsony against the loss due to missing favorable offers. The an-swer is yes.

In addition to (CASK), our GP buyer also learned the following strategies:

1) ((* NT (% PAvgBid PMaxBid )) (proﬁt = 15997) 2) ((Min (If Bigger Then Else PMinBid PAvgBid CASK PAvgBid)

(If Bigger Then Else HT PAvgBid PAvgBid CASK)) (proﬁt = 16512)

3) ((Min (If Bigger Then Else PAvgBid HT CASK HT) PAvgBid) (proﬁt = 16512)

None of these strategies will advise the GP buyer to wait until the very end of trading, but to use a more ag-gressive strategy (higher bidding) to compete with other opponents (other buyers) and to obtain the favorable of-fers from suppliers.

4 Theory of optimal procrastination

The theory of optimal procrastination means that the agent attempts to delay his participation in the mar-ket transaction so as to avoid early competition and become a monopsonist in the later stage. Once he get there, he will then fully exercise the monopsony power by bidding with third-degree price discrimination. How-ever, procrastination may also cause the agent to miss some good offers; therefore, there is an opportunity cost for procrastination, and the agent will try to optimize the procrastination time by balancing his monopsony profits against these costs. Economic theory requires economists to have a good understanding of the struc-ture of the problem and then to find a good solution to it. Sometimes, both of these tasks are demanding, and in this case, we simply dispatch autonomous agents to the “complex world” and see what they find and get inspira-tion from there. The theory of optimal procrastinainspira-tion presented here is a perfect illustration of what we mean by autonomous-agents-inspired economic theory.

More formally, this theory can be stated as follows.

Without losing generality, let us use the simulated buyer (the autonomous agent) in this paper as an example. It is assumed that, at time t, the buyer can be a current holder, i.e., he can oﬀer the highest bid, which is also greater than the current ask. Then, the deal is made, and the transaction price P_t, by the Aurora rule, is the average of the two.

Pt=bidt+ ask₂ t, if bidt askt. (1) Now, in order to gain better terms and conditions, the buyer chooses to trade at a delayed time, say, t + Δt. With this delay, we assume that he can reduce his bid by the amount Δbid_t+Δt (Δbid_t+Δt > 0); nevertheless, with this delay, the more favorable oﬀers have gone, and the alternative ask may require an additional amount, say, Δask_t+Δt (Δask_t+Δt > 0). Hence, the consequence of this delay is to pay a price,P_t+Δt.

Pt+Δt=(bidt− Δbidt+Δt) + (ask₂ t+ Δaskt+Δt), if bid_t− Δbid_t+Δt ask_t+ Δask_t+Δt. (2) The gain, G, with this delay is, therefore, the differ-ence between the two prices paid at different times, and that depends on the difference between Δbid_t+Δt and Δask_t+Δt,

Gt,Δt =| Δbidt+Δt− Δaskt+Δt|, (3)

which is further determined by the shape of the demand schedule and the supply schedule. When the demand schedule and supply schedule are ﬂat and horizontal to each other, such as in Market 3, Δask_t+Δt is zero, so it pays the buyer to wait until he is the sole buyer in the market. However, when the demand and supply sched-ules are step functions, things can get complicated, such as in Markets 1 and 4, but our GP buyer can still ﬁgure out a good time to get into the market.

5 Conclusions

In this paper, through a large computer simulation, we simulate the evolution of the bargaining strategies of au-tonomous agents (buyers) in a competitive environment. These autonomous agents, by design, are purported to search for better deals from which to gain. In the very foundation of economics, we do need these agents to dis-cover, exploit, and eventually destroy any hidden pat-terns and opportunities. The purpose of our simulation is then to have a clear picture of what the autonomous agents discover and what these discoveries mean, and from that to see whether we can also learn and construct a theory in this light. The theory optimal of

(12)

glean from the behavior of our autonomous agents. At the end of the paper, we would like to point out several directions for further study.

First, the choice of a highly static environment is not necessary, but it does make it easier for us to compre-hend and make sense of the behavior of these chance-discovery agents. This learning can then help us to have a better idea of what these agents are doing or attempt-ing to do when they are placed in a much more complex situation, such as the one defined in Ref. [6], where au-tonomous agents are placed in surroundings filled with SFI-style programmed agents. It may not surprise us to see that these autonomous agents eventually beat all these programmed agents, but it is difficult to per-form an in-depth analysis of the discovered bargaining strategies given these complex surroundings. Maybe a challenging task for the future would be to introduce novel data mining or text mining techniques to this large database so as to know more of the “mental process” of these autonomous agents.

Second, this paper and Ref. [6] only consider a single autonomous agent. It would thus be interesting and chal-lenging to see whether we can develop a theory of

multi-agent competition. The immediate next step is to expand

the current single-agent version into a two-agent version, so that in the latter, we can have a co-evolutionary game-theoretic situation, and the monopsony result observed in this paper can become that of a duopsony, as would be expected.

Third, it is always interesting to know whether human agents can also learn the intelligent trading strategies discovered by the autonomous agents, for example, the optimal procrastination strategy. We are now designing market experiments to see whether the trading patterns realized by our autonomous agents can also be replicated by human agents.

Acknowledgements An early version of this paper was pre-sented at the Sino-foreign-interchange Workshop on Intelligence

Science and Intelligent Data Engineering (IScIDE’2010), Harbin,

China, June 3-5. The authors are grateful to the participants and two anonymous referees for the comments received. The NSC grants NSC 98-2911-I-004-007 and NSC 98-2410-H-004-045-MY3 are also gratefully acknowledged.

References

1. Tesfatsion L, Judd K. Handbook of Computational Eco-nomics, Volume 2: Agent-Based Computational Eco-nomics. Amsterdam: North Holland, 2006

2. Chen S H. Computational intelligence in agent-based com-putational economics. In: Fulcher J, Jain L, eds. Compu-tational Intelligence: A Compendium. Springer, 2008, 115: 517–594

3. Chen S H. Genetic programming and agent-based compu-tational economics: from autonomous agents to product innovation. In: Terano T, Kita H, Takahashi S, Deguchi H, eds. Agent-Based Approaches in Economic and Social

Complex Systems, 2009, 6: 3–14

4. Rust J, Miller J, Palmer R. Behavior of trading automata in a computerized double auction market. In: Friedman D, Rust J, eds. Double Auction Markets: Theory, Institutions, and Laboratory Evidence, 1993: 155–198

5. Rust J, Miller J, Palmer R. Characterizing eﬀective trad-ing strategies: insights from a computerized double auction tournament. Journal of Economic Dynamics and Control, 1994, 18(1): 61–96

6. Chen S H, Tai C C. The agent-based double auction mar-kets: 15 years on. In: Takadama K, Cioﬃ-Revilla C, and Deﬀuant G, eds. Simulating Interacting Agents and Social Phenomena: The Second World Congress, 2010, 7: 119–136 7. Andrews M, Prage R. Genetic programming for the acquisi-tion of double aucacquisi-tion market strategies. In: Kinnear K Jr, ed. Advances in Genetic Programming. Cambridge: MIT Press, 1994: 355–368

8. Thaler R. From Homo economicus to Homo sapiens. Jour-nal of Economic Perspectives, 2000, 14(1): 133–141

Prof. Dr. Shu-Heng CHEN is a Distinguished Professor in the Department of Economics, Dean of Oﬃce of International Co-operation, Director of the AI-ECON Research Center, and the organizer of Experimental Eco-nomics Laboratory at Chengchi University. He also serves as Vice Chair of the IEEE Computational Finance & Economics Technical Committee, the editor- in-chief of the Journal of New Mathematics and Natural Computation (World Scientiﬁc), and the associate editor of the Journal of Economic Behaviour and Organization and Journal of Economic Interaction and Coordination. Prof. CHEN holds an MA degree in mathematics and a Ph. D in Economics from University of California at Los Angeles. His current research interests include agent-based com-putational economics, comcom-putational intelligence, exper-imental economics, and computational and cognitive so-cial sciences.

Tina YU is an associate profes-sor in the Department of Com-puter Science at Memorial Uni-versity of Newfoundland. Tina conducts research in machine learning, computational intelli-gence and applies them to a va-riety of areas such as energy, medicine and economics. She serves at the editorial board of a Springer journal Genetic Programming and Evolvable Machines and an MIT Press journal Evolu-tionary Computation.