Chapter 6 Conclusions and Proposals for Future Work
6.3 Other Prospects
We believe adaptive playout is a rising star of computer Go. To solve the semeai problems completely, heavy knowledge implementation scarcely can be the right way, because there are really too many cases for a perfect algorithm to exist. Letting the
81
playout learn from itself, under the framework of the softmax policy, is of great interest to investigate. Also, feeding the information from the tree to the playout in order to solve the semeai problems, as one author (Yamato, the author of ZEN) suggested recently in the computer Go mailing list, also deserves considerable attention. We hope this research can be inspiring and encouraging to other computer Go researchers.
82
References
Abramson, B. (1990). Expected-Outcome: A General Model of Static Evaluation.
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12, No. 2, pp.
182-193.
Allis, V. (1994). Searching for Solutions in Games and Artificial Intelligence. Ph.D.
Thesis, University of Limburg, Masstricht, The NetherLands.
Anderson, D.A. (2009). Monte Carlo search in Games. Technical report, Worcester Polytechnic Institute.
Audouard, P., Chaslot, G., Hoock, J.-B., Rimmel, A., Perez, J., and Teytaud, O. (2009).
Grid coevolution for adaptive simulations; application to the building of opening books in the game of Go. In EvoGames, Tuebingen Allemagne. Springer.
Auer, P., Cesa-Bianchi, N., Freund, Y. and Schapire, R. E. (1995). Gambling in a rigged casino: the adversarial multi-armed bandit problem. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science, pp. 322-331, IEEE Computer Society Press, Los Alamitos, CA.
Auer, P., Cesa-Bianchi, N. and Fischer, P. (2002). Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning Journal, 47, pp. 235-256.
Baier, H. and Drake, P. (2010). The Power of Forgetting: Improving the Last-Good-Reply Policy in Monte Carlo Go. IEEE Transactions on Computational Intelligence and AI in Games, Vol. 2, pp. 303-309.
Baudis, P. (2011). Exploration formulas for UCT. Computer-Go mailing list, http://www.mail-archive.com/[email protected]/msg02154.html.
Retrieved at 2011-07-17 T19:45:45+08:00.
Baudis, P. and Gailly, J.-l. (2011). Pachi: Software for the Board Game of Go / Weiqi /
83
Baduk.
http://pachi.or.cz/.
Retrieved at 2011-07-14 T14:38:10+08:00.
Baum, E. B. and Smith, W. D. (1997). A Bayesian Approach to Relevance in Game Playing. Artificial Intelligence, Vol. 97, No.1–2, pp. 195-242.
Bourki, A., Chaslot, G., Coulm, M., Danjean, V., Doghmen, H., Hérault, T., Hoock, J.-B., Rimmel, A., Teytaud, F., Teytaud, O., Vayssière, P., and Yu, Z. (2010).
Scalability and Parallelization of Monte-Carlo Tree Search. Proceedings of the 5th International Conference on Computer and Games 2010.
Bouzy, B. (2003). Associating domain-dependent knowledge and Monte Carlo approaches within a Go program. Information Sciences, Vol. 175, pp. 247-257.
Bouzy, B. (2005a). Move Pruning Techniques for Monte-Carlo Go. In Joint Conference on Information Sciences. Proceedings of the 11th Advances in Computer Games conference (ACG-11), pp. 104-119.
Bouzy, B. (2005b). History and territory heuristics for Monte-Carlo Go. In Heuristic Search and Computer Game Playing Session, Joint Conference on Information Sciences, Salt Lake City.
Bouzy, B. and Cazenave, T. (2001). Computer Go: an AI-oriented Survey. Artificial Intelligence, Vol. 132, issues 1, pp. 39-103.
Bouzy, B. and Chaslot, G. (2005). Bayesian generation and integration of K-nearest-neighbor patterns for 19×19 Go. IEEE 2005 Symposium on Computational Intelligence in Games (eds. G. Kendall & Simon Lucas), pp. 176-181.
Bouzy, B. and Chaslot, G. (2006). Monte-Carlo Go Reinforcement Learning Experiments. 2006 IEEE Symposiumon Computational Intelligence and Games (eds.
G. Kendall and S. Louis), pp. 187-194, Reno, USA.
Bouzy, B. and Helmstetter, B. (2003). Monte-Carlo Go Developments. Proceedings of the 10th Advances in Computer Games conference (ACG-10), Graz 2003, Book edited by Kluwer, H. Jaap van den Herik, Hiroyuki Iida, Ernst A. Heinz, pp. 159-174.
84
Brügmann, B. (1993). Monte Carlo Go. Not formally published.
http://www.althofer.de/Bruegmann-MonteCarloGo.pdf.
Retrieved at 2011-07-18 T10:16:45+08:00.
Burmeister, J. and Wiles, J. (1995). The Challenge of Go as a Domain for AI Research:
a Comparison Between Go and Chess. Proceedings of the Third Australian and New Zealand Conference on Intelligent Information System, pp. 181-186.
Campbell, M., Hoane, A. J. and Hsu, F. H. (2002). Deep Blue. Artificial Intelligence, Vol. 134, issues 1-2, pp. 57-83.
Chaslot, G., Fiter, C., Hoock, J.-B., Rimmel, A., and Teytaud, O. (2009). Adding Expert Knowledge and Exploration in Monte-Carlo Tree Search. Proceedings of the Twelfth International Advances in Computer Games Conference, pp. 1-13, Pamplona, Spain.
Chaslot, G., J-B. Hoock, J.-B., Perez, J., Rimmel, A., Teytaud, O. and Winands, M.
(2009). Meta Monte-Carlo Tree Search for Automatic Opening Book Generation.
Proceedings of the IJCAI'09 Workshop on General Intelligence in Game Playing Agents, pp. 7-12.
Chaslot, G., Winands, M., Bouzy, B., Uiterwijk, J. W. H. M., and Herik, H. J. van den (2007). Progressive Strategies for Monte-Carlo Tree Search. Proceedings of the 10th Joint Conference on Information Sciences (ed.P. Wang), pp. 655–661, Salt Lake City, USA.
Chaslot, G., Winands, M. and Herik, H. J. van den (2008). Parallel Monte-Carlo Tree Search. Proceedings of the Conference on Computers and Games 2008, Vol. 5131 of Lecture Notes in Computer Science, pp. 60-71.
Chen, K. (1989). Group identification in Computer Go. Heuristic Programming in Artificial Intelligence, Levy & Beal ( Eds.), pp. 195-210.
Chen, K. and Chen, Z. (1999). Static analysis of life and death in the game of Go.
Information Science, Vol. 121, pp. 113-134.
Coquelin, P.-A. and Munos, R. (2007). Bandit Algorithm for Tree Search, Technical Report 6141, INRIA.
85
Coulom, R. (2006). Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search. Proceedings of the 5th International Conference on Computer and Games (eds. H. J. van den Herik, P. Ciancarini, and H. J. Donkers), Vol. 4630 of Lecture Notes in Computer Science, pp. 72-83, Springer, Turin, Italy.
Coulom, R. (2007). Computing Elo Ratings of Move Patterns in the Game of Go.
ICGA Journal, Vol. 30, No. 4, pp. 198-208.
Coulom, R. (2010). Bayesian Elo Rating.
http://remi.coulom.free.fr/Bayesian-Elo/#usage. Tree Search. IEEE Transactions on Computational Intelligence and AI in Games, Vol.
2, No. 4, pp. 259-270. Special issue on Monte Carlo Techniques and Computer Go.
Fotland, D. (2011). Exploration formulas for UCT.
http://www.mail-archive.com/[email protected]/msg02143.html.
Retrieved at 2011-07-17 T19:47:50+08:00.
Fotland, D. (2010). ERICA wins Go 19x19 Tournament. ICGA Journal, Vol. 33, pp.
174-178.
Fotland, D. (2002). Static Eye in "The Many Faces of Go". ICGA Journal, Vol. 25, No.
4, pp. 203-210.
86
Fotland, D. (1996). World Computer Go Championships.
http://www.smart-games.com/worldcompgo.html.
Retrieved at 2011-07-17 T19:44:35+08:00.
Fotland, D. (1993). Knowledge Representation in The Many Faces of Go.
http://www.smart-games.com/knowpap.txt.
Retrieved at 2011-07-17 T19:43:01+08:00.
Gelly, S., Hoock, J.-B., Rimmel, Arpad, Teytaud, O. and Kalemkarian, Y. (2008). The Parallelization of Monte-Carlo Planning. In International Conference on Informatics in Control, Automation and Robot, Madeira, Portugal.
Gelly, S. and Silver, D. (2007). Combining Online and Offline Knowledge in UCT.
Proceedings of the 24th International Conference on Machine Learning, pp. 273-280, Corvallis Oregon USA.
Gelly, S. and Silver, D. (2011). Monte-Carlo Tree Search and Rapid Action Value Estimation in Computer Go. Artificial Intelligence, Vol. 175, No. 11, pp. 1856-1875.
Gelly, S., Wang, Y., Munos, R. and Teytaud, O. (2006). Modifications of UCT with Patterns in Monte-Carlo Go. Technical Report 6062, INRIA.
Gaudel, R., Hoock, J.-B., Pérez, J., Sokolovska, N., Teytaud, O. (2010). A Principled Method for Exploiting Opening Books. Proceedings of the 5th International Conference on Computer and Games 2010, pp. 136-144.
Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. Readings in computer vision: issues, problems, principles, and paradigms, pp. 564-584.
Graepel, T., Goutrié, M., Krüger, M. and Herbrich R. (2001). Learning on Graphs in the Game of Go. Lecture Notes in Computer Science, Vol. 2130/2001, pp. 347-352.
Herik, H. J. van den, Uiterwijk, J. W. H. M., Rijswijck, J. van. (2002). Games solved:
Now and in the future. Artificial Intelligence, Vol. 134, pp. 277-311.
Helmbold, D. P. and Wood, A. P. (2009). All-Moves-As-First Heuristics in
87
Monte-Carlo Go. In Proceedings of the 2009 International Conference on Artificial Intelligence, ICAI’09/ISBN 1-60132-108-2, Editors: Arabnia, de la Fuente, Olivas, pp.
605-610, Los Vegas, USA.
Hendrik, B. (2010). Adaptive Playout Policies for Monte-Carlo Go. Master thesis, Institut für Kognitionswissenschaft, Universität Osnabrück.
Kocsis, L. and Szepesv´ari, C. (2006). Bandit Based Monte-Carlo Planning. In J.
Furnkranz, T. Scheffer and M. Spiliopoulou (eds.), Machine Learning: ECML 2006, Lecture Notes in Artificial Intelligence 4212, pp. 282-293.
Jasiek, R. (1997). Elementary Rules.
http://home.snafu.de/jasiek/element.html.
Retrieved at 2011-07-17 T19:42:20+08:00.
Huang, S. C., and Yen, S. J. (2010). Many Faces of Go Wins Computer 13×13 Go Tournament. ICGA Journal, Vol. 33, No. 3, pp. 172-173.
Hyatt, R. M. (1984). Using time wisely. ICCA Journal, Vol. 7, No. 1, pp. 4-9.
Kato, H. (2008). More UCT / Monte-Carlo questions (Effect of rave). Computer-Go mailing list.
http://www.mail-archive.com/[email protected]/msg07135.html.
Retrieved at 2011-07-17 T19:48:49+08:00.
Knuth, D. E., and Moore, R. W. (1975). An Analysis of Alpha-Beta Pruning, Artificial Intelligence, Vol. 6, No. 4, pp. 293-326.
Lew, L. (2010). Library of effective Go routines.
http://github.com/lukaszlew/libego.
88
Markovitch, S. and Sella, Y. (1996). Learning of resource allocation strategies for game playing. Computational Intelligence, vol. 12, pp. 88-105.
Müller, M. (2002). Computer Go. Artificial Intelligence, Vol. 134, issues 1-2, pp.
145-179.
Persson, M. (2010). Valkyria. Sensei’s Library.
http://senseis.xmp.net/?Valkyria.
Retrieved at 2011-07-14 T14:33:12+08:00.
Rimmel, A., Teytaud, F. and Teytaud, O. (2010). Biasing Monte-Carlo Simulations through RAVE Values. In Proceedings of the 5th International Conference on Computer and Games 2010, pp. 59-68.
Rosin, C. D. (2010). Multi-armed bandits with episode context. Proceedings of ISAIM 2010.
Schaeffer, J. (1989). The history heuristic and alpha-beta search enhancements in practice. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 11, pp. 1203-1212.
Segal, R. (2010). On the Scalability of Parallel UCT. Proceedings of the 5th International Conference on Computer and Games 2010.
Shannon, C. E. (1950). Programming a computer for playing chess. Philosophical Magazine, 7th series, Vol. 41, No. 314, pp. 256-275.
Sheppard, B., Williams, M., Dailey, D., Hillis, D., Nentwich, D., Persson, M., Fotland, D. and Boon, M. (2009). Tweak to MCTS selection criterion. Computer-go mailing list,http://groups.google.com/group/computer-go-archive/browse_thread/thread/7531b ef033ca31f6, Jun. 2009.
Silver, D. (2009). Reinforcement learning and simulation-based search in computer Go. Ph.D. dissertation, University of Alberta.
Slate, D. J., and Atkin, L. R. (1977). Chess 4.5 - The Northwestern University Chess Program, Chess Skill in Man and Machine, P.W. Frey (ed.), Springer-Verlag, New
89
York, pp. 82-118.
Smith, A. (1908). The game of Go, the national game of Japan.
Šolak, R. and Vučković, R. (2009). Time management during a chess game. ICGA Journal, vol. 32, No. 4, pp. 206-220.
Stern, D., Herbrich, R. and Graepel, T. (2006). Bayesian Pattern Ranking for Move Prediction in the Game of Go. In Proceedings of the International Conference of Machine Learning.
Stogin, J., Chen, Y.-P., Drake, P., and Pellegrino, S. (2009). The Beta Distribution in the UCB Algorithm Applied to Monte-Carlo Go. In Proceedings of the 2009 International Conference on Artificial Intelligence, CSREA Press.
Tesauro, G., Rajan, V. T., Segal, R. (2010). Bayesian Inference in Monte-Carlo Tree Search , In Proceedings of the Conference on Uncertainties in Artificial Intelligence 2010.
Teytaud, O. (2011). News on Tromp-Cook.
http://www.mail-archive.com/[email protected]/msg02171.html.
Retrieved at 2011-07-17 T19:50:20+08:00.
Tromp, J. and Farnebäck, G. (2006). Combinatorics of Go. Proceedings of 5th international Conference on Computer and Games, pp. 241-249.
Xie, F., Liu, Z. (2009). Backpropagation Modification in Monte-Carlo Game Tree Search. Intelligent Information Technology Application, 2009 (IITA 2009), pp.
125-128.
Yajima, T., Hashimoto, T., Matsui, T., Hashimoto, J. and Spoerer, K. (2010). Node Expansion Operators for the UCT Algorithm. Proceedings of the 5th International Conference on Computer and Games 2010, pp. 116-123.
Yamashita, H. (2010). time settings. Computer-Go mailing list.
http://dvandva.org/pipermail/computer-go/2010-July/000687.html.
Retrieved at 2011-07-17 T19:50:56+08:00.
90
Yamashita, H. (2011). Exploration formulas for UCT. Computer-Go mailing list.
http://www.mail-archive.com/[email protected]/msg02155.html.
Retrieved at 2011-07-17 T19:51:22+08:00.
Yamato. (2011). News on Tromp-Cook.
http://www.mail-archive.com/[email protected]/msg02184.html.
Retrieved at 2011-07-17 T19:51:43+08:00.
Yen, S. J. (1999). Design and Implementation of Computer Go Program Jimmy 5.0.
Ph.D. dissertation, National Taiwan University.
Yoshimoto, H., Yoshizoe, K., Kaneko, T., Kishimoto, A. and Taura, K. (2006). Monte Carlo Go Has a Way to Go. In Proceedings of the 21st National Conference on Artificial Intelligence, AAAI Press, pp. 1070-1075.
Zobrist, A. (1969). A model of visual organization for the game of Go. Proceedings of AFIPS Spring Joint Computer Conference, Boston, AFIPS Press, Montvale, NJ, pp.
103-111.
Zobrist, A. (1970). Feature extractions and representation for pattern recognition and the game of Go. Ph.D. Thesis, Graduate School of the University of Wisconsin, Madison, WI.
91
Appendix A. Publication List
(a) Journal Papers
[1] Huang, S. C., Coulom, R. and Lin, S. S. (2010). Monte-Carlo Simulation Balancing Applied to 9×9 Go. ICGA Journal, Vol. 33, No. 4, pp. 191-201.
[2] Huang, L. T., Chen, S. T., Huang, S. C. and Lin, S. S. (2007). An Efficient Approach to Solve Mastermind Optimally. ICGA Journal, Vol. 30, No. 3, pp.
143-149.
(b) Conference Papers
[1] Huang, S. C., Coulom, R. and Lin, S. S. (2010). Monte-Carlo Simulation Balancing in Practice. In Proceedings of the International Conference on Computers and Games 2010 (CG2010), JAIST, Kanazawa, Japan, September 24-26.
[2] Huang, S. C., Coulom, R. and Lin, S. S. (2010). Time Management for Monte-Carlo Tree Search Applied to the Game of Go. International Workshop on Computer Games (IWCG 2010), Hsinchu, Taiwan, November 18-20.
[3] Lin, S. S., Chen, B. and Huang, S. C. (2004). Building a Multi-modal Multimedia Information Retrieval System for the National Museum of History.
In Proceedings of the 2004 International Conference on Digital Archive Technologies (ICDAT2004), pp. 121-130, Taipei, Taiwan.
(c) Tournament Report
[1] Huang, S. C. and Yen, S. J. (2010). Many Faces of Go wins Computer 13×13 Go Tournament. ICGA Journal, Vol. 33, No. 3, pp. 172-173.