[1] Bloembergen D, Tuyls K, Hennes D, et al. Evolutionary dynamics of multi-agent learning: A survey[J]. Journal of Artificial Intelligence Research, 2015, 53: 659-697
[2] Panait L, Luke S. Cooperative multi-agent learning: The state of the art[J]. Autonomous agents and multi-agent systems, 2005, 11(3): 387-434
[3] Sukhbaatar S, Fergus R. Learning multiagent communication with backpropagation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2016: 2244-2252
[4] Banerjee B, Lyle J, Kraemer L, et al. Sample bounded distributed reinforcement learning for decentralized POMDPs[C]//Proc of the 26th AAAI Confon Artificial Intelligence. Menlo Park, CA: AAAI, 2012
[5] Omidshafiei S, Agha-Mohammadi A A, Amato C, et al. Graph-based cross entropy method for solving multi-robot decentralized POMDPs[C]//Proc of IEEE Int Conf on Robotics and Automation (ICRA). Piscataway, NJ: IEEE, 2016: 5395-5402
[6] Cheney D L, Seyfarth R M. Constraints and preadaptations in the earliest stages of language evolution[J]. The Linguistic Review, 2005, 22(2/3/4): 135-159
[7] Cao Y, Yu W, Ren W, et al. An overview of recent progress in the study of distributed multi-agent coordination[J]. IEEE Trans on Industrial informatics, 2012, 9(1): 427-438
[8] Matignon L, Jeanpierre L, Mouaddib A I. Coordinated multi-robot exploration under communication constraints using decentralized markov decision processes[C]//Proc of the 26th AAAI Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2012
[9] Buşoniu L, Babuška R, De Schutter B. Multi-agent reinforcement learning: An overview[M]//Innovations in Multi-Agent Systems and Applications-1. Berlin: Springer, 2010: 183-221
[10] Foerster J N, Farquhar G, Afouras T, et al. Counterfactual multi-agent policy gradients[C]//Proc of the 32nd AAAI Conf on Artificial Intelligence. Menlo Park, CA: AAAI, 2018
[11] Lowe R, Wu Y, Tamar A, et al. Multi-agent actor-critic for mixed cooperative-competitive environments[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017: 6379-6390
[12] Foerster J, Assael I A, De Freitas N, et al. Learning to communicate with deep multi-agent reinforcement learning[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2016: 2137-2145
[13] Peng P, Wen Y, Yang Y, et al. Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play starcraft combat games[J]. arXiv preprint, arXiv:1703.10069, 2017
[14] Iqbal S, Sha F. Actor-attention-critic for multi-agent reinforcement learning[J]. arXiv preprint, arXiv:1810.02912, 2018
[15] Jiang J, Lu Z. Learning attentional communication for multi-agent cooperation[C]//Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2018: 7254-7264
|