信息安全研究 ›› 2020, Vol. 6 ›› Issue (4): 345-349.

• 远程办公安全专题 • 上一篇    下一篇

一种高效率的多智能体协作学习通信机制

赵宇航1,马修军2   

  1. 1. 机器感知与智能教育部重点实验室(北京大学)
    2. 北京大学信息科学技术学院
  • 收稿日期:2020-04-06 出版日期:2020-04-03 发布日期:2020-04-06
  • 通讯作者: 赵宇航
  • 作者简介:赵宇航 硕士研究生,主要研究方向为多智能系统、强化学习。 zhaoyuhang@pku.edu.cn 马修军 副教授,主要研究方向为时空数据挖掘、智能Agent与智能系统。 maxiujun@pku.edu.cn

An Efficient Communication Framework in Multi-Agent Cooperating Learning Environment Zhao Yuhang and Ma Xiujun

  • Received:2020-04-06 Online:2020-04-03 Published:2020-04-06

摘要: 目前人工智能的发展日新月异,从计算机视觉到自然语言处理,再到强化学习的研究,都有了不小的突破。但是绝大部分人工智能针对的目标都是单智能体的,这些研究者的目标是让单智能体的智能能够不断的提升。然而多智能体的突破更能解决复杂的问题,例如动物种群的繁衍、人类的团队协作等等。即使单个智能体的智能不是特别高,但智能体之间的交流、协作能够很有效率的话,从整体来看,这个智能体群落的智能会比较高。目前,多智能体协作学习领域通常使用强化学习框架,但大多研究没有显式地应用通信机制,以提高整体模型的效果。提出了一种基于通信过滤的Actor-Critic算法框架,它能使多智能体环境中的智能体之间能够高效地交流,即使在没有Critic指导的执行阶段,高效率的通信也能够很好地帮助智能体协作。算法框架中采用了一个神经网络来过滤智能体之间的信息,来完成一个使低质量的冗余的信息到高质量的低维的信息的过程。本文设计了3个实验来验证模型的效果,分别是2个协作学习场景和一个自动驾驶中的车道变换任务。实验结果表明,在引入沟通的多智能体协作学习中,算法模型的效果比其他类似的模型效果好。

关键词: 多智能系统, 强化学习, 协作学习, 人工智能, 自动驾驶

Abstract: Reinforcement learning in cooperate multi-agent scenarios is important for real-world applications. While several attempts before tried to resolve it without explicit communication, we present a communication-filtering actor-critic algorithm that trains decentralized policies which could exchange filtered information in multi-agent settings, using centrally computed critics. Communication could potentially be an effective way for multi-agent cooperation. We supposed that, when in execution phase without central critics, high-quality communication between agents could help agents have better performance in cooperative situations. However, information sharing among all agents or in predefined communication architectures that existing methods adopt can be problematic. Therefore, we use a neural network to filter information between agents. Empirically, we show the strength of our model in two general cooperative settings and vehicle lane changing scenarios. Our approach outperforms several state-of-the-art models solving multi-agent problems.

Key words: multi-agent system, reinforcement learning, cooperating learning, artificial intelligence, autonomous driving