Cooperative Information Sharing to Improve Distributed Learning
Effective coordination in partially observable MAS requires agent actions to be based on reliable estimates of non-local states. One way of generating such estimates is to allow the agents to share state information that is not directly observable. To this end, we propose a novel strategy of delayed distribution of state estimates. Our empirical studies of this mechanism demonstrate that individual reinforcement-learning agents in a simulated network routing problem achieve a significant improvement in the overall success, robustness, and efficiency of routing compared with the standard Q-routing algorithm.