Many recent works in the field of multi-agent reinforcement
learning via communication focus on learning what messages
to send, when to send, and whom to address such messages.
Those works indicate that communication is useful for higher
cumulative reward or task success. However, one important limitation is that most of them ignore the importance of enforcing
agents' ability to understand the received information. In this
paper, we notice that observation and communication signals
are from separate information sources. Thus, we enhance the
communicating agents with the capability to integrate crucial
information from different sources. Specifically, we propose a
multi-modal communication method, which modulates agents'
observation and communication signals as different modalities
and performs multi-modal fusion to allow knowledge to transfer
across different modalities. We evaluate the proposed method
on a diverse set of cooperative multi-agent tasks with several
state-of-the-art algorithms. Results demonstrate the effectiveness of our method in incorporating knowledge and gaining a
deeper understanding from various information sources.