Language:

On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)

arXiv.org, 2022-01 [Peer Reviewed Journal]

2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;http://arxiv.org/licenses/nonexclusive-distrib/1.0 ;EISSN: 2331-8422 ;DOI: 10.48550/arxiv.2109.04024

Full text available

Citations Cited by

Actions
1. Add to My Research
2. Remove from My Research
3. E-mail
4. Print
5. Permalink
6. Citation
7. EasyBib
8. EndNote
9. RefWorks
10. Delicious
11. Export RIS
12. Export BibTeX

Title:
On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)
Author: Washim Uddin Mondal ; Agarwal, Mridul ; Aggarwal, Vaneet ; Ukkusuri, Satish V
Subjects: Algorithms ; Approximation ; Computer Science - Artificial Intelligence ; Computer Science - Computer Science and Game Theory ; Computer Science - Learning ; Computer Science - Multiagent Systems ; Learning ; Mathematical analysis ; Multiagent systems
Is Part Of: arXiv.org, 2022-01
Description: Mean field control (MFC) is an effective way to mitigate the curse of dimensionality of cooperative multi-agent reinforcement learning (MARL) problems. This work considers a collection of \(N_{\mathrm{pop}}\) heterogeneous agents that can be segregated into \(K\) classes such that the \(k\)-th class contains \(N_k\) homogeneous agents. We aim to prove approximation guarantees of the MARL problem for this heterogeneous system by its corresponding MFC problem. We consider three scenarios where the reward and transition dynamics of all agents are respectively taken to be functions of \((1)\) joint state and action distributions across all classes, \((2)\) individual distributions of each class, and \((3)\) marginal distributions of the entire population. We show that, in these cases, the \(K\)-class MARL problem can be approximated by MFC with errors given as \(e_1=\mathcal{O}(\frac{\sqrt{|\mathcal{X}|}+\sqrt{|\mathcal{U}|}}{N_{\mathrm{pop}}}\sum_{k}\sqrt{N_k})\), \(e_2=\mathcal{O}(\left[\sqrt{|\mathcal{X}|}+\sqrt{|\mathcal{U}|}\right]\sum_{k}\frac{1}{\sqrt{N_k}})\) and \(e_3=\mathcal{O}\left(\left[\sqrt{|\mathcal{X}|}+\sqrt{|\mathcal{U}|}\right]\left[\frac{A}{N_{\mathrm{pop}}}\sum_{k\in[K]}\sqrt{N_k}+\frac{B}{\sqrt{N_{\mathrm{pop}}}\right]\right)\), respectively, where \(A, B\) are some constants and \(|\mathcal{X}|,|\mathcal{U}|\) are the sizes of state and action spaces of each agent. Finally, we design a Natural Policy Gradient (NPG) based algorithm that, in the three cases stated above, can converge to an optimal MARL policy within \(\mathcal{O}(e_j)\) error with a sample complexity of \(\mathcal{O}(e_j^{-3})\), \(j\in\{1,2,3\}\), respectively.
Publisher: Ithaca: Cornell University Library, arXiv.org
Language: English
Identifier: EISSN: 2331-8422
DOI: 10.48550/arxiv.2109.04024
Source: Directory of Open Access Scholarly Resources (ROAD)
arXiv.org
Free E Journals
ProQuest Central

Back to results list


INSPIRE LIBRARY - TON DUC THANG UNIVERSITY	(84-028) 37 755 057	Feedback
19 Nguyen Huu Tho St. Dist.7, HCM	thuvien@tdtu.edu.vn	Feedback

On the Approximation of Cooperative Heterogeneous Multi-Agent Reinforcement Learning (MARL) using Mean Field Control (MFC)

Searching Remote Databases, Please Wait