We present recursive small-step multi-agent A* (RS-MAA*), an exact algorithm that optimizes the expected reward in decentralized partially observable Markov decision processes (Dec-POMDPs). RS-MAA* builds on multi-agent A* (MAA*), an algorithm that finds policies by exploring a search tree, but tackles two major scalability concerns. First, we employ a modified, small-step variant of the search tree that avoids the double exponential outdegree of the classical formulation. Second, we use a tight and recursive heuristic that we compute on-the-fly, thereby avoiding an expensive precomputation. The resulting algorithm is conceptually simple, yet it shows superior performance on a rich set of standard benchmarks.

Citation

  Koops, W., Jansen, N., Junges, S., & Simão, T. D. (2023). Recursive Small-Step Multi-Agent A* for Dec-POMDPs. IJCAI, 5402–5410.

@inproceedings{Koops2023recursive,
  author = {Koops, Wietze and Jansen, Nils and Junges, Sebastian and Sim{\~a}o, Thiago D.},
  title = {Recursive Small-Step Multi-Agent A* for Dec-POMDPs},
  booktitle = {IJCAI},
  year = {2023},
  pages = {5402--5410}
}