Automated curriculum generation for reinforcement learning (RL) aims to speed up learning by designing a sequence of tasks of increasing difficulty. Such tasks are usually drawn from probability distributions with exponentially bounded tails, such as uniform or Gaussian distributions. However, existing approaches overlook heavy-tailed distributions. Under such distributions, current methods may fail to learn optimal policies in rare and risky tasks, which fall under the tails and yield the lowest returns, respectively. We address this challenge by proposing a risk-aware curriculum generation algorithm that simultaneously creates two curricula: 1) a primary curriculum that aims to maximize the expected discounted return with respect to a distribution over target tasks, and an auxiliary curriculum that identifies and over-samples rare and risky tasks observed in the primary curriculum. Our empirical results evidence that the proposed algorithm achieves significantly higher returns in frequent as well as rare tasks compared to the state-of-the-art methods.

Citation

  Koprulu, C., Simão, T. D., Jansen, N., & Topcu, U. (2023). Risk-aware Curriculum Generation for Heavy-tailed Task Distributions. UAI, 1132–1142.

@inproceedings{Koprulu2023risk-aware,
  title = {{Risk-aware Curriculum Generation for Heavy-tailed Task Distributions}},
  author = {Koprulu, Cevahir and Sim{\~a}o, Thiago D. and Jansen, Nils and Topcu, Ufuk},
  pages = {1132--1142},
  booktitle = {UAI},
  year = {2023}
}