Dynamic heuristic acceleration of linearly approximated SARSA(lambda): using ant colony optimization to learn heuristics dynamically

Stefano Bromuri*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Heuristically accelerated reinforcement learning (HARL) is a new family of algorithms that combines the advantages of reinforcement learning (RL) with the advantages of heuristic algorithms. To achieve this, the action selection strategy of the standard RL algorithm is modified to take into account a heuristic running in parallel with the RL process. This paper presents two approximated HARL algorithms that make use of pheromone trails to improve the behaviour of linearly approximated SARSA(λ) by dynamically learning a heuristic function through the pheromone trails. The proposed dynamic algorithms are evaluated in comparison to linearly approximated SARSA(λ), and heuristically accelerated SARSA(λ) using a static heuristic in three benchmark scenarios: the mountain car, the mountain car 3D and the maze scenarios."
Original languageEnglish
Pages (from-to)901-932
Number of pages33
JournalJournal of Heuristics
Volume25
Issue number6
Early online date3 May 2019
DOIs
Publication statusPublished - Dec 2019

    Fingerprint

Keywords

  • Dynamic heuristics
  • Reinforcement learning
  • Ant colony optimization

Cite this