Abstract
Heuristically accelerated reinforcement learning (HARL) is a new family of algorithms that combines the advantages of reinforcement learning (RL) with the advantages of heuristic algorithms. To achieve this, the action selection strategy of the standard RL algorithm is modified to take into account a heuristic running in parallel with the RL process. This paper presents two approximated HARL algorithms that make use of pheromone trails to improve the behaviour of linearly approximated SARSA(λ) by dynamically learning a heuristic function through the pheromone trails. The proposed dynamic algorithms are evaluated in comparison to linearly approximated SARSA(λ), and heuristically accelerated SARSA(λ) using a static heuristic in three benchmark scenarios: the mountain car, the mountain car 3D and the maze scenarios."
Original language | English |
---|---|
Pages (from-to) | 901-932 |
Number of pages | 32 |
Journal | Journal of Heuristics |
Volume | 25 |
Issue number | 6 |
Early online date | 3 May 2019 |
DOIs | |
Publication status | Published - Dec 2019 |
Keywords
- ALGORITHM
- Ant colony optimization
- Dynamic heuristics
- REINFORCEMENT
- Reinforcement learning