Factored temporal difference learning in the new ties environment
Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a l...
Elmentve itt :
| Szerzők: | |
|---|---|
| Testületi szerző: | |
| Dokumentumtípus: | Cikk |
| Megjelent: |
2008
|
| Sorozat: | Acta cybernetica
18 No. 4 |
| Kulcsszavak: | Számítástechnika, Kibernetika |
| Tárgyszavak: | |
| Online Access: | http://acta.bibl.u-szeged.hu/12840 |
| Tartalmi kivonat: | Although reinforcement learning is a popular method for training an agent for decision making based on rewards, well studied tabular methods are not applicable for large, realistic problems. In this paper, we experiment with a factored version of temporal difference learning, which boils down to a linear function approximation scheme utilising natural features coming from the structure of the task. We conducted experiments in the New Ties environment, which is a novel platform for multi-agent simulations. We show that learning utilising a factored representation is effective even in large state spaces, furthermore it outperforms tabular methods even in smaller problems both in learning speed and stability, because of its generalisation capabilities. |
|---|---|
| Terjedelem/Fizikai jellemzők: | 651-668 |
| ISSN: | 0324-721X |