A domain may di er in that changes

21 03 2009

A domain may di er in that changes in the environment are more complex than those investigated in this paper. At present, the system detects that the goal has moved by counting how often a reward is received at the old goal position. Not only is this a rather ad hoc approach, but it also does not account for other possible changes, such as paths becoming blocked or short-cuts becoming available. At present, when learning a new task
the system is restarted and is not required to determine that its present solution is no longer applicable. In future work, the system should decide when its model of the world is no longer correct. It should also decide what, if any, relationship there is to the existing task and how it might be best exploited. This will allow a more complex interaction of the function composition system with reinforcement learning. For instance, the learning of a new task for the robot navigation domain used the relatively simple situation of two rooms. The function composition system initialized the low level algorithm once on detecting suitable features. In the future, to address more complex tasks, with many more rooms, an incremental approach
will be used. When a new task is being learned, the system will progressively build up a solution by function composition as di erent features become apparent.

This approach also should handle any errors the system might make with feature ex- traction. In the experiments with these simple room con gurations, the ltering discussed in Section 2.3 proved sucient to prevent problems. But in more complex tasks, it is likely that false \doorways” will be detected, simply because the system has not explored that region of the state space. A composed function including that extra doorway will drive the
system into that region. It should then become quickly apparent that the doorway does not exist and a new function can be composed.

Taken From: Accelerating Reinforcement Learning


Actions

Informations

Leave a comment

You can use these tags : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>