Applying Reinforcement Learning to a Continuous Environment

Document Type


Publication Date



Carl Burch


In this paper, we explore some issues associated with applying the Temporal Difference (TD) learning algorithm for reinforcement learning to continuous environments. Specifically, we look at whether TD learning can be successfully applied to a continuous environment and whether there is an implementation of TD learning that is best suited to such a task. Included in this paper are:

  • A detailed description of our implementation of capture the flag which we used as a continuous environment.
  • An overview of the TD learning algorithm, as well as our Discrete, Nearest Neighbor, and Artificial Neural Network implementations.
  • A summary of experimental data with graphs and analysis contrasting the learning performance of the aforementioned implementations.

Finally, we show that it is possible to apply reinforcement learning to a continuous environment, and that Artificial Neural Networks can learn quite successfully if correctly configured.