next up previous
Next: RL theory II [3 Up: MLB_Exercises_2010 Previous: Genetic Algorithm [3* P]

RL theory I [3 P]

Prove Corollary 1.1 (p. 7) from the script Theory of Reinforcement Learning 3:

For every policy $ \pi$ there exists a deterministic policy $ \pi'$ such that $ \pi' \geq \pi$ . As a special case: If there exists a stochastic optimal policy $ \pi$ , then there exists also a deterministic optimal policy $ \pi'$ such that $ \pi' \geq \pi$ .

Haeusler Stefan 2011-01-25