Solve the cart-pole problem with the SARSA(
) algorithm and linear function approximation. Download the Reinforcement Learning (RL) MATLAB Toolbox and the example files^{5} and adapt the cart-pole demo example to solve the task.
Use the following learning parameters:
. Normalize the parameter vector of the linear function approximation so that the sum of its elements is 1.
Initialize the action values to zero (*optimistic initialization*). Measure the steps needed to reach the goal to evaluate the success of your learning algorithm. In order to verify if the cart-pole reached the goal use the learned optimal policy, i.e. set
.

- a)
- Use grid-tilings of size to discretize the state space. Show in a plot how the number of steps needed to reach the goal evolves during learning.
- b)
- Use radial basis function (RBF) approximation with evenly spaced RBF centers located at the tile center used in a) (i.e. 11025 total centers). Set the widths in every dimension such that one RBF roughly spans 1-2 tiles.