Do you want BuboFlash to help you learning these things? Or do you want to add or correct something? Click here to log in or create user.

Tags

#reinforcement-learning

Question

What are the successor features of a state-action pair (s, a) under policy π?

Answer

The SFs \(\boldsymbol{\psi} \in \mathbb{R}^{d}\) of a state-action pair (s, a) under policy \(\pi\) are given by \(\psi^{\pi}(s, a) \equiv \mathrm{E}^{\pi}\left[\sum_{i=t}^{\infty} \gamma^{i-t} \boldsymbol{\phi}_{i+1} | S_{t}=s, A_{t}=a\right]\), where the \(\phi_{i+1} \in \mathbb{R}^{d}\) are features of \((S_i, A_i, S_{i+1})\)

Tags

#reinforcement-learning

Question

What are the successor features of a state-action pair (s, a) under policy π?

Answer

?

Tags

#reinforcement-learning

Question

What are the successor features of a state-action pair (s, a) under policy π?

Answer

The SFs \(\boldsymbol{\psi} \in \mathbb{R}^{d}\) of a state-action pair (s, a) under policy \(\pi\) are given by \(\psi^{\pi}(s, a) \equiv \mathrm{E}^{\pi}\left[\sum_{i=t}^{\infty} \gamma^{i-t} \boldsymbol{\phi}_{i+1} | S_{t}=s, A_{t}=a\right]\), where the \(\phi_{i+1} \in \mathbb{R}^{d}\) are features of \((S_i, A_i, S_{i+1})\)

If you want to change selection, open document below and click on "Move attachment"

status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|

repetition number in this series | 0 | memorised on | scheduled repetition | ||||

scheduled repetition interval | last repetition or drill |

Do you want to join discussion? Click here to log in or create user.