Tags
#reinforcement-learning
Question
What are the successor features of a state-action pair (s, a) under policy π?
The SFs $$\boldsymbol{\psi} \in \mathbb{R}^{d}$$ of a state-action pair (s, a) under policy $$\pi$$ are given by $$\psi^{\pi}(s, a) \equiv \mathrm{E}^{\pi}\left[\sum_{i=t}^{\infty} \gamma^{i-t} \boldsymbol{\phi}_{i+1} | S_{t}=s, A_{t}=a\right]$$, where the $$\phi_{i+1} \in \mathbb{R}^{d}$$ are features of $$(S_i, A_i, S_{i+1})$$

Tags
#reinforcement-learning
Question
What are the successor features of a state-action pair (s, a) under policy π?
?

Tags
#reinforcement-learning
Question
What are the successor features of a state-action pair (s, a) under policy π?
The SFs $$\boldsymbol{\psi} \in \mathbb{R}^{d}$$ of a state-action pair (s, a) under policy $$\pi$$ are given by $$\psi^{\pi}(s, a) \equiv \mathrm{E}^{\pi}\left[\sum_{i=t}^{\infty} \gamma^{i-t} \boldsymbol{\phi}_{i+1} | S_{t}=s, A_{t}=a\right]$$, where the $$\phi_{i+1} \in \mathbb{R}^{d}$$ are features of $$(S_i, A_i, S_{i+1})$$
If you want to change selection, open document below and click on "Move attachment"

pdf

owner: reseal - (no access) - Universal Successor Features Approximators, p3

Summary

status measured difficulty not learned 37% [default] 0

No repetitions