What is the form of a universal successor features approximator (USFA) ?
Answer
Universal successor features are defined as \(\psi(s, a, \pi)\equiv \psi^{\pi}(s, a) \equiv \mathrm{E}^{\pi}\left[\sum_{i=t}^{\infty} \gamma^{i-t} \boldsymbol{\phi}_{i+1} | S_{t}=s, A_{t}=a\right]\), that is, as distinct from SFs by taking the policy as an additional argument. Based on such definition, we call \(\tilde{\boldsymbol{\psi}}(\boldsymbol{s}, \boldsymbol{a}, \pi) \approx \boldsymbol{\psi}(\boldsymbol{s}, \boldsymbol{a}, \pi)\) a universal successor features approximator (USFA).
Tags
#reinforcement-learning
Question
What is the form of a universal successor features approximator (USFA) ?
Answer
?
Tags
#reinforcement-learning
Question
What is the form of a universal successor features approximator (USFA) ?
Answer
Universal successor features are defined as \(\psi(s, a, \pi)\equiv \psi^{\pi}(s, a) \equiv \mathrm{E}^{\pi}\left[\sum_{i=t}^{\infty} \gamma^{i-t} \boldsymbol{\phi}_{i+1} | S_{t}=s, A_{t}=a\right]\), that is, as distinct from SFs by taking the policy as an additional argument. Based on such definition, we call \(\tilde{\boldsymbol{\psi}}(\boldsymbol{s}, \boldsymbol{a}, \pi) \approx \boldsymbol{\psi}(\boldsymbol{s}, \boldsymbol{a}, \pi)\) a universal successor features approximator (USFA).
If you want to change selection, open document below and click on "Move attachment"
pdf
owner: reseal - (no access) - Universal Successor Features Approximators, p4
Summary
status
not learned
measured difficulty
37% [default]
last interval [days]
repetition number in this series
0
memorised on
scheduled repetition
scheduled repetition interval
last repetition or drill
Details
No repetitions
Discussion
Do you want to join discussion? Click here to log in or create user.