Do you want BuboFlash to help you learning these things? Or do you want to add or correct something? Click here to log in or create user.



#reinforcement-learning

We consider a broader class of Bellman equations that are non-linear in the rewards and future values: \(v(s)=\mathbb{E}\left[f\left(R_{t+1}, v\left(S_{t+1}\right)\right) | S_{t}=s, A_{t} \sim \pi\left(S_{t}\right)\right]\) .

If you want to change selection, open document below and click on "Move attachment"

pdf

owner: reseal - (no access) - General non-linear Bellman equations, p1


Summary

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Details



Discussion

Do you want to join discussion? Click here to log in or create user.