Among the algorithms investigated so far in this book, only the [...] methods are true SGD methods. These methods converge robustly under both on-policy and off-policy training as well as for general nonlinear (differentiable) function approximators, though they are often slower than semi-gradient methods with bootstrapping, which are not SGD methods.
Among the algorithms investigated so far in this book, only the [...] methods are true SGD methods. These methods converge robustly under both on-policy and off-policy training as well as for general nonlinear (differentiable) function approximators, though they are often slower than semi-gradient methods with bootstrapping, which are not SGD methods.
Among the algorithms investigated so far in this book, only the [...] methods are true SGD methods. These methods converge robustly under both on-policy and off-policy training as well as for general nonlinear (differentiable) function approximators, though they are often slower than semi-gradient methods with bootstrapping, which are not SGD methods.
Answer
Monte Carlo
If you want to change selection, open original toplevel document below and click on "Move attachment"
Parent (intermediate) annotation
Open it data-bubo-id="temp-selection">Monte Carlo<span> methods are true SGD methods. These methods converge robustly under both on-policy and o↵-policy training as well as for general nonlinear (di↵erentiable) function approximators, though
Original toplevel document (pdf)
cannot see any pdfs
Summary
status
not learned
measured difficulty
37% [default]
last interval [days]
repetition number in this series
0
memorised on
scheduled repetition
scheduled repetition interval
last repetition or drill
Details
No repetitions
Discussion
Do you want to join discussion? Click here to log in or create user.