#computer-science #machine-learning #reinforcement-learning
Among the algorithms investigated so far in this book, only the Monte Carlo methods are true SGD methods. These methods converge robustly under both on-policy and off-policy training as well as for general nonlinear (differentiable) function approximators, though they are often slower than semi-gradient methods with bootstrapping, which are not SGD methods.
If you want to change selection, open document below and click on "Move attachment"
pdfcannot see any pdfs
|status||not read|| ||reprioritisations|
|last reprioritisation on|| ||suggested re-reading day|
|started reading on|| ||finished reading on|