BuboFlash - helps with learning

Do you want BuboFlash to help you learning these things? Or do you want to add or correct something? Click here to log in or create user.

#computer-science #machine-learning #reinforcement-learning

The idea of storing some estimates separately and then combini n g them with samples is a good one and is also used in Gradient-TD methods. Gradient-TD methods estimate and store the product of the second two factors in (11.27) . These factors are a d ⇥ d matrix and a d -vector , so their pro du ct is j us t a d -vector , l ike w itself. We denote t h is second learned vector as v: v ⇡ E ⇥ x t x > t ⇤ 1 E[⇢ t t x t ]

If you want to change selection, open document below and click on "Move attachment"

pdf

cannot see any pdfs

Summary

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Details

Discussion

Do you want to join discussion? Click here to log in or create user.