BuboFlash - helps with learning

Do you want BuboFlash to help you learning these things? Or do you want to add or correct something? Click here to log in or create user.

#computer-science #machine-learning #reinforcement-learning

If an objective cannot b e learned, it does in de ed draw its utility into question. In the case of the VE , however, there is a way out. Note that the same solution, w = 1, is optimal for both MRPs above (assuming µ is the same for the two indistinguishable states in the right MRP). Is this a coincidence, or could it be generally t ru e that all MDPs with the same data distribution also have the same optimal parameter vector? If this is true—and we will show next that it is—then the VE remains a usable objective. The VE is not learnable, but the parameter that optimizes it is!

If you want to change selection, open document below and click on "Move attachment"

pdf

cannot see any pdfs

Summary

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Details

Discussion

Do you want to join discussion? Click here to log in or create user.