We are interested in understanding what end-to-end matrix W W emerges when we run GD on an LNN to minimize a general convex loss L ( W ) L(W) , and in particular the matrix completion loss given above. Note that L ( W ) L(W) is convex, but the objective obtained by over-parameterizing with an LNN is not. We analyze the trajectories of W W , and specifically the dynamics of its singular value decomposition. Denote the singular values by { σ r } r \{ \sigma_r \}_r , and the corresponding left and right singular vectors by { u r } r \{ \mathbf{u}_r \}_r and { v r } r \{ \mathbf{v}_r \}_r respectively.
status | not read | reprioritisations | ||
---|---|---|---|---|
last reprioritisation on | suggested re-reading day | |||
started reading on | finished reading on |