To minimize the sum of squared prediction mistakes, take its derivative and set it to zero: d dm a n i = 1 (Y i - m) 2 = - 2 a n i = 1 (Y i - m) = - 2 a n i = 1 Y i + 2nm = 0. (3.27) Solving for the final equation for m shows that g n i = 1 (Y i - m) 2 is minimized when m = Y
