After the success of CNNs in IVSRC 2012 (Krizhevsky et al. (2012)), initialization with Gaussian noise with mean equal to zero and standard deviation set to 0.01 and adding bias equal to one for some layers become very popular.
Why it is not possible to train very deep networks from scratch with this initialization?
After the success of CNNs in IVSRC 2012 (Krizhevsky et al. (2012)), initialization with Gaussian noise with mean equal to zero and standard deviation set to 0.01 and adding bias equal to one for some layers become very popular.
Why it is not possible to train very deep networks from scratch with this initialization?
After the success of CNNs in IVSRC 2012 (Krizhevsky et al. (2012)), initialization with Gaussian noise with mean equal to zero and standard deviation set to 0.01 and adding bias equal to one for some layers become very popular.
Why it is not possible to train very deep networks from scratch with this initialization?
status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|
repetition number in this series | 0 | memorised on | scheduled repetition | ||||
scheduled repetition interval | last repetition or drill |