User:Minima2014

Stochastic gradient descent is called "stochastic" because only a portion (sample) of all inputs are evaluated against. Using the partial derivative based on this sample to update weights will NOT necessarily lead us towards the minimum point of cost.