Understanding the generalization of deep learning is an important but challenging problem. We establish generalization error bounds for SGD by characterizing the algorithmic stability in terms of the population risk at initialization and from information theory perspective.