Title | Journal | Journal Categories | Citations | Publication Date |
---|---|---|---|---|
Optimal distributed online prediction using mini-batches | 2012 | |||
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude | 2012 | |||
Adaptive subgradient methods for online learning and stochastic optimization | 2011 | |||
A method of solving a convex programming problem with convergence rate $O(\frac{1}{k^{2}})$ | 1983 | |||
Analytic insights into structure and rank of neural network hessian maps | 2021 |