This talk is canceled.
The talk will be divided into two parts.
In the first part, I will be talking about a new class of algorithms inspired by Adam (Kingma and Ba, 2014) that is very suitable for nonconvex optimization with a lot of saddle points. Comparing to Adam, the new algorithm uses only one-bit for each dimension to represent the gradient and enjoys the same rate of convergence like SGD to a stationary point; moreover, experiment shows that the SignSGD works comparably to SGD and Signum (SIGn momenTUM) significantly outperforms SGD and Adam on both MNIST and CIFAR10. I will also talk about the ideas of the proof, which is a general recipe for proving convergence of signed updates.
In the second part of the talk, I will talk about a new API of Apache-MxNet (a CMU-born open source deep learning framework) called Gluon. As I will be illustrating Gluon makes it extremely easy for people (like myself) who are more comfortable with high level languages like Matlab/R/Numpy to learn, experiment with, and apply modern neural network models. In particular, we only need to write Numpy-like code of the forward pass of the model and that’s about it. The underlying code will be running highly optimized C++ library and we can decide whether we want to do the computation on CPU or GPU using only one line of code. Gluon even takes care of differentiation automatically and can infer the correct dimensionality of the input data, which leaves little room for coding error. The bottom line is that Gluon really makes deep learning more accessible to newbies and help researchers to get rid of unnecessary technical hurdles so that we can focus on the scientific aspects of deep learning research.
Bio: Dr. Yu-Xiang Wang is a scientist with Amazon AI in Palo Alto. He works with Alex Smola, whose team thrives to make high-quality AI and ML technologies accessible to all developers through open source code and AWS. Prior to joining Amazon, he was with the Machine Learning Department in Carnegie Mellon University. His research interests include trend filtering, differential privacy, subspace clustering, large-scale learning / optimization and more recently some sequential interactive learning problems. Dr. Wang is going to join CS@UCSB in fall, 2018.