In conjunction with the 7th Asian Conference on Machine Learning (ACML 2015)


The world has witnessed the resurgence of neural networks, or as now it is called, "Deep Learning", in the past years. This time, they came back not as promising methods that work on toy data, but as the driving force in the revolution of many application domains, staring with automatic speech recognition, then computer vision, and now natural language processing and probably symbolic artificial intelligence.

Deep learning has gone beyond just deep neural networks or particular approaches to classification and regression. It now stands for a new way of thinking many learning problems. Indeed, now we have end-to-end learning paradigm that directly connects vision and robot manipulation, or systems that aim to reason over complicated natural language facts, which previously required heavy feature engineering and cumbersome models.

This workshop aims to bring together researchers in deep learning, particularly in Asia, to share their research, view points, vision, and probably more importantly their questions and confusion. We are here to discuss not only how deep learning reshaped machine learning, but also how to reshape deep learning--it is after all still young, and again as people argued more than 20 years ago, with great great expectation.

Invited Speakers

  •  Ruslan Salakhutdinov. University of Toronto.
  • TITLE: Deep Multimodal Learning

    ABSTRACT: I will describe a class of statistical models that are capable of extracting a unified representation that fuses together multiple data modalities. In particular, inspired by recent advances in machine translation, I will introduce an encoder-decoder model that learns a multimodal joint embedding space of images and text. The encoder can be used to rank images and sentences while the decoder can generate novel descriptions of images from scratch. I will further describe a novel approach to unsupervised learning of a generic, distributed sentence encoder and show that on several tasks, including semantic relatedness, paraphrase detection, image-sentence ranking, these models improve upon many of the existing techniques. Finally, I will introduce a model that can generate image of natural scenes from natural language descriptions by combining recurrent variational autoencoders with an attention mechanism over words.
  •  Wei Xu. Baidu.
  • TITLE: Deep learning for machine perception and language understanding: a progress report from Baidu

    ABSTRACT: Over the past few years, we have successfully applied deep learning to a wide variety of problems at Baidu, such as web search, online advertising, speech recognition, computer vision, natural language processing, smart data center, robotics, etc. In this talk, I will report our recent progress in machine perception and language understanding. In the end I will discuss some thoughts about the future directions of artificial intelligence.
  •  Xiaogang Wang. Chinese University of Hong Kong.
  • TITLE: DeepID: Deep Learning for Face Recognition

    In this talk, I will present our recent works on deep learning for face recognition. With a novel deep model and a moderate training set with 400,000 face images, 99.47% accuracy has been achieved on LFW, the most challenging and extensively studied face recognition dataset. Deep learning provides a powerful tool to separate intra-personal and inter-personal variations, whose distributions are complex and highly nonlinear, through hierarchical feature transforms. It is essential to learn effective face representations by using two supervisory signals simultaneously, i.e. the face identification and verification signals. Some people understand the success of deep learning as using a complex model with many parameters to fit a dataset. To clarify such misunderstanding, we further investigate face recognition process in deep nets, what information is encoded in neurons, and how robust they are to data corruptions. We discovered several interesting properties of deep nets, including sparseness, selectiveness and robustness. In Multi-View Perception, a hybrid deep model is proposed to simultaneously accomplish the tasks of face recognition, pose estimation, and face reconstruction. It employs deterministic and random neurons to encode identity and pose information respectively. Given a face image taken in an arbitrary view, it can untangle the identity and view features, and in the meanwhile the full spectrum of multi-view images of the same identity can be reconstructed. It is also capable to interpolate and predict images under viewpoints that are unobserved in the training data.
  •  Zheng Zhang. Shanghai New York University.
  •  Jun Zhu. Tsinghua University.
  • TITLE: Fast and Discriminative Training of Deep Generative Models

    ABASTRACT: Deep generative models are flexible tools on revealing the latent structures underlying complex data and performing top-down inference to generate observations. However, the bottom-up recognition ability is often not fully explored. An MLE estimate often leads to inferior prediction accuracy than discriminative methods. In this talk, I will introduce max-margin learning for deep generative models, under a general framework of regularized Bayesian inference (RegBayes), where a posterior regularization term is defined to encourage a large-margin separation between true categories and alternatives, without sacrificing the generative capability. I will also talk about how to perform efficient Bayesian inference for deep generative models by developing a doubly stochastic gradient MCMC algorithm with a neural adaptive importance sampler to estimate the intractable gradient.
  •  Zhengdong Lu. Noah's Ark Lab, Huawei.

Program (Nov 20, 2015)

09:00-09:05 Opening

09:05-10:00 Keynote talk by Prof. Ruslan Salakhutdinov

10:00-10:30 Invited talk by Dr. Wei Xu

10:30-11:00 Coffee break

11:00-11:30 Invited talk by Prof. Xiaogang Wang

11:30-12:00 Invited talk by Prof. Zheng Zhang

12:00-12:30 Invited talk by Dr. Zhengdong Lu

12:00-14:30 Lunch break

14:30-15:00 Invited talk by Prof. Jun Zhu

15:00-16:00 Top conference session

16:00-16:30 Coffee break

16:30-17:30 Panel discussion


Zhengdong Lu, Noah's Ark Lab, HUawei, Hong Kong

Zheng Zhang, NYU at Shanghai, China

Shuicheng Yan, National University of Singapore, Singapore

Contact Us