Understanding Representation and Learning Dynamics of Neural Networks in Function Space (2019-present). Starting from first principles, we show how each individual neuron contributes to the function being represented by a NN. This involves Radon basis expansions and function space theories. Furthermore, we show that a NN trained via gradient descent effectively fits a spline with a certain implicit regularizer to training data. We use this theory to understand how different NN architectures approximate their target functions of interest.
Applying Function Space Theory of Neural Nets to Application Domains in Physics, Neuroscience, & Medicine (2019-present). We work with quantum physicists on approximating many-body wave functions; with biophysicists on simulating protein folding and Raman scattering-based chemical recognition; with neuroscientists on understanding the sense of smell and vision in the mammalian brain; with computer scientists and applied mathematicians in developing faster, smarter algorithms.
Artificial Neuroscience on Large-scale Trained Neural Nets. Using state of the art trained nets for object recognition and character-level language modeling, we are systematically probing these architectures to elucidate the low-level mechanisms by which they accomplish their tasks. A key element of this approach is the strong interaction between Theory and (in silico) Experiments.
Probabilistic Framework for Deep Learning. The recent success of deep learning systems is impressive — they now routinely yield pattern recognition systems with near- or super-human capabilities — but a fundamental question remains: Why do they work? Intuitions abound, but a coherent framework for understanding, analyzing, and synthesizing deep learning architectures has remained elusive. We answer this questions by developing a new probabilistic framework for deep learning based on the Deep Rendering Model: a generative probabilistic model that explicitly captures variation due to latent nuisance variables. The graphical structure of the model enables it to be learned from data using the classical EM algorithm. Furthermore, by relaxing the generative model to a discriminative one, we can recover two of the current leading deep learning systems: Deep Convolutional neural networks (DCNs) and Random Decision Forests (RDFs). Using this model, we develop insights into their successes and shortcomings as well as a principled route to their improvement. Please check out our paper for more details.
Cortically-Inspired Network Architectures for Vision & Reverse-Engineering Neural Plasticity Rules. Working with neuroscientists, we are reverse-engineering coarse-grained architectural motifs and the myriad different neural plasticity rules that have substantial empirical support. We plan to use these architectures/modules in a deep network in order to learn to solve hard perceptual tasks such as action recognition from video.
Event-driven Representations for RNNs. Inspired by the retina and cortex, we are developing a new class of representations and RNNs that learn events, defined as meaningful changes in the inputs. Event-driven representations have many computational and representational benefits: higher throughput, lower latency, sparsity, etc.