Probabilistic Framework for Deep Learning. The recent success of deep learning systems is impressive — they now routinely yield pattern recognition systems with near- or super-human capabilities — but a fundamental question remains: Why do they work? Intuitions abound, but a coherent framework for understanding, analyzing, and synthesizing deep learning architectures has remained elusive. We answer this questions by developing a new probabilistic framework for deep learning based on the Deep Rendering Model: a generative probabilistic model that explicitly captures variation due to latent nuisance variables. The graphical structure of the model enables it to be learned from data using the classical EM algorithm. Furthermore, by relaxing the generative model to a discriminative one, we can recover two of the current leading deep learning systems: Deep Convolutional neural networks (DCNs) and Random Decision Forests (RDFs). Using this model, we develop insights into their successes and shortcomings as well as a principled route to their improvement. Please check out our paper for more details.
IARPA Microns Project (BCM+Rice, 2015-present, IARPA, co-PI). We have developed a probabilistic theory to explain the successes and shortcomings of modern Deep learning architectures. The key element is a new generative model – the Deep Rendering Model – that explicitly models variation due to latent nuisance factors over multiple levels of abstraction. We aim to show a direct mapping from the DRM to neurally plausible architectures and learning algorithms, with a focus on empirically testable predictions.
Cortically-Inspired Network Architectures for Vision & Reverse-Engineering Neural Plasticity Rules. Working with neuroscientists, we are reverse-engineering coarse-grained architectural motifs and the myriad different neural plasticity rules that have substantial empirical support. We plan to use these architectures/modules in a deep network in order to learn to solve hard perceptual tasks such as action recognition from video.
Artificial Neuroscience on Large-scale Trained Neural Nets. Using state of the art trained nets for object recognition and character-level language modeling, we are systematically probing these architectures to elucidate the low-level mechanisms by which they accomplish their tasks. A key element of this approach is the strong interaction between Theory and (in silico) Experiments.
Event-driven Representations for RNNs. Inspired by the retina and cortex, we are developing a new class of representations and RNNs that learn events, defined as meaningful changes in the inputs. Event-driven representations have many computational and representational benefits: higher throughput, lower latency, sparsity, etc.
Medical Event Prediction: Using our theoretical results and recent advances in Deep Learning, we are designing new neural network architectures that can learn to predict the risk of different kinds of catastrophic medical events (e.g. heart attack). This is a collaboration with Craig Rusin’s lab at BCM, Texas Children’s Hospital, and Medical Informatics Corp.
Deep Learning for High-Energy Particle Physics. Joint work with Paul Padley, Kuver Sinha, Jamal Rorie. The LHC Collider generates torrents of particle collision events data every day. In order to test for New Physics in these data, we must use high-capacity machine learning models such as decision forests or neural networks. In collaboration with particle physicists at Rice and Syracuse, Patel Lab is using the Theory of Deep Learning to shed light on why current deep architectures used work and how they can be improved in the hopes of accelerating/enabling the search for New Physics.