Search | arXiv e-print repository

Teaching Compositionality to CNNs

Authors: Austin Stone, Huayan Wang, Michael Stark, Yi Liu, D. Scott Phoenix, Dileep George

Abstract: Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting and training CNNs so that their learned features are compositional. It encourages networks to form representations that disentangle objects from their surround… ▽ More Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions. In this paper, we propose a method for augmenting and training CNNs so that their learned features are compositional. It encourages networks to form representations that disentangle objects from their surroundings and from each other, thereby promoting better generalization. Our method is agnostic to the specific details of the underlying CNN to which it is applied and can in principle be used with any CNN. As we show in our experiments, the learned representations lead to feature activations that are more localized and improve performance over non-compositional baselines in object recognition tasks. △ Less

Submitted 14 June, 2017; originally announced June 2017.

Comments: Preprint appearing in CVPR 2017

arXiv:1611.02788 [pdf, other]

Generative Shape Models: Joint Text Recognition and Segmentation with Very Little Training Data

Authors: Xinghua Lou, Ken Kansky, Wolfgang Lehrach, CC Laan, Bhaskara Marthi, D. Scott Phoenix, Dileep George

Abstract: We demonstrate that a generative model for object shapes can achieve state of the art results on challenging scene text recognition tasks, and with orders of magnitude fewer training images than required for competing discriminative methods. In addition to transcribing text from challenging images, our method performs fine-grained instance segmentation of characters. We show that our model is more… ▽ More We demonstrate that a generative model for object shapes can achieve state of the art results on challenging scene text recognition tasks, and with orders of magnitude fewer training images than required for competing discriminative methods. In addition to transcribing text from challenging images, our method performs fine-grained instance segmentation of characters. We show that our model is more robust to both affine transformations and non-affine deformations compared to previous approaches. △ Less

Submitted 8 November, 2016; originally announced November 2016.

Journal ref: Advances in Neural Information Processing Systems 2016

arXiv:1611.02767 [pdf, other]

A backward pass through a CNN using a generative model of its activations

Authors: Huayan Wang, Anna Chen, Yi Liu, Dileep George, D. Scott Phoenix

Abstract: Neural networks have shown to be a practical way of building a very complex map** between a pre-specified input space and output space. For example, a convolutional neural network (CNN) map** an image into one of a thousand object labels is approaching human performance in this particular task. However the map** (neural network) does not automatically lend itself to other forms of queries, f… ▽ More Neural networks have shown to be a practical way of building a very complex map** between a pre-specified input space and output space. For example, a convolutional neural network (CNN) map** an image into one of a thousand object labels is approaching human performance in this particular task. However the map** (neural network) does not automatically lend itself to other forms of queries, for example, to detect/reconstruct object instances, to enforce top-down signal on ambiguous inputs, or to recover object instances from occlusion. One way to address these queries is a backward pass through the network that fuses top-down and bottom-up information. In this paper, we show a way of building such a backward pass by defining a generative model of the neural network's activations. Approximate inference of the model would naturally take the form of a backward pass through the CNN layers, and it addresses the aforementioned queries in a unified framework. △ Less

Submitted 8 November, 2016; originally announced November 2016.

arXiv:1611.02252 [pdf, other]

Hierarchical compositional feature learning

Authors: Miguel Lázaro-Gredilla, Yi Liu, D. Scott Phoenix, Dileep George

Abstract: We introduce the hierarchical compositional network (HCN), a directed generative model able to discover and disentangle, without supervision, the building blocks of a set of binary images. The building blocks are binary features defined hierarchically as a composition of some of the features in the layer immediately below, arranged in a particular manner. At a high level, HCN is similar to a sigmo… ▽ More We introduce the hierarchical compositional network (HCN), a directed generative model able to discover and disentangle, without supervision, the building blocks of a set of binary images. The building blocks are binary features defined hierarchically as a composition of some of the features in the layer immediately below, arranged in a particular manner. At a high level, HCN is similar to a sigmoid belief network with pooling. Inference and learning in HCN are very challenging and existing variational approximations do not work satisfactorily. A main contribution of this work is to show that both can be addressed using max-product message passing (MPMP) with a particular schedule (no EM required). Also, using MPMP as an inference engine for HCN makes new tasks simple: adding supervision information, classifying images, or performing inpainting all correspond to clam** some variables of the model to their known values and running MPMP on the rest. When used for classification, fast inference with HCN has exactly the same functional form as a convolutional neural network (CNN) with linear activations and binary weights. However, HCN's features are qualitatively very different. △ Less

Submitted 25 October, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

Comments: Removed the "under review" header from every page, no changes to content

Showing 1–4 of 4 results for author: Phoenix, D S