-
Measuring Systematic Generalization in Neural Proof Generation with Transformers
Authors:
Nicolas Gontier,
Koustuv Sinha,
Siva Reddy,
Christopher Pal
Abstract:
We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we pe…
▽ More
We are interested in understanding how well Transformer language models (TLMs) can perform reasoning tasks when trained on knowledge encoded in the form of natural language. We investigate their systematic generalization abilities on a logical reasoning task in natural language, which involves reasoning over relationships between entities grounded in first-order logical proofs. Specifically, we perform soft theorem-proving by leveraging TLMs to generate natural language proofs. We test the generated proofs for logical consistency, along with the accuracy of the final inference. We observe length-generalization issues when evaluated on longer-than-trained sequences. However, we observe TLMs improve their generalization performance after being exposed to longer, exhaustive proofs. In addition, we discover that TLMs are able to generalize better using backward-chaining proofs compared to their forward-chaining counterparts, while they find it easier to generate forward chaining proofs. We observe that models that are not trained to generate proofs are better at generalizing to problems based on longer proofs. This suggests that Transformers have efficient internal reasoning strategies that are harder to interpret. These results highlight the systematic generalization behavior of TLMs in the context of logical reasoning, and we believe this work motivates deeper inspection of their underlying reasoning strategies.
△ Less
Submitted 20 October, 2020; v1 submitted 30 September, 2020;
originally announced September 2020.
-
Neural Neighborhood Encoding for Classification
Authors:
Kaushik Sinha,
Parikshit Ram
Abstract:
Inspired by the fruit-fly olfactory circuit, the Fly Bloom Filter [Dasgupta et al., 2018] is able to efficiently summarize the data with a single pass and has been used for novelty detection. We propose a new classifier (for binary and multi-class classification) that effectively encodes the different local neighborhoods for each class with a per-class Fly Bloom Filter. The inference on test data…
▽ More
Inspired by the fruit-fly olfactory circuit, the Fly Bloom Filter [Dasgupta et al., 2018] is able to efficiently summarize the data with a single pass and has been used for novelty detection. We propose a new classifier (for binary and multi-class classification) that effectively encodes the different local neighborhoods for each class with a per-class Fly Bloom Filter. The inference on test data requires an efficient {\tt FlyHash} [Dasgupta, et al., 2017] operation followed by a high-dimensional, but {\em sparse}, dot product with the per-class Bloom Filters. The learning is trivially parallelizable. On the theoretical side, we establish conditions under which the prediction of our proposed classifier on any test example agrees with the prediction of the nearest neighbor classifier with high probability. We extensively evaluate our proposed scheme with over $50$ data sets of varied data dimensionality to demonstrate that the predictive performance of our proposed neuroscience inspired classifier is competitive the the nearest-neighbor classifiers and other single-pass classifiers.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program)
Authors:
Joelle Pineau,
Philippe Vincent-Lamarre,
Koustuv Sinha,
Vincent Larivière,
Alina Beygelzimer,
Florence d'Alché-Buc,
Emily Fox,
Hugo Larochelle
Abstract:
One of the challenges in machine learning research is to ensure that presented and published results are sound and reliable. Reproducibility, that is obtaining similar results as presented in a paper or talk, using the same code and data (when available), is a necessary step to verify the reliability of research findings. Reproducibility is also an important step to promote open and accessible res…
▽ More
One of the challenges in machine learning research is to ensure that presented and published results are sound and reliable. Reproducibility, that is obtaining similar results as presented in a paper or talk, using the same code and data (when available), is a necessary step to verify the reliability of research findings. Reproducibility is also an important step to promote open and accessible research, thereby allowing the scientific community to quickly integrate new findings and convert ideas to practice. Reproducibility also promotes the use of robust experimental workflows, which potentially reduce unintentional errors. In 2019, the Neural Information Processing Systems (NeurIPS) conference, the premier international conference for research in machine learning, introduced a reproducibility program, designed to improve the standards across the community for how we conduct, communicate, and evaluate machine learning research. The program contained three components: a code submission policy, a community-wide reproducibility challenge, and the inclusion of the Machine Learning Reproducibility checklist as part of the paper submission process. In this paper, we describe each of these components, how it was deployed, as well as what we were able to learn from this initiative.
△ Less
Submitted 30 December, 2020; v1 submitted 26 March, 2020;
originally announced March 2020.
-
Evaluating Logical Generalization in Graph Neural Networks
Authors:
Koustuv Sinha,
Shagun Sodhani,
Joelle Pineau,
William L. Hamilton
Abstract:
Recent research has highlighted the role of relational inductive biases in building learning agents that can generalize and reason in a compositional manner. However, while relational learning algorithms such as graph neural networks (GNNs) show promise, we do not understand how effectively these approaches can adapt to new tasks. In this work, we study the task of logical generalization using GNN…
▽ More
Recent research has highlighted the role of relational inductive biases in building learning agents that can generalize and reason in a compositional manner. However, while relational learning algorithms such as graph neural networks (GNNs) show promise, we do not understand how effectively these approaches can adapt to new tasks. In this work, we study the task of logical generalization using GNNs by designing a benchmark suite grounded in first-order logic. Our benchmark suite, GraphLog, requires that learning algorithms perform rule induction in different synthetic logics, represented as knowledge graphs. GraphLog consists of relation prediction tasks on 57 distinct logical domains. We use GraphLog to evaluate GNNs in three different setups: single-task supervised learning, multi-task pretraining, and continual learning. Unlike previous benchmarks, our approach allows us to precisely control the logical relationship between the different tasks. We find that the ability for models to generalize and adapt is strongly determined by the diversity of the logical rules they encounter during training, and our results highlight new challenges for the design of GNN models. We publicly release the dataset and code used to generate and interact with the dataset at https://www.cs.mcgill.ca/~ksinha4/graphlog.
△ Less
Submitted 14 March, 2020;
originally announced March 2020.
-
New and Explicit Constructions of Unbalanced Ramanujan Bipartite Graphs
Authors:
Shantanu Prasad Burnwal,
Kaneenika Sinha,
Mathukumalli Vidyasagar
Abstract:
The objectives of this article are three-fold. Firstly, we present for the first time explicit constructions of an infinite family of \textit{unbalanced} Ramanujan bigraphs. Secondly, we revisit some of the known methods for constructing Ramanujan graphs and discuss the computational work required in actually implementing the various construction methods. The third goal of this article is to addre…
▽ More
The objectives of this article are three-fold. Firstly, we present for the first time explicit constructions of an infinite family of \textit{unbalanced} Ramanujan bigraphs. Secondly, we revisit some of the known methods for constructing Ramanujan graphs and discuss the computational work required in actually implementing the various construction methods. The third goal of this article is to address the following question: can we construct a bipartite Ramanujan graph with specified degrees, but with the restriction that the edge set of this graph must be distinct from a given set of "prohibited" edges? We provide an affirmative answer in many cases, as long as the set of prohibited edges is not too large.
△ Less
Submitted 12 November, 2020; v1 submitted 8 October, 2019;
originally announced October 2019.
-
CLUTRR: A Diagnostic Benchmark for Inductive Reasoning from Text
Authors:
Koustuv Sinha,
Shagun Sodhani,
** Dong,
Joelle Pineau,
William L. Hamilton
Abstract:
The recent success of natural language understanding (NLU) systems has been troubled by results highlighting the failure of these models to generalize in a systematic and robust way. In this work, we introduce a diagnostic benchmark suite, named CLUTRR, to clarify some key issues related to the robustness and systematicity of NLU systems. Motivated by classic work on inductive logic programming, C…
▽ More
The recent success of natural language understanding (NLU) systems has been troubled by results highlighting the failure of these models to generalize in a systematic and robust way. In this work, we introduce a diagnostic benchmark suite, named CLUTRR, to clarify some key issues related to the robustness and systematicity of NLU systems. Motivated by classic work on inductive logic programming, CLUTRR requires that an NLU system infer kinship relations between characters in short stories. Successful performance on this task requires both extracting relationships between entities, as well as inferring the logical rules governing these relationships. CLUTRR allows us to precisely measure a model's ability for systematic generalization by evaluating on held-out combinations of logical rules, and it allows us to evaluate a model's robustness by adding curated noise facts. Our empirical results highlight a substantial performance gap between state-of-the-art NLU models (e.g., BERT and MAC) and a graph neural network model that works directly with symbolic inputs---with the graph-based model exhibiting both stronger generalization and greater robustness.
△ Less
Submitted 3 September, 2019; v1 submitted 16 August, 2019;
originally announced August 2019.
-
Adversarial Gain
Authors:
Peter Henderson,
Koustuv Sinha,
Rosemary Nan Ke,
Joelle Pineau
Abstract:
Adversarial examples can be defined as inputs to a model which induce a mistake - where the model output is different than that of an oracle, perhaps in surprising or malicious ways. Original models of adversarial attacks are primarily studied in the context of classification and computer vision tasks. While several attacks have been proposed in natural language processing (NLP) settings, they oft…
▽ More
Adversarial examples can be defined as inputs to a model which induce a mistake - where the model output is different than that of an oracle, perhaps in surprising or malicious ways. Original models of adversarial attacks are primarily studied in the context of classification and computer vision tasks. While several attacks have been proposed in natural language processing (NLP) settings, they often vary in defining the parameters of an attack and what a successful attack would look like. The goal of this work is to propose a unifying model of adversarial examples suitable for NLP tasks in both generative and classification settings. We define the notion of adversarial gain: based in control theory, it is a measure of the change in the output of a system relative to the perturbation of the input (caused by the so-called adversary) presented to the learner. This definition, as we show, can be used under different feature spaces and distance conditions to determine attack or defense effectiveness across different intuitive manifolds. This notion of adversarial gain not only provides a useful way for evaluating adversaries and defenses, but can act as a building block for future work in robustness under adversaries due to its rooted nature in stability and manifold theory.
△ Less
Submitted 3 November, 2018;
originally announced November 2018.
-
Estimation of firm-level productivity changes in the Indian power sector: Disentangling unobserved heterogeneity by a transformed fixed-effect stochastic frontier model
Authors:
Anish Sugathan,
Deepak Malghan,
S. Chandrashekar,
Deepak K. Sinha
Abstract:
We measure firm-level productivity changes in the Indian electricity sector during a period that witnessed several pro-market regulatory changes. Using information collected from multiple sources we construct a unique panel of generating firms and transmission and distribution utilities spanning the years 2000 to 2009. We employ a recently developed improvement in the Stochastic Frontier panel met…
▽ More
We measure firm-level productivity changes in the Indian electricity sector during a period that witnessed several pro-market regulatory changes. Using information collected from multiple sources we construct a unique panel of generating firms and transmission and distribution utilities spanning the years 2000 to 2009. We employ a recently developed improvement in the Stochastic Frontier panel method that allows controlling for time-invariant unobserved heterogeneity. Using the method we jointly estimate inefficiency and exogenous determinants of inefficiency. We estimate a flexible translog production model and compute decomposition of productivity into components of changes in technology, efficiency, scale and price effect. During this period, especially post Electricity Act 2003, we observed a general decline in firm-level productivity at the mean rate of -1.6% per year. A positive and large technical change is observed in the sector at the rate of 8% per year, attributable possibly to newer capacity addition. Except for smaller gas based generators, inefficiency is observed to be increasing at the mean rate of 3.1% per year in the sector. Consistent with extant findings we also document no significant impact of un-bundling on firm-level efficiency.
△ Less
Submitted 12 January, 2013;
originally announced January 2013.
-
Near-Optimal Algorithms for Differentially-Private Principal Components
Authors:
Kamalika Chaudhuri,
Anand D. Sarwate,
Kaushik Sinha
Abstract:
Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for develo** tradeoffs between pr…
▽ More
Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for develo** tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.
△ Less
Submitted 7 August, 2013; v1 submitted 11 July, 2012;
originally announced July 2012.