Search | arXiv e-print repository

arXiv:2004.04926 [pdf, ps, other]

Tensor Decompositions for temporal knowledge base completion

Authors: Timothée Lacroix, Guillaume Obozinski, Nicolas Usunier

Abstract: Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that… ▽ More Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that are valid only at certain points in time. For the problem of link prediction under temporal constraints, i.e., answering queries such as (US, has president, ?, 2012), we propose a solution inspired by the canonical decomposition of tensors of order 4. We introduce new regularization schemes and present an extension of ComplEx (Trouillon et al., 2016) that achieves state-of-the-art performance. Additionally, we propose a new dataset for knowledge base completion constructed from Wikidata, larger than previous benchmarks by an order of magnitude, as a new reference for evaluating temporal and non-temporal link prediction methods. △ Less

Submitted 10 April, 2020; originally announced April 2020.

arXiv:2001.00006 [pdf, ps, other]

Learning from Learning Machines: Optimisation, Rules, and Social Norms

Authors: Travis LaCroix, Yoshua Bengio

Abstract: There is an analogy between machine learning systems and economic entities in that they are both adaptive, and their behaviour is specified in a more-or-less explicit way. It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making, but it is an open question as to how precisely moral behaviour can be achieved in an AI system.… ▽ More There is an analogy between machine learning systems and economic entities in that they are both adaptive, and their behaviour is specified in a more-or-less explicit way. It appears that the area of AI that is most analogous to the behaviour of economic entities is that of morally good decision-making, but it is an open question as to how precisely moral behaviour can be achieved in an AI system. This paper explores the analogy between these two complex systems, and we suggest that a clearer understanding of this apparent analogy may help us forward in both the socio-economic domain and the AI domain: known results in economics may help inform feasible solutions in AI safety, but also known results in AI may inform economic policy. If this claim is correct, then the recent successes of deep learning for AI suggest that more implicit specifications work better than explicit ones for solving such problems. △ Less

Submitted 29 December, 2019; originally announced January 2020.

arXiv:1911.11668 [pdf, ps, other]

Biology and Compositionality: Empirical Considerations for Emergent-Communication Protocols

Authors: Travis LaCroix

Abstract: Significant advances have been made in artificial systems by using biological systems as a guide. However, there is often little interaction between computational models for emergent communication and biological models of the emergence of language. Many researchers in language origins and emergent communication take compositionality as their primary target for explaining how simple communication s… ▽ More Significant advances have been made in artificial systems by using biological systems as a guide. However, there is often little interaction between computational models for emergent communication and biological models of the emergence of language. Many researchers in language origins and emergent communication take compositionality as their primary target for explaining how simple communication systems can become more like natural language. However, there is reason to think that compositionality is the wrong target on the biological side, and so too the wrong target on the machine-learning side. As such, the purpose of this paper is to explore this claim. This has theoretical implications for language origins research more generally, but the focus here will be the implications for research on emergent communication in computer science and machine learning---specifically regarding the types of programmes that might be expected to work and those which will not. I further suggest an alternative approach for future research which focuses on reflexivity, rather than compositionality, as a target for explaining how simple communication systems may become more like natural language. I end by providing some reference to the language origins literature that may be of some use to researchers in machine learning. △ Less

Submitted 27 December, 2019; v1 submitted 26 November, 2019; originally announced November 2019.

Comments: Accepted for NeurIPS 2019 workshop Emergent Communication: Towards Natural Language

arXiv:1903.12287 [pdf, other]

PyTorch-BigGraph: A Large-scale Graph Embedding System

Authors: Adam Lerer, Ledell Wu, Jiajun Shen, Timothee Lacroix, Luca Wehrstedt, Abhijit Bose, Alex Peysakhovich

Abstract: Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to tr… ▽ More Graph embedding methods produce unsupervised node features from graphs that can then be used for a variety of machine learning tasks. Modern graphs, particularly in industrial applications, contain billions of nodes and trillions of edges, which exceeds the capability of existing embedding systems. We present PyTorch-BigGraph (PBG), an embedding system that incorporates several modifications to traditional multi-relation embedding systems that allow it to scale to graphs with billions of nodes and trillions of edges. PBG uses graph partitioning to train arbitrarily large embeddings on either a single machine or in a distributed environment. We demonstrate comparable performance with existing embedding systems on common benchmarks, while allowing for scaling to arbitrarily large graphs and parallelization on multiple machines. We train and evaluate embeddings on several large social network graphs as well as the full Freebase dataset, which contains over 100 million nodes and 2 billion edges. △ Less

Submitted 9 April, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

Journal ref: Proceedings of The Conference on Systems and Machine Learning, 2019

arXiv:1806.07297 [pdf, other]

Canonical Tensor Decomposition for Knowledge Base Completion

Authors: Timothée Lacroix, Nicolas Usunier, Guillaume Obozinski

Abstract: The problem of Knowledge Base Completion can be framed as a 3rd-order binary tensor completion problem. In this light, the Canonical Tensor Decomposition (CP) (Hitchcock, 1927) seems like a natural solution; however, current implementations of CP on standard Knowledge Base Completion benchmarks are lagging behind their competitors. In this work, we attempt to understand the limits of CP for knowle… ▽ More The problem of Knowledge Base Completion can be framed as a 3rd-order binary tensor completion problem. In this light, the Canonical Tensor Decomposition (CP) (Hitchcock, 1927) seems like a natural solution; however, current implementations of CP on standard Knowledge Base Completion benchmarks are lagging behind their competitors. In this work, we attempt to understand the limits of CP for knowledge base completion. First, we motivate and test a novel regularizer, based on tensor nuclear $p$-norms. Then, we present a reformulation of the problem that makes it invariant to arbitrary choices in the inclusion of predicates or their reciprocals in the dataset. These two methods combined allow us to beat the current state of the art on several datasets with a CP decomposition, and obtain even better results using the more advanced ComplEx model. △ Less

Submitted 19 June, 2018; originally announced June 2018.

Showing 1–5 of 5 results for author: LaCroix, T