Skip to main content

Showing 1–50 of 55 results for author: Baroni, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15471  [pdf, other

    cs.CL

    Emergence of a High-Dimensional Abstraction Phase in Language Transformers

    Authors: Emily Cheng, Diego Doimo, Corentin Kervadec, Iuri Macocco, Jade Yu, Alessandro Laio, Marco Baroni

    Abstract: A language model (LM) is a map** from a linguistic context to an output token. However, much remains to be known about this map**, including how its geometric properties relate to its function. We take a high-level geometric approach to its analysis, observing, across five pre-trained transformer-based LMs and three input datasets, a distinct phase characterized by high intrinsic dimensionalit… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2405.15454  [pdf, other

    cs.CL eess.SY

    Linearly Controlled Language Generation with Performative Guarantees

    Authors: Emily Cheng, Marco Baroni, Carmen Amo Alonso

    Abstract: The increasing prevalence of Large Language Models (LMs) in critical applications highlights the need for controlled language generation strategies that are not only computationally efficient but that also enjoy performance guarantees. To achieve this, we use a common model of concept semantics as linearly represented in an LM's latent space. In particular, we take the view that natural language g… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  3. arXiv:2402.15268  [pdf, other

    cs.CL cs.AI cs.LG

    MemoryPrompt: A Light Wrapper to Improve Context Tracking in Pre-trained Language Models

    Authors: Nathanaël Carraz Rakotonirina, Marco Baroni

    Abstract: Transformer-based language models (LMs) track contextual information through large, hard-coded input windows. We introduce MemoryPrompt, a leaner approach in which the LM is complemented by a small auxiliary recurrent network that passes information to the LM by prefixing its regular input with a sequence of vectors, akin to soft prompts, without requiring LM finetuning. Tested on a task designed… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Published as conference paper at LREC-COLING 2024

  4. Frost Prediction Using Machine Learning Methods in Fars Province

    Authors: Milad Barooni, Koorush Ziarati, Ali Barooni

    Abstract: One of the common hazards and issues in meteorology and agriculture is the problem of frost, chilling or freezing. This event occurs when the minimum ambient temperature falls below a certain value. This phenomenon causes a lot of damage to the country, especially Fars province. Solving this problem requires that, in addition to predicting the minimum temperature, we can provide enough time to imp… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: Accpeted by 28th International Computer Conference, Computer Society of Iran (CSICC)

  5. arXiv:2310.15829  [pdf, other

    cs.CL

    Unnatural language processing: How do language models handle machine-generated prompts?

    Authors: Corentin Kervadec, Francesca Franzon, Marco Baroni

    Abstract: Language model prompt optimization research has shown that semantically and grammatically well-formed manually crafted prompts are routinely outperformed by automatically generated token sequences with no apparent meaning or syntactic structure, including sequences of vectors from a model's embedding space. We use machine-generated prompts to probe how models respond to input that is not composed… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023 Camera-Ready

  6. arXiv:2310.13620  [pdf, other

    cs.CL

    Bridging Information-Theoretic and Geometric Compression in Language Models

    Authors: Emily Cheng, Corentin Kervadec, Marco Baroni

    Abstract: For a language model (LM) to faithfully model human language, it must compress vast, potentially infinite information into relatively few dimensions. We propose analyzing compression in (pre-trained) LMs from two points of view: geometric and information-theoretic. We demonstrate that the two views are highly correlated, such that the intrinsic geometric dimension of linguistic data predicts their… ▽ More

    Submitted 9 November, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Camera-Ready

  7. arXiv:2304.01662  [pdf, other

    cs.CV cs.AI cs.CL

    Cross-Domain Image Captioning with Discriminative Finetuning

    Authors: Roberto Dessì, Michele Bevilacqua, Eleonora Gualdoni, Nathanael Carraz Rakotonirina, Francesca Franzon, Marco Baroni

    Abstract: Neural captioners are typically trained to mimic human-generated references without optimizing for any specific communication goal, leading to problems such as the generation of vague captions. In this paper, we show that fine-tuning an out-of-the-box neural captioner with a self-supervised discriminative communication objective helps to recover a plain, visually descriptive language that is more… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  8. arXiv:2302.09865  [pdf, other

    cs.CL cs.AI cs.LG

    Can discrete information extraction prompts generalize across language models?

    Authors: Nathanaël Carraz Rakotonirina, Roberto Dessì, Fabio Petroni, Sebastian Riedel, Marco Baroni

    Abstract: We study whether automatically-induced prompts that effectively extract information from a language model can also be used, out-of-the-box, to probe other language models for the same information. After confirming that discrete prompts induced with the AutoPrompt algorithm outperform manual and semi-manual prompts on the slot-filling task, we demonstrate a drop in performance for AutoPrompt prompt… ▽ More

    Submitted 7 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Published as conference paper at ICLR 2023

  9. arXiv:2302.08913  [pdf, other

    cs.CV cs.AI cs.LG

    Referential communication in heterogeneous communities of pre-trained visual deep networks

    Authors: Matéo Mahaut, Francesca Franzon, Roberto Dessì, Marco Baroni

    Abstract: As large pre-trained image-processing neural networks are being embedded in autonomous agents such as self-driving cars or robots, the question arises of how such systems can communicate with each other about the surrounding world, despite their different architectures and training regimes. As a first step in this direction, we systematically explore the task of \textit{referential communication}… ▽ More

    Submitted 13 March, 2024; v1 submitted 4 February, 2023; originally announced February 2023.

  10. arXiv:2210.11512  [pdf, other

    cs.CL

    Communication breakdown: On the low mutual intelligibility between human and neural captioning

    Authors: Roberto Dessì, Eleonora Gualdoni, Francesca Franzon, Gemma Boleda, Marco Baroni

    Abstract: We compare the 0-shot performance of a neural caption-based image retriever when given as input either human-produced captions or captions generated by a neural captioner. We conduct this comparison on the recently introduced ImageCoDe data-set (Krojer et al., 2022) which contains hard distractors nearly identical to the images to be retrieved. We find that the neural retriever has much higher per… ▽ More

    Submitted 27 April, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted as a short paper at EMNLP 2022

  11. arXiv:2110.02782  [pdf, other

    cs.CL

    How BPE Affects Memorization in Transformers

    Authors: Eugene Kharitonov, Marco Baroni, Dieuwke Hupkes

    Abstract: Training data memorization in NLP can both be beneficial (e.g., closed-book QA) and undesirable (personal data extraction). In any case, successful model training requires a non-trivial amount of memorization to store word spellings, various linguistic idiosyncrasies and common knowledge. However, little is known about what affects the memorization behavior of NLP models, as the field tends to foc… ▽ More

    Submitted 2 December, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

  12. arXiv:2106.08694  [pdf, ps, other

    cs.CL

    On the proper role of linguistically-oriented deep net analysis in linguistic theorizing

    Authors: Marco Baroni

    Abstract: A lively research field has recently emerged that uses experimental methods to probe the linguistic behavior of modern deep networks. While work in this tradition often reports intriguing results about the grammatical skills of deep nets, it is not clear what their implications for linguistic theorizing should be. As a consequence, linguistically-oriented deep net analysis has had very little impa… ▽ More

    Submitted 24 March, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

    Comments: To appear in collective volume on Algebraic Systems and the Representation of Linguistic Knowledge, editor: Shalom Lappin, Taylor & Francis

  13. arXiv:2106.04258  [pdf, other

    cs.CL cs.AI cs.LG cs.MA

    Interpretable agent communication from scratch (with a generic visual processor emerging on the side)

    Authors: Roberto Dessì, Eugene Kharitonov, Marco Baroni

    Abstract: As deep networks begin to be deployed as autonomous agents, the issue of how they can communicate with each other becomes important. Here, we train two deep nets from scratch to perform realistic referent identification through unsupervised emergent communication. We show that the largely interpretable emergent protocol allows the nets to successfully communicate even about object types they did n… ▽ More

    Submitted 15 October, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: Accepted at NeurIPS 2021

  14. Mechanisms for Handling Nested Dependencies in Neural-Network Language Models and Humans

    Authors: Yair Lakretz, Dieuwke Hupkes, Alessandra Vergallito, Marco Marelli, Marco Baroni, Stanislas Dehaene

    Abstract: Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and… ▽ More

    Submitted 3 May, 2021; v1 submitted 19 June, 2020; originally announced June 2020.

    Journal ref: Lakretz et al. (2021), Cognition

  15. arXiv:2006.02419  [pdf, other

    cs.CL cs.AI

    Emergent Multi-Agent Communication in the Deep Learning Era

    Authors: Angeliki Lazaridou, Marco Baroni

    Abstract: The ability to cooperate through language is a defining feature of humans. As the perceptual, motory and planning capabilities of deep artificial networks increase, researchers are studying whether they also can develop a shared language to interact. From a scientific perspective, understanding the conditions under which language evolves in communities of deep agents and its emergent features can… ▽ More

    Submitted 14 July, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

    Comments: Added some more references and discussion

  16. Syntactic Structure from Deep Learning

    Authors: Tal Linzen, Marco Baroni

    Abstract: Modern deep neural networks achieve impressive performance in engineering applications that require extensive linguistic skills, such as machine translation. This success has sparked interest in probing whether these models are inducing human-like grammatical knowledge from the raw data they are exposed to, and, consequently, whether they can shed new light on long-standing debates concerning the… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

    Comments: In press at Annual Reviews of Linguistics

  17. arXiv:2004.09124  [pdf, other

    cs.CL cs.AI cs.LG

    Compositionality and Generalization in Emergent Languages

    Authors: Rahma Chaabouni, Eugene Kharitonov, Diane Bouchacourt, Emmanuel Dupoux, Marco Baroni

    Abstract: Natural language allows us to refer to novel composite concepts by combining expressions denoting their parts according to systematic rules, a property known as \emph{compositionality}. In this paper, we study whether the language emerging in deep multi-agent simulations possesses a similar ability to refer to novel primitive combinations, and whether it accomplishes this feat by strategies akin t… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

  18. arXiv:2004.03420  [pdf, other

    cs.CL cs.LG

    Emergent Language Generalization and Acquisition Speed are not tied to Compositionality

    Authors: Eugene Kharitonov, Marco Baroni

    Abstract: Studies of discrete languages emerging when neural agents communicate to solve a joint task often look for evidence of compositional structure. This stems for the expectation that such a structure would allow languages to be acquired faster by the agents and enable them to generalize better. We argue that these beneficial properties are only loosely connected to compositionality. In two experiment… ▽ More

    Submitted 25 April, 2020; v1 submitted 7 April, 2020; originally announced April 2020.

  19. arXiv:2003.11922  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Rat big, cat eaten! Ideas for a useful deep-agent protolanguage

    Authors: Marco Baroni

    Abstract: Deep-agent communities develo** their own language-like communication protocol are a hot (or at least warm) topic in AI. Such agents could be very useful in machine-machine and human-machine interaction scenarios long before they have evolved a protocol as complex as human language. Here, I propose a small set of priorities we should focus on, if we want to get as fast as possible to a stage whe… ▽ More

    Submitted 17 March, 2020; originally announced March 2020.

  20. arXiv:2003.05161  [pdf, other

    cs.CL cs.AI cs.LG

    A Benchmark for Systematic Generalization in Grounded Language Understanding

    Authors: Laura Ruis, Jacob Andreas, Marco Baroni, Diane Bouchacourt, Brenden M. Lake

    Abstract: Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts ("greet the pink brontosaurus by the ferris wheel"). Modern neural networks, by contrast, struggle to interpret novel compositions. In this paper, we introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding. Going beyond a related benchmark th… ▽ More

    Submitted 17 October, 2020; v1 submitted 11 March, 2020; originally announced March 2020.

    Comments: accepted at NeurIPS 2020

  21. arXiv:1911.01892  [pdf, ps, other

    cs.CL cs.AI

    Focus on What's Informative and Ignore What's not: Communication Strategies in a Referential Game

    Authors: Roberto Dessì, Diane Bouchacourt, Davide Crepaldi, Marco Baroni

    Abstract: Research in multi-agent cooperation has shown that artificial agents are able to learn to play a simple referential game while develo** a shared lexicon. This lexicon is not easy to analyze, as it does not show many properties of a natural language. In a simple referential game with two neural network-based agents, we analyze the object-symbol map** trying to understand what kind of strategy w… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.

    Comments: 3rd NeurIPS Workshop on Emergent Communication

  22. arXiv:1907.00852  [pdf, other

    cs.CL cs.AI

    EGG: a toolkit for research on Emergence of lanGuage in Games

    Authors: Eugene Kharitonov, Rahma Chaabouni, Diane Bouchacourt, Marco Baroni

    Abstract: There is renewed interest in simulating language emergence among deep neural agents that communicate to jointly solve a task, spurred by the practical aim to develop language-enabled interactive AIs, as well as by theoretical questions about the evolution of human language. However, optimizing deep architectures connected by a discrete communication channel (such as that in which language emerges)… ▽ More

    Submitted 13 October, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

    Comments: EMNLP 2019 Demo paper

  23. arXiv:1906.07285  [pdf, other

    cs.CL

    Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text

    Authors: Michael Hahn, Marco Baroni

    Abstract: Recurrent neural networks (RNNs) have reached striking performance in many natural language processing tasks. This has renewed interest in whether these generic sequence processing devices are inducing genuine linguistic knowledge. Nearly all current analytical studies, however, initialize the RNNs with a vocabulary of known words, and feed them tokenized input during training. We present a multi-… ▽ More

    Submitted 17 June, 2019; originally announced June 2019.

    Comments: Accepted by Transactions of the Association for Computational Linguistics

  24. arXiv:1905.13687  [pdf, other

    cs.CL cs.LG

    Entropy Minimization In Emergent Languages

    Authors: Eugene Kharitonov, Rahma Chaabouni, Diane Bouchacourt, Marco Baroni

    Abstract: There is growing interest in studying the languages that emerge when neural agents are jointly trained to solve tasks requiring communication through a discrete channel. We investigate here the information-theoretic complexity of such languages, focusing on the basic two-agent, one-exchange setup. We find that, under common training procedures, the emergent languages are subject to an entropy mini… ▽ More

    Submitted 26 June, 2020; v1 submitted 31 May, 2019; originally announced May 2019.

    Comments: Accepted at ICML 2020

  25. arXiv:1905.12561  [pdf, other

    cs.CL cs.AI cs.LG cs.MA

    Anti-efficient encoding in emergent communication

    Authors: Rahma Chaabouni, Eugene Kharitonov, Emmanuel Dupoux, Marco Baroni

    Abstract: Despite renewed interest in emergent language simulations with neural networks, little is known about the basic properties of the induced code, and how they compare to human language. One fundamental characteristic of the latter, known as Zipf's Law of Abbreviation (ZLA), is that more frequent words are efficiently associated to shorter strings. We study whether the same pattern emerges when two n… ▽ More

    Submitted 15 October, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

  26. arXiv:1905.12330  [pdf, other

    cs.CL cs.AI cs.LG

    Word-order biases in deep-agent emergent communication

    Authors: Rahma Chaabouni, Eugene Kharitonov, Alessandro Lazaric, Emmanuel Dupoux, Marco Baroni

    Abstract: Sequence-processing neural networks led to remarkable progress on many NLP tasks. As a consequence, there has been increasing interest in understanding to what extent they process language as humans do. We aim here to uncover which biases such models display with respect to "natural" word-order constraints. We train models to communicate about paths in a simple gridworld, using miniature languages… ▽ More

    Submitted 14 June, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: Conference: Association for Computational Linguistics (ACL)

  27. arXiv:1905.11871  [pdf, other

    cs.CL cs.MA

    Miss Tools and Mr Fruit: Emergent communication in agents learning about object affordances

    Authors: Diane Bouchacourt, Marco Baroni

    Abstract: Recent research studies communication emergence in communities of deep network agents assigned a joint task, ho** to gain insights on human language evolution. We propose here a new task capturing crucial aspects of the human environment, such as natural object affordances, and of human conversation, such as full symmetry among the participants. By conducting a thorough pragmatic and semantic an… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

    Comments: Association for Computational Linguistics

  28. arXiv:1905.08527  [pdf, other

    cs.CL cs.AI cs.LG

    CNNs found to jump around more skillfully than RNNs: Compositional generalization in seq2seq convolutional networks

    Authors: Roberto Dessì, Marco Baroni

    Abstract: Lake and Baroni (2018) introduced the SCAN dataset probing the ability of seq2seq models to capture compositional generalizations, such as inferring the meaning of "jump around" 0-shot from the component words. Recurrent networks (RNNs) were found to completely fail the most challenging generalization cases. We test here a convolutional network (CNN) on these tasks, reporting hugely improved perfo… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

    Comments: accepted as a short paper at ACL 2019

  29. Linguistic generalization and compositionality in modern artificial neural networks

    Authors: Marco Baroni

    Abstract: In the last decade, deep artificial neural networks have achieved astounding performance in many natural language processing tasks. Given the high productivity of language, these models must possess effective generalization abilities. It is widely assumed that humans handle linguistic productivity by means of algebraic compositional rules: Are deep networks similarly compositional? After reviewing… ▽ More

    Submitted 26 June, 2019; v1 submitted 30 March, 2019; originally announced April 2019.

    Comments: Please cite as "to appear in the Philosophical Transactions of the Royal Society B"

  30. arXiv:1903.07435  [pdf, other

    cs.CL

    The emergence of number and syntax units in LSTM language models

    Authors: Yair Lakretz, German Kruszewski, Theo Desbordes, Dieuwke Hupkes, Stanislas Dehaene, Marco Baroni

    Abstract: Recent work has shown that LSTMs trained on a generic language modeling objective capture syntax-sensitive generalizations such as long-distance number agreement. We have however no mechanistic understanding of how they accomplish this remarkable feat. Some have conjectured it depends on heuristics that do not truly take hierarchical structure into account. We present here a detailed study of the… ▽ More

    Submitted 2 April, 2019; v1 submitted 18 March, 2019; originally announced March 2019.

    Comments: To appear in Proceedings of NAACL, Minneapolis, MN, 2019

  31. arXiv:1901.04587  [pdf, other

    cs.CL

    Human few-shot learning of compositional instructions

    Authors: Brenden M. Lake, Tal Linzen, Marco Baroni

    Abstract: People learn in fast and flexible ways that have not been emulated by machines. Once a person learns a new verb "dax," he or she can effortlessly understand how to "dax twice," "walk and dax," or "dax vigorously." There have been striking recent improvements in machine learning for natural language processing, yet the best algorithms require vast amounts of experience and struggle to generalize ne… ▽ More

    Submitted 10 May, 2019; v1 submitted 14 January, 2019; originally announced January 2019.

    Comments: Please cite as: Lake, B. M., Linzen, T., and Baroni, M. (2019). Human few-shot learning of compositional instructions. In Proceedings of the 41st Annual Conference of the Cognitive Science Society

  32. arXiv:1809.04640  [pdf, other

    cs.CL

    Jump to better conclusions: SCAN both left and right

    Authors: Jasmijn Bastings, Marco Baroni, Jason Weston, Kyunghyun Cho, Douwe Kiela

    Abstract: Lake and Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the strong generalization abilities of recurrent sequence-to-sequence models. Their initial experiments suggested that such models may fail because they lack the ability to extract systematic rules. Here, we take a closer look at SCAN and show that it… ▽ More

    Submitted 18 June, 2020; v1 submitted 12 September, 2018; originally announced September 2018.

  33. arXiv:1808.10696  [pdf, other

    cs.CL cs.LG

    How agents see things: On visual representations in an emergent language game

    Authors: Diane Bouchacourt, Marco Baroni

    Abstract: There is growing interest in the language developed by agents interacting in emergent-communication settings. Earlier studies have focused on the agents' symbol usage, rather than on their representation of visual input. In this paper, we consider the referential games of Lazaridou et al. (2017) and investigate the representations the agents develop during their evolving interaction. We find that… ▽ More

    Submitted 13 September, 2018; v1 submitted 31 August, 2018; originally announced August 2018.

    Comments: 2018 Conference on Empirical Methods in Natural Language Processing

  34. arXiv:1807.07545  [pdf, other

    cs.CL cs.AI cs.LG

    Rearranging the Familiar: Testing Compositional Generalization in Recurrent Networks

    Authors: João Loula, Marco Baroni, Brenden M. Lake

    Abstract: Systematic compositionality is the ability to recombine meaningful units with regular and predictable outcomes, and it's seen as key to humans' capacity for generalization in language. Recent work has studied systematic compositionality in modern seq2seq models using generalization to novel navigation instructions in a grounded environment as a probing tool, requiring models to quickly bootstrap t… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

  35. arXiv:1805.01070  [pdf, other

    cs.CL

    What you can cram into a single vector: Probing sentence embeddings for linguistic properties

    Authors: Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni

    Abstract: Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing. "Downstream" tasks, often based on sentence classification, are commonly used to evaluate the quality of sentence representations. The complexity of the tasks makes it however difficult to infer what kind of information is present in the repres… ▽ More

    Submitted 8 July, 2018; v1 submitted 2 May, 2018; originally announced May 2018.

    Comments: ACL 2018

  36. arXiv:1803.11138  [pdf, other

    cs.CL

    Colorless green recurrent networks dream hierarchically

    Authors: Kristina Gulordava, Piotr Bojanowski, Edouard Grave, Tal Linzen, Marco Baroni

    Abstract: Recurrent neural networks (RNNs) have achieved impressive results in a variety of linguistic processing tasks, suggesting that they can induce non-trivial properties of language. We investigate here to what extent RNNs learn to track abstract hierarchical syntactic structure. We test whether RNNs trained with a generic language modeling objective in four languages (Italian, English, Hebrew, Russia… ▽ More

    Submitted 29 March, 2018; originally announced March 2018.

    Comments: Accepted to NAACL 2018

  37. arXiv:1802.06467  [pdf, other

    cs.AI cs.LG cs.NE

    Memorize or generalize? Searching for a compositional RNN in a haystack

    Authors: Adam Liška, Germán Kruszewski, Marco Baroni

    Abstract: Neural networks are very powerful learning systems, but they do not readily generalize from one task to the other. This is partly due to the fact that they do not learn in a compositional way, that is, by discovering skills that are shared by different tasks, and recombining them to solve new problems. In this paper, we explore the compositional generalization capabilities of recurrent neural netw… ▽ More

    Submitted 25 July, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

    Comments: AEGAP Workshop (ICML 2018)

  38. arXiv:1711.00350  [pdf, other

    cs.CL cs.AI cs.LG

    Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

    Authors: Brenden M. Lake, Marco Baroni

    Abstract: Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We th… ▽ More

    Submitted 6 June, 2018; v1 submitted 30 October, 2017; originally announced November 2017.

    Comments: Published at the 35th International Conference on Machine Learning (ICML 2018)

    Journal ref: Lake, B. M. and Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. International Conference on Machine Learning (ICML)

  39. arXiv:1707.06556  [pdf, other

    cs.CL cs.LG

    High-risk learning: acquiring new word vectors from tiny data

    Authors: Aurelie Herbelot, Marco Baroni

    Abstract: Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn 'a good vector' for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modificat… ▽ More

    Submitted 20 July, 2017; originally announced July 2017.

    Comments: Accepted as short paper at EMNLP 2017

  40. arXiv:1702.07306  [pdf, other

    stat.ML cs.LG

    Causal Discovery Using Proxy Variables

    Authors: Mateo Rojas-Carulla, Marco Baroni, David Lopez-Paz

    Abstract: Discovering causal relations is fundamental to reasoning and intelligence. In particular, observational causal discovery algorithms estimate the cause-effect relation between two random entities $X$ and $Y$, given $n$ samples from $P(X,Y)$. In this paper, we develop a framework to estimate the cause-effect relation between two static entities $x$ and $y$: for instance, an art masterpiece $x$ and… ▽ More

    Submitted 23 February, 2017; originally announced February 2017.

  41. arXiv:1702.01815  [pdf, other

    cs.CL

    Living a discrete life in a continuous world: Reference with distributed representations

    Authors: Gemma Boleda, Sebastian Padó, Nghia The Pham, Marco Baroni

    Abstract: Reference is a crucial property of language that allows us to connect linguistic expressions to the world. Modeling it requires handling both continuous and discrete aspects of meaning. Data-driven models excel at the former, but struggle with the latter, and the reverse is true for symbolic models. This paper (a) introduces a concrete referential task to test both aspects, called cross-modal en… ▽ More

    Submitted 4 September, 2017; v1 submitted 6 February, 2017; originally announced February 2017.

    Comments: Accepted at IWCS 2017. Final version, 9 pages

  42. arXiv:1701.08954  [pdf, ps, other

    cs.LG cs.AI cs.CL

    CommAI: Evaluating the first steps towards a useful general AI

    Authors: Marco Baroni, Armand Joulin, Allan Jabri, Germàn Kruszewski, Angeliki Lazaridou, Klemen Simonic, Tomas Mikolov

    Abstract: With machine learning successfully applied to new daunting problems almost every day, general AI starts looking like an attainable goal. However, most current research focuses instead on important but narrow applications, such as image classification or machine translation. We believe this to be largely due to the lack of objective ways to measure progress towards broad machine intelligence. In or… ▽ More

    Submitted 27 March, 2017; v1 submitted 31 January, 2017; originally announced January 2017.

    Comments: Published in ICLR 2017 Workshop Track

  43. arXiv:1612.07182  [pdf, other

    cs.CL cs.CV cs.GT cs.LG cs.MA

    Multi-Agent Cooperation and the Emergence of (Natural) Language

    Authors: Angeliki Lazaridou, Alexander Peysakhovich, Marco Baroni

    Abstract: The current mainstream approach to train natural language systems is to expose them to large amounts of text. This passive learning is problematic if we are interested in develo** interactive machines, such as conversational agents. We propose a framework for language learning that relies on multi-agent communication. We study this learning in the context of referential games. In these games, a… ▽ More

    Submitted 5 March, 2017; v1 submitted 21 December, 2016; originally announced December 2016.

    Comments: Accepted at ICLR 2017

  44. "Show me the cup": Reference with Continuous Representations

    Authors: Gemma Boleda, Sebastian Padó, Marco Baroni

    Abstract: One of the most basic functions of language is to refer to objects in a shared scene. Modeling reference with continuous representations is challenging because it requires individuation, i.e., tracking and distinguishing an arbitrary number of referents. We introduce a neural network model that, given a definite description and a set of objects represented by natural images, points to the intended… ▽ More

    Submitted 28 June, 2016; originally announced June 2016.

    Journal ref: In: Gelbukh A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2017. Lecture Notes in Computer Science, vol 10761. Springer, Cham

  45. arXiv:1606.06031  [pdf, other

    cs.CL cs.AI cs.LG

    The LAMBADA dataset: Word prediction requiring a broad discourse context

    Authors: Denis Paperno, Germán Kruszewski, Angeliki Lazaridou, Quan Ngoc Pham, Raffaella Bernardi, Sandro Pezzelle, Marco Baroni, Gemma Boleda, Raquel Fernández

    Abstract: We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the whole passage, but not if they only see the last sentence preceding the target word. To succeed on LAM… ▽ More

    Submitted 20 June, 2016; originally announced June 2016.

    Comments: 10 pages, Accepted as a long paper for ACL 2016

  46. arXiv:1605.07133  [pdf, other

    cs.CL cs.CV cs.LG

    Towards Multi-Agent Communication-Based Language Learning

    Authors: Angeliki Lazaridou, Nghia The Pham, Marco Baroni

    Abstract: We propose an interactive multimodal framework for language learning. Instead of being passively exposed to large amounts of natural text, our learners (implemented as feed-forward neural networks) engage in cooperative referential games starting from a tabula rasa setup, and thus develop their own language from the need to communicate in order to succeed at the game. Preliminary experiments provi… ▽ More

    Submitted 23 May, 2016; originally announced May 2016.

    Comments: 9 pages, manuscript under submission

  47. arXiv:1603.02618  [pdf, other

    cs.CL cs.CV

    The red one!: On learning to refer to things based on their discriminative properties

    Authors: Angeliki Lazaridou, Nghia The Pham, Marco Baroni

    Abstract: As a first step towards agents learning to communicate about their visual environment, we propose a system that, given visual representations of a referent (cat) and a context (sofa), identifies their discriminative attributes, i.e., properties that distinguish them (has_tail). Moreover, despite the lack of direct supervision at the attribute level, the model learns to assign plausible attributes… ▽ More

    Submitted 23 May, 2016; v1 submitted 8 March, 2016; originally announced March 2016.

    Comments: Accepted as an ACL-short sumbmission

  48. arXiv:1511.08130  [pdf, other

    cs.AI cs.CL

    A Roadmap towards Machine Intelligence

    Authors: Tomas Mikolov, Armand Joulin, Marco Baroni

    Abstract: The development of intelligent machines is one of the biggest unsolved challenges in computer science. In this paper, we propose some fundamental properties these machines should have, focusing in particular on communication and learning. We discuss a simple environment that could be used to incrementally teach a machine the basics of natural-language-based communication, as a prerequisite to more… ▽ More

    Submitted 26 February, 2016; v1 submitted 25 November, 2015; originally announced November 2015.

  49. arXiv:1506.03500  [pdf, other

    cs.CV cs.CL

    Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation

    Authors: Angeliki Lazaridou, Dat Tien Nguyen, Raffaella Bernardi, Marco Baroni

    Abstract: We introduce language-driven image generation, the task of generating an image visualizing the semantic contents of a word embedding, e.g., given the word embedding of grasshopper, we generate a natural image of a grasshopper. We implement a simple method based on two map** functions. The first takes as input a word embedding (as produced, e.g., by the word2vec toolkit) and maps it onto a high-l… ▽ More

    Submitted 23 November, 2015; v1 submitted 10 June, 2015; originally announced June 2015.

    Comments: A 6-page version to appear at the Multimodal Machine Learning NIPS 2015 Workshop

  50. arXiv:1501.02714  [pdf, other

    cs.CL cs.CV

    From Visual Attributes to Adjectives through Decompositional Distributional Semantics

    Authors: Angeliki Lazaridou, Georgiana Dinu, Adam Liska, Marco Baroni

    Abstract: As automated image analysis progresses, there is increasing interest in richer linguistic annotation of pictures, with attributes of objects (e.g., furry, brown...) attracting most attention. By building on the recent "zero-shot learning" approach, and paying attention to the linguistic nature of attributes as noun modifiers, and specifically adjectives, we show that it is possible to tag images w… ▽ More

    Submitted 24 March, 2015; v1 submitted 12 January, 2015; originally announced January 2015.

    Comments: accepted at Transactions of the Association for Computational Linguistics (TACL), 3/2015