Skip to main content

Showing 1–16 of 16 results for author: Toneva, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.07766  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Vision-Language Integration in Multimodal Video Transformers (Partially) Aligns with the Brain

    Authors: Dota Tianai Dong, Mariya Toneva

    Abstract: Integrating information from multiple modalities is arguably one of the essential prerequisites for grounding artificial intelligence systems with an understanding of the real world. Recent advances in video transformers that jointly learn from vision, text, and sound over time have made some progress toward this goal, but the degree to which these models integrate information from modalities stil… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  2. arXiv:2311.04664  [pdf, other

    cs.CL cs.LG eess.AS q-bio.NC

    Speech language models lack important brain-relevant semantics

    Authors: Subba Reddy Oota, Emin Çelik, Fatma Deniz, Mariya Toneva

    Abstract: Despite known differences between reading and listening in the brain, recent work has shown that text-based language models predict both text-evoked and speech-evoked brain activity to an impressive degree. This poses the question of what types of information language models truly predict in the brain. We investigate this question via a direct approach, in which we systematically remove specific l… ▽ More

    Submitted 16 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: 26 pages, 20 figures, The 62nd Annual Meeting of the Association for Computational Linguistics, Long paper - Main

  3. arXiv:2311.04166  [pdf, other

    cs.CL cs.LG

    Perturbed examples reveal invariances shared by language models

    Authors: Ruchit Rawal, Mariya Toneva

    Abstract: The rapid growth in natural language processing (NLP) research has led to numerous new models, outpacing our understanding of how they compare to established ones. One major reason for this difficulty is saturating benchmarks, which may not well reflect differences in model performance in the wild. In this work, we introduce a novel framework to compare two NLP models by revealing their shared inv… ▽ More

    Submitted 14 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted at ACL 2024 (Findings)

  4. arXiv:2310.13018  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE

    Getting aligned on representational alignment

    Authors: Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O'Connell , et al. (5 additional authors not shown)

    Abstract: Biological and artificial information processing systems form representations that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the extent to which the representations formed by these diverse systems agree? Do similarities in representations then translate into similar behavior? How can a system's representations be modified to better match those of an… ▽ More

    Submitted 2 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Working paper, changes to be made in upcoming revisions

  5. arXiv:2307.06006  [pdf, other

    cs.CV cs.LG

    What Happens During Finetuning of Vision Transformers: An Invariance Based Investigation

    Authors: Gabriele Merlin, Vedant Nanda, Ruchit Rawal, Mariya Toneva

    Abstract: The pretrain-finetune paradigm usually improves downstream performance over training a model from scratch on the same task, becoming commonplace across many areas of machine learning. While pretraining is empirically observed to be beneficial for a range of tasks, there is not a clear understanding yet of the reasons for this effect. In this work, we examine the relationship between pretrained vis… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

    Comments: Accepted to CoLLAs 2023

  6. arXiv:2305.19294  [pdf, other

    cs.LG

    Pointwise Representational Similarity

    Authors: Camila Kolling, Till Speicher, Vedant Nanda, Mariya Toneva, Krishna P. Gummadi

    Abstract: With the increasing reliance on deep neural networks, it is important to develop ways to better understand their learned representations. Representation similarity measures have emerged as a popular tool for examining learned representations However, existing measures only provide aggregate estimates of similarity at a global level, i.e. over a set of representations for N input examples. As such,… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  7. arXiv:2301.10297  [pdf, other

    cs.CL q-bio.NC

    Large language models can segment narrative events similarly to humans

    Authors: Sebastian Michelmann, Manoj Kumar, Kenneth A. Norman, Mariya Toneva

    Abstract: Humans perceive discrete events such as "restaurant visits" and "train rides" in their continuous experience. One important prerequisite for studying human event perception is the ability of researchers to quantify when one event ends and another begins. Typically, this information is derived by aggregating behavioral annotations from several observers. Here we present an alternative computational… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

  8. arXiv:2212.10898  [pdf, other

    cs.CL q-bio.NC

    Training language models to summarize narratives improves brain alignment

    Authors: Khai Loong Aw, Mariya Toneva

    Abstract: Building systems that achieve a deeper understanding of language is one of the central goals of natural language processing (NLP). Towards this goal, recent works have begun to train language models on narrative datasets which require extracting the most critical information by integrating across long contexts. However, it is still an open question whether these models are learning a deeper unders… ▽ More

    Submitted 28 February, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: ICLR 2023 (notable top 25%)

  9. arXiv:2212.08094  [pdf, other

    cs.CL q-bio.NC

    Joint processing of linguistic properties in brains and language models

    Authors: Subba Reddy Oota, Manish Gupta, Mariya Toneva

    Abstract: Language models have been shown to be very effective in predicting brain recordings of subjects experiencing complex language stimuli. For a deeper understanding of this alignment, it is important to understand the correspondence between the detailed processing of linguistic information by the human brain versus language models. We investigate this correspondence via a direct approach, in which we… ▽ More

    Submitted 8 November, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: 22 pages, 12 figures, To be published in the proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), New Orleans, USA

  10. arXiv:2212.00596  [pdf, other

    cs.CL q-bio.NC

    Language models and brain alignment: beyond word-level semantics and prediction

    Authors: Gabriele Merlin, Mariya Toneva

    Abstract: Pretrained language models that have been trained to predict the next word over billions of text documents have been shown to also significantly predict brain recordings of people comprehending language. Understanding the reasons behind the observed similarities between language in machines and language in the brain can lead to more insight into both systems. Recent works suggest that the predicti… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  11. arXiv:2202.10376  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE stat.AP

    Same Cause; Different Effects in the Brain

    Authors: Mariya Toneva, Jennifer Williams, Anand Bollu, Christoph Dann, Leila Wehbe

    Abstract: To study information processing in the brain, neuroscientists manipulate experimental stimuli while recording participant brain activity. They can then use encoding models to find out which brain "zone" (e.g. which region of interest, volume pixel or electrophysiology sensor) is predicted from the stimulus properties. Given the assumptions underlying this setup, when stimulus properties are predic… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

    Comments: Accepted to CLeaR 2022

  12. arXiv:2101.12608  [pdf, other

    cs.CL cs.AI

    Does injecting linguistic structure into language models lead to better alignment with brain recordings?

    Authors: Mostafa Abdou, Ana Valeria Gonzalez, Mariya Toneva, Daniel Hershcovich, Anders Søgaard

    Abstract: Neuroscientists evaluate deep neural networks for natural language processing as possible candidate models for how language is processed in the brain. These models are often trained without explicit linguistic supervision, but have been shown to learn some linguistic structure in the absence of such supervision (Manning et al., 2020), potentially questioning the relevance of symbolic linguistic th… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

  13. arXiv:2009.08424  [pdf, other

    cs.CL cs.AI cs.LG

    Modeling Task Effects on Meaning Representation in the Brain via Zero-Shot MEG Prediction

    Authors: Mariya Toneva, Otilia Stretcu, Barnabas Poczos, Leila Wehbe, Tom M. Mitchell

    Abstract: How meaning is represented in the brain is still one of the big open questions in neuroscience. Does a word (e.g., bird) always have the same representation, or does the task under which the word is processed alter its representation (answering "can you eat it?" versus "can it fly?")? The brain activity of subjects who read the same word while performing different semantic tasks has been shown to… ▽ More

    Submitted 15 November, 2020; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: accepted at NeurIPS 2020

  14. arXiv:1911.03268  [pdf, other

    q-bio.NC cs.CL cs.LG

    Inducing brain-relevant bias in natural language processing models

    Authors: Dan Schwartz, Mariya Toneva, Leila Wehbe

    Abstract: Progress in natural language processing (NLP) models that estimate representations of word sequences has recently been leveraged to improve the understanding of language processing in the brain. However, these models have not been specifically designed to capture the way the brain represents language meaning. We hypothesize that fine-tuning these models to predict recordings of brain activity of p… ▽ More

    Submitted 29 October, 2019; originally announced November 2019.

    Comments: To be published in the proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  15. arXiv:1905.11833  [pdf, other

    cs.CL cs.AI cs.LG q-bio.NC

    Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain)

    Authors: Mariya Toneva, Leila Wehbe

    Abstract: Neural networks models for NLP are typically implemented without the explicit encoding of language rules and yet they are able to break one performance record after another. This has generated a lot of research interest in interpreting the representations learned by these networks. We propose here a novel interpretation approach that relies on the only processing system we have that does understan… ▽ More

    Submitted 13 November, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: NeurIPS 2019

  16. arXiv:1812.05159  [pdf, other

    cs.LG stat.ML

    An Empirical Study of Example Forgetting during Deep Neural Network Learning

    Authors: Mariya Toneva, Alessandro Sordoni, Remi Tachet des Combes, Adam Trischler, Yoshua Bengio, Geoffrey J. Gordon

    Abstract: Inspired by the phenomenon of catastrophic forgetting, we investigate the learning dynamics of neural networks as they train on single classification tasks. Our goal is to understand whether a related phenomenon occurs when data does not undergo a clear distributional shift. We define a `forgetting event' to have occurred when an individual training example transitions from being classified correc… ▽ More

    Submitted 15 November, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: ICLR 2019