Skip to main content

Showing 1–26 of 26 results for author: Zuidema, W

.
  1. arXiv:2407.03005  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Human-like Linguistic Biases in Neural Speech Models: Phonetic Categorization and Phonotactic Constraints in Wav2Vec2.0

    Authors: Marianne de Heer Kloots, Willem Zuidema

    Abstract: What do deep neural speech models know about phonology? Existing work has examined the encoding of individual linguistic units such as phonemes in these models. Here we investigate interactions between units. Inspired by classic experiments on human speech perception, we study how Wav2Vec2 resolves phonotactic constraints. We synthesize sounds on an acoustic continuum between /l/ and /r/ and embed… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted to Interspeech 2024. For code and materials, see https://github.com/mdhk/phonotactic-sensitivity

    Journal ref: Proc. INTERSPEECH 2024

  2. arXiv:2407.02136  [pdf, other

    cs.CL

    Black Big Boxes: Do Language Models Hide a Theory of Adjective Order?

    Authors: Jaap Jumelet, Lisa Bylinina, Willem Zuidema, Jakub Szymanik

    Abstract: In English and other languages, multiple adjectives in a complex noun phrase show intricate ordering patterns that have been a target of much linguistic theory. These patterns offer an opportunity to assess the ability of language models (LMs) to learn subtle rules of language involving factors that cross the traditional divisions of syntax, semantics, and pragmatics. We review existing hypotheses… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  3. arXiv:2406.15265  [pdf, other

    cs.CL

    Perception of Phonological Assimilation by Neural Speech Recognition Models

    Authors: Charlotte Pouw, Marianne de Heer Kloots, Afra Alishahi, Willem Zuidema

    Abstract: Human listeners effortlessly compensate for phonological changes during speech perception, often unconsciously inferring the intended sounds. For example, listeners infer the underlying /n/ when hearing an utterance such as "clea[m] pan", where [m] arises from place assimilation to the following labial [p]. This article explores how the neural speech recognition model Wav2Vec2 perceives assimilate… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted for publication in Computational Linguistics (Special Issue on Language Learning, Representation, and Processing in Humans and Machines)

  4. arXiv:2406.04847  [pdf, other

    cs.CL

    Do Language Models Exhibit Human-like Structural Priming Effects?

    Authors: Jaap Jumelet, Willem Zuidema, Arabella Sinclair

    Abstract: We explore which linguistic factors -- at the sentence and token level -- play an important role in influencing language model predictions, and investigate whether these are reflective of results found in humans and human corpora (Gries and Kootstra, 2017). We make use of the structural priming paradigm, where recent exposure to a structure facilitates processing of the same structure. We don't on… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: ACL Findings 2024

  5. arXiv:2310.14840  [pdf, other

    cs.CL

    Transparency at the Source: Evaluating and Interpreting Language Models With Access to the True Distribution

    Authors: Jaap Jumelet, Willem Zuidema

    Abstract: We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data. The data is generated using a massive probabilistic grammar (based on state-split PCFGs), that is itself derived from a large natural language corpus, but also provides us complete control over the generative process. We describe and release both grammar and corpus, and te… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP Findings 2023

  6. arXiv:2310.12611  [pdf, other

    cs.CL cs.AI

    Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model

    Authors: Abhijith Chintam, Rahel Beloch, Willem Zuidema, Michael Hanna, Oskar van der Wal

    Abstract: Language models (LMs) exhibit and amplify many types of undesirable biases learned from the training data, including gender bias. However, we lack tools for effectively and efficiently changing this behavior without hurting general language modeling performance. In this paper, we study three methods for identifying causal relations between LM components and particular output: causal mediation anal… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: Accepted at BlackboxNLP 2023

  7. arXiv:2310.09925  [pdf, other

    cs.CL cs.AI cs.LG

    Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers

    Authors: Hosein Mohebbi, Grzegorz Chrupała, Willem Zuidema, Afra Alishahi

    Abstract: Transformers have become a key architecture in speech processing, but our understanding of how they build up representations of acoustic and linguistic structure is limited. In this study, we address this gap by investigating how measures of 'context-mixing' developed for text models can be adapted and applied to models of spoken language. We identify a linguistic phenomenon that is ideal for such… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (main)

  8. arXiv:2310.03686  [pdf, other

    cs.CL

    DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers

    Authors: Anna Langedijk, Hosein Mohebbi, Gabriele Sarti, Willem Zuidema, Jaap Jumelet

    Abstract: In recent years, many interpretability methods have been proposed to help interpret the internal states of Transformer-models, at different levels of precision and complexity. Here, to analyze encoder-decoder Transformers, we propose a simple, new method: DecoderLens. Inspired by the LogitLens (for decoder-only Transformers), this method involves allowing the decoder to cross-attend representation… ▽ More

    Submitted 3 April, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of NAACL 2024

  9. arXiv:2306.12181  [pdf, other

    cs.CL

    Feature Interactions Reveal Linguistic Structure in Language Models

    Authors: Jaap Jumelet, Willem Zuidema

    Abstract: We study feature interactions in the context of feature attribution methods for post-hoc interpretability. In interpretability research, getting to grips with feature interactions is increasingly recognised as an important challenge, because interacting features are key to the success of neural networks. Feature interactions allow a model to build up hierarchical representations for its input, and… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: ACL Findings 2023

  10. arXiv:2301.12971  [pdf, other

    cs.CL cs.LG

    Quantifying Context Mixing in Transformers

    Authors: Hosein Mohebbi, Willem Zuidema, Grzegorz Chrupała, Afra Alishahi

    Abstract: Self-attention weights and their transformed variants have been the main source of information for analyzing token-to-token interactions in Transformer-based models. But despite their ease of interpretation, these weights are not faithful to the models' decisions as they are only one part of an encoder, and other components in the encoder layer can have considerable impact on information mixing in… ▽ More

    Submitted 8 February, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted to EACL 2023 (main)

  11. Undesirable Biases in NLP: Addressing Challenges of Measurement

    Authors: Oskar van der Wal, Dominik Bachmann, Alina Leidinger, Leendert van Maanen, Willem Zuidema, Katrin Schulz

    Abstract: As Large Language Models and Natural Language Processing (NLP) technology rapidly develop and spread into daily life, it becomes crucial to anticipate how their use could harm people. One problem that has received a lot of attention in recent years is that this technology has displayed harmful biases, from generating derogatory stereotypes to producing disparate outcomes for different social group… ▽ More

    Submitted 14 January, 2024; v1 submitted 24 November, 2022; originally announced November 2022.

    Journal ref: Journal of Artificial Intelligence Research, 79, 1-40 (2024)

  12. arXiv:2207.10245  [pdf, other

    cs.CL cs.AI

    The Birth of Bias: A case study on the evolution of gender bias in an English language model

    Authors: Oskar van der Wal, Jaap Jumelet, Katrin Schulz, Willem Zuidema

    Abstract: Detecting and mitigating harmful biases in modern language models are widely recognized as crucial, open problems. In this paper, we take a step back and investigate how language models come to be biased in the first place. We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus. With full access to the data and to the model parameters as they c… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted at the 4th Workshop on Gender Bias in Natural Language Processing (NAACL, 2022)

  13. arXiv:2109.14989  [pdf, other

    cs.CL

    Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations

    Authors: Arabella Sinclair, Jaap Jumelet, Willem Zuidema, Raquel Fernández

    Abstract: We investigate the extent to which modern, neural language models are susceptible to structural priming, the phenomenon whereby the structure of a sentence makes the same structure more probable in a follow-up sentence. We explore how priming can be used to study the potential of these models to learn abstract structural information, which is a prerequisite for good performance on tasks that requi… ▽ More

    Submitted 29 June, 2022; v1 submitted 30 September, 2021; originally announced September 2021.

    Comments: Published in TACL, MIT Press

  14. arXiv:2011.05295  [pdf, other

    cs.CL

    DoLFIn: Distributions over Latent Features for Interpretability

    Authors: Phong Le, Willem Zuidema

    Abstract: Interpreting the inner workings of neural models is a key step in ensuring the robustness and trustworthiness of the models, but work on neural network interpretability typically faces a trade-off: either the models are too constrained to be very useful, or the solutions found by the models are too complex to interpret. We propose a novel strategy for achieving interpretability that -- in our expe… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

    Journal ref: COLING 2020

  15. arXiv:2006.00555  [pdf, other

    cs.LG cs.AI stat.ML

    Transferring Inductive Biases through Knowledge Distillation

    Authors: Samira Abnar, Mostafa Dehghani, Willem Zuidema

    Abstract: Having the right inductive biases can be crucial in many tasks or scenarios where data or computing resources are a limiting factor, or where training data is not perfectly representative of the conditions at test time. However, defining, designing and efficiently adapting inductive biases is not necessarily straightforward. In this paper, we explore the power of knowledge distillation for transfe… ▽ More

    Submitted 4 October, 2020; v1 submitted 31 May, 2020; originally announced June 2020.

  16. arXiv:2005.00928  [pdf, other

    cs.LG cs.AI cs.CL

    Quantifying Attention Flow in Transformers

    Authors: Samira Abnar, Willem Zuidema

    Abstract: In the Transformer model, "self-attention" combines information from attended embeddings into the representation of the focal embedding in the next layer. Thus, across layers of the Transformer, information originating from different tokens gets increasingly mixed. This makes attention weights unreliable as explanations probes. In this paper, we consider the problem of quantifying this flow of inf… ▽ More

    Submitted 31 May, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

  17. arXiv:1909.08975  [pdf, other

    cs.CL cs.AI stat.ML

    Analysing Neural Language Models: Contextual Decomposition Reveals Default Reasoning in Number and Gender Assignment

    Authors: Jaap Jumelet, Willem Zuidema, Dieuwke Hupkes

    Abstract: Extensive research has recently shown that recurrent neural language models are able to process a wide range of grammatical phenomena. How these models are able to perform these remarkable feats so well, however, is still an open question. To gain more insight into what information LSTMs base their decisions on, we propose a generalisation of Contextual Decomposition (GCD). In particular, this set… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

    Comments: To appear at CoNLL2019

  18. arXiv:1906.01539  [pdf, other

    cs.AI cs.CL q-bio.NC

    Blackbox meets blackbox: Representational Similarity and Stability Analysis of Neural Language Models and Brains

    Authors: Samira Abnar, Lisa Beinborn, Rochelle Choenni, Willem Zuidema

    Abstract: In this paper, we define and apply representational stability analysis (ReStA), an intuitive way of analyzing neural language models. ReStA is a variant of the popular representational similarity analysis (RSA) in cognitive neuroscience. While RSA can be used to compare representations in models, model components, and human brains, ReStA compares instances of the same model, while systematically v… ▽ More

    Submitted 5 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Journal ref: 2nd BlackBoxNLP workshop @ACL2019

  19. arXiv:1906.00180  [pdf, other

    cs.AI cs.CL cs.LG

    Siamese recurrent networks learn first-order logic reasoning and exhibit zero-shot compositional generalization

    Authors: Mathijs Mul, Willem Zuidema

    Abstract: Can neural nets learn logic? We approach this classic question with current methods, and demonstrate that recurrent neural networks can learn to recognize first order logical entailment relations between expressions. We define an artificial language in first-order predicate logic, generate a large dataset of sample 'sentences', and use an automatic theorem prover to infer the relation between rand… ▽ More

    Submitted 1 June, 2019; originally announced June 2019.

    Comments: 12 pages, 5 figures

  20. arXiv:1901.05180  [pdf, other

    cs.CL

    Formal models of Structure Building in Music, Language and Animal Songs

    Authors: Willem Zuidema, Dieuwke Hupkes, Geraint Wiggins, Constance Scharff, Martin Rohrmeier

    Abstract: Human language, music and a variety of animal vocalisations constitute ways of sonic communication that exhibit remarkable structural complexity. While the complexities of language and possible parallels in animal communication have been discussed intensively, reflections on the complexity of music and animal song, and their comparisons are underrepresented. In some ways, music and animal songs ar… ▽ More

    Submitted 16 January, 2019; originally announced January 2019.

    Comments: Pre-edited version of Zuidema, W., Hupkes, D., Wiggins, G. A., Scharff, C., & Rohrmeirer, M. (2018). Formal Models of Structure Building in Music, Language, and Animal Song. The Origins of Musicality, 253

  21. arXiv:1808.08079  [pdf, other

    cs.CL cs.AI

    Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information

    Authors: Mario Giulianelli, Jacqueline Harding, Florian Mohnert, Dieuwke Hupkes, Willem Zuidema

    Abstract: How do neural language models keep track of number agreement between subject and verb? We show that `diagnostic classifiers', trained to predict number from the internal states of a language model, provide a detailed understanding of how, when, and where this information is represented. Moreover, they give us insight into when and where number information is corrupted in cases where the language m… ▽ More

    Submitted 18 November, 2021; v1 submitted 24 August, 2018; originally announced August 2018.

    Comments: Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

  22. arXiv:1711.10203  [pdf, other

    cs.CL

    Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure

    Authors: Dieuwke Hupkes, Sara Veldhoen, Willem Zuidema

    Abstract: We investigate how neural networks can learn and process languages with hierarchical, compositional semantics. To this end, we define the artificial task of processing nested arithmetic expressions, and study whether different types of neural networks can learn to compute their meaning. We find that recursive neural networks can find a generalising solution to this problem, and we visualise this s… ▽ More

    Submitted 20 April, 2018; v1 submitted 28 November, 2017; originally announced November 2017.

    Comments: 20 pages

    Journal ref: Journal of Artificial Intelligence Research 61 (2018) 907-926

  23. arXiv:1711.09285  [pdf, other

    cs.CL

    Experiential, Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity

    Authors: Samira Abnar, Rasyan Ahmed, Max Mijnheer, Willem Zuidema

    Abstract: We evaluate 8 different word embedding models on their usefulness for predicting the neural activation patterns associated with concrete nouns. The models we consider include an experiential model, based on crowd-sourced association data, several popular neural and distributional models, and a model that reflects the syntactic context of words (based on dependency parses). Our goal is to assess th… ▽ More

    Submitted 25 November, 2017; originally announced November 2017.

    Comments: accepted at Cognitive Modeling and Computational Linguistics 2018

  24. arXiv:1603.00423  [pdf, other

    cs.AI cs.CL cs.NE

    Quantifying the vanishing gradient and long distance dependency problem in recursive neural networks and recursive LSTMs

    Authors: Phong Le, Willem Zuidema

    Abstract: Recursive neural networks (RNN) and their recently proposed extension recursive long short term memory networks (RLSTM) are models that compute representations for sentences, by recursively combining word embeddings according to an externally provided parse tree. Both models thus, unlike recurrent networks, explicitly make use of the hierarchical structure of a sentence. In this paper, we demonstr… ▽ More

    Submitted 1 March, 2016; originally announced March 2016.

  25. arXiv:1504.04666  [pdf, other

    cs.CL cs.LG

    Unsupervised Dependency Parsing: Let's Use Supervised Parsers

    Authors: Phong Le, Willem Zuidema

    Abstract: We present a self-training approach to unsupervised dependency parsing that reuses existing supervised and unsupervised parsing algorithms. Our approach, called `iterated reranking' (IR), starts with dependency trees generated by an unsupervised parser, and iteratively improves these trees using the richer probability models used in supervised parsing that are in turn trained on these trees. Our s… ▽ More

    Submitted 17 April, 2015; originally announced April 2015.

    Comments: 11 pages

  26. arXiv:1503.02510  [pdf, other

    cs.CL cs.AI cs.LG

    Compositional Distributional Semantics with Long Short Term Memory

    Authors: Phong Le, Willem Zuidema

    Abstract: We are proposing an extension of the recursive neural network that makes use of a variant of the long short-term memory architecture. The extension allows information low in parse trees to be stored in a memory register (the `memory cell') and used much later higher up in the parse tree. This provides a solution to the vanishing gradient problem and allows the network to capture long range depende… ▽ More

    Submitted 17 April, 2015; v1 submitted 9 March, 2015; originally announced March 2015.

    Comments: 10 pages, 7 figures