Skip to main content

Showing 1–16 of 16 results for author: Beguš, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.07861  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    CiwaGAN: Articulatory information exchange

    Authors: Gašper Beguš, Thomas Lu, Alan Zhou, Peter Wu, Gopala K. Anumanchipalli

    Abstract: Humans encode information into sounds by controlling articulators and decode information from sounds using the auditory apparatus. This paper introduces CiwaGAN, a model of human spoken language acquisition that combines unsupervised articulatory modeling with an unsupervised model of information exchange through the auditory modality. While prior research includes unsupervised articulatory modeli… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  2. arXiv:2306.07195  [pdf

    cs.CL cs.AI

    Large language models and (non-)linguistic recursion

    Authors: Maksymilian Dąbkowski, Gašper Beguš

    Abstract: Recursion is one of the hallmarks of human language. While many design features of language have been shown to exist in animal communication systems, recursion has not. Previous research shows that GPT-4 is the first large language model (LLM) to exhibit metalinguistic abilities (Beguš, Dąbkowski, and Rhodes 2023). Here, we propose several prompt designs aimed at eliciting and analyzing recursive… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  3. arXiv:2305.01626  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Basic syntax from speech: Spontaneous concatenation in unsupervised deep neural networks

    Authors: Gašper Beguš, Thomas Lu, Zili Wang

    Abstract: Computational models of syntax are predominantly text-based. Here we propose that basic syntax can be modeled directly from raw speech in a fully unsupervised way. We focus on one of the most ubiquitous and basic properties of syntax -- concatenation. We introduce spontaneous concatenation: a phenomenon where convolutional neural networks (CNNs) trained on acoustic recordings of individual words s… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  4. arXiv:2305.00948  [pdf

    cs.CL cs.AI

    Large Linguistic Models: Analyzing theoretical linguistic abilities of LLMs

    Authors: Gašper Beguš, Maksymilian Dąbkowski, Ryan Rhodes

    Abstract: The performance of large language models (LLMs) has recently improved to the point where the models can perform well on many language tasks. We show here that for the first time, the models can also generate coherent and valid formal analyses of linguistic data and illustrate the vast potential of large language models for analyses of their metalinguistic abilities. LLMs are primarily trained on l… ▽ More

    Submitted 21 August, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

  5. arXiv:2304.13187  [pdf, other

    cs.AI cs.SE

    AI-assisted coding: Experiments with GPT-4

    Authors: Russell A Poldrack, Thomas Lu, Gašper Beguš

    Abstract: Artificial intelligence (AI) tools based on large language models have acheived human-level performance on some computer programming tasks. We report several experiments using GPT-4 to generate computer code. These experiments demonstrate that AI code generation using the current generation of tools, while powerful, requires substantial human validation to ensure accurate performance. We also demo… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  6. arXiv:2303.10931  [pdf, other

    stat.ML cs.LG cs.SD eess.AS

    Approaching an unknown communication system by latent space exploration and causal inference

    Authors: Gašper Beguš, Andrej Leban, Shane Gero

    Abstract: This paper proposes a methodology for discovering meaningful properties in data by exploring the latent space of unsupervised deep generative models. We combine manipulation of individual latent variables to extreme values with methods inspired by causal inference into an approach we call causal disentanglement with extreme values (CDEV) and show that this method yields insights for model interpre… ▽ More

    Submitted 6 February, 2024; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: 25 pages, 23 figures; new format and section layout (moved some sections to the appendix), added replication experiments, updated references: to a subsequent experimental validation of the work, as well as to related methodological work

  7. arXiv:2210.15173  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    Articulation GAN: Unsupervised modeling of articulatory learning

    Authors: Gašper Beguš, Alan Zhou, Peter Wu, Gopala K Anumanchipalli

    Abstract: Generative deep neural networks are widely used for speech synthesis, but most existing models directly generate waveforms or spectral outputs. Humans, however, produce speech by controlling articulators, which results in the production of speech sounds through physical properties of sound propagation. We introduce the Articulatory Generator to the Generative Adversarial Network paradigm, a new un… ▽ More

    Submitted 12 March, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: ICASSP 2023

    Journal ref: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing

  8. Modeling speech recognition and synthesis simultaneously: Encoding and decoding lexical and sublexical semantic information into speech with no direct access to speech data

    Authors: Gašper Beguš, Alan Zhou

    Abstract: Human speakers encode information into raw speech which is then decoded by the listeners. This complex relationship between encoding (production) and decoding (perception) is often modeled separately. Here, we test how encoding and decoding of lexical semantic information can emerge automatically from raw speech in unsupervised generative deep convolutional networks that combine the production and… ▽ More

    Submitted 29 June, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: Interspeech 2022

    Journal ref: Proc. Interspeech 2022

  9. Interpreting intermediate convolutional layers in unsupervised acoustic word classification

    Authors: Gašper Beguš, Alan Zhou

    Abstract: Understanding how deep convolutional neural networks classify data has been subject to extensive research. This paper proposes a technique to visualize and interpret intermediate layers of unsupervised deep convolutional networks by averaging over individual feature maps in each convolutional layer and inferring underlying distributions of words with non-linear regression techniques. A GAN-based a… ▽ More

    Submitted 3 February, 2022; v1 submitted 5 October, 2021; originally announced October 2021.

    Comments: ICASSP 2022

    Journal ref: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing

  10. arXiv:2104.09489  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    Interpreting intermediate convolutional layers of generative CNNs trained on waveforms

    Authors: Gašper Beguš, Alan Zhou

    Abstract: This paper presents a technique to interpret and visualize intermediate layers in generative CNNs trained on raw speech data in an unsupervised manner. We argue that averaging over feature maps after ReLU activation in each transpose convolutional layer yields interpretable time-series data. This technique allows for acoustic analysis of intermediate layers that parallels the acoustic analysis of… ▽ More

    Submitted 9 September, 2022; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: IEEE/ACM Transactions on Audio Speech and Language Processing

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 3214-3229, 2022

  11. arXiv:2104.08614  [pdf

    cs.SD cs.AI cs.CL cs.LG cs.RO eess.AS

    Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales

    Authors: Jacob Andreas, Gašper Beguš, Michael M. Bronstein, Roee Diamant, Denley Delaney, Shane Gero, Shafi Goldwasser, David F. Gruber, Sarah de Haas, Peter Malkin, Roger Payne, Giovanni Petri, Daniela Rus, Pratyusha Sharma, Dan Tchernov, Pernille Tønnesen, Antonio Torralba, Daniel Vogt, Robert J. Wood

    Abstract: The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman speci… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  12. arXiv:2011.05463  [pdf, other

    cs.CL cs.SD eess.AS

    Deep Sound Change: Deep and Iterative Learning, Convolutional Neural Networks, and Language Change

    Authors: Gašper Beguš

    Abstract: This paper proposes a framework for modeling sound change that combines deep learning and iterative learning. Acquisition and transmission of speech is modeled by training generations of Generative Adversarial Networks (GANs) on unannotated raw speech data. The paper argues that several properties of sound change emerge from the proposed architecture. GANs (Goodfellow et al. 2014 arXiv:1406.2661,… ▽ More

    Submitted 22 September, 2021; v1 submitted 10 November, 2020; originally announced November 2020.

  13. Local and non-local dependency learning and emergence of rule-like representations in speech data by Deep Convolutional Generative Adversarial Networks

    Authors: Gašper Beguš

    Abstract: This paper argues that training GANs on local and non-local dependencies in speech data offers insights into how deep neural networks discretize continuous data and how symbolic-like rule-based morphophonological processes emerge in a deep convolutional architecture. Acquisition of speech has recently been modeled as a dependency between latent space and data generated by GANs in Beguš (2020b; arX… ▽ More

    Submitted 28 July, 2021; v1 submitted 26 September, 2020; originally announced September 2020.

    Comments: In press at Computer Speech & Language

    Journal ref: Computer Speech and Language 71 (2022): 101244

  14. Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication

    Authors: Gašper Beguš

    Abstract: This paper models unsupervised learning of an identity-based pattern (or copying) in speech called reduplication from raw continuous data with deep convolutional neural networks. We use the ciwGAN architecture Beguš (2021a; arXiv:2006.02951) in which learning of meaningful representations in speech emerges from a requirement that the CNNs generate informative data. We propose a technique to wug-te… ▽ More

    Submitted 17 July, 2021; v1 submitted 13 September, 2020; originally announced September 2020.

    Comments: Paper accepted at TACL

    Journal ref: Transactions of the Association for Computational Linguistics 9 (2021): 1180-1196

  15. arXiv:2006.03965  [pdf, other

    cs.CL cs.LG q-bio.NC

    Generative Adversarial Phonology: Modeling unsupervised phonetic and phonological learning with neural networks

    Authors: Gašper Beguš

    Abstract: Training deep neural networks on well-understood dependencies in speech data can provide new insights into how they learn internal representations. This paper argues that acquisition of speech can be modeled as a dependency between random space and generated speech data in the Generative Adversarial Network architecture and proposes a methodology to uncover the network's internal representations t… ▽ More

    Submitted 6 June, 2020; originally announced June 2020.

    Comments: Provisionally accepted in Frontiers in Artificial Intelligence

    Journal ref: Frontiers in Artificial Intelligence 2020

  16. CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks

    Authors: Gašper Beguš

    Abstract: How can deep neural networks encode information that corresponds to words in human speech into raw acoustic data? This paper proposes two neural network architectures for modeling unsupervised lexical learning from raw acoustic inputs, ciwGAN (Categorical InfoWaveGAN) and fiwGAN (Featural InfoWaveGAN), that combine a Deep Convolutional GAN architecture for audio data (WaveGAN; arXiv:1705.07904) wi… ▽ More

    Submitted 28 July, 2021; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: Published in Neural Networks

    Journal ref: Neural Networks 139 (2021), pp. 305-325