Skip to main content

Showing 1–14 of 14 results for author: Ravishankar, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.00876  [pdf, other

    cs.CL cs.AI

    Word Order and World Knowledge

    Authors: Qinghua Zhao, Vinit Ravishankar, Nicolas Garneau, Anders Søgaard

    Abstract: Word order is an important concept in natural language, and in this work, we study how word order affects the induction of world knowledge from raw text using language models. We use word analogies to probe for such knowledge. Specifically, in addition to the natural word order, we first respectively extract texts of six fixed word orders from five languages and then pretrain the language models o… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  2. arXiv:2402.17016  [pdf, other

    cs.CL cs.AI cs.IR

    Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings

    Authors: Isabelle Mohr, Markus Krimmel, Saba Sturua, Mohammad Kalim Akram, Andreas Koukounas, Michael Günther, Georgios Mastrapas, Vinit Ravishankar, Joan Fontanals Martínez, Feng Wang, Qi Liu, Ziniu Yu, Jie Fu, Saahil Ognawala, Susana Guzman, Bo Wang, Maximilian Werk, Nan Wang, Han Xiao

    Abstract: We introduce a novel suite of state-of-the-art bilingual text embedding models that are designed to support English and another target language. These models are capable of processing lengthy text inputs with up to 8192 tokens, making them highly versatile for a range of natural language processing tasks such as text retrieval, clustering, and semantic textual similarity (STS) calculations. By f… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    MSC Class: 68T50 ACM Class: I.2.7

  3. arXiv:2209.10321  [pdf, other

    cs.LO cs.CR cs.SC

    CryptoSolve: Towards a Tool for the Symbolic Analysis of Cryptographic Algorithms

    Authors: Dalton Chichester, Wei Du, Raymond Kauffman, Hai Lin, Christopher Lynch, Andrew M. Marshall, Catherine A. Meadows, Paliath Narendran, Veena Ravishankar, Luis Rovira, Brandon Rozek

    Abstract: Recently, interest has been emerging in the application of symbolic techniques to the specification and analysis of cryptosystems. These techniques, when accompanied by suitable proofs of soundness/completeness, can be used both to identify insecure cryptosystems and prove sound ones secure. But although a number of such symbolic algorithms have been developed and implemented, they remain scattere… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: In Proceedings GandALF 2022, arXiv:2209.09333

    Journal ref: EPTCS 370, 2022, pp. 147-161

  4. arXiv:2203.10995  [pdf, other

    cs.CL

    Word Order Does Matter (And Shuffled Language Models Know It)

    Authors: Vinit Ravishankar, Mostafa Abdou, Artur Kulmizev, Anders Søgaard

    Abstract: Recent studies have shown that language models pretrained and/or fine-tuned on randomly permuted sentences exhibit competitive performance on GLUE, putting into question the importance of word order information. Somewhat counter-intuitively, some of these studies also report that position embeddings appear to be crucial for models' good performance with shuffled text. We probe these language model… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: To appear at ACL 2022; 9 pages

  5. arXiv:2109.05388  [pdf, other

    cs.CL

    The Impact of Positional Encodings on Multilingual Compression

    Authors: Vinit Ravishankar, Anders Søgaard

    Abstract: In order to preserve word-order information in a non-autoregressive setting, transformer architectures tend to include positional knowledge, by (for instance) adding positional encodings to token embeddings. Several modifications have been proposed over the sinusoidal positional encodings used in the original transformer architecture; these include, for instance, separating position encodings and… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

  6. arXiv:2101.10927  [pdf, other

    cs.CL

    Attention Can Reflect Syntactic Structure (If You Let It)

    Authors: Vinit Ravishankar, Artur Kulmizev, Mostafa Abdou, Anders Søgaard, Joakim Nivre

    Abstract: Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English -- a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Journal ref: EACL 2021

  7. arXiv:2005.01348  [pdf, other

    cs.CL cs.LG

    The Sensitivity of Language Models and Humans to Winograd Schema Perturbations

    Authors: Mostafa Abdou, Vinit Ravishankar, Maria Barrett, Yonatan Belinkov, Desmond Elliott, Anders Søgaard

    Abstract: Large-scale pretrained language models are the major driving force behind recent improvements in performance on the Winograd Schema Challenge, a widely employed test of common sense reasoning ability. We show, however, with a new diagnostic dataset, that these models are sensitive to linguistic perturbations of the Winograd examples that minimally affect human understanding. Our results highlight… ▽ More

    Submitted 7 May, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  8. arXiv:2005.00633  [pdf, other

    cs.CL

    From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

    Authors: Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš

    Abstract: Massively multilingual transformers pretrained with language modeling objectives (e.g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance. Current downstream evaluations, however, verify their efficacy predominantly in transfer settings involving languages with sufficient amounts of pretraining data,… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

  9. arXiv:2004.14096  [pdf, other

    cs.CL cs.LG

    Do Neural Language Models Show Preferences for Syntactic Formalisms?

    Authors: Artur Kulmizev, Vinit Ravishankar, Mostafa Abdou, Joakim Nivre

    Abstract: Recent work on the interpretability of deep neural language models has concluded that many properties of natural language syntax are encoded in their representational spaces. However, such studies often suffer from limited scope by focusing on a single language and a single linguistic formalism. In this study, we aim to investigate the extent to which the semblance of syntactic structure captured… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  10. arXiv:2002.08131  [pdf, other

    cs.CL cs.LG

    A Systematic Comparison of Architectures for Document-Level Sentiment Classification

    Authors: Jeremy Barnes, Vinit Ravishankar, Lilja Øvrelid, Erik Velldal

    Abstract: Documents are composed of smaller pieces - paragraphs, sentences, and tokens - that have complex relationships between one another. Sentiment classification models that take into account the structure inherent in these documents have a theoretical advantage over those that do not. At the same time, transfer learning models based on language model pretraining have shown promise for document classif… ▽ More

    Submitted 2 February, 2022; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: 5 pages, 2 figures

  11. arXiv:1907.00227  [pdf, ps, other

    cs.CC cs.CR

    On Asymmetric Unification for the Theory of XOR with a Homomorphism

    Authors: Christopher Lynch, Andrew M. Marshall, Catherine Meadows, Paliath Narendran, Veena Ravishankar

    Abstract: Asymmetric unification, or unification with irreducibility constraints, is a newly developed paradigm that arose out of the automated analysis of cryptographic protocols. However, there are still relatively few asymmetric unification algorithms. In this paper we address this lack by exploring the application of automata-based unification methods. We examine the theory of xor with a homomorphism, A… ▽ More

    Submitted 29 June, 2019; originally announced July 2019.

  12. arXiv:1906.05061  [pdf, other

    cs.CL

    Probing Multilingual Sentence Representations With X-Probe

    Authors: Vinit Ravishankar, Lilja Øvrelid, Erik Velldal

    Abstract: This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain. In doing so, we make two contributions: first, we provide datasets for multilingual probing, derived from Wikipedia, in five languages, viz. English, French, German, Spanish and Russian. Second, we evaluate six sentence encoders for each language, each trained by map** sentence repres… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

    Comments: To appear at RepL4NLP '19

  13. arXiv:1808.09716  [pdf, other

    cs.CL

    What can we learn from Semantic Tagging?

    Authors: Mostafa Abdou, Artur Kulmizev, Vinit Ravishankar, Lasha Abzianidze, Johan Bos

    Abstract: We investigate the effects of multi-task learning using the recently introduced task of semantic tagging. We employ semantic tagging as an auxiliary task for three different NLP tasks: part-of-speech tagging, Universal Dependency parsing, and Natural Language Inference. We compare full neural network sharing, partial neural network sharing, and what we term the learning what to share setting where… ▽ More

    Submitted 29 August, 2018; originally announced August 2018.

    Comments: 9 pages with references and appendixes. EMNLP 2018 camera ready

  14. arXiv:1706.05066  [pdf, ps, other

    cs.LO

    Asymmetric Unification and Disunification

    Authors: Veena Ravishankar, Kimberly A. Gero, Paliath Narendran

    Abstract: We compare two kinds of unification problems: Asymmetric Unification and Disunification, which are variants of Equational Unification. Asymmetric Unification is a type of Equational Unification where the right-hand sides of the equations are in normal form with respect to the given term rewriting system. In Disunification we solve equations and disequations with respect to an equational theory for… ▽ More

    Submitted 5 October, 2017; v1 submitted 15 June, 2017; originally announced June 2017.