Skip to main content

Showing 1–6 of 6 results for author: Wijesiriwardene, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.07818  [pdf, other

    cs.CL cs.AI

    On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models

    Authors: Thilini Wijesiriwardene, Ruwan Wickramarachchi, Aishwarya Naresh Reganti, Vinija Jain, Aman Chadha, Amit Sheth, Amitava Das

    Abstract: The ability of Large Language Models (LLMs) to encode syntactic and semantic structures of language is well examined in NLP. Additionally, analogy identification, in the form of word analogies are extensively studied in the last decade of language modeling literature. In this work we specifically look at how LLMs' abilities to capture sentence analogies (sentences that convey analogous meaning to… ▽ More

    Submitted 5 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: To appear in Findings of EACL 2024

  2. arXiv:2308.01936  [pdf

    cs.AI cs.CL

    Why Do We Need Neuro-symbolic AI to Model Pragmatic Analogies?

    Authors: Thilini Wijesiriwardene, Amit Sheth, Valerie L. Shalin, Amitava Das

    Abstract: A hallmark of intelligence is the ability to use a familiar domain to make inferences about a less familiar domain, known as analogical reasoning. In this article, we delve into the performance of Large Language Models (LLMs) in dealing with progressively complex analogies expressed in unstructured text. We discuss analogies at four distinct levels of complexity: lexical analogies, syntactic analo… ▽ More

    Submitted 12 September, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Comments: 12 pages 3 figures

  3. arXiv:2305.05050  [pdf, other

    cs.CL cs.AI

    ANALOGICAL -- A Novel Benchmark for Long Text Analogy Evaluation in Large Language Models

    Authors: Thilini Wijesiriwardene, Ruwan Wickramarachchi, Bimal G. Gajera, Shreeyash Mukul Gowaikar, Chandan Gupta, Aman Chadha, Aishwarya Naresh Reganti, Amit Sheth, Amitava Das

    Abstract: Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw… ▽ More

    Submitted 25 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted as a long paper at Findings of ACL 2023

  4. arXiv:2204.12716  [pdf, other

    cs.CL cs.AI

    UBERT: A Novel Language Model for Synonymy Prediction at Scale in the UMLS Metathesaurus

    Authors: Thilini Wijesiriwardene, Vinh Nguyen, Goonmeet Bajaj, Hong Yung Yip, Vishesh Javangula, Yuqing Mao, Kin Wah Fung, Srinivasan Parthasarathy, Amit P. Sheth, Olivier Bodenreider

    Abstract: The UMLS Metathesaurus integrates more than 200 biomedical source vocabularies. During the Metathesaurus construction process, synonymous terms are clustered into concepts by human editors, assisted by lexical similarity algorithms. This process is error-prone and time-consuming. Recently, a deep learning model (LexLM) has been developed for the UMLS Vocabulary Alignment (UVA) task. This work intr… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

  5. arXiv:2109.13348  [pdf, other

    cs.CL

    Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in the UMLS Metathesaurus

    Authors: Goonmeet Bajaj, Vinh Nguyen, Thilini Wijesiriwardene, Hong Yung Yip, Vishesh Javangula, Srinivasan Parthasarathy, Amit Sheth, Olivier Bodenreider

    Abstract: The current UMLS (Unified Medical Language System) Metathesaurus construction process for integrating over 200 biomedical source vocabularies is expensive and error-prone as it relies on the lexical algorithms and human editors for deciding if the two biomedical terms are synonymous. Recent advances in Natural Language Processing such as Transformer models like BERT and its biomedical variants wit… ▽ More

    Submitted 15 October, 2021; v1 submitted 14 September, 2021; originally announced September 2021.

  6. ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter

    Authors: Thilini Wijesiriwardene, Hale Inan, Ugur Kursuncu, Manas Gaur, Valerie L. Shalin, Krishnaprasad Thirunarayan, Amit Sheth, I. Budak Arpinar

    Abstract: The convenience of social media has also enabled its misuse, potentially resulting in toxic behavior. Nearly 66% of internet users have observed online harassment, and 41% claim personal experience, with 18% facing severe forms of online harassment. This toxic communication has a significant impact on the well-being of young individuals, affecting mental health and, in some cases, resulting in sui… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: Accepted: Social Informatics 2020

    Journal ref: International Conference on Social Informatics. 12467 (2020) 427-439