Skip to main content

Showing 1–6 of 6 results for author: Harel-Canada, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12680  [pdf, other

    cs.CL

    Measuring Psychological Depth in Language Models

    Authors: Fabrice Harel-Canada, Hanyu Zhou, Sreya Mupalla, Zeynep Yildiz, Amit Sahai, Nanyun Peng

    Abstract: Evaluations of creative stories generated by large language models (LLMs) often focus on objective properties of the text, such as its style, coherence, and toxicity. While these metrics are indispensable, they do not speak to a story's subjective, psychological impact from a reader's perspective. We introduce the Psychological Depth Scale (PDS), a novel framework rooted in literary theory that me… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Preprint. Under Review

  2. arXiv:2404.18881  [pdf, other

    cs.HC cs.LG cs.SE

    Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking

    Authors: Hong ** Kang, Fabrice Harel-Canada, Muhammad Ali Gulzar, Violet Peng, Miryung Kim

    Abstract: Data augmentation techniques apply transformations to existing texts to generate additional data. The transformations may produce low-quality texts, where the meaning of the text is changed and the text may even be mangled beyond human comprehension. Analyzing the synthetically generated texts and their corresponding labels is slow and demanding. To winnow out texts with incorrect labels, we devel… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 Findings

  3. arXiv:2210.12362  [pdf, other

    cs.CL

    EnDex: Evaluation of Dialogue Engagingness at Scale

    Authors: Guangxuan Xu, Ruibo Liu, Fabrice Harel-Canada, Nischal Reddy Chandra, Nanyun Peng

    Abstract: We propose EnDex, the first human-reaction based model to evaluate dialogue engagingness. EnDex is trained on 80k Reddit-based Engagement Dataset (RED) curated using a novel distant-supervision framework. Engagingness is a key measure that captures high-level quality of AI dialogue systems and closely reflects actual user experience. However, data shortage, plus the abstract and extensive definiti… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Journal ref: Findings of EMNLP 2022

  4. arXiv:2205.05137  [pdf, other

    cs.CL cs.LG

    Sibylvariant Transformations for Robust Text Classification

    Authors: Fabrice Harel-Canada, Muhammad Ali Gulzar, Nanyun Peng, Miryung Kim

    Abstract: The vast majority of text transformation techniques in NLP are inherently limited in their ability to expand input space coverage due to an implicit constraint to preserve the original class label. In this work, we propose the notion of sibylvariance (SIB) to describe the broader set of transforms that relax the label-preserving constraint, knowably vary the expected class, and lead to significant… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: 9 pages, Findings of ACL 2022

  5. arXiv:2112.02721  [pdf, other

    cs.CL cs.AI cs.LG

    NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

    Authors: Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, **ho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo , et al. (101 additional authors not shown)

    Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split… ▽ More

    Submitted 11 October, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter

  6. arXiv:2009.10396  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Is Q-Learning Provably Efficient? An Extended Analysis

    Authors: Kushagra Rastogi, Jonathan Lee, Fabrice Harel-Canada, Aditya Joglekar

    Abstract: This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by ** et al. We include a survey of related research to contextualize the need for strengthening the theoretical guarantees related to perhaps the most important threads of model-free reinforcement learning. We also expound upon the reasoning used in the proofs to highlight the c… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.