Skip to main content

Showing 1–6 of 6 results for author: Sia, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.04510  [pdf, other

    cs.CL cs.AI

    Where does In-context Translation Happen in Large Language Models

    Authors: Suzanna Sia, David Mueller, Kevin Duh

    Abstract: Self-supervised large language models have demonstrated the ability to perform Machine Translation (MT) via in-context learning, but little is known about where the model performs the task with respect to prompt instructions and demonstration examples. In this work, we attempt to characterize the region where large language models transition from in-context learners to translation models. Through… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 19 pages. Under Review

  2. arXiv:2311.08324  [pdf, other

    cs.CL cs.AI

    Anti-LM Decoding for Zero-shot In-context Machine Translation

    Authors: Suzanna Sia, Alexandra DeLucia, Kevin Duh

    Abstract: Zero-shot In-context learning is the phenomenon where models can perform the task simply given the instructions. However, pre-trained large language models are known to be poorly calibrated for this task. One of the most effective approaches to handling this bias is to adopt a contrastive decoding objective, which accounts for the prior probability of generating the next token by conditioning on s… ▽ More

    Submitted 2 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL Findings 2024

  3. arXiv:2305.03573  [pdf, other

    cs.CL cs.AI

    In-context Learning as Maintaining Coherency: A Study of On-the-fly Machine Translation Using Large Language Models

    Authors: Suzanna Sia, Kevin Duh

    Abstract: The phenomena of in-context learning has typically been thought of as "learning from examples". In this work which focuses on Machine Translation, we present a perspective of in-context learning as the desired generation task maintaining coherency with its context, i.e., the prompt examples. We first investigate randomly sampled prompts across 4 domains, and find that translation performance impro… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: 9 pages

  4. arXiv:2205.12469  [pdf, other

    cs.CL

    Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI

    Authors: Suzanna Sia, Anton Belyy, Amjad Almahairi, Madian Khabsa, Luke Zettlemoyer, Lambert Mathias

    Abstract: Evaluating an explanation's faithfulness is desired for many reasons such as trust, interpretability and diagnosing the sources of model's errors. In this work, which focuses on the NLI task, we introduce the methodology of Faithfulness-through-Counterfactuals, which first generates a counterfactual hypothesis based on the logical predicates expressed in the explanation, and then evaluates if the… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

    Comments: Under Review

  5. arXiv:2108.05525  [pdf, other

    cs.AI

    Clustering with UMAP: Why and How Connectivity Matters

    Authors: Ayush Dalmia, Suzanna Sia

    Abstract: Topology based dimensionality reduction methods such as t-SNE and UMAP have seen increasing success and popularity in high-dimensional data. These methods have strong mathematical foundations and are based on the intuition that the topology in low dimensions should be close to that of high dimensions. Given that the initial topological structure is a precursor to the success of the algorithm, this… ▽ More

    Submitted 16 December, 2021; v1 submitted 12 August, 2021; originally announced August 2021.

    Comments: Published as a long paper at 2nd Graphs and more Complex structures for Learning and Reasoning Workshop in AAAI 2022

  6. arXiv:2004.14914  [pdf, other

    cs.CL

    Tired of Topic Models? Clusters of Pretrained Word Embeddings Make for Fast and Good Topics too!

    Authors: Suzanna Sia, Ayush Dalmia, Sabrina J. Mielke

    Abstract: Topic models are a useful analysis tool to uncover the underlying themes within document collections. The dominant approach is to use probabilistic topic models that posit a generative story, but in this paper we propose an alternative way to obtain topics: clustering pre-trained word embeddings while incorporating document information for weighted clustering and reranking top words. We provide be… ▽ More

    Submitted 6 October, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: Published as a short paper at EMNLP 2020