Skip to main content

Showing 1–13 of 13 results for author: Chiu, J T

.
  1. arXiv:2406.07524  [pdf, other

    cs.CL cs.AI cs.LG

    Simple and Effective Masked Diffusion Language Models

    Authors: Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov

    Abstract: While diffusion models excel at generating high-quality images, prior work reports a significant performance gap between diffusion and autoregressive (AR) methods in language modeling. In this work, we show that simple masked discrete diffusion is more performant than previously thought. We apply an effective training recipe that improves the performance of masked diffusion models and derive a sim… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Report number: cr07

  2. arXiv:2311.13647  [pdf, other

    cs.CL cs.LG

    Language Model Inversion

    Authors: John X. Morris, Wenting Zhao, Justin T. Chiu, Vitaly Shmatikov, Alexander M. Rush

    Abstract: Language models produce a distribution over the next token; can we use this information to recover the prompt tokens? We consider the problem of language model inversion and show that next-token probabilities contain a surprising amount of information about the preceding text. Often we can recover the text in cases where it is hidden from the user, motivating a method for recovering unknown prompt… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  3. arXiv:2311.08584  [pdf, other

    cs.CL

    Asking More Informative Questions for Grounded Retrieval

    Authors: Sedrick Keh, Justin T. Chiu, Daniel Fried

    Abstract: When a model is trying to gather information in an interactive setting, it benefits from asking informative questions. However, in the case of a grounded multi-turn image identification task, previous studies have been constrained to polar yes/no questions, limiting how much information the model can gain in a single turn. We present an approach that formulates more informative, open-ended questio… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  4. arXiv:2311.08469  [pdf, other

    cs.CL

    UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations

    Authors: Wenting Zhao, Justin T Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Ye** Choi, Xiang Lorraine Li, Alane Suhr

    Abstract: Language technologies that accurately model the dynamics of events must perform commonsense reasoning. Existing work evaluating commonsense reasoning focuses on making inferences about common, everyday situations. To instead investigate the ability to model unusual, unexpected, and unlikely situations, we explore the task of uncommonsense abductive reasoning. Given a piece of context with an unexp… ▽ More

    Submitted 1 May, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: accepted at NAACL'24

  5. arXiv:2311.08390  [pdf, other

    cs.CL

    Predicting Text Preference Via Structured Comparative Reasoning

    Authors: **g Nathan Yan, Tianqi Liu, Justin T Chiu, Jiaming Shen, Zhen Qin, Yue Yu, Yao Zhao, Charu Lakshmanan, Yair Kurzion, Alexander M. Rush, Jialu Liu, Michael Bendersky

    Abstract: Comparative reasoning plays a crucial role in text preference prediction; however, large language models (LLMs) often demonstrate inconsistencies in their reasoning. While approaches like Chain-of-Thought improve accuracy in many other settings, they struggle to consistently distinguish the similarities and differences of complex texts. We introduce SC, a prompting approach that predicts text pref… ▽ More

    Submitted 1 July, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  6. arXiv:2310.17140  [pdf, other

    cs.CL cs.AI

    Symbolic Planning and Code Generation for Grounded Dialogue

    Authors: Justin T. Chiu, Wenting Zhao, Derek Chen, Saujas Vaduguru, Alexander M. Rush, Daniel Fried

    Abstract: Large language models (LLMs) excel at processing and generating both text and code. However, LLMs have had limited applicability in grounded task-oriented dialogue as they are difficult to steer toward task objectives and fail to handle novel grounding. We present a modular and interpretable grounded dialogue system that addresses these shortcomings by composing LLMs with a symbolic planner and gr… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023

  7. arXiv:2305.14618  [pdf, other

    cs.CL cs.AI

    Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations

    Authors: Wenting Zhao, Justin T. Chiu, Claire Cardie, Alexander M. Rush

    Abstract: Abductive reasoning aims to find plausible explanations for an event. This style of reasoning is critical for commonsense tasks where there are often multiple plausible explanations. Existing approaches for abductive reasoning in natural language processing (NLP) often rely on manually generated annotations for supervision; however, such annotations can be subjective and biased. Instead of using d… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: accepted at ACL'23

  8. arXiv:2305.14237  [pdf, ps, other

    cs.CL cs.AI

    HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale Supervision

    Authors: Wenting Zhao, Justin T. Chiu, Claire Cardie, Alexander M. Rush

    Abstract: Explainable multi-hop question answering (QA) not only predicts answers but also identifies rationales, i. e. subsets of input sentences used to derive the answers. This problem has been extensively studied under the supervised setting, where both answer and rationale annotations are given. Because rationale annotations are expensive to collect and not always available, recent efforts have been de… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  9. arXiv:2210.13763  [pdf, other

    cs.NI cs.LG

    Teal: Learning-Accelerated Optimization of WAN Traffic Engineering

    Authors: Zhiying Xu, Francis Y. Yan, Rachee Singh, Justin T. Chiu, Alexander M. Rush, Minlan Yu

    Abstract: The rapid expansion of global cloud wide-area networks (WANs) has posed a challenge for commercial optimization engines to efficiently solve network traffic engineering (TE) problems at scale. Existing acceleration strategies decompose TE optimization into concurrent subproblems but realize limited parallelism due to an inherent tradeoff between run time and allocation performance. We present Te… ▽ More

    Submitted 19 May, 2024; v1 submitted 25 October, 2022; originally announced October 2022.

  10. arXiv:2210.11528  [pdf, other

    cs.CL

    Unsupervised Text Deidentification

    Authors: John X. Morris, Justin T. Chiu, Ramin Zabih, Alexander M. Rush

    Abstract: Deidentification seeks to anonymize textual data prior to distribution. Automatic deidentification primarily uses supervised named entity recognition from human-labeled data points. We propose an unsupervised deidentification method that masks words that leak personally-identifying information. The approach utilizes a specially trained reidentification model to identify individuals from redacted p… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP 2022

  11. arXiv:2201.02715  [pdf, other

    cs.CL

    Low-Rank Constraints for Fast Inference in Structured Models

    Authors: Justin T. Chiu, Yuntian Deng, Alexander M. Rush

    Abstract: Structured distributions, i.e. distributions over combinatorial spaces, are commonly used to learn latent probabilistic representations from observed data. However, scaling these models is bottlenecked by the high computational and memory complexity with respect to the size of the latent representations. Common models such as Hidden Markov Models (HMMs) and Probabilistic Context-Free Grammars (PCF… ▽ More

    Submitted 7 January, 2022; originally announced January 2022.

    Comments: 22 pages. Published at NeurIPS 2021

  12. arXiv:2109.05042  [pdf, other

    cs.CL

    Reference-Centric Models for Grounded Collaborative Dialogue

    Authors: Daniel Fried, Justin T. Chiu, Dan Klein

    Abstract: We present a grounded neural dialogue model that successfully collaborates with people in a partially-observable reference game. We focus on a setting where two agents each observe an overlap** part of a world context and need to identify and agree on some object they share. Therefore, the agents should pool their information and communicate pragmatically to solve the task. Our dialogue agent ac… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  13. arXiv:2011.04640  [pdf, other

    cs.CL cs.LG

    Scaling Hidden Markov Language Models

    Authors: Justin T. Chiu, Alexander M. Rush

    Abstract: The hidden Markov model (HMM) is a fundamental tool for sequence modeling that cleanly separates the hidden state from the emission structure. However, this separation makes it difficult to fit HMMs to large datasets in modern NLP, and they have fallen out of use due to very poor performance compared to fully observed models. This work revisits the challenge of scaling HMMs to language modeling da… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: 9 pages, accepted as a short paper at EMNLP 2020

    Journal ref: EMNLP 2020