Skip to main content

Showing 1–16 of 16 results for author: Lou, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.16203  [pdf, other

    cs.CL

    LLMs' Classification Performance is Overclaimed

    Authors: Hanzi Xu, Renze Lou, Jiangshu Du, Vahid Mahzoon, Elmira Talebianaraki, Zhuoan Zhou, Elizabeth Garrison, Slobodan Vucetic, Wenpeng Yin

    Abstract: In many classification tasks designed for AI or human to solve, gold labels are typically included within the label space by default, often posed as "which of the following is correct?" This standard setup has traditionally highlighted the strong performance of advanced AI, particularly top-performing Large Language Models (LLMs), in routine classification tasks. However, when the gold label is in… ▽ More

    Submitted 29 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  3. arXiv:2406.05948  [pdf, other

    cs.CR cs.AI

    Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models

    Authors: Xi Li, Yusen Zhang, Renze Lou, Chen Wu, Jiaqi Wang

    Abstract: Backdoor attacks present significant threats to Large Language Models (LLMs), particularly with the rise of third-party services that offer API integration and prompt engineering. Untrustworthy third parties can plant backdoors into LLMs and pose risks to users by embedding malicious instructions into user queries. The backdoor-compromised LLM will generate malicious output when and input is embed… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  4. arXiv:2404.03602  [pdf, other

    cs.CL

    Evaluating LLMs at Detecting Errors in LLM Responses

    Authors: Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo Qin, Arman Cohan, Wenpeng Yin, Rui Zhang

    Abstract: With Large Language Models (LLMs) being widely used across various tasks, detecting errors in their responses is increasingly crucial. However, little research has been conducted on error detection of LLM responses. Collecting error annotations on LLM responses is challenging due to the subjective nature of many NLP tasks, and thus previous research focuses on tasks of little practical value (e.g.… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Benchmark and code: https://github.com/psunlpgroup/ReaLMistake

  5. arXiv:2402.01622  [pdf, other

    cs.CL

    TravelPlanner: A Benchmark for Real-World Planning with Language Agents

    Authors: Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su

    Abstract: Planning has been part of the core pursuit for artificial intelligence since its conception, but earlier AI agents mostly focused on constrained settings because many of the cognitive substrates necessary for human-level planning have been lacking. Recently, language agents powered by large language models (LLMs) have shown interesting capabilities such as tool use and reasoning. Are these languag… ▽ More

    Submitted 23 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICML 2024 (Spotlight)

  6. arXiv:2402.00157  [pdf, other

    cs.CL

    Large Language Models for Mathematical Reasoning: Progresses and Challenges

    Authors: Janice Ahn, Rishu Verma, Renze Lou, Di Liu, Rui Zhang, Wenpeng Yin

    Abstract: Mathematical reasoning serves as a cornerstone for assessing the fundamental cognitive capabilities of human intelligence. In recent times, there has been a notable surge in the development of Large Language Models (LLMs) geared towards the automated resolution of mathematical problems. However, the landscape of mathematical problem types is vast and varied, with LLM-oriented techniques undergoing… ▽ More

    Submitted 5 April, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: EACL 2024 Student Research Workshop, 8 pages

  7. arXiv:2401.03082  [pdf, other

    cs.AI

    UMIE: Unified Multimodal Information Extraction with Instruction Tuning

    Authors: Lin Sun, Kai Zhang, Qingyuan Li, Renze Lou

    Abstract: Multimodal information extraction (MIE) gains significant attention as the popularity of multimedia content increases. However, current MIE methods often resort to using task-specific model structures, which results in limited generalizability across tasks and underutilizes shared knowledge across MIE tasks. To address these issues, we propose UMIE, a unified multimodal information extractor to un… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  8. arXiv:2312.02436  [pdf, other

    cs.CL cs.AI

    MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

    Authors: Renze Lou, Kai Zhang, Jian Xie, Yuxuan Sun, Janice Ahn, Hanzi Xu, Yu Su, Wenpeng Yin

    Abstract: In the realm of large language models (LLMs), enhancing instruction-following capability often involves curating expansive training data. This is achieved through two primary schemes: i) Scaling-Inputs: Amplifying (input, output) pairs per task instruction, aiming for better instruction adherence. ii) Scaling Input-Free Tasks: Enlarging tasks, each composed of an (instruction, output) pair (withou… ▽ More

    Submitted 14 March, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: ICLR 2024. Data, model, and code are available at: https://renzelou.github.io/Muffin/

  9. arXiv:2309.12998  [pdf, other

    cs.CL cs.AI

    Audience-specific Explanations for Machine Translation

    Authors: Renhan Lou, Jan Niehues

    Abstract: In machine translation, a common problem is that the translation of certain words even if translated can cause incomprehension of the target language audience due to different cultural backgrounds. A solution to solve this problem is to add explanations for these words. In a first step, we therefore need to identify these words or phrases. In this work we explore techniques to extract example expl… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  10. arXiv:2308.03795  [pdf, other

    cs.CL

    Toward Zero-Shot Instruction Following

    Authors: Renze Lou, Wenpeng Yin

    Abstract: This work proposes a challenging yet more realistic setting for zero-shot cross-task generalization: zero-shot instruction following, presuming the existence of a paragraph-style task definition while no demonstrations exist. To better learn the task supervision from the definition, we propose two strategies: first, to automatically find out the critical sentences in the definition; second, a rank… ▽ More

    Submitted 25 January, 2024; v1 submitted 4 August, 2023; originally announced August 2023.

    Comments: EACL 2024 Student Research Workshop

  11. arXiv:2305.13300  [pdf, other

    cs.CL cs.AI

    Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

    Authors: Jian Xie, Kai Zhang, Jiangjie Chen, Renze Lou, Yu Su

    Abstract: By providing external information to large language models (LLMs), tool augmentation (including retrieval augmentation) has emerged as a promising solution for addressing the limitations of LLMs' static parametric memory. However, how receptive are LLMs to such external evidence, especially when the evidence conflicts with their parametric memory? We present the first comprehensive and controlled… ▽ More

    Submitted 27 February, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: ICLR 2024 (Spotlight)

  12. arXiv:2303.10475  [pdf, other

    cs.CL

    Large Language Model Instruction Following: A Survey of Progresses and Challenges

    Authors: Renze Lou, Kai Zhang, Wenpeng Yin

    Abstract: Task semantics can be expressed by a set of input-output examples or a piece of textual instruction. Conventional machine learning approaches for natural language processing (NLP) mainly rely on the availability of large-scale sets of task-specific examples. Two issues arise: first, collecting task-specific labeled examples does not apply to scenarios where tasks may be too complicated or costly t… ▽ More

    Submitted 24 May, 2024; v1 submitted 18 March, 2023; originally announced March 2023.

    Comments: Accepted by Computational Linguistics Journal. The paper list is available at https://github.com/RenzeLou/awesome-instruction-learning

  13. arXiv:2303.01795  [pdf, other

    cs.CL

    PAGE: A Position-Aware Graph-Based Model for Emotion Cause Entailment in Conversation

    Authors: Xiaojie Gu, Renze Lou, Lin Sun, Shangxin Li

    Abstract: Conversational Causal Emotion Entailment (C2E2) is a task that aims at recognizing the causes corresponding to a target emotion in a conversation. The order of utterances in the conversation affects the causal inference. However, most current position encoding strategies ignore the order relation among utterances and speakers. To address the issue, we devise a novel position-aware graph to encode… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: ICASSP 2023

  14. arXiv:2206.00289  [pdf, other

    cs.CL cs.AI

    MORE: A Metric Learning Based Framework for Open-domain Relation Extraction

    Authors: Yutong Wang, Renze Lou, Kai Zhang, MaoYan Chen, Yujiu Yang

    Abstract: Open relation extraction (OpenRE) is the task of extracting relation schemes from open-domain corpora. Most existing OpenRE methods either do not fully benefit from high-quality labeled corpora or can not learn semantic representation directly, affecting downstream clustering efficiency. To address these problems, in this work, we propose a novel learning framework named MORE (Metric learning-base… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: 5 pages, 3 figures, accepted by ICASSP 2021

  15. arXiv:2109.05748  [pdf, other

    cs.LG cs.CL

    GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks

    Authors: Weicheng Ma, Renze Lou, Kai Zhang, Lili Wang, Soroush Vosoughi

    Abstract: A key problem in multi-task learning (MTL) research is how to select high-quality auxiliary tasks automatically. This paper presents GradTS, an automatic auxiliary task selection method based on gradient calculation in Transformer-based models. Compared to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0.33% to 17.93% on 8 na… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: In EMNLP 2021

  16. Contributions of Transformer Attention Heads in Multi- and Cross-lingual Tasks

    Authors: Weicheng Ma, Kai Zhang, Renze Lou, Lili Wang, Soroush Vosoughi

    Abstract: This paper studies the relative importance of attention heads in Transformer-based models to aid their interpretability in cross-lingual and multi-lingual tasks. Prior research has found that only a few attention heads are important in each mono-lingual Natural Language Processing (NLP) task and pruning the remaining heads leads to comparable or improved performance of the model. However, the impa… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

    Comments: In ACL 2021