Skip to main content

Showing 1–50 of 71 results for author: Chen, N F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16020  [pdf, other

    cs.SD cs.CL eess.AS

    AudioBench: A Universal Benchmark for Audio Large Language Models

    Authors: Bin Wang, Xunlong Zou, Geyu Lin, Shuo Sun, Zhuohan Liu, Wenyu Zhang, Zhengyuan Liu, AiTi Aw, Nancy F. Chen

    Abstract: We introduce AudioBench, a new benchmark designed to evaluate audio large language models (AudioLLMs). AudioBench encompasses 8 distinct tasks and 26 carefully selected or newly curated datasets, focusing on speech understanding, voice interpretation, and audio scene understanding. Despite the rapid advancement of large language models, including multimodal versions, a significant gap exists in co… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: 20 pages; v2 - typo update; Code: https://github.com/AudioLLMs/AudioBench

  2. arXiv:2406.02963  [pdf, other

    cs.SD eess.AS

    Dataset-Distillation Generative Model for Speech Emotion Recognition

    Authors: Fabian Ritter-Gutierrez, Kuan-Po Huang, Jeremy H. M Wong, Dianwen Ng, Hung-yi Lee, Nancy F. Chen, Eng Siong Chng

    Abstract: Deep learning models for speech rely on large datasets, presenting computational challenges. Yet, performance hinges on training data size. Dataset Distillation (DD) aims to learn a smaller dataset without much performance degradation when training with it. DD has been investigated in computer vision but not yet in speech. This paper presents the first approach for DD to speech targeting Speech Em… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at Interspeech 2024

  3. arXiv:2405.15329  [pdf, other

    cs.CL

    Decompose and Aggregate: A Step-by-Step Interpretable Evaluation Framework

    Authors: Minzhi Li, Zhengyuan Liu, Shumin Deng, Shafiq Joty, Nancy F. Chen, Min-Yen Kan

    Abstract: The acceleration of Large Language Models (LLMs) research has opened up new possibilities for evaluating generated texts. They serve as scalable and economical evaluators, but the question of how reliable these evaluators are has emerged as a crucial research question. Prior research efforts in the meta-evaluation of LLMs as judges limit the prompting of an LLM to a single use to obtain a final ev… ▽ More

    Submitted 14 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  4. arXiv:2405.03138  [pdf, other

    cs.CL

    CRAFT: Extracting and Tuning Cultural Instructions from the Wild

    Authors: Bin Wang, Geyu Lin, Zhengyuan Liu, Chengwei Wei, Nancy F. Chen

    Abstract: Large language models (LLMs) have rapidly evolved as the foundation of various natural language processing (NLP) applications. Despite their wide use cases, their understanding of culturally-related concepts and reasoning remains limited. Meantime, there is a significant need to enhance these models' cultural reasoning capabilities, especially concerning underrepresented regions. This paper introd… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 6 pages

  5. arXiv:2404.11932  [pdf, other

    cs.CL cs.AI

    CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment

    Authors: Geyu Lin, Bin Wang, Zhengyuan Liu, Nancy F. Chen

    Abstract: Multilingual proficiency presents a significant challenge for large language models (LLMs). English-centric models are usually suboptimal in other languages, particularly those that are linguistically distant from English. This performance discrepancy mainly stems from the imbalanced distribution of training data across languages during pre-training and instruction tuning stages. To address this p… ▽ More

    Submitted 12 June, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: 11 pages

  6. arXiv:2404.09754  [pdf, other

    cs.CL

    Resilience of Large Language Models for Noisy Instructions

    Authors: Bin Wang, Chengwei Wei, Zhengyuan Liu, Geyu Lin, Nancy F. Chen

    Abstract: As the rapidly advancing domain of natural language processing (NLP), large language models (LLMs) have emerged as powerful tools for interpreting human commands and generating text across various tasks. Nonetheless, the resilience of LLMs to handle text containing inherent errors, stemming from human interactions and collaborative systems, has not been thoroughly explored. Our study investigates… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 12 pages

  7. arXiv:2404.06762  [pdf, other

    cs.CL cs.HC

    Personality-aware Student Simulation for Conversational Intelligent Tutoring Systems

    Authors: Zhengyuan Liu, Stella Xin Yin, Geyu Lin, Nancy F. Chen

    Abstract: Intelligent Tutoring Systems (ITSs) can provide personalized and self-paced learning experience. The emergence of large language models (LLMs) further enables better human-machine interaction, and facilitates the development of conversational ITSs in various disciplines such as math and language learning. In dialogic teaching, recognizing and adapting to individual characteristics can significantl… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  8. arXiv:2404.03429  [pdf, other

    cs.CL

    Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions

    Authors: Zhengyuan Liu, Stella Xin Yin, Carolyn Lee, Nancy F. Chen

    Abstract: Intelligent tutoring systems (ITSs) that imitate human tutors and aim to provide immediate and customized instructions or feedback to learners have shown their effectiveness in education. With the emergence of generative artificial intelligence, large language models (LLMs) further entitle the systems to complex and coherent conversational interactions. These systems would be of great help in lang… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  9. arXiv:2403.11123  [pdf, other

    cs.CL

    Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking

    Authors: Taha Aksu, Nancy F. Chen

    Abstract: Current metrics for evaluating Dialogue State Tracking (DST) systems exhibit three primary limitations. They: i) erroneously presume a uniform distribution of slots throughout the dialog, ii) neglect to assign partial scores for individual turns, iii) frequently overestimate or underestimate performance by repeatedly counting the models' successful or failed predictions. To address these shortcomi… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to COLING 2024

  10. arXiv:2402.00658  [pdf, other

    cs.AI cs.CL

    Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing

    Authors: Fangkai Jiao, Chengwei Qin, Zhengyuan Liu, Nancy F. Chen, Shafiq Joty

    Abstract: Large Language Models (LLMs) have demonstrated significant potential in handling complex reasoning tasks through step-by-step rationale generation. However, recent studies have raised concerns regarding the hallucination and flaws in their reasoning process. Substantial efforts are being made to improve the reliability and faithfulness of the generated rationales. Some approaches model reasoning a… ▽ More

    Submitted 15 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 17 pages, 9 figures

  11. arXiv:2401.17919  [pdf, other

    cs.CL cs.LG

    LOCOST: State-Space Models for Long Document Abstractive Summarization

    Authors: Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy F. Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

    Abstract: State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of $O(L \log L)$, this architecture can handle significantly longer sequences than state-of-the-a… ▽ More

    Submitted 25 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: 9 pages, 5 figures, 7 tables, EACL 2024 conference

  12. arXiv:2312.12153  [pdf, other

    cs.SD eess.AS

    Noise robust distillation of self-supervised speech models via correlation metrics

    Authors: Fabian Ritter-Gutierrez, Kuan-Po Huang, Dianwen Ng, Jeremy H. M. Wong, Hung-yi Lee, Eng Siong Chng, Nancy F. Chen

    Abstract: Compared to large speech foundation models, small distilled models exhibit degraded noise robustness. The student's robustness can be improved by introducing noise at the inputs during pre-training. Despite this, using the standard distillation loss still yields a student with degraded performance. Thus, this paper proposes improving student robustness via distillation with correlation metrics. Te… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 6 pages

  13. arXiv:2312.09541  [pdf, other

    cs.CL

    Picking the Underused Heads: A Network Pruning Perspective of Attention Head Selection for Fusing Dialogue Coreference Information

    Authors: Zhengyuan Liu, Nancy F. Chen

    Abstract: The Transformer-based models with the multi-head self-attention mechanism are widely used in natural language processing, and provide state-of-the-art results. While the pre-trained language backbones are shown to implicitly capture certain linguistic knowledge, explicitly incorporating structure-aware features can bring about further improvement on the downstream tasks. However, such enhancement… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  14. arXiv:2312.02614  [pdf, other

    cs.LG cs.CL

    Prompt Optimization via Adversarial In-Context Learning

    Authors: Xuan Long Do, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Shieh, Junxian He

    Abstract: We propose a new method, Adversarial In-Context Learning (adv-ICL), to optimize prompt for in-context learning (ICL) by employing one LLM as a generator, another as a discriminator, and a third as a prompt modifier. As in traditional adversarial learning, adv-ICL is implemented as a two-player game between the generator and discriminator, where the generator tries to generate realistic enough outp… ▽ More

    Submitted 22 June, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: ACL 2024

  15. arXiv:2311.08385  [pdf, other

    cs.CL

    ChOiRe: Characterizing and Predicting Human Opinions with Chain of Opinion Reasoning

    Authors: Xuan Long Do, Kenji Kawaguchi, Min-Yen Kan, Nancy F. Chen

    Abstract: Aligning language models (LMs) with human opinion is challenging yet vital to enhance their grasp of human values, preferences, and beliefs. We present ChOiRe, a four-step framework to predict human opinion which differentially models the user explicit personae (i.e. demographic or ideological attributes) that are manually declared, and implicit personae inferred from user historical opinions. ChO… ▽ More

    Submitted 27 February, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 22 pages

  16. arXiv:2311.07172  [pdf, other

    cs.CL cs.PL

    VerityMath: Advancing Mathematical Reasoning by Self-Verification Through Unit Consistency

    Authors: Vernon Toh, Ratish Puduppully, Nancy F. Chen

    Abstract: Large Language Models (LLMs) combined with program-based solving techniques are increasingly demonstrating proficiency in mathematical reasoning. However, such progress is mostly demonstrated in closed-source models such as OpenAI-GPT4 and Claude. In this paper, we seek to study the performance of strong open-source LLMs. Specifically, we analyze the outputs of Code Llama (7B) when applied to math… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Work in Progress

  17. arXiv:2311.04495  [pdf, other

    cs.CL

    Multi-label and Multi-target Sampling of Machine Annotation for Computational Stance Detection

    Authors: Zhengyuan Liu, Hai Leong Chieu, Nancy F. Chen

    Abstract: Data collection from manual labeling provides domain-specific and task-aligned supervision for data-driven approaches, and a critical mass of well-annotated resources is required to achieve reasonable performance in natural language processing tasks. However, manual annotations are often challenging to scale up in terms of time and budget, especially when domain knowledge, capturing subtle semanti… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Findings of EMNLP 2023. arXiv admin note: text overlap with arXiv:2305.19845

  18. CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation

    Authors: Minzhi Li, Taiwei Shi, Caleb Ziems, Min-Yen Kan, Nancy F. Chen, Zhengyuan Liu, Diyi Yang

    Abstract: Annotated data plays a critical role in Natural Language Processing (NLP) in training models and evaluating their performance. Given recent developments in Large Language Models (LLMs), models such as ChatGPT demonstrate zero-shot capability on many text-annotation tasks, comparable with or even exceeding human annotators. Such LLMs can serve as alternatives for manual annotation, due to lower cos… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  19. arXiv:2310.14110  [pdf, other

    cs.CL

    Finite-context Indexing of Restricted Output Space for NLP Models Facing Noisy Input

    Authors: Minh Nguyen, Nancy F. Chen

    Abstract: NLP models excel on tasks with clean inputs, but are less accurate with noisy inputs. In particular, character-level noise such as human-written typos and adversarially-engineered realistic-looking misspellings often appears in text and can easily trip up NLP models. Prior solutions to address character-level noise often alter the content of the inputs (low fidelity), thus inadvertently lowering m… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: Accepted at IJCNLP-AACL 2023

  20. arXiv:2310.10981  [pdf, other

    cs.CL

    Instructive Dialogue Summarization with Query Aggregations

    Authors: Bin Wang, Zhengyuan Liu, Nancy F. Chen

    Abstract: Conventional dialogue summarization methods directly generate summaries and do not consider user's specific interests. This poses challenges in cases where the users are more focused on particular topics or aspects. With the advancement of instruction-finetuned language models, we introduce instruction-tuning to dialogues to expand the capability set of dialogue summarization models. To overcome t… ▽ More

    Submitted 9 December, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main Conference - Summarization

  21. arXiv:2310.10570  [pdf, other

    cs.CL

    On Context Utilization in Summarization with Large Language Models

    Authors: Mathieu Ravaut, Aixin Sun, Nancy F. Chen, Shafiq Joty

    Abstract: Large language models (LLMs) excel in abstractive summarization tasks, delivering fluent and pertinent summaries. Recent advancements have extended their capabilities to handle long-input contexts, exceeding 100k tokens. However, in question answering, language models exhibit uneven utilization of their input context. They tend to favor the initial and final segments, resulting in a U-shaped perfo… ▽ More

    Submitted 14 June, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: ACL 2024. 9 pages, 7 figures, 3 tables

  22. arXiv:2310.03473  [pdf, other

    cs.CL

    Controllable Multi-document Summarization: Coverage & Coherence Intuitive Policy with Large Language Model Based Rewards

    Authors: Litton J Kurisinkel, Nancy F chen

    Abstract: Memory-efficient large language models are good at refining text input for better readability. However, controllability is a matter of concern when it comes to text generation tasks with long inputs, such as multi-document summarization. In this work, we investigate for a generic controllable approach for multi-document summarization that leverages the capabilities of LLMs to refine the text. In p… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  23. arXiv:2310.03414  [pdf, other

    cs.CL

    LLM Based Multi-Document Summarization Exploiting Main-Event Biased Monotone Submodular Content Extraction

    Authors: Litton J Kurisinkel, Nancy F. Chen

    Abstract: Multi-document summarization is a challenging task due to its inherent subjective bias, highlighted by the low inter-annotator ROUGE-1 score of 0.4 among DUC-2004 reference summaries. In this work, we aim to enhance the objectivity of news summarization by focusing on the main event of a group of related news documents and presenting it coherently with sufficient context. Our primary objective is… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  24. arXiv:2309.04766  [pdf, other

    cs.CL cs.AI

    SeaEval for Multilingual Foundation Models: From Cross-Lingual Alignment to Cultural Reasoning

    Authors: Bin Wang, Zhengyuan Liu, Xin Huang, Fangkai Jiao, Yang Ding, AiTi Aw, Nancy F. Chen

    Abstract: We present SeaEval, a benchmark for multilingual foundation models. In addition to characterizing how these models understand and reason with natural language, we also investigate how well they comprehend cultural practices, nuances, and values. Alongside standard accuracy metrics, we investigate the brittleness of foundation models in the dimensions of semantics and multilinguality. Our analyses… ▽ More

    Submitted 31 March, 2024; v1 submitted 9 September, 2023; originally announced September 2023.

    Comments: To appear in NAACL 2024. 20 pages. More datasets (2 on Cross-Lingual Consistency and 4 on Cultural Understanding) and more supported languages. Code: https://seaeval.github.io/

    Journal ref: NAACL 2024

  25. arXiv:2306.04724  [pdf, other

    cs.CL

    Prompter: Zero-shot Adaptive Prefixes for Dialogue State Tracking Domain Adaptation

    Authors: Taha Aksu, Min-Yen Kan, Nancy F. Chen

    Abstract: A challenge in the Dialogue State Tracking (DST) field is adapting models to new domains without using any supervised data, zero-shot domain adaptation. Parameter-Efficient Transfer Learning (PETL) has the potential to address this problem due to its robustness. However, it has yet to be applied to the zero-shot scenarios, as it is not clear how to apply it unsupervisedly. Our method, Prompter,… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023

  26. arXiv:2306.02719  [pdf, ps, other

    cs.CL cs.LG cs.SD eess.AS

    Multiple output samples per input in a single-output Gaussian process

    Authors: Jeremy H. M. Wong, Huayun Zhang, Nancy F. Chen

    Abstract: The standard Gaussian Process (GP) only considers a single output sample per input in the training set. Datasets for subjective tasks, such as spoken language assessment, may be annotated with output labels from multiple human raters per input. This paper proposes to generalise the GP to allow for these multiple output samples in the training set, and thus make use of available output uncertainty… ▽ More

    Submitted 25 January, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: This paper is presented in the "Symposium for Celebrating 40 Years of Bayesian Learning in Speech and Language Processing and Beyond", which is a satellite event of the ASRU workshop, on 20 December 2023. https://bayesian40.github.io/

  27. arXiv:2305.19845  [pdf, other

    cs.CL

    Guiding Computational Stance Detection with Expanded Stance Triangle Framework

    Authors: Zhengyuan Liu, Yong Keong Yap, Hai Leong Chieu, Nancy F. Chen

    Abstract: Stance detection determines whether the author of a piece of text is in favor of, against, or neutral towards a specified target, and can be used to gain valuable insights into social media. The ubiquitous indirect referral of targets makes this task challenging, as it requires computational solutions to model semantic features and infer the corresponding implications from a literal statement. Mor… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: Main Conference in ACL 2023

  28. arXiv:2305.13718  [pdf, other

    cs.CL

    Exploring Self-supervised Logic-enhanced Training for Large Language Models

    Authors: Fangkai Jiao, Zhiyang Teng, Bosheng Ding, Zhengyuan Liu, Nancy F. Chen, Shafiq Joty

    Abstract: Existing efforts to improve logical reasoning ability of language models have predominantly relied on supervised fine-tuning, hindering generalization to new domains and/or tasks. The development of Large Langauge Models (LLMs) has demonstrated the capacity of compressing abundant knowledge into a single proxy, enabling them to tackle multiple tasks effectively. Our preliminary experiments, nevert… ▽ More

    Submitted 16 June, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 16 pages, NAACL 2024

  29. arXiv:2305.13085  [pdf, other

    cs.CL

    Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models

    Authors: Ratish Puduppully, Anoop Kunchukuttan, Raj Dabre, Ai Ti Aw, Nancy F. Chen

    Abstract: This study investigates machine translation between related languages i.e., languages within the same family that share linguistic characteristics such as word order and lexical similarity. Machine translation through few-shot prompting leverages a small set of translation pair examples to generate translations for test sentences. This procedure requires the model to learn how to generate translat… ▽ More

    Submitted 22 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 (Main, Long paper)

  30. arXiv:2305.05945  [pdf, other

    cs.CL

    Adapter-TST: A Parameter Efficient Method for Multiple-Attribute Text Style Transfer

    Authors: Zhiqiang Hu, Roy Ka-Wei Lee, Nancy F. Chen

    Abstract: Adapting a large language model for multiple-attribute text style transfer via fine-tuning can be challenging due to the significant amount of computational resources and labeled data required for the specific task. In this paper, we address this challenge by introducing AdapterTST, a framework that freezes the pre-trained model's original parameters and enables the development of a multiple-attri… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 11 pages, 3 figures

  31. arXiv:2305.03088  [pdf, other

    cs.CL cs.AI

    Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation

    Authors: Xuan Long Do, Bowei Zou, Shafiq Joty, Anh Tai Tran, Liangming Pan, Nancy F. Chen, Ai Ti Aw

    Abstract: Conversational Question Generation (CQG) is a critical task for machines to assist humans in fulfilling their information needs through conversations. The task is generally cast into two different settings: answer-aware and answer-unaware. While the former facilitates the models by exposing the expected answer, the latter is more realistic and receiving growing attentions recently. What-to-ask and… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 17 pages, ACL 2023

  32. arXiv:2211.07283  [pdf, other

    eess.AS cs.SD

    SNIPER Training: Single-Shot Sparse Training for Text-to-Speech

    Authors: Perry Lam, Huayun Zhang, Nancy F. Chen, Berrak Sisman, Dorien Herremans

    Abstract: Text-to-speech (TTS) models have achieved remarkable naturalness in recent years, yet like most deep neural models, they have more parameters than necessary. Sparse TTS models can improve on dense models via pruning and extra retraining, or converge faster than dense models with some performance loss. Thus, we propose training TTS models using decaying sparsity, i.e. a high initial sparsity to acc… ▽ More

    Submitted 1 June, 2024; v1 submitted 14 November, 2022; originally announced November 2022.

  33. arXiv:2210.12942  [pdf, other

    cs.CL

    Are Current Task-oriented Dialogue Systems Able to Satisfy Impolite Users?

    Authors: Zhiqiang Hu, Roy Kaa-Wei Lee, Nancy F. Chen

    Abstract: Task-oriented dialogue (TOD) systems have assisted users on many tasks, including ticket booking and service inquiries. While existing TOD systems have shown promising performance in serving customer needs, these systems mostly assume that users would interact with the dialogue agent politely. This assumption is unrealistic as impatient or frustrated customers may also interact with TOD systems im… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: 12 pages

  34. arXiv:2210.08779  [pdf, other

    cs.CL

    Towards Summary Candidates Fusion

    Authors: Mathieu Ravaut, Shafiq Joty, Nancy F. Chen

    Abstract: Sequence-to-sequence deep neural models fine-tuned for abstractive summarization can achieve great performance on datasets with enough human annotations. Yet, it has been shown that they have not reached their full potential, with a wide gap between the top beam search output and the oracle beam. Recently, re-ranking methods have been proposed, to learn to select a better summary candidate. Howeve… ▽ More

    Submitted 26 May, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: 4 Figures, 9 Tables, EMNLP 2022

  35. EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models

    Authors: Perry Lam, Huayun Zhang, Nancy F. Chen, Berrak Sisman

    Abstract: Neural models are known to be over-parameterized, and recent work has shown that sparse text-to-speech (TTS) models can outperform dense models. Although a plethora of sparse methods has been proposed for other domains, such methods have rarely been applied in TTS. In this work, we seek to answer the question: what are the characteristics of selected sparse techniques on the performance and model… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

    Journal ref: Interspeech 2022, 823-827 (2022)

  36. arXiv:2209.06652  [pdf, other

    cs.CL

    CoHS-CQG: Context and History Selection for Conversational Question Generation

    Authors: Xuan Long Do, Bowei Zou, Liangming Pan, Nancy F. Chen, Shafiq Joty, Ai Ti Aw

    Abstract: Conversational question generation (CQG) serves as a vital task for machines to assist humans, such as interactive reading comprehension, through conversations. Compared to traditional single-turn question generation (SQG), CQG is more challenging in the sense that the generated question is required not only to be meaningful, but also to align with the occurred conversation history. While previous… ▽ More

    Submitted 10 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: Accepted by 29th International Conference on Computational Linguistics (COLING 2022)

  37. arXiv:2208.01006  [pdf, other

    cs.CL

    Multi-Document Summarization with Centroid-Based Pretraining

    Authors: Ratish Puduppully, Parag Jain, Nancy F. Chen, Mark Steedman

    Abstract: In Multi-Document Summarization (MDS), the input can be modeled as a set of documents, and the output is its summary. In this paper, we focus on pretraining objectives for MDS. Specifically, we introduce a novel pretraining objective, which involves selecting the ROUGE-based centroid of each document cluster as a proxy for its summary. Our objective thus does not require human written summaries an… ▽ More

    Submitted 31 May, 2023; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: ACL 2023 camera-ready

  38. arXiv:2208.00840  [pdf, other

    q-bio.NC cs.LG eess.IV

    A Transformer-based Neural Language Model that Synthesizes Brain Activation Maps from Free-Form Text Queries

    Authors: Gia H. Ngo, Minh Nguyen, Nancy F. Chen, Mert R. Sabuncu

    Abstract: Neuroimaging studies are often limited by the number of subjects and cognitive processes that can be feasibly interrogated. However, a rapidly growing number of neuroscientific studies have collectively accumulated an extensive wealth of results. Digesting this growing literature and obtaining novel insights remains to be a major challenge, since existing meta-analytic tools are constrained to key… ▽ More

    Submitted 24 July, 2022; originally announced August 2022.

    Comments: arXiv admin note: text overlap with arXiv:2109.13814

    Journal ref: Medical Image Analysis. 2022 Jul 19:102540

  39. arXiv:2206.07898  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Multimodal Dialogue State Tracking

    Authors: Hung Le, Nancy F. Chen, Steven C. H. Hoi

    Abstract: Designed for tracking user goals in dialogues, a dialogue state tracker is an essential component in a dialogue system. However, the research of dialogue state tracking has largely been limited to unimodality, in which slots and slot values are limited by knowledge domains (e.g. restaurant domain with slots of restaurant name and price range) and are defined by specific database schema. In this pa… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Comments: Accepted at NAACL 2022 (Oral)

  40. arXiv:2206.02428  [pdf, other

    cs.CL

    Domain-specific Language Pre-training for Dialogue Comprehension on Clinical Inquiry-Answering Conversations

    Authors: Zhengyuan Liu, Pavitra Krishnaswamy, Nancy F. Chen

    Abstract: There is growing interest in the automated extraction of relevant information from clinical dialogues. However, it is difficult to collect and construct large annotated resources for clinical dialogue tasks. Recent developments in natural language processing suggest that large-scale pre-trained language backbones could be leveraged for such machine comprehension and information extraction tasks. Y… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

    Comments: W3PHIAI-2022

  41. arXiv:2205.09324  [pdf, other

    cs.CL cs.AI

    Learning from Bootstrap** and Stepwise Reinforcement Reward: A Semi-Supervised Framework for Text Style Transfer

    Authors: Zhengyuan Liu, Nancy F. Chen

    Abstract: Text style transfer is an important task in controllable language generation. Supervised approaches have pushed performance improvement on style-oriented rewriting such as formality conversion. However, challenges remain due to the scarcity of large-scale parallel data in many domains. While unsupervised approaches do not rely on annotated sentence pairs for each style, they are often plagued with… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: In Findings of NAACL 2022

  42. arXiv:2203.06569  [pdf, other

    cs.CL

    SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization

    Authors: Mathieu Ravaut, Shafiq Joty, Nancy F. Chen

    Abstract: Sequence-to-sequence neural networks have recently achieved great success in abstractive summarization, especially through fine-tuning large pre-trained language models on the downstream dataset. These models are typically decoded with beam search to generate a unique summary. However, the search space is very large, and with the exposure bias, such decoding is not optimal. In this paper, we show… ▽ More

    Submitted 26 May, 2023; v1 submitted 13 March, 2022; originally announced March 2022.

    Comments: 9 pages, 6 figures, 6 tables, 9 appendix pages, ACL 2022

  43. arXiv:2202.09108  [pdf, other

    cs.CL cs.SD eess.AS

    Large-Scale Acoustic Characterization of Singaporean Children's English Pronunciation

    Authors: Yuling Gu, Nancy F. Chen

    Abstract: In this work, we investigate pronunciation differences in English spoken by Singaporean children in relation to their American and British counterparts by conducting Kmeans clustering and Archetypal analysis on selected vowel pairs and approximants. Given that Singapore adopts British English as the institutional standard due to historical reasons, one might expect Singaporean children to follow B… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  44. arXiv:2201.12546  [pdf, other

    cs.CL cs.SD eess.AS

    Progressive Continual Learning for Spoken Keyword Spotting

    Authors: Yizheng Huang, Nana Hou, Nancy F. Chen

    Abstract: Catastrophic forgetting is a thorny challenge when updating keyword spotting (KWS) models after deployment. To tackle such challenges, we propose a progressive continual learning strategy for small-footprint spoken keyword spotting (PCL-KWS). Specifically, the proposed PCL-KWS framework introduces a network instantiator to generate the task-specific sub-networks for remembering previously learned… ▽ More

    Submitted 6 February, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

    Comments: ICASSP 2022

  45. arXiv:2110.04526  [pdf, other

    cs.CL

    Improving Multi-Party Dialogue Discourse Parsing via Domain Integration

    Authors: Zhengyuan Liu, Nancy F. Chen

    Abstract: While multi-party conversations are often less structured than monologues and documents, they are implicitly organized by semantic level correlations across the interactive turns, and dialogue discourse analysis can be applied to predict the dependency structure and relations between the elementary discourse units, and provide feature-rich structural information for downstream tasks. However, the… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: Published in CODI@EMNLP 2021

  46. arXiv:2110.04518  [pdf, other

    cs.CL

    DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing

    Authors: Zhengyuan Liu, Ke Shi, Nancy F. Chen

    Abstract: Text discourse parsing weighs importantly in understanding information flow and argumentative structure in natural language, making it beneficial for downstream tasks. While previous work significantly improves the performance of RST discourse parsing, they are not readily applicable to practical use cases: (1) EDU segmentation is not integrated into most existing tree parsing frameworks, thus it… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: Published in CODI@EMNLP 2021

  47. arXiv:2109.13814  [pdf, other

    q-bio.NC cs.LG

    Text2Brain: Synthesis of Brain Activation Maps from Free-form Text Query

    Authors: Gia H. Ngo, Minh Nguyen, Nancy F. Chen, Mert R. Sabuncu

    Abstract: Most neuroimaging experiments are under-powered, limited by the number of subjects and cognitive processes that an individual study can investigate. Nonetheless, over decades of research, neuroscience has accumulated an extensive wealth of results. It remains a challenge to digest this growing knowledge base and obtain new insights since existing meta-analytic tools are limited to keyword queries.… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: MICCAI 2021

  48. arXiv:2109.13070  [pdf, other

    cs.CL

    Controllable Neural Dialogue Summarization with Personal Named Entity Planning

    Authors: Zhengyuan Liu, Nancy F. Chen

    Abstract: In this paper, we propose a controllable neural generation framework that can flexibly guide dialogue summarization with personal named entity planning. The conditional sequences are modulated to decide what types of information or what perspective to focus on when forming summaries to tackle the under-constrained problem in summarization tasks. This framework supports two types of use cases: (1)… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 Main Conference

  49. arXiv:2108.13629  [pdf, other

    cs.CL

    Dynamic Sliding Window for Meeting Summarization

    Authors: Zhengyuan Liu, Nancy F. Chen

    Abstract: Recently abstractive spoken language summarization raises emerging research interest, and neural sequence-to-sequence approaches have brought significant performance improvement. However, summarizing long meeting transcripts remains challenging. Due to the large length of source contents and targeted summaries, neural models are prone to be distracted on the context, and produce summaries with deg… ▽ More

    Submitted 31 August, 2021; originally announced August 2021.

    Comments: SummDial@SIGDial 2021

  50. arXiv:2107.03675  [pdf, other

    cs.CL cs.SD eess.AS

    Multilingual Speech Evaluation: Case Studies on English, Malay and Tamil

    Authors: Huayun Zhang, Ke Shi, Nancy F. Chen

    Abstract: Speech evaluation is an essential component in computer-assisted language learning (CALL). While speech evaluation on English has been popular, automatic speech scoring on low resource languages remains challenging. Work in this area has focused on monolingual specific designs and handcrafted features stemming from resource-rich languages like English. Such approaches are often difficult to genera… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: Accepted at INTERSPEECH 2021