Skip to main content

Showing 1–50 of 118 results for author: Che, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01081  [pdf, other

    cs.CV cs.CL

    CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation

    Authors: Yuxuan Wang, Yijun Liu, Fei Yu, Chen Huang, Kexin Li, Zhiguo Wan, Wanxiang Che

    Abstract: Despite the rapid development of Chinese vision-language models (VLMs), most existing Chinese vision-language (VL) datasets are constructed on Western-centric images from existing English VL datasets. The cultural bias in the images makes these datasets unsuitable for evaluating VLMs in Chinese culture. To remedy this issue, we present a new Chinese Vision- Language Understanding Evaluation (CVLUE… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.17456  [pdf, other

    cs.CL cs.AI

    Improving Grammatical Error Correction via Contextual Data Augmentation

    Authors: Yixuan Wang, Baoxin Wang, Yijun Liu, Qingfu Zhu, Dayong Wu, Wanxiang Che

    Abstract: Nowadays, data augmentation through synthetic data has been widely used in the field of Grammatical Error Correction (GEC) to alleviate the problem of data scarcity. However, these synthetic data are mainly used in the pre-training phase rather than the data-limited fine-tuning phase due to inconsistent error distribution and noisy labels. In this paper, we propose a synthetic data construction me… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted as Findings of ACL 2024

  3. arXiv:2406.17404  [pdf, other

    cs.CL cs.LG

    Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training

    Authors: Yixuan Wang, Xianzhen Luo, Fuxuan Wei, Yijun Liu, Qingfu Zhu, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: Existing speculative decoding methods typically require additional model structure and training processes to assist the model for draft token generation. This makes the migration of acceleration methods to the new model more costly and more demanding on device memory. To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning st… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 11 pages, 6 figures

  4. arXiv:2406.17233  [pdf, other

    cs.SE cs.CL

    Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement

    Authors: Yunlong Feng, Yang Xu, Dechuan Teng, Honglin Mu, Xiao Xu, Libo Qin, Wanxiang Che, Qingfu Zhu

    Abstract: Decompilation transforms compiled code back into a high-level programming language for analysis when source code is unavailable. Previous work has primarily focused on enhancing decompilation performance by increasing the scale of model parameters or training data for pre-training. Based on the characteristics of the decompilation task, we propose two methods: (1) Without fine-tuning, the Self-Con… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Under Review

  5. arXiv:2406.13940  [pdf, other

    cs.CL

    AutoCAP: Towards Automatic Cross-lingual Alignment Planning for Zero-shot Chain-of-Thought

    Authors: Yongheng Zhang, Qiguang Chen, Min Li, Wanxiang Che, Libo Qin

    Abstract: Cross-lingual chain-of-thought can effectively complete reasoning tasks across languages, which gains increasing attention. Recently, dominant approaches in the literature improve cross-lingual alignment capabilities by integrating reasoning knowledge from different languages. Despite achieving excellent performance, current methods still have two main challenges: (1) Manual language specification… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL2024 Findings

  6. arXiv:2406.10505  [pdf, other

    cs.CL

    CroPrompt: Cross-task Interactive Prompting for Zero-shot Spoken Language Understanding

    Authors: Libo Qin, Fuxuan Wei, Qiguang Chen, **gxuan Zhou, Shijue Huang, Jiasheng Si, Wenpeng Lu, Wanxiang Che

    Abstract: Slot filling and intent detection are two highly correlated tasks in spoken language understanding (SLU). Recent SLU research attempts to explore zero-shot prompting techniques in large language models to alleviate the data scarcity problem. Nevertheless, the existing prompting work ignores the cross-task interaction information for SLU, which leads to sub-optimal performance. To solve this proble… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  7. arXiv:2406.08068  [pdf, other

    cs.CL

    Large Language Models Meet Text-Centric Multimodal Sentiment Analysis: A Survey

    Authors: Hao Yang, Yanyan Zhao, Yang Wu, Shilong Wang, Tian Zheng, Hongbo Zhang, Wanxiang Che, Bing Qin

    Abstract: Compared to traditional sentiment analysis, which only considers text, multimodal sentiment analysis needs to consider emotional signals from multimodal sources simultaneously and is therefore more consistent with the way how humans process sentiment in real-world scenarios. It involves processing emotional information from various sources such as natural language, images, videos, audio, physiolog… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  8. arXiv:2405.16473  [pdf, other

    cs.CV cs.AI cs.CL

    M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought

    Authors: Qiguang Chen, Libo Qin, ** Zhang, Zhi Chen, Xiao Xu, Wanxiang Che

    Abstract: Multi-modal Chain-of-Thought (MCoT) requires models to leverage knowledge from both textual and visual modalities for step-by-step reasoning, which gains increasing attention. Nevertheless, the current MCoT benchmark still faces some challenges: (1) absence of visual modal reasoning, (2) single-step visual modal reasoning, and (3) Domain missing, thereby hindering the development of MCoT. Motivate… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted at ACL2024 Main Conference

  9. arXiv:2405.12819  [pdf, other

    cs.CL cs.AI

    Large Language Models Meet NLP: A Survey

    Authors: Libo Qin, Qiguang Chen, Xiachong Feng, Yang Wu, Yongheng Zhang, Yinghui Li, Min Li, Wanxiang Che, Philip S. Yu

    Abstract: While large language models (LLMs) like ChatGPT have shown impressive capabilities in Natural Language Processing (NLP) tasks, a systematic investigation of their potential in this field remains largely unexplored. This study aims to address this gap by exploring the following questions: (1) How are LLMs currently applied to NLP tasks in the literature? (2) Have traditional NLP tasks already been… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  10. arXiv:2404.07017  [pdf, other

    cs.CL cs.AI

    Improving Language Model Reasoning with Self-motivated Learning

    Authors: Yunlong Feng, Yang Xu, Libo Qin, Yasheng Wang, Wanxiang Che

    Abstract: Large-scale high-quality training data is important for improving the performance of models. After trained with data that has rationales (reasoning steps), models gain reasoning capability. However, the dataset with high-quality rationales is relatively scarce due to the high annotation cost. To address this issue, we propose \textit{Self-motivated Learning} framework. The framework motivates the… ▽ More

    Submitted 30 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-COLING 2024

  11. arXiv:2404.04925  [pdf, other

    cs.CL

    Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers

    Authors: Libo Qin, Qiguang Chen, Yuhang Zhou, Zhi Chen, Yinghui Li, Lizi Liao, Min Li, Wanxiang Che, Philip S. Yu

    Abstract: Multilingual Large Language Models are capable of using powerful Large Language Models to handle and respond to queries in multiple languages, which achieves remarkable success in multilingual natural language processing tasks. Despite these breakthroughs, there still remains a lack of a comprehensive survey to summarize existing approaches and recent developments in this field. To this end, in th… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  12. arXiv:2404.00629  [pdf, other

    cs.CL

    Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

    Authors: Lizhi Lin, Honglin Mu, Zenan Zhai, Minghan Wang, Yuxia Wang, Renxi Wang, Junjie Gao, Yixuan Zhang, Wanxiang Che, Timothy Baldwin, Xudong Han, Haonan Li

    Abstract: Generative models are rapidly gaining popularity and being integrated into everyday applications, raising concerns over their safety issues as various vulnerabilities are exposed. Faced with the problem, the field of red teaming is experiencing fast-paced growth, which highlights the need for a comprehensive organization covering the entire pipeline and addressing emerging topics for the community… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  13. arXiv:2403.17413  [pdf, other

    cs.CL

    LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction

    Authors: Yixuan Wang, Baoxin Wang, Yijun Liu, Dayong Wu, Wanxiang Che

    Abstract: Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods based on voting can effectively mitigate over-correction and improve the precision of the GEC system. However, these methods still require the output of several GEC systems and inevitably lead to reduced error recall. In this light, we propose the LM-Combiner, a rewri… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted to COLING 2024

  14. arXiv:2403.11128  [pdf, other

    cs.CL

    Beyond Static Evaluation: A Dynamic Approach to Assessing AI Assistants' API Invocation Capabilities

    Authors: Honglin Mu, Yang Xu, Yunlong Feng, Xiaofeng Han, Yitong Li, Yutai Hou, Wanxiang Che

    Abstract: With the rise of Large Language Models (LLMs), AI assistants' ability to utilize tools, especially through API calls, has advanced notably. This progress has necessitated more accurate evaluation methods. Many existing studies adopt static evaluation, where they assess AI assistants' API call based on pre-defined dialogue histories. However, such evaluation method can be misleading, as an AI assis… ▽ More

    Submitted 27 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  15. arXiv:2403.05133  [pdf, other

    cs.IT cs.LG cs.NI

    RIS-empowered Topology Control for Distributed Learning in Urban Air Mobility

    Authors: Kai Xiong, Rui Wang, Supeng Leng, Wenyang Che, Chongwen Huang, Chau Yuen

    Abstract: Urban Air Mobility (UAM) expands vehicles from the ground to the near-ground space, envisioned as a revolution for transportation systems. Comprehensive scene perception is the foundation for autonomous aerial driving. However, UAM encounters the intelligent perception challenge: high perception learning requirements conflict with the limited sensors and computing chips of flying cars. To overcome… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  16. arXiv:2403.00338  [pdf, other

    cs.CL

    Semi-Instruct: Bridging Natural-Instruct and Self-Instruct for Code Large Language Models

    Authors: Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Xu Wang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: Instruction tuning plays a pivotal role in Code Large Language Models (Code LLMs) for the task of program synthesis. Presently, two dominant paradigms for collecting tuning data are natural-instruct (human-written) and self-instruct (automatically generated). Natural-instruct includes diverse and correct codes but lacks instruction-code pairs, and exists improper code formats like nested single-li… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  17. arXiv:2402.11295  [pdf, other

    cs.CL

    OneBit: Towards Extremely Low-bit Large Language Models

    Authors: Yuzhuang Xu, Xu Han, Zonghan Yang, Shuo Wang, Qingfu Zhu, Zhiyuan Liu, Weidong Liu, Wanxiang Che

    Abstract: Model quantification uses low bit-width values to represent the weight matrices of existing models to be quantized, which is a promising approach to reduce both storage and computational overheads of deploying highly anticipated LLMs. However, current quantization methods suffer severe performance degradation when the bit-width is extremely reduced, and thus focus on utilizing 4-bit or 8-bit value… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: 6 figures, 6 tables

  18. arXiv:2402.10812  [pdf, other

    cs.CL

    Exploring Hybrid Question Answering via Program-based Prompting

    Authors: Qi Shi, Han Cui, Haofeng Wang, Qingfu Zhu, Wanxiang Che, Ting Liu

    Abstract: Question answering over heterogeneous data requires reasoning over diverse sources of data, which is challenging due to the large scale of information and organic coupling of heterogeneous data. Various approaches have been proposed to address these challenges. One approach involves training specialized retrievers to select relevant information, thereby reducing the input length. Another approach… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  19. arXiv:2402.10691  [pdf, other

    cs.CL

    Python is Not Always the Best Choice: Embracing Multilingual Program of Thoughts

    Authors: Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Libo Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: Program of Thoughts (PoT) is an approach characterized by its executable intermediate steps, which ensure the accuracy of the logical calculations in the reasoning process. Currently, PoT primarily uses Python. However, relying solely on a single language may result in suboptimal solutions and overlook the potential benefits of other programming languages. In this paper, we conduct comprehensive e… ▽ More

    Submitted 16 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: under review

  20. arXiv:2402.10666  [pdf, other

    cs.CL

    Multi-Hop Table Retrieval for Open-Domain Text-to-SQL

    Authors: Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Qingfu Zhu, Wanxiang Che

    Abstract: Open-domain text-to-SQL is an important task that retrieves question-relevant tables from massive databases and then generates SQL. However, existing retrieval methods that retrieve in a single hop do not pay attention to the text-to-SQL challenge of schema linking, which is aligning the entities in the question with table entities, reflected in two aspects: similar irrelevant entity and domain mi… ▽ More

    Submitted 19 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  21. arXiv:2402.10663  [pdf, other

    cs.CL

    Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL

    Authors: Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che

    Abstract: Currently, the in-context learning method based on large language models (LLMs) has become the mainstream of text-to-SQL research. Previous works have discussed how to select demonstrations related to the user question from a human-labeled demonstration pool. However, human labeling suffers from the limitations of insufficient diversity and high labeling overhead. Therefore, in this paper, we disc… ▽ More

    Submitted 26 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  22. arXiv:2402.10654  [pdf, other

    cs.CL

    Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes

    Authors: Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che

    Abstract: Numerical reasoning is an essential ability for NLP systems to handle numeric information. Recent research indicates that fine-tuning a small-scale model to learn generating reasoning processes alongside answers can significantly enhance performance. However, current methods have the limitation that most methods generate reasoning processes with large language models (LLMs), which are "unreliable"… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  23. arXiv:2402.08259  [pdf, other

    cs.CL

    A Survey of Table Reasoning with Large Language Models

    Authors: Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Qingfu Zhu, Wanxiang Che

    Abstract: Table reasoning, which aims to generate the corresponding answer to the question following the user requirement according to the provided table, and optionally a text description of the table, effectively improving the efficiency of obtaining information. Recently, using Large Language Models (LLMs) has become the mainstream method for table reasoning, because it not only significantly reduces the… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  24. arXiv:2402.03900  [pdf, other

    cs.CL

    Pro-HAN: A Heterogeneous Graph Attention Network for Profile-Based Spoken Language Understanding

    Authors: Dechuan Teng, Chunlin Lu, Xiao Xu, Wanxiang Che, Libo Qin

    Abstract: Recently, Profile-based Spoken Language Understanding (SLU) has gained increasing attention, which aims to incorporate various types of supplementary profile information (i.e., Knowledge Graph, User Profile, Context Awareness) to eliminate the prevalent ambiguities in user utterances. However, existing approaches can only separately model different profile information, without considering their in… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted at ICASSP 2024

  25. arXiv:2402.02547  [pdf

    cs.AI cs.CL

    Integration of cognitive tasks into artificial general intelligence test for large models

    Authors: Youzhi Qu, Chen Wei, Penghui Du, Wenxin Che, Chi Zhang, Wanli Ouyang, Yatao Bian, Feiyang Xu, Bin Hu, Kai Du, Haiyan Wu, Jia Liu, Quanying Liu

    Abstract: During the evolution of large models, performance evaluation is necessarily performed to assess their capabilities and ensure safety before practical application. However, current model evaluations mainly rely on specific tasks and datasets, lacking a united framework for assessing the multidimensional intelligence of large models. In this perspective, we advocate for a comprehensive framework of… ▽ More

    Submitted 5 March, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  26. arXiv:2401.08295  [pdf, other

    cs.CL

    SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models

    Authors: Weixiang Zhao, Shilong Wang, Yulin Hu, Yanyan Zhao, Bing Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

    Abstract: The continual learning (CL) ability is vital for deploying large language models (LLMs) in the dynamic world. Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input, aiming at handling the challenges of catastrophic forgetting and knowledge transfer i… ▽ More

    Submitted 6 June, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: To appear at ACL 2024

  27. arXiv:2311.09008  [pdf, other

    cs.CL

    End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions

    Authors: Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, Min Li

    Abstract: End-to-end task-oriented dialogue (EToD) can directly generate responses in an end-to-end fashion without modular training, which attracts escalating popularity. The advancement of deep neural networks, especially the successful use of large pre-trained models, has further led to significant progress in EToD research in recent years. In this paper, we present a thorough review and provide a unifie… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted at EMNLP2023

  28. arXiv:2310.14799  [pdf, other

    cs.CL cs.AI

    Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages

    Authors: Libo Qin, Qiguang Chen, Fuxuan Wei, Shijue Huang, Wanxiang Che

    Abstract: Chain-of-thought (CoT) is capable of eliciting models to explicitly generate reasoning paths, thus promoting reasoning accuracy and attracting increasing attention. Specifically, zero-shot CoT achieves remarkable improvements in a wide range of reasoning tasks by simply instructing the LLM with the prompt "Let's think step by step!". Despite the success of zero-shot CoT, the existing zero-shot pro… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP2023 Main Conference

  29. arXiv:2310.14626  [pdf, other

    cs.CL cs.IR

    Conversational Recommender System and Large Language Model Are Made for Each Other in E-commerce Pre-sales Dialogue

    Authors: Yuanxing Liu, Wei-Nan Zhang, Yifan Chen, Yuchi Zhang, Haopeng Bai, Fan Feng, Hengbin Cui, Yongbin Li, Wanxiang Che

    Abstract: E-commerce pre-sales dialogue aims to understand and elicit user needs and preferences for the items they are seeking so as to provide appropriate recommendations. Conversational recommender systems (CRSs) learn user representation and provide accurate recommendations based on dialogue context, but rely on external knowledge. Large language models (LLMs) generate responses that mimic pre-sales dia… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings

  30. arXiv:2308.10585  [pdf, other

    cs.CL

    Exploring Equation as a Better Intermediate Meaning Representation for Numerical Reasoning

    Authors: Dingzirui Wang, Longxu Dou, Wenbin Zhang, Junyu Zeng, Wanxiang Che

    Abstract: Numerical reasoning is vital for natural language processing models to understand and process numerical information in real-world scenarios. Most current methods first generate the Intermediate Meaning Representations (IMRs) of questions and then generate answers. Current SOTA methods generate programs as IMRs with large language models (LLMs). Intuitively, equations have fewer restrictions and cl… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

  31. arXiv:2307.09723  [pdf, other

    cs.SD eess.AS

    Improving Domain Generalization for Sound Classification with Sparse Frequency-Regularized Transformer

    Authors: Honglin Mu, Wentian Xia, Wanxiang Che

    Abstract: Sound classification models' performance suffers from generalizing on out-of-distribution (OOD) data. Numerous methods have been proposed to help the model generalize. However, most either introduce inference overheads or focus on long-lasting CNN-variants, while Transformers has been proven to outperform CNNs on numerous natural language processing and computer vision tasks. We propose FRITO, an… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: Accepted by ICME 2023

  32. arXiv:2307.07135  [pdf, other

    cs.CL

    MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System

    Authors: Libo Qin, Shijue Huang, Qiguang Chen, Chenran Cai, Yudi Zhang, Bin Liang, Wanxiang Che, Ruifeng Xu

    Abstract: Multi-modal sarcasm detection has attracted much recent attention. Nevertheless, the existing benchmark (MMSD) has some shortcomings that hinder the development of reliable multi-modal sarcasm detection system: (1) There are some spurious cues in MMSD, leading to the model bias learning; (2) The negative samples in MMSD are not always reasonable. To solve the aforementioned issues, we introduce MM… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: Accepted by ACL2023 Findings

  33. arXiv:2306.08892  [pdf, other

    cs.CL

    MetricPrompt: Prompting Model as a Relevance Metric for Few-shot Text Classification

    Authors: Hongyuan Dong, Weinan Zhang, Wanxiang Che

    Abstract: Prompting methods have shown impressive performance in a variety of text mining tasks and applications, especially few-shot ones. Despite the promising prospects, the performance of prompting model largely depends on the design of prompt template and verbalizer. In this work, we propose MetricPrompt, which eases verbalizer design difficulty by reformulating few-shot text classification task into t… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: Accepted at KDD 2023

  34. arXiv:2306.03287  [pdf, other

    cs.CV

    ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

    Authors: Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, Mingyu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun , et al. (2 additional authors not shown)

    Abstract: Structured text extraction is one of the most valuable and challenging application directions in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we organized the ICDAR 2023 competition on Structured text extracti… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: ICDAR 2023 Competition on SVRD report (To be appear in ICDAR 2023)

  35. arXiv:2306.00103  [pdf, other

    cs.CV cs.CL cs.LG

    ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning

    Authors: Xiao Xu, Bei Li, Chenfei Wu, Shao-Yen Tseng, Anahita Bhiwandiwalla, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan

    Abstract: Two-Tower Vision-Language (VL) models have shown promising improvements on various downstream VL tasks. Although the most advanced work improves performance by building bridges between encoders, it suffers from ineffective layer-by-layer utilization of uni-modal representations and cannot flexibly exploit different levels of uni-modal semantic knowledge. In this work, we propose ManagerTower, a no… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted by ACL 2023 Main Conference, Oral

  36. arXiv:2305.10231  [pdf, other

    cs.CL

    OpenSLU: A Unified, Modularized, and Extensible Toolkit for Spoken Language Understanding

    Authors: Libo Qin, Qiguang Chen, Xiao Xu, Yunlong Feng, Wanxiang Che

    Abstract: Spoken Language Understanding (SLU) is one of the core components of a task-oriented dialogue system, which aims to extract the semantic meaning of user queries (e.g., intents and slots). In this work, we introduce OpenSLU, an open-source toolkit to provide a unified, modularized, and extensible toolkit for spoken language understanding. Specifically, OpenSLU unifies 10 SLU models for both single-… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: ACL2023 Demo Paper

  37. arXiv:2305.05183  [pdf, other

    cs.CL cs.AI

    CSED: A Chinese Semantic Error Diagnosis Corpus

    Authors: Bo Sun, Baoxin Wang, Yixuan Wang, Wanxiang Che, Dayong Wu, Shi** Wang, Ting Liu

    Abstract: Recently, much Chinese text error correction work has focused on Chinese Spelling Check (CSC) and Chinese Grammatical Error Diagnosis (CGED). In contrast, little attention has been paid to the complicated problem of Chinese Semantic Error Diagnosis (CSED), which lacks relevant datasets. The study of semantic errors is important because they are very common and may lead to syntactic irregularities… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 12 pages. arXiv admin note: text overlap with arXiv:2204.07464

  38. U-NEED: A Fine-grained Dataset for User Needs-Centric E-commerce Conversational Recommendation

    Authors: Yuanxing Liu, Weinan Zhang, Baohua Dong, Yan Fan, Hang Wang, Fan Feng, Yifan Chen, Ziyu Zhuang, Hengbin Cui, Yongbin Li, Wanxiang Che

    Abstract: Conversational recommender systems (CRSs) aim to understand the information needs and preferences expressed in a dialogue to recommend suitable items to the user. Most of the existing conversational recommendation datasets are synthesized or simulated with crowdsourcing, which has a large gap with real-world scenarios. To bridge the gap, previous work contributes a dataset E-ConvRec, based on pre-… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: SIGIR23 Resource Track

  39. arXiv:2304.13902   

    cs.CL

    Controllable Data Augmentation for Context-Dependent Text-to-SQL

    Authors: Dingzirui Wang, Longxu Dou, Wanxiang Che

    Abstract: The limited scale of annotated data constraints existing context-dependent text-to-SQL models because of the complexity of labeling. The data augmentation method is a commonly used method to solve this problem. However, the data generated by current augmentation methods often lack diversity. In this paper, we introduce ConDA, which generates interactive questions and corresponding SQL results. We… ▽ More

    Submitted 27 April, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: fix overlap

  40. arXiv:2304.09820  [pdf, other

    cs.CL cs.AI

    A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification

    Authors: Yunlong Feng, Bohan Li, Libo Qin, Xiao Xu, Wanxiang Che

    Abstract: Cross-domain text classification aims to adapt models to a target domain that lacks labeled data. It leverages or reuses rich labeled data from the different but related source domain(s) and unlabeled data from the target domain. To this end, previous work focuses on either extracting domain-invariant features or task-agnostic features, ignoring domain-aware features that may be present in the tar… ▽ More

    Submitted 10 April, 2024; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Accepted at LREC-COLING 2024

  41. arXiv:2304.09402  [pdf, other

    cs.CL cs.LG

    MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning

    Authors: Bohan Li, Longxu Dou, Yutai Hou, Yunlong Feng, Honglin Mu, Qingfu Zhu, Qinghua Sun, Wanxiang Che

    Abstract: Prompt-based learning has shown considerable promise in reformulating various downstream tasks as cloze problems by combining original input with a predetermined template. This approach demonstrates its effectiveness, especially in few-shot learning scenarios, where the model is trained on a scarce amount of data. Despite its successes, the limited templates and text in few-shot prompt-based learn… ▽ More

    Submitted 11 November, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: 19 pages, 5 figures, 6 tables

  42. arXiv:2304.04256  [pdf, other

    cs.CL

    A Preliminary Evaluation of ChatGPT for Zero-shot Dialogue Understanding

    Authors: Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin

    Abstract: Zero-shot dialogue understanding aims to enable dialogue to track the user's needs without any training data, which has gained increasing attention. In this work, we investigate the understanding ability of ChatGPT for zero-shot dialogue understanding tasks including spoken language understanding (SLU) and dialogue state tracking (DST). Experimental results on four popular benchmarks reveal the gr… ▽ More

    Submitted 9 April, 2023; originally announced April 2023.

    Comments: Technical Report

  43. arXiv:2302.02070  [pdf, other

    cs.CV cs.LG

    Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

    Authors: Bohan Li, Xiao Xu, Xinghao Wang, Yutai Hou, Yunlong Feng, Feng Wang, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che

    Abstract: Existing image augmentation methods consist of two categories: perturbation-based methods and generative methods. Perturbation-based methods apply pre-defined perturbations to augment an original image, but only locally vary the image, thus lacking image diversity. In contrast, generative methods bring more image diversity in the augmented images but may not preserve semantic consistency, thus inc… ▽ More

    Submitted 18 January, 2024; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: AAAI 2024

  44. arXiv:2301.02010  [pdf, other

    cs.CL

    HIT-SCIR at MMNLU-22: Consistency Regularization for Multilingual Spoken Language Understanding

    Authors: Bo Zheng, Zhouyang Li, Fuxuan Wei, Qiguang Chen, Libo Qin, Wanxiang Che

    Abstract: Multilingual spoken language understanding (SLU) consists of two sub-tasks, namely intent detection and slot filling. To improve the performance of these two sub-tasks, we propose to use consistency regularization based on a hybrid data augmentation strategy. The consistency regularization enforces the predicted distributions for an example and its semantically equivalent augmentation to be consis… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

    Comments: Accepted by EMNLP2022 MMNLU-22 Workshop. The winner of the MMNLU-22 Competition Full Dataset Task. Code is available at https://github.com/bozheng-hit/MMNLU-22-HIT-SCIR

  45. arXiv:2301.01067  [pdf, other

    cs.CL

    Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge

    Authors: Longxu Dou, Yan Gao, Xuqi Liu, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Min-Yen Kan, Jian-Guang Lou

    Abstract: In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by a… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

    Comments: EMNLP 2022 Main Conference

  46. arXiv:2212.13492  [pdf, other

    cs.CL

    MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

    Authors: Longxu Dou, Yan Gao, Mingyang Pan, Dingzirui Wang, Wanxiang Che, Dechen Zhan, Jian-Guang Lou

    Abstract: Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English. In this work, we present MultiSpider, the largest multilingual text-to-SQL data… ▽ More

    Submitted 27 December, 2022; originally announced December 2022.

    Comments: AAAI2023 Main Conference. Code: https://github.com/microsoft/ContextualSP

  47. arXiv:2212.13465  [pdf, other

    cs.CL cs.AI

    A Survey on Table-and-Text HybridQA: Concepts, Methods, Challenges and Future Directions

    Authors: Dingzirui Wang, Longxu Dou, Wanxiang Che

    Abstract: Table-and-text hybrid question answering (HybridQA) is a widely used and challenging NLP task commonly applied in the financial and scientific domain. The early research focuses on migrating other QA task methods to HybridQA, while with further research, more and more HybridQA-specific methods have been present. With the rapid development of HybridQA, the systematic survey is still under-explored… ▽ More

    Submitted 1 February, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: 7 pages

  48. arXiv:2212.05773  [pdf, other

    cs.CL

    A Survey on Natural Language Processing for Programming

    Authors: Qingfu Zhu, Xianzhen Luo, Fang Liu, Cuiyun Gao, Wanxiang Che

    Abstract: Natural language processing for programming aims to use NLP techniques to assist programming. It is increasingly prevalent for its effectiveness in improving productivity. Distinct from natural language, a programming language is highly structured and functional. Constructing a structure-based representation and a functionality-oriented algorithm is at the heart of program understanding and genera… ▽ More

    Submitted 5 August, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

  49. arXiv:2211.05344  [pdf, other

    cs.CL cs.LG

    LERT: A Linguistically-motivated Pre-trained Language Model

    Authors: Yiming Cui, Wanxiang Che, Shi** Wang, Ting Liu

    Abstract: Pre-trained Language Model (PLM) has become a representative foundation model in the natural language processing field. Most PLMs are trained with linguistic-agnostic pre-training tasks on the surface form of the text, such as the masked language model (MLM). To further empower the PLMs with richer linguistic features, in this paper, we aim to propose a simple but effective way to learn linguistic… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 11 pages

  50. arXiv:2209.11486  [pdf, other

    cs.CL

    MetaPrompting: Learning to Learn Better Prompts

    Authors: Yutai Hou, Hongyuan Dong, Xinghao Wang, Bohan Li, Wanxiang Che

    Abstract: Prompting method is regarded as one of the crucial progress for few-shot nature language processing. Recent research on prompting moves from discrete tokens based ``hard prompts'' to continuous ``soft prompts'', which employ learnable vectors as pseudo prompt tokens and achieve better performance. Though showing promising prospects, these soft-prompting methods are observed to rely heavily on good… ▽ More

    Submitted 3 February, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

    Comments: Accepted as COLING 2022 long paper