Skip to main content

Showing 1–50 of 178 results for author: Tang, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01178  [pdf, other

    cs.CL cs.AI cs.LG

    $\text{Memory}^3$: Language Modeling with Explicit Memory

    Authors: Hongkang Yang, Zehao Lin, Wen** Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, **bo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, Linpeng Tang, Weinan E

    Abstract: The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human brain, we reduce this cost by equip** LLMs with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowled… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    MSC Class: 68T50 ACM Class: I.2.7

  2. arXiv:2406.19353  [pdf, other

    cs.CV

    CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement

    Authors: Chengwen Zhang, Yun Liu, Ruofan Xing, Bingda Tang, Li Yi

    Abstract: Understanding how humans cooperatively rearrange household objects is critical for VR/AR and human-robot interaction. However, in-depth studies on modeling these behaviors are under-researched due to the lack of relevant datasets. We fill this gap by presenting CORE4D, a novel large-scale 4D human-object-human interaction dataset focusing on collaborative object rearrangement, which encompasses di… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.16069  [pdf, other

    cs.CL cs.AI

    FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models

    Authors: Junyi Zhu, Shuochen Liu, Yu Yu, Bo Tang, Yibo Yan, Zhiyu Li, Feiyu Xiong, Tong Xu, Matthew B. Blaschko

    Abstract: Large language models (LLMs) excel in generating coherent text, but they often struggle with context awareness, leading to inaccuracies in tasks requiring faithful adherence to provided information. We introduce FastMem, a novel method designed to enhance instruction fine-tuned LLMs' context awareness through fast memorization of the prompt. FastMem maximizes the likelihood of the prompt before in… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  4. arXiv:2406.13662  [pdf, other

    cs.CL

    ObscurePrompt: Jailbreaking Large Language Models via Obscure Input

    Authors: Yue Huang, **gyu Tang, Dong** Chen, Bingda Tang, Yao Wan, Lichao Sun, Xiangliang Zhang

    Abstract: Recently, Large Language Models (LLMs) have garnered significant attention for their exceptional natural language processing capabilities. However, concerns about their trustworthiness remain unresolved, particularly in addressing "jailbreaking" attacks on aligned LLMs. Previous research predominantly relies on scenarios with white-box LLMs or specific and fixed prompt templates, which are often i… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.12435  [pdf, other

    cs.LG cs.AI cs.DC

    Federated Learning with Limited Node Labels

    Authors: Bisheng Tang, Xiaojun Chen, Shaopu Wang, Yuexin Xuan, Zhendong Zhao

    Abstract: Subgraph federated learning (SFL) is a research methodology that has gained significant attention for its potential to handle distributed graph-structured data. In SFL, the local model comprises graph neural networks (GNNs) with a partial graph structure. However, some SFL models have overlooked the significance of missing cross-subgraph edges, which can lead to local GNNs being unable to message-… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  6. arXiv:2406.12017  [pdf, other

    stat.ML cs.LG stat.CO

    Sparsity-Constraint Optimization via Splicing Iteration

    Authors: Zezhi Wang, ** Zhu, Junxian Zhu, Borui Tang, Hongmei Lin, Xueqin Wang

    Abstract: Sparsity-constraint optimization has wide applicability in signal processing, statistics, and machine learning. Existing fast algorithms must burdensomely tune parameters, such as the step size or the implementation of precise stop criteria, which may be challenging to determine in practice. To address this issue, we develop an algorithm named Sparsity-Constraint Optimization via sPlicing itEratio… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 34 pages

  7. arXiv:2406.09089  [pdf, other

    cs.LG

    DiffPoGAN: Diffusion Policies with Generative Adversarial Networks for Offline Reinforcement Learning

    Authors: Xuemin Hu, Shen Li, Yingfen Xu, Bo Tang, Long Chen

    Abstract: Offline reinforcement learning (RL) can learn optimal policies from pre-collected offline datasets without interacting with the environment, but the sampled actions of the agent cannot often cover the action distribution under a given state, resulting in the extrapolation error issue. Recent works address this issue by employing generative adversarial networks (GANs). However, these methods often… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.07824  [pdf, other

    quant-ph cs.CR

    Efficient Arbitrated Quantum Digital Signature with Multi-Receiver Verification

    Authors: Siyu Xiong, Bangying Tang, Hui Han, **quan Huang, Mingqiang Bai, Fangzhao Li, Wanrong Yu Zhiwen Mo, Bo Liu

    Abstract: Quantum digital signature is used to authenticate the identity of the signer with information theoretical security, while providing non-forgery and non-repudiation services. In traditional multi-receiver quantum digital signature schemes without an arbitrater, the transferability of one-to-one signature is always required to achieve unforgeability, with complicated implementation and heavy key con… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  9. arXiv:2405.19480  [pdf, other

    cs.NI

    RANFusion: A Comprehensive Tool for Simulating Handover In Next-G RAN

    Authors: Seyed Bagher Hashemi Natanzi, Bo Tang

    Abstract: The rapid advancement of 5G networks and the upcoming transition to 6G necessitate the use of the Open Radio Access Network (O-RAN) architecture to enable greater flexibility, interoperability, and innovation. This shift towards 6G and O-RAN requires the development of advanced simulation tools for testing, analyzing, and optimizing Radio Access Network (RAN) operations. This need becomes critical… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 Figure, 2 table

  10. arXiv:2405.16933  [pdf, other

    cs.CL cs.IR

    Empowering Large Language Models to Set up a Knowledge Retrieval Indexer via Self-Learning

    Authors: Xun Liang, Simin Niu, Zhiyu li, Sensen Zhang, Shichao Song, Hanyu Wang, Jiawei Yang, Feiyu Xiong, Bo Tang, Chenyang Xi

    Abstract: Retrieval-Augmented Generation (RAG) offers a cost-effective approach to injecting real-time knowledge into large language models (LLMs). Nevertheless, constructing and validating high-quality knowledge repositories require considerable effort. We propose a pre-retrieval framework named Pseudo-Graph Retrieval-Augmented Generation (PG-RAG), which conceptualizes LLMs as students by providing them wi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  11. arXiv:2405.11874  [pdf, other

    cs.CL

    xFinder: Robust and Pinpoint Answer Extraction for Large Language Models

    Authors: Qingchen Yu, Zifan Zheng, Shichao Song, Zhiyu Li, Feiyu Xiong, Bo Tang, Ding Chen

    Abstract: The continuous advancement of large language models (LLMs) has brought increasing attention to the critical issue of develo** fair and reliable methods for evaluating their performance. Particularly, the emergence of subjective or non-subjective cheating phenomena, such as test set leakage and prompt format overfitting, poses significant challenges to the reliable evaluation of LLMs. Since evalu… ▽ More

    Submitted 23 May, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 37 Pages

  12. arXiv:2405.09492  [pdf, other

    cs.LG

    MGSER-SAM: Memory-Guided Soft Experience Replay with Sharpness-Aware Optimization for Enhanced Continual Learning

    Authors: Xingyu Li, Bo Tang

    Abstract: Deep neural networks suffer from the catastrophic forgetting problem in the field of continual learning (CL). To address this challenge, we propose MGSER-SAM, a novel memory replay-based algorithm specifically engineered to enhance the generalization capabilities of CL models. We first intergrate the SAM optimizer, a component designed for optimizing flatness, which seamlessly fits into well-known… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 8 pages, 5 figures

  13. arXiv:2405.05231  [pdf, other

    cs.LG

    DiskGNN: Bridging I/O Efficiency and Model Accuracy for Out-of-Core GNN Training

    Authors: Renjie Liu, Yichuan Wang, Xiao Yan, Zhenkun Cai, Minjie Wang, Haitian Jiang, Bo Tang, **yang Li

    Abstract: Graph neural networks (GNNs) are machine learning models specialized for graph data and widely used in many applications. To train GNNs on large graphs that exceed CPU memory, several systems store data on disk and conduct out-of-core processing. However, these systems suffer from either read amplification when reading node features that are usually smaller than a disk page or degraded model accur… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  14. arXiv:2405.01312  [pdf, other

    cs.DB cs.CR

    Privacy-Enhanced Database Synthesis for Benchmark Publishing

    Authors: Yongrui Zhong, Yunqing Ge, Jianbin Qin, Shuyuan Zheng, Bo Tang, Yu-Xuan Qiu, Rui Mao, Ye Yuan, Makoto Onizuka, Chuan Xiao

    Abstract: Benchmarking is crucial for evaluating a DBMS, yet existing benchmarks often fail to reflect the varied nature of user workloads. As a result, there is increasing momentum toward creating databases that incorporate real-world user data to more accurately mirror business environments. However, privacy concerns deter users from directly sharing their data, underscoring the importance of creating syn… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  15. arXiv:2404.09192  [pdf, other

    cs.SD cs.AI eess.AS

    Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling

    Authors: Quanxiu Wang, Hui Huang, Mingjie Wang, Yong Dai, **zuomu Zhong, Benlai Tang

    Abstract: Over the past decade, a series of unflagging efforts have been dedicated to develo** highly expressive and controllable text-to-speech (TTS) systems. In general, the holistic TTS comprises two interconnected components: the frontend module and the backend module. The frontend excels in capturing linguistic representations from the raw text input, while the backend module converts linguistic cues… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  16. arXiv:2403.18209  [pdf, other

    cs.LG cs.AI cs.RO

    Long and Short-Term Constraints Driven Safe Reinforcement Learning for Autonomous Driving

    Authors: Xuemin Hu, Pan Chen, Yijun Wen, Bo Tang, Long Chen

    Abstract: Reinforcement learning (RL) has been widely used in decision-making tasks, but it cannot guarantee the agent's safety in the training process due to the requirements of interaction with the environment, which seriously limits its industrial applications such as autonomous driving. Safe RL methods are developed to handle this issue by constraining the expected safety violation costs as a training o… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  17. arXiv:2403.05834  [pdf, other

    cs.MM cs.SD eess.AS

    Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information

    Authors: Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng

    Abstract: Dance generation, as a branch of human motion generation, has attracted increasing attention. Recently, a few works attempt to enhance dance expressiveness, which includes genre matching, beat alignment, and dance dynamics, from certain aspects. However, the enhancement is quite limited as they lack comprehensive consideration of the aforementioned three factors. In this paper, we propose Expressi… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  18. arXiv:2403.04283  [pdf, other

    cs.CL cs.AI cs.LG

    Proxy-RLHF: Decoupling Generation and Alignment in Large Language Model with Proxy

    Authors: Yu Zhu, Chuxiong Sun, Wenfei Yang, Wenqiang Wei, Bo Tang, Tianzhu Zhang, Zhiyu Li, Shifeng Zhang, Feiyu Xiong, Jie Hu, Mingchuan yang

    Abstract: Reinforcement Learning from Human Feedback (RLHF) is the prevailing approach to ensure Large Language Models (LLMs) align with human values. However, existing RLHF methods require a high computational cost, one main reason being that RLHF assigns both the generation and alignment tasks to the LLM simultaneously. In this paper, we introduce Proxy-RLHF, which decouples the generation and alignment p… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  19. arXiv:2403.04261  [pdf

    cs.AI cs.CL cs.LG

    Advancing Biomedical Text Mining with Community Challenges

    Authors: Hui Zong, Rongrong Wu, Jiaxue Cha, Erman Wu, Jiakun Li, Liang Tao, Zuofeng Li, Buzhou Tang, Bairong Shen

    Abstract: The field of biomedical research has witnessed a significant increase in the accumulation of vast amounts of textual data from various sources such as scientific literatures, electronic health records, clinical trial reports, and social media. However, manually processing and analyzing these extensive and complex resources is time-consuming and inefficient. To address this challenge, biomedical te… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  20. arXiv:2403.00878  [pdf, other

    cs.CR cs.AI

    Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

    Authors: Jiandong **, Bowen Tang, Mingxuan Ma, Xiao Liu, Yunfei Wang, Qingnan Lai, Jia Yang, Changling Zhou

    Abstract: We introduces Crimson, a system that enhances the strategic reasoning capabilities of Large Language Models (LLMs) within the realm of cybersecurity. By correlating CVEs with MITRE ATT&CK techniques, Crimson advances threat anticipation and strategic defense efforts. Our approach includes defining and evaluating cybersecurity strategic tasks, alongside implementing a comprehensive human-in-the-loo… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 9 pages, 7 figures

  21. arXiv:2403.00862  [pdf, other

    cs.CL cs.AI

    NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism

    Authors: Miao Li, Ming-Bin Chen, Bo Tang, Shengbin Hou, Pengyu Wang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Keming Mao, Peng Cheng, Yi Luo

    Abstract: We present NewsBench, a novel evaluation framework to systematically assess the capabilities of Large Language Models (LLMs) for editorial capabilities in Chinese journalism. Our constructed benchmark dataset is focused on four facets of writing proficiency and six facets of safety adherence, and it comprises manually and carefully designed 1,267 test samples in the types of multiple choice questi… ▽ More

    Submitted 4 June, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

    Comments: Long paper, ACL 2024 Main

  22. arXiv:2402.19011  [pdf, other

    cs.CR

    Ruledger: Ensuring Execution Integrity in Trigger-Action IoT Platforms

    Authors: **gwen Fan, Yi He, Bo Tang, Qi Li, Ravi Sandhu

    Abstract: Smart home IoT systems utilize trigger-action platforms, e.g., IFTTT, to manage devices from various vendors. However, they may be abused by triggering malicious rule execution with forged IoT devices or events violating the execution integrity and the intentions of the users. To address this issue, we propose a ledger based IoT platform called Ruledger, which ensures the correct execution of rule… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Journal ref: 10.1109/INFOCOM42981.2021.9488687

  23. arXiv:2402.11218  [pdf, other

    cs.CL

    Controlled Text Generation for Large Language Model with Dynamic Attribute Graphs

    Authors: Xun Liang, Hanyu Wang, Shichao Song, Mengting Hu, Xunzhi Wang, Zhiyu Li, Feiyu Xiong, Bo Tang

    Abstract: Controlled Text Generation (CTG) aims to produce texts that exhibit specific desired attributes. In this study, we introduce a pluggable CTG framework for Large Language Models (LLMs) named Dynamic Attribute Graphs-based controlled text generation (DATG). This framework utilizes an attribute scorer to evaluate the attributes of sentences generated by LLMs and constructs dynamic attribute graphs. D… ▽ More

    Submitted 24 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: 18 Pages, Accepted by ACL 2024 Findings

  24. arXiv:2402.10464  [pdf, other

    cs.LG cs.NI

    FedKit: Enabling Cross-Platform Federated Learning for Android and iOS

    Authors: Sichang He, Beilong Tang, Boyan Zhang, Jiaoqi Shao, Xiaomin Ouyang, Daniel Nata Nugraha, Bing Luo

    Abstract: We present FedKit, a federated learning (FL) system tailored for cross-platform FL research on Android and iOS devices. FedKit pipelines cross-platform FL development by enabling model conversion, hardware-accelerated training, and cross-platform model aggregation. Our FL workflow supports flexible machine learning operations (MLOps) in production, facilitating continuous model delivery and traini… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: This work has been accepted for demonstration on IEEE International Conference on Computer Communications (INFOCOM) 2024

  25. Debiasing Recommendation with Personal Popularity

    Authors: Wentao Ning, Reynold Cheng, Xiao Yan, Ben Kao, Nan Huo, Nur AI Hasan Haldar, Bo Tang

    Abstract: Global popularity (GP) bias is the phenomenon that popular items are recommended much more frequently than they should be, which goes against the goal of providing personalized recommendations and harms user experience and recommendation accuracy. Many methods have been proposed to reduce GP bias but they fail to notice the fundamental problem of GP, i.e., it considers popularity from a \textit{gl… ▽ More

    Submitted 21 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW'24 as a research full paper

  26. arXiv:2402.05569  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Simplifying Hypergraph Neural Networks

    Authors: Bohan Tang, Zexi Liu, Keyue Jiang, Siheng Chen, Xiaowen Dong

    Abstract: Hypergraphs are crucial for modeling higher-order interactions in real-world data. Hypergraph neural networks (HNNs) effectively utilise these structures by message passing to generate informative node features for various downstream tasks like node classification. However, the message passing block in existing HNNs typically requires a computationally intensive training process, which limits thei… ▽ More

    Submitted 22 May, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  27. arXiv:2401.17503  [pdf, other

    quant-ph cs.NI

    Differentiated Service Entanglement Routing for Quantum Networks

    Authors: Hui Han, Bo Liu, Bangying Tang, Siyu Xiong, **quan Huang, Wanrong Yu, Shuhui Chen

    Abstract: The entanglement distribution networks with various topologies are mainly implemented by active wavelength multiplexing routing strategies. However, designing an entanglement routing scheme, which achieves the maximized network connections and the optimal overall network efficiency simultaneously, remains a huge challenge for quantum networks. In this article, we propose a differentiated service e… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 25 pages, 14 figures

  28. arXiv:2401.17043  [pdf, other

    cs.CL

    CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

    Authors: Yuanjie Lyu, Zhiyu Li, Simin Niu, Feiyu Xiong, Bo Tang, Wen** Wang, Hao Wu, Huanyong Liu, Tong Xu, Enhong Chen, Yi Luo, Peng Cheng, Haiying Deng, Zhonghao Wang, Zijia Lu

    Abstract: Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources. This method addresses common LLM limitations, including outdated information and the tendency to produce inaccurate "hallucinated" content. However, the evaluation of RAG systems is challenging, as existing benchmarks are limited in scope a… ▽ More

    Submitted 18 February, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 26 Pages

  29. arXiv:2401.15657  [pdf, other

    cs.CV

    Data-Free Generalized Zero-Shot Learning

    Authors: Bowen Tang, Long Yan, **g Zhang, Qian Yu, Lu Sheng, Dong Xu

    Abstract: Deep learning models have the ability to extract rich knowledge from large-scale datasets. However, the sharing of data has become increasingly challenging due to concerns regarding data copyright and privacy. Consequently, this hampers the effective transfer of knowledge from existing data to novel downstream tasks and concepts. Zero-shot learning (ZSL) approaches aim to recognize new classes by… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI24

  30. arXiv:2401.14758  [pdf, other

    cs.LG

    Off-Policy Primal-Dual Safe Reinforcement Learning

    Authors: Zifan Wu, Bo Tang, Qian Lin, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

    Abstract: Primal-dual safe RL methods commonly perform iterations between the primal update of the policy and the dual update of the Lagrange Multiplier. Such a training paradigm is highly susceptible to the error in cumulative cost estimation since this estimation serves as the key bond connecting the primal and dual update processes. We show that this problem causes significant underestimation of cost whe… ▽ More

    Submitted 15 April, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: ICLR 2024 Poster

  31. arXiv:2401.13697  [pdf, other

    cs.CV cs.AI cs.CL

    Toward Robust Multimodal Learning using Multimodal Foundational Models

    Authors: Xianbing Zhao, Soujanya Poria, Xuejiao Li, Yixin Chen, Buzhou Tang

    Abstract: Existing multimodal sentiment analysis tasks are highly rely on the assumption that the training and test sets are complete multimodal data, while this assumption can be difficult to hold: the multimodal data are often incomplete in real-world scenarios. Therefore, a robust multimodal model in scenarios with randomly missing modalities is highly preferred. Recently, CLIP-based multimodal foundatio… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Under Review

  32. arXiv:2401.10811  [pdf, other

    stat.ML cs.LG

    Simulation Based Bayesian Optimization

    Authors: Roi Naveiro, Becky Tang

    Abstract: Bayesian Optimization (BO) is a powerful method for optimizing black-box functions by combining prior knowledge with ongoing function evaluations. BO constructs a probabilistic surrogate model of the objective function given the covariates, which is in turn used to inform the selection of future evaluation points through an acquisition function. For smooth continuous search spaces, Gaussian Proces… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  33. arXiv:2401.07532  [pdf, other

    cs.SD cs.AI eess.AS

    Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation

    Authors: Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, **g Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng

    Abstract: Variational Autoencoders (VAEs) constitute a crucial component of neural symbolic music generation, among which some works have yielded outstanding results and attracted considerable attention. Nevertheless, previous VAEs still encounter issues with overly long feature sequences and generated results lack contextual coherence, thus the challenge of modeling long multi-track symbolic music still re… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  34. arXiv:2401.03385  [pdf, other

    cs.CL

    Grimoire is All You Need for Enhancing Large Language Models

    Authors: Ding Chen, Shichao Song, Qingchen Yu, Zhiyu Li, Wen** Wang, Feiyu Xiong, Bo Tang

    Abstract: In-context Learning (ICL) is one of the key methods for enhancing the performance of large language models on specific tasks by providing a set of few-shot examples. However, the ICL capability of different types of models shows significant variation due to factors such as model architecture, volume of learning data, and the size of parameters. Generally, the larger the model's parameter size and… ▽ More

    Submitted 10 January, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

    Comments: 9 pages

  35. arXiv:2401.02993  [pdf, other

    cs.CL cs.AI

    ReFusion: Improving Natural Language Understanding with Computation-Efficient Retrieval Representation Fusion

    Authors: Shangyu Wu, Ying Xiong, Yufei Cui, Xue Liu, Buzhou Tang, Tei-Wei Kuo, Chun Jason Xue

    Abstract: Retrieval-based augmentations (RA) incorporating knowledge from an external database into language models have greatly succeeded in various knowledge-intensive (KI) tasks. However, integrating retrievals in non-knowledge-intensive (NKI) tasks is still challenging. Existing works focus on concatenating retrievals with inputs to improve model performance. Unfortunately, the use of retrieval concaten… ▽ More

    Submitted 27 May, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

  36. arXiv:2401.01369  [pdf, other

    cs.IR cs.AI cs.LG

    RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems

    Authors: Jiahong Zhou, Shunhui Mao, Guoliang Yang, Bo Tang, Qianlong Xie, Lebin Lin, Xingxing Wang, Dong Wang

    Abstract: Recommender systems aim to recommend the most suitable items to users from a large number of candidates. Their computation cost grows as the number of user requests and the complexity of services (or models) increases. Under the limitation of computation resources (CRs), how to make a trade-off between computation cost and business revenue becomes an essential question. The existing studies focus… ▽ More

    Submitted 27 December, 2023; originally announced January 2024.

    Comments: 11 pages, 7 figures, published to Proceedings of the ACM Web Conference 2023

  37. arXiv:2312.17522  [pdf, other

    cs.CL

    Overview of the PromptCBLUE Shared Task in CHIP2023

    Authors: Wei Zhu, Xiaoling Wang, Mosha Chen, Buzhou Tang

    Abstract: This paper presents an overview of the PromptCBLUE shared task (http://cips-chip.org.cn/2023/eval1) held in the CHIP-2023 Conference. This shared task reformualtes the CBLUE benchmark, and provide a good testbed for Chinese open-domain or medical-domain large language models (LLMs) in general medical natural language processing. Two different tracks are held: (a) prompt tuning track, investigating… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

  38. arXiv:2312.17503  [pdf, other

    cs.LG cs.GT

    HiBid: A Cross-Channel Constrained Bidding System with Budget Allocation by Hierarchical Offline Deep Reinforcement Learning

    Authors: Hao Wang, Bo Tang, Chi Harold Liu, Shangqin Mao, Jiahong Zhou, Zipeng Dai, Yaqi Sun, Qianlong Xie, Xingxing Wang, Dong Wang

    Abstract: Online display advertising platforms service numerous advertisers by providing real-time bidding (RTB) for the scale of billions of ad requests every day. The bidding strategy handles ad requests cross multiple channels to maximize the number of clicks under the set financial constraints, i.e., total budget and cost-per-click (CPC), etc. Different from existing works mainly focusing on single chan… ▽ More

    Submitted 29 December, 2023; originally announced December 2023.

    Report number: 23-NX-HOIX

  39. arXiv:2312.11858  [pdf, other

    cs.LG cs.SI

    SimCalib: Graph Neural Network Calibration based on Similarity between Nodes

    Authors: Boshi Tang, Zhiyong Wu, Xixin Wu, Qiaochu Huang, Jun Chen, Shun Lei, Helen Meng

    Abstract: Graph neural networks (GNNs) have exhibited impressive performance in modeling graph data as exemplified in various applications. Recently, the GNN calibration problem has attracted increasing attention, especially in cost-sensitive scenarios. Previous work has gained empirical insights on the issue, and devised effective approaches for it, but theoretical supports still fall short. In this work,… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  40. arXiv:2312.11442  [pdf, other

    cs.HC cs.AI

    Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations

    Authors: Zilin Wang, Haolin Zhuang, Lu Li, Yinmin Zhang, Junjie Zhong, Jun Chen, Yu Yang, Boshi Tang, Zhiyong Wu

    Abstract: This paper presents an Exploratory 3D Dance generation framework, E3D2, designed to address the exploration capability deficiency in existing music-conditioned 3D dance generation models. Current models often generate monotonous and simplistic dance sequences that misalign with human preferences because they lack exploration capabilities. The E3D2 framework involves a reward model trained from aut… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: AAAI-24

    ACM Class: I.3.7

  41. arXiv:2312.11385  [pdf, other

    cs.LG

    Hypergraph Transformer for Semi-Supervised Classification

    Authors: Zexi Liu, Bohan Tang, Ziyuan Ye, Xiaowen Dong, Siheng Chen, Yanfeng Wang

    Abstract: Hypergraphs play a pivotal role in the modelling of data featuring higher-order relations involving more than two entities. Hypergraph neural networks emerge as a powerful tool for processing hypergraph-structured data, delivering remarkable performance across various tasks, e.g., hypergraph node classification. However, these models struggle to capture global structural information due to their r… ▽ More

    Submitted 2 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP 2024

  42. arXiv:2312.09778  [pdf, other

    cs.LG eess.SP

    Hypergraph-MLP: Learning on Hypergraphs without Message Passing

    Authors: Bohan Tang, Siheng Chen, Xiaowen Dong

    Abstract: Hypergraphs are vital in modelling data with higher-order relations containing more than two entities, gaining prominence in machine learning and signal processing. Many hypergraph neural networks leverage message passing over hypergraph structures to enhance node representation learning, yielding impressive performances in tasks like hypergraph node classification. However, these message-passing-… ▽ More

    Submitted 2 June, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP 2024

  43. arXiv:2312.09305  [pdf, other

    cs.CV

    Stable Score Distillation for High-Quality 3D Generation

    Authors: Boshi Tang, Jianan Wang, Zhiyong Wu, Lei Zhang

    Abstract: Although Score Distillation Sampling (SDS) has exhibited remarkable performance in conditional 3D content generation, a comprehensive understanding of its formulation is still lacking, hindering the development of 3D generation. In this work, we decompose SDS as a combination of three functional components, namely mode-seeking, mode-disengaging and variance-reducing terms, analyzing the properties… ▽ More

    Submitted 7 February, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

  44. arXiv:2312.07718  [pdf, other

    cs.LG math.OC

    CaVE: A Cone-Aligned Approach for Fast Predict-then-optimize with Binary Linear Programs

    Authors: Bo Tang, Elias B. Khalil

    Abstract: The end-to-end predict-then-optimize framework, also known as decision-focused learning, has gained popularity for its ability to integrate optimization into the training procedure of machine learning models that predict the unknown cost (objective function) coefficients of optimization problems from contextual instance information. Naturally, most of the problems of interest in this space can be… ▽ More

    Submitted 15 March, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  45. arXiv:2312.07630  [pdf, other

    cs.CV

    Building Universal Foundation Models for Medical Image Analysis with Spatially Adaptive Networks

    Authors: Lingxiao Luo, Xuanzhong Chen, Bingda Tang, Xinsheng Chen, Rong Han, Chengpeng Hu, Yujiang Li, Ting Chen

    Abstract: Recent advancements in foundation models, typically trained with self-supervised learning on large-scale and diverse datasets, have shown great potential in medical image analysis. However, due to the significant spatial heterogeneity of medical imaging data, current models must tailor specific structures for different datasets, making it challenging to leverage the abundant unlabeled data. In thi… ▽ More

    Submitted 23 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  46. arXiv:2311.15296  [pdf, other

    cs.CL

    UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation

    Authors: Xun Liang, Shichao Song, Simin Niu, Zhiyu Li, Feiyu Xiong, Bo Tang, Yezhaohui Wang, Dawei He, Peng Cheng, Zhonghao Wang, Haiying Deng

    Abstract: Large language models (LLMs) have emerged as pivotal contributors in contemporary natural language processing and are increasingly being applied across a diverse range of industries. However, these large-scale probabilistic statistical models cannot currently ensure the requisite quality in professional content generation. These models often produce hallucinated text, compromising their practical… ▽ More

    Submitted 23 May, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: Accepted by ACL 2024

  47. arXiv:2311.13765  [pdf, other

    math.OC cs.LG physics.soc-ph

    Learning Optimal and Fair Policies for Online Allocation of Scarce Societal Resources from Data Collected in Deployment

    Authors: Bill Tang, Çağıl Koçyiğit, Eric Rice, Phebe Vayanos

    Abstract: We study the problem of allocating scarce societal resources of different types (e.g., permanent housing, deceased donor kidneys for transplantation, ventilators) to heterogeneous allocatees on a waitlist (e.g., people experiencing homelessness, individuals suffering from end-stage renal disease, Covid-19 patients) based on their observed covariates. We leverage administrative data collected in de… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: 61 pages, 9 figures, 2 tables

  48. arXiv:2310.16654  [pdf, other

    cs.CL

    ChatGPT is a Potential Zero-Shot Dependency Parser

    Authors: Boda Lin, Xinyi Zhou, Binghao Tang, Xiaocheng Gong, Si Li

    Abstract: Pre-trained language models have been widely used in dependency parsing task and have achieved significant improvements in parser performance. However, it remains an understudied question whether pre-trained language models can spontaneously exhibit the ability of dependency parsing without introducing additional parser structure in the zero-shot scenario. In this paper, we propose to explore the… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 10 pages

  49. arXiv:2310.14151  [pdf, other

    cs.CL

    PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain

    Authors: Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang

    Abstract: Biomedical language understanding benchmarks are the driving forces for artificial intelligence applications with large language model (LLM) back-ends. However, most current benchmarks: (a) are limited to English which makes it challenging to replicate many of the successes in English for other languages, or (b) focus on knowledge probing of LLMs and neglect to evaluate how LLMs apply these knowle… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  50. arXiv:2310.12733  [pdf, other

    eess.IV cs.CV

    Multiscale Motion-Aware and Spatial-Temporal-Channel Contextual Coding Network for Learned Video Compression

    Authors: Yiming Wang, Qian Huang, Bin Tang, Huashan Sun, Xing Li

    Abstract: Recently, learned video compression has achieved exciting performance. Following the traditional hybrid prediction coding framework, most learned methods generally adopt the motion estimation motion compensation (MEMC) method to remove inter-frame redundancy. However, inaccurate motion vector (MV) usually lead to the distortion of reconstructed frame. In addition, most approaches ignore the spatia… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 12pages,12 figures