Skip to main content

Showing 1–50 of 227 results for author: Lei, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14958  [pdf, other

    cs.CV

    Skip and Skip: Segmenting Medical Images with Prompts

    Authors: Jiawei Chen, Dingkang Yang, Yuxuan Lei, Lihua Zhang

    Abstract: Most medical image lesion segmentation methods rely on hand-crafted accurate annotations of the original image for supervised learning. Recently, a series of weakly supervised or unsupervised methods have been proposed to reduce the dependence on pixel-level annotations. However, these methods are essentially based on pixel-level annotation, ignoring the image-level diagnostic results of the curre… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Work in progress

  2. arXiv:2406.13282  [pdf, other

    cs.CL

    Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective

    Authors: Meizhi Zhong, Chen Zhang, Yikun Lei, Xikai Liu, Yan Gao, Yao Hu, Kehai Chen, Min Zhang

    Abstract: Enabling LLMs to handle lengthy context is currently a research hotspot. Most LLMs are built upon rotary position embedding (RoPE), a popular position encoding method. Therefore, a prominent path is to extrapolate the RoPE trained on comparably short texts to far longer texts. A heavy bunch of efforts have been dedicated to boosting the extrapolation via extending the formulations of the RoPE, how… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  3. arXiv:2406.10985  [pdf, other

    cs.CL

    Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens

    Authors: Weiyao Luo, Suncong Zheng, Heming Xia, Weikang Wang, Yan Lei, Tianyu Liu, Shuang Chen, Zhifang Sui

    Abstract: Large language models (LLMs) have shown promising efficacy across various tasks, becoming powerful tools in numerous aspects of human life. However, Transformer-based LLMs suffer a performance degradation when modeling long-term contexts due to they discard some information to reduce computational overhead. In this work, we propose a simple yet effective method to enable LLMs to take a deep breath… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.02983  [pdf, other

    cs.RO cs.AI

    FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality

    Authors: Keyu Chen, Yuheng Lei, Hao Cheng, Haoran Wu, Wenchao Sun, Sifa Zheng

    Abstract: Generating safety-critical scenarios, which are essential yet difficult to collect at scale, offers an effective method to evaluate the robustness of autonomous vehicles (AVs). Existing methods focus on optimizing adversariality while preserving the naturalness of scenarios, aiming to achieve a balance through data-driven approaches. However, without an appropriate upper bound for adversariality,… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 19 pages. Under review

  5. arXiv:2405.06288  [pdf, other

    cs.CV

    PCLMix: Weakly Supervised Medical Image Segmentation via Pixel-Level Contrastive Learning and Dynamic Mix Augmentation

    Authors: Yu Lei, Haolun Luo, Lituan Wang, Zhenwei Zhang, Lei Zhang

    Abstract: In weakly supervised medical image segmentation, the absence of structural priors and the discreteness of class feature distribution present a challenge, i.e., how to accurately propagate supervision signals from local to global regions without excessively spreading them to other irrelevant regions? To address this, we propose a novel weakly supervised medical image segmentation framework named PC… ▽ More

    Submitted 18 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  6. arXiv:2405.02692  [pdf

    cs.CV physics.med-ph

    Diffeomorphic Transformer-based Abdomen MRI-CT Deformable Image Registration

    Authors: Yang Lei, Luke A. Matkovic, Justin Roper, Tonghe Wang, Jun Zhou, Beth Ghavidel, Mark McDonald, Pretesh Patel, Xiaofeng Yang

    Abstract: This paper aims to create a deep learning framework that can estimate the deformation vector field (DVF) for directly registering abdominal MRI-CT images. The proposed method assumed a diffeomorphic deformation. By using topology-preserved deformation features extracted from the probabilistic diffeomorphic registration model, abdominal motion can be accurately obtained and utilized for DVF estimat… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 18 pages and 4 figures

  7. arXiv:2404.13004  [pdf, other

    cs.CE cs.AI

    FinLangNet: A Novel Deep Learning Framework for Credit Risk Prediction Using Linguistic Analogy in Financial Data

    Authors: Yu Lei, Zixuan Wang, Chu Liu, Tongyao Wang, Dongyang Lee

    Abstract: Recent industrial applications in risk prediction still heavily rely on extensively manually-tuned, statistical learning methods. Real-world financial data, characterized by its high-dimensionality, sparsity, high noise levels, and significant imbalance, poses unique challenges for the effective application of deep neural network models. In this work, we introduce a novel deep learning risk predic… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  8. arXiv:2404.12228  [pdf, other

    cs.AI cs.LG

    Relationship Discovery for Drug Recommendation

    Authors: Xiang Li, Shunpan Liang, Yu Lei, Chen Li, Yulei Hou, Tengfei Ma

    Abstract: Medication recommendation systems are designed to deliver personalized drug suggestions that are closely aligned with individual patient needs. Previous studies have primarily concentrated on develo** medication embeddings, achieving significant progress. Nonetheless, these approaches often fall short in accurately reflecting individual patient profiles, mainly due to challenges in distinguishin… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  9. arXiv:2404.11993  [pdf, other

    cs.IR cs.LG

    Knowledge-Aware Multi-Intent Contrastive Learning for Multi-Behavior Recommendation

    Authors: Shunpan Liang, Junjie Zhao, Chen Li, Yu Lei

    Abstract: Multi-behavioral recommendation optimizes user experiences by providing users with more accurate choices based on their diverse behaviors, such as view, add to cart, and purchase. Current studies on multi-behavioral recommendation mainly explore the connections and differences between multi-behaviors from an implicit perspective. Specifically, they directly model those relations using black-box ne… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  10. arXiv:2404.07072  [pdf, other

    cs.CV

    Implicit Multi-Spectral Transformer: An Lightweight and Effective Visible to Infrared Image Translation Model

    Authors: Yijia Chen, **hua Chen, Xiangxin Zhou, Yingtie Lei, Ziyang Zhou, Mingxian Li

    Abstract: In the field of computer vision, visible light images often exhibit low contrast in low-light conditions, presenting a significant challenge. While infrared imagery provides a potential solution, its utilization entails high costs and practical limitations. Recent advancements in deep learning, particularly the deployment of Generative Adversarial Networks (GANs), have facilitated the transformati… ▽ More

    Submitted 27 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted by IJCNN 2024

  11. arXiv:2404.04953  [pdf, other

    cs.CV

    High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning

    Authors: Yu Lei, Guoshuai Sheng, Fangfang Li, Quanxue Gao, Cheng Deng, Qin Li

    Abstract: Zero-shot learning(ZSL) aims to recognize new classes without prior exposure to their samples, relying on semantic knowledge from observed classes. However, current attention-based models may overlook the transferability of visual features and the distinctiveness of attribute localization when learning regional features in images. Additionally, they often overlook shared attributes among different… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  12. arXiv:2404.03210  [pdf, other

    cs.CV eess.IV

    HDR Imaging for Dynamic Scenes with Events

    Authors: Li Xiaopeng, Zeng Zhaoyuan, Fan Cien, Zhao Chen, Deng Lei, Yu Lei

    Abstract: High dynamic range imaging (HDRI) for real-world dynamic scenes is challenging because moving objects may lead to hybrid degradation of low dynamic range and motion blur. Existing event-based approaches only focus on a separate task, while cascading HDRI and motion deblurring would lead to sub-optimal solutions, and unavailable ground-truth sharp HDR images aggravate the predicament. To address th… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  13. arXiv:2404.02663  [pdf

    eess.SP cs.IT

    Ground-to-UAV sub-Terahertz channel measurement and modeling

    Authors: Da Li, Peian Li, Jiabiao Zhao, Jianjian Liang, Jiacheng Liu, Guohao Liu, Yuanshuai Lei, Wenbo Liu, Jianqin Deng, Fuyong Liu, Jianjun Ma

    Abstract: Unmanned Aerial Vehicle (UAV) assisted terahertz (THz) wireless communications have been expected to play a vital role in the next generation of wireless networks. UAVs can serve as either repeaters or data collectors within the communication link, thereby potentially augmenting the efficacy of communication systems. Despite their promise, the channel analysis and modeling specific to THz wireless… ▽ More

    Submitted 28 June, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Submitted to Optics Express

  14. arXiv:2404.01722  [pdf, other

    cs.CL

    Sentence-level Media Bias Analysis with Event Relation Graph

    Authors: Yuanyuan Lei, Ruihong Huang

    Abstract: Media outlets are becoming more partisan and polarized nowadays. In this paper, we identify media bias at the sentence level, and pinpoint bias sentences that intend to sway readers' opinions. As bias sentences are often expressed in a neutral and factual way, considering broader context outside a sentence can help reveal the bias. In particular, we observe that events in a bias sentence need to b… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024

  15. arXiv:2404.01715  [pdf, other

    cs.CL

    EMONA: Event-level Moral Opinions in News Articles

    Authors: Yuanyuan Lei, Md Messal Monem Miah, Ayesha Qamar, Sai Ramana Reddy, Jonathan Tong, Haotian Xu, Ruihong Huang

    Abstract: Most previous research on moral frames has focused on social media short texts, little work has explored moral sentiment within news articles. In news articles, authors often express their opinions or political stance through moral judgment towards events, specifically whether the event is right or wrong according to social moral rules. This paper initiates a new task to understand moral opinions… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024

  16. arXiv:2404.01706  [pdf, other

    cs.CL

    Polarity Calibration for Opinion Summarization

    Authors: Yuanyuan Lei, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Ruihong Huang, Dong Yu

    Abstract: Opinion summarization is automatically generating summaries from a variety of subjective information, such as product reviews or political opinions. The challenge of opinions summarization lies in presenting divergent or even conflicting opinions. We conduct an analysis of previous summarization models, which reveals their inclination to amplify the polarity bias, emphasizing the majority opinions… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024

  17. arXiv:2404.00312  [pdf, other

    cs.CV cs.AI

    Bayesian Exploration of Pre-trained Models for Low-shot Image Classification

    Authors: Yibo Miao, Yu Lei, Feng Zhou, Zhijie Deng

    Abstract: Low-shot image classification is a fundamental task in computer vision, and the emergence of large-scale vision-language models such as CLIP has greatly advanced the forefront of research in this field. However, most existing CLIP-based methods lack the flexibility to effectively incorporate other pre-trained models that encompass knowledge distinct from CLIP. To bridge the gap, this work proposes… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  18. arXiv:2403.11461  [pdf, other

    cs.RO

    VIHE: Virtual In-Hand Eye Transformer for 3D Robotic Manipulation

    Authors: Weiyao Wang, Yutian Lei, Shiyu **, Gregory D. Hager, Liangjun Zhang

    Abstract: In this work, we introduce the Virtual In-Hand Eye Transformer (VIHE), a novel method designed to enhance 3D manipulation capabilities through action-aware view rendering. VIHE autoregressively refines actions in multiple stages by conditioning on rendered views posed from action predictions in the earlier stages. These virtual in-hand views provide a strong inductive bias for effectively recogniz… ▽ More

    Submitted 18 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

  19. arXiv:2403.10172  [pdf, other

    cs.HC

    Unpacking ICT-supported Social Connections and Support of Late-life Migration: From the Lens of Social Convoys

    Authors: Ying Lei, Shuai Ma, Yuling Sun

    Abstract: Migration and aging-related dilemmas have limited the opportunities for late-life migrants to rebuild social connections and access support. While research on migrants has drawn increasing attention in HCI, limited attention has been paid to the increasing number of late-life migrants. This paper reports a qualitative study examining the social connections and support of late-life migrants. In par… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  20. arXiv:2403.09552  [pdf, other

    cs.HC

    "Are You Really Sure?" Understanding the Effects of Human Self-Confidence Calibration in AI-Assisted Decision Making

    Authors: Shuai Ma, Xinru Wang, Ying Lei, Chuhan Shi, Ming Yin, Xiaojuan Ma

    Abstract: In AI-assisted decision-making, it is crucial but challenging for humans to achieve appropriate reliance on AI. This paper approaches this problem from a human-centered perspective, "human self-confidence calibration". We begin by proposing an analytical framework to highlight the importance of calibrated human self-confidence. In our first study, we explore the relationship between human self-con… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  21. arXiv:2403.06973  [pdf, other

    cs.CV cs.LG

    Bayesian Diffusion Models for 3D Shape Reconstruction

    Authors: Haiyang Xu, Yu Lei, Zeyuan Chen, Xiang Zhang, Yue Zhao, Yilin Wang, Zhuowen Tu

    Abstract: We present Bayesian Diffusion Models (BDM), a prediction algorithm that performs effective Bayesian inference by tightly coupling the top-down (prior) information with the bottom-up (data-driven) procedure via joint diffusion processes. We show the effectiveness of BDM on the 3D shape reconstruction task. Compared to prototypical deep learning data-driven approaches trained on paired (supervised)… ▽ More

    Submitted 21 April, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024; Project Page: https://mlpc-ucsd.github.io/BDM/

  22. RecAI: Leveraging Large Language Models for Next-Generation Recommender Systems

    Authors: Jianxun Lian, Yuxuan Lei, Xu Huang, **g Yao, Wei Xu, Xing Xie

    Abstract: This paper introduces RecAI, a practical toolkit designed to augment or even revolutionize recommender systems with the advanced capabilities of Large Language Models (LLMs). RecAI provides a suite of tools, including Recommender AI Agent, Recommendation-oriented Language Models, Knowledge Plugin, RecExplainer, and Evaluator, to facilitate the integration of LLMs into recommender systems from mult… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 4 pages. Webconf 2024 demo track

    MSC Class: 68T50

  23. arXiv:2403.06420  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    RLingua: Improving Reinforcement Learning Sample Efficiency in Robotic Manipulations With Large Language Models

    Authors: Liangliang Chen, Yutian Lei, Shiyu **, Ying Zhang, Liangjun Zhang

    Abstract: Reinforcement learning (RL) has demonstrated its capability in solving various tasks but is notorious for its low sample efficiency. In this paper, we propose RLingua, a framework that can leverage the internal knowledge of large language models (LLMs) to reduce the sample complexity of RL in robotic manipulations. To this end, we first present a method for extracting the prior knowledge of LLMs b… ▽ More

    Submitted 19 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  24. arXiv:2403.01840  [pdf, other

    cs.CV cs.AI

    FreeA: Human-object Interaction Detection using Free Annotation Labels

    Authors: Yuxiao Wang, Zhenao Wei, Xinyu Jiang, Yu Lei, Weiying Xue, **xiu Liu, Qi Liu

    Abstract: Recent human-object interaction (HOI) detection approaches rely on high cost of manpower and require comprehensive annotated image datasets. In this paper, we propose a novel self-adaption language-driven HOI detection method, termed as FreeA, without labeling by leveraging the adaptability of CLIP to generate latent HOI labels. To be specific, FreeA matches image features of human-object pairs wi… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 11 pages, 7 figures, 6 tables

  25. arXiv:2403.00880  [pdf, other

    cs.IR cs.AI

    Dual-Granularity Medication Recommendation Based on Causal Inference

    Authors: Shunpan Liang, Xiang Li, Xiang Li, Chen Li, Yu Lei, Yulei Hou, Tengfei Ma

    Abstract: As medical demands grow and machine learning technology advances, AI-based diagnostic and treatment systems are garnering increasing attention. Medication recommendation aims to integrate patients' long-term health records with medical knowledge, recommending accuracy and safe medication combinations for specific conditions. However, most existing researches treat medication recommendation systems… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  26. arXiv:2402.18899  [pdf, other

    cs.IR

    Aligning Language Models for Versatile Text-based Item Retrieval

    Authors: Yuxuan Lei, Jianxun Lian, **g Yao, Mingqi Wu, Defu Lian, Xing Xie

    Abstract: This paper addresses the gap between general-purpose text embeddings and the specific demands of item retrieval tasks. We demonstrate the shortcomings of existing models in capturing the nuances necessary for zero-shot performance on item retrieval tasks. To overcome these limitations, we propose generate in-domain dataset from ten tasks tailored to unlocking models' representation ability for ite… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 4 pages,1 figures, 4 tables

  27. arXiv:2402.18458  [pdf, other

    cs.CL

    Meta-Task Prompting Elicits Embedding from Large Language Models

    Authors: Yibin Lei, Di Wu, Tianyi Zhou, Tao Shen, Yu Cao, Chongyang Tao, Andrew Yates

    Abstract: In this work, we introduce a new unsupervised embedding method, Meta-Task Prompting with Explicit One-Word Limitation (MetaEOL), for generating high-quality sentence embeddings from Large Language Models (LLMs) without the need for model fine-tuning or task-specific engineering. Leveraging meta-task prompting, MetaEOL guides LLMs to produce embeddings through a series of carefully designed prompts… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  28. arXiv:2402.18031  [pdf, other

    cs.IR cs.CL

    Corpus-Steered Query Expansion with Large Language Models

    Authors: Yibin Lei, Yu Cao, Tianyi Zhou, Tao Shen, Andrew Yates

    Abstract: Recent studies demonstrate that query expansions generated by large language models (LLMs) can considerably enhance information retrieval systems by generating hypothetical documents that answer the queries as expansions. However, challenges arise from misalignments between the expansions and the retrieval corpus, resulting in issues like hallucinations and outdated information due to the limited… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: EACL 2024 (Short)

  29. arXiv:2402.16872  [pdf, other

    cs.IR

    NFT1000: A Visual Text Dataset For Non-Fungible Token Retrieval

    Authors: Shuxun Wang, Yunfei Lei, Ziqi Zhang, Wei Liu, Haowei Liu, Li Yang, Wenjuan Li, Bing Li, Weiming Hu

    Abstract: With the rise of 'Metaverse' and 'Web3.0', NFT ( Non-Fungible Token ) has emerged as a kind of pivotal digital asset, garnering significant attention. By the end of November 2023, more than 1.4 billion NFT tokens have been minted across various blockchain platforms. To effectively locate a satisfactory NFT token, conducting searches within the extensive array of NFT data is essential. The challeng… ▽ More

    Submitted 28 January, 2024; originally announced February 2024.

    Comments: 6 pages,7 figures

  30. arXiv:2402.14272  [pdf, other

    cs.CL

    Qsnail: A Questionnaire Dataset for Sequential Question Generation

    Authors: Yan Lei, Liang Pang, Yuanzhuo Wang, Huawei Shen, Xueqi Cheng

    Abstract: The questionnaire is a professional research methodology used for both qualitative and quantitative analysis of human opinions, preferences, attitudes, and behaviors. However, designing and evaluating questionnaires demands significant effort due to their intricate and complex structure. Questionnaires entail a series of questions that must conform to intricate constraints involving the questions,… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: Accepted to the LREC-COLING 2024

  31. arXiv:2402.14228  [pdf, other

    cs.LG cs.AI

    COPR: Continual Human Preference Learning via Optimal Policy Regularization

    Authors: Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu

    Abstract: Reinforcement Learning from Human Feedback (RLHF) is commonly utilized to improve the alignment of Large Language Models (LLMs) with human preferences. Given the evolving nature of human preferences, continual alignment becomes more crucial and practical in comparison to traditional static alignment. Nevertheless, making RLHF compatible with Continual Learning (CL) is challenging due to its comple… ▽ More

    Submitted 27 February, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  32. arXiv:2402.06798  [pdf, other

    cs.RO

    Reasoning Gras** via Multimodal Large Language Model

    Authors: Shiyu **, **xuan Xu, Yutian Lei, Liangjun Zhang

    Abstract: Despite significant progress in robotic systems for operation within human-centric environments, existing models still heavily rely on explicit human commands to identify and manipulate specific objects. This limits their effectiveness in environments where understanding and acting on implicit human intentions are crucial. In this study, we introduce a novel task: reasoning gras**, where robots… ▽ More

    Submitted 25 April, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

  33. arXiv:2402.06073  [pdf

    cs.CL cs.SD eess.AS

    LightCAM: A Fast and Light Implementation of Context-Aware Masking based D-TDNN for Speaker Verification

    Authors: Di Cao, Xianchen Wang, Junfeng Zhou, Jiakai Zhang, Yan**g Lei, Wenpeng Chen

    Abstract: Traditional Time Delay Neural Networks (TDNN) have achieved state-of-the-art performance at the cost of high computational complexity and slower inference speed, making them difficult to implement in an industrial environment. The Densely Connected Time Delay Neural Network (D-TDNN) with Context Aware Masking (CAM) module has proven to be an efficient structure to reduce complexity while maintaini… ▽ More

    Submitted 12 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  34. arXiv:2401.14665  [pdf, other

    q-bio.BM cs.AI

    PepGB: Facilitating peptide drug discovery via graph neural networks

    Authors: Yipin Lei, Xu Wang, Meng Fang, Han Li, Xiang Li, Jianyang Zeng

    Abstract: Peptides offer great biomedical potential and serve as promising drug candidates. Currently, the majority of approved peptide drugs are directly derived from well-explored natural human peptides. It is quite necessary to utilize advanced deep learning techniques to identify novel peptide drugs in the vast, unexplored biochemical space. Despite various in silico methods having been developed to acc… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  35. arXiv:2401.05163  [pdf, other

    cs.CV cs.AI

    MISS: A Generative Pretraining and Finetuning Approach for Med-VQA

    Authors: Jiawei Chen, Dingkang Yang, Yue Jiang, Yuxuan Lei, Lihua Zhang

    Abstract: Medical visual question answering (VQA) is a challenging multimodal task, where Vision-Language Pre-training (VLP) models can effectively improve the generalization performance. However, most methods in the medical field treat VQA as an answer classification task which is difficult to transfer to practical application scenarios. Additionally, due to the privacy of medical images and the expensive… ▽ More

    Submitted 19 June, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: ICANN, 2024

  36. arXiv:2401.00657  [pdf, other

    math.OC cs.CV math.SP

    Optimizing ADMM and Over-Relaxed ADMM Parameters for Linear Quadratic Problems

    Authors: **tao Song, Wenqi Lu, Yunwen Lei, Yuchao Tang, Zhenkuan Pan, **ming Duan

    Abstract: The Alternating Direction Method of Multipliers (ADMM) has gained significant attention across a broad spectrum of machine learning applications. Incorporating the over-relaxation technique shows potential for enhancing the convergence rate of ADMM. However, determining optimal algorithmic parameters, including both the associated penalty and relaxation parameters, often relies on empirical approa… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: Accepted to AAAI 2024

  37. arXiv:2401.00243  [pdf, other

    cs.LG

    Uncertainty-Penalized Reinforcement Learning from Human Feedback with Diverse Reward LoRA Ensembles

    Authors: Yuanzhao Zhai, Han Zhang, Yu Lei, Yue Yu, Kele Xu, Dawei Feng, Bo Ding, Huaimin Wang

    Abstract: Reinforcement learning from human feedback (RLHF) emerges as a promising paradigm for aligning large language models (LLMs). However, a notable challenge in RLHF is overoptimization, where beyond a certain threshold, the pursuit of higher rewards leads to a decline in human preferences. In this paper, we observe the weakness of KL regularization which is commonly employed in existing RLHF methods… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: 10 pages, 5 figures,

  38. arXiv:2312.16850  [pdf, other

    cs.SD eess.AS

    Accent-VITS:accent transfer for end-to-end TTS

    Authors: Linhan Ma, Yongmao Zhang, Xinfa Zhu, Yi Lei, Ziqian Ning, Pengcheng Zhu, Lei Xie

    Abstract: Accent transfer aims to transfer an accent from a source speaker to synthetic speech in the target speaker's voice. The main challenge is how to effectively disentangle speaker timbre and accent which are entangled in speech. This paper presents a VITS-based end-to-end accent transfer model named Accent-VITS.Based on the main structure of VITS, Accent-VITS makes substantial improvements to enable… ▽ More

    Submitted 29 December, 2023; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Accepted by NCMMSC2023

  39. arXiv:2312.15380  [pdf, other

    cs.NI eess.SP

    Battery-Care Resource Allocation and Task Offloading in Multi-Agent Post-Disaster MEC Environment

    Authors: Yiwei Tang, Hualong Huang, Wenhan Zhan, Geyong Min, Zhekai Duan, Yuchuan Lei

    Abstract: Being an up-and-coming application scenario of mobile edge computing (MEC), the post-disaster rescue suffers multitudinous computing-intensive tasks but unstably guaranteed network connectivity. In rescue environments, quality of service (QoS), such as task execution delay, energy consumption and battery state of health (SoH), is of significant meaning. This paper studies a multi-user post-disaste… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: accepted by wcnc2024

  40. arXiv:2312.05038  [pdf, other

    cs.CV

    Prompt-In-Prompt Learning for Universal Image Restoration

    Authors: Zilong Li, Yiming Lei, Chenglong Ma, Jun** Zhang, Hongming Shan

    Abstract: Image restoration, which aims to retrieve and enhance degraded images, is fundamental across a wide range of applications. While conventional deep learning approaches have notably improved the image quality across various tasks, they still suffer from (i) the high storage cost needed for various task-specific models and (ii) the lack of interactivity and flexibility, hindering their wider applicat… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  41. arXiv:2312.02821  [pdf, other

    cs.CV

    RotaTR: Detection Transformer for Dense and Rotated Object

    Authors: Zhu Yuke, Ruan Yumeng, Yang Lei, Guo Sheng

    Abstract: Detecting the objects in dense and rotated scenes is a challenging task. Recent works on this topic are mostly based on Faster RCNN or Retinanet. As they are highly dependent on the pre-set dense anchors and the NMS operation, the approach is indirect and suboptimal.The end-to-end DETR-based detectors have achieved great success in horizontal object detection and many other areas like segmentation… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  42. RecExplainer: Aligning Large Language Models for Explaining Recommendation Models

    Authors: Yuxuan Lei, Jianxun Lian, **g Yao, Xu Huang, Defu Lian, Xing Xie

    Abstract: Recommender systems are widely used in online services, with embedding-based models being particularly popular due to their expressiveness in representing complex signals. However, these models often function as a black box, making them less transparent and reliable for both users and developers. Recently, large language models (LLMs) have demonstrated remarkable intelligence in understanding, rea… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: 12 pages, 9 figures, 5 tables

  43. arXiv:2311.05812  [pdf, other

    cs.CL

    CFBenchmark: Chinese Financial Assistant Benchmark for Large Language Model

    Authors: Yang Lei, Jiangtong Li, Dawei Cheng, Zhijun Ding, Changjun Jiang

    Abstract: Large language models (LLMs) have demonstrated great potential in the financial domain. Thus, it becomes important to assess the performance of LLMs in the financial tasks. In this work, we introduce CFBenchmark, to evaluate the performance of LLMs for Chinese financial assistant. The basic version of CFBenchmark is designed to evaluate the basic ability in Chinese financial text processing from t… ▽ More

    Submitted 21 May, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: 12 pages, 4 figures

  44. arXiv:2311.05171  [pdf, other

    cs.NE

    Rethinking Residual Connection in Training Large-Scale Spiking Neural Networks

    Authors: Yudong Li, Yunlin Lei, Xu Yang

    Abstract: Spiking Neural Network (SNN) is known as the most famous brain-inspired model, but the non-differentiable spiking mechanism makes it hard to train large-scale SNNs. To facilitate the training of large-scale SNNs, many training methods are borrowed from Artificial Neural Networks (ANNs), among which deep residual learning is the most commonly used. But the unique features of SNNs make prior intuiti… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  45. arXiv:2310.20456  [pdf, other

    cs.CL

    Towards a Deep Understanding of Multilingual End-to-End Speech Translation

    Authors: Haoran Sun, Xiaohu Zhao, Yikun Lei, Shaolin Zhu, Deyi Xiong

    Abstract: In this paper, we employ Singular Value Canonical Correlation Analysis (SVCCA) to analyze representations learnt in a multilingual end-to-end speech translation model trained over 22 languages. SVCCA enables us to estimate representational similarity across languages and layers, enhancing our understanding of the functionality of multilingual speech translation and its potential connection to mult… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

  46. arXiv:2310.20210  [pdf, other

    cs.CV

    UWFormer: Underwater Image Enhancement via a Semi-Supervised Multi-Scale Transformer

    Authors: Weiwen Chen, Yingtie Lei, Shenghong Luo, Ziyang Zhou, Mingxian Li, Chi-Man Pun

    Abstract: Underwater images often exhibit poor quality, distorted color balance and low contrast due to the complex and intricate interplay of light, water, and objects. Despite the significant contributions of previous underwater enhancement techniques, there exist several problems that demand further improvement: (i) The current deep learning methods rely on Convolutional Neural Networks (CNNs) that lack… ▽ More

    Submitted 24 April, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted by IJCNN 2024

  47. arXiv:2310.19852  [pdf, other

    cs.AI

    AI Alignment: A Comprehensive Survey

    Authors: Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O'Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, Wen Gao

    Abstract: AI alignment aims to make AI systems behave in line with human intentions and values. As AI systems grow more capable, so do risks from misalignment. To provide a comprehensive and up-to-date overview of the alignment field, in this survey, we delve into the core concepts, methodology, and practice of alignment. First, we identify four principles as the key objectives of AI alignment: Robustness,… ▽ More

    Submitted 1 May, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Continually updated, including weak-to-strong generalization and socio-technical thinking. 58 pages (excluding bibliography), 801 references

  48. arXiv:2310.19654  [pdf, other

    cs.CV cs.AI

    MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval

    Authors: Youbo Lei, Feifei He, Chen Chen, Yingbin Mo, Si Jia Li, Defeng Xie, Haonan Lu

    Abstract: Due to the success of large-scale visual-language pretraining (VLP) models and the widespread use of image-text retrieval in industry areas, it is now critically necessary to reduce the model size and streamline their mobile-device deployment. Single- and dual-stream model structures are commonly used in image-text retrieval with the goal of closing the semantic gap between textual and visual moda… ▽ More

    Submitted 1 April, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted by NAACL 2024 Findings

  49. arXiv:2310.18545  [pdf, other

    cs.CL

    Identifying Conspiracy Theories News based on Event Relation Graph

    Authors: Yuanyuan Lei, Ruihong Huang

    Abstract: Conspiracy theories, as a type of misinformation, are narratives that explains an event or situation in an irrational or malicious manner. While most previous work examined conspiracy theory in social media short texts, limited attention was put on such misinformation in long news documents. In this paper, we aim to identify whether a news article contains conspiracy theories. We observe that a co… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  50. arXiv:2310.18544  [pdf, other

    cs.CL

    Discourse Structures Guided Fine-grained Propaganda Identification

    Authors: Yuanyuan Lei, Ruihong Huang

    Abstract: Propaganda is a form of deceptive narratives that instigate or mislead the public, usually with a political purpose. In this paper, we aim to identify propaganda in political news at two fine-grained levels: sentence-level and token-level. We observe that propaganda content is more likely to be embedded in sentences that attribute causality or assert contrast to nearby sentences, as well as seen i… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023