Skip to main content

Showing 1–50 of 343 results for author: He, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00241  [pdf, ps, other

    quant-ph cs.IT math.OC

    Exploiting Structure in Quantum Relative Entropy Programs

    Authors: Kerry He, James Saunderson, Hamza Fawzi

    Abstract: Quantum relative entropy programs are convex optimization problems which minimize a linear functional over an affine section of the epigraph of the quantum relative entropy function. Recently, the self-concordance of a natural barrier function was proved for this set. This has opened up the opportunity to use interior-point methods for nonsymmetric cone programs to solve these optimization problem… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: 36 pages, 8 tables

  2. arXiv:2406.19258  [pdf, other

    cs.LG

    Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers

    Authors: **song Chen, Hanpeng Liu, John E. Hopcroft, Kun He

    Abstract: While tokenized graph Transformers have demonstrated strong performance in node classification tasks, their reliance on a limited subset of nodes with high similarity scores for constructing token sequences overlooks valuable information from other nodes, hindering their ability to fully harness graph information for learning optimal node representations. To address this limitation, we propose a n… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.19249  [pdf, other

    cs.LG

    NTFormer: A Composite Node Tokenized Graph Transformer for Node Classification

    Authors: **song Chen, Siyu Jiang, Kun He

    Abstract: Recently, the emerging graph Transformers have made significant advancements for node classification on graphs. In most graph Transformers, a crucial step involves transforming the input graph into token sequences as the model input, enabling Transformer to effectively learn the node representations. However, we observe that existing methods only express partial graph information of nodes through… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  4. arXiv:2406.18585  [pdf, other

    cs.CV cs.AI

    Flexible ViG: Learning the Self-Saliency for Flexible Object Recognition

    Authors: Lin Zuo, Kunshan Yang, Xianlong Tian, Kunbin He, Yongqi Ding, Mengmeng **g

    Abstract: Existing computer vision methods mainly focus on the recognition of rigid objects, whereas the recognition of flexible objects remains unexplored. Recognizing flexible objects poses significant challenges due to their inherently diverse shapes and sizes, translucent attributes, ambiguous boundaries, and subtle inter-class differences. In this paper, we claim that these problems primarily arise fro… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: under review

  5. FlexCare: Leveraging Cross-Task Synergy for Flexible Multimodal Healthcare Prediction

    Authors: Muhao Xu, Zhenfeng Zhu, Youru Li, Shuai Zheng, Yawei Zhao, Kunlun He, Yao Zhao

    Abstract: Multimodal electronic health record (EHR) data can offer a holistic assessment of a patient's health status, supporting various predictive healthcare tasks. Recently, several studies have embraced the multitask learning approach in the healthcare domain, exploiting the inherent correlations among clinical tasks to predict multiple outcomes simultaneously. However, existing methods necessitate samp… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024 (Research Track)

  6. arXiv:2406.11838  [pdf, other

    cs.CV

    Autoregressive Image Generation without Vector Quantization

    Authors: Tianhong Li, Yonglong Tian, He Li, Mingyang Deng, Kaiming He

    Abstract: Conventional wisdom holds that autoregressive models for image generation are typically accompanied by vector-quantized tokens. We observe that while a discrete-valued space can facilitate representing a categorical distribution, it is not a necessity for autoregressive modeling. In this work, we propose to model the per-token probability distribution using a diffusion procedure, which allows us t… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Tech report

  7. arXiv:2406.00921  [pdf, other

    cs.SE

    Towards Effective Detection of Ponzi schemes on Ethereum with Contract Runtime Behavior Graph

    Authors: Ruichao Liang, **g Chen, Cong Wu, Kun He, Yueming Wu, Weisong Sun, Ruiying Du, Qingchuan Zhao, Yang Liu

    Abstract: Ponzi schemes, a form of scam, have been discovered in Ethereum smart contracts in recent years, causing massive financial losses. Existing detection methods primarily focus on rule-based approaches and machine learning techniques that utilize static information as features. However, these methods have significant limitations. Rule-based approaches rely on pre-defined rules with limited capabiliti… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Submitted to ACM Transactions on Software Engineering and Methodology

  8. arXiv:2406.00427  [pdf, other

    cs.CV

    You Only Need Less Attention at Each Stage in Vision Transformers

    Authors: Shuoxi Zhang, Hanpeng Liu, Stephen Lin, Kun He

    Abstract: The advent of Vision Transformers (ViTs) marks a substantial paradigm shift in the realm of computer vision. ViTs capture the global information of images through self-attention modules, which perform dot product computations among patchified image tokens. While self-attention modules empower ViTs to capture long-range dependencies, the computational complexity grows quadratically with the number… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Camera-Ready; 10 pages, 3 figures

  9. arXiv:2405.20510  [pdf, other

    cs.CV

    Physically Compatible 3D Object Modeling from a Single Image

    Authors: Minghao Guo, Bohan Wang, **chuan Ma, Tianyuan Zhang, Crystal Elaine Owens, Chuang Gan, Joshua B. Tenenbaum, Kaiming He, Wojciech Matusik

    Abstract: We present a computational framework that transforms single images into 3D physical objects. The visual geometry of a physical object in an image is determined by three orthogonal attributes: mechanical properties, external forces, and rest-shape geometry. Existing single-view 3D reconstruction methods often overlook this underlying composition, presuming rigidity or neglecting external forces. Co… ▽ More

    Submitted 3 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  10. arXiv:2405.20283  [pdf, other

    cs.CV cs.GR

    TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes

    Authors: Minghao Guo, Bohan Wang, Kaiming He, Wojciech Matusik

    Abstract: We present TetSphere splatting, an explicit, Lagrangian representation for reconstructing 3D shapes with high-quality geometry. In contrast to conventional object reconstruction methods which predominantly use Eulerian representations, including both neural implicit (e.g., NeRF, NeuS) and explicit representations (e.g., DMTet), and often struggle with high computational demands and suboptimal mesh… ▽ More

    Submitted 17 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  11. arXiv:2405.17100  [pdf, other

    cs.CR cs.SD eess.AS

    Sok: Comprehensive Security Overview, Challenges, and Future Directions of Voice-Controlled Systems

    Authors: Haozhe Xu, Cong Wu, Yangyang Gu, Xingcan Shang, **g Chen, Kun He, Ruiying Du

    Abstract: The integration of Voice Control Systems (VCS) into smart devices and their growing presence in daily life accentuate the importance of their security. Current research has uncovered numerous vulnerabilities in VCS, presenting significant risks to user privacy and security. However, a cohesive and systematic examination of these vulnerabilities and the corresponding solutions is still absent. This… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  12. arXiv:2405.16850  [pdf, other

    eess.IV cs.CV cs.LG

    UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation

    Authors: Runzhao Yang, Yinda Chen, Zhihong Zhang, Xiaoyu Liu, Zongren Li, Kunlun He, Zhiwei Xiong, **li Suo, Qionghai Dai

    Abstract: In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times. Our novel method, ``\textbf{UniCompress}'', innovatively extends the compression capabilities of INR by being the first to compress multi… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  13. arXiv:2405.16380  [pdf, other

    cs.LG quant-ph

    Dynamic Inhomogeneous Quantum Resource Scheduling with Reinforcement Learning

    Authors: Linsen Li, Pratyush Anand, Kaiming He, Dirk Englund

    Abstract: A central challenge in quantum information science and technology is achieving real-time estimation and feedforward control of quantum systems. This challenge is compounded by the inherent inhomogeneity of quantum resources, such as qubit properties and controls, and their intrinsically probabilistic nature. This leads to stochastic challenges in error detection and probabilistic outcomes in proce… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  14. arXiv:2405.12970  [pdf, other

    cs.CV

    Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

    Authors: Yue Han, Junwei Zhu, Keke He, Xu Chen, Yanhao Ge, Wei Li, Xiangtai Li, Jiangning Zhang, Chengjie Wang, Yong Liu

    Abstract: Current face reenactment and swap** methods mainly rely on GAN frameworks, but recent focus has shifted to pre-trained diffusion models for their superior generation capabilities. However, training these models is resource-intensive, and the results have not yet achieved satisfactory performance levels. To address this issue, we introduce Face-Adapter, an efficient and effective adapter designed… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Project Page: https://faceadapter.github.io/face-adapter.github.io/

  15. arXiv:2405.11478  [pdf, other

    cs.CV eess.IV

    Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement

    Authors: Igor Morawski, Kai He, Shusil Dangi, Winston H. Hsu

    Abstract: Currently, low-light conditions present a significant challenge for machine cognition. In this paper, rather than optimizing models by assuming that human and machine cognition are correlated, we use zero-reference low-light enhancement to improve the performance of downstream task models. We propose to improve the zero-reference low-light enhancement method by leveraging the rich visual-linguisti… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024 Workshop NTIRE: New Trends in Image Restoration and Enhancement workshop and Challenges

  16. arXiv:2405.03722  [pdf, other

    cs.CV

    Class-relevant Patch Embedding Selection for Few-Shot Image Classification

    Authors: Weihao Jiang, Haoyang Cui, Kun He

    Abstract: Effective image classification hinges on discerning relevant features from both foreground and background elements, with the foreground typically holding the critical information. While humans adeptly classify images with limited exposure, artificial neural networks often struggle with feature selection from rare samples. To address this challenge, we propose a novel method for selecting class-rel… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2405.03109

  17. arXiv:2405.03176  [pdf, other

    cs.NE

    FIMP-HGA: A Novel Approach to Addressing the Partitioning Min-Max Weighted Matching Problem

    Authors: Yuxuan Wang, Jiongzhi Zheng, **yao Xie, Kun He

    Abstract: The Partitioning Min-Max Weighted Matching (PMMWM) problem, being a practical NP-hard problem, integrates the task of partitioning the vertices of a bipartite graph into disjoint sets of limited size with the classical Maximum-Weight Perfect Matching (MPWM) problem. Initially introduced in 2015, the state-of-the-art method for addressing PMMWM is the MP$_{\text{LS}}$. In this paper, we present a n… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  18. arXiv:2405.03109  [pdf, other

    cs.CV

    Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning

    Authors: Weihao Jiang, Chang Liu, Kun He

    Abstract: Humans possess remarkable ability to accurately classify new, unseen images after being exposed to only a few examples. Such ability stems from their capacity to identify common features shared between new and previously seen images while disregarding distractions such as background variations. However, for artificial neural network models, determining the most relevant features for distinguishing… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  19. arXiv:2404.15854  [pdf, other

    cs.CR cs.LG cs.SD eess.AS

    CLAD: Robust Audio Deepfake Detection Against Manipulation Attacks with Contrastive Learning

    Authors: Haolin Wu, **g Chen, Ruiying Du, Cong Wu, Kun He, Xingcan Shang, Hao Ren, Guowen Xu

    Abstract: The increasing prevalence of audio deepfakes poses significant security threats, necessitating robust detection methods. While existing detection systems exhibit promise, their robustness against malicious audio manipulations remains underexplored. To bridge the gap, we undertake the first comprehensive study of the susceptibility of the most widely adopted audio deepfake detectors to manipulation… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE TDSC

  20. arXiv:2404.15000  [pdf, other

    cs.CR

    EarPass: Secure and Implicit Call Receiver Authentication Using Ear Acoustic Sensing

    Authors: Xi** Sun, **g Chen, Kun He, Zhixiang He, Ruiying Du, Yebo Feng, Qingchuan Zhao, Cong Wu

    Abstract: Private voice communication often contains sensitive information, making it critical to ensure that only authorized users have access to such calls. Unfortunately, current authentication mechanisms, such as PIN-based passwords, fingerprint recognition, and face recognition, fail to authenticate the call receiver, leaving a gap in security. To fill the gap, we present EarPass, a secure and implicit… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  21. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  22. arXiv:2404.09997  [pdf, other

    cs.NE

    An Efficient Evolutionary Algorithm for Diversified Top-k (Weight) Clique Search Problems

    Authors: Jiongzhi Zheng, **ghui Xue, Kun He, Chu-Min Li, Yanli Liu

    Abstract: In many real-world problems and applications, finding only a single element, even though the best, among all possible candidates, cannot fully meet the requirements. We may wish to have a collection where each individual is not only outstanding but also distinctive. Diversified Top-k (DTk) problems are a kind of combinatorial optimization problem for finding such a promising collection of multiple… ▽ More

    Submitted 19 January, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures, 4 tables

  23. arXiv:2404.01106  [pdf, other

    cs.CR

    MagLive: Near-Field Magnetic Sensing-Based Voice Liveness Detection on Smartphones

    Authors: Xi** Sun, **g Chen, Cong Wu, Kun He, Haozhe Xu, Yebo Feng, Ruiying Du, Xianhao Chen

    Abstract: Voice authentication has been widely used on smartphones. However, it remains vulnerable to spoofing attacks, where the attacker replays recorded voice samples from authentic humans using loudspeakers to bypass the voice authentication system. In this paper, we present MagLive, a robust voice liveness detection scheme designed for smartphones to mitigate such spoofing attacks. MagLive leverages di… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  24. arXiv:2404.00557  [pdf, other

    cs.CL

    DivTOD: Unleashing the Power of LLMs for Diversifying Task-Oriented Dialogue Representations

    Authors: Weihao Zeng, Dayuan Fu, Keqing He, Yejie Wang, Yukai Xu, Weiran Xu

    Abstract: Language models pre-trained on general text have achieved impressive results in diverse fields. Yet, the distinct linguistic characteristics of task-oriented dialogues (TOD) compared to general text limit the practical utility of existing language models. Current task-oriented dialogue pre-training methods overlook the one-to-many property of conversations, where multiple responses can be appropri… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: NAACL 2024 (Findings)

  25. arXiv:2403.20261  [pdf, other

    q-bio.BM cs.AI cs.LG

    FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

    Authors: Kaiyuan Gao, Qizhi Pei, **hua Zhu, Kun He, Lijun Wu

    Abstract: Molecular docking is a pivotal process in drug discovery. While traditional techniques rely on extensive sampling and simulation governed by physical principles, these methods are often slow and costly. The advent of deep learning-based approaches has shown significant promise, offering increases in both accuracy and efficiency. Building upon the foundational work of FABind, a model designed with… ▽ More

    Submitted 7 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: 17 pages, 14 figures, 5 tables

  26. arXiv:2403.18330  [pdf, other

    cs.CV cs.LG

    Tracking-Assisted Object Detection with Event Cameras

    Authors: Ting-Kang Yen, Igor Morawski, Shusil Dangi, Kai He, Chung-Yi Lin, Jia-Fong Yeh, Hung-Ting Su, Winston Hsu

    Abstract: Event-based object detection has recently garnered attention in the computer vision community due to the exceptional properties of event cameras, such as high dynamic range and no motion blur. However, feature asynchronism and sparsity cause invisible objects due to no relative motion to the camera, posing a significant challenge in the task. Prior works have studied various memory mechanisms to p… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  27. arXiv:2403.16428  [pdf, other

    cs.CV

    Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

    Authors: Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung ** Chang, Angela Yao

    Abstract: We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3D understanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  28. arXiv:2403.15483  [pdf

    eess.SP cs.LG

    Rolling bearing fault diagnosis method based on generative adversarial enhanced multi-scale convolutional neural network model

    Authors: Maoxuan Zhou, Wei Kang, Kun He

    Abstract: In order to solve the problem that current convolutional neural networks can not capture the correlation features between the time domain signals of rolling bearings effectively, and the model accuracy is limited by the number and quality of samples, a rolling bearing fault diagnosis method based on generative adversarial enhanced multi-scale convolutional neural network model is proposed. Firstly… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  29. arXiv:2403.13663  [pdf, other

    cs.CV

    T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image

    Authors: Shijie Zhang, Boyan Jiang, Keke He, Junwei Zhu, Ying Tai, Chengjie Wang, Yinda Zhang, Yanwei Fu

    Abstract: Pixel2Mesh (P2M) is a classical approach for reconstructing 3D shapes from a single color image through coarse-to-fine mesh deformation. Although P2M is capable of generating plausible global shapes, its Graph Convolution Network (GCN) often produces overly smooth results, causing the loss of fine-grained geometry details. Moreover, P2M generates non-credible features for occluded regions and stru… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Received by ICASSP 2024

  30. arXiv:2403.08632  [pdf, other

    cs.CV cs.LG

    A Decade's Battle on Dataset Bias: Are We There Yet?

    Authors: Zhuang Liu, Kaiming He

    Abstract: We revisit the "dataset classification" experiment suggested by Torralba and Efros a decade ago, in the new era with large-scale, diverse, and hopefully less biased datasets as well as more capable neural network architectures. Surprisingly, we observe that modern neural networks can achieve excellent accuracy in classifying which dataset an image is from: e.g., we report 84.7% accuracy on held-ou… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  31. arXiv:2403.07943  [pdf, other

    cs.LG cs.CR

    Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack

    Authors: Xin Liu, Yuxiang Zhang, Meng Wu, Mingyu Yan, Kun He, Wei Yan, Shirui Pan, Xiaochun Ye, Dongrui Fan

    Abstract: Edge perturbation is a basic method to modify graph structures. It can be categorized into two veins based on their effects on the performance of graph neural networks (GNNs), i.e., graph data augmentation and attack. Surprisingly, both veins of edge perturbation methods employ the same operations, yet yield opposite effects on GNNs' accuracy. A distinct boundary between these methods in using edg… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: 14P

  32. arXiv:2403.06211  [pdf, other

    cs.CG

    Solution-Hashing Search Based on Layout-Graph Transformation for Unequal Circle Packing

    Authors: Jianrong Zhou, Jiyao He, Kun He

    Abstract: The problem of packing unequal circles into a circular container stands as a classic and challenging optimization problem in computational geometry. This study introduces a suite of innovative and efficient methods to tackle this problem. Firstly, we present a novel layout-graph transformation method that represents configurations as graphs, together with an inexact hash method facilitating fast c… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

  33. arXiv:2403.03472  [pdf, other

    cs.LG cs.CV

    Boosting Meta-Training with Base Class Information for Few-Shot Learning

    Authors: Weihao Jiang, Guodong Liu, Di He, Kun He

    Abstract: Few-shot learning, a challenging task in machine learning, aims to learn a classifier adaptable to recognize new, unseen classes with limited labeled examples. Meta-learning has emerged as a prominent framework for few-shot learning. Its training framework is originally a task-level learning method, such as Model-Agnostic Meta-Learning (MAML) and Prototypical Networks. And a recently proposed trai… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 11 pages, 6 figures, submitted to a journal

  34. arXiv:2403.01472  [pdf, other

    cs.CR cs.CL cs.LG

    WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection

    Authors: Anudeex Shetty, Yue Teng, Ke He, Qiongkai Xu

    Abstract: Embedding as a Service (EaaS) has become a widely adopted solution, which offers feature extraction capabilities for addressing various downstream tasks in Natural Language Processing (NLP). Prior studies have shown that EaaS can be prone to model extraction attacks; nevertheless, this concern could be mitigated by adding backdoor watermarks to the text embeddings and subsequently verifying the at… ▽ More

    Submitted 9 June, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: Accepted to ACL2024 (Main Proceedings)

  35. arXiv:2403.01163  [pdf, other

    cs.CL

    BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses

    Authors: Weihao Zeng, Keqing He, Yejie Wang, Dayuan Fu, Weiran Xu

    Abstract: Pre-trained language models have been successful in many scenarios. However, their usefulness in task-oriented dialogues is limited due to the intrinsic linguistic differences between general text and task-oriented dialogues. Current task-oriented dialogue pre-training methods rely on a contrastive framework, which faces challenges such as selecting true positives and hard negatives, as well as la… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  36. arXiv:2402.17256  [pdf, other

    cs.CL

    Beyond the Known: Investigating LLMs Performance on Out-of-Domain Intent Detection

    Authors: Pei Wang, Keqing He, Yejie Wang, Xiaoshuai Song, Yutao Mou, **gang Wang, Yunsen Xian, Xunliang Cai, Weiran Xu

    Abstract: Out-of-domain (OOD) intent detection aims to examine whether the user's query falls outside the predefined domain of the system, which is crucial for the proper functioning of task-oriented dialogue (TOD) systems. Previous methods address it by fine-tuning discriminative models. Recently, some studies have been exploring the application of large language models (LLMs) represented by ChatGPT to var… ▽ More

    Submitted 4 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Journal ref: LREC-COLING 2024

  37. arXiv:2402.11534  [pdf, other

    cs.CL cs.AI

    PreAct: Predicting Future in ReAct Enhances Agent's Planning Ability

    Authors: Dayuan Fu, Jianzhao Huang, Siyuan Lu, Guanting Dong, Yejie Wang, Keqing He, Weiran Xu

    Abstract: Addressing the discrepancies between predictions and actual outcomes often aids individuals in expanding their thought processes and engaging in reflection, thereby facilitating reasoning in the correct direction. In this paper, we introduce $\textbf{PreAct}$, an agent framework that integrates $\textbf{pre}$diction with $\textbf{rea}$soning and $\textbf{act}$ion. Leveraging the information provid… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 13 pages, 6 gigures

  38. arXiv:2402.11279  [pdf, other

    cs.CL cs.AI

    Multi-Perspective Consistency Enhances Confidence Estimation in Large Language Models

    Authors: Pei Wang, Yejie Wang, Muxi Diao, Keqing He, Guanting Dong, Weiran Xu

    Abstract: In the deployment of large language models (LLMs), accurate confidence estimation is critical for assessing the credibility of model predictions. However, existing methods often fail to overcome the issue of overconfidence on incorrect answers. In this work, we focus on improving the confidence estimation of large language models. Considering the fragility of self-awareness in language models, we… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

  39. arXiv:2402.10353  [pdf, other

    cs.CL cs.LG

    Prompt-Based Bias Calibration for Better Zero/Few-Shot Learning of Language Models

    Authors: Kang He, Yinghan Long, Kaushik Roy

    Abstract: Prompt learning is susceptible to intrinsic bias present in pre-trained language models (LMs), resulting in sub-optimal performance of prompt-based zero/few-shot learning. In this work, we propose a null-input prompting method to calibrate intrinsic bias encoded in pre-trained LMs. Different from prior efforts that address intrinsic bias primarily for social fairness and often involve excessive co… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  40. arXiv:2402.09782  [pdf, other

    cs.LG cs.AI

    MC-DBN: A Deep Belief Network-Based Model for Modality Completion

    Authors: Zihong Luo, Zheng Tao, Yuxuan Huang, Kexin He, Chengzhi Liu

    Abstract: Recent advancements in multi-modal artificial intelligence (AI) have revolutionized the fields of stock market forecasting and heart rate monitoring. Utilizing diverse data sources can substantially improve prediction accuracy. Nonetheless, additional data may not always align with the original dataset. Interpolation methods are commonly utilized for handling missing values in modal data, though t… ▽ More

    Submitted 20 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Journal ref: International Conference on Computer Supported Cooperative Work in Design 2024

  41. arXiv:2402.09136  [pdf, other

    cs.CL cs.AI

    DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning

    Authors: Yejie Wang, Keqing He, Guanting Dong, Pei Wang, Weihao Zeng, Muxi Diao, Yutao Mou, Mengdi Zhang, **gang Wang, Xunliang Cai, Weiran Xu

    Abstract: Code Large Language Models (Code LLMs) have demonstrated outstanding performance in code-related tasks. Several instruction tuning approaches have been proposed to boost the code generation performance of pre-trained Code LLMs. In this paper, we introduce a diverse instruction model (DolphCoder) with self-evaluating for code generation. It learns diverse instruction targets and combines a code eva… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 14 pages, 6 figures

  42. arXiv:2402.08631  [pdf, other

    cs.CL cs.AI cs.LG

    Knowledge Editing on Black-box Large Language Models

    Authors: Xiaoshuai Song, Zhengyang Wang, Keqing He, Guanting Dong, Yutao Mou, **xu Zhao, Weiran Xu

    Abstract: Knowledge editing (KE) aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge without negatively influencing other knowledge. Current research primarily focuses on white-box LLMs editing, overlooking an important scenario: black-box LLMs editing, where LLMs are accessed through interfaces and only textual output is available. In this pape… ▽ More

    Submitted 17 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: Work in progress

  43. arXiv:2402.07529  [pdf, other

    cs.DC cs.DS cs.LG cs.NI

    Accelerating Distributed Deep Learning using Lossless Homomorphic Compression

    Authors: Haoyu Li, Yuchen Xu, Jiayi Chen, Rohit Dwivedula, Wenfei Wu, Keqiang He, Aditya Akella, Daehyeok Kim

    Abstract: As deep neural networks (DNNs) grow in complexity and size, the resultant increase in communication overhead during distributed training has become a significant bottleneck, challenging the scalability of distributed training systems. Existing solutions, while aiming to mitigate this bottleneck through worker-level compression and in-network aggregation, fall short due to their inability to effici… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  44. arXiv:2402.03736  [pdf, other

    cs.DS cs.DM cs.LG

    An Effective Branch-and-Bound Algorithm with New Bounding Methods for the Maximum $s$-Bundle Problem

    Authors: **ghui Xue, Jiongzhi Zheng, Mingming **, Kun He

    Abstract: The Maximum s-Bundle Problem (MBP) addresses the task of identifying a maximum s-bundle in a given graph. A graph G=(V, E) is called an s-bundle if its vertex connectivity is at least |V|-s, where the vertex connectivity equals the minimum number of vertices whose deletion yields a disconnected or trivial graph. MBP is NP-hard and holds relevance in numerous realworld scenarios emphasizing the ver… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 10 pages, 2 figures, 5 tables

  45. arXiv:2402.01348  [pdf, other

    cs.LG cs.AI

    CORE: Mitigating Catastrophic Forgetting in Continual Learning through Cognitive Replay

    Authors: Jianshu Zhang, Yankai Fu, Ziheng Peng, Dongyu Yao, Kun He

    Abstract: This paper introduces a novel perspective to significantly mitigate catastrophic forgetting in continuous learning (CL), which emphasizes models' capacity to preserve existing knowledge and assimilate new information. Current replay-based methods treat every task and data sample equally and thus can not fully exploit the potential of the replay buffer. In response, we propose COgnitive REplay (COR… ▽ More

    Submitted 9 April, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted by CogSci24 as oral presentation

  46. arXiv:2401.16465  [pdf, other

    cs.CV cs.GR

    DressCode: Autoregressively Sewing and Generating Garments from Text Guidance

    Authors: Kai He, Kaixin Yao, Qixuan Zhang, **gyi Yu, Lingjie Liu, Lan Xu

    Abstract: Apparel's significant role in human appearance underscores the importance of garment digitalization for digital human creation. Recent advances in 3D content creation are pivotal for digital human creation. Nonetheless, garment generation from text guidance is still nascent. We introduce a text-driven 3D garment generation framework, DressCode, which aims to democratize design for novices and offe… ▽ More

    Submitted 14 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Project page: https://IHe-KaiI.github.io/DressCode/

  47. arXiv:2401.14404  [pdf, other

    cs.CV cs.LG

    Deconstructing Denoising Diffusion Models for Self-Supervised Learning

    Authors: Xinlei Chen, Zhuang Liu, Saining Xie, Kaiming He

    Abstract: In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoencoder (DAE). This deconstructive procedure allows us to explore how various components of modern DDMs influence self-supervised representation learni… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Technical report, 10 pages

  48. arXiv:2401.12461  [pdf, other

    cs.CL cs.AI

    Fast Adversarial Training against Textual Adversarial Attacks

    Authors: Yichen Yang, Xin Liu, Kun He

    Abstract: Many adversarial defense methods have been proposed to enhance the adversarial robustness of natural language processing models. However, most of them introduce additional pre-set linguistic knowledge and assume that the synonym candidates used by attackers are accessible, which is an ideal assumption. We delve into adversarial training in the embedding space and propose a Fast Adversarial Trainin… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 4 pages, 4 figures

  49. arXiv:2401.10589  [pdf, other

    cs.AI

    Rethinking the Soft Conflict Pseudo Boolean Constraint on MaxSAT Local Search Solvers

    Authors: Jiongzhi Zheng, Zhuo Chen, Chu-Min Li, Kun He

    Abstract: MaxSAT is an optimization version of the famous NP-complete Satisfiability problem (SAT). Algorithms for MaxSAT mainly include complete solvers and local search incomplete solvers. In many complete solvers, once a better solution is found, a Soft conflict Pseudo Boolean (SPB) constraint will be generated to enforce the algorithm to find better solutions. In many local search algorithms, clause wei… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  50. arXiv:2401.02650  [pdf, other

    cs.LG stat.ML

    Improving sample efficiency of high dimensional Bayesian optimization with MCMC

    Authors: Zeji Yi, Yunyue Wei, Chu Xin Cheng, Kaibo He, Yanan Sui

    Abstract: Sequential optimization methods are often confronted with the curse of dimensionality in high-dimensional spaces. Current approaches under the Gaussian process framework are still burdened by the computational complexity of tracking Gaussian process posteriors and need to partition the optimization problem into small regions to ensure exploration or assume an underlying low-dimensional structure.… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.