Skip to main content

Showing 1–50 of 290 results for author: Sun, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18053  [pdf, other

    cs.LG cs.AI

    Bidirectional-Reachable Hierarchical Reinforcement Learning with Mutually Responsive Policies

    Authors: Yu Luo, Fuchun Sun, Tianying Ji, Xianyuan Zhan

    Abstract: Hierarchical reinforcement learning (HRL) addresses complex long-horizon tasks by skillfully decomposing them into subgoals. Therefore, the effectiveness of HRL is greatly influenced by subgoal reachability. Typical HRL methods only consider subgoal reachability from the unilateral level, where a dominant level enforces compliance to the subordinate level. However, we observe that when the dominan… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. When Vision Meets Touch: A Contemporary Review for Visuotactile Sensors from the Signal Processing Perspective

    Authors: Shoujie Li, Zihan Wang, Changsheng Wu, Xiang Li, Shan Luo, Bin Fang, Fuchun Sun, Xiao-** Zhang, Wenbo Ding

    Abstract: Tactile sensors, which provide information about the physical properties of objects, are an essential component of robotic systems. The visuotactile sensing technology with the merits of high resolution and low cost has facilitated the development of robotics from environment exploration to dexterous operation. Over the years, several reviews on visuotactile sensors for robots have been presented,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Accepted by IEEE Journal of Selected Topics in Signal Processing

  3. arXiv:2406.11263  [pdf, other

    cs.CL cs.AI

    The Fall of ROME: Understanding the Collapse of LLMs in Model Editing

    Authors: Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Du Su, Dawei Yin, Huawei Shen

    Abstract: Despite significant progress in model editing methods, their application in real-world scenarios remains challenging as they often cause large language models (LLMs) to collapse. Among them, ROME is particularly concerning, as it could disrupt LLMs with only a single edit. In this paper, we study the root causes of such collapse. Through extensive analysis, we identify two primary factors that con… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2406.10157  [pdf, other

    cs.RO cs.AI

    RoboGolf: Mastering Real-World Minigolf with a Reflective Multi-Modality Vision-Language Model

    Authors: Hantao Zhou, Tianying Ji, Jianwei Zhang, Fuchun Sun, Huazhe Xu

    Abstract: Minigolf, a game with countless court layouts, and complex ball motion, constitutes a compelling real-world testbed for the study of embodied intelligence. As it not only challenges spatial and kinodynamic reasoning but also requires reflective and corrective capacities to address erroneously designed courses. We introduce RoboGolf, a framework that perceives dual-camera visual inputs with nested… ▽ More

    Submitted 23 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page: https://jity16.github.io/RoboGolf/

  5. arXiv:2406.09612  [pdf, other

    cs.AI cs.LG physics.chem-ph

    Automated Molecular Concept Generation and Labeling with Large Language Models

    Authors: Shichang Zhang, Botao Xia, Zimin Zhang, Qianli Wu, Fang Sun, Ziniu Hu, Yizhou Sun

    Abstract: Artificial intelligence (AI) is significantly transforming scientific research. Explainable AI methods, such as concept-based models (CMs), are promising for driving new scientific discoveries because they make predictions based on meaningful concepts and offer insights into the prediction process. In molecular science, however, explainable CMs are not as common compared to black-box models like G… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  6. arXiv:2406.07876  [pdf, other

    cs.CV cs.AI cs.LG

    Small Scale Data-Free Knowledge Distillation

    Authors: He Liu, Yikai Wang, Hua** Liu, Fuchun Sun, Anbang Yao

    Abstract: Data-free knowledge distillation is able to utilize the knowledge learned by a large teacher network to augment the training of a smaller student network without accessing the original training data, avoiding privacy, security, and proprietary risks in real applications. In this line of research, existing methods typically follow an inversion-and-distillation paradigm in which a generative adversa… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: This work is accepted to CVPR 2024. The project page: https://github.com/OSVAI/SSD-KD

  7. arXiv:2406.05837  [pdf, other

    cs.CV cs.AI

    Solution for CVPR 2024 UG2+ Challenge Track on All Weather Semantic Segmentation

    Authors: Jun Yu, Yunxiang Zhang, Fengzhao Sun, Leilei Wang, Renjie Lu

    Abstract: In this report, we present our solution for the semantic segmentation in adverse weather, in UG2+ Challenge at CVPR 2024. To achieve robust and accurate segmentation results across various weather conditions, we initialize the InternImage-H backbone with pre-trained weights from the large-scale joint dataset and enhance it with the state-of-the-art Upernet segmentation method. Specifically, we uti… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Solution for CVPR 2024 UG2+ Challenge Track on All Weather Semantic Segmentation

  8. arXiv:2406.02925  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition

    Authors: Hsuan Su, Hua Farn, Fan-Yun Sun, Shang-Tse Chen, Hung-yi Lee

    Abstract: Synthetic data is widely used in speech recognition due to the availability of text-to-speech models, which facilitate adapting models to previously unseen text domains. However, existing methods suffer in performance when they fine-tune an automatic speech recognition (ASR) model on synthetic data as they suffer from the distributional shift commonly referred to as the synthetic-to-real gap. In t… ▽ More

    Submitted 15 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  9. arXiv:2405.19080  [pdf, other

    cs.LG cs.AI

    OMPO: A Unified Framework for RL under Policy and Dynamics Shifts

    Authors: Yu Luo, Tianying Ji, Fuchun Sun, Jianwei Zhang, Huazhe Xu, Xianyuan Zhan

    Abstract: Training reinforcement learning policies using environment interaction data collected from varying policies or dynamics presents a fundamental challenge. Existing works often overlook the distribution discrepancies induced by policy or dynamics shifts, or rely on specialized algorithms with task priors, thus often resulting in suboptimal policy performances and high learning variances. In this pap… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  10. arXiv:2405.18520  [pdf, other

    cs.LG cs.AI

    Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL

    Authors: Yu Luo, Tianying Ji, Fuchun Sun, Jianwei Zhang, Huazhe Xu, Xianyuan Zhan

    Abstract: Off-policy reinforcement learning (RL) has achieved notable success in tackling many complex real-world tasks, by leveraging previously collected data for policy learning. However, most existing off-policy RL algorithms fail to maximally exploit the information in the replay buffer, limiting sample efficiency and policy performance. In this work, we discover that concurrently training an offline R… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  11. arXiv:2405.16822  [pdf, other

    cs.CV

    Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels

    Authors: Yikai Wang, Xinzhou Wang, Zilong Chen, Zhengyi Wang, Fuchun Sun, Jun Zhu

    Abstract: Video generative models are receiving particular attention given their ability to generate realistic and imaginative frames. Besides, these models are also observed to exhibit strong 3D consistency, significantly enhancing their potential to act as world simulators. In this work, we present Vidu4D, a novel reconstruction model that excels in accurately reconstructing 4D (i.e., sequential 3D) repre… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Project page: https://vidu4d-dgs.github.io

  12. arXiv:2405.14280  [pdf, other

    cs.IR

    ASI++: Towards Distributionally Balanced End-to-End Generative Retrieval

    Authors: Yuxuan Liu, Tianchi Yang, Zihan Zhang, Minghui Song, Haizhen Huang, Weiwei Deng, Feng Sun, Qi Zhang

    Abstract: Generative retrieval, a promising new paradigm in information retrieval, employs a seq2seq model to encode document features into parameters and decode relevant document identifiers (IDs) based on search queries. Existing generative retrieval solutions typically rely on a preprocessing stage to pre-define document IDs, which can suffer from a semantic gap between these IDs and the retrieval task.… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  13. arXiv:2405.14219  [pdf, other

    cs.LG cs.AI

    Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making

    Authors: Hanzhao Wang, Yu Pan, Fupeng Sun, Shang Liu, Kalyan Talluri, Guanting Chen, Xiaocheng Li

    Abstract: In this paper, we consider the supervised pretrained transformer for a class of sequential decision-making problems. The class of considered problems is a subset of the general formulation of reinforcement learning in that there is no transition probability matrix, and the class of problems covers bandits, dynamic pricing, and newsvendor problems as special cases. Such a structure enables the use… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  14. arXiv:2405.12130  [pdf, other

    cs.CL cs.LG

    MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

    Authors: Ting Jiang, Shaohan Huang, Shengyue Luo, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang

    Abstract: Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models. In this paper, we analyze the impact of low-rank updating, as implemented in LoRA. Our findings suggest that the low-rank updating mechanism may limit the ability of LLMs to effectively learn and memorize new knowledge. Inspired by this observation, we propose a new method called MoRA, which employs… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Work in Progress

  15. arXiv:2405.07237  [pdf, other

    cs.RO

    Soft Contact Simulation and Manipulation Learning of Deformable Objects with Vision-based Tactile Sensor

    Authors: Jianhua Shan, Yuhao Sun, Shixin Zhang, Fuchun Sun, Zixi Chen, Zirong Shen, Cesare Stefanini, Yiyong Yang, Shan Luo, Bin Fang

    Abstract: Deformable object manipulation is a classical and challenging research area in robotics. Compared with rigid object manipulation, this problem is more complex due to the deformation properties including elastic, plastic, and elastoplastic deformation. In this paper, we describe a new deformable object manipulation method including soft contact simulation, manipulation learning, and sim-to-real tra… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  16. arXiv:2405.02914  [pdf, other

    cs.RO

    Simulation of Optical Tactile Sensors Supporting Slip and Rotation using Path Tracing and IMPM

    Authors: Zirong Shen, Yuhao Sun, Shixin Zhang, Zixi Chen, Heyi Sun, Fuchun Sun, Bin Fang

    Abstract: Optical tactile sensors are extensively utilized in intelligent robot manipulation due to their ability to acquire high-resolution tactile information at a lower cost. However, achieving adequate reality and versatility in simulating optical tactile sensors is challenging. In this paper, we propose a simulation method and validate its effectiveness through experiments. We utilize path tracing for… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  17. arXiv:2405.02803  [pdf, other

    cs.LG cs.DC

    Is Flash Attention Stable?

    Authors: Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Ye** Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

    Abstract: Training large-scale machine learning models poses distinct system challenges, given both the size and complexity of today's workloads. Recently, many organizations training state-of-the-art Generative AI models have reported cases of instability during training, often taking the form of loss spikes. Numeric deviation has emerged as a potential cause of this training instability, although quantify… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  18. arXiv:2405.00700  [pdf

    cs.NE cond-mat.str-el

    Oxygen vacancies modulated VO2 for neurons and Spiking Neural Network construction

    Authors: Liang Li, Ting Zhou, Tong Liu, Zhiwei Liu, Ya** Li, Shuo Wu, Shanguang Zhao, **glin Zhu, Meiling Liu, Zhihan Lin, Bowen Sun, Jianjun Li, Fangwen Sun, Chongwen Zou

    Abstract: Artificial neuronal devices are the basic building blocks for neuromorphic computing systems, which have been motivated by realistic brain emulation. Aiming for these applications, various device concepts have been proposed to mimic the neuronal dynamics and functions. While till now, the artificial neuron devices with high efficiency, high stability and low power consumption are still far from pr… ▽ More

    Submitted 16 April, 2024; originally announced May 2024.

    Comments: 18 pages,4 figures

  19. arXiv:2404.18201  [pdf, other

    cs.RO

    What Foundation Models can Bring for Robot Learning in Manipulation : A Survey

    Authors: Dingzhe Li, Yixiang **, Yong A, Hongze Yu, Jun Shi, Xiaoshuai Hao, Peng Hao, Hua** Liu, Fuchun Sun, Bin Fang

    Abstract: The realization of universal robots is an ultimate goal of researchers. However, a key hurdle in achieving this goal lies in the robots' ability to manipulate objects in their unstructured surrounding environments according to different tasks. The learning-based approach is considered an effective way to address generalization. The impressive performance of foundation models in the fields of compu… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  20. arXiv:2404.18166  [pdf, other

    cs.IR

    Behavior-Contextualized Item Preference Modeling for Multi-Behavior Recommendation

    Authors: Mingshi Yan, Fan Liu, **g Sun, Fuming Sun, Zhiyong Cheng, Yahong Han

    Abstract: In recommender systems, multi-behavior methods have demonstrated their effectiveness in mitigating issues like data sparsity, a common challenge in traditional single-behavior recommendation approaches. These methods typically infer user preferences from various auxiliary behaviors and apply them to the target behavior for recommendations. However, this direct transfer can introduce noise to the t… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted by SIGIR 2024

  21. arXiv:2404.17287  [pdf, other

    cs.CL

    When to Trust LLMs: Aligning Confidence with Response Quality

    Authors: Shuchang Tao, Liuyi Yao, Hanxing Ding, Yuexiang Xie, Qi Cao, Fei Sun, **yang Gao, Huawei Shen, Bolin Ding

    Abstract: Despite the success of large language models (LLMs) in natural language generation, much evidence shows that LLMs may produce incorrect or nonsensical text. This limitation highlights the importance of discerning when to trust LLMs, especially in safety-critical domains. Existing methods often express reliability by confidence level, however, their effectiveness is limited by the lack of objective… ▽ More

    Submitted 9 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted by ACL 2024

  22. arXiv:2404.15034  [pdf, other

    cs.LG cs.AI

    Deep Multi-View Channel-Wise Spatio-Temporal Network for Traffic Flow Prediction

    Authors: Hao Miao, Senzhang Wang, Meiyue Zhang, Diansheng Guo, Funing Sun, Fan Yang

    Abstract: Accurately forecasting traffic flows is critically important to many real applications including public safety and intelligent transportation systems. The challenges of this problem include both the dynamic mobility patterns of the people and the complex spatial-temporal correlations of the urban traffic data. Meanwhile, most existing models ignore the diverse impacts of the various traffic observ… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted by AAAI2020 workshop

  23. arXiv:2404.10337  [pdf, other

    cs.AI

    Intriguing Properties of Positional Encoding in Time Series Forecasting

    Authors: Jianqi Zhang, **gyao Wang, Wenwen Qiang, Fanjiang Xu, Changwen Zheng, Fuchun Sun, Hui Xiong

    Abstract: Transformer-based methods have made significant progress in time series forecasting (TSF). They primarily handle two types of tokens, i.e., temporal tokens that contain all variables of the same timestamp, and variable tokens that contain all input time points for a specific variable. Transformer-based methods rely on positional encoding (PE) to mark tokens' positions, facilitating the model to pe… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  24. arXiv:2404.04317  [pdf, other

    stat.ML cs.LG q-bio.QM

    DeepLINK-T: deep learning inference for time series data using knockoffs and LSTM

    Authors: Wenxuan Zuo, Zifan Zhu, Yuxuan Du, Yi-Chun Yeh, Jed A. Fuhrman, **chi Lv, Yingying Fan, Fengzhu Sun

    Abstract: High-dimensional longitudinal time series data is prevalent across various real-world applications. Many such applications can be modeled as regression problems with high-dimensional time series covariates. Deep learning has been a popular and powerful tool for fitting these regression models. Yet, the development of interpretable and reproducible deep-learning models is challenging and remains un… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  25. arXiv:2404.00959  [pdf, other

    cs.CV

    Equivariant Local Reference Frames for Unsupervised Non-rigid Point Cloud Shape Correspondence

    Authors: Ling Wang, Runfa Chen, Yikai Wang, Fuchun Sun, Xinzhou Wang, Sun Kai, Guangyuan Fu, Jianwei Zhang, Wenbing Huang

    Abstract: Unsupervised non-rigid point cloud shape correspondence underpins a multitude of 3D vision tasks, yet itself is non-trivial given the exponential complexity stemming from inter-point degree-of-freedom, i.e., pose transformations. Based on the assumption of local rigidity, one solution for reducing complexity is to decompose the overall shape into independent local regions using Local Reference Fra… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  26. arXiv:2403.20091  [pdf, other

    cs.IT eess.SP

    A Signature Based Approach Towards Global Channel Charting with Ultra Low Complexity

    Authors: Longhai Zhao, Yunchuan Yang, Qi Xiong, He Wang, Bin Yu, Feifei Sun, Chengjun Sun

    Abstract: Channel charting, an unsupervised learning method that learns a low-dimensional representation from channel information to preserve geometrical property of physical space of user equipments (UEs), has drawn many attentions from both academic and industrial communities, because it can facilitate many downstream tasks, such as indoor localization, UE handover, beam management, and so on. However, ma… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: accepted by IEEE ICC 2024 Workshops

  27. arXiv:2403.13293  [pdf, other

    cs.CV cs.AI cs.LG

    Building Optimal Neural Architectures using Interpretable Knowledge

    Authors: Keith G. Mills, Fred X. Han, Mohammad Salameh, Shengyao Lu, Chunhua Zhou, Jiao He, Fengyu Sun, Di Niu

    Abstract: Neural Architecture Search is a costly practice. The fact that a search space can span a vast number of design choices with each architecture evaluation taking nontrivial overhead makes it hard for an algorithm to sufficiently explore candidate networks. In this paper, we propose AutoBuild, a scheme which learns to align the latent embeddings of operations and architecture modules with the ground-… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: CVPR'24; 18 Pages, 18 Figures, 3 Tables

  28. arXiv:2403.10395  [pdf, other

    cs.CV cs.LG

    Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding

    Authors: Pengkun Liu, Yikai Wang, Fuchun Sun, Jiafang Li, Hang Xiao, Hongxiang Xue, Xinzhou Wang

    Abstract: Encouraged by the growing availability of pre-trained 2D diffusion models, image-to-3D generation by leveraging Score Distillation Sampling (SDS) is making remarkable progress. Most existing methods combine novel-view lifting from 2D diffusion models which usually take the reference image as a condition while applying hard L2 image supervision at the reference view. Yet heavily adhering to the ima… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Project page: https://isotropic3d.github.io/ Source code: https://github.com/pkunliu/Isotropic3D

  29. arXiv:2403.01265  [pdf, other

    cs.RO eess.SY

    Smooth Computation without Input Delay: Robust Tube-Based Model Predictive Control for Robot Manipulator Planning

    Authors: Yu Luo, Qie Sima, Tianying Ji, Fuchun Sun, Hua** Liu, Jianwei Zhang

    Abstract: Model Predictive Control (MPC) has exhibited remarkable capabilities in optimizing objectives and meeting constraints. However, the substantial computational burden associated with solving the Optimal Control Problem (OCP) at each triggering instant introduces significant delays between state sampling and control application. These delays limit the practicality of MPC in resource-constrained syste… ▽ More

    Submitted 7 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2103.09693

  30. arXiv:2402.18039  [pdf, other

    cs.CL cs.AI

    ResLoRA: Identity Residual Map** in Low-Rank Adaption

    Authors: Shuhua Shi, Shaohan Huang, Minghui Song, Zhoujun Li, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

    Abstract: As one of the most popular parameter-efficient fine-tuning (PEFT) methods, low-rank adaptation (LoRA) is commonly applied to fine-tune large language models (LLMs). However, updating the weights of LoRA blocks effectively and expeditiously is challenging due to the long calculation path in the original model. To address this, we propose ResLoRA, an improved framework of LoRA. By adding residual pa… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 14 pages, 7 figures

  31. arXiv:2402.17511  [pdf, other

    cs.RO cs.AI

    Rethinking Mutual Information for Language Conditioned Skill Discovery on Imitation Learning

    Authors: Zhaoxun Ju, Chao Yang, Hongbo Wang, Yu Qiao, Fuchun Sun

    Abstract: Language-conditioned robot behavior plays a vital role in executing complex tasks by associating human commands or instructions with perception and actions. The ability to compose long-horizon tasks based on unconstrained language instructions necessitates the acquisition of a diverse set of general-purpose skills. However, acquiring inherent primitive skills in a coupled and long-horizon environm… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 16 pages

    ACM Class: I.2.6

  32. arXiv:2402.15754  [pdf, other

    cs.CL

    HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition

    Authors: Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

    Abstract: Large language models (LLMs) have emerged as a promising alternative to expensive human evaluations. However, the alignment and coverage of LLM-based evaluations are often limited by the scope and potential bias of the evaluation prompts and criteria. To address this challenge, we propose HD-Eval, a novel framework that iteratively aligns LLM-based evaluators with human preference via Hierarchical… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 20 pages, 13 figures

  33. arXiv:2402.14843  [pdf, other

    cs.CL cs.AI cs.LG

    Text Diffusion with Reinforced Conditioning

    Authors: Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang

    Abstract: Diffusion models have demonstrated exceptional capability in generating high-quality images, videos, and audio. Due to their adaptiveness in iterative refinement, they provide a strong potential for achieving better non-autoregressive sequence generation. However, existing text diffusion models still fall short in their performance due to a challenge in handling the discreteness of language. This… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 9 pages, 3 figures

  34. arXiv:2402.14528  [pdf, other

    cs.LG cs.AI

    ACE : Off-Policy Actor-Critic with Causality-Aware Entropy Regularization

    Authors: Tianying Ji, Yongyuan Liang, Yan Zeng, Yu Luo, Guowei Xu, Jiawei Guo, Ruijie Zheng, Furong Huang, Fuchun Sun, Huazhe Xu

    Abstract: The varying significance of distinct primitive behaviors during the policy learning process has been overlooked by prior model-free RL algorithms. Leveraging this insight, we explore the causal relationship between different action dimensions and rewards to evaluate the significance of various primitive behaviors during training. We introduce a causality-aware entropy term that effectively identif… ▽ More

    Submitted 22 May, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted by ICML 2024

    ACM Class: I.2

  35. arXiv:2402.10695  [pdf, other

    cs.LG cs.AI cs.CR

    Unlink to Unlearn: Simplifying Edge Unlearning in GNNs

    Authors: Jiajun Tan, Fei Sun, Ruichen Qiu, Du Su, Huawei Shen

    Abstract: As concerns over data privacy intensify, unlearning in Graph Neural Networks (GNNs) has emerged as a prominent research frontier in academia. This concept is pivotal in enforcing the \textit{right to be forgotten}, which entails the selective removal of specific data from trained GNNs upon user request. Our research focuses on edge unlearning, a process of particular relevance to real-world applic… ▽ More

    Submitted 11 March, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024 as a Short Research Paper

  36. arXiv:2402.09656  [pdf, other

    cs.AI

    The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse

    Authors: Wanli Yang, Fei Sun, Xinyu Ma, Xun Liu, Dawei Yin, Xueqi Cheng

    Abstract: Although model editing has shown promise in revising knowledge in Large Language Models (LLMs), its impact on the inherent capabilities of LLMs is often overlooked. In this work, we reveal a critical phenomenon: even a single edit can trigger model collapse, manifesting as significant performance degradation in various benchmark tasks. However, benchmarking LLMs after each edit, while necessary to… ▽ More

    Submitted 5 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted at Findings of ACL 2024

  37. arXiv:2402.04869  [pdf, other

    cs.LG cs.AI

    Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policy

    Authors: Ruichu Cai, Siyang Huang, Jie Qiao, Wei Chen, Yan Zeng, Keli Zhang, Fuchun Sun, Yang Yu, Zhifeng Hao

    Abstract: As a key component to intuitive cognition and reasoning solutions in human intelligence, causal knowledge provides great potential for reinforcement learning (RL) agents' interpretability towards decision-making by hel** reduce the searching space. However, there is still a considerable gap in discovering and incorporating causality into RL, which hinders the rapid development of causal RL. In t… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  38. Zero-shot sketch-based remote sensing image retrieval based on multi-level and attention-guided tokenization

    Authors: Bo Yang, Chen Wang, Xiaoshuang Ma, Bei** Song, Zhuang Liu, Fangde Sun

    Abstract: Effectively and efficiently retrieving images from remote sensing databases is a critical challenge in the realm of remote sensing big data. Utilizing hand-drawn sketches as retrieval inputs offers intuitive and user-friendly advantages, yet the potential of multi-level feature integration from sketches remains underexplored, leading to suboptimal retrieval performance. To address this gap, our st… ▽ More

    Submitted 15 May, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: 44 pages, 6 figures

    Journal ref: Remote Sens. 2024, 16, 1653

  39. arXiv:2401.17723  [pdf, other

    cs.IR

    LoRec: Large Language Model for Robust Sequential Recommendation against Poisoning Attacks

    Authors: Kaike Zhang, Qi Cao, Yunfan Wu, Fei Sun, Huawei Shen, Xueqi Cheng

    Abstract: Sequential recommender systems stand out for their ability to capture users' dynamic interests and the patterns of item-to-item transitions. However, the inherent openness of sequential recommender systems renders them vulnerable to poisoning attacks, where fraudulent users are injected into the training data to manipulate learned patterns. Traditional defense strategies predominantly depend on pr… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  40. arXiv:2401.15235  [pdf, other

    eess.IV cs.CV cs.LG

    CascadedGaze: Efficiency in Global Context Extraction for Image Restoration

    Authors: Amirhosein Ghasemabadi, Muhammad Kamran Janjua, Mohammad Salameh, Chunhua Zhou, Fengyu Sun, Di Niu

    Abstract: Image restoration tasks traditionally rely on convolutional neural networks. However, given the local nature of the convolutional operator, they struggle to capture global information. The promise of attention mechanisms in Transformers is to circumvent this problem, but it comes at the cost of intensive computational overhead. Many recent studies in image restoration have focused on solving the c… ▽ More

    Submitted 7 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: Published in Transactions on Machine Learning Research (TMLR), 2024. 20 pages

  41. arXiv:2401.14166  [pdf, other

    cs.CL cs.AI

    BayesPrompt: Prompting Large-Scale Pre-Trained Language Models on Few-shot Inference via Debiased Domain Abstraction

    Authors: Jiangmeng Li, Fei Song, Yifan **, Wenwen Qiang, Changwen Zheng, Fuchun Sun, Hui Xiong

    Abstract: As a novel and effective fine-tuning paradigm based on large-scale pre-trained language models (PLMs), prompt-tuning aims to reduce the gap between downstream tasks and pre-training objectives. While prompt-tuning has yielded continuous advancements in various tasks, such an approach still remains a persistent defect: prompt-tuning methods fail to generalize to specific few-shot patterns. From the… ▽ More

    Submitted 20 March, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted by ICLR2024

  42. arXiv:2401.11911  [pdf, other

    cs.CL cs.AI

    Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?

    Authors: Hexiang Tan, Fei Sun, Wanli Yang, Yuanzhuo Wang, Qi Cao, Xueqi Cheng

    Abstract: While auxiliary information has become a key to enhancing Large Language Models (LLMs), relatively little is known about how LLMs merge these contexts, specifically contexts generated by LLMs and those retrieved from external sources. To investigate this, we formulate a systematic framework to identify whether LLMs' responses are attributed to either generated or retrieved contexts. To easily trac… ▽ More

    Submitted 10 June, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted at ACL 2024 Main, Homepage (https://tan-hexiang.github.io/Blinded_by_Generated_Contexts/)

  43. arXiv:2401.07284  [pdf, other

    cs.CL

    Improving Domain Adaptation through Extended-Text Reading Comprehension

    Authors: Ting Jiang, Shaohan Huang, Shengyue Luo, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Deqing Wang, Fuzhen Zhuang

    Abstract: To enhance the domain-specific capabilities of large language models, continued pre-training on a domain-specific corpus is a prevalent method. Recent work demonstrates that adapting models using reading comprehension data formatted by regex-based patterns can significantly improve performance on domain-specific tasks. However, regex-based patterns are incapable of parsing raw corpora using domain… ▽ More

    Submitted 18 January, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

    Comments: Work in Progress

  44. arXiv:2401.05055  [pdf, other

    cs.CV

    Application of Deep Learning in Blind Motion Deblurring: Current Status and Future Prospects

    Authors: Yawen Xiang, Heng Zhou, Chengyang Li, Fangwei Sun, Zhongbo Li, Yongqiang Xie

    Abstract: Motion deblurring is one of the fundamental problems of computer vision and has received continuous attention. The variability in blur, both within and across images, imposes limitations on non-blind deblurring techniques that rely on estimating the blur kernel. As a response, blind motion deblurring has emerged, aiming to restore clear and detailed images without prior knowledge of the blur type,… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 29 pages, 13 figures, more than 150 papers have been included

  45. arXiv:2312.15300  [pdf, other

    cs.CV

    Q-Boost: On Visual Quality Assessment Ability of Low-level Multi-Modality Foundation Models

    Authors: Zicheng Zhang, Haoning Wu, Zhongpeng Ji, Chunyi Li, Erli Zhang, Wei Sun, Xiaohong Liu, Xiongkuo Min, Fengyu Sun, Shangling Jui, Weisi Lin, Guangtao Zhai

    Abstract: Recent advancements in Multi-modality Large Language Models (MLLMs) have demonstrated remarkable capabilities in complex high-level vision tasks. However, the exploration of MLLM potential in visual quality assessment, a vital aspect of low-level vision, remains limited. To address this gap, we introduce Q-Boost, a novel strategy designed to enhance low-level MLLMs in image quality assessment (IQA… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  46. arXiv:2312.14385  [pdf, other

    cs.DC cs.LG cs.MM

    Generative AI Beyond LLMs: System Implications of Multi-Modal Generation

    Authors: Alicia Golden, Samuel Hsia, Fei Sun, Bilge Acun, Basil Hosmer, Ye** Lee, Zachary DeVito, Jeff Johnson, Gu-Yeon Wei, David Brooks, Carole-Jean Wu

    Abstract: As the development of large-scale Generative AI models evolve beyond text (1D) generation to include image (2D) and video (3D) generation, processing spatial and temporal information presents unique challenges to quality, performance, and efficiency. We present the first work towards understanding this new system design space for multi-modal text-to-image (TTI) and text-to-video (TTV) generation m… ▽ More

    Submitted 5 May, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Published at 2024 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

  47. arXiv:2312.14222  [pdf, other

    cs.LG cs.AI

    Hierarchical Topology Isomorphism Expertise Embedded Graph Contrastive Learning

    Authors: Jiangmeng Li, Yifan **, Hang Gao, Wenwen Qiang, Changwen Zheng, Fuchun Sun

    Abstract: Graph contrastive learning (GCL) aims to align the positive features while differentiating the negative features in the latent space by minimizing a pair-wise contrastive loss. As the embodiment of an outstanding discriminative unsupervised graph representation learning approach, GCL achieves impressive successes in various graph benchmarks. However, such an approach falls short of recognizing the… ▽ More

    Submitted 25 December, 2023; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI2024

  48. arXiv:2312.09067  [pdf, other

    cs.CV cs.AI cs.CL cs.RO

    Holodeck: Language Guided Generation of 3D Embodied AI Environments

    Authors: Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, Christopher Clark

    Abstract: 3D simulated environments play a critical role in Embodied AI, but their creation requires expertise and extensive manual effort, restricting their diversity and scope. To mitigate this limitation, we present Holodeck, a system that generates 3D environments to match a user-supplied prompt fully automatedly. Holodeck can generate diverse scenes, e.g., arcades, spas, and museums, adjust the designs… ▽ More

    Submitted 22 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Published in CVPR 2024, 21 pages, 27 figures, 2 tables

  49. arXiv:2312.06240  [pdf, other

    cs.CV

    UIEDP:Underwater Image Enhancement with Diffusion Prior

    Authors: Dazhao Du, Enhan Li, Lingyu Si, Fanjiang Xu, Jianwei Niu, Fuchun Sun

    Abstract: Underwater image enhancement (UIE) aims to generate clear images from low-quality underwater images. Due to the unavailability of clear reference images, researchers often synthesize them to construct paired datasets for training deep models. However, these synthesized images may sometimes lack quality, adversely affecting training outcomes. To address this issue, we propose UIE with Diffusion Pri… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  50. arXiv:2312.05476  [pdf, other

    cs.CV

    Exploring the Naturalness of AI-Generated Images

    Authors: Zijian Chen, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhongpeng Ji, Fengyu Sun, Shangling Jui, Xiongkuo Min, Guangtao Zhai, Wenjun Zhang

    Abstract: The proliferation of Artificial Intelligence-Generated Images (AGIs) has greatly expanded the Image Naturalness Assessment (INA) problem. Different from early definitions that mainly focus on tone-mapped images with limited distortions (e.g., exposure, contrast, and color reproduction), INA on AI-generated images is especially challenging as it has more diverse contents and could be affected by fa… ▽ More

    Submitted 4 March, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: 33 pages