Skip to main content

Showing 1–50 of 8,727 results for author: Su

.
  1. arXiv:2407.09417  [pdf, other

    cs.CL cs.IR

    Mitigating Entity-Level Hallucination in Large Language Models

    Authors: Weihang Su, Yichen Tang, Qingyao Ai, Changyue Wang, Zhi**g Wu, Yiqun Liu

    Abstract: The emergence of Large Language Models (LLMs) has revolutionized how users access information, shifting from traditional search engines to direct question-and-answer interactions with LLMs. However, the widespread adoption of LLMs has revealed a significant challenge known as hallucination, wherein LLMs generate coherent yet factually inaccurate responses. This hallucination phenomenon has led to… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.09233  [pdf, other

    astro-ph.SR physics.space-ph

    An impulsive geomagnetic effect from an early-impulsive flare

    Authors: Hugh S. Hudson, Edward. W. Cliver, Lyndsay Fletcher, Declan A. Diver, Peter T. Gallagher, Ying Li, Christopher M. J. Osborne, Craig Stark, Yang Su

    Abstract: The geomagnetic "solar flare effect" (SFE) results from excess ionization in the Earth's ionosphere, famously first detected at the time of the Carrington flare in 1859. This indirect detection of a flare constituted one of the first cases of "multimessenger astronomy," whereby solar ionizing radiation stimulates ionospheric currents. Well-observed SFEs have few-minute time scales and perturbation… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: MNRAS to be published

  3. arXiv:2407.09024  [pdf, other

    cs.LG

    Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control

    Authors: Huayu Chen, Kaiwen Zheng, Hang Su, Jun Zhu

    Abstract: Drawing upon recent advances in language model alignment, we formulate offline Reinforcement Learning as a two-stage optimization problem: First pretraining expressive generative policies on reward-free behavior datasets, then fine-tuning these policies to align with task-specific annotations like Q-values. This strategy allows us to leverage abundant and diverse behavior data to enhance generaliz… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  4. arXiv:2407.08961  [pdf

    eess.IV cs.CV

    Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining on Chest CT

    Authors: Jie Zheng, Ru Wen, Haiqin Hu, Lina Wei, Kui Su, Wei Chen, Chen Liu, Jun Wang

    Abstract: Existing Masked Image Modeling (MIM) depends on a spatial patch-based masking-reconstruction strategy to perceive objects'features from unlabeled images, which may face two limitations when applied to chest CT: 1) inefficient feature learning due to complex anatomical details presented in CT images, and 2) suboptimal knowledge transfer owing to input disparity between upstream and downstream model… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  5. arXiv:2407.08949  [pdf, other

    cs.CV

    One-Shot Pose-Driving Face Animation Platform

    Authors: He Feng, Donglin Di, Yongjia Ma, Wei Chen, Tonghua Su

    Abstract: The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs. Current approaches often require fine-tuning for specific identities and frequently fail to produce expressive videos due to the limited effectiveness of Wav2Pose modules. To facilitate the generation of one-… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  6. arXiv:2407.08762  [pdf, ps, other

    cs.SI cs.LG

    Commute-Time-Optimised Graphs for GNNs

    Authors: Igor Sterner, Shiye Su, Petar Veličković

    Abstract: We explore graph rewiring methods that optimise commute time. Recent graph rewiring approaches facilitate long-range interactions in sparse graphs, making such rewirings commute-time-optimal $\textit{on average}$. However, when an expert prior exists on which node pairs should or should not interact, a superior rewiring would favour short commute times between these privileged node pairs. We const… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  7. arXiv:2407.08558  [pdf, other

    cs.AI

    ST-Mamba: Spatial-Temporal Mamba for Traffic Flow Estimation Recovery using Limited Data

    Authors: Doncheng Yuan, Jianzhe Xue, **shan Su, Wenchao Xu, Haibo Zhou

    Abstract: Traffic flow estimation (TFE) is crucial for urban intelligent traffic systems. While traditional on-road detectors are hindered by limited coverage and high costs, cloud computing and data mining of vehicular network data, such as driving speeds and GPS coordinates, present a promising and cost-effective alternative. Furthermore, minimizing data collection can significantly reduce overhead. Howev… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by 2024 IEEE/CIC International Conference on Communications in China (ICCC)

  8. arXiv:2407.08525  [pdf, other

    quant-ph

    Faster Preparation of Multi-qubit Entanglement with Higher Success Rates

    Authors: B. -B. Liu, Shi-Lei Su, Y. -L. Zuo, Gang Chen, Ş. K. Özdemir, H. **g

    Abstract: A noteworthy discovery in recent research is that the process of two-qubit quantum entanglement preparation can be significantly accelerated near the exceptional point (EP) or spectral coalescence of non-Hermitian systems, as compared to conventional Hermitian setups. Nevertheless, a significant obstacle for quantum EP-based devices is their limited success rate in generating highly entangled stat… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  9. arXiv:2407.08353  [pdf

    cond-mat.mtrl-sci

    One-dimensional flat bands in phosphorene nanoribbons with pentagonal nature

    Authors: Shuo Sun, **g-Yang You, Zhihao Cai, Jie Su, Tong Yang, Xinnan Peng, Yihe Wang, Daiyu Geng, Jian Gou, Yuli Huang, Sisheng Duan, Lan Chen, Kehui Wu, Andrew T. S. Wee, Yuan ** Feng, Jia Lin Zhang, Jiong Lu, Baojie Feng, Wei Chen

    Abstract: Materials with topological flat bands can serve as a promising platform to investigate strongly interacting phenomena. However, experimental realization of ideal flat bands is mostly limited to artificial lattices or moiré systems. Here we report a general way to construct one-dimensional (1D) flat bands in phosphorene nanoribbons (PNRs) with pentagonal nature: penta-hexa-PNRs and penta-dodeca-PNR… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 13 pages, 4 figures

  10. arXiv:2407.08317  [pdf, other

    stat.ME

    Inference procedures in sequential trial emulation with survival outcomes: comparing confidence intervals based on the sandwich variance estimator, bootstrap and jackknife

    Authors: Juliette M. Limozin, Shaun R. Seaman, Li Su

    Abstract: Sequential trial emulation (STE) is an approach to estimating causal treatment effects by emulating a sequence of target trials from observational data. In STE, inverse probability weighting is commonly utilised to address time-varying confounding and/or dependent censoring. Then structural models for potential outcomes are applied to the weighted data to estimate treatment effects. For inference,… ▽ More

    Submitted 12 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: Main text: 23 pages, 5 figures, 5 tables. Supplementary materials included

  11. arXiv:2407.08268  [pdf, other

    cs.CV

    Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

    Authors: Tong Shao, Zhuotao Tian, Hang Zhao, **gyong Su

    Abstract: CLIP, as a vision-language model, has significantly advanced Open-Vocabulary Semantic Segmentation (OVSS) with its zero-shot capabilities. Despite its success, its application to OVSS faces challenges due to its initial image-level alignment training, which affects its performance in tasks requiring detailed local context. Our study delves into the impact of CLIP's [CLS] token on patch feature cor… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: ECCV24 accepted

  12. arXiv:2407.08238  [pdf, ps, other

    eess.SY

    Integrated User Matching and Pricing in Round-Trip Car-Sharing

    Authors: Avalpreet Singh Brar, Rong Su, Gioele Zardini, Jaskaranveer Kaur

    Abstract: Traditional round-trip car rental systems mandate users to return vehicles to their point of origin, limiting the system adaptability to meet diverse mobility demands. This constraint often leads to fleet under-utilization and incurs high parking costs for idle vehicles. To address this inefficiency, we propose a N-user matching algorithm which is designed to facilitate one-way trips within the ro… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  13. arXiv:2407.08234  [pdf, other

    cs.RO eess.SY

    Model Predictive Control For Mobile Manipulators Based On Neural Dynamics(Extended version)

    Authors: Tao Su, Shiqi Zheng

    Abstract: This article focuses on the trajectory tracking problem of mobile manipulators (MMs). Firstly, we construct a position and orientation model predictive tracking control (POMPTC) scheme for mobile manipulators. The proposed POMPTC scheme can simultaneously minimize the tracking error, joint velocity, and joint acceleration. Moreover, it can achieve synchronous control for the position and orientati… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: This article consists of 13 pages, including the text and the proof process

  14. arXiv:2407.08206  [pdf

    cs.CL

    System Report for CCL24-Eval Task 7: Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation

    Authors: **gshen Zhang, Xiangyu Yang, Xinkai Su, Xinglu Chen, Tianyou Huang, Xinying Qiu

    Abstract: This system report presents our approaches and results for the Chinese Essay Fluency Evaluation (CEFE) task at CCL-2024. For Track 1, we optimized predictions for challenging fine-grained error types using binary classification models and trained coarse-grained models on the Chinese Learner 4W corpus. In Track 2, we enhanced performance by constructing a pseudo-dataset with multiple error types pe… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  15. arXiv:2407.08150  [pdf, other

    cs.CV

    Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding

    Authors: Minghui Wu, Chenxu Zhao, Anyang Su, Donglin Di, Tianyu Fu, Da An, Min He, Ya Gao, Meng Ma, Kun Yan, ** Wang

    Abstract: Understanding of video creativity and content often varies among individuals, with differences in focal points and cognitive levels across different ages, experiences, and genders. There is currently a lack of research in this area, and most existing benchmarks suffer from several drawbacks: 1) a limited number of modalities and answers with restrictive length; 2) the content and scenarios within… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  16. arXiv:2407.07930  [pdf

    q-bio.BM cs.LG

    Token-Mol 1.0: Tokenized drug design with large language model

    Authors: Jike Wang, Rui Qin, Mingyang Wang, Mei**g Fang, Yangyang Zhang, Yuchen Zhu, Qun Su, Qiaolin Gou, Chao Shen, Odin Zhang, Zhenxing Wu, Dejun Jiang, Xujun Zhang, Huifeng Zhao, Xiaozhe Wan, Zhourui Wu, Liwei Liu, Yu Kang, Chang-Yu Hsieh, Tingjun Hou

    Abstract: Significant interests have recently risen in leveraging sequence-based large language models (LLMs) for drug design. However, most current applications of LLMs in drug discovery lack the ability to comprehend three-dimensional (3D) structures, thereby limiting their effectiveness in tasks that explicitly involve molecular conformations. In this study, we introduced Token-Mol, a token-only 3D drug… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  17. arXiv:2407.07651  [pdf, other

    hep-ex physics.data-an

    Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

    Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

    Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  18. arXiv:2407.07538  [pdf, other

    q-bio.QM nlin.PS

    An epidemical model with nonlocal spatial infections

    Authors: Su Yang, Weiqi Chu, Panayotis Kevrekidis

    Abstract: The SIR model is one of the most prototypical compartmental models in epidemiology. Generalizing this ordinary differential equation (ODE) framework into a spatially distributed partial differential equation (PDE) model is a considerable challenge. In the present work, we extend a recently proposed model based on nearest-neighbor spatial interactions by one of the authors in~\cite{vaziry2022modell… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 15 pages, 9 figures

  19. arXiv:2407.07433  [pdf, other

    cs.CV cs.AI

    Controllable Navigation Instruction Generation with Chain of Thought Prompting

    Authors: Xianghao Kong, **yu Chen, Wenguan Wang, Hang Su, Xiaolin Hu, Yi Yang, Si Liu

    Abstract: Instruction generation is a vital and multidisciplinary research area with broad applications. Existing instruction generation models are limited to generating instructions in a single style from a particular dataset, and the style and content of generated instructions cannot be controlled. Moreover, most existing instruction generation methods also disregard the spatial modeling of the navigation… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  20. arXiv:2407.07365  [pdf, other

    cs.CV

    High-Resolution Cloud Detection Network

    Authors: **gsheng Li, Tianxiang Xue, Jiayi Zhao, **gmin Ge, Yufang Min, Wei Su, Kun Zhan

    Abstract: The complexity of clouds, particularly in terms of texture detail at high resolutions, has not been well explored by most existing cloud detection networks. This paper introduces the High-Resolution Cloud Detection Network (HR-cloud-Net), which utilizes a hierarchical high-resolution integration approach. HR-cloud-Net integrates a high-resolution representation module, layer-wise cascaded feature… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Journal of Electronic Imaging

  21. arXiv:2407.07169  [pdf, other

    astro-ph.GA astro-ph.CO astro-ph.SR hep-ph

    Unraveling the role of merger histories in the population of Insitu stars: linking IllustrisTNG cosmological simulation to H3 survey

    Authors: Razieh Emami, Lars Hernquist, Randall Smith, James F. Steiner, Grant Tremblay, Douglas Finkbeiner, Mark Vogelsberger, Josh Grindlay, Federico Marinacci, Kung-Yi Su, Cecilia Garraffo, Yuan-Sen Ting, Phillip A. Cargile, Rebecca L. Davies, Chloë E. Benton, Yijia Li, Letizia Bugiani, Amir H. Khoram, Sownak Bose

    Abstract: We undertake a comprehensive investigation into the distribution of insitu stars within Milky Way-like galaxies, leveraging TNG50 simulations and comparing their predictions with data from the H3 survey. Our analysis reveals that 28% of galaxies demonstrate reasonable agreement with H3, while only 12% exhibit excellent alignment in their profiles, regardless of the specific spatial cut employed to… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 20 pages, 13 figures

  22. arXiv:2407.07089  [pdf, other

    cs.LG

    Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic

    Authors: Ruochen **, Bojian Hou, Jiancong Xiao, Weijie Su, Li Shen

    Abstract: Task arithmetic has recently emerged as a cost-effective and scalable approach to edit pre-trained models directly in weight space, by adding the fine-tuned weights of different tasks. The performance has been further improved by a linear property which is illustrated by weight disentanglement. Yet, conventional linearization methods (e.g., NTK linearization) not only double the time and training… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  23. arXiv:2407.07038  [pdf, other

    cs.CL

    Decoding Climate Disagreement: A Graph Neural Network-Based Approach to Understanding Social Media Dynamics

    Authors: Ruiran Su, Janet B. Pierrehumbert

    Abstract: This work introduces the ClimateSent-GAT Model, an innovative method that integrates Graph Attention Networks (GATs) with techniques from natural language processing to accurately identify and predict disagreements within Reddit comment-reply pairs. Our model classifies disagreements into three categories: agree, disagree, and neutral. Leveraging the inherent graph structure of Reddit comment-repl… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  24. arXiv:2407.07024  [pdf, other

    cs.CV cs.AI

    Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization

    Authors: Jeongseok Hyun, Su Ho Han, Hyolim Kang, Joon-Young Lee, Seon Joo Kim

    Abstract: The vocabulary size in temporal action localization (TAL) is constrained by the scarcity of large-scale annotated datasets. To address this, recent works incorporate powerful pre-trained vision-language models (VLMs), such as CLIP, to perform open-vocabulary TAL (OV-TAL). However, unlike VLMs trained on extensive image/video-text pairs, existing OV-TAL methods still rely on small, fully labeled TA… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  25. arXiv:2407.06865  [pdf, ps, other

    math.RT math.QA

    Affine $\imath$quantum groups and Steinberg varieties of type C

    Authors: Changjian Su, Weiqiang Wang

    Abstract: We provide a geometric realization of the quasi-split affine $\imath$quantum group of type AIII$_{2n-1}^{(τ)}$ in terms of equivariant K-groups of non-connected Steinberg varieties of type C. This uses a new Drinfeld type presentation of this affine $\imath$quantum group which admits very nontrivial Serre relations. We then construct à la Springer a family of finite-dimensional standard modules an… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 47 pages. Comments are welcome

  26. arXiv:2407.06373  [pdf

    eess.IV eess.SP

    Enhancing super-resolution ultrasound localisation through multi-frame deconvolution exploiting spatiotemporal coherence

    Authors: Su Yan, Clotilde Vié, Marcelo Lerendegui, Herman Verinaz-Jadan, Jipeng Yan, Martina Tashkova, James Burn, Bingxue Wang, Gary Frost, Kevin G. Murphy, Meng-Xing Tang

    Abstract: Super-resolution ultrasound imaging through microbubble (MB) localisation and tracking, also known as ultrasound localisation microscopy, allows non-invasive sub-diffraction resolution imaging of microvasculature in animals and humans. The number of MBs localised from the acquired contrast-enhanced ultrasound (CEUS) images and the localisation precision directly influence the quality of the result… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 26 pages, 1 table, 7 figures

  27. arXiv:2407.06329  [pdf, other

    cs.LG cs.AI

    Solving Multi-Model MDPs by Coordinate Ascent and Dynamic Programming

    Authors: Xihong Su, Marek Petrik

    Abstract: Multi-model Markov decision process (MMDP) is a promising framework for computing policies that are robust to parameter uncertainty in MDPs. MMDPs aim to find a policy that maximizes the expected return over a distribution of MDP models. Because MMDPs are NP-hard to solve, most methods resort to approximations. In this paper, we derive the policy gradient of MMDPs and propose CADP, which combines… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted at UAI 2023

  28. arXiv:2407.06135  [pdf, other

    cs.CL cs.AI cs.CV

    ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation

    Authors: Ethan Chern, Jiadi Su, Yan Ma, Pengfei Liu

    Abstract: Previous open-source large multimodal models (LMMs) have faced several limitations: (1) they often lack native integration, requiring adapters to align visual representations with pre-trained large language models (LLMs); (2) many are restricted to single-modal generation; (3) while some support multimodal generation, they rely on separate diffusion models for visual modeling and generation. To mi… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  29. arXiv:2407.06006  [pdf, other

    quant-ph

    Heisenberg-limited Bayesian phase estimation with low-depth digital quantum circuits

    Authors: Su Direkci, Ran Finkelstein, Manuel Endres, Tuvia Gefen

    Abstract: Optimal phase estimation protocols require complex state preparation and readout schemes, generally unavailable or unscalable in many quantum platforms. We develop and analyze a scheme that achieves near-optimal precision up to a constant overhead for Bayesian phase estimation, using simple digital quantum circuits with depths scaling logarithmically with the number of qubits. We find that for Gau… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 27 pages, 13 figures

  30. arXiv:2407.05993  [pdf, other

    cs.CV

    Self-Prior Guided Mamba-UNet Networks for Medical Image Super-Resolution

    Authors: Zexin Ji, Beiji Zou, Xiaoyan Kui, Pierre Vera, Su Ruan

    Abstract: In this paper, we propose a self-prior guided Mamba-UNet network (SMamba-UNet) for medical image super-resolution. Existing methods are primarily based on convolutional neural networks (CNNs) or Transformers. CNNs-based methods fail to capture long-range dependencies, while Transformer-based approaches face heavy calculation challenges due to their quadratic computational complexity. Recently, Sta… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  31. arXiv:2407.05969  [pdf, other

    cs.CV

    Deform-Mamba Network for MRI Super-Resolution

    Authors: Zexin Ji, Beiji Zou, Xiaoyan Kui, Pierre Vera, Su Ruan

    Abstract: In this paper, we propose a new architecture, called Deform-Mamba, for MR image super-resolution. Unlike conventional CNN or Transformer-based super-resolution approaches which encounter challenges related to the local respective field or heavy computational cost, our approach aims to effectively explore the local and global information of images. Specifically, we develop a Deform-Mamba encoder wh… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  32. arXiv:2407.05963  [pdf, ps, other

    cs.SE cs.AI cs.NI cs.SI

    6GSoft: Software for Edge-to-Cloud Continuum

    Authors: Muhammad Azeem Akbar, Matteo Esposito, Sami Hyrynsalmi, Karthikeyan Dinesh Kumar, Valentina Lenarduzzi, Xiaozhou Li, Ali Mehraj, Tommi Mikkonen, Sergio Moreschini, Niko Mäkitalo, Markku Oivo, Anna-Sofia Paavonen, Risha Parveen, Kari Smolander, Ruoyu Su, Kari Systä, Davide Taibi, Nan Yang, Zheying Zhang, Muhammad Zohaib

    Abstract: In the era of 6G, develo** and managing software requires cutting-edge software engineering (SE) theories and practices tailored for such complexity across a vast number of connected edge devices. Our project aims to lead the development of sustainable methods and energy-efficient orchestration models specifically for edge environments, enhancing architectural support driven by AI for contempora… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  33. arXiv:2407.05638  [pdf, other

    cs.CV

    HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion

    Authors: Junhao Su, Chenghao He, Feiyu Zhu, Xiaojie Xu, Dongzhi Guan, Chenyang Si

    Abstract: Traditional deep learning relies on end-to-end backpropagation for training, but it suffers from drawbacks such as high memory consumption and not aligning with biological neural networks. Recent advancements have introduced locally supervised learning, which divides networks into modules with isolated gradients and trains them locally. However, this approach can lead to performance lag due to lim… ▽ More

    Submitted 8 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  34. arXiv:2407.05623  [pdf, other

    cs.CV

    Momentum Auxiliary Network for Supervised Local Learning

    Authors: Junhao Su, Changpeng Cai, Feiyu Zhu, Chenghao He, Xiaojie Xu, Dongzhi Guan, Chenyang Si

    Abstract: Deep neural networks conventionally employ end-to-end backpropagation for their training process, which lacks biological credibility and triggers a locking dilemma during network parameter updates, leading to significant GPU memory use. Supervised local learning, which segments the network into multiple local blocks updated by independent auxiliary networks. However, these methods cannot replace e… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  35. arXiv:2407.05577  [pdf, other

    cs.CV

    Audio-driven High-resolution Seamless Talking Head Video Editing via StyleGAN

    Authors: Jiacheng Su, Kunhong Liu, Liyan Chen, Junfeng Yao, Qingsong Liu, Dongdong Lv

    Abstract: The existing methods for audio-driven talking head video editing have the limitations of poor visual effects. This paper tries to tackle this problem through editing talking face images seamless with different emotions based on two modules: (1) an audio-to-landmark module, consisting of the CrossReconstructed Emotion Disentanglement and an alignment network module. It bridges the gap between speec… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  36. arXiv:2407.05576  [pdf, other

    cs.CV

    ORMNet: Object-centric Relationship Modeling for Egocentric Hand-object Segmentation

    Authors: Yuejiao Su, Yi Wang, Lap-Pui Chau

    Abstract: Egocentric hand-object segmentation (EgoHOS) is a brand-new task aiming at segmenting the hands and interacting objects in the egocentric image. Although significant advancements have been achieved by current methods, establishing an end-to-end model with high accuracy remains an unresolved challenge. Moreover, existing methods lack explicit modeling of the relationships between hands and objects… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  37. arXiv:2407.05395  [pdf, other

    nucl-th

    Quantifying angular distributions in multinucleon transfer reactions with a semi-classical method

    Authors: Zehong Liao, Zepeng Gao, Yu Yang, Yue** Fang, Jun Su, Long Zhu

    Abstract: The multinucleon transfer (MNT) process in low-energy heavy ion collisions can be utilized to produce unknown nuclei far beyond the stability line. However, the reaction products exhibit broad angular and energy distributions, which could lower the experimental detection efficiency. We present a classical approach that employs a parameterized angular distribution to describe the complex issue. By… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: 6 pages, 6 figure

  38. arXiv:2407.05389  [pdf, other

    cs.CV cs.AI

    Image-Conditional Diffusion Transformer for Underwater Image Enhancement

    Authors: Xingyang Nie, Su Pan, Xiaoyu Zhai, Shifei Tao, Fengzhong Qu, Biao Wang, Huilin Ge, Guojie Xiao

    Abstract: Underwater image enhancement (UIE) has attracted much attention owing to its importance for underwater operation and marine engineering. Motivated by the recent advance in generative models, we propose a novel UIE method based on image-conditional diffusion transformer (ICDT). Our method takes the degraded underwater image as the conditional input and converts it into latent space where ICDT is ap… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  39. arXiv:2407.05248  [pdf, other

    cs.CV

    Self-Paced Sample Selection for Barely-Supervised Medical Image Segmentation

    Authors: Junming Su, Zhiqiang Shen, Peng Cao, **zhu Yang, Osmar R. Zaiane

    Abstract: The existing barely-supervised medical image segmentation (BSS) methods, adopting a registration-segmentation paradigm, aim to learn from data with very few annotations to mitigate the extreme label scarcity problem. However, this paradigm poses a challenge: pseudo-labels generated by image registration come with significant noise. To address this issue, we propose a self-paced sample selection fr… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: Accepted to MICCAI 2024

  40. arXiv:2407.05229  [pdf, other

    cs.LG

    HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning

    Authors: Liyuan Wang, **gyi Xie, Xingxing Zhang, Hang Su, Jun Zhu

    Abstract: The deployment of pre-trained models (PTMs) has greatly advanced the field of continual learning (CL), enabling positive knowledge transfer and resilience to catastrophic forgetting. To sustain these advantages for sequentially arriving tasks, a promising direction involves kee** the pre-trained backbone frozen while employing parameter-efficient tuning (PET) techniques to instruct representatio… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: This is a generalized version of our HiDe-Prompt (NeurIPS 2023, Spotlight)

  41. arXiv:2407.05138  [pdf, other

    cs.SE cs.AI

    Vortex under Ripplet: An Empirical Study of RAG-enabled Applications

    Authors: Yuchen Shao, Yuheng Huang, Jiawei Shen, Lei Ma, Ting Su, Chengcheng Wan

    Abstract: Large language models (LLMs) enhanced by retrieval-augmented generation (RAG) provide effective solutions in various application scenarios. However, developers face challenges in integrating RAG-enhanced LLMs into software systems, due to lack of interface specification, requirements from software context, and complicated system management. In this paper, we manually studied 100 open-source applic… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  42. arXiv:2407.05117  [pdf, ps, other

    hep-ex

    Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II

    Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (349 additional authors not shown)

    Abstract: We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 8 pages, 4 figures

    Report number: Belle II Preprint 2024-020; KEK Preprint 2024-17

  43. arXiv:2407.05098  [pdf, other

    cs.LG cs.AI

    FedTSA: A Cluster-based Two-Stage Aggregation Method for Model-heterogeneous Federated Learning

    Authors: Boyu Fan, Chenrui Wu, Xiang Su, Pan Hui

    Abstract: Despite extensive research into data heterogeneity in federated learning (FL), system heterogeneity remains a significant yet often overlooked challenge. Traditional FL approaches typically assume homogeneous hardware resources across FL clients, implying that clients can train a global model within a comparable time. However, in practical FL systems, clients often have heterogeneous resources, wh… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  44. arXiv:2407.04739  [pdf, other

    eess.SP

    Classification of Power Quality Disturbances Using Resnet with Channel Attention Mechanism

    Authors: Su Pan, Xingyang Nie, Xiaoyu Zhai, Biao Wang, Huilin Ge, Cheng He, Zhen** Ding

    Abstract: The detection and classification of power quality disturbances (PQDs) carries significant importance for power systems. In response to this imperative, numerous intelligent diagnostic methods have been developed. However, existing identification methods usually concentrate on single-type signals or on complex signals with two types, rendering them susceptible to noisy labels and environmental effe… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  45. arXiv:2407.04232  [pdf

    q-bio.QM physics.bio-ph q-bio.BM q-bio.SC

    A Unified Intracellular pH Landscape with SITE-pHorin: a Quantum-Entanglement-Enhanced pH Probe

    Authors: Shu-Ang Li, Xiao-Yan Meng, Su Zhang, Ying-Jie Zhang, Run-Zhou Yang, Dian-Dian Wang, Yang Yang, Pei-Pei Liu, Jian-Sheng Kang

    Abstract: An accurate map of intracellular organelle pH is crucial for comprehending cellular metabolism and organellar functions. However, a unified intracellular pH spectrum using a single probe is still lack. Here, we developed a novel quantum entanglement-enhanced pH-sensitive probe called SITE-pHorin, which featured a wide pH-sensitive range and ratiometric quantitative measurement capabilities. Subseq… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 64 pages, 7 figures, the supplemental material contains 13 supplemental figures and 4 supplemental tables

  46. arXiv:2407.04197  [pdf

    physics.ins-det hep-ex nucl-ex

    Compact Ion Beam System for Fusion Demonstration

    Authors: Allan Xi Chen, Nai-Wei Liu, Alexander Gunn, Zhe Su, Benjamin F. Sigal, Matthew Salazar, Nawar Abdalla, James Chen, Alfred Y. Wong, Qiong Wang

    Abstract: We demonstrate a compact ion beam device capable of accelerating H$^+$ and D$^+$ ions up to 75keV energy, on to a solid target, with sufficient beam current to study fusion reactions. The ion beam system uses a microwave driven plasma source to generate ions that are accelerated to high energy with a DC acceleration structure. The plasma source is driven by pulsed microwaves from a solid-state RF… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 18 pages, 13 figures

  47. arXiv:2407.03884  [pdf, other

    cs.CL cs.AI

    Planning with Large Language Models for Conversational Agents

    Authors: Zhigen Li, Jianxiang Peng, Yanmeng Wang, Tianhao Shen, Minghui Zhang, Linxi Su, Shang Wu, Yihang Wu, Yuqian Wang, Ye Wang, Wei Hu, Jianfeng Li, Shaojun Wang, **g Xiao, Deyi Xiong

    Abstract: Controllability and proactivity are crucial properties of autonomous conversational agents (CAs). Controllability requires the CAs to follow the standard operating procedures (SOPs), such as verifying identity before activating credit cards. Proactivity requires the CAs to guide the conversation towards the goal during user uncooperation, such as persuasive dialogue. Existing research cannot be un… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  48. arXiv:2407.03804  [pdf, other

    cs.LG cs.NI

    Multi-Time Scale Service Caching and Pricing in MEC Systems with Dynamic Program Popularity

    Authors: Yiming Chen, Xingyuan Hu, Bo Gu, Shimin Gong, Zhou Su

    Abstract: In mobile edge computing systems, base stations (BSs) equipped with edge servers can provide computing services to users to reduce their task execution time. However, there is always a conflict of interest between the BS and users. The BS prices the service programs based on user demand to maximize its own profit, while the users determine their offloading strategies based on the prices to minimiz… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  49. arXiv:2407.03724  [pdf, other

    cs.RO

    Flight Structure Optimization of Modular Reconfigurable UAVs

    Authors: Yao Su, Ziyuan Jiao, Zeyu Zhang, **gwen Zhang, Hang Li, Meng Wang, Hangxin Liu

    Abstract: This paper presents a Genetic Algorithm (GA) designed to reconfigure a large group of modular Unmanned Aerial Vehicles (UAVs), each with different weights and inertia parameters, into an over-actuated flight structure with improved dynamic properties. Previous research efforts either utilized expert knowledge to design flight structures for a specific task or relied on enumeration-based algorithms… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  50. arXiv:2407.03695  [pdf, other

    cs.CV

    M^3:Manipulation Mask Manufacturer for Arbitrary-Scale Super-Resolution Mask

    Authors: Xinyu Yang, Xiaochen Ma, Xuekang Zhu, Bo Du, Lei Su, Bingkui Tong, Zeyu Lei, Jizhe Zhou

    Abstract: In the field of image manipulation localization (IML), the small quantity and poor quality of existing datasets have always been major issues. A dataset containing various types of manipulations will greatly help improve the accuracy of IML models. Images on the internet (such as those on Baidu Tieba's PS Bar) are manipulated using various techniques, and creating a dataset from these images will… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.