Skip to main content

Showing 1–50 of 292 results for author: Fu, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01031  [pdf, other

    cs.LG cs.CL

    PocketLLM: Enabling On-Device Fine-Tuning for Personalized LLMs

    Authors: Dan Peng, Zhihui Fu, Jun Wang

    Abstract: Recent advancements in large language models (LLMs) have indeed showcased their impressive capabilities. On mobile devices, the wealth of valuable, non-public data generated daily holds great promise for locally fine-tuning personalized LLMs, while maintaining privacy through on-device processing. However, the constraints of mobile device resources pose challenges to direct on-device LLM fine-tuni… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted to the ACL 2024 Workshop on Privacy in Natural Language Processing (PrivateNLP)

  2. arXiv:2406.18099  [pdf, other

    cs.DB

    CompassDB: Pioneering High-Performance Key-Value Store with Perfect Hash

    Authors: ** Jiang, Dongsheng He, Yu Hu, Dong Liu, Chenfan Xiao, Hongxiao Bi, Yusong Zhang, Chaoqu Jiang, Zhijun Fu

    Abstract: Modern mainstream persistent key-value storage engines utilize Log-Structured Merge tree (LSM-tree) based designs, optimizing read/write performance by leveraging sequential disk I/O. However, the advent of SSDs, with their significant improvements in bandwidth and IOPS, shifts the bottleneck from I/O to CPU. The high compaction cost and large read/write amplification associated with LSM trees hav… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.13292  [pdf, other

    q-bio.QM cs.AI eess.IV

    An interpretable generative multimodal neuroimaging-genomics framework for decoding Alzheimer's disease

    Authors: Giorgio Dolci, Federica Cruciani, Md Abdur Rahaman, Anees Abrol, Jiayu Chen, Zening Fu, Ilaria Boscolo Galazzo, Gloria Menegaz, Vince D. Calhoun

    Abstract: Alzheimer's disease (AD) is the most prevalent form of dementia with a progressive decline in cognitive abilities. The AD continuum encompasses a prodormal stage known as Mild Cognitive Impairment (MCI), where patients may either progress to AD or remain stable. In this study, we leveraged structural and functional MRI to investigate the disease-induced grey matter and functional network connectiv… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 27 pages, 7 figures, submitted to a journal

  4. arXiv:2406.12529  [pdf, other

    cs.IR cs.AI

    LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation

    Authors: Yuhao Wang, Yichao Wang, Zichuan Fu, Xiangyang Li, Xiangyu Zhao, Huifeng Guo, Ruiming Tang

    Abstract: As the demand for more personalized recommendation grows and a dramatic boom in commercial scenarios arises, the study on multi-scenario recommendation (MSR) has attracted much attention, which uses the data from all scenarios to simultaneously improve their recommendation performance. However, existing methods tend to integrate insufficient scenario knowledge and neglect learning personalized cro… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  5. arXiv:2406.10454  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    HumanPlus: Humanoid Shadowing and Imitation from Humans

    Authors: Zipeng Fu, Qingqing Zhao, Qi Wu, Gordon Wetzstein, Chelsea Finn

    Abstract: One of the key arguments for building robots that have similar form factors to human beings is that we can leverage the massive human data for training. Yet, doing so has remained challenging in practice due to the complexities in humanoid perception and control, lingering physical gaps between humanoids and humans in morphologies and actuation, and lack of a data pipeline for humanoids to learn a… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: project website: https://humanoid-ai.github.io/

  6. arXiv:2406.09781  [pdf, other

    cs.CV

    GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding

    Authors: Yiqi Wu, Xiaodan Hu, Ziming Fu, Siling Zhou, Jiangong Li

    Abstract: Animal ethology is an crucial aspect of animal research, and animal behavior labeling is the foundation for studying animal behavior. This process typically involves labeling video clips with behavioral semantic tags, a task that is complex, subjective, and multimodal. With the rapid development of multimodal large language models(LLMs), new application have emerged for animal behavior understandi… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  7. arXiv:2406.09315  [pdf, other

    cs.AI cs.CV cs.LG

    Vertical LoRA: Dense Expectation-Maximization Interpretation of Transformers

    Authors: Zhuolin Fu

    Abstract: In this paper, we show how Transformers can be interpreted as dense Expectation-Maximization algorithms performed on Bayesian Nets. Based on the above interpretation, we propose a new model design paradigm, namely Vertical LoRA (VLoRA), which reduces the parameter count dramatically while preserving performance. In VLoRA, a model consists of layers, each of which recursively learns an increment ba… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  8. arXiv:2406.06580  [pdf, other

    cs.CL cs.AI

    Break the Chain: Large Language Models Can be Shortcut Reasoners

    Authors: Mengru Ding, Hanmeng Liu, Zhizhang Fu, Jian Song, Wenbo Xie, Yue Zhang

    Abstract: Recent advancements in Chain-of-Thought (CoT) reasoning utilize complex modules but are hampered by high token consumption, limited applicability, and challenges in reproducibility. This paper conducts a critical evaluation of CoT prompting, extending beyond arithmetic to include complex logical and commonsense reasoning tasks, areas where standard CoT methods fall short. We propose the integratio… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  9. arXiv:2406.04378  [pdf, other

    astro-ph.IM cs.LG hep-ex

    TIDMAD: Time Series Dataset for Discovering Dark Matter with AI Denoising

    Authors: J. T. Fry, Aobo Li, Lindley Winslow, Xinyi Hope Fu, Zhenghao Fu, Kaliroe M. W. Pappas

    Abstract: Dark matter makes up approximately 85% of total matter in our universe, yet it has never been directly observed in any laboratory on Earth. The origin of dark matter is one of the most important questions in contemporary physics, and a convincing detection of dark matter would be a Nobel-Prize-level breakthrough in fundamental science. The ABRACADABRA experiment was specifically designed to search… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  10. arXiv:2406.00380  [pdf, other

    cs.CL cs.AI

    The Best of Both Worlds: Toward an Honest and Helpful Large Language Model

    Authors: Chujie Gao, Qihui Zhang, Dong** Chen, Yue Huang, Siyuan Wu, Zhengyan Fu, Yao Wan, Xiangliang Zhang, Lichao Sun

    Abstract: Large Language Models (LLMs) have achieved remarkable success across various industries due to their exceptional generative capabilities. However, for safe and effective real-world deployments, ensuring honesty and helpfulness is critical. This paper addresses the question: Can we prioritize the helpfulness of LLMs while preserving their honesty? To begin with, we establish exhaustive principles a… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  11. arXiv:2405.16849  [pdf, other

    cs.CV

    Sync4D: Video Guided Controllable Dynamics for Physics-Based 4D Generation

    Authors: Zhoujie Fu, Jiacheng Wei, Wenhao Shen, Chaoyue Song, Xiaofeng Yang, Fayao Liu, Xulei Yang, Guosheng Lin

    Abstract: In this work, we introduce a novel approach for creating controllable dynamics in 3D-generated Gaussians using casually captured reference videos. Our method transfers the motion of objects from reference videos to a variety of generated 3D Gaussians across different categories, ensuring precise and customizable motion transfer. We achieve this by employing blend skinning-based non-parametric shap… ▽ More

    Submitted 6 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Our project page: https://sync4dphys.github.io/

  12. arXiv:2405.11526  [pdf, other

    cs.CV

    Register assisted aggregation for Visual Place Recognition

    Authors: Xuan Yu, Zhenyong Fu

    Abstract: Visual Place Recognition (VPR) refers to the process of using computer vision to recognize the position of the current query image. Due to the significant changes in appearance caused by season, lighting, and time spans between query images and database images for retrieval, these differences increase the difficulty of place recognition. Previous methods often discarded useless features (such as s… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  13. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  14. arXiv:2405.01119  [pdf

    cs.SI eess.SY

    Towards Understanding Worldwide Cross-cultural Differences in Implicit Driving Cues: Review, Comparative Analysis, and Research Roadmap

    Authors: Yongqi Dong, Chang Liu, Yiyun Wang, Zhe Fu

    Abstract: Recognizing and understanding implicit driving cues across diverse cultures is imperative for fostering safe and efficient global transportation systems, particularly when training new immigrants holding driving licenses from culturally disparate countries. Additionally, it is essential to consider cross-cultural differences in the development of Automated Driving features tailored to different co… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 7 pages, 1 figure, under review by the 27th IEEE International Conference on Intelligent Transportation Systems (IEEE ITSC 2024)

  15. arXiv:2405.01065  [pdf, other

    cs.CV

    MFDS-Net: Multi-Scale Feature Depth-Supervised Network for Remote Sensing Change Detection with Global Semantic and Detail Information

    Authors: Zhenyang Huang, Zhao** Fu, Song **tao, Genji Yuan, **jiang Li

    Abstract: Change detection as an interdisciplinary discipline in the field of computer vision and remote sensing at present has been receiving extensive attention and research. Due to the rapid development of society, the geographic information captured by remote sensing satellites is changing faster and more complex, which undoubtedly poses a higher challenge and highlights the value of change detection ta… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  16. arXiv:2405.00472  [pdf, other

    eess.IV cs.CV

    DmADs-Net: Dense multiscale attention and depth-supervised network for medical image segmentation

    Authors: Zhao** Fu, Zheng Chen, **jiang Li, Lu Ren

    Abstract: Deep learning has made important contributions to the development of medical image segmentation. Convolutional neural networks, as a crucial branch, have attracted strong attention from researchers. Through the tireless efforts of numerous researchers, convolutional neural networks have yielded numerous outstanding algorithms for processing medical images. The ideas and architectures of these algo… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  17. arXiv:2404.18231  [pdf, other

    cs.CL cs.AI

    From Persona to Personalization: A Survey on Role-Playing Language Agents

    Authors: Jiangjie Chen, Xintao Wang, Rui Xu, Siyu Yuan, Yikai Zhang, Wei Shi, Jian Xie, Shuang Li, Ruihan Yang, Tinghui Zhu, Aili Chen, Nianqi Li, Lida Chen, Caiyu Hu, Siye Wu, Scott Ren, Ziquan Fu, Yanghua Xiao

    Abstract: Recent advancements in large language models (LLMs) have significantly boosted the rise of Role-Playing Language Agents (RPLAs), i.e., specialized AI systems designed to simulate assigned personas. By harnessing multiple advanced abilities of LLMs, including in-context learning, instruction following, and social intelligence, RPLAs achieve a remarkable sense of human likeness and vivid role-playin… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Preprint

  18. arXiv:2404.18047  [pdf, other

    cs.RO

    LIKO: LiDAR, Inertial, and Kinematic Odometry for Bipedal Robots

    Authors: Qingrui Zhao, Mingyuan Li, Yongliang Shi, Xuechao Chen, Zhangguo Yu, Lianqiang Han, Zhenyuan Fu, **tao Zhang, Chao Li, Yuanxi Zhang, Qiang Huang

    Abstract: High-frequency and accurate state estimation is crucial for biped robots. This paper presents a tightly-coupled LiDAR-Inertial-Kinematic Odometry (LIKO) for biped robot state estimation based on an iterated extended Kalman filter. Beyond state estimation, the foot contact position is also modeled and estimated. This allows for both position and velocity updates from kinematic measurement. Addition… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  19. arXiv:2404.04102  [pdf, other

    cs.LG cs.AI cs.CL

    ROPO: Robust Preference Optimization for Large Language Models

    Authors: Xize Liang, Chao Chen, Shuang Qiu, Jie Wang, Yue Wu, Zhihang Fu, Zhihao Shi, Feng Wu, Jie** Ye

    Abstract: Preference alignment is pivotal for empowering large language models (LLMs) to generate helpful and harmless responses. However, the performance of preference alignment is highly sensitive to the prevalent noise in the preference data. Recent efforts for this problem either marginally alleviate the impact of noise without the ability to actually reduce its presence, or rely on costly teacher LLMs… ▽ More

    Submitted 28 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

  20. arXiv:2404.00144  [pdf, other

    eess.IV cs.CV

    An Interpretable Cross-Attentive Multi-modal MRI Fusion Framework for Schizophrenia Diagnosis

    Authors: Ziyu Zhou, Anton Orlichenko, Gang Qu, Zening Fu, Vince D Calhoun, Zhengming Ding, Yu-** Wang

    Abstract: Both functional and structural magnetic resonance imaging (fMRI and sMRI) are widely used for the diagnosis of mental disorder. However, combining complementary information from these two modalities is challenging due to their heterogeneity. Many existing methods fall short of capturing the interaction between these modalities, frequently defaulting to a simple combination of latent features. In t… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  21. arXiv:2403.17994  [pdf, other

    cs.CV cs.LG

    Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023

    Authors: Hongpeng Pan, Yang Yang, Zhongtian Fu, Yuxuan Zhang, Shian Du, Yi Xu, Xiangyang Ji

    Abstract: This report proposes an improved method for the Tracking Any Point (TAP) task, which tracks any physical surface through a video. Several existing approaches have explored the TAP by considering the temporal relationships to obtain smooth point motion trajectories, however, they still suffer from the cumulative error caused by temporal prediction. To address this issue, we propose a simple yet eff… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  22. arXiv:2403.10790  [pdf, other

    quant-ph cs.CR cs.LG

    QuantumLeak: Stealing Quantum Neural Networks from Cloud-based NISQ Machines

    Authors: Zhenxiao Fu, Min Yang, Cheng Chu, Yilun Xu, Gang Huang, Fan Chen

    Abstract: Variational quantum circuits (VQCs) have become a powerful tool for implementing Quantum Neural Networks (QNNs), addressing a wide range of complex problems. Well-trained VQCs serve as valuable intellectual assets hosted on cloud-based Noisy Intermediate Scale Quantum (NISQ) computers, making them susceptible to malicious VQC stealing attacks. However, traditional model extraction techniques desig… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Journal ref: published in IJCNN 2024

  23. arXiv:2403.09140  [pdf, other

    cs.CV

    Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

    Authors: Cheng Chen, Xiaofeng Yang, Fan Yang, Chengzeng Feng, Zhoujie Fu, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu

    Abstract: Recent works on text-to-3d generation show that using only 2D diffusion supervision for 3D generation tends to produce results with inconsistent appearances (e.g., faces on the back view) and inaccurate shapes (e.g., animals with extra legs). Existing methods mainly address this issue by retraining diffusion models with images rendered from 3D data to ensure multi-view consistency while struggling… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024. Project Page: https://stellarcheng.github.io/Sculpt3D/

  24. arXiv:2403.06138  [pdf, other

    cs.CV

    BSDA: Bayesian Random Semantic Data Augmentation for Medical Image Classification

    Authors: Yaoyao Zhu, Xiuding Cai, Xueyao Wang, Xiaoqing Chen, Yu Yao, Zhongliang Fu

    Abstract: Data augmentation is a crucial regularization technique for deep neural networks, particularly in medical image classification. Mainstream data augmentation (DA) methods are usually applied at the image level. Due to the specificity and diversity of medical imaging, expertise is often required to design effective DA strategies, and improper augmentation operations can degrade model performance. Al… ▽ More

    Submitted 27 June, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

  25. arXiv:2402.11021  [pdf, other

    quant-ph cs.ET

    TITAN: A Distributed Large-Scale Trapped-Ion NISQ Computer

    Authors: Cheng Chu, Zhenxiao Fu, Yilun Xu, Gang Huang, Hausi Muller, Fan Chen, Lei Jiang

    Abstract: Trapped-Ion (TI) technology offers potential breakthroughs for Noisy Intermediate Scale Quantum (NISQ) computing. TI qubits offer extended coherence times and high gate fidelity, making them appealing for large-scale NISQ computers. Constructing such computers demands a distributed architecture connecting Quantum Charge Coupled Devices (QCCDs) via quantum matter-links and photonic switches. Howeve… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  26. arXiv:2402.10062  [pdf, other

    cs.LG stat.ML

    Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection

    Authors: Chao Chen, Zhihang Fu, Kai Liu, Ze Chen, Mingyuan Tao, Jie** Ye

    Abstract: For a machine learning model deployed in real world scenarios, the ability of detecting out-of-distribution (OOD) samples is indispensable and challenging. Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples. The training-based methods require expensive traini… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by NeurIPS 2023. 19 pages

    Journal ref: NeurIPS 2023

  27. arXiv:2402.07858  [pdf, other

    cs.LG

    Multiscale Neuroimaging Features for the Identification of Medication Class and Non-Responders in Mood Disorder Treatment

    Authors: Bradley T. Baker, Mustafa S. Salman, Zening Fu, Armin Iraji, Elizabeth Osuch, Jeremy Bockholt, Vince D. Calhoun

    Abstract: In the clinical treatment of mood disorders, the complex behavioral symptoms presented by patients and variability of patient response to particular medication classes can create difficulties in providing fast and reliable treatment when standard diagnostic and prescription methods are used. Increasingly, the incorporation of physiological information such as neuroimaging scans and derivatives int… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  28. arXiv:2402.03744  [pdf, other

    cs.CL

    INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection

    Authors: Chao Chen, Kai Liu, Ze Chen, Yi Gu, Yue Wu, Mingyuan Tao, Zhihang Fu, Jie** Ye

    Abstract: Knowledge hallucination have raised widespread concerns for the security and reliability of deployed LLMs. Previous efforts in detecting hallucinations have been employed at logit-level uncertainty estimation or language-level self-consistency evaluation, where the semantic information is inevitably lost during the token-decoding procedure. Thus, we propose to explore the dense semantic informatio… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted by ICLR-2024

  29. arXiv:2401.15691  [pdf, other

    cs.LG

    One for all: A novel Dual-space Co-training baseline for Large-scale Multi-View Clustering

    Authors: Zisen Kong, Zhiqiang Fu, Dongxia Chang, Yiming Wang, Yao Zhao

    Abstract: In this paper, we propose a novel multi-view clustering model, named Dual-space Co-training Large-scale Multi-view Clustering (DSCMC). The main objective of our approach is to enhance the clustering performance by leveraging co-training in two distinct spaces. In the original space, we learn a projection matrix to obtain latent consistent anchor graphs from different views. This process involves c… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  30. arXiv:2401.05952  [pdf, other

    cs.CL

    LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?

    Authors: Qihui Zhang, Chujie Gao, Dong** Chen, Yue Huang, Yixin Huang, Zhenyang Sun, Shilin Zhang, Weiye Li, Zhengyan Fu, Yao Wan, Lichao Sun

    Abstract: With the rapid development and widespread application of Large Language Models (LLMs), the use of Machine-Generated Text (MGT) has become increasingly common, bringing with it potential risks, especially in terms of quality and integrity in fields like news, education, and science. Current research mainly focuses on purely MGT detection without adequately addressing mixed scenarios, including AI-r… ▽ More

    Submitted 30 March, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: Accepted by NAACL 2024

  31. arXiv:2401.02954  [pdf, other

    cs.CL cs.AI cs.LG

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    Authors: DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, Jianzhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li , et al. (63 additional authors not shown)

    Abstract: The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  32. arXiv:2401.02117  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

    Authors: Zipeng Fu, Tony Z. Zhao, Chelsea Finn

    Abstract: Imitation learning from human demonstrations has shown impressive performance in robotics. However, most results focus on table-top manipulation, lacking the mobility and dexterity necessary for generally useful tasks. In this work, we develop a system for imitating mobile manipulation tasks that are bimanual and require whole-body control. We first present Mobile ALOHA, a low-cost and whole-body… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: Project website: https://mobile-aloha.github.io (Zipeng Fu and Tony Z. Zhao are project co-leads, Chelsea Finn is the advisor)

  33. arXiv:2312.14372  [pdf, other

    physics.data-an cs.LG

    Generative Models for Simulation of KamLAND-Zen

    Authors: Z. Fu, C. Grant, D. M. Krawiec, A. Li, L. Winslow

    Abstract: The next generation of searches for neutrinoless double beta decay (0ν\b{eta}\b{eta}) are poised to answer deep questions on the nature of neutrinos and the source of the Universe's matter-antimatter asymmetry. They will be looking for event rates of less than one event per ton of instrumented isotope per year. To claim discovery, accurate and efficient simulations of detector events that mimic 0ν… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Submitted to EPJC

  34. arXiv:2312.10743  [pdf, other

    cs.IR

    A Unified Framework for Multi-Domain CTR Prediction via Large Language Models

    Authors: Zichuan Fu, Xiangyang Li, Chuhan Wu, Yichao Wang, Kuicai Dong, Xiangyu Zhao, Mengchen Zhao, Huifeng Guo, Ruiming Tang

    Abstract: Click-Through Rate (CTR) prediction is a crucial task in online recommendation platforms as it involves estimating the probability of user engagement with advertisements or items by clicking on them. Given the availability of various services like online shop**, ride-sharing, food delivery, and professional services on commercial platforms, recommendation systems in these platforms are required… ▽ More

    Submitted 23 February, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: submited to TOIS

  35. arXiv:2312.02473  [pdf, other

    cs.LG cs.DC

    NeutronStream: A Dynamic GNN Training Framework with Sliding Window for Graph Streams

    Authors: Chaoyi Chen, Dechao Gao, Yanfeng Zhang, Qiange Wang, Zhenbo Fu, Xuecang Zhang, Junhua Zhu, Yu Gu, Ge Yu

    Abstract: Existing Graph Neural Network (GNN) training frameworks have been designed to help developers easily create performant GNN implementations. However, most existing GNN frameworks assume that the input graphs are static, but ignore that most real-world graphs are constantly evolving. Though many dynamic GNN models have emerged to learn from evolving graphs, the training process of these dynamic GNNs… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 12 pages, 15 figures

  36. arXiv:2311.09622  [pdf

    cs.RO

    Homography Initialization and Dynamic Weighting Algorithm Based on a Downward-Looking Camera and IMU

    Authors: Bo Dong, Yongkang Tao, Deng Peng, Zhigang Fu

    Abstract: In recent years, the technology in visual-inertial odometry (VIO) has matured considerably and has been widely used in many applications. However, we still encounter challenges when applying VIO to a micro air vehicle (MAV) equipped with a downward-looking camera. Specifically, VIO cannot compute the correct initialization results during take-off and the cumulative drift is large when the MAV is f… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  37. arXiv:2311.04256  [pdf, ps, other

    cs.AI cs.IT cs.LG

    Foundational theories of hesitant fuzzy sets and hesitant fuzzy information systems and their applications for multi-strength intelligent classifiers

    Authors: Shizhan Lu, Zeshui Xu, Zhu Fu

    Abstract: Hesitant fuzzy sets are widely used in certain instances of uncertainty and hesitation. In sets, the inclusion relationship is an important and foundational definition. Thus, as a kind of set, hesitant fuzzy sets require an explicit definition of inclusion relationship. Based on the hesitant fuzzy membership degree of discrete form, several kinds of inclusion relationships for hesitant fuzzy sets… ▽ More

    Submitted 17 February, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: 25 pages

  38. arXiv:2311.01059  [pdf, other

    cs.RO cs.LG

    Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment

    Authors: Annie S. Chen, Govind Chada, Laura Smith, Archit Sharma, Zipeng Fu, Sergey Levine, Chelsea Finn

    Abstract: To succeed in the real world, robots must cope with situations that differ from those seen during training. We study the problem of adapting on-the-fly to such novel scenarios during deployment, by drawing upon a diverse repertoire of previously learned behaviors. Our approach, RObust Autonomous Modulation (ROAM), introduces a mechanism based on the perceived value of pre-trained behaviors to sele… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 19 pages, 6 figures

  39. arXiv:2310.18920  [pdf, other

    cs.CV

    Improving Multi-Person Pose Tracking with A Confidence Network

    Authors: Zehua Fu, Wenhang Zuo, Zhenghui Hu, Qingjie Liu, Yunhong Wang

    Abstract: Human pose estimation and tracking are fundamental tasks for understanding human behaviors in videos. Existing top-down framework-based methods usually perform three-stage tasks: human detection, pose estimation and tracking. Although promising results have been achieved, these methods rely heavily on high-performance detectors and may fail to track persons who are occluded or miss-detected. To ov… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted by IEEE Transactions on Multimedia. 11 pages, 5 figures

  40. arXiv:2310.10226  [pdf, other

    cs.CL

    Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective

    Authors: Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, Yixuan Su

    Abstract: There are a number of diverging hypotheses about the neural text degeneration problem, i.e., generating repetitive and dull loops, which makes this problem both interesting and confusing. In this work, we aim to advance our understanding by presenting a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the dege… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  41. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  42. arXiv:2310.05025  [pdf, other

    cs.CL

    Synslator: An Interactive Machine Translation Tool with Online Learning

    Authors: Jiayi Wang, Ke Wang, Fengming Zhou, Chengyu Wang, Zhiyong Fu, Zeyu Feng, Yu Zhao, Yuqi Zhang

    Abstract: Interactive machine translation (IMT) has emerged as a progression of the computer-aided translation paradigm, where the machine translation system and the human translator collaborate to produce high-quality translations. This paper introduces Synslator, a user-friendly computer-aided translation (CAT) tool that not only supports IMT, but is adept at online learning with real-time translation mem… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  43. arXiv:2309.15635  [pdf, other

    cs.CV

    Position and Orientation-Aware One-Shot Learning for Medical Action Recognition from Signal Data

    Authors: Leiyu Xie, Yuxing Yang, Zeyu Fu, Syed Mohsen Naqvi

    Abstract: In this work, we propose a position and orientation-aware one-shot learning framework for medical action recognition from signal data. The proposed framework comprises two stages and each stage includes signal-level image generation (SIG), cross-attention (CsA), dynamic time war** (DTW) modules and the information fusion between the proposed privacy-preserved position and orientation features. T… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  44. arXiv:2309.09949  [pdf, other

    cs.AI cs.CL

    How to Generate Popular Post Headlines on Social Media?

    Authors: Zhouxiang Fang, Min Yu, Zhendong Fu, Boning Zhang, Xuanwen Huang, Xiaoqi Tang, Yang Yang

    Abstract: Posts, as important containers of user-generated-content pieces on social media, are of tremendous social influence and commercial value. As an integral components of a post, the headline has a decisive contribution to the post's popularity. However, current mainstream method for headline generation is still manually writing, which is unstable and requires extensive human effort. This drives us to… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  45. arXiv:2309.07921  [pdf, other

    cs.CV

    OpenIllumination: A Multi-Illumination Dataset for Inverse Rendering Evaluation on Real Objects

    Authors: Isabella Liu, Linghao Chen, Ziyang Fu, Liwen Wu, Haian **, Zhong Li, Chin Ming Ryan Wong, Yi Xu, Ravi Ramamoorthi, Zexiang Xu, Hao Su

    Abstract: We introduce OpenIllumination, a real-world dataset containing over 108K images of 64 objects with diverse materials, captured under 72 camera views and a large number of different illuminations. For each image in the dataset, we provide accurate camera parameters, illumination ground truth, and foreground segmentation masks. Our dataset enables the quantitative evaluation of most inverse renderin… ▽ More

    Submitted 1 February, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  46. arXiv:2309.05665  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Robot Parkour Learning

    Authors: Ziwen Zhuang, Zipeng Fu, Jianren Wang, Christopher Atkeson, Soeren Schwertfeger, Chelsea Finn, Hang Zhao

    Abstract: Parkour is a grand challenge for legged locomotion that requires robots to overcome various obstacles rapidly in complex environments. Existing methods can generate either diverse but blind locomotion skills or vision-based but specialized skills by using reference animal data or complex rewards. However, autonomous parkour requires robots to learn generalizable skills that are both vision-based a… ▽ More

    Submitted 11 September, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: CoRL 2023 (Oral). Project website at https://robot-parkour.github.io

  47. arXiv:2309.05028  [pdf, other

    cs.CV

    SC-NeRF: Self-Correcting Neural Radiance Field with Sparse Views

    Authors: Liang Song, Guangming Wang, Jiuming Liu, Zhenyang Fu, Yanzi Miao, Hesheng

    Abstract: In recent studies, the generalization of neural radiance fields for novel view synthesis task has been widely explored. However, existing methods are limited to objects and indoor scenes. In this work, we extend the generalization task to outdoor scenes, trained only on object-level datasets. This approach presents two challenges. Firstly, the significant distributional shift between training and… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

  48. arXiv:2309.01112  [pdf

    cs.RO eess.SY

    Swing Leg Motion Strategy for Heavy-load Legged Robot Based on Force Sensing

    Authors: Ze Fu, Yinghui Li, Weizhong Guo

    Abstract: The heavy-load legged robot has strong load carrying capacity and can adapt to various unstructured terrains. But the large weight results in higher requirements for motion stability and environmental perception ability. In order to utilize force sensing information to improve its motion performance, in this paper, we propose a finite state machine model for the swing leg in the static gait by imi… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  49. arXiv:2308.12797  [pdf, other

    cs.RO cs.MA eess.SY

    TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search

    Authors: Licheng Wen, Ze Fu, Pinlong Cai, Daocheng Fu, Song Mao, Botian Shi

    Abstract: Digital twins for intelligent transportation systems are currently attracting great interests, in which generating realistic, diverse, and human-like traffic flow in simulations is a formidable challenge. Current approaches often hinge on predefined driver models, objective optimization, or reliance on pre-recorded driving datasets, imposing limitations on their scalability, versatility, and adapt… ▽ More

    Submitted 31 August, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

  50. arXiv:2308.05401  [pdf, other

    cs.RO

    Fast calibration for ultrasound imaging guidance based on depth camera

    Authors: Fuqiang Zhao, Mingchang Li, Mengde Li, Zhongtao Fu, Miao Li

    Abstract: During the process of robot-assisted ultrasound(US) puncture, it is important to estimate the location of the puncture from the 2D US images. To this end, the calibration of the US image becomes an important issue. In this paper, we proposed a depth camera-based US calibration method, where an easy-to-deploy device is designed for the calibration. With this device, the coordinates of the puncture… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.