Skip to main content

Showing 1–50 of 105 results for author: Gu, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00987  [pdf, other

    cs.NI eess.SY

    Exploiting Dependency-Aware Priority Adjustment for Mixed-Criticality TSN Flow Scheduling

    Authors: Miao Guo, Yifei Sun, Chaojie Gu, Shibo He, Zhiguo Shi

    Abstract: Time-Sensitive Networking (TSN) serves as a one-size-fits-all solution for mixed-criticality communication, in which flow scheduling is vital to guarantee real-time transmissions. Traditional approaches statically assign priorities to flows based on their associated applications, resulting in significant queuing delays. In this paper, we observe that assigning different priorities to a flow leads… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by IWQoS'24

  2. arXiv:2406.06579  [pdf, other

    cs.CL cs.AI cs.CV

    From Redundancy to Relevance: Enhancing Explainability in Multimodal Large Language Models

    Authors: Xiaofeng Zhang, Chen Shen, Xiaosong Yuan, Shaotian Yan, Liang Xie, Wenxiao Wang, Chaochen Gu, Hao Tang, Jie** Ye

    Abstract: Recently, multimodal large language models have exploded with an endless variety, most of the popular Large Vision Language Models (LVLMs) depend on sequential visual representation, where images are converted into hundreds or thousands of tokens before being input into the Large Language Model (LLM) along with language prompts. The black-box design hinders the interpretability of visual-language… ▽ More

    Submitted 13 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  3. arXiv:2406.06258  [pdf, other

    cs.CV

    Tuning-Free Visual Customization via View Iterative Self-Attention Control

    Authors: Xiaojie Li, Chenghao Gu, Shuzhao Xie, Yunpeng Bai, Weixiang Zhang, Zhi Wang

    Abstract: Fine-Tuning Diffusion Models enable a wide range of personalized generation and editing applications on diverse visual modalities. While Low-Rank Adaptation (LoRA) accelerates the fine-tuning process, it still requires multiple reference images and time-consuming training, which constrains its scalability for large-scale and real-time applications. In this paper, we propose \textit{View Iterative… ▽ More

    Submitted 10 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Under review

  4. arXiv:2406.01579  [pdf, other

    cs.CV

    Tetrahedron Splatting for 3D Generation

    Authors: Chun Gu, Zeyu Yang, Zijie Pan, Xiatian Zhu, Li Zhang

    Abstract: 3D representation is essential to the significant advance of 3D generation with 2D diffusion priors. As a flexible representation, NeRF has been first adopted for 3D representation. With density-based volumetric rendering, it however suffers both intensive computational overhead and inaccurate mesh extraction. Using a signed distance field and Marching Tetrahedra, DMTet allows for precise mesh ext… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/fudan-zvg/tet-splatting

  5. arXiv:2405.16437  [pdf, other

    cs.CV

    Incremental Pseudo-Labeling for Black-Box Unsupervised Domain Adaptation

    Authors: Yawen Zou, Chunzhi Gu, Jun Yu, Shangce Gao, Chao Zhang

    Abstract: Black-Box unsupervised domain adaptation (BBUDA) learns knowledge only with the prediction of target data from the source model without access to the source data and source model, which attempts to alleviate concerns about the privacy and security of data. However, incorrect pseudo-labels are prevalent in the prediction generated by the source model due to the cross-domain discrepancy, which may s… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  6. arXiv:2405.00816  [pdf

    cs.SI cs.LG

    Sifting out communities in large sparse networks

    Authors: Sharlee Climer, Kenneth Smith Jr, Wei Yang, Lisa de las Fuentes, Victor G. Dávila-Román, C. Charles Gu

    Abstract: Research data sets are growing to unprecedented sizes and network modeling is commonly used to extract complex relationships in diverse domains, such as genetic interactions involved in disease, logistics, and social communities. As the number of nodes increases in a network, an increasing sparsity of edges is a practical limitation due to memory restrictions. Moreover, many of these sparse networ… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  7. arXiv:2404.18359  [pdf, other

    cs.CL cs.AI

    FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models

    Authors: Wei Li, Ren Ma, Jiang Wu, Chenya Gu, Jiahui Peng, **yang Len, Songyang Zhang, Hang Yan, Dahua Lin, Conghui He

    Abstract: In the burgeoning field of large language models (LLMs), the assessment of fundamental knowledge remains a critical challenge, particularly for models tailored to Chinese language and culture. This paper introduces FoundaBench, a pioneering benchmark designed to rigorously evaluate the fundamental knowledge capabilities of Chinese LLMs. FoundaBench encompasses a diverse array of 3354 multiple-choi… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  8. arXiv:2404.17381  [pdf, other

    cs.CV

    Frequency-Guided Multi-Level Human Action Anomaly Detection with Normalizing Flows

    Authors: Shun Maeda, Chunzhi Gu, Jun Yu, Shogo Tokai, Shangce Gao, Chao Zhang

    Abstract: We introduce the task of human action anomaly detection (HAAD), which aims to identify anomalous motions in an unsupervised manner given only the pre-determined normal category of training action samples. Compared to prior human-related anomaly detection tasks which primarily focus on unusual events from videos, HAAD involves the learning of specific action labels to recognize semantically anomalo… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  9. arXiv:2404.10311  [pdf, other

    eess.SY cs.AI

    Learning and Optimization for Price-based Demand Response of Electric Vehicle Charging

    Authors: Chengyang Gu, Yuxin Pan, Ruohong Liu, Yize Chen

    Abstract: In the context of charging electric vehicles (EVs), the price-based demand response (PBDR) is becoming increasingly significant for charging load management. Such response usually encourages cost-sensitive customers to adjust their energy demand in response to changes in price for financial incentives. Thus, to model and optimize EV charging, it is important for charging station operator to model… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted by American Control Conference (ACC) 2024

  10. arXiv:2404.02148  [pdf, other

    cs.CV

    Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models

    Authors: Zeyu Yang, Zijie Pan, Chun Gu, Li Zhang

    Abstract: Recent advancements in 3D generation are predominantly propelled by improvements in 3D-aware image diffusion models which are pretrained on Internet-scale image data and fine-tuned on massive 3D data, offering the capability of producing highly consistent multi-view images. However, due to the scarcity of synchronized multi-view video data, it is impractical to adapt this paradigm to 4D generation… ▽ More

    Submitted 22 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Technical Report

  11. arXiv:2404.01958  [pdf, other

    cs.LG

    MESEN: Exploit Multimodal Data to Design Unimodal Human Activity Recognition with Few Labels

    Authors: Lilin Xu, Chaojie Gu, Rui Tan, Shibo He, Jiming Chen

    Abstract: Human activity recognition (HAR) will be an essential function of various emerging applications. However, HAR typically encounters challenges related to modality limitations and label scarcity, leading to an application gap between current solutions and real-world requirements. In this work, we propose MESEN, a multimodal-empowered unimodal sensing framework, to utilize unlabeled multimodal data a… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to the 21th ACM Conference on Embedded Networked Sensor Systems (SenSys 2023)

  12. arXiv:2404.01284  [pdf, other

    cs.CV

    Large Motion Model for Unified Multi-Modal Motion Generation

    Authors: Mingyuan Zhang, Daisheng **, Chenyang Gu, Fangzhou Hong, Zhongang Cai, **gfang Huang, Chongzhi Zhang, Xinying Guo, Lei Yang, Ying He, Ziwei Liu

    Abstract: Human motion generation, a cornerstone technique in animation and video production, has widespread applications in various tasks like text-to-motion and music-to-dance. Previous works focus on develo** specialist models tailored for each task without scalability. In this work, we present Large Motion Model (LMM), a motion-centric, multi-modal framework that unifies mainstream motion generation t… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Homepage: https://mingyuan-zhang.github.io/projects/LMM.html

  13. A Learning-based Incentive Mechanism for Mobile AIGC Service in Decentralized Internet of Vehicles

    Authors: Jiani Fan, Minrui Xu, Ziyao Liu, Huanyi Ye, Chaojie Gu, Dusit Niyato, Kwok-Yan Lam

    Abstract: Artificial Intelligence-Generated Content (AIGC) refers to the paradigm of automated content generation utilizing AI models. Mobile AIGC services in the Internet of Vehicles (IoV) network have numerous advantages over traditional cloud-based AIGC services, including enhanced network efficiency, better reconfigurability, and stronger data security and privacy. Nonetheless, AIGC service provisioning… ▽ More

    Submitted 9 May, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: 2023 IEEE 98th Vehicular Technology Conference (VTC2023-Fall)

  14. arXiv:2403.17297  [pdf, other

    cs.CL cs.AI

    InternLM2 Technical Report

    Authors: Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang , et al. (75 additional authors not shown)

    Abstract: The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI). However, replicating such advancements in open-source models has been challenging. This paper introduces InternLM2, an open-source LLM that outperforms its predecessors in comprehensive evaluations across 6 dimensions and 30 benchmarks, long-context m… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  15. arXiv:2403.14112  [pdf, other

    cs.CL

    Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations

    Authors: Jiaxing Sun, Weiquan Huang, Jiang Wu, Chenya Gu, Wei Li, Songyang Zhang, Hang Yan, Conghui He

    Abstract: We introduce CHARM, the first benchmark for comprehensively and in-depth evaluating the commonsense reasoning ability of large language models (LLMs) in Chinese, which covers both globally known and Chinese-specific commonsense. We evaluated 7 English and 12 Chinese-oriented LLMs on CHARM, employing 5 representative prompt strategies for improving LLMs' reasoning ability, such as Chain-of-Thought.… ▽ More

    Submitted 19 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Equal contribution: Jiaxing Sun, Weiquan Huang, Jiang Wu; Corresponding author: Conghui He

  16. arXiv:2403.10020  [pdf, other

    cs.CL cs.MM

    Lost in Overlap: Exploring Watermark Collision in LLMs

    Authors: Yiyang Luo, Ke Lin, Chao Gu

    Abstract: The proliferation of large language models (LLMs) in generating content raises concerns about text copyright. Watermarking methods, particularly logit-based approaches, embed imperceptible identifiers into text to address these challenges. However, the widespread use of watermarking across diverse LLMs has led to an inevitable issue known as watermark collision during common tasks like question an… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Short Paper, 4 pages

  17. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  18. arXiv:2401.16235  [pdf

    cs.LG stat.AP

    Player Pressure Map -- A Novel Representation of Pressure in Soccer for Evaluating Player Performance in Different Game Contexts

    Authors: Chaoyi Gu, Jiaming Na, Yisheng Pei, Varuna De Silva

    Abstract: In soccer, contextual player performance metrics are invaluable to coaches. For example, the ability to perform under pressure during matches distinguishes the elite from the average. Appropriate pressure metric enables teams to assess players' performance accurately under pressure and design targeted training scenarios to address their weaknesses. The primary objective of this paper is to leverag… ▽ More

    Submitted 7 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  19. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  20. arXiv:2312.10109   

    cs.CV cs.AI

    Enlighten-Your-Voice: When Multimodal Meets Zero-shot Low-light Image Enhancement

    Authors: Xiaofeng Zhang, Zishan Xu, Hao Tang, Chaochen Gu, Wei Chen, Shanying Zhu, ** Guan

    Abstract: Low-light image enhancement is a crucial visual task, and many unsupervised methods tend to overlook the degradation of visible information in low-light scenes, which adversely affects the fusion of complementary information and hinders the generation of satisfactory results. To address this, our study introduces "Enlighten-Your-Voice", a multimodal enhancement framework that innovatively enriches… ▽ More

    Submitted 1 February, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: It needs revised

  21. arXiv:2312.08343  [pdf

    eess.IV cs.CV q-bio.QM

    Enhancing CT Image synthesis from multi-modal MRI data based on a multi-task neural network framework

    Authors: Zhuoyao Xin, Christopher Wu, Dong Liu, Chunming Gu, Jia Guo, Jun Hua

    Abstract: Image segmentation, real-value prediction, and cross-modal translation are critical challenges in medical imaging. In this study, we propose a versatile multi-task neural network framework, based on an enhanced Transformer U-Net architecture, capable of simultaneously, selectively, and adaptively addressing these medical image tasks. Validation is performed on a public repository of human brain MR… ▽ More

    Submitted 17 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: 4 pages, 3 figures, 2 tables

  22. arXiv:2312.04469  [pdf, other

    cs.LG cs.CL cs.CR

    On the Learnability of Watermarks for Language Models

    Authors: Chenchen Gu, Xiang Lisa Li, Percy Liang, Tatsunori Hashimoto

    Abstract: Watermarking of language model outputs enables statistical detection of model-generated text, which can mitigate harms and misuses of language models. Existing watermarking strategies operate by altering the decoder of an existing language model. In this paper, we ask whether language models can directly learn to generate watermarked text, which would have significant implications for the real-wor… ▽ More

    Submitted 2 May, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted at ICLR 2024

  23. arXiv:2311.18561  [pdf, other

    cs.CV

    Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering

    Authors: Yurui Chen, Chun Gu, Junzhe Jiang, Xiatian Zhu, Li Zhang

    Abstract: Modeling dynamic, large-scale urban scenes is challenging due to their highly intricate geometric structures and unconstrained dynamics in both space and time. Prior methods often employ high-level architectural priors, separating static and dynamic elements, resulting in suboptimal capture of their synergistic interactions. To address this challenge, we present a unified representation model, cal… ▽ More

    Submitted 20 March, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Project page: https://fudan-zvg.github.io/PVG/

  24. arXiv:2311.18332  [pdf, other

    cs.CV

    Multilevel Saliency-Guided Self-Supervised Learning for Image Anomaly Detection

    Authors: Jianjian Qin, Chunzhi Gu, Jun Yu, Chao Zhang

    Abstract: Anomaly detection (AD) is a fundamental task in computer vision. It aims to identify incorrect image data patterns which deviate from the normal ones. Conventional methods generally address AD by preparing augmented negative samples to enforce self-supervised learning. However, these techniques typically do not consider semantics during augmentation, leading to the generation of unrealistic or inv… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  25. arXiv:2311.16043  [pdf, other

    cs.CV cs.GR

    Relightable 3D Gaussian: Real-time Point Cloud Relighting with BRDF Decomposition and Ray Tracing

    Authors: Jian Gao, Chun Gu, Youtian Lin, Hao Zhu, Xun Cao, Li Zhang, Yao Yao

    Abstract: We present a novel differentiable point-based rendering framework for material and lighting decomposition from multi-view images, enabling editing, ray-tracing, and real-time relighting of the 3D point cloud. Specifically, a 3D scene is represented as a set of relightable 3D Gaussian points, where each point is additionally associated with a normal direction, BRDF parameters, and incident lights f… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  26. arXiv:2311.04095  [pdf, other

    cs.CV

    Image-Pointcloud Fusion based Anomaly Detection using PD-REAL Dataset

    Authors: Jianjian Qin, Chunzhi Gu, Jun Yu, Chao Zhang

    Abstract: We present PD-REAL, a novel large-scale dataset for unsupervised anomaly detection (AD) in the 3D domain. It is motivated by the fact that 2D-only representations in the AD task may fail to capture the geometric structures of anomalies due to uncertainty in lighting conditions or shooting angles. PD-REAL consists entirely of Play-Doh models for 15 object categories and focuses on the analysis of p… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  27. arXiv:2310.14907  [pdf, other

    cs.CV cs.GR

    Orientation-Aware Leg Movement Learning for Action-Driven Human Motion Prediction

    Authors: Chunzhi Gu, Chao Zhang, Shigeru Kuriyama

    Abstract: The task of action-driven human motion prediction aims to forecast future human motion based on the observed sequence while respecting the given action label. It requires modeling not only the stochasticity within human motion but the smooth yet realistic transition between multiple action labels. However, the fact that most datasets do not contain such transition data complicates this task. Exist… ▽ More

    Submitted 5 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

  28. arXiv:2310.01876  [pdf, other

    cs.CV eess.IV

    A Dual Attentive Generative Adversarial Network for Remote Sensing Image Change Detection

    Authors: Luyi Qiu, Xiaofeng Zhang, ChaoChen Gu, and ShanYing Zhu

    Abstract: Remote sensing change detection between bi-temporal images receives growing concentration from researchers. However, comparing two bi-temporal images for detecting changes is challenging, as they demonstrate different appearances. In this paper, we propose a dual attentive generative adversarial network for achieving very high-resolution remote sensing image change detection tasks, which regards t… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  29. arXiv:2309.07791  [pdf, other

    cs.NE

    A Multi-In and Multi-Out Dendritic Neuron Model and its Optimization

    Authors: Yu Ding, Jun Yu, Chunzhi Gu, Shangce Gao, Chao Zhang

    Abstract: Artificial neural networks (ANNs), inspired by the interconnection of real neurons, have achieved unprecedented success in various fields such as computer vision and natural language processing. Recently, a novel mathematical ANN model, known as the dendritic neuron model (DNM), has been proposed to address nonlinear problems by more accurately reflecting the structure of real neurons. However, th… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  30. arXiv:2309.01941  [pdf, other

    q-bio.NC cs.AI cs.LG

    Dynamic Brain Transformer with Multi-level Attention for Functional Brain Network Analysis

    Authors: Xuan Kan, Antonio Aodong Chen Gu, Hejie Cui, Ying Guo, Carl Yang

    Abstract: Recent neuroimaging studies have highlighted the importance of network-centric brain analysis, particularly with functional magnetic resonance imaging. The emergence of Deep Neural Networks has fostered a substantial interest in predicting clinical outcomes and categorizing individuals based on brain networks. However, the conventional approach involving static brain network analysis offers limite… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE BHI 2023

    MSC Class: 68T07; 68T05 ACM Class: I.2.6; J.3

  31. arXiv:2309.01377  [pdf, other

    cs.CV cs.AI

    Memory augment is All You Need for image restoration

    Authors: Xiao Feng Zhang, Chao Chen Gu, Shan Ying Zhu

    Abstract: Image restoration is a low-level vision task, most CNN methods are designed as a black box, lacking transparency and internal aesthetics. Although some methods combining traditional optimization algorithms with DNNs have been proposed, they all have some limitations. In this paper, we propose a three-granularity memory layer and contrast learning named MemoryNet, specifically, dividing the samples… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  32. arXiv:2308.15122  [pdf, other

    cs.CL

    SpikeBERT: A Language Spikformer Learned from BERT with Knowledge Distillation

    Authors: Changze Lv, Tianlong Li, Jianhan Xu, Chenxi Gu, Zixuan Ling, Cenyuan Zhang, Xiaoqing Zheng, Xuan**g Huang

    Abstract: Spiking neural networks (SNNs) offer a promising avenue to implement deep neural networks in a more energy-efficient way. However, the network architectures of existing SNNs for language tasks are still simplistic and relatively shallow, and deep architectures have not been fully explored, resulting in a significant performance gap compared to mainstream transformer-based networks such as BERT. To… ▽ More

    Submitted 21 February, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

  33. arXiv:2308.08998  [pdf, other

    cs.CL cs.LG

    Reinforced Self-Training (ReST) for Language Modeling

    Authors: Caglar Gulcehre, Tom Le Paine, Srivatsan Srinivasan, Ksenia Konyushkova, Lotte Weerts, Abhishek Sharma, Aditya Siddhant, Alex Ahern, Miaosen Wang, Chenjie Gu, Wolfgang Macherey, Arnaud Doucet, Orhan Firat, Nando de Freitas

    Abstract: Reinforcement learning from human feedback (RLHF) can improve the quality of large language model's (LLM) outputs by aligning them with human preferences. We propose a simple algorithm for aligning LLMs with human preferences inspired by growing batch reinforcement learning (RL), which we call Reinforced Self-Training (ReST). Given an initial LLM policy, ReST produces a dataset by generating sampl… ▽ More

    Submitted 21 August, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: 23 pages, 16 figures

  34. arXiv:2307.11360  [pdf, other

    cs.CV

    ParGANDA: Making Synthetic Pedestrians A Reality For Object Detection

    Authors: Daria Reshetova, Guanhang Wu, Marcel Puyat, Chunhui Gu, Huizhong Chen

    Abstract: Object detection is the key technique to a number of Computer Vision applications, but it often requires large amounts of annotated data to achieve decent results. Moreover, for pedestrian detection specifically, the collected data might contain some personally identifiable information (PII), which is highly restricted in many countries. This label intensive and privacy concerning task has recentl… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  35. arXiv:2306.10286   

    cs.CV cs.AI

    Enlighten Anything: When Segment Anything Model Meets Low-Light Image Enhancement

    Authors: Qihan Zhao, Xiaofeng Zhang, Hao Tang, Chaochen Gu, Shanying Zhu

    Abstract: Image restoration is a low-level visual task, and most CNN methods are designed as black boxes, lacking transparency and intrinsic aesthetics. Many unsupervised approaches ignore the degradation of visible information in low-light scenes, which will seriously affect the aggregation of complementary information and also make the fusion algorithm unable to produce satisfactory fusion results under e… ▽ More

    Submitted 31 July, 2023; v1 submitted 17 June, 2023; originally announced June 2023.

    Comments: it will be revised

  36. arXiv:2306.10200  [pdf, other

    cs.CR

    Privacy-Enhancing Technologies for Financial Data Sharing

    Authors: Panagiotis Chatzigiannis, Wanyun Catherine Gu, Srinivasan Raghuraman, Peter Rindal, Mahdi Zamani

    Abstract: Today, financial institutions (FIs) store and share consumers' financial data for various reasons such as offering loans, processing payments, and protecting against fraud and financial crime. Such sharing of sensitive data have been subject to data breaches in the past decade. While some regulations (e.g., GDPR, FCRA, and CCPA) help to prevent institutions from freely sharing clients' sensitive… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  37. arXiv:2306.06113  [pdf, other

    cs.CV

    SAM-helps-Shadow:When Segment Anything Model meet shadow removal

    Authors: Xiaofeng Zhang, Chaochen Gu, Shanying Zhu

    Abstract: The challenges surrounding the application of image shadow removal to real-world images and not just constrained datasets like ISTD/SRD have highlighted an urgent need for zero-shot learning in this field. In this study, we innovatively adapted the SAM (Segment anything model) for shadow removal by introducing SAM-helps-Shadow, effectively integrating shadow detection and removal into a single sta… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  38. Embedding Contextual Information through Reward Sha** in Multi-Agent Learning: A Case Study from Google Football

    Authors: Chaoyi Gu, Varuna De Silva, Corentin Artaud, Rafael Pina

    Abstract: Artificial Intelligence has been used to help human complete difficult tasks in complicated environments by providing optimized strategies for decision-making or replacing the manual labour. In environments including multiple agents, such as football, the most common methods to train agents are Imitation Learning and Multi-Agent Reinforcement Learning (MARL). However, the agents trained by Imitati… ▽ More

    Submitted 21 July, 2023; v1 submitted 25 March, 2023; originally announced March 2023.

    Journal ref: 2023 IEEE 13th International Conference on Pattern Recognition Systems (ICPRS), Guayaquil, Ecuador, 2023, pp. 1-8

  39. arXiv:2303.13323  [pdf, other

    stat.ML cs.LG cs.MA

    Deep Generative Multi-Agent Imitation Model as a Computational Benchmark for Evaluating Human Performance in Complex Interactive Tasks: A Case Study in Football

    Authors: Chaoyi Gu, Varuna De Silva

    Abstract: Evaluating the performance of human is a common need across many applications, such as in engineering and sports. When evaluating human performance in completing complex and interactive tasks, the most common way is to use a metric having been proved efficient for that context, or to use subjective measurement techniques. However, this can be an error prone and unreliable process since static metr… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: 8 pages, 10 figures

  40. arXiv:2303.09952  [pdf, other

    cs.CV

    Single-view Neural Radiance Fields with Depth Teacher

    Authors: Yurui Chen, Chun Gu, Feihu Zhang, Li Zhang

    Abstract: Neural Radiance Fields (NeRF) have been proposed for photorealistic novel view rendering. However, it requires many different views of one scene for training. Moreover, it has poor generalizations to new scenes and requires retraining or fine-tuning on each scene. In this paper, we develop a new NeRF model for novel view synthesis using only a single image as input. We propose to combine the (coar… ▽ More

    Submitted 11 May, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

  41. arXiv:2302.14012  [pdf

    quant-ph cs.ET

    Drone-based quantum key distribution

    Authors: Xiao-Hui Tian, Ran Yang, Ji-Ning Zhang, Hua Yu, Yao Zhang, Pengfei Fan, Mengwen Chen, Changsheng Gu, Xin Ni, Mingzhe Hu, Xun Cao, Xiaopeng Hu, Gang Zhao, Yan-Qing Lu, Zhi-Jun Yin, Hua-Ying Liu, Yan-Xiao Gong, Zhenda Xie, Shi-Ning Zhu

    Abstract: Drone-based quantum link has the potential to realize mobile quantum network, and entanglement distribution has been demonstrated using one and two drones. Here we report the first drone-based quantum key distribution (QKD), with average secure key rate larger than 8 kHz using decoy-state BB84 protocol with polarization coding. Compact acquisition, pointing, and tracking (APT) system and QKD modul… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  42. arXiv:2212.14566  [pdf, other

    eess.SY cs.LG

    Pontryagin Optimal Control via Neural Networks

    Authors: Chengyang Gu, Hui Xiong, Yize Chen

    Abstract: Solving real-world optimal control problems are challenging tasks, as the complex, high-dimensional system dynamics are usually unrevealed to the decision maker. It is thus hard to find the optimal control actions numerically. To deal with such modeling and computation challenges, in this paper, we integrate Neural Networks with the Pontryagin's Maximum Principle (PMP), and propose a sample effici… ▽ More

    Submitted 15 January, 2024; v1 submitted 30 December, 2022; originally announced December 2022.

    Comments: In submission

  43. arXiv:2212.10066  [pdf, other

    cs.CV cs.AI

    RepMode: Learning to Re-parameterize Diverse Experts for Subcellular Structure Prediction

    Authors: Donghao Zhou, Chunbin Gu, Junde Xu, Furui Liu, Qiong Wang, Guangyong Chen, Pheng-Ann Heng

    Abstract: In biological research, fluorescence staining is a key technique to reveal the locations and morphology of subcellular structures. However, it is slow, expensive, and harmful to cells. In this paper, we model it as a deep learning task termed subcellular structure prediction (SSP), aiming to predict the 3D fluorescent images of multiple subcellular structures from a 3D transmitted-light image. Unf… ▽ More

    Submitted 25 March, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted by CVPR2023 (Highlight)

  44. HDNet: A Hierarchically Decoupled Network for Crowd Counting

    Authors: Chenliang Gu, Changan Wang, Bin-Bin Gao, Jun Liu, Tianliang Zhang

    Abstract: Recently, density map regression-based methods have dominated in crowd counting owing to their excellent fitting ability on density distribution. However, further improvement tends to saturate mainly because of the confusing background noise and the large density variation. In this paper, we propose a Hierarchically Decoupled Network (HDNet) to solve the above two problems within a unified framewo… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: Accepted by IEEE International Conference on Multimedia and Expo (ICME), 2022

  45. arXiv:2211.00312  [pdf, other

    cs.CV cs.LG

    HDNet: Hierarchical Dynamic Network for Gait Recognition using Millimeter-Wave Radar

    Authors: Yanyan Huang, Yong Wang, Kun Shi, Chaojie Gu, Yu Fu, Cheng Zhuo, Zhiguo Shi

    Abstract: Gait recognition is widely used in diversified practical applications. Currently, the most prevalent approach is to recognize human gait from RGB images, owing to the progress of computer vision technologies. Nevertheless, the perception capability of RGB cameras deteriorates in rough circumstances, and visual surveillance may cause privacy invasion. Due to the robustness and non-invasive feature… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  46. arXiv:2210.17258  [pdf, other

    cs.CV

    Teacher-Student Network for 3D Point Cloud Anomaly Detection with Few Normal Samples

    Authors: Jianjian Qin, Chunzhi Gu, Jun Yu, Chao Zhang

    Abstract: Anomaly detection, which is a critical and popular topic in computer vision, aims to detect anomalous samples that are different from the normal (i.e., non-anomalous) ones. The current mainstream methods focus on anomaly detection for images, whereas little attention has been paid to 3D point cloud. In this paper, drawing inspiration from the knowledge transfer ability of teacher-student architect… ▽ More

    Submitted 9 May, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

  47. arXiv:2210.15255  [pdf, other

    cs.AR

    RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN

    Authors: Yilong Zhao, Li Jiang, Mingyu Gao, Naifeng **g, Chengyang Gu, Qidong Tang, Fangxin Liu, Tao Yang, Xiaoyao Liang

    Abstract: The second-order training methods can converge much faster than first-order optimizers in DNN training. This is because the second-order training utilizes the inversion of the second-order information (SOI) matrix to find a more accurate descent direction and step size. However, the huge SOI matrices bring significant computational and memory overheads in the traditional architectures like GPU and… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: 13pages, 13 figures

  48. arXiv:2210.07543  [pdf, other

    cs.CL cs.LG

    Watermarking Pre-trained Language Models with Backdooring

    Authors: Chenxi Gu, Chengsong Huang, Xiaoqing Zheng, Kai-Wei Chang, Cho-Jui Hsieh

    Abstract: Large pre-trained language models (PLMs) have proven to be a crucial component of modern natural language processing systems. PLMs typically need to be fine-tuned on task-specific downstream datasets, which makes it hard to claim the ownership of PLMs and protect the developer's intellectual property due to the catastrophic forgetting phenomenon. We show that PLMs can be watermarked with a multi-t… ▽ More

    Submitted 10 February, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

  49. arXiv:2210.06951  [pdf, ps, other

    cs.IT physics.data-an

    Performance Optimization and Parameters Estimation for MIMO-OFDM Dual-functional Communication-radar Systems

    Authors: Chen Zhong, Chunrong Gu, Lan Tang, Yechao Bai, Mengting Lou

    Abstract: In dual-functional communication-radar systems, common radio frequency (RF) signals are used for both communication and detection. For better compatibility with existing communication systems, we adopt multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) signals as integrated signals and investigate the estimation performance of MIMO-OFDM signals. We first analyz… ▽ More

    Submitted 27 August, 2022; originally announced October 2022.

    Comments: Digital Communications and network

  50. arXiv:2209.15543  [pdf, other

    physics.geo-ph cs.LG

    Bayesian Neural Networks for Geothermal Resource Assessment: Prediction with Uncertainty

    Authors: Stephen Brown, William L. Rodi, Marco Seracini, Chen Gu, Michael Fehler, James Faulds, Connor M. Smith, Sven Treitel

    Abstract: We consider the application of machine learning to the evaluation of geothermal resource potential. A supervised learning problem is defined where maps of 10 geological and geophysical features within the state of Nevada, USA are used to define geothermal potential across a broad region. We have available a relatively small set of positive training sites (known resources or active power plants) an… ▽ More

    Submitted 25 October, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: 27 pages, 12 figures