Skip to main content

Showing 1–50 of 167 results for author: Zhong, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.09985  [pdf, other

    cs.CV

    VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing

    Authors: Binghui Chen, Chongyang Zhong, Wangmeng Xiang, Yifeng Geng, Xuansong Xie

    Abstract: Due to the significant advances in large-scale text-to-image generation by diffusion model (DM), controllable human image generation has been attracting much attention recently. Existing works, such as Controlnet [36], T2I-adapter [20] and HumanSD [10] have demonstrated good abilities in generating human images based on pose conditions, they still fail to meet the requirements of real e-commerce s… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: project page: https://aigcdesigngroup.github.io/replace-anything;

  2. arXiv:2404.19750  [pdf, other

    cs.IT eess.SP

    A Joint Communication and Computation Design for Distributed RISs Assisted Probabilistic Semantic Communication in IIoT

    Authors: Zhouxiang Zhao, Zhaohui Yang, Chongwen Huang, Li Wei, Qianqian Yang, Caijun Zhong, Wei Xu, Zhaoyang Zhang

    Abstract: In this paper, the problem of spectral-efficient communication and computation resource allocation for distributed reconfigurable intelligent surfaces (RISs) assisted probabilistic semantic communication (PSC) in industrial Internet-of-Things (IIoT) is investigated. In the considered model, multiple RISs are deployed to serve multiple users, while PSC adopts compute-then-transmit protocol to reduc… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  3. arXiv:2403.11974  [pdf, other

    eess.IV cs.CV

    OUCopula: Bi-Channel Multi-Label Copula-Enhanced Adapter-Based CNN for Myopia Screening Based on OU-UWF Images

    Authors: Yang Li, Qiuyi Huang, Chong Zhong, Danjuan Yang, Meiyan Li, A. H. Welsh, Aiyi Liu, Bo Fu, Catherien C. Liu, Xingtao Zhou

    Abstract: Myopia screening using cutting-edge ultra-widefield (UWF) fundus imaging is potentially significant for ophthalmic outcomes. Current multidisciplinary research between ophthalmology and deep learning (DL) concentrates primarily on disease classification and diagnosis using single-eye images, largely ignoring joint modeling and prediction for Oculus Uterque (OU, both eyes). Inspired by the complex… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  4. arXiv:2403.11693  [pdf, other

    cs.IT eess.SP

    Beamforming Design for Semantic-Bit Coexisting Communication System

    Authors: Maojun Zhang, Guangxu Zhu, Richeng **, Xiaoming Chen, Qingjiang Shi, Caijun Zhong, Kaibin Huang

    Abstract: Semantic communication (SemCom) is emerging as a key technology for future sixth-generation (6G) systems. Unlike traditional bit-level communication (BitCom), SemCom directly optimizes performance at the semantic level, leading to superior communication efficiency. Nevertheless, the task-oriented nature of SemCom renders it challenging to completely replace BitCom. Consequently, it is desired to c… ▽ More

    Submitted 22 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Submitted to IEEE for possible publication

  5. arXiv:2403.11057  [pdf, other

    cs.CV cs.RO

    Large Language Models Powered Context-aware Motion Prediction

    Authors: Xiaoji Zheng, Lixiu Wu, Zhijie Yan, Yuanrong Tang, Hao Zhao, Chen Zhong, Bokui Chen, Jiangtao Gong

    Abstract: Motion prediction is among the most fundamental tasks in autonomous driving. Traditional methods of motion forecasting primarily encode vector information of maps and historical trajectory data of traffic participants, lacking a comprehensive understanding of overall traffic semantics, which in turn affects the performance of prediction tasks. In this paper, we utilized Large Language Models (LLMs… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: 6 pages,4 figures

    MSC Class: 68T45

  6. arXiv:2403.09637  [pdf, other

    cs.RO cs.CV

    GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Gras**

    Authors: Yuhang Zheng, Xiangyu Chen, Yupeng Zheng, Songen Gu, Runyi Yang, Bu **, Pengfei Li, Chengliang Zhong, Zengmao Wang, Lina Liu, Chao Yang, Dawei Wang, Zhen Chen, Xiaoxiao Long, Meiqing Wang

    Abstract: Constructing a 3D scene capable of accommodating open-ended language queries, is a pivotal pursuit, particularly within the domain of robotics. Such technology facilitates robots in executing object manipulations based on human language directives. To tackle this challenge, some research efforts have been dedicated to the development of language-embedded implicit fields. However, implicit fields (… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  7. arXiv:2403.08766  [pdf, other

    cs.CV

    MonoOcc: Digging into Monocular Semantic Occupancy Prediction

    Authors: Yupeng Zheng, Xiang Li, Pengfei Li, Yuhang Zheng, Bu **, Chengliang Zhong, Xiaoxiao Long, Hao Zhao, Qichao Zhang

    Abstract: Monocular Semantic Occupancy Prediction aims to infer the complete 3D geometry and semantic information of scenes from only 2D images. It has garnered significant attention, particularly due to its potential to enhance the 3D perception of autonomous vehicles. However, existing methods rely on a complex cascaded framework with relatively limited information to restore 3D scenes, including a depend… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted by ICRA 2024

  8. arXiv:2403.02714  [pdf, other

    cs.CV

    DomainVerse: A Benchmark Towards Real-World Distribution Shifts For Tuning-Free Adaptive Domain Generalization

    Authors: Feng Hou, ** Yuan, Ying Yang, Yang Liu, Yang Zhang, Cheng Zhong, Zhongchao Shi, Jian** Fan, Yong Rui, Zhiqiang He

    Abstract: Traditional cross-domain tasks, including domain adaptation and domain generalization, rely heavily on training model by source domain data. With the recent advance of vision-language models (VLMs), viewed as natural source models, the cross-domain task changes to directly adapt the pre-trained source model to arbitrary target domains equipped with prior domain knowledge, and we name this task Ada… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Currently in review for ICML 2024

  9. arXiv:2403.01480  [pdf, ps, other

    cs.IT eess.SP

    Deep Learning-based Design of Uplink Integrated Sensing and Communication

    Authors: Qiao Qi, Xiaoming Chen, Caijun Zhong, Chau Yuen, Zhaoyang Zhang

    Abstract: In this paper, we investigate the issue of uplink integrated sensing and communication (ISAC) in 6G wireless networks where the sensing echo signal and the communication signal are received simultaneously at the base station (BS). To effectively mitigate the mutual interference between sensing and communication caused by the sharing of spectrum and hardware resources, we provide a joint sensing tr… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: IEEE Transactions on Wireless Communications, 2024

  10. arXiv:2402.10593  [pdf, other

    cs.IT eess.SP

    Bayesian Learning for Double-RIS Aided ISAC Systems with Superimposed Pilots and Data

    Authors: Xu Gan, Chongwen Huang, Zhaohui Yang, Caijun Zhong, Xiaoming Chen, Zhaoyang Zhang, Qinghua Guo, Chau Yuen, Merouane Debbah

    Abstract: Reconfigurable intelligent surface (RIS) has great potential to improve the performance of integrated sensing and communication (ISAC) systems, especially in scenarios where line-of-sight paths between the base station and users are blocked. However, the spectral efficiency (SE) of RIS-aided ISAC uplink transmissions may be drastically reduced by the heavy burden of pilot overhead for realizing se… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  11. arXiv:2402.10456  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Generative Modeling for Tabular Data via Penalized Optimal Transport Network

    Authors: Wenhui Sophia Lu, Chenyang Zhong, Wing Hung Wong

    Abstract: The task of precisely learning the probability distribution of rows within tabular data and producing authentic synthetic samples is both crucial and non-trivial. Wasserstein generative adversarial network (WGAN) marks a notable improvement in generative modeling, addressing the challenges faced by its predecessor, generative adversarial network. However, due to the mixed data types and multimodal… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 37 pages, 23 figures

  12. arXiv:2401.17509  [pdf, other

    cs.CV

    Anything in Any Scene: Photorealistic Video Object Insertion

    Authors: Chen Bai, Zeman Shao, Guoxiang Zhang, Di Liang, Jie Yang, Zhuorui Zhang, Yujian Guo, Chengzhang Zhong, Yiqiao Qiu, Zhendong Wang, Yichen Guan, Xiaoyin Zheng, Tao Wang, Cheng Lu

    Abstract: Realistic video simulation has shown significant potential across diverse applications, from virtual reality to film production. This is particularly true for scenarios where capturing videos in real-world settings is either impractical or expensive. Existing approaches in video simulation often fail to accurately model the lighting environment, represent the object geometry, or achieve high level… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  13. arXiv:2401.01629  [pdf, ps, other

    cs.LG cs.AI cs.CY

    Synthetic Data in AI: Challenges, Applications, and Ethical Implications

    Authors: Shuang Hao, Wenfeng Han, Tao Jiang, Yi** Li, Haonan Wu, Chunlin Zhong, Zhangjun Zhou, He Tang

    Abstract: In the rapidly evolving field of artificial intelligence, the creation and utilization of synthetic datasets have become increasingly significant. This report delves into the multifaceted aspects of synthetic data, particularly emphasizing the challenges and potential biases these datasets may harbor. It explores the methodologies behind synthetic data generation, spanning traditional statistical… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  14. arXiv:2312.11024  [pdf, other

    cs.CV cs.AI

    Collaborative Weakly Supervised Video Correlation Learning for Procedure-Aware Instructional Video Analysis

    Authors: Tianyao He, Huabin Liu, Yuxi Li, Xiao Ma, Cheng Zhong, Yang Zhang, Weiyao Lin

    Abstract: Video Correlation Learning (VCL), which aims to analyze the relationships between videos, has been widely studied and applied in various general video tasks. However, applying VCL to instructional videos is still quite challenging due to their intrinsic procedural temporal structure. Specifically, procedural knowledge is critical for accurate correlation analyses on instructional videos. Neverthel… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: has been accepted by AAAI 24

  15. Nash Equilibria of Two-round Auctions

    Authors: Chulong Zhong, Xiang Yan, Yuyi Wang, Shuang** Huang, ** Zhong

    Abstract: In a two-round auction, a subset of bidders is selected (probabilistically), according to their bids in the first round, for the second round, where they can increase their bids. We formalize the two-round auction model, restricting the second round to a dominant strategy incentive compatible (DSIC) auction for the selected bidders. It turns out that, however, such two-round auctions are not direc… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  16. arXiv:2312.03033  [pdf, other

    cs.CV

    LiDAR-based Person Re-identification

    Authors: Wenxuan Guo, Zhiyu Pan, Ying** Liang, Ziheng Xi, Zhi Chen Zhong, Jianjiang Feng, Jie Zhou

    Abstract: Camera-based person re-identification (ReID) systems have been widely applied in the field of public security. However, cameras often lack the perception of 3D morphological information of human and are susceptible to various limitations, such as inadequate illumination, complex background, and personal privacy. In this paper, we propose a LiDAR-based ReID framework, ReID3D, that utilizes pre-trai… ▽ More

    Submitted 11 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

  17. arXiv:2312.00088  [pdf, ps, other

    cs.LG eess.SP eess.SY

    Anomaly Detection via Learning-Based Sequential Controlled Sensing

    Authors: Geethu Joseph, Chen Zhong, M. Cenk Gursoy, Senem Velipasalar, Pramod K. Varshney

    Abstract: In this paper, we address the problem of detecting anomalies among a given set of binary processes via learning-based controlled sensing. Each process is parameterized by a binary random variable indicating whether the process is anomalous. To identify the anomalies, the decision-making agent is allowed to observe a subset of the processes at each time instant. Also, probing each process has an as… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  18. arXiv:2311.18288  [pdf, other

    cs.CV

    CosAvatar: Consistent and Animatable Portrait Video Tuning with Text Prompt

    Authors: Haiyao Xiao, Chenglai Zhong, Xuan Gao, Yudong Guo, Juyong Zhang

    Abstract: Recently, text-guided digital portrait editing has attracted more and more attentions. However, existing methods still struggle to maintain consistency across time, expression, and view or require specific data prerequisites. To solve these challenging problems, we propose CosAvatar, a high-quality and user-friendly framework for portrait tuning. With only monocular video and text instructions as… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: Project page: https://ustc3dv.github.io/CosAvatar/

  19. Spectrum Sharing between UAV-based Wireless Mesh Networks and Ground Networks

    Authors: Zhiqing Wei, Zijun Guo, Zhiyong Feng, Jialin Zhu, Caijun Zhong, Qihui Wu, Huici Wu

    Abstract: The unmanned aerial vehicle (UAV)-based wireless mesh networks can economically provide wireless services for the areas with disasters. However, the capacity of air-to-air communications is limited due to the multi-hop transmissions. In this paper, the spectrum sharing between UAV-based wireless mesh networks and ground networks is studied to improve the capacity of the UAV networks. Considering t… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: 6 pages, 6 figures

  20. arXiv:2311.03967  [pdf, other

    cs.CV stat.ML

    CeCNN: Copula-enhanced convolutional neural networks in joint prediction of refraction error and axial length based on ultra-widefield fundus images

    Authors: Chong Zhong, Yang Li, Danjuan Yang, Meiyan Li, Xingyao Zhou, Bo Fu, Catherine C. Liu, A. H. Welsh

    Abstract: Ultra-widefield (UWF) fundus images are replacing traditional fundus images in screening, detection, prediction, and treatment of complications related to myopia because their much broader visual range is advantageous for highly myopic eyes. Spherical equivalent (SE) is extensively used as the main myopia outcome measure, and axial length (AL) has drawn increasing interest as an important ocular c… ▽ More

    Submitted 1 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

  21. arXiv:2309.05098  [pdf, other

    cs.CV

    3D Implicit Transporter for Temporally Consistent Keypoint Discovery

    Authors: Chengliang Zhong, Yuhang Zheng, Yupeng Zheng, Hao Zhao, Li Yi, Xiaodong Mu, Ling Wang, Pengfei Li, Guyue Zhou, Chao Yang, Xinliang Zhang, Jian Zhao

    Abstract: Keypoint-based representation has proven advantageous in various visual and robotic tasks. However, the existing 2D and 3D methods for detecting keypoints mainly rely on geometric consistency to achieve spatial alignment, neglecting temporal consistency. To address this issue, the Transporter method was introduced for 2D data, which reconstructs the target frame from the source frame to incorporat… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: ICCV2023 oral paper

  22. arXiv:2309.00796  [pdf, other

    cs.CV

    AttT2M: Text-Driven Human Motion Generation with Multi-Perspective Attention Mechanism

    Authors: Chongyang Zhong, Lei Hu, Zihao Zhang, Shihong Xia

    Abstract: Generating 3D human motion based on textual descriptions has been a research focus in recent years. It requires the generated motion to be diverse, natural, and conform to the textual description. Due to the complex spatio-temporal nature of human motion and the difficulty in learning the cross-modal relationship between text and motion, text-driven motion generation is still a challenging problem… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: IEEE International Conference on Computer Vision 2023, 9 pages

  23. arXiv:2308.14346  [pdf, other

    cs.CL cs.AI

    DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation

    Authors: Zhijie Bao, Wei Chen, Shengze Xiao, Kuang Ren, Jiaao Wu, Cheng Zhong, Jiajie Peng, Xuan**g Huang, Zhongyu Wei

    Abstract: We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services. To construct high-quality Supervised Fine-Tuning (SFT) datasets, we employ three strategies: utilizing medical knowledge-graphs, reconstructing real-world dialogues, and incorporating human-guided preference… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Work in progress

  24. arXiv:2308.13245  [pdf, other

    cs.CV

    Unpaired Multi-domain Attribute Translation of 3D Facial Shapes with a Square and Symmetric Geometric Map

    Authors: Zhenfeng Fan, Zhiheng Zhang, Shuang Yang, Chongyang Zhong, Min Cao, Shihong Xia

    Abstract: While impressive progress has recently been made in image-oriented facial attribute translation, shape-oriented 3D facial attribute translation remains an unsolved issue. This is primarily limited by the lack of 3D generative models and ineffective usage of 3D facial data. We propose a learning framework for 3D facial attribute translation to relieve these limitations. Firstly, we customize a nove… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  25. arXiv:2308.05881  [pdf, other

    cs.CV cs.LG

    Aphid Cluster Recognition and Detection in the Wild Using Deep Learning Models

    Authors: Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Cuncong Zhong, Bo Luo, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

    Abstract: Aphid infestation poses a significant threat to crop production, rural communities, and global food security. While chemical pest control is crucial for maximizing yields, applying chemicals across entire fields is both environmentally unsustainable and costly. Hence, precise localization and management of aphids are essential for targeted pesticide application. The paper primarily focuses on usin… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  26. arXiv:2307.05929  [pdf, other

    cs.CV cs.AI

    A New Dataset and Comparative Study for Aphid Cluster Detection

    Authors: Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Cuncong Zhong, Bo Luo, Ivan Grijalva Teran, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

    Abstract: Aphids are one of the main threats to crops, rural families, and global food security. Chemical pest control is a necessary component of crop production for maximizing yields, however, it is unnecessary to apply the chemical approaches to the entire fields in consideration of the environmental pollution and the cost. Thus, accurately localizing the aphid and estimating the infestation level is cru… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  27. Pose-aware Attention Network for Flexible Motion Retargeting by Body Part

    Authors: Lei Hu, Zihao Zhang, Chongyang Zhong, Boyuan Jiang, Shihong Xia

    Abstract: Motion retargeting is a fundamental problem in computer graphics and computer vision. Existing approaches usually have many strict requirements, such as the source-target skeletons needing to have the same number of joints or share the same topology. To tackle this problem, we note that skeletons with different structure may have some common body parts despite the differences in joint numbers. Fol… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 17 pages, 12 figures. IEEE Transactions on Visualization and Computer Graphics, (2023)

    MSC Class: 68U05; 68T40 ACM Class: I.3.0; I.2.0

  28. arXiv:2306.04915  [pdf, ps, other

    cs.IT eess.SP

    Sensing-based Beamforming Design for Joint Performance Enhancement of RIS-Aided ISAC Systems

    Authors: Xiaowei Qian, Xiaoling Hu, Chenxi Liu, Mugen Peng, Caijun Zhong

    Abstract: Reconfigurable intelligent surface (RIS) has shown its great potential in facilitating device-based integrated sensing and communication (ISAC), where sensing and communication tasks are mostly conducted on different time-frequency resources. While the more challenging scenarios of simultaneous sensing and communication (SSC) have so far drawn little attention. In this paper, we propose a novel RI… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  29. arXiv:2305.03969  [pdf, ps, other

    cs.IT cs.DC

    Joint Compression and Deadline Optimization for Wireless Federated Learning

    Authors: Maojun Zhang, Yang Li, Dongzhu Liu, Richeng **, Guangxu Zhu, Caijun Zhong, Tony Q. S. Quek

    Abstract: Federated edge learning (FEEL) is a popular distributed learning framework for privacy-preserving at the edge, in which densely distributed edge devices periodically exchange model-updates with the server to complete the global model training. Due to limited bandwidth and uncertain wireless environment, FEEL may impose heavy burden to the current communication system. In addition, under the common… ▽ More

    Submitted 12 December, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Comments: 13 pages, accepted by IEEE Transactions on Mobile Computing (TMC)

  30. arXiv:2304.06686  [pdf, other

    cs.LG stat.ML

    OKRidge: Scalable Optimal k-Sparse Ridge Regression

    Authors: Jiachang Liu, Sam Rosen, Chudi Zhong, Cynthia Rudin

    Abstract: We consider an important problem in scientific discovery, namely identifying sparse governing equations for nonlinear dynamical systems. This involves solving sparse ridge regression problems to provable optimality in order to determine which terms drive the underlying dynamics. We propose a fast algorithm, OKRidge, for sparse ridge regression, using a novel lower bound calculation involving, firs… ▽ More

    Submitted 11 January, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 Spotlight

  31. arXiv:2304.00838  [pdf, other

    cs.CV cs.GR

    MetaHead: An Engine to Create Realistic Digital Head

    Authors: Dingyun Zhang, Chenglai Zhong, Yudong Guo, Yang Hong, Juyong Zhang

    Abstract: Collecting and labeling training data is one important step for learning-based methods because the process is time-consuming and biased. For face analysis tasks, although some generative models can be used to generate face data, they can only achieve a subset of generation diversity, reconstruction accuracy, 3D consistency, high-fidelity visual quality, and easy editability. One recent related wor… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: Project page: https://ustc3dv.github.io/MetaHead/

  32. arXiv:2303.16047  [pdf, other

    cs.LG cs.AI stat.ML

    Exploring and Interacting with the Set of Good Sparse Generalized Additive Models

    Authors: Chudi Zhong, Zhi Chen, Jiachang Liu, Margo Seltzer, Cynthia Rudin

    Abstract: In real applications, interaction between machine learning models and domain experts is critical; however, the classical machine learning paradigm that usually produces only a single model does not facilitate such interaction. Approximating and exploring the Rashomon set, i.e., the set of all near-optimal models, addresses this practical challenge by providing the user with a searchable space cont… ▽ More

    Submitted 17 November, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  33. arXiv:2303.11319  [pdf, other

    cs.IT eess.SP

    Over-the-Air Federated Edge Learning with Error-Feedback One-Bit Quantization and Power Control

    Authors: Yuding Liu, Dongzhu Liu, Guangxu Zhu, Qingjiang Shi, Caijun Zhong

    Abstract: Over-the-air federated edge learning (Air-FEEL) is a communication-efficient framework for distributed machine learning using training data distributed at edge devices. This framework enables all edge devices to transmit model updates simultaneously over the entire available bandwidth, allowing for over-the-air aggregation. A one-bit digital over-the-air aggregation (OBDA) scheme has been recently… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  34. arXiv:2302.09634  [pdf, ps, other

    cs.LG

    Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification in the Presence of Data Heterogeneity

    Authors: Richeng **, Xiaofan He, Caijun Zhong, Zhaoyang Zhang, Tony Quek, Huaiyu Dai

    Abstract: Communication overhead has become one of the major bottlenecks in the distributed training of deep neural networks. To alleviate the concern, various gradient compression methods have been proposed, and sign-based algorithms are of surging interest. However, SIGNSGD fails to converge in the presence of data heterogeneity, which is commonly observed in the emerging federated learning (FL) paradigm.… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

  35. arXiv:2302.09624  [pdf, ps, other

    cs.CR cs.IT cs.LG

    Breaking the Communication-Privacy-Accuracy Tradeoff with $f$-Differential Privacy

    Authors: Richeng **, Zhonggen Su, Caijun Zhong, Zhaoyang Zhang, Tony Quek, Huaiyu Dai

    Abstract: We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability. The commonly adopted compression schemes introduce information loss into local data while improving communication efficiency, and it remains an open problem whether such discrete-valued mechanisms provide any priv… ▽ More

    Submitted 1 February, 2024; v1 submitted 19 February, 2023; originally announced February 2023.

  36. arXiv:2302.01334  [pdf, other

    cs.CV

    STEPS: Joint Self-supervised Nighttime Image Enhancement and Depth Estimation

    Authors: Yupeng Zheng, Chengliang Zhong, Pengfei Li, Huan-ang Gao, Yuhang Zheng, Bu **, Ling Wang, Hao Zhao, Guyue Zhou, Qichao Zhang, Dongbin Zhao

    Abstract: Self-supervised depth estimation draws a lot of attention recently as it can promote the 3D sensing capabilities of self-driving vehicles. However, it intrinsically relies upon the photometric consistency assumption, which hardly holds during nighttime. Although various supervised nighttime image enhancement methods have been proposed, their generalization performance in challenging driving scenar… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: Accepted by ICRA 2023, Code: https://github.com/ucaszyp/STEPS

  37. arXiv:2302.00461  [pdf, ps, other

    cs.IT eess.SP

    AMP-SBL Unfolding for Wideband MmWave Massive MIMO Channel Estimation

    Authors: Jiabao Gao, Caijun Zhong, Geoffrey Ye Li

    Abstract: In wideband millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems, channel estimation is challenging due to the hybrid analog-digital architecture, which compresses the received pilot signal and makes channel estimation a compressive sensing (CS) problem. However, existing high-performance CS algorithms usually suffer from high complexity. On the other hand, the beam squin… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  38. arXiv:2212.00227  [pdf, other

    cs.IT eess.SP

    Wireless Image Transmission with Semantic and Security Awareness

    Authors: Maojun Zhang, Yang Li, Zezhong Zhang, Guangxu Zhu, Caijun Zhong

    Abstract: Semantic communication is an increasingly popular framework for wireless image transmission due to its high communication efficiency. With the aid of the joint-source-and-channel (JSC) encoder implemented by neural network, semantic communication directly maps original images into symbol sequences containing semantic information. Compared with the traditional separate source and channel coding des… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

    Comments: Submitted to IEEE WCL for possible publication

  39. arXiv:2211.14866  [pdf, ps, other

    cs.IT

    Spatially Sparse Precoding in Wideband Hybrid Terahertz Massive MIMO Systems

    Authors: Jiabao Gao, Caijun Zhong, Geoffrey Ye Li, Joseph B. Soriaga, Arash Behboodi

    Abstract: In terahertz (THz) massive multiple-input multiple-output (MIMO) systems, the combination of huge bandwidth and massive antennas results in severe beam split, thus making the conventional phase-shifter based hybrid precoding architecture ineffective. With the incorporation of true-time-delay (TTD) lines in the hardware implementation of the analog precoders, delay-phase precoding (DPP) emerges as… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

  40. arXiv:2211.04582  [pdf, other

    cs.LG

    Learning to Learn Domain-invariant Parameters for Domain Generalization

    Authors: Feng Hou, Yao Zhang, Yang Liu, ** Yuan, Cheng Zhong, Yang Zhang, Zhongchao Shi, Jian** Fan, Zhiqiang He

    Abstract: Due to domain shift, deep neural networks (DNNs) usually fail to generalize well on unknown test data in practice. Domain generalization (DG) aims to overcome this issue by capturing domain-invariant representations from source domains. Motivated by the insight that only partial parameters of DNNs are optimized to extract domain-invariant representations, we expect a general model that is capable… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP'23

  41. arXiv:2210.14319  [pdf, other

    cs.CV

    Explicitly Increasing Input Information Density for Vision Transformers on Small Datasets

    Authors: Xiangyu Chen, Ying Qin, Wenju Xu, Andrés M. Bur, Cuncong Zhong, Guanghui Wang

    Abstract: Vision Transformers have attracted a lot of attention recently since the successful implementation of Vision Transformer (ViT) on vision tasks. With vision Transformers, specifically the multi-head self-attention modules, networks can capture long-term dependencies inherently. However, these attention modules normally need to be trained on large datasets, and vision Transformers show inferior perf… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS workshop (VTTA) 2022

  42. arXiv:2210.13795  [pdf, other

    cs.LG cs.IR cs.SI

    Line Graph Contrastive Learning for Link Prediction

    Authors: Zehua Zhang, Shilin Sun, Guixiang Ma, Caiming Zhong

    Abstract: Link prediction tasks focus on predicting possible future connections. Most existing researches measure the likelihood of links by different similarity scores on node pairs and predict links between nodes. However, the similarity-based approaches have some challenges in information loss on nodes and generalization ability on similarity indexes. To address the above issues, we propose a Line Graph… ▽ More

    Submitted 7 March, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: 37 pages

  43. arXiv:2210.12333  [pdf, other

    cs.CV

    Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets

    Authors: Xiangyu Chen, Qinghao Hu, Kaidong Li, Cuncong Zhong, Guanghui Wang

    Abstract: Vision Transformers has demonstrated competitive performance on computer vision tasks benefiting from their ability to capture long-range dependencies with multi-head self-attention modules and multi-layer perceptron. However, calculating global attention brings another disadvantage compared with convolutional neural networks, i.e. requiring much more data and computations to converge, which makes… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Camera-Ready Version for WACV 2023

  44. arXiv:2210.06951  [pdf, ps, other

    cs.IT physics.data-an

    Performance Optimization and Parameters Estimation for MIMO-OFDM Dual-functional Communication-radar Systems

    Authors: Chen Zhong, Chunrong Gu, Lan Tang, Yechao Bai, Mengting Lou

    Abstract: In dual-functional communication-radar systems, common radio frequency (RF) signals are used for both communication and detection. For better compatibility with existing communication systems, we adopt multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) signals as integrated signals and investigate the estimation performance of MIMO-OFDM signals. We first analyz… ▽ More

    Submitted 27 August, 2022; originally announced October 2022.

    Comments: Digital Communications and network

  45. Reconstructing Personalized Semantic Facial NeRF Models From Monocular Video

    Authors: Xuan Gao, Chenglai Zhong, Jun Xiang, Yang Hong, Yudong Guo, Juyong Zhang

    Abstract: We present a novel semantic model for human head defined with neural radiance field. The 3D-consistent head model consist of a set of disentangled and interpretable bases, and can be driven by low-dimensional expression coefficients. Thanks to the powerful representation ability of neural radiance field, the constructed model can represent complex facial attributes including hair, wearings, which… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted by SIGGRAPH Asia 2022 (Journal Track). Project page: https://ustc3dv.github.io/NeRFBlendShape/

    Journal ref: ACM Trans. Graph. 41, 6, Article 200 (December 2022), 12 pages

  46. arXiv:2210.05846  [pdf, other

    cs.LG

    FasterRisk: Fast and Accurate Interpretable Risk Scores

    Authors: Jiachang Liu, Chudi Zhong, Boxuan Li, Margo Seltzer, Cynthia Rudin

    Abstract: Over the last century, risk scores have been the most popular form of predictive model used in healthcare and criminal justice. Risk scores are sparse linear models with integer coefficients; often these models can be memorized or placed on an index card. Typically, risk scores have been created either without data or by rounding logistic regression coefficients, but these methods do not reliably… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  47. arXiv:2210.04742  [pdf, other

    cs.LG cs.AI

    Over-the-Air Split Machine Learning in Wireless MIMO Networks

    Authors: Yuzhi Yang, Zhaoyang Zhang, Yuqing Tian, Zhaohui Yang, Chongwen Huang, Caijun Zhong, Kai-Kit Wong

    Abstract: In split machine learning (ML), different partitions of a neural network (NN) are executed by different computing nodes, requiring a large amount of communication cost. To ease communication burden, over-the-air computation (OAC) can efficiently implement all or part of the computation at the same time of communication. Based on the proposed system, the system implementation over wireless network… ▽ More

    Submitted 11 December, 2022; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: to be pubilshed in IEEE Journal on Selected Areas in Communications, 15 pages, 13 figures, journal paper

  48. TimberTrek: Exploring and Curating Sparse Decision Trees with Interactive Visualization

    Authors: Zijie J. Wang, Chudi Zhong, Rui Xin, Takuya Takagi, Zhi Chen, Duen Horng Chau, Cynthia Rudin, Margo Seltzer

    Abstract: Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees--a huge set of almost-optimal interpretable ML models. To help ML practitioners identify models with desirable properties from this Rashomon set, we develop TimberTrek, the f… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: Accepted at IEEE VIS 2022. 5 pages, 6 figures. For a demo video, see https://youtu.be/3eGqTmsStJM. For a live demo, visit https://poloclub.github.io/timbertrek

  49. arXiv:2209.08040  [pdf, other

    cs.LG cs.AI

    Exploring the Whole Rashomon Set of Sparse Decision Trees

    Authors: Rui Xin, Chudi Zhong, Zhi Chen, Takuya Takagi, Margo Seltzer, Cynthia Rudin

    Abstract: In any given machine learning problem, there may be many models that could explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that might have desirable properties beyond what could be expressed within a loss function. The Rashomon set is the set of these all almost-optima… ▽ More

    Submitted 25 October, 2022; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: NeurIPS 2022 (Oral)

  50. arXiv:2208.06072  [pdf, ps, other

    cs.IT eess.SP

    Multiple RISs Assisted Cell-Free Networks With Two-timescale CSI: Performance Analysis and System Design

    Authors: Xu Gan, Caijun Zhong, Chongwen Huang, Zhaohui Yang, Zhaoyang Zhang

    Abstract: Reconfigurable intelligent surface (RIS) can be employed in a cell-free system to create favorable propagation conditions from base stations (BSs) to users via configurable elements. However, prior works on RIS-aided cell-free system designs mainly rely on the instantaneous channel state information (CSI), which may incur substantial overhead due to extremely high dimensions of estimated channels.… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: 31 pages, 9 figures