Skip to main content

Showing 1–50 of 110 results for author: Chu, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01926  [pdf

    physics.med-ph cs.CV

    Chemical Shift Encoding based Double Bonds Quantification in Triglycerides using Deep Image Prior

    Authors: Chaoxing Huang, Ziqiang Yu, Zijian Gao, Qiuyi Shen, Queenie Chan, Vincent Wai-Sun Wong, Winnie Chiu-Wing Chu, Weitian Chen

    Abstract: This study evaluated a deep learning-based method using Deep Image Prior (DIP) to quantify triglyceride double bonds from chemical-shift encoded multi-echo gradient echo images without network training. We employed a cost function based on signal constraints to iteratively update the neural network on a single dataset. The method was validated using phantom experiments and in vivo scans. Results s… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.01521  [pdf, other

    cs.LG cs.AI cs.CV

    Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing

    Authors: Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, Yang Song

    Abstract: Diffusion models have recently achieved success in solving Bayesian inverse problems with learned data priors. Current methods build on top of the diffusion sampling process, where each denoising step makes small modifications to samples from the previous step. However, this process struggles to correct errors from earlier sampling steps, leading to worse performance in complicated nonlinear inver… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.12002  [pdf, other

    q-bio.PE cs.LG math.NA physics.soc-ph

    Modeling, Inference, and Prediction in Mobility-Based Compartmental Models for Epidemiology

    Authors: Ning Jiang, Weiqi Chu, Yao Li

    Abstract: Classical compartmental models in epidemiology often struggle to accurately capture real-world dynamics due to their inability to address the inherent heterogeneity of populations. In this paper, we introduce a novel approach that incorporates heterogeneity through a mobility variable, transforming the traditional ODE system into a system of integro-differential equations that describe the dynamic… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 16 pages, 7 figures

  4. arXiv:2405.17401  [pdf, other

    cs.LG cs.CV stat.ML

    RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

    Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu

    Abstract: We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution for training-free personalization of diffusion models. Existing training-free approaches exhibit difficulties in (a) style extraction from reference images in the absence of additional style or content text descriptions, (b) unwanted content leakage from reference style images, and (c) effective composition of styl… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review

  5. arXiv:2405.03141  [pdf, other

    eess.IV cs.AI cs.CV physics.med-ph

    Automatic Ultrasound Curve Angle Measurement via Affinity Clustering for Adolescent Idiopathic Scoliosis Evaluation

    Authors: Yihao Zhou, Timothy Tin-Yan Lee, Kelly Ka-Lee Lai, Chonglin Wu, Hin Ting Lau, De Yang, Chui-Yi Chan, Winnie Chiu-Wing Chu, Jack Chun-Yiu Cheng, Tsz-** Lam, Yong-** Zheng

    Abstract: The current clinical gold standard for evaluating adolescent idiopathic scoliosis (AIS) is X-ray radiography, using Cobb angle measurement. However, the frequent monitoring of the AIS progression using X-rays poses a challenge due to the cumulative radiation exposure. Although 3D ultrasound has been validated as a reliable and radiation-free alternative for scoliosis assessment, the process of mea… ▽ More

    Submitted 6 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  6. arXiv:2405.02280  [pdf, other

    cs.CV

    DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

    Authors: Wen-Hsuan Chu, Lei Ke, Katerina Fragkiadaki

    Abstract: View-predictive generative models provide strong priors for lifting object-centric images and videos into 3D and 4D through rendering and score distillation objectives. A question then remains: what about lifting complete multi-object dynamic scenes? There are two challenges in this direction: First, rendering error gradients are often insufficient to recover fast object motion, and second, view p… ▽ More

    Submitted 23 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: Project page: https://dreamscene4d.github.io/

  7. arXiv:2403.18270  [pdf, other

    cs.CV eess.IV

    Image Deraining via Self-supervised Reinforcement Learning

    Authors: He-Hao Liao, Yan-Tsung Peng, Wen-Tao Chu, **-Chun Hsieh, Chung-Chi Tsai

    Abstract: The quality of images captured outdoors is often affected by the weather. One factor that interferes with sight is rain, which can obstruct the view of observers and computer vision applications that rely on those images. The work aims to recover rain images by removing rain streaks via Self-supervised Reinforcement Learning (RL) for image deraining (SRL-Derain). We locate rain streak pixels from… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  8. arXiv:2403.02329  [pdf, other

    cs.LG cs.CR cs.CV

    COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks

    Authors: Zijian Huang, Wenda Chu, Linyi Li, Chejian Xu, Bo Li

    Abstract: Multi-sensor fusion systems (MSFs) play a vital role as the perception module in modern autonomous vehicles (AVs). Therefore, ensuring their robustness against common and realistic adversarial semantic transformations, such as rotation and shifting in the physical world, is crucial for the safety of AVs. While empirical evidence suggests that MSFs exhibit improved robustness compared to single-mod… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  9. arXiv:2402.16124  [pdf, other

    cs.CV

    AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation

    Authors: Yasheng Sun, Wenqing Chu, Hang Zhou, Kaisiyuan Wang, Hideki Koike

    Abstract: While considerable progress has been made in achieving accurate lip synchronization for 3D speech-driven talking face generation, the task of incorporating expressive facial detail synthesis aligned with the speaker's speaking status remains challenging. Our goal is to directly leverage the inherent style information conveyed by human speech for generating an expressive talking face that aligns wi… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  10. arXiv:2402.13297  [pdf, other

    q-bio.QM cs.AI

    Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-terminal Coding Sequences

    Authors: Zhanglu Yan, Weiran Chu, Yuhua Sheng, Kaiwen Tang, Shida Wang, Yanfeng Liu, Weng-Fai Wong

    Abstract: N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. T… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  11. arXiv:2402.06599  [pdf, other

    cs.CV cs.AI

    On the Out-Of-Distribution Generalization of Multimodal Large Language Models

    Authors: Xingxuan Zhang, Jiansheng Li, Wen**g Chu, Junjia Hai, Renzhe Xu, Yuqing Yang, Shikai Guan, Jiazheng Xu, Peng Cui

    Abstract: We investigate the generalization boundaries of current Multimodal Large Language Models (MLLMs) via comprehensive evaluation under out-of-distribution scenarios and domain-specific tasks. We evaluate their zero-shot generalization across synthetic images, real-world distributional shifts, and specialized datasets like medical and molecular imagery. Empirical results indicate that MLLMs struggle w… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  12. SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks

    Authors: Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu

    Abstract: We present a framework for learning cross-modal video representations by directly pre-training on raw data to facilitate various downstream video-text tasks. Our main contributions lie in the pre-training framework and proxy tasks. First, based on the shortcomings of two mainstream pixel-level pre-training architectures (limited applications or less efficient), we propose Shared Network Pre-traini… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: Accepted by TCSVT (IEEE Transactions on Circuits and Systems for Video Technology)

  13. arXiv:2401.15362  [pdf, other

    cs.CV

    Transformer-based Clipped Contrastive Quantization Learning for Unsupervised Image Retrieval

    Authors: Ayush Dubey, Shiv Ram Dubey, Satish Kumar Singh, Wei-Ta Chu

    Abstract: Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image. The Convolutional Neural Network (CNN)-based approaches have been extensively exploited with self-supervised contrastive learning for image hashing. However, the existing approaches suffer due to lack of effective utilization of global feat… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  14. arXiv:2401.04354  [pdf, other

    cs.CV

    Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

    Authors: Xuzheng Yu, Chen Jiang, Wei Zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu

    Abstract: With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important. In this paper, we address the problem of video scene recognition, whose goal is to learn a high-level video representation to classify scenes in videos. Due to the diversity and complexity of video contents in realistic scenarios, this task remains a challeng… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  15. arXiv:2312.00852  [pdf, other

    cs.LG cs.CV stat.ML

    Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion

    Authors: Litu Rout, Yujia Chen, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu

    Abstract: Sampling from the posterior distribution poses a major computational challenge in solving inverse problems using latent diffusion models. Common methods rely on Tweedie's first-order moments, which are known to induce a quality-limiting bias. Existing second-order approximations are impractical due to prohibitive computational costs, making standard reverse diffusion processes intractable for post… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Preprint

  16. arXiv:2311.08430  [pdf, other

    cs.LG cs.AI cs.IR

    Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

    Authors: Wei Wen, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Hang Yin, Weiwei Chu, Kaveh Hassani, Mengying Sun, Jiang Liu, Xu Wang, Lin Jiang, Yuxin Chen, Buyun Zhang, Xi Liu, Dehua Cheng, Zhengxing Chen, Guang Zhao, Fangqiu Han, Jiyan Yang, Yuchen Hao, Liang Xiong, Wen-Yen Chen

    Abstract: Neural Architecture Search (NAS) has demonstrated its efficacy in computer vision and potential for ranking systems. However, prior work focused on academic problems, which are evaluated at small scale under well-controlled fixed baselines. In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Wei Wen and Kuang-Hung Liu contribute equally

  17. arXiv:2311.06791  [pdf, other

    cs.CV

    InfMLLM: A Unified Framework for Visual-Language Tasks

    Authors: Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

    Abstract: Large language models (LLMs) have proven their remarkable versatility in handling a comprehensive range of language-centric applications. To expand LLMs' capabilities to a broader spectrum of modal inputs, multimodal large language models (MLLMs) have attracted growing interest. This work delves into enabling LLMs to tackle more vision-language-related tasks, particularly image captioning, visual… ▽ More

    Submitted 6 December, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: 8

  18. arXiv:2310.06992  [pdf, other

    cs.CV

    Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models

    Authors: Wen-Hsuan Chu, Adam W. Harley, Pavel Tokmakov, Achal Dave, Leonidas Guibas, Katerina Fragkiadaki

    Abstract: Object tracking is central to robot perception and scene understanding. Tracking-by-detection has long been a dominant paradigm for object tracking of specific object categories. Recently, large-scale pre-trained models have shown promising advances in detecting and segmenting objects and parts in 2D static images in the wild. This begs the question: can we re-purpose these large-scale pre-trained… ▽ More

    Submitted 25 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Project page available at https://wenhsuanchu.github.io/ovtracktor/

  19. arXiv:2309.15458  [pdf, other

    cs.AI cs.SC

    LogicMP: A Neuro-symbolic Approach for Encoding First-order Logic Constraints

    Authors: Weidi Xu, **gwei Wang, Lele Xie, Jianshan He, Hongting Zhou, Taifeng Wang, Xiaopei Wan, **gdong Chen, Chao Qu, Wei Chu

    Abstract: Integrating first-order logic constraints (FOLCs) with neural networks is a crucial but challenging problem since it involves modeling intricate correlations to satisfy the constraints. This paper proposes a novel neural layer, LogicMP, whose layers perform mean-field variational inference over an MLN. It can be plugged into any off-the-shelf neural network to encode FOLCs while retaining modulari… ▽ More

    Submitted 16 April, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: 28 pages, 14 figures, 12 tables

  20. Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval

    Authors: Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei Zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu

    Abstract: With the explosive growth of web videos in recent years, large-scale Content-Based Video Retrieval (CBVR) becomes increasingly essential in video filtering, recommendation, and copyright protection. Segment-level CBVR (S-CBVR) locates the start and end time of similar segments in finer granularity, which is beneficial for user browsing efficiency and infringement detection especially in long video… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted by ACM MM 2021

  21. arXiv:2309.11082  [pdf, other

    cs.CV cs.CL cs.MM

    Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning

    Authors: Chen Jiang, Hong Liu, Xuzheng Yu, Qing Wang, Yuan Cheng, Jia Xu, Zhongyi Liu, Qingpei Guo, Wei Chu, Ming Yang, Yuan Qi

    Abstract: In recent years, the explosion of web videos makes text-video retrieval increasingly essential and popular for video filtering, recommendation, and search. Text-video retrieval aims to rank relevant text/video higher than irrelevant ones. The core of this task is to precisely measure the cross-modal similarity between texts and videos. Recently, contrastive learning methods have shown promising re… ▽ More

    Submitted 26 January, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted by ACM MM 2023

  22. arXiv:2309.08825  [pdf, other

    cs.LG cs.AI

    Distributionally Robust Post-hoc Classifiers under Prior Shifts

    Authors: Jiaheng Wei, Harikrishna Narasimhan, Ehsan Amid, Wen-Sheng Chu, Yang Liu, Abhishek Kumar

    Abstract: The generalization ability of machine learning models degrades significantly when the test distribution shifts away from the training distribution. We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors. The presence of skewed training priors can often lead to the models overfitting to spurious features. Unlike… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Camera ready version, accepted at ICLR 2023

  23. Dynamic Frame Interpolation in Wavelet Domain

    Authors: Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Ying Tai, Chengjie Wang, Jie Yang

    Abstract: Video frame interpolation is an important low-level vision task, which can increase frame rate for more fluent visual experience. Existing methods have achieved great success by employing advanced motion models and synthesis networks. However, the spatial redundancy when synthesizing the target frame has not been fully explored, that can result in lots of inefficient computation. On the other hand… ▽ More

    Submitted 20 September, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted by IEEE TIP

  24. arXiv:2309.00398  [pdf, other

    cs.CV cs.MM

    VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation

    Authors: Xin Li, Wenqing Chu, Ye Wu, Weihang Yuan, Fanglong Liu, Qi Zhang, Fu Li, Haocheng Feng, Errui Ding, **gdong Wang

    Abstract: In this paper, we present VideoGen, a text-to-video generation approach, which can generate a high-definition video with high frame fidelity and strong temporal consistency using reference-guided latent diffusion. We leverage an off-the-shelf text-to-image generation model, e.g., Stable Diffusion, to generate an image with high content quality from the text prompt, as a reference image to guide vi… ▽ More

    Submitted 7 September, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: 8pages, 8figures, project page: https://videogen.github.io/VideoGen/

  25. arXiv:2307.02736  [pdf

    physics.med-ph cs.CV

    An Uncertainty Aided Framework for Learning based Liver $T_1ρ$ Map** and Analysis

    Authors: Chaoxing Huang, Vincent Wai Sun Wong, Queenie Chan, Winnie Chiu Wing Chu, Weitian Chen

    Abstract: Objective: Quantitative $T_1ρ$ imaging has potential for assessment of biochemical alterations of liver pathologies. Deep learning methods have been employed to accelerate quantitative $T_1ρ$ imaging. To employ artificial intelligence-based quantitative imaging methods in complicated clinical environment, it is valuable to estimate the uncertainty of the predicated $T_1ρ$ values to provide the con… ▽ More

    Submitted 9 October, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

  26. arXiv:2307.01778  [pdf, other

    cs.CV cs.AI cs.CR

    Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling

    Authors: Zhanhao Hu, Wenda Chu, Xiaopei Zhu, Hui Zhang, Bo Zhang, Xiaolin Hu

    Abstract: Recent works have proposed to craft adversarial clothes for evading person detectors, while they are either only effective at limited viewing angles or very conspicuous to humans. We aim to craft adversarial texture for clothes based on 3D modeling, an idea that has been used to craft rigid adversarial objects such as a 3D-printed turtle. Unlike rigid objects, humans and clothes are non-rigid, lea… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: Accepted by CVPR 2023

  27. arXiv:2306.14182  [pdf, other

    cs.CV cs.AI

    Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input

    Authors: Qingpei Guo, Kaisheng Yao, Wei Chu

    Abstract: The ability to model intra-modal and inter-modal interactions is fundamental in multimodal machine learning. The current state-of-the-art models usually adopt deep learning models with fixed structures. They can achieve exceptional performances on specific tasks, but face a particularly challenging problem of modality mismatch because of diversity of input modalities and their fixed structures. In… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: Accepted by ECCV2022

  28. arXiv:2305.02610  [pdf, other

    cs.CV

    Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval

    Authors: Tan Pan, Furong Xu, Xudong Yang, Sifeng He, Chen Jiang, Qingpei Guo, Feng Qian Xiaobo Zhang, Yuan Cheng, Lei Yang, Wei Chu

    Abstract: Image retrieval plays an important role in the Internet world. Usually, the core parts of mainstream visual retrieval systems include an online service of the embedding model and a large-scale vector database. For traditional model upgrades, the old model will not be replaced by the new one until the embeddings of all the images in the database are re-computed by the new model, which takes days or… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: accepted by CVPR 2023

  29. arXiv:2305.02572  [pdf, other

    cs.CV

    High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

    Authors: Chao Xu, Junwei Zhu, Jiangning Zhang, Yue Han, Wenqing Chu, Ying Tai, Chengjie Wang, Zhifeng Xie, Yong Liu

    Abstract: Recently, emotional talking face generation has received considerable attention. However, existing methods only adopt one-hot coding, image, or audio as emotion conditions, thus lacking flexible control in practical applications and failing to handle unseen emotion styles due to limited semantics. They either ignore the one-shot setting or the quality of generated faces. In this paper, we propose… ▽ More

    Submitted 30 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

  30. A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition

    Authors: Ruchao Fan, Wei Chu, Peng Chang, Abeer Alwan

    Abstract: Recently, end-to-end models have been widely used in automatic speech recognition (ASR) systems. Two of the most representative approaches are connectionist temporal classification (CTC) and attention-based encoder-decoder (AED) models. Autoregressive transformers, variants of AED, adopt an autoregressive mechanism for token generation and thus are relatively slow during inference. In this paper,… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

    Comments: Published in IEEE Transactions on Audio, Speech, and Language Processing

  31. arXiv:2304.06662  [pdf, other

    eess.IV cs.CV

    Deep Learning in Breast Cancer Imaging: A Decade of Progress and Future Directions

    Authors: Luyang Luo, Xi Wang, Yi Lin, Xiaoqi Ma, Andong Tan, Ronald Chan, Varut Vardhanabhuti, Winnie CW Chu, Kwang-Ting Cheng, Hao Chen

    Abstract: Breast cancer has reached the highest incidence rate worldwide among all malignancies since 2020. Breast imaging plays a significant role in early diagnosis and intervention to improve the outcome of breast cancer patients. In the past decade, deep learning has shown remarkable progress in breast cancer imaging analysis, holding great promise in interpreting the rich information and complex contex… ▽ More

    Submitted 20 January, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: IEEE RBME 2024

  32. arXiv:2303.13662  [pdf, other

    cs.CV

    Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment

    Authors: Yiyou Sun, Yaojie Liu, Xiaoming Liu, Yixuan Li, Wen-Sheng Chu

    Abstract: This work studies the generalization issue of face anti-spoofing (FAS) models on domain gaps, such as image resolution, blurriness and sensor variations. Most prior works regard domain-specific signals as a negative impact, and apply metric learning or adversarial losses to remove them from feature representation. Though learning a domain-invariant feature space is viable for the training data, we… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR2023

  33. arXiv:2302.14335  [pdf, other

    cs.CV

    DC-Former: Diverse and Compact Transformer for Person Re-Identification

    Authors: Wen Li, Cheng Zou, Meng Wang, Furong Xu, Jianan Zhao, Ruobing Zheng, Yuan Cheng, Wei Chu

    Abstract: In person re-identification (re-ID) task, it is still challenging to learn discriminative representation by deep learning, due to limited data. Generally speaking, the model will get better performance when increasing the amount of data. The addition of similar classes strengthens the ability of the classifier to identify similar identities, thereby improving the discrimination of representation.… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted by AAAI23

  34. arXiv:2302.06637  [pdf, other

    cs.LG cs.AI

    PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

    Authors: Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

    Abstract: Personalized Federated Learning (pFL) has emerged as a promising solution to tackle data heterogeneity across clients in FL. However, existing pFL methods either (1) introduce high communication and computation costs or (2) overfit to local data, which can be limited in scope, and are vulnerable to evolved test samples with natural shifts. In this paper, we propose PerAda, a parameter-efficient pF… ▽ More

    Submitted 6 April, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: CVPR 2024

  35. arXiv:2302.05083  [pdf, other

    cs.LG

    DRGCN: Dynamic Evolving Initial Residual for Deep Graph Convolutional Networks

    Authors: Lei Zhang, Xiaodong Yan, Jianshan He, Ruopeng Li, Wei Chu

    Abstract: Graph convolutional networks (GCNs) have been proved to be very practical to handle various graph-related tasks. It has attracted considerable research interest to study deep GCNs, due to their potential superior performance compared with shallow ones. However, simply increasing network depth will, on the contrary, hurt the performance due to the over-smoothing problem. Adding residual connection… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: 8 pages, Accept by Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023)

  36. arXiv:2302.00848  [pdf, other

    cs.LG stat.ME stat.ML

    Causal Effect Estimation: Recent Advances, Challenges, and Opportunities

    Authors: Zhixuan Chu, Jianmin Huang, Ruopeng Li, Wei Chu, Sheng Li

    Abstract: Causal inference has numerous real-world applications in many domains, such as health care, marketing, political science, and online advertising. Treatment effect estimation, a fundamental problem in causal inference, has been extensively studied in statistics for decades. However, traditional treatment effect estimation methods may not well handle large-scale and high-dimensional heterogeneous da… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  37. arXiv:2212.14489  [pdf, other

    cs.SI math.OC math.ST physics.soc-ph

    Inference of interaction kernels in mean-field models of opinion dynamics

    Authors: Weiqi Chu, Qin Li, Mason A. Porter

    Abstract: In models of opinion dynamics, many parameters -- either in the form of constants or in the form of functions -- play a critical role in describing, calibrating, and forecasting how opinions change with time. When examining a model of opinion dynamics, it is beneficial to infer its parameters using empirical data. In this paper, we study an example of such an inference problem. We consider a mean-… ▽ More

    Submitted 26 October, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

    Comments: 20 pages, 3 figures

    MSC Class: 91D30; 35R30; 45Q05; 65K10

  38. arXiv:2211.00509  [pdf, other

    cs.CV

    Self-Supervised Intensity-Event Stereo Matching

    Authors: **** Gu, **an Zhou, Ringo Sai Wo Chu, Yan Chen, Jiawei Zhang, Xuanye Cheng, Song Zhang, Jimmy S. Ren

    Abstract: Event cameras are novel bio-inspired vision sensors that output pixel-level intensity changes in microsecond accuracy with a high dynamic range and low power consumption. Despite these advantages, event cameras cannot be directly applied to computational imaging tasks due to the inability to obtain high-quality intensity and events simultaneously. This paper aims to connect a standalone event came… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: This paper has been accepted by the Journal of Imaging Science & Technology

  39. arXiv:2208.12787  [pdf, other

    physics.soc-ph cs.SI math.DS math.PR nlin.AO

    Non-Markovian models of opinion dynamics on temporal networks

    Authors: Weiqi Chu, Mason A. Porter

    Abstract: Traditional models of opinion dynamics, in which the nodes of a network change their opinions based on their interactions with neighboring nodes, consider how opinions evolve either on time-independent networks or on temporal networks with edges that follow Poisson statistics. Most such models are Markovian. However, in many real-life networks, interactions between individuals (and hence the edges… ▽ More

    Submitted 10 March, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

    Comments: 24 pages, 7 figures

    MSC Class: 91D30; 37H10; 05C80

  40. arXiv:2207.11669  [pdf, other

    cs.DC cs.LG

    Distributed Robust Principal Component Analysis

    Authors: Wenda Chu

    Abstract: We study the robust principal component analysis (RPCA) problem in a distributed setting. The goal of RPCA is to find an underlying low-rank estimation for a raw data matrix when the data matrix is subject to the corruption of gross sparse errors. Previous studies have developed RPCA algorithms that provide stable solutions with fast convergence. However, these algorithms are typically hard to sca… ▽ More

    Submitted 12 August, 2022; v1 submitted 24 July, 2022; originally announced July 2022.

    Comments: 13 pages

  41. arXiv:2207.10315  [pdf, other

    cs.CV

    SeedFormer: Patch Seeds based Point Cloud Completion with Upsample Transformer

    Authors: Haoran Zhou, Yun Cao, Wenqing Chu, Junwei Zhu, Tong Lu, Ying Tai, Chengjie Wang

    Abstract: Point cloud completion has become increasingly popular among generation tasks of 3D point clouds, as it is a challenging yet indispensable problem to recover the complete shape of a 3D object from its partial observation. In this paper, we propose a novel SeedFormer to improve the ability of detail preservation and recovery in point cloud completion. Unlike previous methods based on a global featu… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: Camera-ready, to be published in ECCV 2022, with supplementary material

  42. arXiv:2207.10265  [pdf, other

    cs.LG cs.CY

    FOCUS: Fairness via Agent-Awareness for Federated Learning on Heterogeneous Data

    Authors: Wenda Chu, Chulin Xie, Boxin Wang, Linyi Li, Lang Yin, Arash Nourian, Han Zhao, Bo Li

    Abstract: Federated learning (FL) allows agents to jointly train a global model without sharing their local data. However, due to the heterogeneous nature of local data, it is challenging to optimize or even define fairness of the trained global model for the agents. For instance, existing work usually considers accuracy equity as fairness for different agents in FL, which is limited, especially under the h… ▽ More

    Submitted 15 November, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

  43. arXiv:2207.03105  [pdf

    q-bio.TO cs.CV eess.IV physics.med-ph

    Uncertainty-Aware Self-supervised Neural Network for Liver $T_{1ρ}$ Map** with Relaxation Constraint

    Authors: Chaoxing Huang, Yurui Qian, Simon Chun Ho Yu, Jian Hou, Baiyan Jiang, Queenie Chan, Vincent Wai-Sun Wong, Winnie Chiu-Wing Chu, Weitian Chen

    Abstract: $T_{1ρ}$ map** is a promising quantitative MRI technique for the non-invasive assessment of tissue properties. Learning-based approaches can map $T_{1ρ}$ from a reduced number of $T_{1ρ}$ weighted images, but requires significant amounts of high quality training data. Moreover, existing methods do not provide the confidence level of the $T_{1ρ}… ▽ More

    Submitted 25 October, 2022; v1 submitted 7 July, 2022; originally announced July 2022.

    Comments: Provisionally accepted by Physics in Medicine and Biology

  44. arXiv:2205.14620  [pdf, other

    cs.CV

    IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation

    Authors: Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Xiaoming Huang, Ying Tai, Chengjie Wang, Jie Yang

    Abstract: Prevailing video frame interpolation algorithms, that generate the intermediate frames from consecutive inputs, typically rely on complex model architectures with heavy parameters or large delay, hindering them from diverse real-time applications. In this work, we devise an efficient encoder-decoder based network, termed IFRNet, for fast intermediate frame synthesizing. It first extracts pyramid f… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

    Comments: Accepted by CVPR 2022

  45. arXiv:2205.13346  [pdf, other

    cs.CL

    Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation

    Authors: Mingzhe Li, XieXiong Lin, Xiuying Chen, **xiong Chang, Qishen Zhang, Feng Wang, Taifeng Wang, Zhongyi Liu, Wei Chu, Dongyan Zhao, Rui Yan

    Abstract: Contrastive learning has achieved impressive success in generation tasks to militate the "exposure bias" problem and discriminatively exploit the different quality of references. Existing works mostly focus on contrastive learning on the instance-level without discriminating the contribution of each word, while keywords are the gist of the text and dominant the constrained map** relationships. H… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted by ACL2022

  46. arXiv:2204.13957  [pdf, other

    cs.CL cs.AI

    PIE: a Parameter and Inference Efficient Solution for Large Scale Knowledge Graph Embedding Reasoning

    Authors: Linlin Chao, Xiexiong Lin, Taifeng Wang, Wei Chu

    Abstract: Knowledge graph (KG) embedding methods which map entities and relations to unique embeddings in the KG have shown promising results on many reasoning tasks. However, the same embedding dimension for both dense entities and sparse entities will cause either over parameterization (sparse entities) or under fitting (dense entities). Normally, a large dimension is set to get better performance. Meanwh… ▽ More

    Submitted 4 May, 2022; v1 submitted 29 April, 2022; originally announced April 2022.

  47. ESCM$^2$: Entire Space Counterfactual Multi-Task Model for Post-Click Conversion Rate Estimation

    Authors: Hao Wang, Tai-Wei Chang, Tianqiao Liu, Jianmin Huang, Zhichao Chen, Chao Yu, Ruopeng Li, Wei Chu

    Abstract: Accurate estimation of post-click conversion rate is critical for building recommender systems, which has long been confronted with sample selection bias and data sparsity issues. Methods in the Entire Space Multi-task Model (ESMM) family leverage the sequential pattern of user actions, i.e. $impression\rightarrow click \rightarrow conversion$ to address data sparsity issue. However, they still fa… ▽ More

    Submitted 23 May, 2022; v1 submitted 3 April, 2022; originally announced April 2022.

    Journal ref: SIGIR 2022

  48. arXiv:2203.12189  [pdf, other

    cs.SI eess.SY math.DS math.PR physics.soc-ph

    A density description of a bounded-confidence model of opinion dynamics on hypergraphs

    Authors: Weiqi Chu, Mason A. Porter

    Abstract: Social interactions often occur between three or more agents simultaneously. Examining opinion dynamics on hypergraphs allows one to study the effect of such polyadic interactions on the opinions of agents. In this paper, we consider a bounded-confidence model (BCM), in which opinions take continuous values and interacting agents comprise their opinions if they are close enough to each other. We s… ▽ More

    Submitted 27 April, 2023; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: 19 pages, 4 figures

    MSC Class: 91D30; 05C65; 45J05

  49. arXiv:2203.12175  [pdf, other

    cs.CV

    Adaptive Transformers for Robust Few-shot Cross-domain Face Anti-spoofing

    Authors: Hsin-** Huang, Deqing Sun, Yaojie Liu, Wen-Sheng Chu, Taihong Xiao, **wei Yuan, Hartwig Adam, Ming-Hsuan Yang

    Abstract: While recent face anti-spoofing methods perform well under the intra-domain setups, an effective approach needs to account for much larger appearance variations of images acquired in complex scenes with different sensors for robust performance. In this paper, we present adaptive vision transformers (ViT) for robust cross-domain face antispoofing. Specifically, we adopt ViT as a backbone to exploit… ▽ More

    Submitted 28 July, 2023; v1 submitted 22 March, 2022; originally announced March 2022.

  50. arXiv:2203.06696  [pdf, other

    cs.CV

    Training Protocol Matters: Towards Accurate Scene Text Recognition via Training Protocol Searching

    Authors: Xiaojie Chu, Yongtao Wang, Chunhua Shen, **gdong Chen, Wei Chu

    Abstract: The development of scene text recognition (STR) in the era of deep learning has been mainly focused on novel architectures of STR models. However, training protocol (i.e., settings of the hyper-parameters involved in the training of STR models), which plays an equally important role in successfully training a good STR model, is under-explored for scene text recognition. In this work, we attempt to… ▽ More

    Submitted 16 March, 2022; v1 submitted 13 March, 2022; originally announced March 2022.