Skip to main content

Showing 1–50 of 391 results for author: Chu, W

.
  1. arXiv:2407.01926  [pdf

    physics.med-ph cs.CV

    Chemical Shift Encoding based Double Bonds Quantification in Triglycerides using Deep Image Prior

    Authors: Chaoxing Huang, Ziqiang Yu, Zijian Gao, Qiuyi Shen, Queenie Chan, Vincent Wai-Sun Wong, Winnie Chiu-Wing Chu, Weitian Chen

    Abstract: This study evaluated a deep learning-based method using Deep Image Prior (DIP) to quantify triglyceride double bonds from chemical-shift encoded multi-echo gradient echo images without network training. We employed a cost function based on signal constraints to iteratively update the neural network on a single dataset. The method was validated using phantom experiments and in vivo scans. Results s… ▽ More

    Submitted 3 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.01521  [pdf, other

    cs.LG cs.AI cs.CV

    Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing

    Authors: Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, Yang Song

    Abstract: Diffusion models have recently achieved success in solving Bayesian inverse problems with learned data priors. Current methods build on top of the diffusion sampling process, where each denoising step makes small modifications to samples from the previous step. However, this process struggles to correct errors from earlier sampling steps, leading to worse performance in complicated nonlinear inver… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2406.12002  [pdf, other

    q-bio.PE cs.LG math.NA physics.soc-ph

    Modeling, Inference, and Prediction in Mobility-Based Compartmental Models for Epidemiology

    Authors: Ning Jiang, Weiqi Chu, Yao Li

    Abstract: Classical compartmental models in epidemiology often struggle to accurately capture real-world dynamics due to their inability to address the inherent heterogeneity of populations. In this paper, we introduce a novel approach that incorporates heterogeneity through a mobility variable, transforming the traditional ODE system into a system of integro-differential equations that describe the dynamic… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 16 pages, 7 figures

  4. arXiv:2405.17401  [pdf, other

    cs.LG cs.CV stat.ML

    RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control

    Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu

    Abstract: We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution for training-free personalization of diffusion models. Existing training-free approaches exhibit difficulties in (a) style extraction from reference images in the absence of additional style or content text descriptions, (b) unwanted content leakage from reference style images, and (c) effective composition of styl… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review

  5. arXiv:2405.03141  [pdf, other

    eess.IV cs.AI cs.CV physics.med-ph

    Automatic Ultrasound Curve Angle Measurement via Affinity Clustering for Adolescent Idiopathic Scoliosis Evaluation

    Authors: Yihao Zhou, Timothy Tin-Yan Lee, Kelly Ka-Lee Lai, Chonglin Wu, Hin Ting Lau, De Yang, Chui-Yi Chan, Winnie Chiu-Wing Chu, Jack Chun-Yiu Cheng, Tsz-** Lam, Yong-** Zheng

    Abstract: The current clinical gold standard for evaluating adolescent idiopathic scoliosis (AIS) is X-ray radiography, using Cobb angle measurement. However, the frequent monitoring of the AIS progression using X-rays poses a challenge due to the cumulative radiation exposure. Although 3D ultrasound has been validated as a reliable and radiation-free alternative for scoliosis assessment, the process of mea… ▽ More

    Submitted 6 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  6. arXiv:2405.02280  [pdf, other

    cs.CV

    DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos

    Authors: Wen-Hsuan Chu, Lei Ke, Katerina Fragkiadaki

    Abstract: View-predictive generative models provide strong priors for lifting object-centric images and videos into 3D and 4D through rendering and score distillation objectives. A question then remains: what about lifting complete multi-object dynamic scenes? There are two challenges in this direction: First, rendering error gradients are often insufficient to recover fast object motion, and second, view p… ▽ More

    Submitted 23 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: Project page: https://dreamscene4d.github.io/

  7. arXiv:2404.09891  [pdf, ps, other

    math.CO

    Convolution Identities of Stirling Numbers

    Authors: Nadia Na Li, Wenchang Chu

    Abstract: By means of the generating function method, a linear recurrence relation is explicitly resolved. The solution is expressed in terms of the Stirling numbers of both the first and the second kind. Two remarkable pairs of combinatorial identities are established as applications, that contain some well-known convolution formulae on Stirling numbers as special cases.

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 7 pages

    MSC Class: 05A10; 11B65

  8. arXiv:2403.18270  [pdf, other

    cs.CV eess.IV

    Image Deraining via Self-supervised Reinforcement Learning

    Authors: He-Hao Liao, Yan-Tsung Peng, Wen-Tao Chu, **-Chun Hsieh, Chung-Chi Tsai

    Abstract: The quality of images captured outdoors is often affected by the weather. One factor that interferes with sight is rain, which can obstruct the view of observers and computer vision applications that rely on those images. The work aims to recover rain images by removing rain streaks via Self-supervised Reinforcement Learning (RL) for image deraining (SRL-Derain). We locate rain streak pixels from… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  9. arXiv:2403.02329  [pdf, other

    cs.LG cs.CR cs.CV

    COMMIT: Certifying Robustness of Multi-Sensor Fusion Systems against Semantic Attacks

    Authors: Zijian Huang, Wenda Chu, Linyi Li, Chejian Xu, Bo Li

    Abstract: Multi-sensor fusion systems (MSFs) play a vital role as the perception module in modern autonomous vehicles (AVs). Therefore, ensuring their robustness against common and realistic adversarial semantic transformations, such as rotation and shifting in the physical world, is crucial for the safety of AVs. While empirical evidence suggests that MSFs exhibit improved robustness compared to single-mod… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  10. arXiv:2402.16124  [pdf, other

    cs.CV

    AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation

    Authors: Yasheng Sun, Wenqing Chu, Hang Zhou, Kaisiyuan Wang, Hideki Koike

    Abstract: While considerable progress has been made in achieving accurate lip synchronization for 3D speech-driven talking face generation, the task of incorporating expressive facial detail synthesis aligned with the speaker's speaking status remains challenging. Our goal is to directly leverage the inherent style information conveyed by human speech for generating an expressive talking face that aligns wi… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  11. arXiv:2402.13297  [pdf, other

    q-bio.QM cs.AI

    Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-terminal Coding Sequences

    Authors: Zhanglu Yan, Weiran Chu, Yuhua Sheng, Kaiwen Tang, Shida Wang, Yanfeng Liu, Weng-Fai Wong

    Abstract: N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. T… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  12. arXiv:2402.06599  [pdf, other

    cs.CV cs.AI

    On the Out-Of-Distribution Generalization of Multimodal Large Language Models

    Authors: Xingxuan Zhang, Jiansheng Li, Wen**g Chu, Junjia Hai, Renzhe Xu, Yuqing Yang, Shikai Guan, Jiazheng Xu, Peng Cui

    Abstract: We investigate the generalization boundaries of current Multimodal Large Language Models (MLLMs) via comprehensive evaluation under out-of-distribution scenarios and domain-specific tasks. We evaluate their zero-shot generalization across synthetic images, real-world distributional shifts, and specialized datasets like medical and molecular imagery. Empirical results indicate that MLLMs struggle w… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  13. SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks

    Authors: Xingning Dong, Qingpei Guo, Tian Gan, Qing Wang, Jianlong Wu, Xiangyuan Ren, Yuan Cheng, Wei Chu

    Abstract: We present a framework for learning cross-modal video representations by directly pre-training on raw data to facilitate various downstream video-text tasks. Our main contributions lie in the pre-training framework and proxy tasks. First, based on the shortcomings of two mainstream pixel-level pre-training architectures (limited applications or less efficient), we propose Shared Network Pre-traini… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: Accepted by TCSVT (IEEE Transactions on Circuits and Systems for Video Technology)

  14. arXiv:2401.15362  [pdf, other

    cs.CV

    Transformer-based Clipped Contrastive Quantization Learning for Unsupervised Image Retrieval

    Authors: Ayush Dubey, Shiv Ram Dubey, Satish Kumar Singh, Wei-Ta Chu

    Abstract: Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image. The Convolutional Neural Network (CNN)-based approaches have been extensively exploited with self-supervised contrastive learning for image hashing. However, the existing approaches suffer due to lack of effective utilization of global feat… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

  15. Pressure-induced superconductivity in a novel germanium allotrope

    Authors: Liangzi Deng, Jianbo Zhang, Yuki Sakai, Zhongjia Tang, Moein Adnani, Rabin Dahal, Alexander P. Litvinchuk, James R. Chelikowsky, Marvin L. Cohen, Russell J. Hemley, Arnold Guloy, Yang Ding, Ching-Wu Chu

    Abstract: High-pressure studies on elements play an essential role in superconductivity research, with implications for both fundamental science and applications. Here we report the experimental discovery of surprisingly low pressure driving a novel germanium allotrope into a superconducting state in comparison to that for alpha-Ge. Raman measurements revealed structural phase transitions and possible elect… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 30 pages, 13 figures

  16. arXiv:2401.04354  [pdf, other

    cs.CV

    Knowledge-enhanced Multi-perspective Video Representation Learning for Scene Recognition

    Authors: Xuzheng Yu, Chen Jiang, Wei Zhang, Tian Gan, Linlin Chao, Jianan Zhao, Yuan Cheng, Qingpei Guo, Wei Chu

    Abstract: With the explosive growth of video data in real-world applications, a comprehensive representation of videos becomes increasingly important. In this paper, we address the problem of video scene recognition, whose goal is to learn a high-level video representation to classify scenes in videos. Due to the diversity and complexity of video contents in realistic scenarios, this task remains a challeng… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  17. arXiv:2312.00852  [pdf, other

    cs.LG cs.CV stat.ML

    Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion

    Authors: Litu Rout, Yujia Chen, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu

    Abstract: Sampling from the posterior distribution poses a major computational challenge in solving inverse problems using latent diffusion models. Common methods rely on Tweedie's first-order moments, which are known to induce a quality-limiting bias. Existing second-order approximations are impractical due to prohibitive computational costs, making standard reverse diffusion processes intractable for post… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Preprint

  18. arXiv:2311.08430  [pdf, other

    cs.LG cs.AI cs.IR

    Rankitect: Ranking Architecture Search Battling World-class Engineers at Meta Scale

    Authors: Wei Wen, Kuang-Hung Liu, Igor Fedorov, Xin Zhang, Hang Yin, Weiwei Chu, Kaveh Hassani, Mengying Sun, Jiang Liu, Xu Wang, Lin Jiang, Yuxin Chen, Buyun Zhang, Xi Liu, Dehua Cheng, Zhengxing Chen, Guang Zhao, Fangqiu Han, Jiyan Yang, Yuchen Hao, Liang Xiong, Wen-Yen Chen

    Abstract: Neural Architecture Search (NAS) has demonstrated its efficacy in computer vision and potential for ranking systems. However, prior work focused on academic problems, which are evaluated at small scale under well-controlled fixed baselines. In industry system, such as ranking system in Meta, it is unclear whether NAS algorithms from the literature can outperform production baselines because of: (1… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Wei Wen and Kuang-Hung Liu contribute equally

  19. arXiv:2311.06791  [pdf, other

    cs.CV

    InfMLLM: A Unified Framework for Visual-Language Tasks

    Authors: Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

    Abstract: Large language models (LLMs) have proven their remarkable versatility in handling a comprehensive range of language-centric applications. To expand LLMs' capabilities to a broader spectrum of modal inputs, multimodal large language models (MLLMs) have attracted growing interest. This work delves into enabling LLMs to tackle more vision-language-related tasks, particularly image captioning, visual… ▽ More

    Submitted 6 December, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: 8

  20. arXiv:2311.03558  [pdf

    cond-mat.supr-con

    Replication and study of anomalies in LK-99--the alleged ambient-pressure, room-temperature superconductor

    Authors: T. Habamahoro, T. Bontke, M. Chirom, Z. Wu, J. M. Bao, L. Z. Deng, C. W. Chu

    Abstract: We have studied LK-99 [Pb$_{10-x}$Cu$_x$(PO$_4$)$_6$O], alleged by Lee et al. to exhibit superconductivity above room temperature and at ambient pressure, and have reproduced all anomalies in electric and magnetic measurements that they reported as evidence for the claim of LK-99 being an ambient-pressure, room-temperature superconductor. We found that these anomalies are associated with the struc… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 15 pages, 7 figures

  21. arXiv:2310.19847  [pdf, ps, other

    math.GM

    Integrals of Hyperbolic Tangent Function

    Authors: **g Li, Wenchang Chu

    Abstract: By means of the contour integration method, we evaluate, in closed form, a class of definite integrals involving hyperbolic tangent function.

    Submitted 30 October, 2023; originally announced October 2023.

    MSC Class: 33E20; 11M32

  22. arXiv:2310.06992  [pdf, other

    cs.CV

    Zero-Shot Open-Vocabulary Tracking with Large Pre-Trained Models

    Authors: Wen-Hsuan Chu, Adam W. Harley, Pavel Tokmakov, Achal Dave, Leonidas Guibas, Katerina Fragkiadaki

    Abstract: Object tracking is central to robot perception and scene understanding. Tracking-by-detection has long been a dominant paradigm for object tracking of specific object categories. Recently, large-scale pre-trained models have shown promising advances in detecting and segmenting objects and parts in 2D static images in the wild. This begs the question: can we re-purpose these large-scale pre-trained… ▽ More

    Submitted 25 January, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Project page available at https://wenhsuanchu.github.io/ovtracktor/

  23. arXiv:2309.15458  [pdf, other

    cs.AI cs.SC

    LogicMP: A Neuro-symbolic Approach for Encoding First-order Logic Constraints

    Authors: Weidi Xu, **gwei Wang, Lele Xie, Jianshan He, Hongting Zhou, Taifeng Wang, Xiaopei Wan, **gdong Chen, Chao Qu, Wei Chu

    Abstract: Integrating first-order logic constraints (FOLCs) with neural networks is a crucial but challenging problem since it involves modeling intricate correlations to satisfy the constraints. This paper proposes a novel neural layer, LogicMP, whose layers perform mean-field variational inference over an MLN. It can be plugged into any off-the-shelf neural network to encode FOLCs while retaining modulari… ▽ More

    Submitted 16 April, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: 28 pages, 14 figures, 12 tables

  24. Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval

    Authors: Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei Zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu

    Abstract: With the explosive growth of web videos in recent years, large-scale Content-Based Video Retrieval (CBVR) becomes increasingly essential in video filtering, recommendation, and copyright protection. Segment-level CBVR (S-CBVR) locates the start and end time of similar segments in finer granularity, which is beneficial for user browsing efficiency and infringement detection especially in long video… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted by ACM MM 2021

  25. arXiv:2309.11082  [pdf, other

    cs.CV cs.CL cs.MM

    Dual-Modal Attention-Enhanced Text-Video Retrieval with Triplet Partial Margin Contrastive Learning

    Authors: Chen Jiang, Hong Liu, Xuzheng Yu, Qing Wang, Yuan Cheng, Jia Xu, Zhongyi Liu, Qingpei Guo, Wei Chu, Ming Yang, Yuan Qi

    Abstract: In recent years, the explosion of web videos makes text-video retrieval increasingly essential and popular for video filtering, recommendation, and search. Text-video retrieval aims to rank relevant text/video higher than irrelevant ones. The core of this task is to precisely measure the cross-modal similarity between texts and videos. Recently, contrastive learning methods have shown promising re… ▽ More

    Submitted 26 January, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted by ACM MM 2023

  26. arXiv:2309.08825  [pdf, other

    cs.LG cs.AI

    Distributionally Robust Post-hoc Classifiers under Prior Shifts

    Authors: Jiaheng Wei, Harikrishna Narasimhan, Ehsan Amid, Wen-Sheng Chu, Yang Liu, Abhishek Kumar

    Abstract: The generalization ability of machine learning models degrades significantly when the test distribution shifts away from the training distribution. We investigate the problem of training models that are robust to shifts caused by changes in the distribution of class-priors or group-priors. The presence of skewed training priors can often lead to the models overfitting to spurious features. Unlike… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Camera ready version, accepted at ICLR 2023

  27. Dynamic Frame Interpolation in Wavelet Domain

    Authors: Lingtong Kong, Boyuan Jiang, Donghao Luo, Wenqing Chu, Ying Tai, Chengjie Wang, Jie Yang

    Abstract: Video frame interpolation is an important low-level vision task, which can increase frame rate for more fluent visual experience. Existing methods have achieved great success by employing advanced motion models and synthesis networks. However, the spatial redundancy when synthesizing the target frame has not been fully explored, that can result in lots of inefficient computation. On the other hand… ▽ More

    Submitted 20 September, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted by IEEE TIP

  28. arXiv:2309.00398  [pdf, other

    cs.CV cs.MM

    VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation

    Authors: Xin Li, Wenqing Chu, Ye Wu, Weihang Yuan, Fanglong Liu, Qi Zhang, Fu Li, Haocheng Feng, Errui Ding, **gdong Wang

    Abstract: In this paper, we present VideoGen, a text-to-video generation approach, which can generate a high-definition video with high frame fidelity and strong temporal consistency using reference-guided latent diffusion. We leverage an off-the-shelf text-to-image generation model, e.g., Stable Diffusion, to generate an image with high content quality from the text prompt, as a reference image to guide vi… ▽ More

    Submitted 7 September, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: 8pages, 8figures, project page: https://videogen.github.io/VideoGen/

  29. arXiv:2307.02736  [pdf

    physics.med-ph cs.CV

    An Uncertainty Aided Framework for Learning based Liver $T_1ρ$ Map** and Analysis

    Authors: Chaoxing Huang, Vincent Wai Sun Wong, Queenie Chan, Winnie Chiu Wing Chu, Weitian Chen

    Abstract: Objective: Quantitative $T_1ρ$ imaging has potential for assessment of biochemical alterations of liver pathologies. Deep learning methods have been employed to accelerate quantitative $T_1ρ$ imaging. To employ artificial intelligence-based quantitative imaging methods in complicated clinical environment, it is valuable to estimate the uncertainty of the predicated $T_1ρ$ values to provide the con… ▽ More

    Submitted 9 October, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

  30. arXiv:2307.01778  [pdf, other

    cs.CV cs.AI cs.CR

    Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling

    Authors: Zhanhao Hu, Wenda Chu, Xiaopei Zhu, Hui Zhang, Bo Zhang, Xiaolin Hu

    Abstract: Recent works have proposed to craft adversarial clothes for evading person detectors, while they are either only effective at limited viewing angles or very conspicuous to humans. We aim to craft adversarial texture for clothes based on 3D modeling, an idea that has been used to craft rigid adversarial objects such as a 3D-printed turtle. Unlike rigid objects, humans and clothes are non-rigid, lea… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: Accepted by CVPR 2023

  31. arXiv:2307.01490  [pdf

    physics.optics

    Bistable scattering of nano-silicon for super-linear super-resolution imaging

    Authors: Po-Hsueh Tseng, Kentaro Nishida, Pang-Han Wu, Yu-Lung Tang, Yu-Chieh Chen, Chi-Yin Yang, Jhen-Hong Yang, Wei-Ruei Chen, Olesiya Pashina, Mihail Petrov, Kuo-** Chen, Shi- Wei Chu

    Abstract: Optical bistability is fundamental for all-optical switches, but typically requires high-Q cavities with micrometer sizes. Through boosting nonlinearity with photo-thermo-optical effects, we achieve bistability in a silicon Mie resonator with a volume size of 10-3 um3 and Q-factor < 10, both are record-low. Furthermore, bistable scattering naturally leads to large super-linear emission-excitation… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  32. arXiv:2306.16602  [pdf, other

    physics.flu-dyn math.AP

    An electro-hydrodynamics modeling of droplet actuation on solid surface by surfactant-mediated electro-dewetting

    Authors: Weiqi Chu, Hangjie Ji, Qining Wang, Chang-** "CJ'' Kim, Andrea L. Bertozzi

    Abstract: We propose an electro-hydrodynamics model to describe the dynamic evolution of a slender drop containing a dilute ionic surfactant on a naturally wettable surface, with a varying external electric field. This unified model reproduces fundamental microfluidic operations controlled by electrical signals, including dewetting, rewetting, and droplet shifting. In this paper, lubrication theory analysis… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: 16 pages, 13 figures

  33. arXiv:2306.14182  [pdf, other

    cs.CV cs.AI

    Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input

    Authors: Qingpei Guo, Kaisheng Yao, Wei Chu

    Abstract: The ability to model intra-modal and inter-modal interactions is fundamental in multimodal machine learning. The current state-of-the-art models usually adopt deep learning models with fixed structures. They can achieve exceptional performances on specific tasks, but face a particularly challenging problem of modality mismatch because of diversity of input modalities and their fixed structures. In… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

    Comments: Accepted by ECCV2022

  34. arXiv:2305.04732  [pdf

    cond-mat.mtrl-sci physics.comp-ph

    Photo-accelerated hot carrier transfer at MoS2/WS2:a first-principles study

    Authors: Zhi-Guo Tao, Guo-Jun Zhu, Weibin Chu, Xin-Gao Gong, Ji-Hui Yang

    Abstract: Charge transfer in type-II heterostructures plays important roles in determining device performance for photovoltaic and photocatalytic applications. However, current theoretical studies of charge transfer process don't consider the effects of operating conditions such as illuminations and yield systemically larger interlayer transfer time of hot electrons in MoS2/WS2 compared to experimental resu… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

  35. arXiv:2305.02610  [pdf, other

    cs.CV

    Boundary-aware Backward-Compatible Representation via Adversarial Learning in Image Retrieval

    Authors: Tan Pan, Furong Xu, Xudong Yang, Sifeng He, Chen Jiang, Qingpei Guo, Feng Qian Xiaobo Zhang, Yuan Cheng, Lei Yang, Wei Chu

    Abstract: Image retrieval plays an important role in the Internet world. Usually, the core parts of mainstream visual retrieval systems include an online service of the embedding model and a large-scale vector database. For traditional model upgrades, the old model will not be replaced by the new one until the embeddings of all the images in the database are re-computed by the new model, which takes days or… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: accepted by CVPR 2023

  36. arXiv:2305.02572  [pdf, other

    cs.CV

    High-fidelity Generalized Emotional Talking Face Generation with Multi-modal Emotion Space Learning

    Authors: Chao Xu, Junwei Zhu, Jiangning Zhang, Yue Han, Wenqing Chu, Ying Tai, Chengjie Wang, Zhifeng Xie, Yong Liu

    Abstract: Recently, emotional talking face generation has received considerable attention. However, existing methods only adopt one-hot coding, image, or audio as emotion conditions, thus lacking flexible control in practical applications and failing to handle unseen emotion styles due to limited semantics. They either ignore the one-shot setting or the quality of generated faces. In this paper, we propose… ▽ More

    Submitted 30 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

  37. A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition

    Authors: Ruchao Fan, Wei Chu, Peng Chang, Abeer Alwan

    Abstract: Recently, end-to-end models have been widely used in automatic speech recognition (ASR) systems. Two of the most representative approaches are connectionist temporal classification (CTC) and attention-based encoder-decoder (AED) models. Autoregressive transformers, variants of AED, adopt an autoregressive mechanism for token generation and thus are relatively slow during inference. In this paper,… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

    Comments: Published in IEEE Transactions on Audio, Speech, and Language Processing

  38. arXiv:2304.06662  [pdf, other

    eess.IV cs.CV

    Deep Learning in Breast Cancer Imaging: A Decade of Progress and Future Directions

    Authors: Luyang Luo, Xi Wang, Yi Lin, Xiaoqi Ma, Andong Tan, Ronald Chan, Varut Vardhanabhuti, Winnie CW Chu, Kwang-Ting Cheng, Hao Chen

    Abstract: Breast cancer has reached the highest incidence rate worldwide among all malignancies since 2020. Breast imaging plays a significant role in early diagnosis and intervention to improve the outcome of breast cancer patients. In the past decade, deep learning has shown remarkable progress in breast cancer imaging analysis, holding great promise in interpreting the rich information and complex contex… ▽ More

    Submitted 20 January, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: IEEE RBME 2024

  39. arXiv:2304.03592  [pdf

    physics.chem-ph

    Role of electrodes in study of hydrovoltaic effects

    Authors: Chunxiao Zheng, Sunmiao Fang, Weicun Chu, ** Tan, Bingkun Tian, Xiaofeng Jiang, Wanlin Guo

    Abstract: The last decade has witnessed the emergence of hydrovoltaic technology, which can harvest electricity from different forms of water movement, such as raindrops, waves, flows, moisture, and natural evaporation. In particular, the evaporation-induced hydrovoltaic effect received great attention since its discovery in 2017 due to its negative heat emission property. Nevertheless, the influence of ele… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

  40. arXiv:2303.18167  [pdf, other

    stat.ME stat.AP

    Accounting for Vibration Noise in Stochastic Measurement Errors

    Authors: Lionel Voirol, Davide A. Cucci, Mucyo Karemera, Wenfei Chu, Roberto Molinari, Stéphane Guerrier

    Abstract: The measurement of data over time and/or space is of utmost importance in a wide range of domains from engineering to physics. Devices that perform these measurements therefore need to be extremely precise to obtain correct system diagnostics and accurate predictions, consequently requiring a rigorous calibration procedure which models their errors before being employed. While the deterministic co… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: 30 pages, 9 figures

  41. arXiv:2303.13662  [pdf, other

    cs.CV

    Rethinking Domain Generalization for Face Anti-spoofing: Separability and Alignment

    Authors: Yiyou Sun, Yaojie Liu, Xiaoming Liu, Yixuan Li, Wen-Sheng Chu

    Abstract: This work studies the generalization issue of face anti-spoofing (FAS) models on domain gaps, such as image resolution, blurriness and sensor variations. Most prior works regard domain-specific signals as a negative impact, and apply metric learning or adversarial losses to remove them from feature representation. Though learning a domain-invariant feature space is viable for the training data, we… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR2023

  42. arXiv:2303.07623  [pdf, other

    physics.med-ph eess.IV

    Uncertainty-weighted Multi-tasking for $T_{1ρ}$ and T$_2$ Map** in the Liver with Self-supervised Learning

    Authors: Chaoxing Huang, Yurui Qian, Jian Hou, Baiyan Jiang, Queenie Chan, Vincent WS Wong, Winnie CW Chu, Weitian Chen

    Abstract: Multi-parametric map** of MRI relaxations in liver has the potential of revealing pathological information of the liver. A self-supervised learning based multi-parametric map** method is proposed to map T$T_{1ρ}$ and T$_2$ simultaneously, by utilising the relaxation constraint in the learning process. Data noise of different map** tasks is utilised to make the model uncertainty-aware, which… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  43. arXiv:2303.01023  [pdf, other

    quant-ph

    Adiabatic quantum learning

    Authors: Nannan Ma, Wenhao Chu, Jiangbin Gong

    Abstract: Adiabatic quantum control protocols have been of wide interest to quantum computation due to their robustness and insensitivity to their actual duration of execution. As an extension of previous quantum learning algorithms, this work proposes to execute some quantum learning protocols based entirely on adiabatic quantum evolution, hence dubbed as ``adiabatic quantum learning". In a conventional qu… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: 9 pages, 3 figures

  44. arXiv:2302.14335  [pdf, other

    cs.CV

    DC-Former: Diverse and Compact Transformer for Person Re-Identification

    Authors: Wen Li, Cheng Zou, Meng Wang, Furong Xu, Jianan Zhao, Ruobing Zheng, Yuan Cheng, Wei Chu

    Abstract: In person re-identification (re-ID) task, it is still challenging to learn discriminative representation by deep learning, due to limited data. Generally speaking, the model will get better performance when increasing the amount of data. The addition of similar classes strengthens the ability of the classifier to identify similar identities, thereby improving the discrimination of representation.… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted by AAAI23

  45. arXiv:2302.13300  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Low-temperature thermal Hall conductivity of Pr2Zr2O7 single crystal

    Authors: Wenjun Chu, Xuefeng Sun

    Abstract: To probe the peculiar excitations spinons and magnetic monopoles in the quantum spin ice candidate Pr2Zr2O7, we studied the low-temperature thermal Hall conductivity (\k{appa}xy) and thermal conductivity (\k{appa}xx) of Pr2Zr2O7 single crystal with magnetic fields applied along the [111] axis. The magnetic field dependencies of \k{appa}xx suggest the roles of magnetic excitations in thermal conduc… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.

  46. arXiv:2302.06637  [pdf, other

    cs.LG cs.AI

    PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

    Authors: Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

    Abstract: Personalized Federated Learning (pFL) has emerged as a promising solution to tackle data heterogeneity across clients in FL. However, existing pFL methods either (1) introduce high communication and computation costs or (2) overfit to local data, which can be limited in scope, and are vulnerable to evolved test samples with natural shifts. In this paper, we propose PerAda, a parameter-efficient pF… ▽ More

    Submitted 6 April, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: CVPR 2024

  47. arXiv:2302.05083  [pdf, other

    cs.LG

    DRGCN: Dynamic Evolving Initial Residual for Deep Graph Convolutional Networks

    Authors: Lei Zhang, Xiaodong Yan, Jianshan He, Ruopeng Li, Wei Chu

    Abstract: Graph convolutional networks (GCNs) have been proved to be very practical to handle various graph-related tasks. It has attracted considerable research interest to study deep GCNs, due to their potential superior performance compared with shallow ones. However, simply increasing network depth will, on the contrary, hurt the performance due to the over-smoothing problem. Adding residual connection… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: 8 pages, Accept by Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023)

  48. arXiv:2302.00848  [pdf, other

    cs.LG stat.ME stat.ML

    Causal Effect Estimation: Recent Advances, Challenges, and Opportunities

    Authors: Zhixuan Chu, Jianmin Huang, Ruopeng Li, Wei Chu, Sheng Li

    Abstract: Causal inference has numerous real-world applications in many domains, such as health care, marketing, political science, and online advertising. Treatment effect estimation, a fundamental problem in causal inference, has been extensively studied in statistics for decades. However, traditional treatment effect estimation methods may not well handle large-scale and high-dimensional heterogeneous da… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  49. arXiv:2302.00439  [pdf

    physics.comp-ph

    Accelerating the calculation of electron-phonon coupling by machine learning methods

    Authors: Yang Zhong, Zhiguo Tao, Weibin Chu, Xingao Gong, Hongjun Xiang

    Abstract: Electron-phonon coupling (EPC) plays an important role in many fundamental physical phenomena, but the high computational cost of the EPC matrix hinders the theoretical research on them. In this paper, an analytical formula is derived to calculate the EPC matrix in terms of the Hamiltonian and its gradient in the nonorthogonal atomic orbital bases. The recently-developed E(3) equivariant neural ne… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

    Comments: 11 pages, 2 figures, 2 tables

  50. arXiv:2212.14489  [pdf, other

    cs.SI math.OC math.ST physics.soc-ph

    Inference of interaction kernels in mean-field models of opinion dynamics

    Authors: Weiqi Chu, Qin Li, Mason A. Porter

    Abstract: In models of opinion dynamics, many parameters -- either in the form of constants or in the form of functions -- play a critical role in describing, calibrating, and forecasting how opinions change with time. When examining a model of opinion dynamics, it is beneficial to infer its parameters using empirical data. In this paper, we study an example of such an inference problem. We consider a mean-… ▽ More

    Submitted 26 October, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

    Comments: 20 pages, 3 figures

    MSC Class: 91D30; 35R30; 45Q05; 65K10