Skip to main content

Showing 1–50 of 175 results for author: Ge, R

.
  1. arXiv:2406.15968  [pdf, other

    cs.CL cs.LG

    ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods

    Authors: Roy Xie, Junlin Wang, Ruomin Huang, Minxing Zhang, Rong Ge, Jian Pei, Neil Zhenqiang Gong, Bhuwan Dhingra

    Abstract: The rapid scaling of large language models (LLMs) has raised concerns about the transparency and fair use of the pretraining data used for training them. Detecting such content is challenging due to the scale of the data and limited exposure of each instance during training. We propose ReCaLL (Relative Conditional Log-Likelihood), a novel membership inference attack (MIA) to detect LLMs' pretraini… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.04845  [pdf, other

    cs.CL cs.AI cs.DC cs.LG cs.MA

    FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models

    Authors: Rui Ye, Rui Ge, Xinyu Zhu, **gyi Chai, Yaxin Du, Yang Liu, Yanfeng Wang, Siheng Chen

    Abstract: Federated learning has enabled multiple parties to collaboratively train large language models without directly sharing their data (FedLLM). Following this training paradigm, the community has put massive efforts from diverse aspects including framework, performance, and privacy. However, an unpleasant fact is that there are currently no realistic datasets and benchmarks for FedLLM and previous wo… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 22 pages

  3. arXiv:2406.04068  [pdf, other

    cs.LG math.ST stat.ML

    Reassessing How to Compare and Improve the Calibration of Machine Learning Models

    Authors: Muthu Chidambaram, Rong Ge

    Abstract: A machine learning model is calibrated if its predicted probability for an outcome matches the observed frequency for that outcome conditional on the model prediction. This property has become increasingly important as the impact of machine learning models has continued to spread to various domains. As a result, there are now a dizzying number of recent papers on measuring and improving the calibr… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 20 pages, 7 figures

  4. arXiv:2406.01766  [pdf, ps, other

    cs.LG stat.ML

    How Does Gradient Descent Learn Features -- A Local Analysis for Regularized Two-Layer Neural Networks

    Authors: Mo Zhou, Rong Ge

    Abstract: The ability of learning useful features is one of the major advantages of neural networks. Although recent works show that neural network can operate in a neural tangent kernel (NTK) regime that does not allow feature learning, many works also demonstrate the potential for neural networks to go beyond NTK regime and perform feature learning. Recently, a line of work highlighted the feature learnin… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2405.10561  [pdf, other

    eess.IV cs.CV

    Infrared Image Super-Resolution via Lightweight Information Split Network

    Authors: Shijie Liu, Kang Yan, Feiwei Qin, Changmiao Wang, Ruiquan Ge, Kai Zhang, Jie Huang, Yong Peng, ** Cao

    Abstract: Single image super-resolution (SR) is an established pixel-level vision task aimed at reconstructing a high-resolution image from its degraded low-resolution counterpart. Despite the notable advancements achieved by leveraging deep neural networks for SR, most existing deep learning architectures feature an extensive number of layers, leading to high computational complexity and substantial memory… ▽ More

    Submitted 27 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  6. Shared Virtual Memory: Its Design and Performance Implications for Diverse Applications

    Authors: Bennett Cooper, Thomas R. W. Scogland, Rong Ge

    Abstract: Discrete GPU accelerators, while providing massive computing power for supercomputers and data centers, have their separate memory domain. Explicit memory management across device and host domains in programming is tedious and error-prone. To improve programming portability and productivity, Unified Memory (UM) integrates GPU memory into the host virtual memory systems, and provides transparent da… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: To be published in ICS '24

  7. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  8. arXiv:2405.00542  [pdf, other

    eess.IV cs.CV

    UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement

    Authors: Ruiquan Ge, Zhaojie Fang, Pengxue Wei, Zhanghao Chen, Hongyang Jiang, Ahmed Elazab, Wangting Li, Xiang Wan, Shaochong Zhang, Changmiao Wang

    Abstract: Fundus photography, in combination with the ultra-wide-angle fundus (UWF) techniques, becomes an indispensable diagnostic tool in clinical settings by offering a more comprehensive view of the retina. Nonetheless, UWF fluorescein angiography (UWF-FA) necessitates the administration of a fluorescent dye via injection into the patient's hand or elbow unlike UWF scanning laser ophthalmoscopy (UWF-SLO… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  9. arXiv:2404.19531  [pdf, other

    cs.CV

    MoST: Multi-modality Scene Tokenization for Motion Prediction

    Authors: Norman Mu, **gwei Ji, Zhenpei Yang, Nate Harada, Haotian Tang, Kan Chen, Charles R. Qi, Runzhou Ge, Kratarth Goel, Zoey Yang, Scott Ettinger, Rami Al-Rfou, Dragomir Anguelov, Yin Zhou

    Abstract: Many existing motion prediction approaches rely on symbolic perception outputs to generate agent trajectories, such as bounding boxes, road graph information and traffic lights. This symbolic representation is a high-level abstraction of the real world, which may render the motion prediction model vulnerable to perception errors (e.g., failures in detecting open-vocabulary obstacles) while missing… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  10. arXiv:2404.18007  [pdf, ps, other

    cs.LO

    A Formal Model to Prove Instantiation Termination for E-matching-Based Axiomatisations (Extended Version)

    Authors: Rui Ge, Ronald Garcia, Alexander J. Summers

    Abstract: SMT-based program analysis and verification often involve reasoning about program features that have been specified using quantifiers; incorporating quantifiers into SMT-based reasoning is, however, known to be challenging. If quantifier instantiation is not carefully controlled, then runtime and outcomes can be brittle and hard to predict. In particular, uncontrolled quantifier instantiation can… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: extended version of IJCAR 2024 publication

  11. arXiv:2403.12401  [pdf, other

    cs.CV

    VQ-NeRV: A Vector Quantized Neural Representation for Videos

    Authors: Yunjie Xu, Xiang Feng, Feiwei Qin, Ruiquan Ge, Yong Peng, Changmiao Wang

    Abstract: Implicit neural representations (INR) excel in encoding videos within neural networks, showcasing promise in computer vision tasks like video compression and denoising. INR-based approaches reconstruct video frames from content-agnostic embeddings, which hampers their efficacy in video frame regression and restricts their generalization ability for video interpolation. To address these deficiencie… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Under Review

  12. arXiv:2403.10547  [pdf, ps, other

    math.OC cs.AI cs.DS cs.LG

    Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing

    Authors: Shuyao Li, Yu Cheng, Ilias Diakonikolas, Jelena Diakonikolas, Rong Ge, Stephen J. Wright

    Abstract: Finding an approximate second-order stationary point (SOSP) is a well-studied and fundamental problem in stochastic nonconvex optimization with many applications in machine learning. However, this problem is poorly understood in the presence of outliers, limiting the use of existing nonconvex algorithms in adversarial settings. In this paper, we study the problem of finding SOSPs in the strong c… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  13. arXiv:2402.17187  [pdf, other

    eess.IV cs.CV

    PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction

    Authors: Zhaoxin Guo, Zhipeng Wang, Ruiquan Ge, Jianxun Yu, Feiwei Qin, Yuan Tian, Yuqing Peng, Yonghong Li, Changmiao Wang

    Abstract: The early detection of a pulmonary embolism (PE) is critical for enhancing patient survival rates. Both image-based and non-image-based features are of utmost importance in medical classification tasks. In a clinical setting, physicians tend to rely on the contextual information provided by Electronic Medical Records (EMR) to interpret medical imaging. However, very few models effectively integrat… ▽ More

    Submitted 17 April, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  14. arXiv:2402.14180  [pdf, other

    cs.LG

    Linear Transformers are Versatile In-Context Learners

    Authors: Max Vladymyrov, Johannes von Oswald, Mark Sandler, Rong Ge

    Abstract: Recent research has demonstrated that transformers, particularly linear attention models, implicitly execute gradient-descent-like algorithms on data provided in-context during their forward inference step. However, their capability in handling more complex problems remains unexplored. In this paper, we prove that any linear transformer maintains an implicit linear model and can be interpreted as… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  15. arXiv:2402.11307  [pdf, other

    cs.CV

    ICHPro: Intracerebral Hemorrhage Prognosis Classification Via Joint-attention Fusion-based 3d Cross-modal Network

    Authors: Xinlei Yu, Xinyang Li, Ruiquan Ge, Shibin Wu, Ahmed Elazab, Jichao Zhu, Lingyan Zhang, Gangyong Jia, Taosheng Xu, Xiang Wan, Changmiao Wang

    Abstract: Intracerebral Hemorrhage (ICH) is the deadliest subtype of stroke, necessitating timely and accurate prognostic evaluation to reduce mortality and disability. However, the multi-factorial nature and complexity of ICH make methods based solely on computed tomography (CT) image features inadequate. Despite the capacity of cross-modal networks to fuse additional information, the effective combination… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: 6 pages,4 figures, 4 tables, accepted by ISBI

  16. arXiv:2402.11274  [pdf, other

    eess.IV cs.CV cs.LG

    TC-DiffRecon: Texture coordination MRI reconstruction method based on diffusion model and modified MF-UNet method

    Authors: Chenyan Zhang, Yifei Chen, Zhenxiong Fan, Yiyu Huang, Wenchao Weng, Ruiquan Ge, Dong Zeng, Changmiao Wang

    Abstract: Recently, diffusion models have gained significant attention as a novel set of deep learning-based generative methods. These models attempt to sample data from a Gaussian distribution that adheres to a target distribution, and have been successfully adapted to the reconstruction of MRI data. However, as an unconditional generative model, the diffusion model typically disrupts image coordination be… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: 5 pages, 2 figures, accept ISBI2024

    Journal ref: ISBI 2024

  17. arXiv:2402.08948  [pdf, ps, other

    cs.LG math.AP

    Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input

    Authors: Ziang Chen, Rong Ge

    Abstract: In this work, we study the mean-field flow for learning subspace-sparse polynomials using stochastic gradient descent and two-layer neural networks, where the input distribution is standard Gaussian and the output only depends on the projection of the input onto a low-dimensional subspace. We propose a basis-free generalization of the merged-staircase property in Abbe et al. (2022) and establish a… ▽ More

    Submitted 8 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  18. arXiv:2402.06855  [pdf, other

    cs.LG cs.CV

    For Better or For Worse? Learning Minimum Variance Features With Label Augmentation

    Authors: Muthu Chidambaram, Rong Ge

    Abstract: Data augmentation has been pivotal in successfully training deep learning models on classification tasks over the past decade. An important subclass of data augmentation techniques - which includes both label smoothing and Mixup - involves modifying not only the input data but also the input label during model training. In this work, we analyze the role played by the label augmentation aspect of s… ▽ More

    Submitted 27 May, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: 18 pages, 3 figures

  19. arXiv:2402.02100  [pdf, ps, other

    quant-ph physics.optics

    Weak-measurement-based pseudospin pointer: A cost-effective scheme for precision measurement

    Authors: Ling Ye, Lan Luo, An Wang, Rongchun Ge, Zhiyou Zhang

    Abstract: As an essential component of state-of-the-art quantum technologies, fast and efficient quantum measurements are in persistent demand over time. We present a proof-of-principle experiment on a new dimensionless pseudo-spin pointer based on weak measurement. In the context of optical parameter estimation, we demonstrate that the parametric distribution's moment is obtained experimentally by employin… ▽ More

    Submitted 12 June, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: 6 figures

    Journal ref: Phys. Rev. A 109, L060601 (2024)

  20. arXiv:2401.11859  [pdf, other

    eess.IV cs.CV

    LKFormer: Large Kernel Transformer for Infrared Image Super-Resolution

    Authors: Feiwei Qin, Kang Yan, Changmiao Wang, Ruiquan Ge, Yong Peng, Kai Zhang

    Abstract: Given the broad application of infrared technology across diverse fields, there is an increasing emphasis on investigating super-resolution techniques for infrared images within the realm of deep learning. Despite the impressive results of current Transformer-based methods in image super-resolution tasks, their reliance on the self-attentive mechanism intrinsic to the Transformer architecture resu… ▽ More

    Submitted 24 January, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 14 pages, 4 figures, accept Multimedia Tools and Applications

  21. Transfer-Learning-Based Autotuning Using Gaussian Copula

    Authors: Thomas Randall, Jaehoon Koo, Brice Videau, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall, Rong Ge, Prasanna Balaprakash

    Abstract: As diverse high-performance computing (HPC) systems are built, many opportunities arise for applications to solve larger problems than ever before. Given the significantly increased complexity of these HPC systems and application tuning, empirical performance tuning, such as autotuning, has emerged as a promising approach in recent years. Despite its effectiveness, autotuning is often a computatio… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: 13 pages, 5 figures, 7 tables, the definitive version of this work is published in the Proceedings of the ACM International Conference on Supercomputing 2023, available at https://dl.acm.org/doi/10.1145/3577193.3593712

    ACM Class: I.2.4; G.3; D.2.8

    Journal ref: Proceedings of the 37th International Conference on Supercomputing (2023) 37-49

  22. arXiv:2401.02954  [pdf, other

    cs.CL cs.AI cs.LG

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    Authors: DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, Jianzhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li , et al. (63 additional authors not shown)

    Abstract: The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  23. arXiv:2312.15601  [pdf, ps, other

    physics.optics

    On-chip Lithium Niobate Heterogeneous Photonic Crystal Nanocavity Laser

    Authors: Xiangmin Liu, Rui Ge, Chengyu Chen, Jiangwei Wu, Yu** Chen, Xianfeng Chen

    Abstract: Thin film lithium niobate (TFLN) has become an platform for modern integrated circuits due to its excellent optical properties. With the development of rare earth ion doped TFLN, important breakthroughs of on-chip microlasers has emerged and show significant application for optical communication, computing and quantum photonics. However, challenges still remain in develo** compact lasers with sm… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  24. arXiv:2312.07743  [pdf, other

    cs.LG cs.CL cs.DC

    FULL-W2V: Fully Exploiting Data Reuse for W2V on GPU-Accelerated Systems

    Authors: Thomas Randall, Tyler Allen, Rong Ge

    Abstract: Word2Vec remains one of the highly-impactful innovations in the field of Natural Language Processing (NLP) that represents latent grammatical and syntactical information in human text with dense vectors in a low dimension. Word2Vec has high computational cost due to the algorithm's inherent sequentiality, intensive memory accesses, and the large vocabularies it represents. While prior studies have… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 12 pages, 7 figures, 7 tables, the definitive version of this work is published in the Proceedings of the ACM International Conference on Supercomputing 2021, available at https://doi.org/10.1145/3447818.3460373

    ACM Class: I.2.7; D.1.3; G.4

    Journal ref: Proceedings of the ACM International Conference on Supercomputing (2021) 455-466

  25. arXiv:2312.01175  [pdf

    physics.acc-ph

    High Q and high gradient performance of the first medium-temperature baking 1.3 GHz cryomodule

    Authors: Jiyuan Zhai, Weimin Pan, Feisi He, Rui Ge, Zhenghui Mi, Peng Sha, Song **, Ruixiong Han, Qunyao Wang, Haiying Lin, Guangwei Wang, Mei Li, Min**g Sang, Liangrui Sun, Rui Ye, Tongxian Zhao, Shaopeng Li, Keyu Zhu, Baiqi Liu, Xiaolong Wang, Xiangchen Yang, Xiaojuan Bian, Xiangzhen Zhang, Huizhou Ma, Xuwen Dai , et al. (14 additional authors not shown)

    Abstract: World's first 1.3 GHz cryomodule containing eight 9-cell superconducting radio-frequency (RF) cavities treated by medium-temperature furnace baking (mid-T bake) was developed, assembled and tested at IHEP for the Dalian Advanced Light Source (DALS) and CEPC R&D. The 9-cell cavities in the cryomodule achieved an unprecedented highest average Q0 of 3.8E10 at 16 MV/m and 3.6E10 at 21 MV/m in the hori… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 5 pages, 6 figures

  26. arXiv:2311.15328  [pdf, other

    eess.IV cs.CV

    BS-Diff: Effective Bone Suppression Using Conditional Diffusion Models from Chest X-Ray Images

    Authors: Zhanghao Chen, Yifei Sun, Wenjian Qin, Ruiquan Ge, Cheng Pan, Wenming Deng, Zhou Liu, Wenwen Min, Ahmed Elazab, Xiang Wan, Changmiao Wang

    Abstract: Chest X-rays (CXRs) are commonly utilized as a low-dose modality for lung screening. Nonetheless, the efficacy of CXRs is somewhat impeded, given that approximately 75% of the lung area overlaps with bone, which in turn hampers the detection and diagnosis of diseases. As a remedial measure, bone suppression techniques have been introduced. The current dual-energy subtraction imaging technique in t… ▽ More

    Submitted 28 February, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures, accepted by IEEE ISBI 2024

  27. arXiv:2311.07033  [pdf, other

    eess.IV cs.CV

    TTMFN: Two-stream Transformer-based Multimodal Fusion Network for Survival Prediction

    Authors: Ruiquan Ge, Xiangyang Hu, Rungen Huang, Gangyong Jia, Yaqi Wang, Renshu Gu, Changmiao Wang, Elazab Ahmed, Linyan Wang, Juan Ye, Ye Li

    Abstract: Survival prediction plays a crucial role in assisting clinicians with the development of cancer treatment protocols. Recent evidence shows that multimodal data can help in the diagnosis of cancer disease and improve survival prediction. Currently, deep learning-based approaches have experienced increasing success in survival prediction by integrating pathological images and gene expression data. H… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  28. arXiv:2311.04772  [pdf, other

    eess.IV cs.CV

    GCS-ICHNet: Assessment of Intracerebral Hemorrhage Prognosis using Self-Attention with Domain Knowledge Integration

    Authors: Xuhao Shan, Xinyang Li, Ruiquan Ge, Shibin Wu, Ahmed Elazab, Jichao Zhu, Lingyan Zhang, Gangyong Jia, Qingying Xiao, Xiang Wan, Changmiao Wang

    Abstract: Intracerebral Hemorrhage (ICH) is a severe condition resulting from damaged brain blood vessel ruptures, often leading to complications and fatalities. Timely and accurate prognosis and management are essential due to its high mortality rate. However, conventional methods heavily rely on subjective clinician expertise, which can lead to inaccurate diagnoses and delays in treatment. Artificial inte… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 6 pages, 3 figures, 5 tables, published to BIBM 2023

  29. arXiv:2310.02777  [pdf, other

    cs.CL

    The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-Language Models

    Authors: Chenwei Wu, Li Erran Li, Stefano Ermon, Patrick Haffner, Rong Ge, Zaiwei Zhang

    Abstract: Compositionality is a common property in many modalities including natural languages and images, but the compositional generalization of multi-modal models is not well-understood. In this paper, we identify two sources of visual-linguistic compositionality: linguistic priors and the interplay between images and texts. We show that current attempts to improve compositional generalization rely on li… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  30. arXiv:2307.11530  [pdf, other

    eess.IV cs.CV

    UWAT-GAN: Fundus Fluorescein Angiography Synthesis via Ultra-wide-angle Transformation Multi-scale GAN

    Authors: Zhaojie Fang, Zhanghao Chen, Pengxue Wei, Wangting Li, Shaochong Zhang, Ahmed Elazab, Gangyong Jia, Ruiquan Ge, Changmiao Wang

    Abstract: Fundus photography is an essential examination for clinical and differential diagnosis of fundus diseases. Recently, Ultra-Wide-angle Fundus (UWF) techniques, UWF Fluorescein Angiography (UWF-FA) and UWF Scanning Laser Ophthalmoscopy (UWF-SLO) have been gradually put into use. However, Fluorescein Angiography (FA) and UWF-FA require injecting sodium fluorescein which may have detrimental influence… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: 26th International Conference on Medical Image Computing and Computer Assisted Intervention

  31. arXiv:2307.07246  [pdf, other

    cs.CV cs.LG

    Knowledge Boosting: Rethinking Medical Contrastive Vision-Language Pre-Training

    Authors: Xiaofei Chen, Yuting He, Cheng Xue, Rongjun Ge, Shuo Li, Guanyu Yang

    Abstract: The foundation models based on pre-training technology have significantly advanced artificial intelligence from theoretical to practical applications. These models have facilitated the feasibility of computer-aided diagnosis for widespread use. Medical contrastive vision-language pre-training, which does not require human annotations, is an effective approach for guiding representation learning us… ▽ More

    Submitted 17 July, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: accepted by MICCAI 2023

  32. arXiv:2306.07824  [pdf, ps, other

    eess.IV

    JCCS-PFGM: A Novel Circle-Supervision based Poisson Flow Generative Model for Multiphase CECT Progressive Low-Dose Reconstruction with Joint Condition

    Authors: Rongjun Ge, Yuting He, Cong Xia, Yang Chen, Daoqiang Zhang, Ge Wang

    Abstract: Multiphase contrast-enhanced computed tomography (CECT) scan is clinically significant to demonstrate the anatomy at different phases. In practice, such a multiphase CECT scan inherently takes longer time and deposits much more radiation dose into a patient body than a regular CT scan, and reduction of the radiation dose typically compromise the CECT image quality and its diagnostic value. With Jo… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  33. arXiv:2306.00740  [pdf, other

    cs.LG stat.ML

    On the Limitations of Temperature Scaling for Distributions with Overlaps

    Authors: Muthu Chidambaram, Rong Ge

    Abstract: Despite the impressive generalization capabilities of deep neural networks, they have been repeatedly shown to be overconfident when they are wrong. Fixing this issue is known as model calibration, and has consequently received much attention in the form of modified training schemes and post-training calibration procedures such as temperature scaling. While temperature scaling is frequently used b… ▽ More

    Submitted 13 February, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 27 pages, 9 Figures, published in ICLR 2024

  34. arXiv:2305.10633  [pdf, other

    cs.LG cs.IT stat.ML

    Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models

    Authors: Alex Damian, Eshaan Nichani, Rong Ge, Jason D. Lee

    Abstract: We focus on the task of learning a single index model $σ(w^\star \cdot x)$ with respect to the isotropic Gaussian distribution in $d$ dimensions. Prior work has shown that the sample complexity of learning $w^\star$ is governed by the information exponent $k^\star$ of the link function $σ$, which is defined as the index of the first nonzero Hermite coefficient of $σ$. Ben Arous et al. (2021) showe… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  35. arXiv:2304.03834  [pdf, other

    cs.CV

    WOMD-LiDAR: Raw Sensor Dataset Benchmark for Motion Forecasting

    Authors: Kan Chen, Runzhou Ge, Hang Qiu, Rami AI-Rfou, Charles R. Qi, Xuanyu Zhou, Zoey Yang, Scott Ettinger, Pei Sun, Zhaoqi Leng, Mustafa Baniodeh, Ivan Bogun, Weiyue Wang, Mingxing Tan, Dragomir Anguelov

    Abstract: Widely adopted motion forecasting datasets substitute the observed sensory inputs with higher-level abstractions such as 3D boxes and polylines. These sparse shapes are inferred through annotating the original scenes with perception systems' predictions. Such intermediate representations tie the quality of the motion forecasting models to the performance of computer vision models. Moreover, the hu… ▽ More

    Submitted 18 February, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: ICRA 2024 camera ready version. Dataset website: https://waymo.com/open/data/motion/

  36. arXiv:2304.02826  [pdf, ps, other

    physics.optics quant-ph

    Meta-lenses for differential imaging based on weak measurement

    Authors: Xiong Liu, Rongchun Ge, Xinrui Li, **glei Du, Hong Zhang, Zhiyou Zhang

    Abstract: All-optical information communication, processing and computation have received substantial interest of both fundamental and applied research due to its unrivaled speed and broad bandwidth. Compared to its electronic counterpart, photons seldom interact with each other which makes them obtain a long coherence time on one hand and relieved from heavy energy dissipation on the other. However, one of… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 8 pages, 7 figures

  37. arXiv:2304.01063  [pdf, other

    cs.LG math.OC

    Depth Separation with Multilayer Mean-Field Networks

    Authors: Yunwei Ren, Mo Zhou, Rong Ge

    Abstract: Depth separation -- why a deeper network is more powerful than a shallower one -- has been a major problem in deep learning theory. Previous results often focus on representation power. For example, arXiv:1904.06984 constructed a function that is easy to approximate using a 3-layer network but not approximable by any 2-layer network. In this paper, we show that this separation is in fact algorithm… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: ICLR 2023

  38. arXiv:2303.08117  [pdf, other

    cs.CL cs.LG

    Do Transformers Parse while Predicting the Masked Word?

    Authors: Haoyu Zhao, Abhishek Panigrahi, Rong Ge, Sanjeev Arora

    Abstract: Pre-trained language models have been shown to encode linguistic structures, e.g. dependency and constituency parse trees, in their embeddings while being trained on unsupervised loss functions like masked language modeling. Some doubts have been raised whether the models actually are doing parsing or only some computation weakly correlated with it. We study questions: (a) Is it possible to explic… ▽ More

    Submitted 15 October, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: Accpeted to EMNLP 2023, 30 pages

  39. arXiv:2303.00874  [pdf, other

    cs.CV cs.AI

    Geometric Visual Similarity Learning in 3D Medical Image Self-supervised Pre-training

    Authors: Yuting He, Guanyu Yang, Rongjun Ge, Yang Chen, Jean-Louis Coatrieux, Boyu Wang, Shuo Li

    Abstract: Learning inter-image similarity is crucial for 3D medical images self-supervised pre-training, due to their sharing of numerous same semantic regions. However, the lack of the semantic prior in metrics and the semantic-independent variation in 3D medical images make it challenging to get a reliable measurement for the inter-image similarity, hindering the learning of consistent representation for… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR 2023

    Journal ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023

  40. arXiv:2302.12715  [pdf, other

    cs.LG cs.AI

    Hiding Data Helps: On the Benefits of Masking for Sparse Coding

    Authors: Muthu Chidambaram, Chenwei Wu, Yu Cheng, Rong Ge

    Abstract: Sparse coding, which refers to modeling a signal as sparse linear combinations of the elements of a learned dictionary, has proven to be a successful (and interpretable) approach in applications such as signal processing, computer vision, and medical imaging. While this success has spurred much work on provable guarantees for dictionary recovery when the learned dictionary is the same size as the… ▽ More

    Submitted 1 June, 2023; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: 16 pages, 1 figure, ICML 2023

  41. arXiv:2302.00257  [pdf, other

    cs.LG stat.ML

    Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

    Authors: Mo Zhou, Rong Ge

    Abstract: In deep learning, often the training process finds an interpolator (a solution with 0 training loss), but the test loss is still low. This phenomenon, known as benign overfitting, is a major mystery that received a lot of recent attention. One common mechanism for benign overfitting is implicit regularization, where the training process leads to additional properties for the interpolator, often ch… ▽ More

    Submitted 25 May, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: ICML 2023 camera ready version

  42. arXiv:2212.14789  [pdf, ps, other

    physics.optics

    Simultaneous $ χ^{(2)} $- $ χ^{(2)} $ and $ χ^{(2)} $-$ χ^{(3)} $ nonlinear processes generation in thin film lithium tantalate microcavity

    Authors: Xiongshuo Yan, Miao Xue, Jiangwei Wu, Rui Ge, Tingge Yuan, Yu** Chen, Xianfeng Chen

    Abstract: On-chip efficient nonlinear functions are instrumental in escalating the utilities and performance of photonic integrated circuits (PICs), especially for a wide range of classical and quantum applications, such as tunable coherent radiation, optical frequency conversion, spectroscopy, quantum science, etc. Lithium tantalate (LT) has been widely used in nonlinear wavelength converters, surface acou… ▽ More

    Submitted 30 December, 2022; originally announced December 2022.

  43. Doubly resonant photonic crystal cavity using merged bound states in the continuum

    Authors: Rui Ge, Xiangmin Liu, Xiongshuo Yan, Xianfeng Chen, Yu** Chen

    Abstract: In this work, a doubly resonant photonic crystal (PhC) cavity using the merged bound states in the continuum (BICs) is proposed to obtain a higher second harmonic generation (SHG) efficiency. Firstly by scanning geometry parameters the accidental BICs and a band-edge mode outside the light cone can be obtained. Then as the lattice constant or the thickness of the slab is adjusted the accidental BI… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  44. arXiv:2210.13512  [pdf, other

    cs.LG cs.AI cs.CV math.OC stat.ML

    Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup

    Authors: Muthu Chidambaram, Xiang Wang, Chenwei Wu, Rong Ge

    Abstract: Mixup is a data augmentation technique that relies on training using random convex combinations of data points and their labels. In recent years, Mixup has become a standard primitive used in the training of state-of-the-art image classification models due to its demonstrated benefits over empirical risk minimization with regards to generalization and robustness. In this work, we try to explain so… ▽ More

    Submitted 1 June, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: 37 pages, 2 figures, ICML 2023

  45. arXiv:2210.11799  [pdf, other

    physics.optics

    Large Quality Factor Enhancement Based on Cascaded Uniform Lithium Niobate Bichromatic Photonic Crystal Cavities

    Authors: Rui Ge, Xiongshuo Yan, Zhaokang Liang, Hao Li, Jiangwei Wu, Xiangmin Liu, Yu** Chen, Xianfeng Chen

    Abstract: In this paper, by cascading several bichromatic photonic crystals we demonstrate that the quality factor can be much larger compared with that in an isolated cavity without increasing the total size of the device. We take lithium niobate photonic crystal as an example to illustrate that the simulated quality factor of the cascaded cavity can attain 10^5 with a 70° slant angle, which is an order of… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  46. arXiv:2210.03294  [pdf, other

    cs.LG math.OC stat.ML

    Understanding Edge-of-Stability Training Dynamics with a Minimalist Example

    Authors: Xingyu Zhu, Zixuan Wang, Xiang Wang, Mo Zhou, Rong Ge

    Abstract: Recently, researchers observed that gradient descent for deep neural networks operates in an ``edge-of-stability'' (EoS) regime: the sharpness (maximum eigenvalue of the Hessian) is often larger than stability threshold $2/η$ (where $η$ is the step size). Despite this, the loss oscillates and converges in the long run, and the sharpness at the end is just slightly below $2/η$. While many other wel… ▽ More

    Submitted 21 February, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: 53 pages, 19 figures

    ACM Class: I.2.6

  47. arXiv:2210.01019  [pdf, other

    stat.ML cs.LG

    Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks

    Authors: Xiang Wang, Annie N. Wang, Mo Zhou, Rong Ge

    Abstract: Monotonic linear interpolation (MLI) - on the line connecting a random initialization with the minimizer it converges to, the loss and accuracy are monotonic - is a phenomenon that is commonly observed in the training of neural networks. Such a phenomenon may seem to suggest that optimization of neural networks is easy. In this paper, we show that the MLI property is not necessarily related to the… ▽ More

    Submitted 14 February, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: ICLR 2023

  48. arXiv:2208.06898  [pdf, other

    cond-mat.mes-hall cond-mat.dis-nn quant-ph

    Cavity induced many-body localization

    Authors: Rong-Chun Ge, Saeed Rahmanian Koshkaki, Michael H. Kolodrubetz

    Abstract: In this manuscript, we explore the feasibility of achieving many-body localization in the context of cavity quantum electrodynamics at strong coupling. Working with a spinless electronic Hubbard chain sitting coupled to a single-mode cavity, we show that the global coupling between electrons and photons -- which generally would be expected to delocalize the fermionic excitations -- can instead fav… ▽ More

    Submitted 19 July, 2023; v1 submitted 14 August, 2022; originally announced August 2022.

    Comments: Updated version with more details provided

  49. arXiv:2208.05808  [pdf, ps, other

    physics.optics quant-ph

    A general scheme of differential imaging employing weak measurement

    Authors: Xiong Liu, An Wang, Junfan Zhu, Ling Ye, Rongchun Ge, **glei Du, Hong Zhang, Zhiyou Zhang

    Abstract: We propose and experimentally realize a general scheme of differential imaging employing the idea of weak measurement. We show that the weak coupling between the system of interest and a two-level ancilla can introduce a two-beam circuit after an arbitrary pre-selection of the ancilla. By choosing the post-selection orthogonal to the pre-selection measurement, an effective imaging platform based o… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: 11 figures, accepted for publication in Phys. Rev. A

  50. arXiv:2207.08301  [pdf, other

    cs.RO

    Vision-based Relative Detection and Tracking for Teams of Micro Aerial Vehicles

    Authors: Rundong Ge, Moonyoung Lee, Vivek Radhakrishnan, Yang Zhou, Guanrui Li, Giuseppe Loianno

    Abstract: In this paper, we address the vision-based detection and tracking problems of multiple aerial vehicles using a single camera and Inertial Measurement Unit (IMU) as well as the corresponding perception consensus problem (i.e., uniqueness and identical IDs across all observing agents). We design several vision-based decentralized Bayesian multi-tracking filtering strategies to resolve the associatio… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.