Skip to main content

Showing 1–50 of 219 results for author: Shen, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12577  [pdf, other

    cs.CV

    Cephalometric Landmark Detection across Ages with Prototypical Network

    Authors: Han Wu, Chong Wang, Lanzhuju Mei, Tong Yang, Min Zhu, Dingggang Shen, Zhiming Cui

    Abstract: Automated cephalometric landmark detection is crucial in real-world orthodontic diagnosis. Current studies mainly focus on only adult subjects, neglecting the clinically crucial scenario presented by adolescents whose landmarks often exhibit significantly different appearances compared to adults. Hence, an open question arises about how to develop a unified and effective detection algorithm across… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  2. arXiv:2406.12465  [pdf, other

    cs.CY cs.AI cs.IR

    RIGL: A Unified Reciprocal Approach for Tracing the Independent and Group Learning Processes

    Authors: Xiaoshan Yu, Chuan Qin, Dazhong Shen, Shangshang Yang, Hai** Ma, Hengshu Zhu, Xingyi Zhang

    Abstract: In the realm of education, both independent learning and group learning are esteemed as the most classic paradigms. The former allows learners to self-direct their studies, while the latter is typically characterized by teacher-directed scenarios. Recent studies in the field of intelligent education have leveraged deep temporal models to trace the learning process, capturing the dynamics of studen… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024. 12 pages

  3. arXiv:2406.05974  [pdf, other

    eess.IV cs.CV

    Inter-slice Super-resolution of Magnetic Resonance Images by Pre-training and Self-supervised Fine-tuning

    Authors: Xin Wang, Zhiyun Song, Yitao Zhu, Sheng Wang, Lichi Zhang, Dinggang Shen, Qian Wang

    Abstract: In clinical practice, 2D magnetic resonance (MR) sequences are widely adopted. While individual 2D slices can be stacked to form a 3D volume, the relatively large slice spacing can pose challenges for both image visualization and subsequent analysis tasks, which often require isotropic voxel spacing. To reduce slice spacing, deep-learning-based super-resolution techniques are widely investigated.… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: ISBI 2024

  4. arXiv:2405.18407  [pdf, other

    cs.LG cs.CV

    Phased Consistency Model

    Authors: Fu-Yun Wang, Zhaoyang Huang, Alexander William Bergman, Dazhong Shen, Peng Gao, Michael Lingelbach, Keqiang Sun, Weikang Bian, Guanglu Song, Yu Liu, Hongsheng Li, Xiaogang Wang

    Abstract: The consistency model (CM) has recently made significant progress in accelerating the generation of diffusion models. However, its application to high-resolution, text-conditioned image generation in the latent space (a.k.a., LCM) remains unsatisfactory. In this paper, we identify three key flaws in the current design of LCM. We investigate the reasons behind these limitations and propose the Phas… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2405.17835  [pdf, other

    cs.CV

    Deform3DGS: Flexible Deformation for Fast Surgical Scene Reconstruction with Gaussian Splatting

    Authors: Shuojue Yang, Qian Li, Daiyun Shen, Bingchen Gong, Qi Dou, Yueming **

    Abstract: Tissue deformation poses a key challenge for accurate surgical scene reconstruction. Despite yielding high reconstruction quality, existing methods suffer from slow rendering speeds and long training times, limiting their intraoperative applicability. Motivated by recent progress in 3D Gaussian Splatting, an emerging technology in real-time 3D rendering, this work presents a novel fast reconstruct… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Early accepted at MICCAI 2024, 10 pages, 2 figures

  6. arXiv:2405.10705  [pdf, other

    eess.IV cs.CV

    3D Vessel Reconstruction from Sparse-View Dynamic DSA Images via Vessel Probability Guided Attenuation Learning

    Authors: Zhentao Liu, Huangxuan Zhao, Wenhui Qin, Zhenghong Zhou, Xinggang Wang, Wen** Wang, Xiaochun Lai, Chuansheng Zheng, Dinggang Shen, Zhiming Cui

    Abstract: Digital Subtraction Angiography (DSA) is one of the gold standards in vascular disease diagnosing. With the help of contrast agent, time-resolved 2D DSA images deliver comprehensive insights into blood flow information and can be utilized to reconstruct 3D vessel structures. Current commercial DSA systems typically demand hundreds of scanning views to perform reconstruction, resulting in substanti… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 12 pages, 13 figures, 5 tables

  7. arXiv:2405.10691  [pdf, other

    eess.IV cs.CV

    LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

    Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

    Abstract: The infant brain undergoes rapid development in the first few years after birth.Compared to cross-sectional studies, longitudinal studies can depict the trajectories of infants brain development with higher accuracy, statistical power and flexibility.However, the collection of infant longitudinal magnetic resonance (MR) data suffers a notorious dropout problem, resulting in incomplete datasets wit… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  8. arXiv:2404.13067  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Efficient Resume Understanding: A Multi-Granularity Multi-Modal Pre-Training Approach

    Authors: Feihu Jiang, Chuan Qin, **gshuai Zhang, Kaichun Yao, Xi Chen, Dazhong Shen, Chen Zhu, Hengshu Zhu, Hui Xiong

    Abstract: In the contemporary era of widespread online recruitment, resume understanding has been widely acknowledged as a fundamental and crucial task, which aims to extract structured information from resume documents automatically. Compared to the traditional rule-based approaches, the utilization of recently proposed pre-trained document understanding models can greatly enhance the effectiveness of resu… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: ICME 2024 Accepted

  9. arXiv:2404.13046  [pdf, other

    cs.CV

    MoVA: Adapting Mixture of Vision Experts to Multimodal Context

    Authors: Zhuofan Zong, Bingqi Ma, Dazhong Shen, Guanglu Song, Hao Shao, Dongzhi Jiang, Hongsheng Li, Yu Liu

    Abstract: As the key component in multimodal large language models (MLLMs), the ability of the visual encoder greatly affects MLLM's understanding on diverse image content. Although some large-scale pretrained vision encoders such as vision encoders in CLIP and DINOv2 have brought promising performance, we found that there is still no single vision encoder that can dominate various image content understandi… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  10. arXiv:2404.05384  [pdf, other

    cs.CV cs.AI

    Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance

    Authors: Dazhong Shen, Guanglu Song, Zeyue Xue, Fu-Yun Wang, Yu Liu

    Abstract: Classifier-Free Guidance (CFG) has been widely used in text-to-image diffusion models, where the CFG scale is introduced to control the strength of text guidance on the whole image space. However, we argue that a global CFG scale results in spatial inconsistency on varying semantic strengths and suboptimal image quality. To address this problem, we present a novel approach, Semantic-aware Classifi… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: accepted by CVPR-2024

  11. arXiv:2404.03653  [pdf, other

    cs.CV cs.AI cs.CL

    CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

    Authors: Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu, Hongsheng Li

    Abstract: Diffusion models have demonstrated great success in the field of text-to-image generation. However, alleviating the misalignment between the text prompts and images is still challenging. The root reason behind the misalignment has not been extensively investigated. We observe that the misalignment is caused by inadequate token attention activation. We further attribute this phenomenon to the diffu… ▽ More

    Submitted 3 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Project Page: https://caraj7.github.io/comat

  12. arXiv:2404.01563  [pdf

    eess.IV cs.CV

    Two-Phase Multi-Dose-Level PET Image Reconstruction with Dose Level Awareness

    Authors: Yuchen Fei, Yanmei Luo, Yan Wang, Jiaqi Cui, Yuanyuan Xu, Jiliu Zhou, Dinggang Shen

    Abstract: To obtain high-quality positron emission tomography (PET) while minimizing radiation exposure, a range of methods have been designed to reconstruct standard-dose PET (SPET) from corresponding low-dose PET (LPET) images. However, most current methods merely learn the map** between single-dose-level LPET and SPET images, but omit the dose disparity of LPET images in clinical scenarios. In this pap… ▽ More

    Submitted 10 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by ISBI2024

  13. arXiv:2403.20058  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Revolutionizing Disease Diagnosis with simultaneous functional PET/MR and Deeply Integrated Brain Metabolic, Hemodynamic, and Perfusion Networks

    Authors: Luoyu Wang, Yitian Tao, Qing Yang, Yan Liang, Siwei Liu, Hongcheng Shi, Dinggang Shen, Han Zhang

    Abstract: Simultaneous functional PET/MR (sf-PET/MR) presents a cutting-edge multimodal neuroimaging technique. It provides an unprecedented opportunity for concurrently monitoring and integrating multifaceted brain networks built by spatiotemporally covaried metabolic activity, neural activity, and cerebral blood flow (perfusion). Albeit high scientific/clinical values, short in hardware accessibility of P… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 11 pages

  14. arXiv:2403.17416  [pdf, other

    cs.IR

    AFDGCF: Adaptive Feature De-correlation Graph Collaborative Filtering for Recommendations

    Authors: Wei Wu, Chao Wang, Dazhong Shen, Chuan Qin, Liyi Chen, Hui Xiong

    Abstract: Collaborative filtering methods based on graph neural networks (GNNs) have witnessed significant success in recommender systems (RS), capitalizing on their ability to capture collaborative signals within intricate user-item relationships via message-passing mechanisms. However, these GNN-based RS inadvertently introduce excess linear correlation between user and item embeddings, contradicting the… ▽ More

    Submitted 15 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted by SIGIR2024

  15. arXiv:2403.13745  [pdf, other

    cs.CV

    Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

    Authors: Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li

    Abstract: Video outpainting is a challenging task, aiming at generating video content outside the viewport of the input video while maintaining inter-frame and intra-frame consistency. Existing methods fall short in either generation quality or flexibility. We introduce MOTIA Mastering Video Outpainting Through Input-Specific Adaptation, a diffusion-based pipeline that leverages both the intrinsic data-spec… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Code will be available at https://github.com/G-U-N/Be-Your-Outpainter

  16. arXiv:2403.12440  [pdf, other

    cs.CV

    Self-learning Canonical Space for Multi-view 3D Human Pose Estimation

    Authors: Xiaoben Li, Mancheng Meng, Ziyan Wu, Terrence Chen, Fan Yang, Dinggang Shen

    Abstract: Multi-view 3D human pose estimation is naturally superior to single view one, benefiting from more comprehensive information provided by images of multiple views. The information includes camera poses, 2D/3D human poses, and 3D geometry. However, the accurate annotation of these information is hard to obtain, making it challenging to predict accurate 3D human pose from multi-view images. To deal w… ▽ More

    Submitted 29 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  17. arXiv:2403.12434  [pdf, other

    cs.CV

    Human Mesh Recovery from Arbitrary Multi-view Images

    Authors: Xiaoben Li, Mancheng Meng, Ziyan Wu, Terrence Chen, Fan Yang, Dinggang Shen

    Abstract: Human mesh recovery from arbitrary multi-view images involves two characteristics: the arbitrary camera poses and arbitrary number of camera views. Because of the variability, designing a unified framework to tackle this task is challenging. The challenges can be summarized as the dilemma of being able to simultaneously estimate arbitrary camera poses and recover human mesh from arbitrary multi-vi… ▽ More

    Submitted 17 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  18. arXiv:2403.12416  [pdf, other

    cs.CV cs.CL

    Eye-gaze Guided Multi-modal Alignment for Medical Representation Learning

    Authors: Chong Ma, Hanqi Jiang, Wenting Chen, Yiwei Li, Zihao Wu, Xiaowei Yu, Zhengliang Liu, Lei Guo, Dajiang Zhu, Tuo Zhang, Dinggang Shen, Tianming Liu, Xiang Li

    Abstract: In the medical multi-modal frameworks, the alignment of cross-modality features presents a significant challenge. However, existing works have learned features that are implicitly aligned from the data, without considering the explicit relationships in the medical context. This data-reliance may lead to low generalization of the learned alignment relationships. In this work, we propose the Eye-gaz… ▽ More

    Submitted 13 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: 12 pages, 6 figures

    MSC Class: 68T07 ACM Class: I.2.0; I.4.0; I.5.4; I.7.0

  19. arXiv:2403.04287  [pdf, other

    cs.IR

    DGR: A General Graph Desmoothing Framework for Recommendation via Global and Local Perspectives

    Authors: Leilei Ding, Dazhong Shen, Chao Wang, Tianfu Wang, Le Zhang, Yanyong Zhang

    Abstract: Graph Convolutional Networks (GCNs) have become pivotal in recommendation systems for learning user and item embeddings by leveraging the user-item interaction graph's node information and topology. However, these models often face the famous over-smoothing issue, leading to indistinct user and item embeddings and reduced personalization. Traditional desmoothing methods in GCN-based systems are mo… ▽ More

    Submitted 22 April, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  20. arXiv:2402.13776  [pdf, other

    eess.IV cs.CV cs.LG

    Cas-DiffCom: Cascaded diffusion model for infant longitudinal super-resolution 3D medical image completion

    Authors: Lianghu Guo, Tianli Tao, Xinyi Cai, Zihao Zhu, Jiawei Huang, Lixuan Zhu, Zhuoyang Gu, Haifeng Tang, Rui Zhou, Siyan Han, Yan Liang, Qing Yang, Dinggang Shen, Han Zhang

    Abstract: Early infancy is a rapid and dynamic neurodevelopmental period for behavior and neurocognition. Longitudinal magnetic resonance imaging (MRI) is an effective tool to investigate such a crucial stage by capturing the developmental trajectories of the brain structures. However, longitudinal MRI acquisition always meets a serious data-missing problem due to participant dropout and failed scans, makin… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  21. arXiv:2402.08409  [pdf, other

    cs.CV cs.AI

    Transferring Ultrahigh-Field Representations for Intensity-Guided Brain Segmentation of Low-Field Magnetic Resonance Imaging

    Authors: Kwanseok Oh, Jieun Lee, Da-Woon Heo, Dinggang Shen, Heung-Il Suk

    Abstract: Ultrahigh-field (UHF) magnetic resonance imaging (MRI), i.e., 7T MRI, provides superior anatomical details of internal brain structures owing to its enhanced signal-to-noise ratio and susceptibility-induced contrast. However, the widespread use of 7T MRI is limited by its high cost and lower accessibility compared to low-field (LF) MRI. This study proposes a deep-learning framework that systematic… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 32 pages, 9 figures, and 5 tables

  22. arXiv:2402.02029  [pdf, other

    cs.CV cs.AI cs.LG

    ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image Segmentation

    Authors: Zihan Li, Yuan Zheng, Dandan Shan, Shuzhou Yang, Qingde Li, Beizhan Wang, Yuanting Zhang, Qingqi Hong, Dinggang Shen

    Abstract: Most recent scribble-supervised segmentation methods commonly adopt a CNN framework with an encoder-decoder architecture. Despite its multiple benefits, this framework generally can only capture small-range feature dependency for the convolutional layer with the local receptive field, which makes it difficult to learn global shape information from the limited information provided by scribble annot… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE Transactions on Medical Imaging (TMI)

  23. Image2Points:A 3D Point-based Context Clusters GAN for High-Quality PET Image Reconstruction

    Authors: Jiaqi Cui, Yan Wang, Lu Wen, Pinxian Zeng, Xi Wu, Jiliu Zhou, Dinggang Shen

    Abstract: To obtain high-quality Positron emission tomography (PET) images while minimizing radiation exposure, numerous methods have been proposed to reconstruct standard-dose PET (SPET) images from the corresponding low-dose PET (LPET) images. However, these methods heavily rely on voxel-based representations, which fall short of adequately accounting for the precise structure and fine-grained context, le… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: Accepted by ICASSP 2024

  24. arXiv:2401.10749  [pdf, other

    cs.CY cs.LG

    ReliCD: A Reliable Cognitive Diagnosis Framework with Confidence Awareness

    Authors: Yunfei Zhang, Chuan Qin, Dazhong Shen, Hai** Ma, Le Zhang, Xingyi Zhang, Hengshu Zhu

    Abstract: During the past few decades, cognitive diagnostics modeling has attracted increasing attention in computational education communities, which is capable of quantifying the learning status and knowledge mastery levels of students. Indeed, the recent advances in neural networks have greatly enhanced the performance of traditional cognitive diagnosis models through learning the deep representations of… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

  25. arXiv:2401.01383  [pdf, other

    q-bio.NC cs.AI cs.CV cs.LG

    Predicting Infant Brain Connectivity with Federated Multi-Trajectory GNNs using Scarce Data

    Authors: Michalis Pistos, Gang Li, Weili Lin, Dinggang Shen, Islem Rekik

    Abstract: The understanding of the convoluted evolution of infant brain networks during the first postnatal year is pivotal for identifying the dynamics of early brain connectivity development. Existing deep learning solutions suffer from three major limitations. First, they cannot generalize to multi-trajectory prediction tasks, where each graph trajectory corresponds to a particular imaging modality or co… ▽ More

    Submitted 8 January, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

  26. arXiv:2312.07353  [pdf, other

    cs.CV

    CLIP in Medical Imaging: A Comprehensive Survey

    Authors: Zihao Zhao, Yuxiao Liu, Han Wu, Yonghao Li, Sheng Wang, Lin Teng, Disheng Liu, Zhiming Cui, Qian Wang, Dinggang Shen

    Abstract: Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training paradigm, successfully introduces text supervision to vision models. It has shown promising results across various tasks, attributable to its generalizability and interpretability. The use of CLIP has recently gained increasing interest in the medical imaging domain, serving both as a pre-training paradigm for alig… ▽ More

    Submitted 21 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Project page available at https://github.com/zhaozh10/Awesome-CLIP-in-Medical-Imaging

  27. arXiv:2312.06069  [pdf, other

    cs.CV

    Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis

    Authors: Zihao Zhao, Sheng Wang, Qian Wang, Dinggang Shen

    Abstract: Obtaining large-scale radiology reports can be difficult for medical images due to various reasons, limiting the effectiveness of contrastive pre-training in the medical image domain and underscoring the need for alternative methods. In this paper, we propose eye-tracking as an alternative to text reports, as it allows for the passive collection of gaze signals without disturbing radiologist's rou… ▽ More

    Submitted 12 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

    Comments: *These authors contributed equally. Accepted by AAAI 2024

  28. arXiv:2312.05256  [pdf, other

    eess.IV cs.AI

    Holistic Evaluation of GPT-4V for Biomedical Imaging

    Authors: Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen Xu, Yaonai Wei, **gyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang , et al. (25 additional authors not shown)

    Abstract: In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor… ▽ More

    Submitted 10 November, 2023; originally announced December 2023.

  29. arXiv:2311.09590  [pdf, other

    eess.IV cs.CV

    MARformer: An Efficient Metal Artifact Reduction Transformer for Dental CBCT Images

    Authors: Yuxuan Shi, Jun Xu, Dinggang Shen

    Abstract: Cone Beam Computed Tomography (CBCT) plays a key role in dental diagnosis and surgery. However, the metal teeth implants could bring annoying metal artifacts during the CBCT imaging process, interfering diagnosis and downstream processing such as tooth segmentation. In this paper, we develop an efficient Transformer to perform metal artifacts reduction (MAR) from dental CBCT images. The proposed M… ▽ More

    Submitted 18 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: under consideration of Computer Vision and Image Understanding journal

  30. arXiv:2311.08236  [pdf, other

    cs.CV

    MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image Diagnosis

    Authors: Yitao Zhu, Zhenrong Shen, Zihao Zhao, Sheng Wang, Xin Wang, Xiangyu Zhao, Dinggang Shen, Qian Wang

    Abstract: The common practice in develo** computer-aided diagnosis (CAD) models based on transformer architectures usually involves fine-tuning from ImageNet pre-trained weights. However, with recent advances in large-scale pre-training and the practice of scaling laws, Vision Transformers (ViT) have become much larger and less accessible to medical imaging communities. Additionally, in real-world scenari… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 5 pages, 3 figures

  31. arXiv:2311.04410  [pdf, other

    cs.RO eess.SY

    An Efficient Probabilistic Solution to Map** Errors in LiDAR-Camera Fusion for Autonomous Vehicles

    Authors: Dan Shen, Zhengming Zhang, Renran Tian, Yaobin Chen, Rini Sherony

    Abstract: LiDAR-camera fusion is one of the core processes for the perception system of current automated driving systems. The typical sensor fusion process includes a list of coordinate transformation operations following system calibration. Although a significant amount of research has been done to improve the fusion accuracy, there are still inherent data map** errors in practice related to system sync… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  32. arXiv:2311.04383  [pdf, other

    cs.RO eess.SY

    Active Collision Avoidance System for E-Scooters in Pedestrian Environment

    Authors: Xuke Yan, Dan Shen

    Abstract: In the dense fabric of urban areas, electric scooters have rapidly become a preferred mode of transportation. As they cater to modern mobility demands, they present significant safety challenges, especially when interacting with pedestrians. In general, e-scooters are suggested to be ridden in bike lanes/sidewalks or share the road with cars at the maximum speed of about 15-20 mph, which is more f… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Submitted to SAE 2024

  33. arXiv:2310.11106  [pdf, other

    cs.CV

    3D Structure-guided Network for Tooth Alignment in 2D Photograph

    Authors: Yulong Dou, Lanzhuju Mei, Dinggang Shen, Zhiming Cui

    Abstract: Orthodontics focuses on rectifying misaligned teeth (i.e., malocclusions), affecting both masticatory function and aesthetics. However, orthodontic treatment often involves complex, lengthy procedures. As such, generating a 2D photograph depicting aligned teeth prior to orthodontic treatment is crucial for effective dentist-patient communication and, more importantly, for encouraging patients to a… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  34. arXiv:2310.05242  [pdf, other

    cs.CL cs.AI

    ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data

    Authors: Tianyang Zhong, Wei Zhao, Yutong Zhang, Yi Pan, Peixin Dong, Zuowei Jiang, Xiaoyan Kui, Youlan Shang, Li Yang, Yaonai Wei, Longtao Yang, Hao Chen, Huan Zhao, Yuxiao Liu, Ning Zhu, Yiwei Li, Yisong Wang, Jiaqi Yao, Jiaqi Wang, Ying Zeng, Lei He, Chao Zheng, Zhixue Zhang, Ming Li, Zhengliang Liu , et al. (17 additional authors not shown)

    Abstract: Radiology report generation, as a key step in medical image analysis, is critical to the quantitative analysis of clinically informed decision-making levels. However, complex and diverse radiology reports with cross-source heterogeneity pose a huge generalizability challenge to the current methods under massive data volume, mainly because the style and normativity of radiology reports are obviousl… ▽ More

    Submitted 9 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

  35. arXiv:2309.15769  [pdf, other

    math.ST cs.LG stat.ME

    Algebraic and Statistical Properties of the Ordinary Least Squares Interpolator

    Authors: Dennis Shen, Dogyoon Song, Peng Ding, Jasjeet S. Sekhon

    Abstract: Deep learning research has uncovered the phenomenon of benign overfitting for overparameterized statistical models, which has drawn significant theoretical interest in recent years. Given its simplicity and practicality, the ordinary least squares (OLS) interpolator has become essential to gain foundational insights into this phenomenon. While properties of OLS are well established in classical, u… ▽ More

    Submitted 30 May, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

  36. arXiv:2309.06419  [pdf, other

    cs.CL

    Radiology-Llama2: Best-in-Class Large Language Model for Radiology

    Authors: Zhengliang Liu, Yiwei Li, Peng Shu, Aoxiao Zhong, Longtao Yang, Chao Ju, Zihao Wu, Chong Ma, Jie Luo, Cheng Chen, Sekeun Kim, Jiang Hu, Haixing Dai, Lin Zhao, Dajiang Zhu, Jun Liu, Wei Liu, Dinggang Shen, Tianming Liu, Quanzheng Li, Xiang Li

    Abstract: This paper introduces Radiology-Llama2, a large language model specialized for radiology through a process known as instruction tuning. Radiology-Llama2 is based on the Llama2 architecture and further trained on a large dataset of radiology reports to generate coherent and clinically useful impressions from radiological findings. Quantitative evaluations using ROUGE metrics on the MIMIC-CXR and Op… ▽ More

    Submitted 29 August, 2023; originally announced September 2023.

  37. arXiv:2308.10157  [pdf, ps, other

    eess.IV cs.CV

    Contrastive Diffusion Model with Auxiliary Guidance for Coarse-to-Fine PET Reconstruction

    Authors: Zeyu Han, Yuhan Wang, Lu** Zhou, Peng Wang, Binyu Yan, Jiliu Zhou, Yan Wang, Dinggang Shen

    Abstract: To obtain high-quality positron emission tomography (PET) scans while reducing radiation exposure to the human body, various approaches have been proposed to reconstruct standard-dose PET (SPET) images from low-dose PET (LPET) images. One widely adopted technique is the generative adversarial networks (GANs), yet recently, diffusion probabilistic models (DPMs) have emerged as a compelling alternat… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

    Comments: Accepted and presented in MICCAI 2023. To be published in Proceedings

  38. arXiv:2308.06762  [pdf, other

    eess.IV cs.CV

    Tissue Segmentation of Thick-Slice Fetal Brain MR Scans with Guidance from High-Quality Isotropic Volumes

    Authors: Shijie Huang, Xukun Zhang, Zhiming Cui, He Zhang, Geng Chen, Dinggang Shen

    Abstract: Accurate tissue segmentation of thick-slice fetal brain magnetic resonance (MR) scans is crucial for both reconstruction of isotropic brain MR volumes and the quantification of fetal brain development. However, this task is challenging due to the use of thick-slice scans in clinically-acquired fetal brain data. To address this issue, we propose to leverage high-quality isotropic fetal brain MR vol… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 August, 2023; originally announced August 2023.

    Comments: 10 pages, 9 figures, 5 tables, Fetal MRI, Brain tissue segmentation, Unsupervised domain adaptation, Cycle-consistency

  39. arXiv:2308.05365  [pdf

    eess.IV cs.CV

    TriDo-Former: A Triple-Domain Transformer for Direct PET Reconstruction from Low-Dose Sinograms

    Authors: Jiaqi Cui, Pinxian Zeng, Xinyi Zeng, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang, Dinggang Shen

    Abstract: To obtain high-quality positron emission tomography (PET) images while minimizing radiation exposure, various methods have been proposed for reconstructing standard-dose PET (SPET) images from low-dose PET (LPET) sinograms directly. However, current methods often neglect boundaries during sinogram-to-image reconstruction, resulting in high-frequency distortion in the frequency domain and diminishe… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  40. arXiv:2307.13693  [pdf, other

    cs.CL

    Evaluating Large Language Models for Radiology Natural Language Processing

    Authors: Zhengliang Liu, Tianyang Zhong, Yiwei Li, Yutong Zhang, Yi Pan, Zihao Zhao, Peixin Dong, Chao Cao, Yuxiao Liu, Peng Shu, Yaonai Wei, Zihao Wu, Chong Ma, Jiaqi Wang, Sheng Wang, Mengyue Zhou, Zuowei Jiang, Chunlin Li, Jason Holmes, Shaochen Xu, Lu Zhang, Haixing Dai, Kai Zhang, Lin Zhao, Yuanhao Chen , et al. (20 additional authors not shown)

    Abstract: The rise of large language models (LLMs) has marked a pivotal shift in the field of natural language processing (NLP). LLMs have revolutionized a multitude of domains, and they have made a significant impact in the medical field. Large language models are now more abundant than ever, and many of these models exhibit bilingual capabilities, proficient in both English and Chinese. However, a compreh… ▽ More

    Submitted 27 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

  41. arXiv:2307.12845  [pdf, other

    eess.IV cs.CV

    Multi-View Vertebra Localization and Identification from CT Images

    Authors: Han Wu, Jiadong Zhang, Yu Fang, Zhentao Liu, Nizhuan Wang, Zhiming Cui, Dinggang Shen

    Abstract: Accurately localizing and identifying vertebrae from CT images is crucial for various clinical applications. However, most existing efforts are performed on 3D with crop** patch operation, suffering from the large computation costs and limited global information. In this paper, we propose a multi-view vertebra localization and identification from CT images, converting the 3D problem into a 2D lo… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: MICCAI 2023

  42. arXiv:2307.07953  [pdf

    cs.CV

    Accurate 3D Prediction of Missing Teeth in Diverse Patterns for Precise Dental Implant Planning

    Authors: Lei Ma, Peng Xue, Yuning Gu, Yue Zhao, Min Zhu, Zhongxiang Ding, Dinggang Shen

    Abstract: In recent years, the demand for dental implants has surged, driven by their high success rates and esthetic advantages. However, accurate prediction of missing teeth for precise digital implant planning remains a challenge due to the intricate nature of dental structures and the variability in tooth loss patterns. This study presents a novel framework for accurate prediction of missing teeth in di… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

  43. arXiv:2307.03195  [pdf, other

    cs.CY cs.AI

    A Comprehensive Survey of Artificial Intelligence Techniques for Talent Analytics

    Authors: Chuan Qin, Le Zhang, Yihang Cheng, Rui Zha, Dazhong Shen, Qi Zhang, Xi Chen, Ying Sun, Chen Zhu, Hengshu Zhu, Hui Xiong

    Abstract: In today's competitive and fast-evolving business environment, it is a critical time for organizations to rethink how to make talent-related decisions in a quantitative manner. Indeed, the recent development of Big Data and Artificial Intelligence (AI) techniques have revolutionized human resource management. The availability of large-scale talent and management-related data provides unparalleled… ▽ More

    Submitted 5 May, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 61 pages, 15 figures

  44. arXiv:2307.00855  [pdf, other

    cs.CV cs.AI

    Review of Large Vision Models and Visual Prompt Engineering

    Authors: Jiaqi Wang, Zhengliang Liu, Lin Zhao, Zihao Wu, Chong Ma, Sigang Yu, Haixing Dai, Qiushi Yang, Yiheng Liu, Songyao Zhang, Enze Shi, Yi Pan, Tuo Zhang, Dajiang Zhu, Xiang Li, Xi Jiang, Bao Ge, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang

    Abstract: Visual prompt engineering is a fundamental technology in the field of visual and image Artificial General Intelligence, serving as a key component for achieving zero-shot capabilities. As the development of large vision models progresses, the importance of prompt engineering becomes increasingly evident. Designing suitable prompts for specific visual tasks has emerged as a meaningful research dire… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  45. arXiv:2306.14392  [pdf, other

    cs.CV

    ContentCTR: Frame-level Live Streaming Click-Through Rate Prediction with Multimodal Transformer

    Authors: Jiaxin Deng, Dong Shen, Shiyao Wang, Xiangyu Wu, Fan Yang, Guorui Zhou, Gaofeng Meng

    Abstract: In recent years, live streaming platforms have gained immense popularity as they allow users to broadcast their videos and interact in real-time with hosts and peers. Due to the dynamic changes of live content, accurate recommendation models are crucial for enhancing user experience. However, most previous works treat the live as a whole item and explore the Click-through-Rate (CTR) prediction fra… ▽ More

    Submitted 25 June, 2023; originally announced June 2023.

  46. arXiv:2306.08666  [pdf, other

    cs.CL cs.AI

    Radiology-GPT: A Large Language Model for Radiology

    Authors: Zhengliang Liu, Aoxiao Zhong, Yiwei Li, Longtao Yang, Chao Ju, Zihao Wu, Chong Ma, Peng Shu, Cheng Chen, Sekeun Kim, Haixing Dai, Lin Zhao, Lichao Sun, Dajiang Zhu, Jun Liu, Wei Liu, Dinggang Shen, Xiang Li, Quanzheng Li, Tianming Liu

    Abstract: We introduce Radiology-GPT, a large language model for radiology. Using an instruction tuning approach on an extensive dataset of radiology domain knowledge, Radiology-GPT demonstrates superior performance compared to general language models such as StableLM, Dolly and LLaMA. It exhibits significant versatility in radiological diagnosis, research, and communication. This work serves as a catalyst… ▽ More

    Submitted 19 March, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

  47. arXiv:2306.05480  [pdf, other

    cs.AI

    Artificial General Intelligence for Medical Imaging

    Authors: Xiang Li, Lu Zhang, Zihao Wu, Zhengliang Liu, Lin Zhao, Yixuan Yuan, Jun Liu, Gang Li, Dajiang Zhu, **kun Yan, Quanzheng Li, Wei Liu, Tianming Liu, Dinggang Shen

    Abstract: In this review, we explore the potential applications of Artificial General Intelligence (AGI) models in healthcare, focusing on foundational Large Language Models (LLMs), Large Vision Models, and Large Multimodal Models. We emphasize the importance of integrating clinical expertise, domain knowledge, and multimodal capabilities into AGI models. In addition, we lay out key roadmaps that guide the… ▽ More

    Submitted 2 July, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

  48. arXiv:2306.00676  [pdf, other

    cs.CV

    Hyperspectral Target Detection Based on Low-Rank Background Subspace Learning and Graph Laplacian Regularization

    Authors: Dunbin Shen, Xiaorui Ma, Wenfeng Kong, Jiacheng Tian, Hongyu Wang

    Abstract: Hyperspectral target detection is good at finding dim and small objects based on spectral characteristics. However, existing representation-based methods are hindered by the problem of the unknown background dictionary and insufficient utilization of spatial information. To address these issues, this paper proposes an efficient optimizing approach based on low-rank representation (LRR) and graph L… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 4 pages, 3 figures, 1 table

  49. ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs

    Authors: Zihao Zhao, Sheng Wang, **chen Gu, Yitao Zhu, Lanzhuju Mei, Zixu Zhuang, Zhiming Cui, Qian Wang, Dinggang Shen

    Abstract: The integration of Computer-Aided Diagnosis (CAD) with Large Language Models (LLMs) presents a promising frontier in clinical applications, notably in automating diagnostic processes akin to those performed by radiologists and providing consultations similar to a virtual family doctor. Despite the promising potential of this integration, current works face at least two limitations: (1) From the pe… ▽ More

    Submitted 17 April, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Authors Zihao Zhao, Sheng Wang, **chen Gu, Yitao Zhu contributed equally to this work and should be considered co-first authors

  50. arXiv:2305.14736  [pdf, other

    cs.AI cs.FL eess.SY

    Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes

    Authors: Krishna C. Kalagarla, Dhruva Kartik, Dongming Shen, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

    Abstract: Autonomous systems often have logical constraints arising, for example, from safety, operational, or regulatory requirements. Such constraints can be expressed using temporal logic specifications. The system state is often partially observable. Moreover, it could encompass a team of multiple agents with a common objective but disparate information structures and constraints. In this paper, we firs… ▽ More

    Submitted 19 June, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2203.09038