Skip to main content

Showing 1–50 of 219 results for author: Fan, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18242  [pdf, other

    cs.CV eess.IV

    ConStyle v2: A Strong Prompter for All-in-One Image Restoration

    Authors: Dongqi Fan, Junhao Zhang, Liang Chang

    Abstract: This paper introduces ConStyle v2, a strong plug-and-play prompter designed to output clean visual prompts and assist U-Net Image Restoration models in handling multiple degradations. The joint training process of IRConStyle, an Image Restoration framework consisting of ConStyle and a general restoration network, is divided into two stages: first, pre-training ConStyle alone, and then freezing its… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.12052  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    UniGLM: Training One Unified Language Model for Text-Attributed Graphs

    Authors: Yi Fang, Dongzhe Fan, Sirui Ding, Ninghao Liu, Qiaoyu Tan

    Abstract: Representation learning on text-attributed graphs (TAGs), where nodes are represented by textual descriptions, is crucial for textual and relational knowledge systems and recommendation systems. Currently, state-of-the-art embedding methods for TAGs primarily focus on fine-tuning language models (e.g., BERT) using structure-aware training signals. While effective, these methods are tailored for in… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2406.11945  [pdf, other

    cs.LG cs.AI cs.IR

    GAugLLM: Improving Graph Contrastive Learning for Text-Attributed Graphs with Large Language Models

    Authors: Yi Fang, Dongzhe Fan, Daochen Zha, Qiaoyu Tan

    Abstract: This work studies self-supervised graph learning for text-attributed graphs (TAGs) where nodes are represented by textual attributes. Unlike traditional graph contrastive methods that perturb the numerical feature space and alter the graph's topological structure, we aim to improve view generation through language supervision. This is driven by the prevalence of textual attributes in real applicat… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2406.00988  [pdf, other

    cs.AR

    ADE-HGNN: Accelerating HGNNs through Attention Disparity Exploitation

    Authors: Dengke Han, Meng Wu, Runzhen Xue, Mingyu Yan, Xiaochun Ye, Dongrui Fan

    Abstract: Heterogeneous Graph Neural Networks (HGNNs) have recently demonstrated great power in handling heterogeneous graph data, rendering them widely applied in many critical real-world domains. Most HGNN models leverage attention mechanisms to significantly improvemodel accuracy, albeit at the cost of increased computational complexity and memory bandwidth requirements. Fortunately, the attention dispar… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 15 pages, 9 figures, accepted by Euro-PAR 2024

  5. arXiv:2405.18784  [pdf, other

    cs.CV

    LP-3DGS: Learning to Prune 3D Gaussian Splatting

    Authors: Zhaoliang Zhang, Tianchen Song, Yongjae Lee, Li Yang, Cheng Peng, Rama Chellappa, Deliang Fan

    Abstract: Recently, 3D Gaussian Splatting (3DGS) has become one of the mainstream methodologies for novel view synthesis (NVS) due to its high quality and fast rendering speed. However, as a point-based scene representation, 3DGS potentially generates a large number of Gaussians to fit the scene, leading to high memory usage. Improvements that have been proposed require either an empirical and preset prunin… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  6. arXiv:2405.17793  [pdf, other

    cs.CV

    SafeguardGS: 3D Gaussian Primitive Pruning While Avoiding Catastrophic Scene Destruction

    Authors: Yongjae Lee, Zhaoliang Zhang, Deliang Fan

    Abstract: 3D Gaussian Splatting (3DGS) has made a significant stride in novel view synthesis, demonstrating top-notch rendering quality while achieving real-time rendering speed. However, the excessively large number of Gaussian primitives resulting from 3DGS' suboptimal densification process poses a major challenge, slowing down frame-per-second (FPS) and demanding considerable memory cost, making it unfav… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Comprehensive experiments are in progress

  7. arXiv:2405.14251  [pdf, other

    cs.RO eess.SY

    Efficient Navigation of a Robotic Fish Swimming Across the Vortical Flow Field

    Authors: Haodong Feng, Dehan Yuan, Jiale Miao, Jie You, Yue Wang, Yi Zhu, Dixia Fan

    Abstract: Navigating efficiently across vortical flow fields presents a significant challenge in various robotic applications. The dynamic and unsteady nature of vortical flows often disturbs the control of underwater robots, complicating their operation in hydrodynamic environments. Conventional control methods, which depend on accurate modeling, fail in these settings due to the complexity of fluid-struct… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  8. arXiv:2405.09822  [pdf, other

    cs.RO

    SEEK: Semantic Reasoning for Object Goal Navigation in Real World Inspection Tasks

    Authors: Muhammad Fadhil Ginting, Sung-Kyun Kim, David D. Fan, Matteo Palieri, Mykel J. Kochenderfer, Ali-akbar Agha-Mohammadi

    Abstract: This paper addresses the problem of object-goal navigation in autonomous inspections in real-world environments. Object-goal navigation is crucial to enable effective inspections in various settings, often requiring the robot to identify the target object within a large search space. Current object inspection methods fall short of human efficiency because they typically cannot bootstrap prior and… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  9. arXiv:2405.06247  [pdf, other

    cs.LG cs.AI cs.CR

    Disttack: Graph Adversarial Attacks Toward Distributed GNN Training

    Authors: Yuxiang Zhang, Xin Liu, Meng Wu, Wei Yan, Mingyu Yan, Xiaochun Ye, Dongrui Fan

    Abstract: Graph Neural Networks (GNNs) have emerged as potent models for graph learning. Distributing the training process across multiple computing nodes is the most promising solution to address the challenges of ever-growing real-world graphs. However, current adversarial attack methods on GNNs neglect the characteristics and applications of the distributed scenario, leading to suboptimal performance and… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted by 30th International European Conference on Parallel and Distributed Computing(Euro-Par 2024)

  10. arXiv:2405.03708  [pdf

    cs.DC cs.DB cs.LG

    Delta Tensor: Efficient Vector and Tensor Storage in Delta Lake

    Authors: Zhiwei Bao, Liu Liao-Liao, Zhiyu Wu, Yifan Zhou, Dan Fan, Michal Aibin, Yvonne Coady, Andrew Brownsword

    Abstract: The exponential growth of artificial intelligence (AI) and machine learning (ML) applications has necessitated the development of efficient storage solutions for vector and tensor data. This paper presents a novel approach for tensor storage in a Lakehouse architecture using Delta Lake. By adopting the multidimensional array storage strategy from array databases and sparse encoding methods to Delt… ▽ More

    Submitted 13 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

  11. arXiv:2404.09753  [pdf, other

    cs.CL cs.LG

    Personalized Collaborative Fine-Tuning for On-Device Large Language Models

    Authors: Nicolas Wagner, Dongyang Fan, Martin Jaggi

    Abstract: We explore on-device self-supervised collaborative fine-tuning of large language models with limited local data availability. Taking inspiration from the collaborative learning community, we introduce three distinct trust-weighted gradient aggregation schemes: weight similarity-based, prediction similarity-based and validation performance-based. To minimize communication overhead, we integrate Low… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  12. GDR-HGNN: A Heterogeneous Graph Neural Networks Accelerator Frontend with Graph Decoupling and Recoupling

    Authors: Runzhen Xue, Mingyu Yan, Dengke Han, Yihan Teng, Zhimin Tang, Xiaochun Ye, Dongrui Fan

    Abstract: Heterogeneous Graph Neural Networks (HGNNs) have broadened the applicability of graph representation learning to heterogeneous graphs. However, the irregular memory access pattern of HGNNs leads to the buffer thrashing issue in HGNN accelerators. In this work, we identify an opportunity to address buffer thrashing in HGNN acceleration through an analysis of the topology of heterogeneous graphs. To… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 6 pages, 10 figures, accepted by DAC'61

  13. Low Frequency Sampling in Model Predictive Path Integral Control

    Authors: Bogdan Vlahov, Jason Gibson, David D. Fan, Patrick Spieler, Ali-akbar Agha-mohammadi, Evangelos A. Theodorou

    Abstract: Sampling-based model-predictive controllers have become a powerful optimization tool for planning and control problems in various challenging environments. In this paper, we show how the default choice of uncorrelated Gaussian distributions can be improved upon with the use of a colored noise distribution. Our choice of distribution allows for the emphasis on low frequency control signals, which c… ▽ More

    Submitted 18 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: Published to RA-L

    Journal ref: IEEE Robotics and Automation Letters, vol. 9, no. 5, pp.4543-4550, 2024

  14. arXiv:2404.01892  [pdf, other

    cs.CV

    Minimize Quantization Output Error with Bias Compensation

    Authors: Cheng Gong, Haoshuai Zheng, Mengting Hu, Zheng Lin, Deng-** Fan, Yuzhi Zhang, Tao Li

    Abstract: Quantization is a promising method that reduces memory usage and computational intensity of Deep Neural Networks (DNNs), but it often leads to significant output error that hinder model deployment. In this paper, we propose Bias Compensation (BC) to minimize the output error, thus realizing ultra-low-precision quantization without model fine-tuning. Instead of optimizing the non-convex quantizatio… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures

    Journal ref: CAAI Artificial Intelligence Research, 2024

  15. arXiv:2404.01487  [pdf, other

    cs.LG

    Explainable AI Integrated Feature Engineering for Wildfire Prediction

    Authors: Di Fan, Ayan Biswas, James Paul Ahrens

    Abstract: Wildfires present intricate challenges for prediction, necessitating the use of sophisticated machine learning techniques for effective modeling\cite{jain2020review}. In our research, we conducted a thorough assessment of various machine learning algorithms for both classification and regression tasks relevant to predicting wildfires. We found that for classifying different types or stages of wild… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: text overlap with arXiv:2307.09615 by other authors

  16. arXiv:2404.00292  [pdf, other

    cs.CV

    LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion

    Authors: Pancheng Zhao, Peng Xu, Pengda Qin, Deng-** Fan, Zhicheng Zhang, Guoli Jia, Bowen Zhou, Jufeng Yang

    Abstract: Camouflaged vision perception is an important vision task with numerous practical applications. Due to the expensive collection and labeling costs, this community struggles with a major bottleneck that the species category of its datasets is limited to a small number of object species. However, the existing camouflaged generation methods require specifying the background manually, thus failing to… ▽ More

    Submitted 12 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024, Fig.3 revised

  17. arXiv:2403.14350  [pdf, other

    cs.CV

    Annotation-Efficient Polyp Segmentation via Active Learning

    Authors: Duojun Huang, Xinyu Xiong, De-Jun Fan, Feng Gao, Xiao-Jian Wu, Guanbin Li

    Abstract: Deep learning-based techniques have proven effective in polyp segmentation tasks when provided with sufficient pixel-wise labeled data. However, the high cost of manual annotation has created a bottleneck for model generalization. To minimize annotation costs, we propose a deep active learning framework for annotation-efficient polyp segmentation. In practice, we measure the uncertainty of each sa… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 2024 IEEE 21th International Symposium on Biomedical Imaging (ISBI)

  18. arXiv:2403.07943  [pdf, other

    cs.LG cs.CR

    Revisiting Edge Perturbation for Graph Neural Network in Graph Data Augmentation and Attack

    Authors: Xin Liu, Yuxiang Zhang, Meng Wu, Mingyu Yan, Kun He, Wei Yan, Shirui Pan, Xiaochun Ye, Dongrui Fan

    Abstract: Edge perturbation is a basic method to modify graph structures. It can be categorized into two veins based on their effects on the performance of graph neural networks (GNNs), i.e., graph data augmentation and attack. Surprisingly, both veins of edge perturbation methods employ the same operations, yet yield opposite effects on GNNs' accuracy. A distinct boundary between these methods in using edg… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: 14P

  19. arXiv:2403.06444  [pdf, other

    cs.CV

    Latent Semantic Consensus For Deterministic Geometric Model Fitting

    Authors: Guobao Xiao, Jun Yu, Jiayi Ma, Deng-** Fan, Ling Shao

    Abstract: Estimating reliable geometric model parameters from the data with severe outliers is a fundamental and important task in computer vision. This paper attempts to sample high-quality subsets and select model instances to estimate parameters in the multi-structural data. To address this, we propose an effective method called Latent Semantic Consensus (LSC). The principle of LSC is to preserve the lat… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  20. arXiv:2403.06066  [pdf

    eess.IV cs.CV cs.LG

    CausalCellSegmenter: Causal Inference inspired Diversified Aggregation Convolution for Pathology Image Segmentation

    Authors: Dawei Fan, Yifan Gao, Jiaming Yu, Yan** Chen, Wencheng Li, Chuancong Lin, Kaibin Li, Changcai Yang, Riqing Chen, Lifang Wei

    Abstract: Deep learning models have shown promising performance for cell nucleus segmentation in the field of pathology image analysis. However, training a robust model from multiple domains remains a great challenge for cell nucleus segmentation. Additionally, the shortcomings of background noise, highly overlap** between cell nucleus, and blurred edges often lead to poor performance. To address these ch… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: 10 pages, 5 figures, 2 tables, MICCAI

  21. arXiv:2403.04306  [pdf, other

    cs.CV cs.AI cs.LG

    Effectiveness Assessment of Recent Large Vision-Language Models

    Authors: Yao Jiang, Xinyu Yan, Ge-Peng Ji, Keren Fu, Meijun Sun, Huan Xiong, Deng-** Fan, Fahad Shahbaz Khan

    Abstract: The advent of large vision-language models (LVLMs) represents a remarkable advance in the quest for artificial general intelligence. However, the model's effectiveness in both specialized and general tasks warrants further investigation. This paper endeavors to evaluate the competency of popular LVLMs in specialized and general tasks, respectively, aiming to offer a comprehensive understanding of… ▽ More

    Submitted 11 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted by Visual Intelligence

  22. arXiv:2402.15784  [pdf, other

    cs.CV

    IRConStyle: Image Restoration Framework Using Contrastive Learning and Style Transfer

    Authors: Dongqi Fan, Xin Zhao, Liang Chang

    Abstract: Recently, the contrastive learning paradigm has achieved remarkable success in high-level tasks such as classification, detection, and segmentation. However, contrastive learning applied in low-level tasks, like image restoration, is limited, and its effectiveness is uncertain. This raises a question: Why does the contrastive learning paradigm not yield satisfactory results in image restoration? I… ▽ More

    Submitted 7 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  23. arXiv:2402.13089  [pdf, other

    cs.LG cs.AI cs.CL

    Towards an empirical understanding of MoE design choices

    Authors: Dongyang Fan, Bettina Messmer, Martin Jaggi

    Abstract: In this study, we systematically evaluate the impact of common design choices in Mixture of Experts (MoEs) on validation performance, uncovering distinct influences at token and sequence levels. We also present empirical evidence showing comparable performance between a learned router and a frozen, randomly initialized router, suggesting that learned routing may not be essential. Our study further… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  24. arXiv:2402.01368  [pdf, other

    cs.CV

    LIR: A Lightweight Baseline for Image Restoration

    Authors: Dongqi Fan, Ting Yue, Xin Zhao, Ren**g Xu, Liang Chang

    Abstract: Recently, there have been significant advancements in Image Restoration based on CNN and transformer. However, the inherent characteristics of the Image Restoration task are often overlooked in many works. They, instead, tend to focus on the basic block design and stack numerous such blocks to the model, leading to parameters redundant and computations unnecessary. Thus, the efficiency of the imag… ▽ More

    Submitted 24 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  25. arXiv:2402.01143  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Network Representations with Disentangled Graph Auto-Encoder

    Authors: Di Fan, Chuanhou Gao

    Abstract: The (variational) graph auto-encoder is extensively employed for learning representations of graph-structured data. However, the formation of real-world graphs is a complex and heterogeneous process influenced by latent factors. Existing encoders are fundamentally holistic, neglecting the entanglement of latent factors. This not only makes graph analysis tasks less effective but also makes it hard… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 61 pages, 13 figures

  26. arXiv:2401.17191  [pdf, other

    cs.RO

    Semantic Belief Behavior Graph: Enabling Autonomous Robot Inspection in Unknown Environments

    Authors: Muhammad Fadhil Ginting, David D. Fan, Sung-Kyun Kim, Mykel J. Kochenderfer, Ali-akbar Agha-mohammadi

    Abstract: This paper addresses the problem of autonomous robotic inspection in complex and unknown environments. This capability is crucial for efficient and precise inspections in various real-world scenarios, even when faced with perceptual uncertainty and lack of prior knowledge of the environment. Existing methods for real-world autonomous inspections typically rely on predefined targets and waypoints a… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  27. arXiv:2401.15261  [pdf, other

    cs.CV

    Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes

    Authors: Diandian Guo, Deng-** Fan, Tongyu Lu, Christos Sakaridis, Luc Van Gool

    Abstract: The estimation of implicit cross-frame correspondences and the high computational cost have long been major challenges in video semantic segmentation (VSS) for driving scenes. Prior works utilize keyframes, feature propagation, or cross-frame attention to address these issues. By contrast, we are the first to harness vanishing point (VP) priors for more effective segmentation. Intuitively, objects… ▽ More

    Submitted 25 April, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: CVPR 2024 highlight

  28. An annotated grain kernel image database for visual quality inspection

    Authors: Lei Fan, Yiwen Ding, Dongdong Fan, Yong Wu, Hongxia Chu, Maurice Pagnucco, Yang Song

    Abstract: We present a machine vision-based database named GrainSet for the purpose of visual quality inspection of grain kernels. The database contains more than 350K single-kernel images with experts' annotations. The grain kernels used in the study consist of four types of cereal grains including wheat, maize, sorghum and rice, and were collected from over 20 regions in 5 countries. The surface informati… ▽ More

    Submitted 20 November, 2023; originally announced January 2024.

    Comments: Accepted by Nature Scientific Data (2023), https://github.com/hellodfan/GrainSet

  29. arXiv:2401.03407  [pdf, other

    cs.CV

    Bilateral Reference for High-Resolution Dichotomous Image Segmentation

    Authors: Peng Zheng, Dehong Gao, Deng-** Fan, Li Liu, Jorma Laaksonen, Wanli Ouyang, Nicu Sebe

    Abstract: We introduce a novel bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS). It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef). The LM aids in object localization using global semantic information. Within the RM, we utilize BiRef for the reconstruction proce… ▽ More

    Submitted 25 June, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

    Comments: Version 5, with updated DIS performance, accuracy-efficiency comparison, and 3rd-party applications

  30. arXiv:2401.02317  [pdf, other

    cs.CV

    BA-SAM: Scalable Bias-Mode Attention Mask for Segment Anything Model

    Authors: Yiran Song, Qianyu Zhou, Xiangtai Li, Deng-** Fan, Xuequan Lu, Lizhuang Ma

    Abstract: In this paper, we address the challenge of image resolution variation for the Segment Anything Model (SAM). SAM, known for its zero-shot generalizability, exhibits a performance degradation when faced with datasets with varying image sizes. Previous approaches tend to resize the image to a fixed size or adopt structure modifications, hindering the preservation of SAM's rich prior knowledge. Beside… ▽ More

    Submitted 19 March, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

  31. arXiv:2311.18605  [pdf, other

    cs.CV

    Learning Triangular Distribution in Visual World

    Authors: ** Chen, Xingpeng Zhang, Chengtao Zhou, Dichao Fan, Peng Tu, Le Zhang, Yanlin Qian

    Abstract: Convolution neural network is successful in pervasive vision tasks, including label distribution learning, which usually takes the form of learning an injection from the non-linear visual features to the well-defined labels. However, how the discrepancy between features is mapped to the label discrepancy is ambient, and its correctness is not guaranteed.To address these problems, we study the math… ▽ More

    Submitted 18 March, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Accepet by CVPR 2024 (11 pages, 5 figures)

  32. arXiv:2311.17122  [pdf, other

    cs.CV

    Large Model Based Referring Camouflaged Object Detection

    Authors: Shupeng Cheng, Ge-Peng Ji, Pengda Qin, Deng-** Fan, Bowen Zhou, Peng Xu

    Abstract: Referring camouflaged object detection (Ref-COD) is a recently-proposed problem aiming to segment out specified camouflaged objects matched with a textual or visual reference. This task involves two major challenges: the COD domain-specific perception and multimodal reference-image alignment. Our motivation is to make full use of the semantic intelligence and intrinsic knowledge of recent Multimod… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  33. arXiv:2311.15011  [pdf, other

    cs.CV

    VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

    Authors: Ziyang Luo, Nian Liu, Wangbo Zhao, Xuguang Yang, Dingwen Zhang, Deng-** Fan, Fahad Khan, Junwei Han

    Abstract: Salient object detection (SOD) and camouflaged object detection (COD) are related yet distinct binary map** tasks. These tasks involve multiple modalities, sharing commonalities and unique cues. Existing research often employs intricate task-specific specialist models, potentially leading to redundancy and suboptimal results. We introduce VSCode, a generalist model with novel 2D prompt learning,… ▽ More

    Submitted 11 April, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: Accepted by CVPR2024

  34. Identifying the Defective: Detecting Damaged Grains for Cereal Appearance Inspection

    Authors: Lei Fan, Yiwen Ding, Dongdong Fan, Yong Wu, Maurice Pagnucco, Yang Song

    Abstract: Cereal grain plays a crucial role in the human diet as a major source of essential nutrients. Grain Appearance Inspection (GAI) serves as an essential process to determine grain quality and facilitate grain circulation and processing. However, GAI is routinely performed manually by inspectors with cumbersome procedures, which poses a significant bottleneck in smart agriculture. In this paper, we… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted by ECAI2023. https://github.com/hellodfan/AI4GrainInsp

  35. arXiv:2310.16015  [pdf, other

    physics.ao-ph cs.AI cs.LG

    Physically Explainable Deep Learning for Convective Initiation Nowcasting Using GOES-16 Satellite Observations

    Authors: Da Fan, Steven J. Greybush, David John Gagne II, Eugene E. Clothiaux

    Abstract: Convection initiation (CI) nowcasting remains a challenging problem for both numerical weather prediction models and existing nowcasting algorithms. In this study, object-based probabilistic deep learning models are developed to predict CI based on multichannel infrared GOES-R satellite observations. The data come from patches surrounding potential CI events identified in Multi-Radar Multi-Sensor… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  36. Acquiring Weak Annotations for Tumor Localization in Temporal and Volumetric Data

    Authors: Yu-Cheng Chou, Bowen Li, Deng-** Fan, Alan Yuille, Zongwei Zhou

    Abstract: Creating large-scale and well-annotated datasets to train AI algorithms is crucial for automated tumor detection and localization. However, with limited resources, it is challenging to determine the best type of annotations when annotating massive amounts of unlabeled data. To address this issue, we focus on polyps in colonoscopy videos and pancreatic tumors in abdominal CT scans; both application… ▽ More

    Submitted 20 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Published in Machine Intelligence Research

    Journal ref: Mach. Intell. Res. (2024)

  37. arXiv:2309.13207  [pdf, other

    cs.LG

    Evidential Deep Learning: Enhancing Predictive Uncertainty Estimation for Earth System Science Applications

    Authors: John S. Schreck, David John Gagne II, Charlie Becker, William E. Chapman, Kim Elmore, Da Fan, Gabrielle Gantos, Eliot Kim, Dhamma Kimpara, Thomas Martin, Maria J. Molina, Vanessa M. Pryzbylo, Jacob Radford, Belen Saavedra, Justin Willson, Christopher Wirz

    Abstract: Robust quantification of predictive uncertainty is critical for understanding factors that drive weather and climate outcomes. Ensembles provide predictive uncertainty estimates and can be decomposed physically, but both physics and machine learning ensembles are computationally expensive. Parametric deep learning can estimate uncertainty with one model by predicting the parameters of a probabilit… ▽ More

    Submitted 19 February, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

  38. arXiv:2309.10523  [pdf, other

    cs.CV

    Edge-aware Feature Aggregation Network for Polyp Segmentation

    Authors: Tao Zhou, Yizhe Zhang, Geng Chen, Yi Zhou, Ye Wu, Deng-** Fan

    Abstract: Precise polyp segmentation is vital for the early diagnosis and prevention of colorectal cancer (CRC) in clinical practice. However, due to scale variation and blurry polyp boundaries, it is still a challenging task to achieve satisfactory segmentation performance with different scales and shapes. In this study, we present a novel Edge-aware Feature Aggregation Network (EFA-Net) for polyp segmenta… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 20 pages 8 figures

  39. arXiv:2309.07581  [pdf, ps, other

    cs.AR

    A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives

    Authors: Zhengyang Lv, Mingyu Yan, Xin Liu, Mengyao Dong, Xiaochun Ye, Dongrui Fan, Ninghui Sun

    Abstract: Graph-related applications have experienced significant growth in academia and industry, driven by the powerful representation capabilities of graph. However, efficiently executing these applications faces various challenges, such as load imbalance, random memory access, etc. To address these challenges, researchers have proposed various acceleration systems, including software frameworks and hard… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  40. arXiv:2308.12962  [pdf, other

    cs.CV

    Motion-Guided Masking for Spatiotemporal Representation Learning

    Authors: David Fan, Jue Wang, Shuai Liao, Yi Zhu, Vimal Bhat, Hector Santos-Villalobos, Rohith MV, Xinyu Li

    Abstract: Several recent works have directly extended the image masked autoencoder (MAE) with random masking into video domain, achieving promising results. However, unlike images, both spatial and temporal information are important for video understanding. This suggests that the random masking strategy that is inherited from the image MAE is less effective for video MAE. This motivates the design of a nove… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV 2023

  41. arXiv:2308.11185  [pdf, other

    cs.CV

    MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation

    Authors: Najmeh Sadoughi, Xinyu Li, Avijit Vajpayee, David Fan, Bing Shuai, Hector Santos-Villalobos, Vimal Bhat, Rohith MV

    Abstract: Previous research has studied the task of segmenting cinematic videos into scenes and into narrative acts. However, these studies have overlooked the essential task of multimodal alignment and fusion for effectively and efficiently processing long-form videos (>60min). In this paper, we introduce Multimodal alignmEnt aGgregation and distillAtion (MEGA) for cinematic long-video segmentation. MEGA t… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: ICCV 2023 accepted

  42. arXiv:2308.08269  [pdf, other

    eess.IV cs.CV

    OnUVS: Online Feature Decoupling Framework for High-Fidelity Ultrasound Video Synthesis

    Authors: Han Zhou, Dong Ni, Ao Chang, Xinrui Zhou, Rusi Chen, Yanlin Chen, Lian Liu, Jiamin Liang, Yuhao Huang, Tong Han, Zhe Liu, Deng-** Fan, Xin Yang

    Abstract: Ultrasound (US) imaging is indispensable in clinical practice. To diagnose certain diseases, sonographers must observe corresponding dynamic anatomic structures to gather comprehensive information. However, the limited availability of specific US video cases causes teaching difficulties in identifying corresponding diseases, which potentially impacts the detection rate of such cases. The synthesis… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 14 pages, 13 figures and 6 tables

  43. arXiv:2307.16262  [pdf, other

    eess.IV cs.CV

    Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

    Authors: Debesh Jha, Vanshali Sharma, Debapriya Banik, Debayan Bhattacharya, Kaushiki Roy, Steven A. Hicks, Nikhil Kumar Tomar, Vajira Thambawita, Adrian Krenzer, Ge-Peng Ji, Sahadev Poudel, George Batchkala, Saruar Alam, Awadelrahman M. A. Ahmed, Quoc-Huy Trinh, Zeshan Khan, Tien-Phat Nguyen, Shruti Shrestha, Sabari Nathan, Jeonghwan Gwak, Ritika K. Jha, Zheyuan Zhang, Alexander Schlaefer, Debotosh Bhattacharjee, M. K. Bhuyan , et al. (8 additional authors not shown)

    Abstract: Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 July, 2023; originally announced July 2023.

  44. How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges

    Authors: Haotong Qin, Ge-Peng Ji, Salman Khan, Deng-** Fan, Fahad Shahbaz Khan, Luc Van Gool

    Abstract: Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI. Notably, Bard has recently been updated to handle visual inputs alongside text prompts during conversations. Given Bard's impressive track record in handling textual inputs, we explore its capabilities in understanding and interpreting visual data (images) conditioned by text questions. This… ▽ More

    Submitted 30 August, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

    Journal ref: Machine Intelligence Research. 20(5), October 2023, 605-613

  45. arXiv:2307.12765  [pdf, other

    cs.AR

    HiHGNN: Accelerating HGNNs through Parallelism and Data Reusability Exploitation

    Authors: Runzhen Xue, Dengke Han, Mingyu Yan, Mo Zou, Xiaocheng Yang, Duo Wang, Wenming Li, Zhimin Tang, John Kim, Xiaochun Ye, Dongrui Fan

    Abstract: Heterogeneous graph neural networks (HGNNs) have emerged as powerful algorithms for processing heterogeneous graphs (HetGs), widely used in many critical fields. To capture both structural and semantic information in HetGs, HGNNs first aggregate the neighboring feature vectors for each vertex in each semantic graph and then fuse the aggregated results across all semantic graphs for each vertex. Un… ▽ More

    Submitted 26 April, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: 16 pages, 17 figures; To appear in IEEE TPDS 2024

  46. arXiv:2307.08098  [pdf, other

    cs.CV

    CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation

    Authors: Jialun Pei, Tao Jiang, He Tang, Nian Liu, Yueming **, Deng-** Fan, Pheng-Ann Heng

    Abstract: We propose a novel approach for RGB-D salient instance segmentation using a dual-branch cross-modal feature calibration architecture called CalibNet. Our method simultaneously calibrates depth and RGB features in the kernel and mask branches to generate instance-aware kernels and mask features. CalibNet consists of three simple modules, a dynamic interactive kernel (DIK) and a weight-sharing fusio… ▽ More

    Submitted 11 June, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: This work has been accepted by TIP 2024

  47. arXiv:2307.05383  [pdf

    eess.SP cs.HC cs.LG

    Human Emotion Recognition Based On Galvanic Skin Response signal Feature Selection and SVM

    Authors: Di Fan, Mingyang Liu, Xiaohan Zhang, Xiaopeng Gong

    Abstract: A novel human emotion recognition method based on automatically selected Galvanic Skin Response (GSR) signal features and SVM is proposed in this paper. GSR signals were acquired by e-Health Sensor Platform V2.0. Then, the data is de-noised by wavelet function and normalized to get rid of the individual difference. 30 features are extracted from the normalized data, however, directly using of thes… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  48. arXiv:2306.07532  [pdf, other

    cs.CV

    Referring Camouflaged Object Detection

    Authors: Xuying Zhang, Bowen Yin, Zheng Lin, Qibin Hou, Deng-** Fan, Ming-Ming Cheng

    Abstract: We consider the problem of referring camouflaged object detection (Ref-COD), a new task that aims to segment specified camouflaged objects based on a small set of referring images with salient target objects. We first assemble a large-scale dataset, called R2C7K, which consists of 7K images covering 64 object categories in real-world scenarios. Then, we develop a simple but strong dual-branch fram… ▽ More

    Submitted 11 July, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

  49. arXiv:2306.03497  [pdf, other

    cs.CV

    Instructive Feature Enhancement for Dichotomous Medical Image Segmentation

    Authors: Lian Liu, Han Zhou, Jiongquan Chen, Si**g Liu, Wenlong Shi, Dong Ni, Deng-** Fan, Xin Yang

    Abstract: Deep neural networks have been widely applied in dichotomous medical image segmentation (DMIS) of many anatomical structures in several modalities, achieving promising performance. However, existing networks tend to struggle with task-specific, heavy and complex designs to improve accuracy. They made little instructions to which feature channels would be more beneficial for segmentation, and that… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by MICCAI 2023

  50. arXiv:2305.18497  [pdf, other

    cs.LG

    Collaborative Learning via Prediction Consensus

    Authors: Dongyang Fan, Celestine Mendler-Dünner, Martin Jaggi

    Abstract: We consider a collaborative learning setting where the goal of each agent is to improve their own model by leveraging the expertise of collaborators, in addition to their own training data. To facilitate the exchange of expertise among agents, we propose a distillation-based method leveraging shared unlabeled auxiliary data, which is pseudo-labeled by the collective. Central to our method is a tru… ▽ More

    Submitted 14 November, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: Accepted to the 37th Conference on Neural Information Processing Systems (NeurIPS 2023)