Skip to main content

Showing 1–48 of 48 results for author: Dou, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01921  [pdf, other

    cs.CV

    GVDIFF: Grounded Text-to-Video Generation with Diffusion Models

    Authors: Huanzhang Dou, Ruixiang Li, Wei Su, Xi Li

    Abstract: In text-to-video (T2V) generation, significant attention has been directed toward its development, yet unifying discrete and continuous grounding conditions in T2V generation remains under-explored. This paper proposes a Grounded text-to-Video generation framework, termed GVDIFF. First, we inject the grounding condition into the self-attention through an uncertainty-based representation to explici… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.18048  [pdf, other

    cs.CV

    ScanFormer: Referring Expression Comprehension by Iteratively Scanning

    Authors: Wei Su, Peihan Miao, Huanzhang Dou, Xi Li

    Abstract: Referring Expression Comprehension (REC) aims to localize the target objects specified by free-form natural language descriptions in images. While state-of-the-art methods achieve impressive performance, they perform a dense perception of images, which incorporates redundant visual regions unrelated to linguistic queries, leading to additional computational overhead. This inspires us to explore a… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by CVPR2024

  3. arXiv:2406.14098  [pdf, ps, other

    cs.CV

    HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models

    Authors: Xinrui Zhou, Yuhao Huang, Wufeng Xue, Haoran Dou, Jun Cheng, Han Zhou, Dong Ni

    Abstract: Echocardiography (ECHO) video is widely used for cardiac examination. In clinical, this procedure heavily relies on operator experience, which needs years of training and maybe the assistance of deep learning-based systems for enhanced accuracy and efficiency. However, it is challenging since acquiring sufficient customized data (e.g., abnormal cases) for novice training and deep model development… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted by MICCAI 2024

  4. arXiv:2406.10673  [pdf, other

    cs.CV

    SemanticMIM: Marring Masked Image Modeling with Semantics Compression for General Visual Representation

    Authors: Yike Yuan, Huanzhang Dou, Fengjun Guo, Xi Li

    Abstract: This paper represents a neat yet effective framework, named SemanticMIM, to integrate the advantages of masked image modeling (MIM) and contrastive learning (CL) for general visual representation. We conduct a thorough comparative analysis between CL and MIM, revealing that their complementary advantages fundamentally stem from two distinct phases, i.e., compression and reconstruction. Specificall… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  5. arXiv:2404.18106  [pdf, other

    cs.CV

    Semi-supervised Text-based Person Search

    Authors: Daming Gao, Yang Bai, Min Cao, Hao Dou, Mang Ye, Min Zhang

    Abstract: Text-based person search (TBPS) aims to retrieve images of a specific person from a large image gallery based on a natural language description. Existing methods rely on massive annotated image-text data to achieve satisfactory performance in fully-supervised learning. It poses a significant challenge in practice, as acquiring person images from surveillance videos is relatively easy, while obtain… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 13 pages

  6. arXiv:2404.01121  [pdf, other

    cs.CV eess.IV

    CMT: Cross Modulation Transformer with Hybrid Loss for Pansharpening

    Authors: Wen-Jie Shu, Hong-Xia Dou, Rui Wen, Xiao Wu, Liang-Jian Deng

    Abstract: Pansharpening aims to enhance remote sensing image (RSI) quality by merging high-resolution panchromatic (PAN) with multispectral (MS) images. However, prior techniques struggled to optimally fuse PAN and MS images for enhanced spatial and spectral information, due to a lack of a systematic framework capable of effectively coordinating their individual strengths. In response, we present the Cross… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  7. arXiv:2402.15239  [pdf, other

    cs.CV cs.LG

    GS-EMA: Integrating Gradient Surgery Exponential Moving Average with Boundary-Aware Contrastive Learning for Enhanced Domain Generalization in Aneurysm Segmentation

    Authors: Fengming Lin, Yan Xia, Michael MacRaild, Yash Deo, Haoran Dou, Qiongyao Liu, Nina Cheng, Nishant Ravikumar, Alejandro F. Frangi

    Abstract: The automated segmentation of cerebral aneurysms is pivotal for accurate diagnosis and treatment planning. Confronted with significant domain shifts and class imbalance in 3D Rotational Angiography (3DRA) data from various medical institutions, the task becomes challenging. These shifts include differences in image appearance, intensity distribution, resolution, and aneurysm size, all of which com… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted by ISBI 2024

  8. arXiv:2402.15237  [pdf, other

    cs.CV cs.LG

    Unsupervised Domain Adaptation for Brain Vessel Segmentation through Transwarp Contrastive Learning

    Authors: Fengming Lin, Yan Xia, Michael MacRaild, Yash Deo, Haoran Dou, Qiongyao Liu, Kun Wu, Nishant Ravikumar, Alejandro F. Frangi

    Abstract: Unsupervised domain adaptation (UDA) aims to align the labelled source distribution with the unlabelled target distribution to obtain domain-invariant predictive models. Since cross-modality medical data exhibit significant intra and inter-domain shifts and most are unlabelled, UDA is more important while challenging in medical image analysis. This paper proposes a simple yet potent contrastive le… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted by ISBI 2024

  9. arXiv:2312.05541  [pdf, other

    cs.CV

    DPoser: Diffusion Model as Robust 3D Human Pose Prior

    Authors: Junzhe Lu, **g Lin, Hongkun Dou, Ailing Zeng, Yue Deng, Yulun Zhang, Haoqian Wang

    Abstract: This work targets to construct a robust human pose prior. However, it remains a persistent challenge due to biomechanical constraints and diverse human movements. Traditional priors like VAEs and NDFs often exhibit shortcomings in realism and generalization, notably with unseen noisy poses. To address these issues, we introduce DPoser, a robust and versatile human pose prior built upon diffusion m… ▽ More

    Submitted 23 March, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: Project Page: https://dposer.github.io; Code Released: https://github.com/moonbow721/DPoser

  10. arXiv:2308.12861  [pdf, other

    eess.IV cs.CV

    Learned Local Attention Maps for Synthesising Vessel Segmentations

    Authors: Yash Deo, Rodrigo Bonazzola, Haoran Dou, Yan Xia, Tianyou Wei, Nishant Ravikumar, Alejandro F. Frangi, Toni Lassila

    Abstract: Magnetic resonance angiography (MRA) is an imaging modality for visualising blood vessels. It is useful for several diagnostic applications and for assessing the risk of adverse events such as haemorrhagic stroke (resulting from the rupture of aneurysms in blood vessels). However, MRAs are not acquired routinely, hence, an approach to synthesise blood vessel segmentations from more routinely acqui… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  11. arXiv:2308.06781  [pdf, other

    eess.IV cs.CV

    Shape-guided Conditional Latent Diffusion Models for Synthesising Brain Vasculature

    Authors: Yash Deo, Haoran Dou, Nishant Ravikumar, Alejandro F. Frangi, Toni Lassila

    Abstract: The Circle of Willis (CoW) is the part of cerebral vasculature responsible for delivering blood to the brain. Understanding the diverse anatomical variations and configurations of the CoW is paramount to advance research on cerebrovascular diseases and refine clinical interventions. However, comprehensive investigation of less prevalent CoW variations remains challenging because of the dominance o… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

  12. arXiv:2307.00885  [pdf, other

    eess.IV cs.CV

    An Explainable Deep Framework: Towards Task-Specific Fusion for Multi-to-One MRI Synthesis

    Authors: Luyi Han, Tianyu Zhang, Yunzhi Huang, Haoran Dou, Xin Wang, Yuan Gao, Chunyao Lu, Tan Tao, Ritse Mann

    Abstract: Multi-sequence MRI is valuable in clinical settings for reliable diagnosis and treatment prognosis, but some sequences may be unusable or missing for various reasons. To address this issue, MRI synthesis is a potential solution. Recent deep learning-based methods have achieved good performance in combining multiple available sequences for missing sequence synthesis. Despite their success, these me… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  13. arXiv:2306.14687  [pdf, other

    eess.IV cs.CV

    GSMorph: Gradient Surgery for cine-MRI Cardiac Deformable Registration

    Authors: Haoran Dou, Ning Bi, Luyi Han, Yuhao Huang, Ritse Mann, Xin Yang, Dong Ni, Nishant Ravikumar, Alejandro F. Frangi, Yunzhi Huang

    Abstract: Deep learning-based deformable registration methods have been widely investigated in diverse medical applications. Learning-based deformable registration relies on weighted objective functions trading off registration accuracy and smoothness of the deformation field. Therefore, they inevitably require tuning the hyperparameter for optimal registration performance. Tuning the hyperparameters is hig… ▽ More

    Submitted 20 July, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted at MICCAI 2023

  14. arXiv:2306.14680  [pdf, other

    eess.IV cs.CV cs.LG

    A Conditional Flow Variational Autoencoder for Controllable Synthesis of Virtual Populations of Anatomy

    Authors: Haoran Dou, Nishant Ravikumar, Alejandro F. Frangi

    Abstract: The generation of virtual populations (VPs) of anatomy is essential for conducting in silico trials of medical devices. Typically, the generated VP should capture sufficient variability while remaining plausible and should reflect the specific characteristics and demographics of the patients observed in real populations. In several applications, it is desirable to synthesise virtual populations in… ▽ More

    Submitted 28 July, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

    Comments: Accepted at MICCAI 2023

  15. arXiv:2306.04652  [pdf, other

    cs.CV

    Language Adaptive Weight Generation for Multi-task Visual Grounding

    Authors: Wei Su, Peihan Miao, Huanzhang Dou, Gaoang Wang, Liang Qiao, Zheyang Li, Xi Li

    Abstract: Although the impressive performance in visual grounding, the prevailing approaches usually exploit the visual backbone in a passive way, i.e., the visual backbone extracts features with fixed weights without expression-related hints. The passive perception may lead to mismatches (e.g., redundant and missing), limiting further performance improvement. Ideally, the visual backbone should actively ex… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by CVPR2023

  16. arXiv:2306.04650  [pdf, other

    cs.CV

    GaitMPL: Gait Recognition with Memory-Augmented Progressive Learning

    Authors: Huanzhang Dou, Pengyi Zhang, Yuhan Zhao, Lin Dong, Zequn Qin, Xi Li

    Abstract: Gait recognition aims at identifying the pedestrians at a long distance by their biometric gait patterns. It is inherently challenging due to the various covariates and the properties of silhouettes (textureless and colorless), which result in two kinds of pair-wise hard samples: the same pedestrian could have distinct silhouettes (intra-class diversity) and different pedestrians could have simila… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by TIP2022

  17. arXiv:2306.04451  [pdf, other

    cs.CV

    Referring Expression Comprehension Using Language Adaptive Inference

    Authors: Wei Su, Peihan Miao, Huanzhang Dou, Yongjian Fu, Xi Li

    Abstract: Different from universal object detection, referring expression comprehension (REC) aims to locate specific objects referred to by natural language expressions. The expression provides high-level concepts of relevant visual and contextual patterns, which vary significantly with different expressions and account for only a few of those encoded in the REC model. This leads us to a question: do we re… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by AAAI2023

  18. arXiv:2306.03445  [pdf, other

    cs.CV cs.AI

    MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition

    Authors: Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Xi Li

    Abstract: Gait recognition, which aims at identifying individuals by their walking patterns, has recently drawn increasing research attention. However, gait recognition still suffers from the conflicts between the limited binary visual clues of the silhouette and numerous covariates with diverse scales, which brings challenges to the model's adaptiveness. In this paper, we address this conflict by developin… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by ECCV2022

  19. arXiv:2306.03428  [pdf, other

    cs.CV

    GaitGCI: Generative Counterfactual Intervention for Gait Recognition

    Authors: Huanzhang Dou, Pengyi Zhang, Wei Su, Yunlong Yu, Yining Lin, Xi Li

    Abstract: Gait is one of the most promising biometrics that aims to identify pedestrians from their walking patterns. However, prevailing methods are susceptible to confounders, resulting in the networks hardly focusing on the regions that reflect effective walking patterns. To address this fundamental problem in gait recognition, we propose a Generative Counterfactual Intervention framework, dubbed GaitGCI… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by CVPR2023

  20. arXiv:2306.02544  [pdf, other

    cs.CV

    Fourier Test-time Adaptation with Multi-level Consistency for Robust Classification

    Authors: Yuhao Huang, Xin Yang, Xiaoqiong Huang, Xinrui Zhou, Haozhe Chi, Haoran Dou, Xindi Hu, Jian Wang, Xuedong Deng, Dong Ni

    Abstract: Deep classifiers may encounter significant performance degradation when processing unseen testing data from varying centers, vendors, and protocols. Ensuring the robustness of deep models against these domain shifts is crucial for their widespread clinical application. In this study, we propose a novel approach called Fourier Test-time Adaptation (FTTA), which employs a dual-adaptation design to i… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Accepted by MICCAI 2023

  21. arXiv:2305.08328  [pdf, other

    cs.IR cs.LG

    FedAds: A Benchmark for Privacy-Preserving CVR Estimation with Vertical Federated Learning

    Authors: Penghui Wei, Hongjian Dou, Shaoguo Liu, Rongjun Tang, Li Liu, Liang Wang, Bo Zheng

    Abstract: Conversion rate (CVR) estimation aims to predict the probability of conversion event after a user has clicked an ad. Typically, online publisher has user browsing interests and click feedbacks, while demand-side advertising platform collects users' post-click behaviors such as dwell time and conversion decisions. To estimate CVR accurately and protect data privacy better, vertical federated learni… ▽ More

    Submitted 14 May, 2023; originally announced May 2023.

    Comments: SIGIR 2023, Resource Track

  22. arXiv:2304.10051  [pdf, other

    cs.LG

    HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning Framework for Data Analytic Services

    Authors: Hui Dou, Shanshan Zhu, Yiwen Zhang, Pengfei Chen, Zibin Zheng

    Abstract: Hyper-parameters optimization (HPO) is vital for machine learning models. Besides model accuracy, other tuning intentions such as model training time and energy consumption are also worthy of attention from data analytic service providers. Hence, it is essential to take both model hyperparameters and system parameters into consideration to execute cross-layer multi-objective hyperparameter auto-tu… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

  23. arXiv:2210.01607  [pdf, other

    eess.IV cs.CV cs.LG

    A Generative Shape Compositional Framework to Synthesise Populations of Virtual Chimaeras

    Authors: Haoran Dou, Seppo Virtanen, Nishant Ravikumar, Alejandro F. Frangi

    Abstract: Generating virtual populations of anatomy that capture sufficient variability while remaining plausible is essential for conducting in-silico trials of medical devices. However, not all anatomical shapes of interest are always available for each individual in a population. Hence, missing/partially-overlap** anatomical information is often available across individuals in a population. We introduc… ▽ More

    Submitted 4 March, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: 15 pages, 4 figures, 4 tables. Accepted by IEEE Transactions on Neural Networks and Learning Systems

  24. arXiv:2207.00476  [pdf, other

    cs.CV cs.AI cs.LG

    Online Reflective Learning for Robust Medical Image Segmentation

    Authors: Yuhao Huang, Xin Yang, Xiaoqiong Huang, Jiamin Liang, Xinrui Zhou, Cheng Chen, Haoran Dou, Xindi Hu, Yan Cao, Dong Ni

    Abstract: Deep segmentation models often face the failure risks when the testing image presents unseen distributions. Improving model robustness against these risks is crucial for the large-scale clinical application of deep models. In this study, inspired by human learning cycle, we propose a novel online reflective learning framework (RefSeg) to improve segmentation robustness. Based on the reflection-on-… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Accepted by MICCAI 2022

  25. arXiv:2207.00475  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Agent with Tangent-based Formulation and Anatomical Perception for Standard Plane Localization in 3D Ultrasound

    Authors: Yuxin Zou, Haoran Dou, Yuhao Huang, Xin Yang, Jikuan Qian, Chaojiong Zhen, Xiaodan Ji, Nishant Ravikumar, Guoqiang Chen, Weijun Huang, Alejandro F. Frangi, Dong Ni

    Abstract: Standard plane (SP) localization is essential in routine clinical ultrasound (US) diagnosis. Compared to 2D US, 3D US can acquire multiple view planes in one scan and provide complete anatomy with the addition of coronal plane. However, manually navigating SPs in 3D US is laborious and biased due to the orientation variability and huge search space. In this study, we introduce a novel reinforcemen… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Accepted by MICCAI 2022

  26. arXiv:2206.15254  [pdf, other

    eess.IV cs.CV

    Localizing the Recurrent Laryngeal Nerve via Ultrasound with a Bayesian Shape Framework

    Authors: Haoran Dou, Luyi Han, Yushuang He, Jun Xu, Nishant Ravikumar, Ritse Mann, Alejandro F. Frangi, Pew-Thian Yap, Yunzhi Huang

    Abstract: Tumor infiltration of the recurrent laryngeal nerve (RLN) is a contraindication for robotic thyroidectomy and can be difficult to detect via standard laryngoscopy. Ultrasound (US) is a viable alternative for RLN detection due to its safety and ability to provide real-time feedback. However, the tininess of the RLN, with a diameter typically less than 3mm, poses significant challenges to the accura… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: Early Accepted by MICCAI 2022

  27. arXiv:2110.15027  [pdf, other

    eess.IV cs.CV

    Deformable Registration of Brain MR Images via a Hybrid Loss

    Authors: Luyi Han, Haoran Dou, Yunzhi Huang, Pew-Thian Yap

    Abstract: Unsupervised learning strategy is widely adopted by the deformable registration models due to the lack of ground truth of deformation fields. These models typically depend on the intensity-based similarity loss to obtain the learning convergence. Despite the success, such dependence is insufficient. For the deformable registration of mono-modality image, well-aligned two images not only have indis… ▽ More

    Submitted 19 December, 2021; v1 submitted 28 October, 2021; originally announced October 2021.

    Comments: Ranked fifth on the brain T1w deformable registration task organized by the MICCAI 2021 Learn2Reg challenge

  28. arXiv:2108.00752  [pdf, other

    cs.CV cs.AI cs.LG cs.MA

    Flip Learning: Erase to Segment

    Authors: Yuhao Huang, Xin Yang, Yuxin Zou, Chaoyu Chen, Jian Wang, Haoran Dou, Nishant Ravikumar, Alejandro F Frangi, Jianqiao Zhou, Dong Ni

    Abstract: Nodule segmentation from breast ultrasound images is challenging yet essential for the diagnosis. Weakly-supervised segmentation (WSS) can help reduce time-consuming and cumbersome manual annotation. Unlike existing weakly-supervised approaches, in this study, we propose a novel and general WSS framework called Flip Learning, which only needs the box annotation. Specifically, the target in the lab… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: Accepted by MICCAI 2021

  29. arXiv:2105.14421   

    cs.CV

    VersatileGait: A Large-Scale Synthetic Gait Dataset Towards in-the-Wild Simulation

    Authors: Pengyi Zhang, Huanzhang Dou, Wenhu Zhang, Yuhan Zhao, Songyuan Li, Zequn Qin, Xi Li

    Abstract: Gait recognition has a rapid development in recent years. However, gait recognition in the wild is not well explored yet. An obvious reason could be ascribed to the lack of diverse training data from the perspective of intrinsic and extrinsic factors. To remedy this problem, we propose to construct a large-scale gait dataset with the help of controllable computer simulation. In detail, to diversif… ▽ More

    Submitted 31 May, 2021; v1 submitted 29 May, 2021; originally announced May 2021.

    Comments: We should have updated 2101.01394 but we did a new submission

  30. arXiv:2105.13695  [pdf, other

    cs.CV

    AutoSampling: Search for Effective Data Sampling Schedules

    Authors: Ming Sun, Haoxuan Dou, Baopu Li, Lei Cui, Junjie Yan, Wanli Ouyang

    Abstract: Data sampling acts as a pivotal role in training deep learning models. However, an effective sampling schedule is difficult to learn due to the inherently high dimension of parameters in learning the sampling schedule. In this paper, we propose an AutoSampling method to automatically learn sampling schedules for model training, which consists of the multi-exploitation step aiming for optimal local… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

    Comments: Automl for sampling firstly without any assumpation

    Journal ref: ICML 2021

  31. arXiv:2105.10626  [pdf, other

    cs.CV cs.MA eess.IV

    Searching Collaborative Agents for Multi-plane Localization in 3D Ultrasound

    Authors: Xin Yang, Yuhao Huang, Ruobing Huang, Haoran Dou, Rui Li, Jikuan Qian, Xiaoqiong Huang, Wenlong Shi, Chaoyu Chen, Yuanji Zhang, Haixia Wang, Yi Xiong, Dong Ni

    Abstract: 3D ultrasound (US) has become prevalent due to its rich spatial and diagnostic information not contained in 2D US. Moreover, 3D US can contain multiple standard planes (SPs) in one shot. Thus, automatically localizing SPs in 3D US has the potential to improve user-independence and scanning-efficiency. However, manual SP localization in 3D US is challenging because of the low image quality, huge se… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted by Medical Image Analysis (10 figures, 8 tabels)

  32. arXiv:2105.08994  [pdf, other

    cs.CV

    Efficient Transfer Learning via Joint Adaptation of Network Architecture and Weight

    Authors: Ming Sun, Haoxuan Dou, Junjie Yan

    Abstract: Transfer learning can boost the performance on the targettask by leveraging the knowledge of the source domain. Recent worksin neural architecture search (NAS), especially one-shot NAS, can aidtransfer learning by establishing sufficient network search space. How-ever, existing NAS methods tend to approximate huge search spaces byexplicitly building giant super-networks with multiple sub-paths, an… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: NAS is one part of transfer learning

    Journal ref: ECCV 2020

  33. arXiv:2103.14502  [pdf, other

    eess.IV cs.CV

    Agent with Warm Start and Adaptive Dynamic Termination for Plane Localization in 3D Ultrasound

    Authors: Xin Yang, Haoran Dou, Ruobing Huang, Wufeng Xue, Yuhao Huang, Jikuan Qian, Yuanji Zhang, Huanjia Luo, Huizhi Guo, Tianfu Wang, Yi Xiong, Dong Ni

    Abstract: Accurate standard plane (SP) localization is the fundamental step for prenatal ultrasound (US) diagnosis. Typically, dozens of US SPs are collected to determine the clinical diagnosis. 2D US has to perform scanning for each SP, which is time-consuming and operator-dependent. While 3D US containing multiple SPs in one shot has the inherent advantages of less user-dependency and more efficiency. Aut… ▽ More

    Submitted 26 March, 2021; originally announced March 2021.

    Comments: Accepted by IEEE Transactions on Medical Imaging (12 pages, 8 figures, 11 tabels)

  34. arXiv:2101.01394  [pdf, other

    cs.CV

    VersatileGait: A Large-Scale Synthetic Gait Dataset with Fine-GrainedAttributes and Complicated Scenarios

    Authors: Huanzhang Dou, Wenhu Zhang, Pengyi Zhang, Yuhan Zhao, Songyuan Li, Zequn Qin, Fei Wu, Lin Dong, Xi Li

    Abstract: With the motivation of practical gait recognition applications, we propose to automatically create a large-scale synthetic gait dataset (called VersatileGait) by a game engine, which consists of around one million silhouette sequences of 11,000 subjects with fine-grained attributes in various complicated scenarios. Compared with existing real gait datasets with limited samples and simple scenarios… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

  35. arXiv:2010.04928  [pdf, other

    eess.IV cs.CV cs.LG

    Contrastive Rendering for Ultrasound Image Segmentation

    Authors: Haoming Li, Xin Yang, Jiamin Liang, Wenlong Shi, Chaoyu Chen, Haoran Dou, Rui Li, Rui Gao, Guangquan Zhou, **ghui Fang, Xiaowen Liang, Ruobing Huang, Alejandro Frangi, Zhiyi Chen, Dong Ni

    Abstract: Ultrasound (US) image segmentation embraced its significant improvement in deep learning era. However, the lack of sharp boundaries in US images still remains an inherent challenge for segmentation. Previous methods often resort to global context, multi-scale cues or auxiliary guidance to estimate the boundaries. It is hard for these methods to approach pixel-level learning for fine-grained bounda… ▽ More

    Submitted 10 October, 2020; originally announced October 2020.

    Comments: 10 pages, 5 figures, 2 tables, 13 references

  36. arXiv:2009.03098  [pdf, other

    cs.CV cs.AI

    Progressive Bilateral-Context Driven Model for Post-Processing Person Re-Identification

    Authors: Min Cao, Chen Chen, Hao Dou, Xiyuan Hu, Silong Peng, Arjan Kuijper

    Abstract: Most existing person re-identification methods compute pairwise similarity by extracting robust visual features and learning the discriminative metric. Owing to visual ambiguities, these content-based methods that determine the pairwise relationship only based on the similarity between them, inevitably produce a suboptimal ranking list. Instead, the pairwise similarity can be estimated more accura… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

    Journal ref: Transactions on Multimedia 2020

  37. arXiv:2007.15273  [pdf, other

    cs.CV eess.IV eess.SP

    Searching Collaborative Agents for Multi-plane Localization in 3D Ultrasound

    Authors: Yuhao Huang, Xin Yang, Rui Li, Jikuan Qian, Xiaoqiong Huang, Wenlong Shi, Haoran Dou, Chaoyu Chen, Yuanji Zhang, Huanjia Luo, Alejandro Frangi, Yi Xiong, Dong Ni

    Abstract: 3D ultrasound (US) is widely used due to its rich diagnostic information, portability and low cost. Automated standard plane (SP) localization in US volume not only improves efficiency and reduces user-dependence, but also boosts 3D US interpretation. In this study, we propose a novel Multi-Agent Reinforcement Learning (MARL) framework to localize multiple uterine SPs in 3D US simultaneously. Our… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: Early accepted by MICCAI 2020

  38. arXiv:2005.00306  [pdf, other

    cs.CV

    PCA-SRGAN: Incremental Orthogonal Projection Discrimination for Face Super-resolution

    Authors: Hao Dou, Chen Chen, Xiyuan Hu, Zuxing Xuan, Zhisen Hu, Silong Peng

    Abstract: Generative Adversarial Networks (GAN) have been employed for face super resolution but they bring distorted facial details easily and still have weakness on recovering realistic texture. To further improve the performance of GAN based models on super-resolving face images, we propose PCA-SRGAN which pays attention to the cumulative discrimination in the orthogonal projection space spanned by PCA p… ▽ More

    Submitted 28 August, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

  39. arXiv:2004.13567  [pdf, other

    eess.IV cs.CV

    Hybrid Attention for Automatic Segmentation of Whole Fetal Head in Prenatal Ultrasound Volumes

    Authors: Xin Yang, Xu Wang, Yi Wang, Haoran Dou, Shengli Li, Huaxuan Wen, Yi Lin, Pheng-Ann Heng, Dong Ni

    Abstract: Background and Objective: Biometric measurements of fetal head are important indicators for maternal and fetal health monitoring during pregnancy. 3D ultrasound (US) has unique advantages over 2D scan in covering the whole fetal head and may promote the diagnoses. However, automatically segmenting the whole fetal head in US volumes still pends as an emerging and unsolved problem. The challenges th… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

    Comments: Accepted by Computer Methods and Programs in Biomedicine

  40. arXiv:2004.12847  [pdf, other

    cs.CV eess.IV q-bio.QM

    A Deep Attentive Convolutional Neural Network for Automatic Cortical Plate Segmentation in Fetal MRI

    Authors: Haoran Dou, Davood Karimi, Caitlin K. Rollins, Cynthia M. Ortinau, Lana Vasung, Clemente Velasco-Annis, Abdelhakim Ouaalam, Xin Yang, Dong Ni, Ali Gholipour

    Abstract: Fetal cortical plate segmentation is essential in quantitative analysis of fetal brain maturation and cortical folding. Manual segmentation of the cortical plate, or manual refinement of automatic segmentations is tedious and time-consuming. Automatic segmentation of the cortical plate, on the other hand, is challenged by the relatively low resolution of the reconstructed fetal brain MRI scans com… ▽ More

    Submitted 2 April, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: Accepted by IEEE Transactions on Medical Imaging

  41. arXiv:2004.00226  [pdf, other

    eess.IV cs.CV

    Synthesis and Edition of Ultrasound Images via Sketch Guided Progressive Growing GANs

    Authors: Jiamin Liang, Xin Yang, Haoming Li, Yi Wang, Manh The Van, Haoran Dou, Chaoyu Chen, **ghui Fang, Xiaowen Liang, Zixin Mai, Guowen Zhu, Zhiyi Chen, Dong Ni

    Abstract: Ultrasound (US) is widely accepted in clinic for anatomical structure inspection. However, lacking in resources to practice US scan, novices often struggle to learn the operation skills. Also, in the deep learning era, automated US image analysis is limited by the lack of annotated samples. Efficiently synthesizing realistic, editable and high resolution US images can solve the problems. The task… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: IEEE International Symposium on Biomedical Imaging (IEEE ISBI 2020)

  42. arXiv:2002.05844  [pdf, other

    eess.IV cs.CV

    Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer

    Authors: Zhendong Liu, Xin Yang, Rui Gao, Shengfeng Liu, Haoran Dou, Shuangchi He, Yuhao Huang, Yankai Huang, Huanjia Luo, Yuanji Zhang, Yi Xiong, Dong Ni

    Abstract: Deep Neural Networks (DNNs) suffer from the performance degradation when image appearance shift occurs, especially in ultrasound (US) image segmentation. In this paper, we propose a novel and intuitive framework to remove the appearance shift, and hence improve the generalization ability of DNNs. Our work has three highlights. First, we follow the spirit of universal style transfer to remove appea… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

    Comments: IEEE International Symposium on Biomedical Imaging (IEEE ISBI 2020)

  43. arXiv:1912.02911  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Deep learning with noisy labels: exploring techniques and remedies in medical image analysis

    Authors: Davood Karimi, Haoran Dou, Simon K. Warfield, Ali Gholipour

    Abstract: Supervised training of deep learning models requires large labeled datasets. There is a growing interest in obtaining such datasets for medical image analysis applications. However, the impact of label noise has not received sufficient attention. Recent studies have shown that label noise can significantly impact the performance of deep learning models in many machine learning and computer vision… ▽ More

    Submitted 20 March, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

  44. arXiv:1910.04935  [pdf, other

    cs.CV cs.LG eess.IV

    FetusMap: Fetal Pose Estimation in 3D Ultrasound

    Authors: Xin Yang, Wenlong Shi, Haoran Dou, Jikuan Qian, Yi Wang, Wufeng Xue, Shengli Li, Dong Ni, Pheng-Ann Heng

    Abstract: The 3D ultrasound (US) entrance inspires a multitude of automated prenatal examinations. However, studies about the structuralized description of the whole fetus in 3D US are still rare. In this paper, we propose to estimate the 3D pose of fetus in US volumes to facilitate its quantitative analyses in global and local scales. Given the great challenges in 3D US, including the high volume dimension… ▽ More

    Submitted 3 March, 2024; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 9 pages, 6 figures, 2 tables. Accepted by MICCAI 2019

  45. arXiv:1910.04331  [pdf, other

    eess.IV cs.CV cs.LG

    Agent with Warm Start and Active Termination for Plane Localization in 3D Ultrasound

    Authors: Haoran Dou, Xin Yang, Jikuan Qian, Wufeng Xue, Hao Qin, Xu Wang, Lequan Yu, Shujun Wang, Yi Xiong, Pheng-Ann Heng, Dong Ni

    Abstract: Standard plane localization is crucial for ultrasound (US) diagnosis. In prenatal US, dozens of standard planes are manually acquired with a 2D probe. It is time-consuming and operator-dependent. In comparison, 3D US containing multiple standard planes in one shot has the inherent advantages of less user-dependency and more efficiency. However, manual plane localization in US volume is challenging… ▽ More

    Submitted 3 March, 2024; v1 submitted 9 October, 2019; originally announced October 2019.

    Comments: 9 pages, 5 figures, 1 table. Accepted by MICCAI 2019 (oral)

  46. arXiv:1909.00186  [pdf, other

    eess.IV cs.CV

    Joint Segmentation and Landmark Localization of Fetal Femur in Ultrasound Volumes

    Authors: Xu Wang, Xin Yang, Haoran Dou, Shengli Li, Pheng-Ann Heng, Dong Ni

    Abstract: Volumetric ultrasound has great potentials in promoting prenatal examinations. Automated solutions are highly desired to efficiently and effectively analyze the massive volumes. Segmentation and landmark localization are two key techniques in making the quantitative evaluation of prenatal ultrasound volumes available in clinic. However, both tasks are non-trivial when considering the poor image qu… ▽ More

    Submitted 31 August, 2019; originally announced September 2019.

    Comments: Accepted by IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI), 2019

  47. arXiv:1907.01743  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Deep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound

    Authors: Yi Wang, Haoran Dou, Xiaowei Hu, Lei Zhu, Xin Yang, Ming Xu, **g Qin, Pheng-Ann Heng, Tianfu Wang, Dong Ni

    Abstract: Automatic prostate segmentation in transrectal ultrasound (TRUS) images is of essential importance for image-guided prostate interventions and treatment planning. However, develo** such automatic solutions remains very challenging due to the missing/ambiguous boundary and inhomogeneous intensity distribution of the prostate in TRUS, as well as the large variability in prostate shapes. This paper… ▽ More

    Submitted 3 March, 2024; v1 submitted 3 July, 2019; originally announced July 2019.

    Comments: 11 pages, 10 figures, 2 tables. Accepted by IEEE transactions on Medical Imaging

  48. arXiv:1807.11141  [pdf, other

    cs.IR

    KB4Rec: A Dataset for Linking Knowledge Bases with Recommender Systems

    Authors: Wayne Xin Zhao, Gaole He, Hongjian Dou, ** Huang, Siqi Ouyang, Ji-Rong Wen

    Abstract: To develop a knowledge-aware recommender system, a key data problem is how we can obtain rich and structured knowledge information for recommender system (RS) items. Existing datasets or methods either use side information from original recommender systems (containing very few kinds of useful information) or utilize private knowledge base (KB). In this paper, we present the first public linked KB… ▽ More

    Submitted 27 December, 2020; v1 submitted 29 July, 2018; originally announced July 2018.