Search | arXiv e-print repository

AD-Aligning: Emulating Human-like Generalization for Cognitive Domain Adaptation in Deep Learning

Authors: Zhuoying Li, Bohua Wan, Cong Mu, Ruzhang Zhao, Shushan Qiu, Chao Yan

Abstract: Domain adaptation is pivotal for enabling deep learning models to generalize across diverse domains, a task complicated by variations in presentation and cognitive nuances. In this paper, we introduce AD-Aligning, a novel approach that combines adversarial training with source-target domain alignment to enhance generalization capabilities. By pretraining with Coral loss and standard loss, AD-Align… ▽ More Domain adaptation is pivotal for enabling deep learning models to generalize across diverse domains, a task complicated by variations in presentation and cognitive nuances. In this paper, we introduce AD-Aligning, a novel approach that combines adversarial training with source-target domain alignment to enhance generalization capabilities. By pretraining with Coral loss and standard loss, AD-Aligning aligns target domain statistics with those of the pretrained encoder, preserving robustness while accommodating domain shifts. Through extensive experiments on diverse datasets and domain shift scenarios, including noise-induced shifts and cognitive domain adaptation tasks, we demonstrate AD-Aligning's superior performance compared to existing methods such as Deep Coral and ADDA. Our findings highlight AD-Aligning's ability to emulate the nuanced cognitive processes inherent in human perception, making it a promising solution for real-world applications requiring adaptable and robust domain adaptation strategies. △ Less

Submitted 21 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted by 2024 5th International Conference on Electronic Communication and Artificial Intelligence

arXiv:2403.08758 [pdf]

Spatiotemporal Diffusion Model with Paired Sampling for Accelerated Cardiac Cine MRI

Authors: Shihan Qiu, Shaoyan Pan, Yikang Liu, Lin Zhao, Jian Xu, Qi Liu, Terrence Chen, Eric Z. Chen, Xiao Chen, Shanhui Sun

Abstract: Current deep learning reconstruction for accelerated cardiac cine MRI suffers from spatial and temporal blurring. We aim to improve image sharpness and motion delineation for cine MRI under high undersampling rates. A spatiotemporal diffusion enhancement model conditional on an existing deep learning reconstruction along with a novel paired sampling strategy was developed. The diffusion model prov… ▽ More Current deep learning reconstruction for accelerated cardiac cine MRI suffers from spatial and temporal blurring. We aim to improve image sharpness and motion delineation for cine MRI under high undersampling rates. A spatiotemporal diffusion enhancement model conditional on an existing deep learning reconstruction along with a novel paired sampling strategy was developed. The diffusion model provided sharper tissue boundaries and clearer motion than the original reconstruction in experts evaluation on clinical data. The innovative paired sampling strategy substantially reduced artificial noises in the generative results. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.08749 [pdf]

Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI

Authors: Shihan Qiu, Shaoyan Pan, Yikang Liu, Lin Zhao, Jian Xu, Qi Liu, Terrence Chen, Eric Z. Chen, Xiao Chen, Shanhui Sun

Abstract: The currently limited quality of accelerated cardiac cine reconstruction may potentially be improved by the emerging diffusion models, but the clinically unacceptable long processing time poses a challenge. We aim to develop a clinically feasible diffusion-model-based reconstruction pipeline to improve the image quality of cine MRI. A multi-in multi-out diffusion enhancement model together with fa… ▽ More The currently limited quality of accelerated cardiac cine reconstruction may potentially be improved by the emerging diffusion models, but the clinically unacceptable long processing time poses a challenge. We aim to develop a clinically feasible diffusion-model-based reconstruction pipeline to improve the image quality of cine MRI. A multi-in multi-out diffusion enhancement model together with fast inference strategies were developed to be used in conjunction with a reconstruction model. The diffusion reconstruction reduced spatial and temporal blurring in prospectively undersampled clinical data, as validated by experts inspection. The 1.5s per video processing time enabled the approach to be applied in clinical scenarios. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2401.10544 [pdf, other]

AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks

Authors: Yun Liang, Hai Lin, Shaojian Qiu, Yihang Zhang

Abstract: Recently, Transformers have been introduced into the field of acoustics recognition. They are pre-trained on large-scale datasets using methods such as supervised learning and semi-supervised learning, demonstrating robust generality--It fine-tunes easily to downstream tasks and shows more robust performance. However, the predominant fine-tuning method currently used is still full fine-tuning, whi… ▽ More Recently, Transformers have been introduced into the field of acoustics recognition. They are pre-trained on large-scale datasets using methods such as supervised learning and semi-supervised learning, demonstrating robust generality--It fine-tunes easily to downstream tasks and shows more robust performance. However, the predominant fine-tuning method currently used is still full fine-tuning, which involves updating all parameters during training. This not only incurs significant memory usage and time costs but also compromises the model's generality. Other fine-tuning methods either struggle to address this issue or fail to achieve matching performance. Therefore, we conducted a comprehensive analysis of existing fine-tuning methods and proposed an efficient fine-tuning approach based on Adapter tuning, namely AAT. The core idea is to freeze the audio Transformer model and insert extra learnable Adapters, efficiently acquiring downstream task knowledge without compromising the model's original generality. Extensive experiments have shown that our method achieves performance comparable to or even superior to full fine-tuning while optimizing only 7.118% of the parameters. It also demonstrates superiority over other fine-tuning methods. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: Preprint version for ICASSP 2024, Korea

arXiv:2310.06109 [pdf, other]

QR-Tag: Angular Measurement and Tracking with a QR-Design Marker

Authors: Simeng Qiu, Hadi Amata, Wolfgang Heidrich

Abstract: Directional information measurement has many applications in domains such as robotics, virtual and augmented reality, and industrial computer vision. Conventional methods either require pre-calibration or necessitate controlled environments. The state-of-the-art MoireTag approach exploits the Moire effect and QR-design to continuously track the angular shift precisely. However, it is still not a f… ▽ More Directional information measurement has many applications in domains such as robotics, virtual and augmented reality, and industrial computer vision. Conventional methods either require pre-calibration or necessitate controlled environments. The state-of-the-art MoireTag approach exploits the Moire effect and QR-design to continuously track the angular shift precisely. However, it is still not a fully QR code design. To overcome the above challenges, we propose a novel snapshot method for discrete angular measurement and tracking with scannable QR-design patterns that are generated by binary structures printed on both sides of a glass plate. The QR codes, resulting from the parallax effect due to the geometry alignment between two layers, can be readily measured as angular information using a phone camera. The simulation results show that the proposed non-contact object tracking framework is computationally efficient with high accuracy. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2306.11837 [pdf, other]

Brain Anatomy Prior Modeling to Forecast Clinical Progression of Cognitive Impairment with Structural MRI

Authors: Lintao Zhang, **jian Wu, Lihong Wang, Li Wang, David C. Steffens, Shijun Qiu, Guy G. Potter, Mingxia Liu

Abstract: Brain structural MRI has been widely used to assess the future progression of cognitive impairment (CI). Previous learning-based studies usually suffer from the issue of small-sized labeled training data, while there exist a huge amount of structural MRIs in large-scale public databases. Intuitively, brain anatomical structures derived from these public MRIs (even without task-specific label infor… ▽ More Brain structural MRI has been widely used to assess the future progression of cognitive impairment (CI). Previous learning-based studies usually suffer from the issue of small-sized labeled training data, while there exist a huge amount of structural MRIs in large-scale public databases. Intuitively, brain anatomical structures derived from these public MRIs (even without task-specific label information) can be used to boost CI progression trajectory prediction. However, previous studies seldom take advantage of such brain anatomy prior. To this end, this paper proposes a brain anatomy prior modeling (BAPM) framework to forecast the clinical progression of cognitive impairment with small-sized target MRIs by exploring anatomical brain structures. Specifically, the BAPM consists of a pretext model and a downstream model, with a shared brain anatomy-guided encoder to model brain anatomy prior explicitly. Besides the encoder, the pretext model also contains two decoders for two auxiliary tasks (i.e., MRI reconstruction and brain tissue segmentation), while the downstream model relies on a predictor for classification. The brain anatomy-guided encoder is pre-trained with the pretext model on 9,344 auxiliary MRIs without diagnostic labels for anatomy prior modeling. With this encoder frozen, the downstream model is then fine-tuned on limited target MRIs for prediction. We validate the BAPM on two CI-related studies with T1-weighted MRIs from 448 subjects. Experimental results suggest the effectiveness of BAPM in (1) four CI progression prediction tasks, (2) MR image reconstruction, and (3) brain tissue segmentation, compared with several state-of-the-art methods. △ Less

Submitted 26 June, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

arXiv:2301.02367 [pdf]

doi 10.1109/TMI.2023.3261346

MR Elastography with Optimization-Based Phase Unwrap** and Traveling Wave Expansion-based Neural Network (TWENN)

Authors: Shengyuan Ma, Runke Wang, Suhao Qiu, Ruokun Li, Qi Yue, Qingfang Sun, Liang Chen, Fuhua Yan, Guang-Zhong Yang, Yuan Feng

Abstract: Magnetic Resonance Elastography (MRE) can characterize biomechanical properties of soft tissue for disease diagnosis and treatment planning. However, complicated wavefields acquired from MRE coupled with noise pose challenges for accurate displacement extraction and modulus estimation. Here we propose a pipeline for processing MRE images using optimization-based displacement extraction and Traveli… ▽ More Magnetic Resonance Elastography (MRE) can characterize biomechanical properties of soft tissue for disease diagnosis and treatment planning. However, complicated wavefields acquired from MRE coupled with noise pose challenges for accurate displacement extraction and modulus estimation. Here we propose a pipeline for processing MRE images using optimization-based displacement extraction and Traveling Wave Expansion-based Neural Network (TWENN) modulus estimation. Phase unwrap** and displacement extraction were achieved by optimization of an objective function with Dual Data Consistency (Dual-DC). A complex-valued neural network using displacement covariance as input has been constructed for the estimation of complex wavenumbers. A model of traveling wave expansion is used to generate training datasets with different levels of noise for the network. The complex shear modulus map is obtained by a fusion of multifrequency and multidirectional data. Validation using images of brain and liver simulation demonstrates the practical value of the proposed pipeline, which can estimate the biomechanical properties with minimum root-mean-square-errors compared with state-of-the-art methods. Applications of the proposed method for processing MRE images of phantom, brain, and liver show clear anatomical features and that the pipeline is robust to noise and has a good generalization capability. △ Less

Submitted 4 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

arXiv:2208.10933 [pdf]

doi 10.1021/acsnano.2c06432

Large-Scale Integrated Flexible Tactile Sensor Array for Sensitive Smart Robotic Touch

Authors: Zhenxuan Zhao, Jianshi Tang, Jian Yuan, Yijun Li, Yuan Dai, Jian Yao, Qingtian Zhang, Sanchuan Ding, Tingyu Li, Ruirui Zhang, Yu Zheng, Zhengyou Zhang, Song Qiu, Qingwen Li, Bin Gao, Ning Deng, He Qian, Fei Xing, Zheng You, Huaqiang Wu

Abstract: In the long pursuit of smart robotics, it has been envisioned to empower robots with human-like senses, especially vision and touch. While tremendous progress has been made in image sensors and computer vision over the past decades, the tactile sense abilities are lagging behind due to the lack of large-scale flexible tactile sensor array with high sensitivity, high spatial resolution, and fast re… ▽ More In the long pursuit of smart robotics, it has been envisioned to empower robots with human-like senses, especially vision and touch. While tremendous progress has been made in image sensors and computer vision over the past decades, the tactile sense abilities are lagging behind due to the lack of large-scale flexible tactile sensor array with high sensitivity, high spatial resolution, and fast response. In this work, we have demonstrated a 64x64 flexible tactile sensor array with a record-high spatial resolution of 0.9 mm (equivalently 28.2 pixels per inch), by integrating a high-performance piezoresistive film (PRF) with a large-area active matrix of carbon nanotube thin-film transistors. PRF with self-formed microstructures exhibited high pressure-sensitivity of ~385 kPa-1 for MWCNTs concentration of 6%, while the 14% one exhibited fast response time of ~3 ms, good linearity, broad detection range beyond 1400 kPa, and excellent cyclability over 3000 cycles. Using this fully integrated tactile sensor array, the footprint maps of an artificial honeybee were clearly identified. Furthermore, we hardware-implemented a smart tactile system by integrating the PRF-based sensor array with a memristor-based computing-in-memory chip to record and recognize handwritten digits and Chinese calligraphy, achieving high classification accuracies of 98.8% and 97.3% in hardware, respectively. The integration of sensor networks with deep learning hardware may enable edge or near-sensor computing with significantly reduced power consumption and latency. Our work could pave the road to building large-scale intelligent sensor networks for next-generation smart robotics. △ Less

Submitted 3 November, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Correction in Methods: The weight ratio of TPU:DMF was set to be 1:5

Journal ref: ACS Nano 2022, 16, 16784

arXiv:2206.05279 [pdf, other]

PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework

Authors: Ning Kang, Shanzhao Qiu, Shifeng Zhang, Zhenguo Li, Shutao Xia

Abstract: Generative model based image lossless compression algorithms have seen a great success in improving compression ratio. However, the throughput for most of them is less than 1 MB/s even with the most advanced AI accelerated chips, preventing them from most real-world applications, which often require 100 MB/s. In this paper, we propose PILC, an end-to-end image lossless compression framework that a… ▽ More Generative model based image lossless compression algorithms have seen a great success in improving compression ratio. However, the throughput for most of them is less than 1 MB/s even with the most advanced AI accelerated chips, preventing them from most real-world applications, which often require 100 MB/s. In this paper, we propose PILC, an end-to-end image lossless compression framework that achieves 200 MB/s for both compression and decompression with a single NVIDIA Tesla V100 GPU, 10 times faster than the most efficient one before. To obtain this result, we first develop an AI codec that combines auto-regressive model and VQ-VAE which performs well in lightweight setting, then we design a low complexity entropy coder that works well with our codec. Experiments show that our framework compresses better than PNG by a margin of 30% in multiple datasets. We believe this is an important step to bring AI compression forward to commercial use. △ Less

Submitted 9 June, 2022; originally announced June 2022.

arXiv:1911.12885 [pdf, other]

Geometric Back-projection Network for Point Cloud Classification

Authors: Shi Qiu, Saeed Anwar, Nick Barnes

Abstract: As the basic task of point cloud analysis, classification is fundamental but always challenging. To address some unsolved problems of existing methods, we propose a network that captures geometric features of point clouds for better representations. To achieve this, on the one hand, we enrich the geometric information of points in low-level 3D space explicitly. On the other hand, we apply CNN-base… ▽ More As the basic task of point cloud analysis, classification is fundamental but always challenging. To address some unsolved problems of existing methods, we propose a network that captures geometric features of point clouds for better representations. To achieve this, on the one hand, we enrich the geometric information of points in low-level 3D space explicitly. On the other hand, we apply CNN-based structures in high-level feature spaces to learn local geometric context implicitly. Specifically, we leverage an idea of error-correcting feedback structure to capture the local features of point clouds comprehensively. Furthermore, an attention module based on channel affinity assists the feature map to avoid possible redundancy by emphasizing its distinct channels. The performance on both synthetic and real-world point clouds datasets demonstrate the superiority and applicability of our network. Comparing with other state-of-the-art methods, our approach balances accuracy and efficiency. △ Less

Submitted 13 April, 2021; v1 submitted 28 November, 2019; originally announced November 2019.

Comments: 14 pages, 8 figures

arXiv:1908.06553 [pdf]

LabelECG: A Web-based Tool for Distributed Electrocardiogram Annotation

Authors: Zijian Ding, Shan Qiu, Yutong Guo, Jian** Lin, Li Sun, Dapeng Fu, Zhen Yang, Chengquan Li, Yang Yu, Long Meng, Tingting Lv, Dan Li, ** Zhang

Abstract: Electrocardiography plays an essential role in diagnosing and screening cardiovascular diseases in daily healthcare. Deep neural networks have shown the potentials to improve the accuracies of arrhythmia detection based on electrocardiograms (ECGs). However, more ECG records with ground truth are needed to promote the development and progression of deep learning techniques in automatic ECG analysi… ▽ More Electrocardiography plays an essential role in diagnosing and screening cardiovascular diseases in daily healthcare. Deep neural networks have shown the potentials to improve the accuracies of arrhythmia detection based on electrocardiograms (ECGs). However, more ECG records with ground truth are needed to promote the development and progression of deep learning techniques in automatic ECG analysis. Here we propose a web-based tool for ECG viewing and annotating, LabelECG. With the facilitation of unified data management, LabelECG is able to distribute large cohorts of ECGs to dozens of technicians and physicians, who can simultaneously make annotations through web-browsers on PCs, tablets and cell phones. Along with the doctors from four hospitals in China, we applied LabelECG to support the annotations of about 15,000 12-lead resting ECG records in three months. These annotated ECGs have successfully supported the First China ECG intelligent Competition. La-belECG will be freely accessible on the Internet to support similar researches, and will also be upgraded through future works. △ Less

Submitted 18 August, 2019; originally announced August 2019.

arXiv:1908.00410 [pdf, other]

Pathological Myopic Image Analysis with Transfer Learning

Authors: Ruitao Xie, Libo Liu, **gxin Liu, Connor S Qiu

Abstract: We present a summary of transfer learning based methods for several challenging myopic fundus image analysis tasks including classification of pathological and non-pathological myopia,localisation of fovea,and segmentation of optic disc.By adapting existing popular deep learning architectures,our proposed methods have achieved 1st and 2nd place in several tasks at the Pathologic Myopia Challenge h… ▽ More We present a summary of transfer learning based methods for several challenging myopic fundus image analysis tasks including classification of pathological and non-pathological myopia,localisation of fovea,and segmentation of optic disc.By adapting existing popular deep learning architectures,our proposed methods have achieved 1st and 2nd place in several tasks at the Pathologic Myopia Challenge held at ISBI2019. △ Less

Submitted 31 July, 2019; originally announced August 2019.

Comments: MIDL 2019 [arXiv:1907.08612]

Report number: MIDL/2019/ExtendedAbstract/BkeLp6mTFE

Showing 1–12 of 12 results for author: Qiu, S