-
EEG-MACS: Manifold Attention and Confidence Stratification for EEG-based Cross-Center Brain Disease Diagnosis under Unreliable Annotations
Authors:
Zhenxi Song,
Ruihan Qin,
Huixia Ren,
Zhen Liang,
Yi Guo,
Min Zhang,
Zhiguo Zhang
Abstract:
Cross-center data heterogeneity and annotation unreliability significantly challenge the intelligent diagnosis of diseases using brain signals. A notable example is the EEG-based diagnosis of neurodegenerative diseases, which features subtler abnormal neural dynamics typically observed in small-group settings. To advance this area, in this work, we introduce a transferable framework employing Mani…
▽ More
Cross-center data heterogeneity and annotation unreliability significantly challenge the intelligent diagnosis of diseases using brain signals. A notable example is the EEG-based diagnosis of neurodegenerative diseases, which features subtler abnormal neural dynamics typically observed in small-group settings. To advance this area, in this work, we introduce a transferable framework employing Manifold Attention and Confidence Stratification (MACS) to diagnose neurodegenerative disorders based on EEG signals sourced from four centers with unreliable annotations. The MACS framework's effectiveness stems from these features: 1) The Augmentor generates various EEG-represented brain variants to enrich the data space; 2) The Switcher enhances the feature space for trusted samples and reduces overfitting on incorrectly labeled samples; 3) The Encoder uses the Riemannian manifold and Euclidean metrics to capture spatiotemporal variations and dynamic synchronization in EEG; 4) The Projector, equipped with dual heads, monitors consistency across multiple brain variants and ensures diagnostic accuracy; 5) The Stratifier adaptively stratifies learned samples by confidence levels throughout the training process; 6) Forward and backpropagation in MACS are constrained by confidence stratification to stabilize the learning system amid unreliable annotations. Our subject-independent experiments, conducted on both neurocognitive and movement disorders using cross-center corpora, have demonstrated superior performance compared to existing related algorithms. This work not only improves EEG-based diagnostics for cross-center and small-setting brain diseases but also offers insights into extending MACS techniques to other data analyses, tackling data heterogeneity and annotation unreliability in multimedia and multimodal content understanding.
△ Less
Submitted 29 April, 2024;
originally announced May 2024.
-
Image Fusion in Remote Sensing: An Overview and Meta Analysis
Authors:
Hessah Albanwan,
Rongjun Qin,
Yang Tang
Abstract:
Image fusion in Remote Sensing (RS) has been a consistent demand due to its ability to turn raw images of different resolutions, sources, and modalities into accurate, complete, and spatio-temporally coherent images. It greatly facilitates downstream applications such as pan-sharpening, change detection, land-cover classification, etc. Yet, image fusion solutions are highly disparate to various re…
▽ More
Image fusion in Remote Sensing (RS) has been a consistent demand due to its ability to turn raw images of different resolutions, sources, and modalities into accurate, complete, and spatio-temporally coherent images. It greatly facilitates downstream applications such as pan-sharpening, change detection, land-cover classification, etc. Yet, image fusion solutions are highly disparate to various remote sensing problems and thus are often narrowly defined in existing reviews as topical applications, such as pan-sharpening, and spatial-temporal image fusion. Considering that image fusion can be theoretically applied to any gridded data through pixel-level operations, in this paper, we expanded its scope by comprehensively surveying relevant works with a simple taxonomy: 1) many-to-one image fusion; 2) many-to-many image fusion. This simple taxonomy defines image fusion as a map** problem that turns either a single or a set of images into another single or set of images, depending on the desired coherence, e.g., spectral, spatial/resolution coherence, etc. We show that this simple taxonomy, despite the significant modality difference it covers, can be presented by a conceptually easy framework. In addition, we provide a meta-analysis to review the major papers studying the various types of image fusion and their applications over the years (from the 1980s to date), covering 5,926 peer-reviewed papers. Finally, we discuss the main benefits and emerging challenges to provide open research directions and potential future works.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Hybrid Gate-Pulse Model for Variational Quantum Algorithms
Authors:
Zhiding Liang,
Zhixin Song,
**glei Cheng,
Zichang He,
Ji Liu,
Hanrui Wang,
Ruiyang Qin,
Yiru Wang,
Song Han,
Xuehai Qian,
Yiyu Shi
Abstract:
Current quantum programs are mostly synthesized and compiled on the gate-level, where quantum circuits are composed of quantum gates. The gate-level workflow, however, introduces significant redundancy when quantum gates are eventually transformed into control signals and applied on quantum devices. For superconducting quantum computers, the control signals are microwave pulses. Therefore, pulse-l…
▽ More
Current quantum programs are mostly synthesized and compiled on the gate-level, where quantum circuits are composed of quantum gates. The gate-level workflow, however, introduces significant redundancy when quantum gates are eventually transformed into control signals and applied on quantum devices. For superconducting quantum computers, the control signals are microwave pulses. Therefore, pulse-level optimization has gained more attention from researchers due to their advantages in terms of circuit duration. Recent works, however, are limited by their poor scalability brought by the large parameter space of control signals. In addition, the lack of gate-level "knowledge" also affects the performance of pure pulse-level frameworks. We present a hybrid gate-pulse model that can mitigate these problems. We propose to use gate-level compilation and optimization for "fixed" part of the quantum circuits and to use pulse-level methods for problem-agnostic parts. Experimental results demonstrate the efficiency of the proposed framework in discrete optimization tasks. We achieve a performance boost at most 8% with 60% shorter pulse duration in the problem-agnostic layer.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
Time and Cost-Efficient Bathymetric Map** System using Sparse Point Cloud Generation and Automatic Object Detection
Authors:
Andres Pulido,
Ruoyao Qin,
Antonio Diaz,
Andrew Ortega,
Peter Ifju,
Jaejeong Shin
Abstract:
Generating 3D point cloud (PC) data from noisy sonar measurements is a problem that has potential applications for bathymetry map**, artificial object inspection, map** of aquatic plants and fauna as well as underwater navigation and localization of vehicles such as submarines. Side-scan sonar sensors are available in inexpensive cost ranges, especially in fish-finders, where the transducers a…
▽ More
Generating 3D point cloud (PC) data from noisy sonar measurements is a problem that has potential applications for bathymetry map**, artificial object inspection, map** of aquatic plants and fauna as well as underwater navigation and localization of vehicles such as submarines. Side-scan sonar sensors are available in inexpensive cost ranges, especially in fish-finders, where the transducers are usually mounted to the bottom of a boat and can approach shallower depths than the ones attached to an Uncrewed Underwater Vehicle (UUV) can. However, extracting 3D information from side-scan sonar imagery is a difficult task because of its low signal-to-noise ratio and missing angle and depth information in the imagery. Since most algorithms that generate a 3D point cloud from side-scan sonar imagery use Shape from Shading (SFS) techniques, extracting 3D information is especially difficult when the seafloor is smooth, is slowly changing in depth, or does not have identifiable objects that make acoustic shadows. This paper introduces an efficient algorithm that generates a sparse 3D point cloud from side-scan sonar images. This computation is done in a computationally efficient manner by leveraging the geometry of the first sonar return combined with known positions provided by GPS and down-scan sonar depth measurement at each data point. Additionally, this paper implements another algorithm that uses a Convolutional Neural Network (CNN) using transfer learning to perform object detection on side-scan sonar images collected in real life and generated with a simulation. The algorithm was tested on both real and synthetic images to show reasonably accurate anomaly detection and classification.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
A Simulation Study of Passing Drivers' Responses to the Autonomous Truck-Mounted Attenuator System in Road Maintenance
Authors:
Yu Li,
Bill Wang,
William Li,
Ruwen Qin
Abstract:
The Autonomous Truck-Mounted Attenuator (ATMA) system is a lead-follower vehicle system based on autonomous driving and connected vehicle technologies. The lead truck performs maintenance tasks on the road, and the unmanned follower truck alerts passing vehicles about the moving work zone and protects workers and the equipment. While the ATMA has been under testing by transportation maintenance an…
▽ More
The Autonomous Truck-Mounted Attenuator (ATMA) system is a lead-follower vehicle system based on autonomous driving and connected vehicle technologies. The lead truck performs maintenance tasks on the road, and the unmanned follower truck alerts passing vehicles about the moving work zone and protects workers and the equipment. While the ATMA has been under testing by transportation maintenance and operations agencies recently, a simulator-based testing capability is a supplement, especially if human subjects are involved. This paper aims to discover how passing drivers perceive, understand, and react to the ATMA system in road maintenance. With the driving simulator developed for this ATMA study, the paper performed a simulation study wherein a screen-based eye tracker collected sixteen subjects' gaze points and pupil diameters. Data analysis evidenced the change in subjects' visual attention patterns while passing the ATMA. On average, the ATMA starts to attract subjects' attention from 500 ft behind the follower truck. Most (87.50%) understood the follower truck's protection purpose, and many (60%) reasoned the association between the two trucks. Nevertheless, nearly half of the participants (43.75%) did not recognize that ATMA is a connected autonomous vehicle system. While all subjects safely changed lanes and attempted to pass the slow-moving ATMA, their inadequate understanding of the ATMA is a potential risk, like cutting into the ATAM. Results implied that transportation maintenance and operations agencies should consider this in establishing the deployment guidance.
△ Less
Submitted 21 November, 2022; v1 submitted 1 August, 2022;
originally announced August 2022.
-
A Multi-tasking Model of Speaker-Keyword Classification for Kee** Human in the Loop of Drone-assisted Inspection
Authors:
Yu Li,
Anisha Parsan,
Bill Wang,
Penghao Dong,
Shanshan Yao,
Ruwen Qin
Abstract:
Audio commands are a preferred communication medium to keep inspectors in the loop of civil infrastructure inspection performed by a semi-autonomous drone. To understand job-specific commands from a group of heterogeneous and dynamic inspectors, a model must be developed cost-effectively for the group and easily adapted when the group changes. This paper is motivated to build a multi-tasking deep…
▽ More
Audio commands are a preferred communication medium to keep inspectors in the loop of civil infrastructure inspection performed by a semi-autonomous drone. To understand job-specific commands from a group of heterogeneous and dynamic inspectors, a model must be developed cost-effectively for the group and easily adapted when the group changes. This paper is motivated to build a multi-tasking deep learning model that possesses a Share-Split-Collaborate architecture. This architecture allows the two classification tasks to share the feature extractor and then split subject-specific and keyword-specific features intertwined in the extracted features through feature projection and collaborative training. A base model for a group of five authorized subjects is trained and tested on the inspection keyword dataset collected by this study. The model achieved a 95.3% or higher mean accuracy in classifying the keywords of any authorized inspectors. Its mean accuracy in speaker classification is 99.2%. Due to the richer keyword representations that the model learns from the pooled training data, adapting the base model to a new inspector requires only a little training data from that inspector, like five utterances per keyword. Using the speaker classification scores for inspector verification can achieve a success rate of at least 93.9% in verifying authorized inspectors and 76.1% in detecting unauthorized ones. Further, the paper demonstrates the applicability of the proposed model to larger-size groups on a public dataset. This paper provides a solution to addressing challenges facing AI-assisted human-robot interaction, including worker heterogeneity, worker dynamics, and job heterogeneity.
△ Less
Submitted 31 October, 2022; v1 submitted 8 July, 2022;
originally announced July 2022.
-
A Review of Landcover Classification with Very-High Resolution Remotely Sensed Optical Images-Analysis Unit,Model Scalability and Transferability
Authors:
Rongjun Qin,
Tao Liu
Abstract:
As an important application in remote sensing, landcover classification remains one of the most challenging tasks in very-high-resolution (VHR) image analysis. As the rapidly increasing number of Deep Learning (DL) based landcover methods and training strategies are claimed to be the state-of-the-art, the already fragmented technical landscape of landcover map** methods has been further complica…
▽ More
As an important application in remote sensing, landcover classification remains one of the most challenging tasks in very-high-resolution (VHR) image analysis. As the rapidly increasing number of Deep Learning (DL) based landcover methods and training strategies are claimed to be the state-of-the-art, the already fragmented technical landscape of landcover map** methods has been further complicated. Although there exists a plethora of literature review work attempting to guide researchers in making an informed choice of landcover map** methods, the articles either focus on the review of applications in a specific area or revolve around general deep learning models, which lack a systematic view of the ever advancing landcover map** methods. In addition, issues related to training samples and model transferability have become more critical than ever in an era dominated by data-driven approaches, but these issues were addressed to a lesser extent in previous review articles regarding remote sensing classification. Therefore, in this paper, we present a systematic overview of existing methods by starting from learning methods and varying basic analysis units for landcover map** tasks, to challenges and solutions on three aspects of scalability and transferability with a remote sensing classification focus including (1) sparsity and imbalance of data; (2) domain gaps across different geographical regions; and (3) multi-source and multi-view fusion. We discuss in detail each of these categorical methods and draw concluding remarks in these developments and recommend potential directions for the continued endeavor.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Attention-Based Sensor Fusion for Human Activity Recognition Using IMU Signals
Authors:
Wen** Tao,
Haodong Chen,
Md Moniruzzaman,
Ming C. Leu,
Zhaozheng Yi,
Ruwen Qin
Abstract:
Human Activity Recognition (HAR) using wearable devices such as smart watches embedded with Inertial Measurement Unit (IMU) sensors has various applications relevant to our daily life, such as workout tracking and health monitoring. In this paper, we propose a novel attention-based approach to human activity recognition using multiple IMU sensors worn at different body locations. Firstly, a sensor…
▽ More
Human Activity Recognition (HAR) using wearable devices such as smart watches embedded with Inertial Measurement Unit (IMU) sensors has various applications relevant to our daily life, such as workout tracking and health monitoring. In this paper, we propose a novel attention-based approach to human activity recognition using multiple IMU sensors worn at different body locations. Firstly, a sensor-wise feature extraction module is designed to extract the most discriminative features from individual sensors with Convolutional Neural Networks (CNNs). Secondly, an attention-based fusion mechanism is developed to learn the importance of sensors at different body locations and to generate an attentive feature representation. Finally, an inter-sensor feature extraction module is applied to learn the inter-sensor correlations, which are connected to a classifier to output the predicted classes of activities. The proposed approach is evaluated using five public datasets and it outperforms state-of-the-art methods on a wide variety of activity categories.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
A New Weakly Supervised Learning Approach for Real-time Iron Ore Feed Load Estimation
Authors:
Li Guo,
Yonghong Peng,
Rui Qin,
Bingyu Liu
Abstract:
Iron ore feed load control is one of the most critical settings in a mineral grinding process, directly impacting the quality of final products. The setting of the feed load is mainly determined by the characteristics of the ore pellets. However, the characterisation of ore is challenging to acquire in many production environments, leading to poor feed load settings and inefficient production proc…
▽ More
Iron ore feed load control is one of the most critical settings in a mineral grinding process, directly impacting the quality of final products. The setting of the feed load is mainly determined by the characteristics of the ore pellets. However, the characterisation of ore is challenging to acquire in many production environments, leading to poor feed load settings and inefficient production processes. This paper presents our work using deep learning models for direct ore feed load estimation from ore pellet images. To address the challenges caused by the large size of a full ore pellets image and the shortage of accurately annotated data, we treat the whole modelling process as a weakly supervised learning problem. A two-stage model training algorithm and two neural network architectures are proposed. The experiment results show competitive model performance, and the trained models can be used for real-time feed load estimation for grind process optimisation.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
Quality assessment of image matchers for DSM generation -- a comparative study based on UAV images
Authors:
Rongjun Qin,
Armin Gruen,
Cive Fraser
Abstract:
Recently developed automatic dense image matching algorithms are now being implemented for DSM/DTM production, with their pixel-level surface generation capability offering the prospect of partially alleviating the need for manual and semi-automatic stereoscopic measurements. In this paper, five commercial/public software packages for 3D surface generation are evaluated, using 5cm GSD imagery reco…
▽ More
Recently developed automatic dense image matching algorithms are now being implemented for DSM/DTM production, with their pixel-level surface generation capability offering the prospect of partially alleviating the need for manual and semi-automatic stereoscopic measurements. In this paper, five commercial/public software packages for 3D surface generation are evaluated, using 5cm GSD imagery recorded from a UAV. Generated surface models are assessed against point clouds generated from mobile LiDAR and manual stereoscopic measurements. The software packages considered are APS, MICMAC, SURE, Pix4UAV and an SGM implementation from DLR.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
Change Detection for Geodatabase Updating
Authors:
Rongjun Qin
Abstract:
The geodatabase (vectorized data) nowadays becomes a rather standard digital city infrastructure; however, updating geodatabase efficiently and economically remains a fundamental and practical issue in the geospatial industry. The cost of building a geodatabase is extremely high and labor intensive, and very often the maps we use have several months and even years of latency. One solution is to de…
▽ More
The geodatabase (vectorized data) nowadays becomes a rather standard digital city infrastructure; however, updating geodatabase efficiently and economically remains a fundamental and practical issue in the geospatial industry. The cost of building a geodatabase is extremely high and labor intensive, and very often the maps we use have several months and even years of latency. One solution is to develop more automated methods for (vectorized) geospatial data generation, which has been proven a difficult task in the past decades. An alternative solution is to first detect the differences between the new data and the existing geospatial data, and then only update the area identified as changes. The second approach is becoming more favored due to its high practicality and flexibility. A highly relevant technique is change detection. This article aims to provide an overview the state-of-the-art change detection methods in the field of Remote Sensing and Geomatics to support the task of updating geodatabases. Data used for change detection are highly disparate, we therefore structure our review intuitively based on the dimension of the data, being 1) change detection with 2D data; 2) change detection with 3D data. Conclusions will be drawn based on the reviewed efforts in the field, and we will share our outlooks of the topic of updating geodatabases.
△ Less
Submitted 27 June, 2021;
originally announced June 2021.
-
A Comprehensive Survey of Machine Learning Applied to Radar Signal Processing
Authors:
** Lang,
Xiongjun Fu,
Marco Martorella,
Jian Dong,
Rui Qin,
Xianpeng Meng,
Min Xie
Abstract:
Modern radar systems have high requirements in terms of accuracy, robustness and real-time capability when operating on increasingly complex electromagnetic environments. Traditional radar signal processing (RSP) methods have shown some limitations when meeting such requirements, particularly in matters of target classification. With the rapid development of machine learning (ML), especially deep…
▽ More
Modern radar systems have high requirements in terms of accuracy, robustness and real-time capability when operating on increasingly complex electromagnetic environments. Traditional radar signal processing (RSP) methods have shown some limitations when meeting such requirements, particularly in matters of target classification. With the rapid development of machine learning (ML), especially deep learning, radar researchers have started integrating these new methods when solving RSP-related problems. This paper aims at hel** researchers and practitioners to better understand the application of ML techniques to RSP-related problems by providing a comprehensive, structured and reasoned literature overview of ML-based RSP techniques. This work is amply introduced by providing general elements of ML-based RSP and by stating the motivations behind them. The main applications of ML-based RSP are then analysed and structured based on the application field. This paper then concludes with a series of open questions and proposed research directions, in order to indicate current gaps and potential future solutions and trends.
△ Less
Submitted 28 September, 2020;
originally announced September 2020.
-
Multi-View Large-Scale Bundle Adjustment Method for High-Resolution Satellite Images
Authors:
Xu Huang,
Rongjun Qin
Abstract:
Given enough multi-view image corresponding points (also called tie points) and ground control points (GCP), bundle adjustment for high-resolution satellite images is used to refine the orientations or most often used geometric parameters Rational Polynomial Coefficients (RPC) of each satellite image in a unified geodetic framework, which is very critical in many photogrammetry and computer vision…
▽ More
Given enough multi-view image corresponding points (also called tie points) and ground control points (GCP), bundle adjustment for high-resolution satellite images is used to refine the orientations or most often used geometric parameters Rational Polynomial Coefficients (RPC) of each satellite image in a unified geodetic framework, which is very critical in many photogrammetry and computer vision applications. However, the growing number of high resolution spaceborne optical sensors has brought two challenges to the bundle adjustment: 1) images come from different satellite cameras may have different imaging dates, viewing angles, resolutions, etc., thus resulting in geometric and radiometric distortions in the bundle adjustment; 2) The large-scale map** area always corresponds to vast number of bundle adjustment corrections (including RPC bias and object space point coordinates). Due to the limitation of computer memory, it is hard to refine all corrections at the same time. Hence, how to efficiently realize the bundle adjustment in large-scale regions is very important. This paper particularly addresses the multi-view large-scale bundle adjustment problem by two steps: 1) to get robust tie points among different satellite images, we design a multi-view, multi-source tie point matching algorithm based on plane rectification and epipolar constraints, which is able to compensate geometric and local nonlinear radiometric distortions among satellite datasets, and 2) to solve dozens of thousands or even millions of variables bundle adjustment corrections in the large scale bundle adjustment, we use an efficient solution with only a little computer memory. Experiments on in-track and off-track satellite datasets show that the proposed method is capable of computing sub-pixel accuracy bundle adjustment results.
△ Less
Submitted 22 May, 2019;
originally announced May 2019.
-
Analysis of critical parameters of satellite stereo image for 3D reconstruction and map**
Authors:
Rongjun Qin
Abstract:
Although nowadays advanced dense image matching (DIM) algorithms are able to produce LiDAR (Light Detection And Ranging) comparable dense point clouds from satellite stereo images, the accuracy and completeness of such point clouds heavily depend on the geometric parameters of the satellite stereo images. The intersection angle between two images are normally seen as the most important one in ster…
▽ More
Although nowadays advanced dense image matching (DIM) algorithms are able to produce LiDAR (Light Detection And Ranging) comparable dense point clouds from satellite stereo images, the accuracy and completeness of such point clouds heavily depend on the geometric parameters of the satellite stereo images. The intersection angle between two images are normally seen as the most important one in stereo data acquisition, as the state-of-the-art DIM algorithms work best on narrow baseline (smaller intersection angle) stereos (E.g. Semi-Global Matching regards 15-25 degrees as good intersection angle). This factor is in line with the traditional aerial photogrammetry configuration, as the intersection angle directly relates to the base-high ratio and texture distortion in the parallax direction, thus both affecting the horizontal and vertical accuracy. However, our experiments found that even with very similar (and good) intersection angles, the same DIM algorithm applied on different stereo pairs (of the same area) produced point clouds with dramatically different accuracy as compared to the ground truth LiDAR data. This raises a very practical question that is often asked by practitioners: what factors constitute a good satellite stereo pair, such that it produces accurate and optimal results for map** purpose? In this work, we provide a comprehensive analysis on this matter by performing stereo matching over 1,000 satellite stereo pairs with different acquisition parameters including their intersection angles, off-nadir angles, sun elevation & azimuth angles, as well as time differences, thus to offer a thorough answer to this question. This work will potentially provide a valuable reference to researchers working on multi-view satellite image reconstruction, as well as industrial practitioners minimizing costs for high-quality large-scale map**.
△ Less
Submitted 17 May, 2019;
originally announced May 2019.
-
Automated 3D recovery from very high resolution multi-view satellite images
Authors:
Rongjun Qin
Abstract:
This paper presents an automated pipeline for processing multi-view satellite images to 3D digital surface models (DSM). The proposed pipeline performs automated geo-referencing and generates high-quality densely matched point clouds. In particular, a novel approach is developed that fuses multiple depth maps derived by stereo matching to generate high-quality 3D maps. By learning critical configu…
▽ More
This paper presents an automated pipeline for processing multi-view satellite images to 3D digital surface models (DSM). The proposed pipeline performs automated geo-referencing and generates high-quality densely matched point clouds. In particular, a novel approach is developed that fuses multiple depth maps derived by stereo matching to generate high-quality 3D maps. By learning critical configurations of stereo pairs from sample LiDAR data, we rank the image pairs based on the proximity of the results to the sample data. Multiple depth maps derived from individual image pairs are fused with an adaptive 3D median filter that considers the image spectral similarities. We demonstrate that the proposed adaptive median filter generally delivers better results in general as compared to normal median filter, and achieved an accuracy of improvement of 0.36 meters RMSE in the best case. Results and analysis are introduced in detail.
△ Less
Submitted 4 October, 2019; v1 submitted 17 May, 2019;
originally announced May 2019.