-
Subject-Adaptive Transfer Learning Using Resting State EEG Signals for Cross-Subject EEG Motor Imagery Classification
Authors:
Sion An,
Myeongkyun Kang,
Soopil Kim,
Philip Chikontwe,
Li Shen,
Sang Hyun Park
Abstract:
Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In…
▽ More
Electroencephalography (EEG) motor imagery (MI) classification is a fundamental, yet challenging task due to the variation of signals between individuals i.e., inter-subject variability. Previous approaches try to mitigate this using task-specific (TS) EEG signals from the target subject in training. However, recording TS EEG signals requires time and limits its applicability in various fields. In contrast, resting state (RS) EEG signals are a viable alternative due to ease of acquisition with rich subject information. In this paper, we propose a novel subject-adaptive transfer learning strategy that utilizes RS EEG signals to adapt models on unseen subject data. Specifically, we disentangle extracted features into task- and subject-dependent features and use them to calibrate RS EEG signals for obtaining task information while preserving subject characteristics. The calibrated signals are then used to adapt the model to the target subject, enabling the model to simulate processing TS EEG signals of the target subject. The proposed method achieves state-of-the-art accuracy on three public benchmarks, demonstrating the effectiveness of our method in cross-subject EEG MI classification. Our findings highlight the potential of leveraging RS EEG signals to advance practical brain-computer interface systems.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Inverse Nonlinearity Compensation of Hyperelastic Deformation in Dielectric Elastomer for Acoustic Actuation
Authors:
** Woo Lee,
Gwang Seok An,
Jeong-Yun Sun,
Kyogu Lee
Abstract:
This paper delves into the analysis of nonlinear deformation induced by dielectric actuation in pre-stressed ideal dielectric elastomers. It formulates a nonlinear ordinary differential equation governing this deformation based on the hyperelastic model under dielectric stress. Through numerical integration and neural network approximations, the relationship between voltage and stretch is establis…
▽ More
This paper delves into the analysis of nonlinear deformation induced by dielectric actuation in pre-stressed ideal dielectric elastomers. It formulates a nonlinear ordinary differential equation governing this deformation based on the hyperelastic model under dielectric stress. Through numerical integration and neural network approximations, the relationship between voltage and stretch is established. Neural networks are employed to approximate solutions for voltage-to-stretch and stretch-to-voltage transformations obtained via an explicit Runge-Kutta method. The effectiveness of these approximations is demonstrated by leveraging them for compensating nonlinearity through the wavesha** of the input signal. The comparative analysis highlights the superior accuracy of the approximated solutions over baseline methods, resulting in minimized harmonic distortions when utilizing dielectric elastomers as acoustic actuators. This study underscores the efficacy of the proposed approach in mitigating nonlinearities and enhancing the performance of dielectric elastomers in acoustic actuation applications.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification
Authors:
Guoxin Wang,
Xuyang Cao,
Shan An,
Fengmei Fan,
Chao Zhang,
**song Wang,
Feng Yu,
Zhiren Wang
Abstract:
Deep learning approaches, together with neuroimaging techniques, play an important role in psychiatric disorders classification. Previous studies on psychiatric disorders diagnosis mainly focus on using functional connectivity matrices of resting-state functional magnetic resonance imaging (rs-fMRI) as input, which still needs to fully utilize the rich temporal information of the time series of rs…
▽ More
Deep learning approaches, together with neuroimaging techniques, play an important role in psychiatric disorders classification. Previous studies on psychiatric disorders diagnosis mainly focus on using functional connectivity matrices of resting-state functional magnetic resonance imaging (rs-fMRI) as input, which still needs to fully utilize the rich temporal information of the time series of rs-fMRI data. In this work, we proposed a multi-dimension-embedding-aware modality fusion transformer (MFFormer) for schizophrenia and bipolar disorder classification using rs-fMRI and T1 weighted structural MRI (T1w sMRI). Concretely, to fully utilize the temporal information of rs-fMRI and spatial information of sMRI, we constructed a deep learning architecture that takes as input 2D time series of rs-fMRI and 3D volumes T1w. Furthermore, to promote intra-modality attention and information fusion across different modalities, a fusion transformer module (FTM) is designed through extensive self-attention of hybrid feature maps of multi-modality. In addition, a dimension-up and dimension-down strategy is suggested to properly align feature maps of multi-dimensional from different modalities. Experimental results on our private and public OpenfMRI datasets show that our proposed MFFormer performs better than that using a single modality or multi-modality MRI on schizophrenia and bipolar disorder diagnosis.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Input-Output Feedback Linearization Preserving Task Priority for Multivariate Nonlinear Systems Having Singular Input Gain Matrix
Authors:
Sang-ik An,
Dongheui Lee,
Gyunghoon Park
Abstract:
We propose an extension of the input-output feedback linearization for a class of multivariate systems that are not input-output linearizable in a classical manner. The key observation is that the usual input-output linearization problem can be interpreted as the problem of solving simultaneous linear equations associated with the input gain matrix: thus, even at points where the input gain matrix…
▽ More
We propose an extension of the input-output feedback linearization for a class of multivariate systems that are not input-output linearizable in a classical manner. The key observation is that the usual input-output linearization problem can be interpreted as the problem of solving simultaneous linear equations associated with the input gain matrix: thus, even at points where the input gain matrix becomes singular, it is still possible to solve a part of linear equations, by which a subset of input-output relations is made linear or close to be linear. Based on this observation, we adopt the task priority-based approach in the input-output linearization problem. First, we generalize the classical Byrnes-Isidori normal form to a prioritized normal form having a triangular structure, so that the singularity of a subblock of the input gain matrix related to lower-priority tasks does not directly propagate to higher-priority tasks. Next, we present a prioritized input-output linearization via the multi-objective optimization with the lexicographical ordering, resulting in a prioritized semilinear form that establishes input output relations whose subset with higher priority is linear or close to be linear. Finally, Lyapunov analysis on ultimate boundedness and task achievement is provided, particularly when the proposed prioritized input-output linearization is applied to the output tracking problem. This work introduces a new control framework for complex systems having critical and noncritical control issues, by assigning higher priority to the critical ones.
△ Less
Submitted 4 May, 2023; v1 submitted 3 May, 2023;
originally announced May 2023.
-
A holistically 3D-printed flexible millimeter-wave Doppler radar: Towards fully printed high-frequency multilayer flexible hybrid electronics systems
Authors:
Hong Tang,
Yingjie Zhang,
Bowen Zheng,
Sensong An,
Mohammad Haerinia,
Yunxi Dong,
Yi Huang,
Wei Guo,
Hualiang Zhang
Abstract:
Flexible hybrid electronics (FHE) is an emerging technology enabled through the integration of advanced semiconductor devices and 3D printing technology. It unlocks tremendous market potential by realizing low-cost flexible circuits and systems that can be conformally integrated into various applications. However, the operating frequencies of most reported FHE systems are relatively low. It is als…
▽ More
Flexible hybrid electronics (FHE) is an emerging technology enabled through the integration of advanced semiconductor devices and 3D printing technology. It unlocks tremendous market potential by realizing low-cost flexible circuits and systems that can be conformally integrated into various applications. However, the operating frequencies of most reported FHE systems are relatively low. It is also worth to note that reported FHE systems have been limited to relatively simple design concept (since complex systems will impose challenges in aspects such as multilayer interconnections, printing materials, and bonding layers). Here, we report a fully 3D-printed flexible four-layer millimeter-wave Doppler radar (i.e., a millimeter-wave FHE system). The sensing performance and flexibility of the 3D-printed radar are characterized and validated by general field tests and bending tests, respectively. Our results demonstrate the feasibility of develo** fully 3D-printed high-frequency multilayer FHE, which can be conformally integrated into irregular surfaces (e.g., vehicle bumpers) for applications such as vehicle radars and wearable electronics.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Visualisation of sulphur on single fibre level for pul** industry
Authors:
Börje Norlin,
Siwen An,
Thomas Granfeldt,
David Krapohl,
Barry Lai,
Hafizur Rahman,
Faisal Zeeshan,
Per Engstrand
Abstract:
In the pulp and paper industry, about 5 Mt/y chemithermomechanical pulp (CTMP) are produced globally from softwood chips for production of carton board grades. For tailor making CTMP for this purpose, wood chips are impregnated with aqueous sodium sulphite for sulphonation of the wood lignin. When lignin is sulphonated, the defibration of wood into pulp becomes more selective, resulting in enhance…
▽ More
In the pulp and paper industry, about 5 Mt/y chemithermomechanical pulp (CTMP) are produced globally from softwood chips for production of carton board grades. For tailor making CTMP for this purpose, wood chips are impregnated with aqueous sodium sulphite for sulphonation of the wood lignin. When lignin is sulphonated, the defibration of wood into pulp becomes more selective, resulting in enhanced pulp properties. On a microscopic fibre scale, however, one could strongly assume that the sulphonation of the wood structure is very uneven due to its macroscale size of wood chips. If this is the case and the sulphonation could be done significantly more evenly, the CTMP process could be more efficient and produce pulp even better suited for carton boards. Therefore, the present study aimed to develop a technique based on X-ray fluorescence microscopy imaging (uXRF) for quantifying the sulphur distribution on CTMP wood fibres.
The feasibility of uXRF imaging for sulphur homogeneity measurements in wood fibres needs investigation. Clarification of which spatial and spectral resolution that allows visualization of sulphur impregnation into single wood fibres is needed. Measurements of single fibre imaging were carried out at the APS synchrotron facility. With a synchrotron beam using 1 um scanning step, images of elemental map** are acquired from CTMP samples diluted with non-sulphonated pulp under specified conditions. Since the measurements show significant differ-ences between sulphonated and non-sulphonated fibres, and a significant peak concentration in the shell of the sulphonated fibres, the proposed technique is found to be feasible. The required spatial resolution of the uXRF imaging for an on-site CTMP sulphur homogeneity measurement setup is about 15 um, and the homogeneity measured along the fibre shells is suggested to be used as the CTMP sulphonation measurement parameter.
△ Less
Submitted 21 December, 2022;
originally announced December 2022.
-
Characterization of micro pore optics for full-field X-ray fluorescence imaging
Authors:
Siwen An,
David Krapohl,
Benny Thörnberg,
Romain Roudot,
Emile Schyns,
Börje Norlin
Abstract:
Elemental map** images can be achieved through step scanning imaging using pinhole optics or micro pore optics (MPO), or alternatively by full-field X-ray fluorescence imaging (FF-XRF). X-ray optics for FF-XRF can be manufactured with different micro-channel geometries such as square, hexagonal or circular channels. Each optic geometry creates different imaging artefacts. Square-channel MPOs gen…
▽ More
Elemental map** images can be achieved through step scanning imaging using pinhole optics or micro pore optics (MPO), or alternatively by full-field X-ray fluorescence imaging (FF-XRF). X-ray optics for FF-XRF can be manufactured with different micro-channel geometries such as square, hexagonal or circular channels. Each optic geometry creates different imaging artefacts. Square-channel MPOs generate a high intensity central spot due to two reflections via orthogonal channel walls inside a single channel, which is the desirable part for image formation, and two perpendicular lines forming a cross due to reflections in one plane only.
Thus, we have studied the performance of a square-channel MPO in an FF-XRF imaging system. The setup consists of a commercially available MPO provided by Photonis and a Timepix3 readout chip with a silicon detector. Imaging of fluorescence from small metal particles has been used to obtain the point spread function (PSF) characteristics. The transmission through MPO channels and variation of the critical reflection angle are characterized by measurements of fluorescence from Copper and Titanium metal fragments. Since the critical angle of reflection is energy dependent, the cross-arm artefacts will affect the resolution differently for different fluorescence energies. It is possible to identify metal fragments due to the form of the PSF function. The PSF function can be further characterized using a Fourier transform to suppress diffuse background signals in the image.
△ Less
Submitted 21 December, 2022;
originally announced December 2022.
-
Replacing the Framingham-based equation for prediction of cardiovascular disease risk and adverse outcome by using artificial intelligence and retinal imaging
Authors:
Ehsan Vaghefi,
David Squirrell,
Songyang An,
Song Yang,
John Marshall
Abstract:
Purpose: To create and evaluate the accuracy of an artificial intelligence Deep learning platform (ORAiCLE) capable of using only retinal fundus images to predict both an individuals overall 5 year cardiovascular risk (CVD) and the relative contribution of the component risk factors that comprise this risk. Methods: We used 165,907 retinal images from a database of 47,236 patient visits. Initially…
▽ More
Purpose: To create and evaluate the accuracy of an artificial intelligence Deep learning platform (ORAiCLE) capable of using only retinal fundus images to predict both an individuals overall 5 year cardiovascular risk (CVD) and the relative contribution of the component risk factors that comprise this risk. Methods: We used 165,907 retinal images from a database of 47,236 patient visits. Initially, each image was paired with biometric data age, ethnicity, sex, presence and duration of diabetes a HDL/LDL ratios as well as any CVD event wtihin 5 years of the retinal image acquisition. A risk score based on Framingham equations was calculated. The real CVD event rate was also determined for the individuals and overall population. Finally, ORAiCLE was trained using only age, ethnicity, sex plus retinal images. Results: Compared to Framingham-based score, ORAiCLE was up to 12% more accurate in prediciting cardiovascular event in he next 5-years, especially for the highest risk group of people. The reliability and accuracy of each of the restrictive models was suboptimal to ORAiCLE performance ,indicating that it was using data from both sets of data to derive its final results. Conclusion: Retinal photography is inexpensive and only minimal training is required to acquire them as fully automated, inexpensive camera systems are now widely available. As such, AI-based CVD risk algorithms such as ORAiCLE promise to make CV health screening more accurate, more afforadable and more accessible for all. Furthermore, ORAiCLE unique ability to assess the relative contribution of the components that comprise an individuals overall risk would inform treatment decisions based on the specific needs of an individual, thereby increasing the likelihood of positive health outcomes.
△ Less
Submitted 25 August, 2022; v1 submitted 17 July, 2022;
originally announced July 2022.
-
Source-free Unsupervised Domain Adaptation for Blind Image Quality Assessment
Authors:
Jianzhao Liu,
Xin Li,
Shukun An,
Zhibo Chen
Abstract:
Existing learning-based methods for blind image quality assessment (BIQA) are heavily dependent on large amounts of annotated training data, and usually suffer from a severe performance degradation when encountering the domain/distribution shift problem. Thanks to the development of unsupervised domain adaptation (UDA), some works attempt to transfer the knowledge from a label-sufficient source do…
▽ More
Existing learning-based methods for blind image quality assessment (BIQA) are heavily dependent on large amounts of annotated training data, and usually suffer from a severe performance degradation when encountering the domain/distribution shift problem. Thanks to the development of unsupervised domain adaptation (UDA), some works attempt to transfer the knowledge from a label-sufficient source domain to a label-free target domain under domain shift with UDA. However, it requires the coexistence of source and target data, which might be impractical for source data due to the privacy or storage issues. In this paper, we take the first step towards the source-free unsupervised domain adaptation (SFUDA) in a simple yet efficient manner for BIQA to tackle the domain shift without access to the source data. Specifically, we cast the quality assessment task as a rating distribution prediction problem. Based on the intrinsic properties of BIQA, we present a group of well-designed self-supervised objectives to guide the adaptation of the BN affine parameters towards the target domain. Among them, minimizing the prediction entropy and maximizing the batch prediction diversity aim to encourage more confident results while avoiding the trivial solution. Besides, based on the observation that the IQA rating distribution of single image follows the Gaussian distribution, we apply Gaussian regularization to the predicted rating distribution to make it more consistent with the nature of human scoring. Extensive experimental results under cross-domain scenarios demonstrated the effectiveness of our proposed method to mitigate the domain shift.
△ Less
Submitted 15 August, 2022; v1 submitted 17 July, 2022;
originally announced July 2022.
-
Fast and Scalable Human Pose Estimation using mmWave Point Cloud
Authors:
Sizhe An,
Umit Y. Ogras
Abstract:
Millimeter-Wave (mmWave) radar can enable high-resolution human pose estimation with low cost and computational requirements. However, mmWave data point cloud, the primary input to processing algorithms, is highly sparse and carries significantly less information than other alternatives such as video frames. Furthermore, the scarce labeled mmWave data impedes the development of machine learning (M…
▽ More
Millimeter-Wave (mmWave) radar can enable high-resolution human pose estimation with low cost and computational requirements. However, mmWave data point cloud, the primary input to processing algorithms, is highly sparse and carries significantly less information than other alternatives such as video frames. Furthermore, the scarce labeled mmWave data impedes the development of machine learning (ML) models that can generalize to unseen scenarios. We propose a fast and scalable human pose estimation (FUSE) framework that combines multi-frame representation and meta-learning to address these challenges. Experimental evaluations show that FUSE adapts to the unseen scenarios 4$\times$ faster than current supervised learning approaches and estimates human joint coordinates with about 7 cm mean absolute error.
△ Less
Submitted 29 April, 2022;
originally announced May 2022.
-
BlazeNeo: Blazing fast polyp segmentation and neoplasm detection
Authors:
Nguyen Sy An,
Phan Ngoc Lan,
Dao Viet Hang,
Dao Van Long,
Tran Quang Trung,
Nguyen Thi Thuy,
Dinh Viet Sang
Abstract:
In recent years, computer-aided automatic polyp segmentation and neoplasm detection have been an emerging topic in medical image analysis, providing valuable support to colonoscopy procedures. Attentions have been paid to improving the accuracy of polyp detection and segmentation. However, not much focus has been given to latency and throughput for performing these tasks on dedicated devices, whic…
▽ More
In recent years, computer-aided automatic polyp segmentation and neoplasm detection have been an emerging topic in medical image analysis, providing valuable support to colonoscopy procedures. Attentions have been paid to improving the accuracy of polyp detection and segmentation. However, not much focus has been given to latency and throughput for performing these tasks on dedicated devices, which can be crucial for practical applications. This paper introduces a novel deep neural network architecture called BlazeNeo, for the task of polyp segmentation and neoplasm detection with an emphasis on compactness and speed while maintaining high accuracy. The model leverages the highly efficient HarDNet backbone alongside lightweight Receptive Field Blocks for computational efficiency, and an auxiliary training mechanism to take full advantage of the training data for the segmentation quality. Our experiments on a challenging dataset show that BlazeNeo achieves improvements in latency and model size while maintaining comparable accuracy against state-of-the-art methods. When deploying on the Jetson AGX Xavier edge device in INT8 precision, our BlazeNeo achieves over 155 fps while yielding the best accuracy among all compared methods.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.
-
NeoUNet: Towards accurate colon polyp segmentation and neoplasm detection
Authors:
Phan Ngoc Lan,
Nguyen Sy An,
Dao Viet Hang,
Dao Van Long,
Tran Quang Trung,
Nguyen Thi Thuy,
Dinh Viet Sang
Abstract:
Automatic polyp segmentation has proven to be immensely helpful for endoscopy procedures, reducing the missing rate of adenoma detection for endoscopists while increasing efficiency. However, classifying a polyp as being neoplasm or not and segmenting it at the pixel level is still a challenging task for doctors to perform in a limited time. In this work, we propose a fine-grained formulation for…
▽ More
Automatic polyp segmentation has proven to be immensely helpful for endoscopy procedures, reducing the missing rate of adenoma detection for endoscopists while increasing efficiency. However, classifying a polyp as being neoplasm or not and segmenting it at the pixel level is still a challenging task for doctors to perform in a limited time. In this work, we propose a fine-grained formulation for the polyp segmentation problem. Our formulation aims to not only segment polyp regions, but also identify those at high risk of malignancy with high accuracy. In addition, we present a UNet-based neural network architecture called NeoUNet, along with a hybrid loss function to solve this problem. Experiments show highly competitive results for NeoUNet on our benchmark dataset compared to existing polyp segmentation models.
△ Less
Submitted 11 July, 2021;
originally announced July 2021.
-
Self-Supervised Learning based CT Denoising using Pseudo-CT Image Pairs
Authors:
Dongkyu Won,
Eui** Jung,
Sion An,
Philip Chikontwe,
Sang Hyun Park
Abstract:
Recently, Self-supervised learning methods able to perform image denoising without ground truth labels have been proposed. These methods create low-quality images by adding random or Gaussian noise to images and then train a model for denoising. Ideally, it would be beneficial if one can generate high-quality CT images with only a few training samples via self-supervision. However, the performance…
▽ More
Recently, Self-supervised learning methods able to perform image denoising without ground truth labels have been proposed. These methods create low-quality images by adding random or Gaussian noise to images and then train a model for denoising. Ideally, it would be beneficial if one can generate high-quality CT images with only a few training samples via self-supervision. However, the performance of CT denoising is generally limited due to the complexity of CT noise. To address this problem, we propose a novel self-supervised learning-based CT denoising method. In particular, we train pre-train CT denoising and noise models that can predict CT noise from Low-dose CT (LDCT) using available LDCT and Normal-dose CT (NDCT) pairs. For a given test LDCT, we generate Pseudo-LDCT and NDCT pairs using the pre-trained denoising and noise models and then update the parameters of the denoising model using these pairs to remove noise in the test LDCT. To make realistic Pseudo LDCT, we train multiple noise models from individual images and generate the noise using the ensemble of noise models. We evaluate our method on the 2016 AAPM Low-Dose CT Grand Challenge dataset. The proposed ensemble noise model can generate realistic CT noise, and thus our method significantly improves the denoising performance existing denoising models trained by supervised- and self-supervised learning.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
MGait: Model-Based Gait Analysis Using Wearable Bend and Inertial Sensors
Authors:
Sizhe An,
Yigit Tuncel,
Toygun Basaklar,
Gokul Krishna Krishnakumar,
Ganapati Bhat,
Umit Ogras
Abstract:
Movement disorders, such as Parkinson's disease, affect more than 10 million people worldwide. Gait analysis is a critical step in the diagnosis and rehabilitation of these disorders. Specifically, step length provides valuable insights into the gait quality and rehabilitation process. However, traditional approaches for estimating step length are not suitable for continuous daily monitoring since…
▽ More
Movement disorders, such as Parkinson's disease, affect more than 10 million people worldwide. Gait analysis is a critical step in the diagnosis and rehabilitation of these disorders. Specifically, step length provides valuable insights into the gait quality and rehabilitation process. However, traditional approaches for estimating step length are not suitable for continuous daily monitoring since they rely on special mats and clinical environments. To address this limitation, we present a novel and practical step-length estimation technique using low-power wearable bend and inertial sensors. Experimental results show that the proposed model estimates step length with 5.49% mean absolute percentage error and provides accurate real-time feedback to the user.
△ Less
Submitted 7 September, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
Transfer Learning for Human Activity Recognition using Representational Analysis of Neural Networks
Authors:
Sizhe An,
Ganapati Bhat,
Suat Gumussoy,
Umit Ogras
Abstract:
Human activity recognition (HAR) research has increased in recent years due to its applications in mobile health monitoring, activity recognition, and patient rehabilitation. The typical approach is training a HAR classifier offline with known users and then using the same classifier for new users. However, the accuracy for new users can be low with this approach if their activity patterns are dif…
▽ More
Human activity recognition (HAR) research has increased in recent years due to its applications in mobile health monitoring, activity recognition, and patient rehabilitation. The typical approach is training a HAR classifier offline with known users and then using the same classifier for new users. However, the accuracy for new users can be low with this approach if their activity patterns are different than those in the training data. At the same time, training from scratch for new users is not feasible for mobile applications due to the high computational cost and training time. To address this issue, we propose a HAR transfer learning framework with two components. First, a representational analysis reveals common features that can transfer across users and user-specific features that need to be customized. Using this insight, we transfer the reusable portion of the offline classifier to new users and fine-tune only the rest. Our experiments with five datasets show up to 43% accuracy improvement and 66% training time reduction when compared to the baseline without using transfer learning. Furthermore, measurements on the Nvidia Jetson Xavier-NX hardware platform reveal that the power and energy consumption decrease by 43% and 68%, respectively, while achieving the same or higher accuracy as training from scratch.
△ Less
Submitted 23 February, 2021; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Improving Monocular Depth Estimation by Leveraging Structural Awareness and Complementary Datasets
Authors:
Tian Chen,
Shijie An,
Yuan Zhang,
Chongyang Ma,
Huayan Wang,
Xiaoyan Guo,
Wen Zheng
Abstract:
Monocular depth estimation plays a crucial role in 3D recognition and understanding. One key limitation of existing approaches lies in their lack of structural information exploitation, which leads to inaccurate spatial layout, discontinuous surface, and ambiguous boundaries. In this paper, we tackle this problem in three aspects. First, to exploit the spatial relationship of visual features, we p…
▽ More
Monocular depth estimation plays a crucial role in 3D recognition and understanding. One key limitation of existing approaches lies in their lack of structural information exploitation, which leads to inaccurate spatial layout, discontinuous surface, and ambiguous boundaries. In this paper, we tackle this problem in three aspects. First, to exploit the spatial relationship of visual features, we propose a structure-aware neural network with spatial attention blocks. These blocks guide the network attention to global structures or local details across different feature layers. Second, we introduce a global focal relative loss for uniform point pairs to enhance spatial constraint in the prediction, and explicitly increase the penalty on errors in depth-wise discontinuous regions, which helps preserve the sharpness of estimation results. Finally, based on analysis of failure cases for prior methods, we collect a new Hard Case (HC) Depth dataset of challenging scenes, such as special lighting conditions, dynamic objects, and tilted camera angles. The new dataset is leveraged by an informed learning curriculum that mixes training examples incrementally to handle diverse data distributions. Experimental results show that our method outperforms state-of-the-art approaches by a large margin in terms of both prediction accuracy on NYUDv2 dataset and generalization performance on unseen datasets.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Few-Shot Relation Learning with Attention for EEG-based Motor Imagery Classification
Authors:
Sion An,
Soopil Kim,
Philip Chikontwe,
Sang Hyun Park
Abstract:
Brain-Computer Interfaces (BCI) based on Electroencephalography (EEG) signals, in particular motor imagery (MI) data have received a lot of attention and show the potential towards the design of key technologies both in healthcare and other industries. MI data is generated when a subject imagines movement of limbs and can be used to aid rehabilitation as well as in autonomous driving scenarios. Th…
▽ More
Brain-Computer Interfaces (BCI) based on Electroencephalography (EEG) signals, in particular motor imagery (MI) data have received a lot of attention and show the potential towards the design of key technologies both in healthcare and other industries. MI data is generated when a subject imagines movement of limbs and can be used to aid rehabilitation as well as in autonomous driving scenarios. Thus, classification of MI signals is vital for EEG-based BCI systems. Recently, MI EEG classification techniques using deep learning have shown improved performance over conventional techniques. However, due to inter-subject variability, the scarcity of unseen subject data, and low signal-to-noise ratio, extracting robust features and improving accuracy is still challenging. In this context, we propose a novel two-way few shot network that is able to efficiently learn how to learn representative features of unseen subject categories and how to classify them with limited MI EEG data. The pipeline includes an embedding module that learns feature representations from a set of samples, an attention mechanism for key signal feature discovery, and a relation module for final classification based on relation scores between a support set and a query signal. In addition to the unified learning of feature similarity and a few shot classifier, our method leads to emphasize informative features in support data relevant to the query data, which generalizes better on unseen subjects. For evaluation, we used the BCI competition IV 2b dataset and achieved an 9.3% accuracy improvement in the 20-shot classification task with state-of-the-art performance. Experimental results demonstrate the effectiveness of employing attention and the overall generality of our method.
△ Less
Submitted 19 August, 2020; v1 submitted 2 March, 2020;
originally announced March 2020.
-
Prioritized Inverse Kinematics: Desired Task Trajectories in Nonsingular Task Spaces
Authors:
Sang-ik An,
Dongheui Lee
Abstract:
A prioritized inverse kinematics (PIK) solution can be considered as a (regulation or output tracking) control law of a dynamical system with prioritized multiple outputs. We propose a method that guarantees that a joint trajectory generated from a class of PIK solutions exists uniquely in a nonsingular configuration space. We start by assuming that desired task trajectories stay in nonsingular ta…
▽ More
A prioritized inverse kinematics (PIK) solution can be considered as a (regulation or output tracking) control law of a dynamical system with prioritized multiple outputs. We propose a method that guarantees that a joint trajectory generated from a class of PIK solutions exists uniquely in a nonsingular configuration space. We start by assuming that desired task trajectories stay in nonsingular task spaces and find conditions for task trajectories to stay in a neighborhood of desired task trajectories in which we can guarantee existence and uniqueness of a joint trajectory in a nonsingular configuration space. Based on this result, we find a sufficient condition for task convergence and analyze various stability notions such as stability, uniform stability, uniform asymptotic stability, and exponential stability in both continuous and discrete times. We discuss why the number of tasks is limited in discrete time and show how preconditioning can be used in order to overcome this limitation.
△ Less
Submitted 22 October, 2019;
originally announced October 2019.
-
Prioritized Inverse Kinematics: Nonsmoothness, Trajectory Existence, Task Convergence, Stability
Authors:
Sang-ik An,
Dongheui Lee
Abstract:
In this paper, we study various theoretical properties of a class of prioritized inverse kinematics (PIK) solutions that can be considered as a class of (output regulation or tracking) control laws of a dynamical system with prioritized multiple outputs. We first develop tools to investigate nonsmoothness of PIK solutions and find a sufficient condition for nonsmoothness. It implies that existence…
▽ More
In this paper, we study various theoretical properties of a class of prioritized inverse kinematics (PIK) solutions that can be considered as a class of (output regulation or tracking) control laws of a dynamical system with prioritized multiple outputs. We first develop tools to investigate nonsmoothness of PIK solutions and find a sufficient condition for nonsmoothness. It implies that existence and uniqueness of a joint trajectory satisfying a PIK solution cannot be guaranteed by the classical theorems. So, we construct an alternative existence and uniqueness theorem that uses structural information of PIK solutions. Then, we narrow the class of PIK solutions down to the case that all tasks are designed to follow some desired task trajectories and discover a few properties related to task convergence. The study goes further to analyze stability of equilibrium points of the differential equation whose right hand side is a PIK solution when all tasks are designed to reach some desired task positions. Finally, we furnish an example with a two-link manipulator that shows how our findings can be used to analyze the behavior of a joint trajectory generated from a PIK solution.
△ Less
Submitted 22 January, 2020; v1 submitted 8 May, 2019;
originally announced May 2019.