Search | arXiv e-print repository

Graph Neural Networks in EEG-based Emotion Recognition: A Survey

Authors: Chenyu Liu, Xinliang Zhou, Yihao Wu, Ruizhi Yang, Liming Zhai, Ziyu Jia, Yang Liu

Abstract: Compared to other modalities, EEG-based emotion recognition can intuitively respond to the emotional patterns in the human brain and, therefore, has become one of the most concerning tasks in the brain-computer interfaces field. Since dependencies within brain regions are closely related to emotion, a significant trend is to develop Graph Neural Networks (GNNs) for EEG-based emotion recognition. H… ▽ More Compared to other modalities, EEG-based emotion recognition can intuitively respond to the emotional patterns in the human brain and, therefore, has become one of the most concerning tasks in the brain-computer interfaces field. Since dependencies within brain regions are closely related to emotion, a significant trend is to develop Graph Neural Networks (GNNs) for EEG-based emotion recognition. However, brain region dependencies in emotional EEG have physiological bases that distinguish GNNs in this field from those in other time series fields. Besides, there is neither a comprehensive review nor guidance for constructing GNNs in EEG-based emotion recognition. In the survey, our categorization reveals the commonalities and differences of existing approaches under a unified framework of graph construction. We analyze and categorize methods from three stages in the framework to provide clear guidance on constructing GNNs in EEG-based emotion recognition. In addition, we discuss several open challenges and future directions, such as Temporal full-connected graph and Graph condensation. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2312.00437 [pdf]

Investigation on data fusion of sun-induced chlorophyll fluorescence and reflectance for photosynthetic capacity of rice

Authors: Yu-an Zhou, Li Zhai, Weijun Zhou, Ji Zhou, Haiyan Cen

Abstract: Studying crop photosynthesis is crucial for improving yield, but current methods are labor-intensive. This research aims to enhance accuracy by combining leaf reflectance and sun-induced chlorophyll fluorescence (SIF) signals to estimate key photosynthetic traits in rice. The study analyzes 149 leaf samples from two rice cultivars, considering reflectance, SIF, chlorophyll, carotenoids, and CO2 re… ▽ More Studying crop photosynthesis is crucial for improving yield, but current methods are labor-intensive. This research aims to enhance accuracy by combining leaf reflectance and sun-induced chlorophyll fluorescence (SIF) signals to estimate key photosynthetic traits in rice. The study analyzes 149 leaf samples from two rice cultivars, considering reflectance, SIF, chlorophyll, carotenoids, and CO2 response curves. After noise removal, SIF and reflectance spectra are used for data fusion at different levels (raw, feature, and decision). Competitive adaptive reweighted sampling (CARS) extracts features, and partial least squares regression (PLSR) builds regression models. Results indicate that using either reflectance or SIF alone provides modest estimations for photosynthetic traits. However, combining these data sources through measurement-level data fusion significantly improves accuracy, with mid-level and decision-level fusion also showing positive outcomes. In particular, decision-level fusion enhances predictive capabilities, suggesting the potential for efficient crop phenoty**. Overall, sun-induced chlorophyll fluorescence spectra effectively predict rice's photosynthetic capacity, and data fusion methods contribute to increased accuracy, paving the way for high-throughput crop phenoty**. △ Less

Submitted 1 December, 2023; originally announced December 2023.

arXiv:2309.11783 [pdf, other]

Frame Pairwise Distance Loss for Weakly-supervised Sound Event Detection

Authors: Rui Tao, Yuxing Huang, Xiangdong Wang, Long Yan, Lufeng Zhai, Kazushige Ouchi, Taihao Li

Abstract: Weakly-supervised learning has emerged as a promising approach to leverage limited labeled data in various domains by bridging the gap between fully supervised methods and unsupervised techniques. Acquisition of strong annotations for detecting sound events is prohibitively expensive, making weakly supervised learning a more cost-effective and broadly applicable alternative. In order to enhance th… ▽ More Weakly-supervised learning has emerged as a promising approach to leverage limited labeled data in various domains by bridging the gap between fully supervised methods and unsupervised techniques. Acquisition of strong annotations for detecting sound events is prohibitively expensive, making weakly supervised learning a more cost-effective and broadly applicable alternative. In order to enhance the recognition rate of the learning of detection of weakly-supervised sound events, we introduce a Frame Pairwise Distance (FPD) loss branch, complemented with a minimal amount of synthesized data. The corresponding sampling and label processing strategies are also proposed. Two distinct distance metrics are employed to evaluate the proposed approach. Finally, the method is validated on the DCASE 2023 task4 dataset. The obtained experimental results corroborated the efficacy of this approach. △ Less

Submitted 7 December, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: Submitted to ICASSP 2024

arXiv:2305.05152 [pdf, other]

Who is Speaking Actually? Robust and Versatile Speaker Traceability for Voice Conversion

Authors: Yanzhen Ren, Hongcheng Zhu, Liming Zhai, Zongkun Sun, Rubing Shen, Lina Wang

Abstract: Voice conversion (VC), as a voice style transfer technology, is becoming increasingly prevalent while raising serious concerns about its illegal use. Proactively tracing the origins of VC-generated speeches, i.e., speaker traceability, can prevent the misuse of VC, but unfortunately has not been extensively studied. In this paper, we are the first to investigate the speaker traceability for VC and… ▽ More Voice conversion (VC), as a voice style transfer technology, is becoming increasingly prevalent while raising serious concerns about its illegal use. Proactively tracing the origins of VC-generated speeches, i.e., speaker traceability, can prevent the misuse of VC, but unfortunately has not been extensively studied. In this paper, we are the first to investigate the speaker traceability for VC and propose a traceable VC framework named VoxTracer. Our VoxTracer is similar to but beyond the paradigm of audio watermarking. We first use unique speaker embedding to represent speaker identity. Then we design a VAE-Glow structure, in which the hiding process imperceptibly integrates the source speaker identity into the VC, and the tracing process accurately recovers the source speaker identity and even the source speech in spite of severe speech quality degradation. To address the speech mismatch between the hiding and tracing processes affected by different distortions, we also adopt an asynchronous training strategy to optimize the VAE-Glow models. The VoxTracer is versatile enough to be applied to arbitrary VC methods and popular audio coding standards. Extensive experiments demonstrate that the VoxTracer achieves not only high imperceptibility in hiding, but also nearly 100% tracing accuracy against various types of audio lossy compressions (AAC, MP3, Opus and SILK) with a broad range of bitrates (16 kbps - 128 kbps) even in a very short time duration (0.74s). Our speech demo is available at https://anonymous.4open.science/w/DEMOofVoxTracer. △ Less

Submitted 26 July, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: has been accepted by ACM MM 2023

arXiv:2304.14920 [pdf, other]

An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Authors: Xinliang Zhou, Dan Lin, Ziyu Jia, Chenyu Liu, Liming Zhai, Yang Liu

Abstract: Drowsy driving has a crucial influence on driving safety, creating an urgent demand for driver drowsiness detection. Electroencephalogram (EEG) signal can accurately reflect the mental fatigue state and thus has been widely studied in drowsiness monitoring. However, the raw EEG data is inherently noisy and redundant, which is neglected by existing works that just use single-channel EEG data or ful… ▽ More Drowsy driving has a crucial influence on driving safety, creating an urgent demand for driver drowsiness detection. Electroencephalogram (EEG) signal can accurately reflect the mental fatigue state and thus has been widely studied in drowsiness monitoring. However, the raw EEG data is inherently noisy and redundant, which is neglected by existing works that just use single-channel EEG data or full-head channel EEG data for model training, resulting in limited performance of driver drowsiness detection. In this paper, we are the first to propose an Interpretability-guided Channel Selection (ICS) framework for the driver drowsiness detection task. Specifically, we design a two-stage training strategy to progressively select the key contributing channels with the guidance of interpretability. We first train a teacher network in the first stage using full-head channel EEG data. Then we apply the class activation map** (CAM) to the trained teacher model to highlight the high-contributing EEG channels and further propose a channel voting scheme to select the top N contributing EEG channels. Finally, we train a student network with the selected channels of EEG data in the second stage for driver drowsiness detection. Experiments are designed on a public dataset, and the results demonstrate that our method is highly applicable and can significantly improve the performance of cross-subject driver drowsiness detection. △ Less

Submitted 26 April, 2023; originally announced April 2023.

arXiv:2304.10755 [pdf, other]

Interpretable and Robust AI in EEG Systems: A Survey

Authors: Xinliang Zhou, Chenyu Liu, Liming Zhai, Ziyu Jia, Cuntai Guan, Yang Liu

Abstract: The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain th… ▽ More The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain the trust of users. The robustness reflects the AI's reliability against attacks and perturbations, which is essential for sensitive and fragile EEG signals. Thus the interpretability and robustness of AI in EEG systems have attracted increasing attention, and their research has achieved great progress recently. However, there is still no survey covering recent advances in this field. In this paper, we present the first comprehensive survey and summarize the interpretable and robust AI techniques for EEG systems. Specifically, we first propose a taxonomy of interpretability by characterizing it into three types: backpropagation, perturbation, and inherently interpretable methods. Then we classify the robustness mechanisms into four classes: noise and artifacts, human variability, data acquisition instability, and adversarial attacks. Finally, we identify several critical and unresolved challenges for interpretable and robust AI in EEG systems and further discuss their future directions. △ Less

Submitted 30 August, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

Showing 1–6 of 6 results for author: Zhai, L