Search | arXiv e-print repository

Audio-Visual Segmentation via Unlabeled Frame Exploitation

Authors: **xiang Liu, Yikun Liu, Fei Zhang, Chen Ju, Ya Zhang, Yanfeng Wang

Abstract: Audio-visual segmentation (AVS) aims to segment the sounding objects in video frames. Although great progress has been witnessed, we experimentally reveal that current methods reach marginal performance gain within the use of the unlabeled frames, leading to the underutilization issue. To fully explore the potential of the unlabeled frames for AVS, we explicitly divide them into two categories bas… ▽ More Audio-visual segmentation (AVS) aims to segment the sounding objects in video frames. Although great progress has been witnessed, we experimentally reveal that current methods reach marginal performance gain within the use of the unlabeled frames, leading to the underutilization issue. To fully explore the potential of the unlabeled frames for AVS, we explicitly divide them into two categories based on their temporal characteristics, i.e., neighboring frame (NF) and distant frame (DF). NFs, temporally adjacent to the labeled frame, often contain rich motion information that assists in the accurate localization of sounding objects. Contrary to NFs, DFs have long temporal distances from the labeled frame, which share semantic-similar objects with appearance variations. Considering their unique characteristics, we propose a versatile framework that effectively leverages them to tackle AVS. Specifically, for NFs, we exploit the motion cues as the dynamic guidance to improve the objectness localization. Besides, we exploit the semantic cues in DFs by treating them as valid augmentations to the labeled frames, which are then used to enrich data diversity in a self-training manner. Extensive experimental results demonstrate the versatility and superiority of our method, unleashing the power of the abundant unlabeled frames. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2401.12440 [pdf, ps, other]

doi 10.1109/ICASSP48485.2024.10447161

Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models

Authors: Chenyang Gao, Brecht Desplanques, Chelsea J. -T. Ju, Aman Chadha, Andreas Stolcke

Abstract: Automated speaker identification (SID) is a crucial step for the personalization of a wide range of speech-enabled services. Typical SID systems use a symmetric enrollment-verification framework with a single model to derive embeddings both offline for voice profiles extracted from enrollment utterances, and online from runtime utterances. Due to the distinct circumstances of enrollment and runtim… ▽ More Automated speaker identification (SID) is a crucial step for the personalization of a wide range of speech-enabled services. Typical SID systems use a symmetric enrollment-verification framework with a single model to derive embeddings both offline for voice profiles extracted from enrollment utterances, and online from runtime utterances. Due to the distinct circumstances of enrollment and runtime, such as different computation and latency constraints, several applications would benefit from an asymmetric enrollment-verification framework that uses different models for enrollment and runtime embedding generation. To support this asymmetric SID where each of the two models can be updated independently, we propose using a lightweight neural network to map the embeddings from the two independent models to a shared speaker embedding space. Our results show that this approach significantly outperforms cosine scoring in a shared speaker logit space for models that were trained with a contrastive loss on large datasets with many speaker identities. This proposed Neural Embedding Speaker Space Alignment (NESSA) combined with an asymmetric update of only one of the models delivers at least 60% of the performance gain achieved by updating both models in the standard symmetric SID approach. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: Accepted to ICASSP 2024

arXiv:2307.13236 [pdf, other]

Audio-aware Query-enhanced Transformer for Audio-Visual Segmentation

Authors: **xiang Liu, Chen Ju, Chaofan Ma, Yanfeng Wang, Yu Wang, Ya Zhang

Abstract: The goal of the audio-visual segmentation (AVS) task is to segment the sounding objects in the video frames using audio cues. However, current fusion-based methods have the performance limitations due to the small receptive field of convolution and inadequate fusion of audio-visual features. To overcome these issues, we propose a novel \textbf{Au}dio-aware query-enhanced \textbf{TR}ansformer (AuTR… ▽ More The goal of the audio-visual segmentation (AVS) task is to segment the sounding objects in the video frames using audio cues. However, current fusion-based methods have the performance limitations due to the small receptive field of convolution and inadequate fusion of audio-visual features. To overcome these issues, we propose a novel \textbf{Au}dio-aware query-enhanced \textbf{TR}ansformer (AuTR) to tackle the task. Unlike existing methods, our approach introduces a multimodal transformer architecture that enables deep fusion and aggregation of audio-visual features. Furthermore, we devise an audio-aware query-enhanced transformer decoder that explicitly helps the model focus on the segmentation of the pinpointed sounding objects based on audio signals, while disregarding silent yet salient objects. Experimental results show that our method outperforms previous methods and demonstrates better generalization ability in multi-sound and open-set scenarios. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: arXiv admin note: text overlap with arXiv:2305.11019

arXiv:2302.11410 [pdf, other]

doi 10.1109/EMBC40787.2023.10340899

Score-Based Data Generation for EEG Spatial Covariance Matrices: Towards Boosting BCI Performance

Authors: Ce Ju, Reinmar Josef Kobler, Cuntai Guan

Abstract: The efficacy of Electroencephalogram (EEG) classifiers can be augmented by increasing the quantity of available data. In the case of geometric deep learning classifiers, the input consists of spatial covariance matrices derived from EEGs. In order to synthesize these spatial covariance matrices and facilitate future improvements of geometric deep learning classifiers, we propose a generative model… ▽ More The efficacy of Electroencephalogram (EEG) classifiers can be augmented by increasing the quantity of available data. In the case of geometric deep learning classifiers, the input consists of spatial covariance matrices derived from EEGs. In order to synthesize these spatial covariance matrices and facilitate future improvements of geometric deep learning classifiers, we propose a generative modeling technique based on state-of-the-art score-based models. The quality of generated samples is evaluated through visual and quantitative assessments using a left/right-hand-movement motor imagery dataset. The exceptional pixel-level resolution of these generative samples highlights the formidable capacity of score-based generative modeling. Additionally, the center (Frechet mean) of the generated samples aligns with neurophysiological evidence that event-related desynchronization and synchronization occur on electrodes C3 and C4 within the Mu and Beta frequency bands during motor imagery processing. The quantitative evaluation revealed that 84.3% of the generated samples could be accurately predicted by a pre-trained classifier and an improvement of up to 8.7% in the average accuracy over ten runs for a specific test subject in a holdout experiment. △ Less

Submitted 15 December, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: 7 pages, 4 figures; This work has been accepted by the 2023 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Conference (IEEE EMBC 2023'). Copyright will be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2211.02641 [pdf, ps, other]

doi 10.1109/TNNLS.2023.3307470

Graph Neural Networks on SPD Manifolds for Motor Imagery Classification: A Perspective from the Time-Frequency Analysis

Authors: Ce Ju, Cuntai Guan

Abstract: The motor imagery (MI) classification has been a prominent research topic in brain-computer interfaces based on electroencephalography (EEG). Over the past few decades, the performance of MI-EEG classifiers has seen gradual enhancement. In this study, we amplify the geometric deep learning-based MI-EEG classifiers from the perspective of time-frequency analysis, introducing a new architecture call… ▽ More The motor imagery (MI) classification has been a prominent research topic in brain-computer interfaces based on electroencephalography (EEG). Over the past few decades, the performance of MI-EEG classifiers has seen gradual enhancement. In this study, we amplify the geometric deep learning-based MI-EEG classifiers from the perspective of time-frequency analysis, introducing a new architecture called Graph-CSPNet. We refer to this category of classifiers as Geometric Classifiers, highlighting their foundation in differential geometry stemming from EEG spatial covariance matrices. Graph-CSPNet utilizes novel manifold-valued graph convolutional techniques to capture the EEG features in the time-frequency domain, offering heightened flexibility in signal segmentation for capturing localized fluctuations. To evaluate the effectiveness of Graph-CSPNet, we employ five commonly-used publicly available MI-EEG datasets, achieving near-optimal classification accuracies in nine out of eleven scenarios. The Python repository can be found at https://github.com/GeometricBCI/Tensor-CSPNet-and-Graph-CSPNet. △ Less

Submitted 20 August, 2023; v1 submitted 25 October, 2022; originally announced November 2022.

Comments: 15 pages, 5 figures, 6 Tables; This work has been accepted by the IEEE Transactions on Neural Networks and Learning Systems, 2023. Copyright will be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2207.07776 [pdf, other]

doi 10.21437/Interspeech.2022-10948

Adversarial Reweighting for Speaker Verification Fairness

Authors: Minho **, Chelsea J. -T. Ju, Zeya Chen, Yi-Chieh Liu, Jasha Droppo, Andreas Stolcke

Abstract: We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so t… ▽ More We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so that the main learner is forced to focus on poorly performing instances. Using a min-max optimization algorithm, this method improves overall speaker verification fairness. We present three different ARWformulations: accumulated pairwise similarity, pseudo-labeling, and pairwise weighting, and measure their performance in terms of equal error rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female speakers, with relative EER reductions of 7.7%, 10.1% and 3.0%, respectively. For nationality subgroups, the proposed algorithm showed 1.04% EER for US speakers, 0.76% for UK speakers, and 1.22% for all others. The absolute EER gap between gender groups was reduced from 0.70% to 0.58%, while the standard deviation over nationality groups decreased from 0.21 to 0.19. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Journal ref: Proc. Interspeech, Sept. 2022, pp. 4800-4804

arXiv:2206.12772 [pdf, other]

doi 10.1145/3503161.3548317

Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation

Authors: **xiang Liu, Chen Ju, Weidi Xie, Ya Zhang

Abstract: We present a simple yet effective self-supervised framework for audio-visual representation learning, to localize the sound source in videos. To understand what enables to learn useful representations, we systematically investigate the effects of data augmentations, and reveal that (1) composition of data augmentations plays a critical role, i.e. explicitly encouraging the audio-visual representat… ▽ More We present a simple yet effective self-supervised framework for audio-visual representation learning, to localize the sound source in videos. To understand what enables to learn useful representations, we systematically investigate the effects of data augmentations, and reveal that (1) composition of data augmentations plays a critical role, i.e. explicitly encouraging the audio-visual representations to be invariant to various transformations~({\em transformation invariance}); (2) enforcing geometric consistency substantially improves the quality of learned representations, i.e. the detected sound source should follow the same transformation applied on input video frames~({\em transformation equivariance}). Extensive experiments demonstrate that our model significantly outperforms previous methods on two sound localization benchmarks, namely, Flickr-SoundNet and VGG-Sound. Additionally, we also evaluate audio retrieval and cross-modal retrieval tasks. In both cases, our self-supervised models demonstrate superior retrieval performances, even competitive with the supervised approach in audio retrieval. This reveals the proposed framework learns strong multi-modal representations that are beneficial to sound localisation and generalization to further applications. \textit{All codes will be available}. △ Less

Submitted 15 August, 2022; v1 submitted 25 June, 2022; originally announced June 2022.

Comments: Camera-ready Version for ACMMM 2022, Project page is https://**xiang-liu.github.io/SSL-TIE/

arXiv:2202.02472 [pdf, ps, other]

doi 10.1109/TNNLS.2022.3172108

Tensor-CSPNet: A Novel Geometric Deep Learning Framework for Motor Imagery Classification

Authors: Ce Ju, Cuntai Guan

Abstract: Deep learning (DL) has been widely investigated in a vast majority of applications in electroencephalography (EEG)-based brain-computer interfaces (BCIs), especially for motor imagery (MI) classification in the past five years. The mainstream DL methodology for the MI-EEG classification exploits the temporospatial patterns of EEG signals using convolutional neural networks (CNNs), which have remar… ▽ More Deep learning (DL) has been widely investigated in a vast majority of applications in electroencephalography (EEG)-based brain-computer interfaces (BCIs), especially for motor imagery (MI) classification in the past five years. The mainstream DL methodology for the MI-EEG classification exploits the temporospatial patterns of EEG signals using convolutional neural networks (CNNs), which have remarkably succeeded in visual images. However, since the statistical characteristics of visual images depart radically from EEG signals, a natural question arises whether an alternative network architecture exists apart from CNNs. To address this question, we propose a novel geometric deep learning (GDL) framework called Tensor-CSPNet, which characterizes spatial covariance matrices derived from EEG signals on symmetric positive definite (SPD) manifolds and fully captures the temporospatiofrequency patterns using existing deep neural networks on SPD manifolds, integrating with experiences from many successful MI-EEG classifiers to optimize the framework. In the experiments, Tensor-CSPNet attains or slightly outperforms the current state-of-the-art performance on the cross-validation and holdout scenarios in two commonly-used MI-EEG datasets. Moreover, the visualization and interpretability analyses also exhibit the validity of Tensor-CSPNet for the MI-EEG classification. To conclude, in this study, we provide a feasible answer to the question by generalizing the DL methodologies on SPD manifolds, which indicates the start of a specific GDL methodology for the MI-EEG classification. △ Less

Submitted 23 September, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

Comments: 15 pages, 10 figures, 12 tables; This work has been accepted by the IEEE Transactions on Neural Networks and Learning Systems. Copyright will be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2201.05745 [pdf, other]

Deep Optimal Transport for Domain Adaptation on SPD Manifolds

Authors: Ce Ju, Cuntai Guan

Abstract: The machine learning community has shown increasing interest in addressing the domain adaptation problem on symmetric positive definite (SPD) manifolds. This interest is primarily driven by the complexities of neuroimaging data generated from brain signals, which often exhibit shifts in data distribution across recording sessions. These neuroimaging data, represented by signal covariance matrices,… ▽ More The machine learning community has shown increasing interest in addressing the domain adaptation problem on symmetric positive definite (SPD) manifolds. This interest is primarily driven by the complexities of neuroimaging data generated from brain signals, which often exhibit shifts in data distribution across recording sessions. These neuroimaging data, represented by signal covariance matrices, possess the mathematical properties of symmetry and positive definiteness. However, applying conventional domain adaptation methods is challenging because these mathematical properties can be disrupted when operating on covariance matrices. In this study, we introduce a novel geometric deep learning-based approach utilizing optimal transport on SPD manifolds to manage discrepancies in both marginal and conditional distributions between the source and target domains. We evaluate the effectiveness of this approach in three cross-session brain-computer interface scenarios and provide visualized results for further insights. The GitHub repository of this study can be accessed at https://github.com/GeometricBCI/Deep-Optimal-Transport-for-Domain-Adaptation-on-SPD-Manifolds. △ Less

Submitted 3 June, 2024; v1 submitted 14 January, 2022; originally announced January 2022.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

ACM Class: I.2.0

arXiv:2106.10169 [pdf, other]

Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition

Authors: Ruirui Li, Chelsea J. -T. Ju, Zeya Chen, Hongda Mao, Oguz Elibol, Andreas Stolcke

Abstract: By implicitly recognizing a user based on his/her speech input, speaker identification enables many downstream applications, such as personalized system behavior and expedited shop** checkouts. Based on whether the speech content is constrained or not, both text-dependent (TD) and text-independent (TI) speaker recognition models may be used. We wish to combine the advantages of both types of mod… ▽ More By implicitly recognizing a user based on his/her speech input, speaker identification enables many downstream applications, such as personalized system behavior and expedited shop** checkouts. Based on whether the speech content is constrained or not, both text-dependent (TD) and text-independent (TI) speaker recognition models may be used. We wish to combine the advantages of both types of models through an ensemble system to make more reliable predictions. However, any such combined approach has to be robust to incomplete inputs, i.e., when either TD or TI input is missing. As a solution we propose a fusion of embeddings network foenet architecture, combining joint learning with neural attention. We compare foenet with four competitive baseline methods on a dataset of voice assistant inputs, and show that it achieves higher accuracy than the baseline and score fusion methods, especially in the presence of incomplete inputs. △ Less

Submitted 18 June, 2021; originally announced June 2021.

arXiv:2102.03745 [pdf, other]

Hierarchically Coordinated Energy Management for A Regional Multi-microgrid Community

Authors: Chengquan Ju

Abstract: This paper proposes a novel hierarchically coordinated energy management system (EMS) for a regional community (e.g., residential area, campus, industrial park, etc.) comprising multiple small-scale microgrids (MGs) (e.g., houses, buildings, etc.). It aims to minimize the total operational cost of the MG community and maximize the individual benefit of each MG simultaneously. At the local level in… ▽ More This paper proposes a novel hierarchically coordinated energy management system (EMS) for a regional community (e.g., residential area, campus, industrial park, etc.) comprising multiple small-scale microgrids (MGs) (e.g., houses, buildings, etc.). It aims to minimize the total operational cost of the MG community and maximize the individual benefit of each MG simultaneously. At the local level inside each MG, with the detailed modeling of various energy resources including photovoltaics (PVs), energy storages (ESs), electric vehicles (EVs) and dispatchable loads, the individual optimization problem is formulated as a mixed-integer linear program (MILP). Local EMSs makes power dispatch decisions for all the controllable units to minimize the operational cost in individual MGs. At the community level, a novel pairing algorithm is proposed to explicitly find the MG pairings with surplus and deficit. The community-level EMS employs the pairing algorithm to determine specific power exchanges among MGs and minimizes the energy transactions with the upstream grid. The operational cost of each individual MG is further reduced by additional economic benefits procured by the community-level EMS. The proposed method has distinguishing advantages on modeling generality, computational complexity and privacy protection, and its performance is verified by the simulation results. △ Less

Submitted 7 February, 2021; originally announced February 2021.

Comments: 11 pages

arXiv:2011.03682 [pdf, other]

Non-local convolutional neural networks (nlcnn) for speaker recognition

Authors: Haici Yang, Hongda Mao, Ruirui Li, Chelsea J. T. Ju, Oguz Elibol

Abstract: Speaker recognition is the process of identifying a speaker based on the voice. The technology has attracted more attention with the recent increase in popularity of smart voice assistants, such as Amazon Alexa. In the past few years, various convolutional neural network (CNN) based speaker recognition algorithms have been proposed and achieved satisfactory performance. However, convolutional oper… ▽ More Speaker recognition is the process of identifying a speaker based on the voice. The technology has attracted more attention with the recent increase in popularity of smart voice assistants, such as Amazon Alexa. In the past few years, various convolutional neural network (CNN) based speaker recognition algorithms have been proposed and achieved satisfactory performance. However, convolutional operations are building blocks that typically perform on a local neighborhood at a time and thus miss to capture global, long-range interactions at the feature level which are critical for understanding the pattern in a speaker's voice. In this work, we propose to apply Non-local Convolutional Neural Networks (NLCNN) to improve the capability of capturing long-range dependencies at the feature level, therefore improving speaker recognition performance. Specifically, we introduce non-local blocks where the output response of a position is computed as a weighted sum of the input features at all positions. Combining non-local blocks with pre-defined CNN networks, we investigate the effectiveness of NLCNN models. Without extensive tuning, the proposed NLCNN models outperform state-of-the-art speaker recognition algorithms on the public Voxceleb dataset. What's more, we investigate different types of non-local operations applied to the frequency-time domain, time domain, frequency domain and frame-level respectively. Among them, time domain is the most effective one for speaker recognition applications. △ Less

Submitted 19 May, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

arXiv:2004.12321 [pdf, other]

doi 10.1109/EMBC44109.2020.9175344

Federated Transfer Learning for EEG Signal Classification

Authors: Ce Ju, Dashan Gao, Ravikiran Mane, Ben Tan, Yang Liu, Cuntai Guan

Abstract: The success of deep learning (DL) methods in the Brain-Computer Interfaces (BCI) field for classification of electroencephalographic (EEG) recordings has been restricted by the lack of large datasets. Privacy concerns associated with EEG signals limit the possibility of constructing a large EEG-BCI dataset by the conglomeration of multiple small ones for jointly training machine learning models. H… ▽ More The success of deep learning (DL) methods in the Brain-Computer Interfaces (BCI) field for classification of electroencephalographic (EEG) recordings has been restricted by the lack of large datasets. Privacy concerns associated with EEG signals limit the possibility of constructing a large EEG-BCI dataset by the conglomeration of multiple small ones for jointly training machine learning models. Hence, in this paper, we propose a novel privacy-preserving DL architecture named federated transfer learning (FTL) for EEG classification that is based on the federated learning framework. Working with the single-trial covariance matrix, the proposed architecture extracts common discriminative information from multi-subject EEG data with the help of domain adaptation techniques. We evaluate the performance of the proposed architecture on the PhysioNet dataset for 2-class motor imagery classification. While avoiding the actual data sharing, our FTL approach achieves 2% higher classification accuracy in a subject-adaptive analysis. Also, in the absence of multi-subject data, our architecture provides 6% better accuracy compared to other state-of-the-art DL architectures. △ Less

Submitted 25 January, 2021; v1 submitted 26 April, 2020; originally announced April 2020.

Comments: 6 pages, 2 figures, Accepted for IEEE Engineering in Medicine and Biology Society (EMBC) 2020 GitHub: https://github.com/DashanGao/Federated-Transfer-Leraning-for-EEG

ACM Class: I.5.4

Journal ref: 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 2020, pp. 3040-3045

arXiv:2002.08602 [pdf, other]

A Hybrid Systems-based Hierarchical Control Architecture for Heterogeneous Field Robot Teams

Authors: Chanyoung Ju, Hyoung Il Son

Abstract: Field robot systems have recently been applied to a wide range of research fields. Making such systems more automated, advanced, and activated requires cooperation among heterogeneous robots. Classic control theory is inefficient in managing large-scale complex dynamic systems. Therefore, the supervisory control theory based on discrete event system needs to be introduced to overcome this limitati… ▽ More Field robot systems have recently been applied to a wide range of research fields. Making such systems more automated, advanced, and activated requires cooperation among heterogeneous robots. Classic control theory is inefficient in managing large-scale complex dynamic systems. Therefore, the supervisory control theory based on discrete event system needs to be introduced to overcome this limitation. In this study, we propose a hybrid systems-based hierarchical control architecture through a supervisory control-based high-level controller and a traditional control-based low-level controller. The hybrid systems and its dynamics are modeled through a formal method called hybrid automata, and the behavior specifications expressing the control objectives for cooperation are designed. Additionally, a modular supervisor that is more scalable and maintainable than a centralized supervisory controller was synthesized. The proposed hybrid systems and hierarchical control architecture were implemented, validated, and then evaluated for performance through the physics-based simulator. Experimental results confirmed that the heterogeneous field robot team satisfied the given specifications and presented systematic results, validating the efficiency of the proposed control architecture. △ Less

Submitted 20 February, 2020; originally announced February 2020.

Comments: 23pages, 19 figures, submitted for publication

arXiv:2002.07630 [pdf]

Extending iLQR method with control delay

Authors: Cheng Ju, Yan Qin, Chunjiang Fu

Abstract: Iterative linear quadradic regulator(iLQR) has become a benchmark method to deal with nonlinear stochastic optimal control problem. However, it does not apply to delay system. In this paper, we extend the iLQR theory and prove new theorem in case of input signal with fixed delay. Which could be beneficial for machine learning or optimal control application to real time robot or human assistive dev… ▽ More Iterative linear quadradic regulator(iLQR) has become a benchmark method to deal with nonlinear stochastic optimal control problem. However, it does not apply to delay system. In this paper, we extend the iLQR theory and prove new theorem in case of input signal with fixed delay. Which could be beneficial for machine learning or optimal control application to real time robot or human assistive device. △ Less

Submitted 15 February, 2020; originally announced February 2020.

arXiv:1909.05784 [pdf, other]

HHHFL: Hierarchical Heterogeneous Horizontal Federated Learning for Electroencephalography

Authors: Dashan Gao, Ce Ju, Xiguang Wei, Yang Liu, Tianjian Chen, Qiang Yang

Abstract: Electroencephalography (EEG) classification techniques have been widely studied for human behavior and emotion recognition tasks. But it is still a challenging issue since the data may vary from subject to subject, may change over time for the same subject, and maybe heterogeneous. Recent years, increasing privacy-preserving demands poses new challenges to this task. The data heterogeneity, as wel… ▽ More Electroencephalography (EEG) classification techniques have been widely studied for human behavior and emotion recognition tasks. But it is still a challenging issue since the data may vary from subject to subject, may change over time for the same subject, and maybe heterogeneous. Recent years, increasing privacy-preserving demands poses new challenges to this task. The data heterogeneity, as well as the privacy constraint of the EEG data, is not concerned in previous studies. To fill this gap, in this paper, we propose a heterogeneous federated learning approach to train machine learning models over heterogeneous EEG data, while preserving the data privacy of each party. To verify the effectiveness of our approach, we conduct experiments on a real-world EEG dataset, consisting of heterogeneous data collected from diverse devices. Our approach achieves consistent performance improvement on every task. △ Less

Submitted 10 September, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

Comments: 5 pages, 6 figures, Accepted for International Workshop on Federated Machine Learning for User Privacy and Data Confidentiality in Conjunction with IJCAI 2019 (FL-IJCAI'2019)

ACM Class: I.2.6; I.2.11

Showing 1–16 of 16 results for author: Ju, C