-
Few-Shot Continual Learning via Flat-to-Wide Approaches
Authors:
Muhammad Anwar Ma'sum,
Mahardhika Pratama,
Edwin Lughofer,
Lin Liu,
Habibullah,
Ryszard Kowalczyk
Abstract:
Existing approaches on continual learning call for a lot of samples in their training processes. Such approaches are impractical for many real-world problems having limited samples because of the overfitting problem. This paper proposes a few-shot continual learning approach, termed FLat-tO-WidE AppRoach (FLOWER), where a flat-to-wide learning process finding the flat-wide minima is proposed to ad…
▽ More
Existing approaches on continual learning call for a lot of samples in their training processes. Such approaches are impractical for many real-world problems having limited samples because of the overfitting problem. This paper proposes a few-shot continual learning approach, termed FLat-tO-WidE AppRoach (FLOWER), where a flat-to-wide learning process finding the flat-wide minima is proposed to address the catastrophic forgetting problem. The issue of data scarcity is overcome with a data augmentation approach making use of a ball generator concept to restrict the sampling space into the smallest enclosing ball. Our numerical studies demonstrate the advantage of FLOWER achieving significantly improved performances over prior arts notably in the small base tasks. For further study, source codes of FLOWER, competitor algorithms and experimental logs are shared publicly in \url{https://github.com/anwarmaxsum/FLOWER}.
△ Less
Submitted 13 July, 2023; v1 submitted 25 June, 2023;
originally announced June 2023.
-
Assessor-Guided Learning for Continual Environments
Authors:
Muhammad Anwar Ma'sum,
Mahardhika Pratama,
Edwin Lughofer,
Wei** Ding,
Wisnu Jatmiko
Abstract:
This paper proposes an assessor-guided learning strategy for continual learning where an assessor guides the learning process of a base learner by controlling the direction and pace of the learning process thus allowing an efficient learning of new environments while protecting against the catastrophic interference problem. The assessor is trained in a meta-learning manner with a meta-objective to…
▽ More
This paper proposes an assessor-guided learning strategy for continual learning where an assessor guides the learning process of a base learner by controlling the direction and pace of the learning process thus allowing an efficient learning of new environments while protecting against the catastrophic interference problem. The assessor is trained in a meta-learning manner with a meta-objective to boost the learning process of the base learner. It performs a soft-weighting mechanism of every sample accepting positive samples while rejecting negative samples. The training objective of a base learner is to minimize a meta-weighted combination of the cross entropy loss function, the dark experience replay (DER) loss function and the knowledge distillation loss function whose interactions are controlled in such a way to attain an improved performance. A compensated over-sampling (COS) strategy is developed to overcome the class imbalanced problem of the episodic memory due to limited memory budgets. Our approach, Assessor-Guided Learning Approach (AGLA), has been evaluated in the class-incremental and task-incremental learning problems. AGLA achieves improved performances compared to its competitors while the theoretical analysis of the COS strategy is offered. Source codes of AGLA, baseline algorithms and experimental logs are shared publicly in \url{https://github.com/anwarmaxsum/AGLA} for further study.
△ Less
Submitted 21 March, 2023;
originally announced March 2023.
-
Evolving Multi-Label Fuzzy Classifier
Authors:
Edwin Lughofer
Abstract:
Multi-label classification has attracted much attention in the machine learning community to address the problem of assigning single samples to more than one class at the same time. We propose an evolving multi-label fuzzy classifier (EFC-ML) which is able to self-adapt and self-evolve its structure with new incoming multi-label samples in an incremental, single-pass manner. It is based on a multi…
▽ More
Multi-label classification has attracted much attention in the machine learning community to address the problem of assigning single samples to more than one class at the same time. We propose an evolving multi-label fuzzy classifier (EFC-ML) which is able to self-adapt and self-evolve its structure with new incoming multi-label samples in an incremental, single-pass manner. It is based on a multi-output Takagi-Sugeno type architecture, where for each class a separate consequent hyper-plane is defined. The learning procedure embeds a locally weighted incremental correlation-based algorithm combined with (conventional) recursive fuzzily weighted least squares and Lasso-based regularization. The correlation-based part ensures that the interrelations between class labels, a specific well-known property in multi-label classification for improved performance, are preserved properly; the Lasso-based regularization reduces the curse of dimensionality effects in the case of a higher number of inputs. Antecedent learning is achieved by product-space clustering and conducted for all class labels together, which yields a single rule base, allowing a compact knowledge view. Furthermore, our approach comes with an online active learning (AL) strategy for updating the classifier on just a number of selected samples, which in turn makes the approach applicable for scarcely labelled streams in applications, where the annotation effort is typically expensive. Our approach was evaluated on several data sets from the MULAN repository and showed significantly improved classification accuracy compared to (evolving) one-versus-rest or classifier chaining concepts. A significant result was that, due to the online AL method, a 90\% reduction in the number of samples used for classifier updates had little effect on the accumulated accuracy trend lines compared to a full update in most data set cases.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Scalable Teacher Forcing Network for Semi-Supervised Large Scale Data Streams
Authors:
Mahardhika Pratama,
Choiru Za'in,
Edwin Lughofer,
Eric Pardede,
Dwi A. P. Rahayu
Abstract:
The large-scale data stream problem refers to high-speed information flow which cannot be processed in scalable manner under a traditional computing platform. This problem also imposes expensive labelling cost making the deployment of fully supervised algorithms unfeasible. On the other hand, the problem of semi-supervised large-scale data streams is little explored in the literature because most…
▽ More
The large-scale data stream problem refers to high-speed information flow which cannot be processed in scalable manner under a traditional computing platform. This problem also imposes expensive labelling cost making the deployment of fully supervised algorithms unfeasible. On the other hand, the problem of semi-supervised large-scale data streams is little explored in the literature because most works are designed in the traditional single-node computing environments while also being fully supervised approaches. This paper offers Weakly Supervised Scalable Teacher Forcing Network (WeScatterNet) to cope with the scarcity of labelled samples and the large-scale data streams simultaneously. WeScatterNet is crafted under distributed computing platform of Apache Spark with a data-free model fusion strategy for model compression after parallel computing stage. It features an open network structure to address the global and local drift problems while integrating a data augmentation, annotation and auto-correction ($DA^3$) method for handling partially labelled data streams. The performance of WeScatterNet is numerically evaluated in the six large-scale data stream problems with only $25\%$ label proportions. It shows highly competitive performance even if compared with fully supervised learners with $100\%$ label proportions.
△ Less
Submitted 25 June, 2021;
originally announced July 2021.
-
Unsupervised Continual Learning via Self-Adaptive Deep Clustering Approach
Authors:
Mahardhika Pratama,
Andri Ashfahani,
Edwin Lughofer
Abstract:
Unsupervised continual learning remains a relatively uncharted territory in the existing literature because the vast majority of existing works call for unlimited access of ground truth incurring expensive labelling cost. Another issue lies in the problem of task boundaries and task IDs which must be known for model's updates or model's predictions hindering feasibility for real-time deployment. K…
▽ More
Unsupervised continual learning remains a relatively uncharted territory in the existing literature because the vast majority of existing works call for unlimited access of ground truth incurring expensive labelling cost. Another issue lies in the problem of task boundaries and task IDs which must be known for model's updates or model's predictions hindering feasibility for real-time deployment. Knowledge Retention in Self-Adaptive Deep Continual Learner, (KIERA), is proposed in this paper. KIERA is developed from the notion of flexible deep clustering approach possessing an elastic network structure to cope with changing environments in the timely manner. The centroid-based experience replay is put forward to overcome the catastrophic forgetting problem. KIERA does not exploit any labelled samples for model updates while featuring a task-agnostic merit. The advantage of KIERA has been numerically validated in popular continual learning problems where it shows highly competitive performance compared to state-of-the art approaches. Our implementation is available in \textit{\url{https://github.com/ContinualAL/KIERA}}.
△ Less
Submitted 28 June, 2021;
originally announced June 2021.
-
Autonomous Deep Quality Monitoring in Streaming Environments
Authors:
Andri Ashfahani,
Mahardhika Pratama,
Edwin Lughofer,
Edward Yapp Kien Yee
Abstract:
The common practice of quality monitoring in industry relies on manual inspection well-known to be slow, error-prone and operator-dependent. This issue raises strong demand for automated real-time quality monitoring developed from data-driven approaches thus alleviating from operator dependence and adapting to various process uncertainties. Nonetheless, current approaches do not take into account…
▽ More
The common practice of quality monitoring in industry relies on manual inspection well-known to be slow, error-prone and operator-dependent. This issue raises strong demand for automated real-time quality monitoring developed from data-driven approaches thus alleviating from operator dependence and adapting to various process uncertainties. Nonetheless, current approaches do not take into account the streaming nature of sensory information while relying heavily on hand-crafted features making them application-specific. This paper proposes the online quality monitoring methodology developed from recently developed deep learning algorithms for data streams, Neural Networks with Dynamically Evolved Capacity (NADINE), namely NADINE++. It features the integration of 1-D and 2-D convolutional layers to extract natural features of time-series and visual data streams captured from sensors and cameras of the injection molding machines from our own project. Real-time experiments have been conducted where the online quality monitoring task is simulated on the fly under the prequential test-then-train fashion - the prominent data stream evaluation protocol. Comparison with the state-of-the-art techniques clearly exhibits the advantage of NADINE++ with 4.68\% improvement on average for the quality monitoring task in streaming environments. To support the reproducible research initiative, codes, results of NADINE++ along with supplementary materials and injection molding dataset are made available in \url{https://github.com/ContinualAL/NADINE-IJCNN2021}.
△ Less
Submitted 26 June, 2021;
originally announced June 2021.
-
DEVDAN: Deep Evolving Denoising Autoencoder
Authors:
Andri Ashfahani,
Mahardhika Pratama,
Edwin Lughofer,
Yew Soon Ong
Abstract:
The Denoising Autoencoder (DAE) enhances the flexibility of the data stream method in exploiting unlabeled samples. Nonetheless, the feasibility of DAE for data stream analytic deserves an in-depth study because it characterizes a fixed network capacity that cannot adapt to rapidly changing environments. Deep evolving denoising autoencoder (DEVDAN), is proposed in this paper. It features an open s…
▽ More
The Denoising Autoencoder (DAE) enhances the flexibility of the data stream method in exploiting unlabeled samples. Nonetheless, the feasibility of DAE for data stream analytic deserves an in-depth study because it characterizes a fixed network capacity that cannot adapt to rapidly changing environments. Deep evolving denoising autoencoder (DEVDAN), is proposed in this paper. It features an open structure in the generative phase and the discriminative phase where the hidden units can be automatically added and discarded on the fly. The generative phase refines the predictive performance of the discriminative model exploiting unlabeled data. Furthermore, DEVDAN is free of the problem-specific threshold and works fully in the single-pass learning fashion. We show that DEVDAN can find competitive network architecture compared with state-of-the-art methods on the classification task using ten prominent datasets simulated under the prequential test-then-train protocol.
△ Less
Submitted 9 January, 2020; v1 submitted 8 October, 2019;
originally announced October 2019.
-
ATL: Autonomous Knowledge Transfer from Many Streaming Processes
Authors:
Mahardhika Pratama,
Marcus de Carvalho,
Renchunzi Xie,
Edwin Lughofer,
Jie Lu
Abstract:
Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for…
▽ More
Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for the online unsupervised transfer learning problem across many streaming processes. ATL offers an online domain adaptation strategy via the generative and discriminative phases coupled with the KL divergence based optimization strategy to produce a domain invariant network while putting forward an elastic network structure. It automatically evolves its network structure from scratch with/without the presence of ground truth to overcome independent concept drifts in the source and target domain. The rigorous numerical evaluation has been conducted along with a comparison against recently published works. ATL demonstrates improved performance while showing significantly faster training speed than its counterparts.
△ Less
Submitted 19 October, 2019; v1 submitted 8 October, 2019;
originally announced October 2019.
-
PAC: A Novel Self-Adaptive Neuro-Fuzzy Controller for Micro Aerial Vehicles
Authors:
Md Meftahul Ferdaus,
Mahardhika Pratama,
Sreenatha G. Anavatti,
Matthew A. Garratt,
Edwin Lughofer
Abstract:
There exists an increasing demand for a flexible and computationally efficient controller for micro aerial vehicles (MAVs) due to a high degree of environmental perturbations. In this work, an evolving neuro-fuzzy controller, namely Parsimonious Controller (PAC) is proposed. It features fewer network parameters than conventional approaches due to the absence of rule premise parameters. PAC is buil…
▽ More
There exists an increasing demand for a flexible and computationally efficient controller for micro aerial vehicles (MAVs) due to a high degree of environmental perturbations. In this work, an evolving neuro-fuzzy controller, namely Parsimonious Controller (PAC) is proposed. It features fewer network parameters than conventional approaches due to the absence of rule premise parameters. PAC is built upon a recently developed evolving neuro-fuzzy system known as parsimonious learning machine (PALM) and adopts new rule growing and pruning modules derived from the approximation of bias and variance. These rule adaptation methods have no reliance on user-defined thresholds, thereby increasing the PAC's autonomy for real-time deployment. PAC adapts the consequent parameters with the sliding mode control (SMC) theory in the single-pass fashion. The boundedness and convergence of the closed-loop control system's tracking error and the controller's consequent parameters are confirmed by utilizing the LaSalle-Yoshizawa theorem. Lastly, the controller's efficacy is evaluated by observing various trajectory tracking performance from a bio-inspired flap**-wing micro aerial vehicle (BI-FWMAV) and a rotary wing micro aerial vehicle called hexacopter. Furthermore, it is compared to three distinctive controllers. Our PAC outperforms the linear PID controller and feed-forward neural network (FFNN) based nonlinear adaptive controller. Compared to its predecessor, G-controller, the tracking accuracy is comparable, but the PAC incurs significantly fewer parameters to attain similar or better performance than the G-controller.
△ Less
Submitted 8 October, 2019; v1 submitted 8 November, 2018;
originally announced November 2018.
-
Autonomous Deep Learning: Incremental Learning of Denoising Autoencoder for Evolving Data Streams
Authors:
Mahardhika Pratama,
Andri Ashfahani,
Yew Soon Ong,
Savitha Ramasamy,
Edwin Lughofer
Abstract:
The generative learning phase of Autoencoder (AE) and its successor Denosing Autoencoder (DAE) enhances the flexibility of data stream method in exploiting unlabelled samples. Nonetheless, the feasibility of DAE for data stream analytic deserves in-depth study because it characterizes a fixed network capacity which cannot adapt to rapidly changing environments. An automated construction of a denoi…
▽ More
The generative learning phase of Autoencoder (AE) and its successor Denosing Autoencoder (DAE) enhances the flexibility of data stream method in exploiting unlabelled samples. Nonetheless, the feasibility of DAE for data stream analytic deserves in-depth study because it characterizes a fixed network capacity which cannot adapt to rapidly changing environments. An automated construction of a denoising autoeconder, namely deep evolving denoising autoencoder (DEVDAN), is proposed in this paper. DEVDAN features an open structure both in the generative phase and in the discriminative phase where input features can be automatically added and discarded on the fly. A network significance (NS) method is formulated in this paper and is derived from the bias-variance concept. This method is capable of estimating the statistical contribution of the network structure and its hidden units which precursors an ideal state to add or prune input features. Furthermore, DEVDAN is free of the problem- specific threshold and works fully in the single-pass learning fashion. The efficacy of DEVDAN is numerically validated using nine non-stationary data stream problems simulated under the prequential test-then-train protocol where DEVDAN is capable of delivering an improvement of classification accuracy to recently published online learning works while having flexibility in the automatic extraction of robust input features and in adapting to rapidly changing environments.
△ Less
Submitted 24 September, 2018;
originally announced September 2018.
-
An Online RFID Localization in the Manufacturing Shopfloor
Authors:
Andri Ashfahani,
Mahardhika Pratama,
Edwin Lughofer,
Qing Cai,
Huang Sheng
Abstract:
{Radio Frequency Identification technology has gained popularity for cheap and easy deployment. In the realm of manufacturing shopfloor, it can be used to track the location of manufacturing objects to achieve better efficiency. The underlying challenge of localization lies in the non-stationary characteristics of manufacturing shopfloor which calls for an adaptive life-long learning strategy in o…
▽ More
{Radio Frequency Identification technology has gained popularity for cheap and easy deployment. In the realm of manufacturing shopfloor, it can be used to track the location of manufacturing objects to achieve better efficiency. The underlying challenge of localization lies in the non-stationary characteristics of manufacturing shopfloor which calls for an adaptive life-long learning strategy in order to arrive at accurate localization results. This paper presents an evolving model based on a novel evolving intelligent system, namely evolving Type-2 Quantum Fuzzy Neural Network (eT2QFNN), which features an interval type-2 quantum fuzzy set with uncertain jump positions. The quantum fuzzy set possesses a graded membership degree which enables better identification of overlaps between classes. The eT2QFNN works fully in the evolving mode where all parameters including the number of rules are automatically adjusted and generated on the fly. The parameter adjustment scenario relies on decoupled extended Kalman filter method. Our numerical study shows that eT2QFNN is able to deliver comparable accuracy compared to state-of-the-art algorithms.
△ Less
Submitted 29 May, 2019; v1 submitted 20 May, 2018;
originally announced May 2018.
-
Robust Unsupervised Domain Adaptation for Neural Networks via Moment Alignment
Authors:
Werner Zellinger,
Bernhard A. Moser,
Thomas Grubinger,
Edwin Lughofer,
Thomas Natschläger,
Susanne Saminger-Platz
Abstract:
A novel approach for unsupervised domain adaptation for neural networks is proposed. It relies on metric-based regularization of the learning process. The metric-based regularization aims at domain-invariant latent feature representations by means of maximizing the similarity between domain-specific activation distributions. The proposed metric results from modifying an integral probability metric…
▽ More
A novel approach for unsupervised domain adaptation for neural networks is proposed. It relies on metric-based regularization of the learning process. The metric-based regularization aims at domain-invariant latent feature representations by means of maximizing the similarity between domain-specific activation distributions. The proposed metric results from modifying an integral probability metric such that it becomes less translation-sensitive on a polynomial function space. The metric has an intuitive interpretation in the dual space as the sum of differences of higher order central moments of the corresponding activation distributions. Under appropriate assumptions on the input distributions, error minimization is proven for the continuous case. As demonstrated by an analysis of standard benchmark experiments for sentiment analysis, object recognition and digit recognition, the outlined approach is robust regarding parameter changes and achieves higher classification accuracies than comparable approaches. The source code is available at https://github.com/wzell/mann.
△ Less
Submitted 13 August, 2019; v1 submitted 16 November, 2017;
originally announced November 2017.
-
Online Tool Condition Monitoring Based on Parsimonious Ensemble+
Authors:
Mahardhika Pratama,
Eric Dimla,
Edwin Lughofer,
Witold Pedrycz,
Tegoeh Tjahjowidowo
Abstract:
Accurate diagnosis of tool wear in metal turning process remains an open challenge for both scientists and industrial practitioners because of inhomogeneities in workpiece material, nonstationary machining settings to suit production requirements, and nonlinear relations between measured variables and tool wear. Common methodologies for tool condition monitoring still rely on batch approaches whic…
▽ More
Accurate diagnosis of tool wear in metal turning process remains an open challenge for both scientists and industrial practitioners because of inhomogeneities in workpiece material, nonstationary machining settings to suit production requirements, and nonlinear relations between measured variables and tool wear. Common methodologies for tool condition monitoring still rely on batch approaches which cannot cope with a fast sampling rate of metal cutting process. Furthermore they require a retraining process to be completed from scratch when dealing with a new set of machining parameters. This paper presents an online tool condition monitoring approach based on Parsimonious Ensemble+, pENsemble+. The unique feature of pENsemble+ lies in its highly flexible principle where both ensemble structure and base-classifier structure can automatically grow and shrink on the fly based on the characteristics of data streams. Moreover, the online feature selection scenario is integrated to actively sample relevant input attributes. The paper presents advancement of a newly developed ensemble learning algorithm, pENsemble+, where online active learning scenario is incorporated to reduce operator labelling effort. The ensemble merging scenario is proposed which allows reduction of ensemble complexity while retaining its diversity. Experimental studies utilising real-world manufacturing data streams and comparisons with well known algorithms were carried out. Furthermore, the efficacy of pENsemble was examined using benchmark concept drift data streams. It has been found that pENsemble+ incurs low structural complexity and results in a significant reduction of operator labelling effort.
△ Less
Submitted 7 December, 2019; v1 submitted 6 November, 2017;
originally announced November 2017.
-
Evolving Ensemble Fuzzy Classifier
Authors:
Mahardhika Pratama,
Witold Pedrycz,
Edwin Lughofer
Abstract:
The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the lit…
▽ More
The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it addresses the bias and variance dilemma better than its single model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining non-stationary data streams can be found in the literature, most of them are crafted under a static base classifier and revisits preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because it involves a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity.
△ Less
Submitted 7 December, 2019; v1 submitted 18 May, 2017;
originally announced May 2017.
-
Metacognitive Learning Approach for Online Tool Condition Monitoring
Authors:
Mahardhika Pratama,
Eric Dimla,
Chow Yin Lai,
Edwin Lughofer
Abstract:
As manufacturing processes become increasingly automated, so should tool condition monitoring (TCM) as it is impractical to have human workers monitor the state of the tools continuously. Tool condition is crucial to ensure the good quality of products: Worn tools affect not only the surface quality but also the dimensional accuracy, which means higher reject rate of the products. Therefore, there…
▽ More
As manufacturing processes become increasingly automated, so should tool condition monitoring (TCM) as it is impractical to have human workers monitor the state of the tools continuously. Tool condition is crucial to ensure the good quality of products: Worn tools affect not only the surface quality but also the dimensional accuracy, which means higher reject rate of the products. Therefore, there is an urgent need to identify tool failures before it occurs on the fly. While various versions of intelligent tool condition monitoring have been proposed, most of them suffer from a cognitive nature of traditional machine learning algorithms. They focus on the how to learn process without paying attention to other two crucial issues: what to learn, and when to learn. The what to learn and the when to learn provide self regulating mechanisms to select the training samples and to determine time instants to train a model. A novel tool condition monitoring approach based on a psychologically plausible concept, namely the metacognitive scaffolding theory, is proposed and built upon a recently published algorithm, recurrent classifier (rClass). The learning process consists of three phases: what to learn, how to learn, when to learn and makes use of a generalized recurrent network structure as a cognitive component. Experimental studies with real-world manufacturing data streams were conducted where rClass demonstrated the highest accuracy while retaining the lowest complexity over its counterparts.
△ Less
Submitted 6 May, 2017;
originally announced May 2017.
-
Parsimonious Random Vector Functional Link Network for Data Streams
Authors:
Mahardhika Pratama,
Plamen P. Angelov,
Edwin Lughofer
Abstract:
The theory of random vector functional link network (RVFLN) has provided a breakthrough in the design of neural networks (NNs) since it conveys solid theoretical justification of randomized learning. Existing works in RVFLN are hardly scalable for data stream analytics because they are inherent to the issue of complexity as a result of the absence of structural learning scenarios. A novel class of…
▽ More
The theory of random vector functional link network (RVFLN) has provided a breakthrough in the design of neural networks (NNs) since it conveys solid theoretical justification of randomized learning. Existing works in RVFLN are hardly scalable for data stream analytics because they are inherent to the issue of complexity as a result of the absence of structural learning scenarios. A novel class of RVLFN, namely parsimonious random vector functional link network (pRVFLN), is proposed in this paper. pRVFLN features an open structure paradigm where its network structure can be built from scratch and can be automatically generated in accordance with degree of nonlinearity and time-varying property of system being modelled. pRVFLN is equipped with complexity reduction scenarios where inconsequential hidden nodes can be pruned and input features can be dynamically selected. pRVFLN puts into perspective an online active learning mechanism which expedites the training process and relieves operator labelling efforts. In addition, pRVFLN introduces a non-parametric type of hidden node, developed using an interval-valued data cloud. The hidden node completely reflects the real data distribution and is not constrained by a specific shape of the cluster. All learning procedures of pRVFLN follow a strictly single-pass learning mode, which is applicable for an online real-time deployment. The efficacy of pRVFLN was rigorously validated through numerous simulations and comparisons with state-of-the art algorithms where it produced the most encouraging numerical results. Furthermore, the robustness of pRVFLN was investigated and a new conclusion is made to the scope of random parameters where it plays vital role to the success of randomized learning.
△ Less
Submitted 6 May, 2017; v1 submitted 10 April, 2017;
originally announced April 2017.
-
Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning
Authors:
Werner Zellinger,
Thomas Grubinger,
Edwin Lughofer,
Thomas Natschläger,
Susanne Saminger-Platz
Abstract:
The learning of domain-invariant representations in the context of domain adaptation with neural networks is considered. We propose a new regularization method that minimizes the discrepancy between domain-specific latent feature representations directly in the hidden activation space. Although some standard distribution matching approaches exist that can be interpreted as the matching of weighted…
▽ More
The learning of domain-invariant representations in the context of domain adaptation with neural networks is considered. We propose a new regularization method that minimizes the discrepancy between domain-specific latent feature representations directly in the hidden activation space. Although some standard distribution matching approaches exist that can be interpreted as the matching of weighted sums of moments, e.g. Maximum Mean Discrepancy (MMD), an explicit order-wise matching of higher order moments has not been considered before. We propose to match the higher order central moments of probability distributions by means of order-wise moment differences. Our model does not require computationally expensive distance and kernel matrix computations. We utilize the equivalent representation of probability distributions by moment sequences to define a new distance function, called Central Moment Discrepancy (CMD). We prove that CMD is a metric on the set of probability distributions on a compact interval. We further prove that convergence of probability distributions on compact intervals w.r.t. the new metric implies convergence in distribution of the respective random variables. We test our approach on two different benchmark data sets for object recognition (Office) and sentiment analysis of product reviews (Amazon reviews). CMD achieves a new state-of-the-art performance on most domain adaptation tasks of Office and outperforms networks trained with MMD, Variational Fair Autoencoders and Domain Adversarial Neural Networks on Amazon reviews. In addition, a post-hoc parameter sensitivity analysis shows that the new approach is stable w.r.t. parameter changes in a certain interval. The source code of the experiments is publicly available.
△ Less
Submitted 2 May, 2019; v1 submitted 28 February, 2017;
originally announced February 2017.