-
Graph Mining under Data scarcity
Authors:
Appan Rakaraddi,
Lam Siew-Kei,
Mahardhika Pratama,
Marcus de Carvalho
Abstract:
Multitude of deep learning models have been proposed for node classification in graphs. However, they tend to perform poorly under labeled-data scarcity. Although Few-shot learning for graphs has been introduced to overcome this problem, the existing models are not easily adaptable for generic graph learning frameworks like Graph Neural Networks (GNNs). Our work proposes an Uncertainty Estimator f…
▽ More
Multitude of deep learning models have been proposed for node classification in graphs. However, they tend to perform poorly under labeled-data scarcity. Although Few-shot learning for graphs has been introduced to overcome this problem, the existing models are not easily adaptable for generic graph learning frameworks like Graph Neural Networks (GNNs). Our work proposes an Uncertainty Estimator framework that can be applied on top of any generic GNN backbone network (which are typically designed for supervised/semi-supervised node classification) to improve the node classification performance. A neural network is used to model the Uncertainty Estimator as a probability distribution rather than probabilistic discrete scalar values. We train these models under the classic episodic learning paradigm in the $n$-way, $k$-shot fashion, in an end-to-end setting. Our work demonstrates that implementation of the uncertainty estimator on a GNN backbone network improves the classification accuracy under Few-shot setting without any meta-learning specific architecture. We conduct experiments on multiple datasets under different Few-shot settings and different GNN-based backbone networks. Our method outperforms the baselines, which demonstrates the efficacy of the Uncertainty Estimator for Few-shot node classification on graphs with a GNN.
△ Less
Submitted 11 June, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.
-
Decoding AI: The inside story of data analysis in ChatGPT
Authors:
Ozan Evkaya,
Miguel de Carvalho
Abstract:
As a result of recent advancements in generative AI, the field of Data Science is prone to various changes. This review critically examines the Data Analysis (DA) capabilities of ChatGPT assessing its performance across a wide range of tasks. While DA provides researchers and practitioners with unprecedented analytical capabilities, it is far from being perfect, and it is important to recognize an…
▽ More
As a result of recent advancements in generative AI, the field of Data Science is prone to various changes. This review critically examines the Data Analysis (DA) capabilities of ChatGPT assessing its performance across a wide range of tasks. While DA provides researchers and practitioners with unprecedented analytical capabilities, it is far from being perfect, and it is important to recognize and address its limitations.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
Towards Cross-Domain Continual Learning
Authors:
Marcus de Carvalho,
Mahardhika Pratama,
Jie Zhang,
Chua Haoyan,
Edward Yapp
Abstract:
Continual learning is a process that involves training learning agents to sequentially master a stream of tasks or classes without revisiting past data. The challenge lies in leveraging previously acquired knowledge to learn new tasks efficiently, while avoiding catastrophic forgetting. Existing methods primarily focus on single domains, restricting their applicability to specific problems.
In t…
▽ More
Continual learning is a process that involves training learning agents to sequentially master a stream of tasks or classes without revisiting past data. The challenge lies in leveraging previously acquired knowledge to learn new tasks efficiently, while avoiding catastrophic forgetting. Existing methods primarily focus on single domains, restricting their applicability to specific problems.
In this work, we introduce a novel approach called Cross-Domain Continual Learning (CDCL) that addresses the limitations of being limited to single supervised domains. Our method combines inter- and intra-task cross-attention mechanisms within a compact convolutional network. This integration enables the model to maintain alignment with features from previous tasks, thereby delaying the data drift that may occur between tasks, while performing unsupervised cross-domain (UDA) between related domains. By leveraging an intra-task-specific pseudo-labeling method, we ensure accurate input pairs for both labeled and unlabeled samples, enhancing the learning process. To validate our approach, we conduct extensive experiments on public UDA datasets, showcasing its positive performance on cross-domain continual learning challenges. Additionally, our work introduces incremental ideas that contribute to the advancement of this field.
We make our code and models available to encourage further exploration and reproduction of our results: \url{https://github.com/Ivsucram/CDCL}
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
A parallelizable model-based approach for marginal and multivariate clustering
Authors:
Miguel de Carvalho,
Gabriel Martos Venturini,
Andrej Svetlošák
Abstract:
This paper develops a clustering method that takes advantage of the sturdiness of model-based clustering, while attempting to mitigate some of its pitfalls. First, we note that standard model-based clustering likely leads to the same number of clusters per margin, which seems a rather artificial assumption for a variety of datasets. We tackle this issue by specifying a finite mixture model per mar…
▽ More
This paper develops a clustering method that takes advantage of the sturdiness of model-based clustering, while attempting to mitigate some of its pitfalls. First, we note that standard model-based clustering likely leads to the same number of clusters per margin, which seems a rather artificial assumption for a variety of datasets. We tackle this issue by specifying a finite mixture model per margin that allows each margin to have a different number of clusters, and then cluster the multivariate data using a strategy game-inspired algorithm to which we call Reign-and-Conquer. Second, since the proposed clustering approach only specifies a model for the margins -- but leaves the joint unspecified -- it has the advantage of being partially parallelizable; hence, the proposed approach is computationally appealing as well as more tractable for moderate to high dimensions than a `full' (joint) model-based clustering approach. A battery of numerical experiments on artificial data indicate an overall good performance of the proposed methods in a variety of scenarios, and real datasets are used to showcase their application in practice.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
Class-Incremental Learning via Knowledge Amalgamation
Authors:
Marcus de Carvalho,
Mahardhika Pratama,
Jie Zhang,
Yajuan San
Abstract:
Catastrophic forgetting has been a significant problem hindering the deployment of deep learning algorithms in the continual learning setting. Numerous methods have been proposed to address the catastrophic forgetting problem where an agent loses its generalization power of old tasks while learning new tasks. We put forward an alternative strategy to handle the catastrophic forgetting with knowled…
▽ More
Catastrophic forgetting has been a significant problem hindering the deployment of deep learning algorithms in the continual learning setting. Numerous methods have been proposed to address the catastrophic forgetting problem where an agent loses its generalization power of old tasks while learning new tasks. We put forward an alternative strategy to handle the catastrophic forgetting with knowledge amalgamation (CFA), which learns a student network from multiple heterogeneous teacher models specializing in previous tasks and can be applied to current offline methods. The knowledge amalgamation process is carried out in a single-head manner with only a selected number of memorized samples and no annotations. The teachers and students do not need to share the same network structure, allowing heterogeneous tasks to be adapted to a compact or sparse data representation. We compare our method with competitive baselines from different strategies, demonstrating our approach's advantages.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Reinforced Continual Learning for Graphs
Authors:
Appan Rakaraddi,
Siew Kei Lam,
Mahardhika Pratama,
Marcus De Carvalho
Abstract:
Graph Neural Networks (GNNs) have become the backbone for a myriad of tasks pertaining to graphs and similar topological data structures. While many works have been established in domains related to node and graph classification/regression tasks, they mostly deal with a single task. Continual learning on graphs is largely unexplored and existing graph continual learning approaches are limited to t…
▽ More
Graph Neural Networks (GNNs) have become the backbone for a myriad of tasks pertaining to graphs and similar topological data structures. While many works have been established in domains related to node and graph classification/regression tasks, they mostly deal with a single task. Continual learning on graphs is largely unexplored and existing graph continual learning approaches are limited to the task-incremental learning scenarios. This paper proposes a graph continual learning strategy that combines the architecture-based and memory-based approaches. The structural learning strategy is driven by reinforcement learning, where a controller network is trained in such a way to determine an optimal number of nodes to be added/pruned from the base network when new tasks are observed, thus assuring sufficient network capacities. The parameter learning strategy is underpinned by the concept of Dark Experience replay method to cope with the catastrophic forgetting problem. Our approach is numerically validated with several graph continual learning benchmark problems in both task-incremental learning and class-incremental learning settings. Compared to recently published works, our approach demonstrates improved performance in both the settings. The implementation code can be found at \url{https://github.com/codexhammer/gcl}.
△ Less
Submitted 4 September, 2022;
originally announced September 2022.
-
Autonomous Cross Domain Adaptation under Extreme Label Scarcity
Authors:
Weiwei Weng,
Mahardhika Pratama,
Choiru Za'in,
Marcus De Carvalho,
Rakaraddi Appan,
Andri Ashfahani,
Edward Yapp Kien Yee
Abstract:
A cross domain multistream classification is a challenging problem calling for fast domain adaptations to handle different but related streams in never-ending and rapidly changing environments. Notwithstanding that existing multistream classifiers assume no labelled samples in the target stream, they still incur expensive labelling cost since they require fully labelled samples of the source strea…
▽ More
A cross domain multistream classification is a challenging problem calling for fast domain adaptations to handle different but related streams in never-ending and rapidly changing environments. Notwithstanding that existing multistream classifiers assume no labelled samples in the target stream, they still incur expensive labelling cost since they require fully labelled samples of the source stream. This paper aims to attack the problem of extreme label shortage in the cross domain multistream classification problems where only very few labelled samples of the source stream are provided before process runs. Our solution, namely Learning Streaming Process from Partial Ground Truth (LEOPARD), is built upon a flexible deep clustering network where its hidden nodes, layers and clusters are added and removed dynamically in respect to varying data distributions. A deep clustering strategy is underpinned by a simultaneous feature learning and clustering technique leading to clustering-friendly latent spaces. A domain adaptation strategy relies on the adversarial domain adaptation technique where a feature extractor is trained to fool a domain classifier classifying source and target streams. Our numerical study demonstrates the efficacy of LEOPARD where it delivers improved performances compared to prominent algorithms in 15 of 24 cases. Source codes of LEOPARD are shared in \url{https://github.com/wengweng001/LEOPARD.git} to enable further study.
△ Less
Submitted 4 September, 2022;
originally announced September 2022.
-
ACDC: Online Unsupervised Cross-Domain Adaptation
Authors:
Marcus de Carvalho,
Mahardhika Pratama,
Jie Zhang,
Edward Yapp
Abstract:
We consider the problem of online unsupervised cross-domain adaptation, where two independent but related data streams with different feature spaces -- a fully labeled source stream and an unlabeled target stream -- are learned together. Unique characteristics and challenges such as covariate shift, asynchronous concept drifts, and contrasting data throughput arises. We propose ACDC, an adversaria…
▽ More
We consider the problem of online unsupervised cross-domain adaptation, where two independent but related data streams with different feature spaces -- a fully labeled source stream and an unlabeled target stream -- are learned together. Unique characteristics and challenges such as covariate shift, asynchronous concept drifts, and contrasting data throughput arises. We propose ACDC, an adversarial unsupervised domain adaptation framework that handles multiple data streams with a complete self-evolving neural network structure that reacts to these defiances. ACDC encapsulates three modules into a single model: A denoising autoencoder that extracts features, an adversarial module that performs domain conversion, and an estimator that learns the source stream and predicts the target stream. ACDC is a flexible and expandable framework with little hyper-parameter tunability. Our experimental results under the prequential test-then-train protocol indicate an improvement in target accuracy over the baseline methods, achieving more than a 10\% increase in some cases.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Arc Flow Formulations Based on Dynamic Programming: Theoretical Foundations and Applications
Authors:
Vinícius L. de Lima,
Cláudio Alves,
François Clautiaux,
Manuel Iori,
José M. Valério de Carvalho
Abstract:
Network flow formulations are among the most successful tools to solve optimization problems. Such formulations correspond to determining an optimal flow in a network. One particular class of network flow formulations is the arc flow, where variables represent flows on individual arcs of the network. For $\mathcal{NP}$-hard problems, polynomial-sized arc flow models typically provide weak linear r…
▽ More
Network flow formulations are among the most successful tools to solve optimization problems. Such formulations correspond to determining an optimal flow in a network. One particular class of network flow formulations is the arc flow, where variables represent flows on individual arcs of the network. For $\mathcal{NP}$-hard problems, polynomial-sized arc flow models typically provide weak linear relaxations and may have too much symmetry to be efficient in practice. Instead, arc flow models with a pseudo-polynomial size usually provide strong relaxations and are efficient in practice. The interest in pseudo-polynomial arc flow formulations has grown considerably in the last twenty years, in which they have been used to solve many open instances of hard problems. A remarkable advantage of pseudo-polynomial arc flow models is the possibility to solve practical-sized instances directly by a Mixed Integer Linear Programming solver, avoiding the implementation of complex methods based on column generation.
In this survey, we present theoretical foundations of pseudo-polynomial arc flow formulations, by showing a relation between their network and Dynamic Programming (DP). This relation allows a better understanding of the strength of these formulations, through a link with models obtained by Dantzig-Wolfe decomposition. The relation with DP also allows a new perspective to relate state-space relaxation methods for DP with arc flow models. We also present a dual point of view to contrast the linear relaxation of arc flow models with that of models based on paths and cycles. To conclude, we review the main solution methods and applications of arc flow models based on DP in several domains such as cutting, packing, scheduling, and routing.
△ Less
Submitted 15 April, 2021; v1 submitted 1 October, 2020;
originally announced October 2020.
-
ATL: Autonomous Knowledge Transfer from Many Streaming Processes
Authors:
Mahardhika Pratama,
Marcus de Carvalho,
Renchunzi Xie,
Edwin Lughofer,
Jie Lu
Abstract:
Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for…
▽ More
Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for the online unsupervised transfer learning problem across many streaming processes. ATL offers an online domain adaptation strategy via the generative and discriminative phases coupled with the KL divergence based optimization strategy to produce a domain invariant network while putting forward an elastic network structure. It automatically evolves its network structure from scratch with/without the presence of ground truth to overcome independent concept drifts in the source and target domain. The rigorous numerical evaluation has been conducted along with a comparison against recently published works. ATL demonstrates improved performance while showing significantly faster training speed than its counterparts.
△ Less
Submitted 19 October, 2019; v1 submitted 8 October, 2019;
originally announced October 2019.
-
Predicting assisted ventilation in Amyotrophic Lateral Sclerosis using a mixture of experts and conformal predictors
Authors:
Telma Pereira,
Sofia Pires,
Marta Gromicho,
Susana Pinto,
Mamede de Carvalho,
Sara C. Madeira
Abstract:
Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by a rapid motor decline, leading to respiratory failure and subsequently to death. In this context, researchers have sought for models to automatically predict disease progression to assisted ventilation in ALS patients. However, the clinical translation of such models is limited by the lack of insight 1) on the risk…
▽ More
Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by a rapid motor decline, leading to respiratory failure and subsequently to death. In this context, researchers have sought for models to automatically predict disease progression to assisted ventilation in ALS patients. However, the clinical translation of such models is limited by the lack of insight 1) on the risk of error for predictions at patient-level, and 2) on the most adequate time to administer the non-invasive ventilation. To address these issues, we combine Conformal Prediction (a machine learning framework that complements predictions with confidence measures) and a mixture experts into a prognostic model which not only predicts whether an ALS patient will suffer from respiratory insufficiency but also the most likely time window of occurrence, at a given reliability level. Promising results were obtained, with near 80% of predictions being correctly identified.
△ Less
Submitted 30 July, 2019;
originally announced July 2019.
-
Log Skeletons: A Classification Approach to Process Discovery
Authors:
H. M. W. Verbeek,
R. Medeiros de Carvalho
Abstract:
To test the effectiveness of process discovery algorithms, a Process Discovery Contest (PDC) has been set up. This PDC uses a classification approach to measure this effectiveness: The better the discovered model can classify whether or not a new trace conforms to the event log, the better the discovery algorithm is supposed to be. Unfortunately, even the state-of-the-art fully-automated discovery…
▽ More
To test the effectiveness of process discovery algorithms, a Process Discovery Contest (PDC) has been set up. This PDC uses a classification approach to measure this effectiveness: The better the discovered model can classify whether or not a new trace conforms to the event log, the better the discovery algorithm is supposed to be. Unfortunately, even the state-of-the-art fully-automated discovery algorithms score poorly on this classification. Even the best of these algorithms, the Inductive Miner, scored only 147 correct classified traces out of 200 traces on the PDC of 2017. This paper introduces the rule-based log skeleton model, which is closely related to the Declare constraint model, together with a way to classify traces using this model. This classification using log skeletons is shown to score better on the PDC of 2017 than state-of-the-art discovery algorithms: 194 out of 200. As a result, one can argue that the fully-automated algorithm to construct (or: discover) a log skeleton from an event log outperforms existing state-of-the-art fully-automated discovery algorithms.
△ Less
Submitted 21 June, 2018;
originally announced June 2018.
-
Fourier Analysis and Holographic Representations of 1D and 2D Signals
Authors:
G. A. Giraldi,
B. F. Moutinho,
D. M. L. de Carvalho,
J. C. de Oliveira
Abstract:
In this paper, we focus on Fourier analysis and holographic transforms for signal representation. For instance, in the case of image processing, the holographic representation has the property that an arbitrary portion of the transformed image enables reconstruction of the whole image with details missing. We focus on holographic representation defined through the Fourier Transforms. Thus, We fi…
▽ More
In this paper, we focus on Fourier analysis and holographic transforms for signal representation. For instance, in the case of image processing, the holographic representation has the property that an arbitrary portion of the transformed image enables reconstruction of the whole image with details missing. We focus on holographic representation defined through the Fourier Transforms. Thus, We firstly review some results in Fourier transform and Fourier series. Next, we review the Discrete Holographic Fourier Transform (DHFT) for image representation. Then, we describe the contributions of our work. We show a simple scheme for progressive transmission based on the DHFT. Next, we propose the Continuous Holographic Fourier Transform (CHFT) and discuss some theoretical aspects of it for 1D signals. Finally, some testes are presented in the experimental results
△ Less
Submitted 3 April, 2006; v1 submitted 29 March, 2006;
originally announced March 2006.