Search | arXiv e-print repository

Graph Mining under Data scarcity

Authors: Appan Rakaraddi, Lam Siew-Kei, Mahardhika Pratama, Marcus de Carvalho

Abstract: Multitude of deep learning models have been proposed for node classification in graphs. However, they tend to perform poorly under labeled-data scarcity. Although Few-shot learning for graphs has been introduced to overcome this problem, the existing models are not easily adaptable for generic graph learning frameworks like Graph Neural Networks (GNNs). Our work proposes an Uncertainty Estimator f… ▽ More Multitude of deep learning models have been proposed for node classification in graphs. However, they tend to perform poorly under labeled-data scarcity. Although Few-shot learning for graphs has been introduced to overcome this problem, the existing models are not easily adaptable for generic graph learning frameworks like Graph Neural Networks (GNNs). Our work proposes an Uncertainty Estimator framework that can be applied on top of any generic GNN backbone network (which are typically designed for supervised/semi-supervised node classification) to improve the node classification performance. A neural network is used to model the Uncertainty Estimator as a probability distribution rather than probabilistic discrete scalar values. We train these models under the classic episodic learning paradigm in the $n$-way, $k$-shot fashion, in an end-to-end setting. Our work demonstrates that implementation of the uncertainty estimator on a GNN backbone network improves the classification accuracy under Few-shot setting without any meta-learning specific architecture. We conduct experiments on multiple datasets under different Few-shot settings and different GNN-based backbone networks. Our method outperforms the baselines, which demonstrates the efficacy of the Uncertainty Estimator for Few-shot node classification on graphs with a GNN. △ Less

Submitted 11 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

Comments: 7 pages, 2 figures

arXiv:2404.08480 [pdf, other]

Decoding AI: The inside story of data analysis in ChatGPT

Authors: Ozan Evkaya, Miguel de Carvalho

Abstract: As a result of recent advancements in generative AI, the field of Data Science is prone to various changes. This review critically examines the Data Analysis (DA) capabilities of ChatGPT assessing its performance across a wide range of tasks. While DA provides researchers and practitioners with unprecedented analytical capabilities, it is far from being perfect, and it is important to recognize an… ▽ More As a result of recent advancements in generative AI, the field of Data Science is prone to various changes. This review critically examines the Data Analysis (DA) capabilities of ChatGPT assessing its performance across a wide range of tasks. While DA provides researchers and practitioners with unprecedented analytical capabilities, it is far from being perfect, and it is important to recognize and address its limitations. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 15 pages with figures and appendix

arXiv:2402.12490 [pdf, other]

Towards Cross-Domain Continual Learning

Authors: Marcus de Carvalho, Mahardhika Pratama, Jie Zhang, Chua Haoyan, Edward Yapp

Abstract: Continual learning is a process that involves training learning agents to sequentially master a stream of tasks or classes without revisiting past data. The challenge lies in leveraging previously acquired knowledge to learn new tasks efficiently, while avoiding catastrophic forgetting. Existing methods primarily focus on single domains, restricting their applicability to specific problems. In t… ▽ More Continual learning is a process that involves training learning agents to sequentially master a stream of tasks or classes without revisiting past data. The challenge lies in leveraging previously acquired knowledge to learn new tasks efficiently, while avoiding catastrophic forgetting. Existing methods primarily focus on single domains, restricting their applicability to specific problems. In this work, we introduce a novel approach called Cross-Domain Continual Learning (CDCL) that addresses the limitations of being limited to single supervised domains. Our method combines inter- and intra-task cross-attention mechanisms within a compact convolutional network. This integration enables the model to maintain alignment with features from previous tasks, thereby delaying the data drift that may occur between tasks, while performing unsupervised cross-domain (UDA) between related domains. By leveraging an intra-task-specific pseudo-labeling method, we ensure accurate input pairs for both labeled and unlabeled samples, enhancing the learning process. To validate our approach, we conduct extensive experiments on public UDA datasets, showcasing its positive performance on cross-domain continual learning challenges. Additionally, our work introduces incremental ideas that contribute to the advancement of this field. We make our code and models available to encourage further exploration and reproduction of our results: \url{https://github.com/Ivsucram/CDCL} △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: 12 pages, 2 Figures, 4 Tables. To be published at the IEEE International Conference on Data Engineering (ICDE) 2024

arXiv:2212.04009 [pdf, other]

A parallelizable model-based approach for marginal and multivariate clustering

Authors: Miguel de Carvalho, Gabriel Martos Venturini, Andrej Svetlošák

Abstract: This paper develops a clustering method that takes advantage of the sturdiness of model-based clustering, while attempting to mitigate some of its pitfalls. First, we note that standard model-based clustering likely leads to the same number of clusters per margin, which seems a rather artificial assumption for a variety of datasets. We tackle this issue by specifying a finite mixture model per mar… ▽ More This paper develops a clustering method that takes advantage of the sturdiness of model-based clustering, while attempting to mitigate some of its pitfalls. First, we note that standard model-based clustering likely leads to the same number of clusters per margin, which seems a rather artificial assumption for a variety of datasets. We tackle this issue by specifying a finite mixture model per margin that allows each margin to have a different number of clusters, and then cluster the multivariate data using a strategy game-inspired algorithm to which we call Reign-and-Conquer. Second, since the proposed clustering approach only specifies a model for the margins -- but leaves the joint unspecified -- it has the advantage of being partially parallelizable; hence, the proposed approach is computationally appealing as well as more tractable for moderate to high dimensions than a `full' (joint) model-based clustering approach. A battery of numerical experiments on artificial data indicate an overall good performance of the proposed methods in a variety of scenarios, and real datasets are used to showcase their application in practice. △ Less

Submitted 7 December, 2022; originally announced December 2022.

arXiv:2209.02112 [pdf, other]

Class-Incremental Learning via Knowledge Amalgamation

Authors: Marcus de Carvalho, Mahardhika Pratama, Jie Zhang, Yajuan San

Abstract: Catastrophic forgetting has been a significant problem hindering the deployment of deep learning algorithms in the continual learning setting. Numerous methods have been proposed to address the catastrophic forgetting problem where an agent loses its generalization power of old tasks while learning new tasks. We put forward an alternative strategy to handle the catastrophic forgetting with knowled… ▽ More Catastrophic forgetting has been a significant problem hindering the deployment of deep learning algorithms in the continual learning setting. Numerous methods have been proposed to address the catastrophic forgetting problem where an agent loses its generalization power of old tasks while learning new tasks. We put forward an alternative strategy to handle the catastrophic forgetting with knowledge amalgamation (CFA), which learns a student network from multiple heterogeneous teacher models specializing in previous tasks and can be applied to current offline methods. The knowledge amalgamation process is carried out in a single-head manner with only a selected number of memorized samples and no annotations. The teachers and students do not need to share the same network structure, allowing heterogeneous tasks to be adapted to a compact or sparse data representation. We compare our method with competitive baselines from different strategies, demonstrating our approach's advantages. △ Less

Submitted 5 September, 2022; originally announced September 2022.

Comments: Paper accepted at ECML PKDD 2022

arXiv:2209.01556 [pdf, other]

Reinforced Continual Learning for Graphs

Authors: Appan Rakaraddi, Siew Kei Lam, Mahardhika Pratama, Marcus De Carvalho

Abstract: Graph Neural Networks (GNNs) have become the backbone for a myriad of tasks pertaining to graphs and similar topological data structures. While many works have been established in domains related to node and graph classification/regression tasks, they mostly deal with a single task. Continual learning on graphs is largely unexplored and existing graph continual learning approaches are limited to t… ▽ More Graph Neural Networks (GNNs) have become the backbone for a myriad of tasks pertaining to graphs and similar topological data structures. While many works have been established in domains related to node and graph classification/regression tasks, they mostly deal with a single task. Continual learning on graphs is largely unexplored and existing graph continual learning approaches are limited to the task-incremental learning scenarios. This paper proposes a graph continual learning strategy that combines the architecture-based and memory-based approaches. The structural learning strategy is driven by reinforcement learning, where a controller network is trained in such a way to determine an optimal number of nodes to be added/pruned from the base network when new tasks are observed, thus assuring sufficient network capacities. The parameter learning strategy is underpinned by the concept of Dark Experience replay method to cope with the catastrophic forgetting problem. Our approach is numerically validated with several graph continual learning benchmark problems in both task-incremental learning and class-incremental learning settings. Compared to recently published works, our approach demonstrates improved performance in both the settings. The implementation code can be found at \url{https://github.com/codexhammer/gcl}. △ Less

Submitted 4 September, 2022; originally announced September 2022.

Comments: has been accepted for publication as a long paper at 31st ACM International Conference on Information and Knowledge Management (CIKM 22)

arXiv:2209.01548 [pdf, other]

doi 10.1109/TNNLS.2022.3183356

Autonomous Cross Domain Adaptation under Extreme Label Scarcity

Authors: Weiwei Weng, Mahardhika Pratama, Choiru Za'in, Marcus De Carvalho, Rakaraddi Appan, Andri Ashfahani, Edward Yapp Kien Yee

Abstract: A cross domain multistream classification is a challenging problem calling for fast domain adaptations to handle different but related streams in never-ending and rapidly changing environments. Notwithstanding that existing multistream classifiers assume no labelled samples in the target stream, they still incur expensive labelling cost since they require fully labelled samples of the source strea… ▽ More A cross domain multistream classification is a challenging problem calling for fast domain adaptations to handle different but related streams in never-ending and rapidly changing environments. Notwithstanding that existing multistream classifiers assume no labelled samples in the target stream, they still incur expensive labelling cost since they require fully labelled samples of the source stream. This paper aims to attack the problem of extreme label shortage in the cross domain multistream classification problems where only very few labelled samples of the source stream are provided before process runs. Our solution, namely Learning Streaming Process from Partial Ground Truth (LEOPARD), is built upon a flexible deep clustering network where its hidden nodes, layers and clusters are added and removed dynamically in respect to varying data distributions. A deep clustering strategy is underpinned by a simultaneous feature learning and clustering technique leading to clustering-friendly latent spaces. A domain adaptation strategy relies on the adversarial domain adaptation technique where a feature extractor is trained to fool a domain classifier classifying source and target streams. Our numerical study demonstrates the efficacy of LEOPARD where it delivers improved performances compared to prominent algorithms in 15 of 24 cases. Source codes of LEOPARD are shared in \url{https://github.com/wengweng001/LEOPARD.git} to enable further study. △ Less

Submitted 4 September, 2022; originally announced September 2022.

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2022

arXiv:2110.01326 [pdf, other]

ACDC: Online Unsupervised Cross-Domain Adaptation

Authors: Marcus de Carvalho, Mahardhika Pratama, Jie Zhang, Edward Yapp

Abstract: We consider the problem of online unsupervised cross-domain adaptation, where two independent but related data streams with different feature spaces -- a fully labeled source stream and an unlabeled target stream -- are learned together. Unique characteristics and challenges such as covariate shift, asynchronous concept drifts, and contrasting data throughput arises. We propose ACDC, an adversaria… ▽ More We consider the problem of online unsupervised cross-domain adaptation, where two independent but related data streams with different feature spaces -- a fully labeled source stream and an unlabeled target stream -- are learned together. Unique characteristics and challenges such as covariate shift, asynchronous concept drifts, and contrasting data throughput arises. We propose ACDC, an adversarial unsupervised domain adaptation framework that handles multiple data streams with a complete self-evolving neural network structure that reacts to these defiances. ACDC encapsulates three modules into a single model: A denoising autoencoder that extracts features, an adversarial module that performs domain conversion, and an estimator that learns the source stream and predicts the target stream. ACDC is a flexible and expandable framework with little hyper-parameter tunability. Our experimental results under the prequential test-then-train protocol indicate an improvement in target accuracy over the baseline methods, achieving more than a 10\% increase in some cases. △ Less

Submitted 4 October, 2021; originally announced October 2021.

arXiv:2010.00558 [pdf, ps, other]

doi 10.1016/j.ejor.2021.04.024

Arc Flow Formulations Based on Dynamic Programming: Theoretical Foundations and Applications

Authors: Vinícius L. de Lima, Cláudio Alves, François Clautiaux, Manuel Iori, José M. Valério de Carvalho

Abstract: Network flow formulations are among the most successful tools to solve optimization problems. Such formulations correspond to determining an optimal flow in a network. One particular class of network flow formulations is the arc flow, where variables represent flows on individual arcs of the network. For $\mathcal{NP}$-hard problems, polynomial-sized arc flow models typically provide weak linear r… ▽ More Network flow formulations are among the most successful tools to solve optimization problems. Such formulations correspond to determining an optimal flow in a network. One particular class of network flow formulations is the arc flow, where variables represent flows on individual arcs of the network. For $\mathcal{NP}$-hard problems, polynomial-sized arc flow models typically provide weak linear relaxations and may have too much symmetry to be efficient in practice. Instead, arc flow models with a pseudo-polynomial size usually provide strong relaxations and are efficient in practice. The interest in pseudo-polynomial arc flow formulations has grown considerably in the last twenty years, in which they have been used to solve many open instances of hard problems. A remarkable advantage of pseudo-polynomial arc flow models is the possibility to solve practical-sized instances directly by a Mixed Integer Linear Programming solver, avoiding the implementation of complex methods based on column generation. In this survey, we present theoretical foundations of pseudo-polynomial arc flow formulations, by showing a relation between their network and Dynamic Programming (DP). This relation allows a better understanding of the strength of these formulations, through a link with models obtained by Dantzig-Wolfe decomposition. The relation with DP also allows a new perspective to relate state-space relaxation methods for DP with arc flow models. We also present a dual point of view to contrast the linear relaxation of arc flow models with that of models based on paths and cycles. To conclude, we review the main solution methods and applications of arc flow models based on DP in several domains such as cutting, packing, scheduling, and routing. △ Less

Submitted 15 April, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:1910.03434 [pdf, other]

ATL: Autonomous Knowledge Transfer from Many Streaming Processes

Authors: Mahardhika Pratama, Marcus de Carvalho, Renchunzi Xie, Edwin Lughofer, Jie Lu

Abstract: Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for… ▽ More Transferring knowledge across many streaming processes remains an uncharted territory in the existing literature and features unique characteristics: no labelled instance of the target domain, covariate shift of source and target domain, different period of drifts in the source and target domains. Autonomous transfer learning (ATL) is proposed in this paper as a flexible deep learning approach for the online unsupervised transfer learning problem across many streaming processes. ATL offers an online domain adaptation strategy via the generative and discriminative phases coupled with the KL divergence based optimization strategy to produce a domain invariant network while putting forward an elastic network structure. It automatically evolves its network structure from scratch with/without the presence of ground truth to overcome independent concept drifts in the source and target domain. The rigorous numerical evaluation has been conducted along with a comparison against recently published works. ATL demonstrates improved performance while showing significantly faster training speed than its counterparts. △ Less

Submitted 19 October, 2019; v1 submitted 8 October, 2019; originally announced October 2019.

Comments: This paper has been accepted for publication in CIKM 2019

arXiv:1907.13070 [pdf, other]

Predicting assisted ventilation in Amyotrophic Lateral Sclerosis using a mixture of experts and conformal predictors

Authors: Telma Pereira, Sofia Pires, Marta Gromicho, Susana Pinto, Mamede de Carvalho, Sara C. Madeira

Abstract: Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by a rapid motor decline, leading to respiratory failure and subsequently to death. In this context, researchers have sought for models to automatically predict disease progression to assisted ventilation in ALS patients. However, the clinical translation of such models is limited by the lack of insight 1) on the risk… ▽ More Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease characterized by a rapid motor decline, leading to respiratory failure and subsequently to death. In this context, researchers have sought for models to automatically predict disease progression to assisted ventilation in ALS patients. However, the clinical translation of such models is limited by the lack of insight 1) on the risk of error for predictions at patient-level, and 2) on the most adequate time to administer the non-invasive ventilation. To address these issues, we combine Conformal Prediction (a machine learning framework that complements predictions with confidence measures) and a mixture experts into a prognostic model which not only predicts whether an ALS patient will suffer from respiratory insufficiency but also the most likely time window of occurrence, at a given reliability level. Promising results were obtained, with near 80% of predictions being correctly identified. △ Less

Submitted 30 July, 2019; originally announced July 2019.

Journal ref: KDD 2019 Workshop on Applied Data Science for Healthcare

arXiv:1806.08247 [pdf, other]

Log Skeletons: A Classification Approach to Process Discovery

Authors: H. M. W. Verbeek, R. Medeiros de Carvalho

Abstract: To test the effectiveness of process discovery algorithms, a Process Discovery Contest (PDC) has been set up. This PDC uses a classification approach to measure this effectiveness: The better the discovered model can classify whether or not a new trace conforms to the event log, the better the discovery algorithm is supposed to be. Unfortunately, even the state-of-the-art fully-automated discovery… ▽ More To test the effectiveness of process discovery algorithms, a Process Discovery Contest (PDC) has been set up. This PDC uses a classification approach to measure this effectiveness: The better the discovered model can classify whether or not a new trace conforms to the event log, the better the discovery algorithm is supposed to be. Unfortunately, even the state-of-the-art fully-automated discovery algorithms score poorly on this classification. Even the best of these algorithms, the Inductive Miner, scored only 147 correct classified traces out of 200 traces on the PDC of 2017. This paper introduces the rule-based log skeleton model, which is closely related to the Declare constraint model, together with a way to classify traces using this model. This classification using log skeletons is shown to score better on the PDC of 2017 than state-of-the-art discovery algorithms: 194 out of 200. As a result, one can argue that the fully-automated algorithm to construct (or: discover) a log skeleton from an event log outperforms existing state-of-the-art fully-automated discovery algorithms. △ Less

Submitted 21 June, 2018; originally announced June 2018.

Comments: 16 pages with 9 figures, followed by an appendix of 14 pages with 17 figures

MSC Class: 62H30; 93C65 ACM Class: I.5.3; H.3.3; J.1

arXiv:cs/0603116 [pdf, ps, other]

Fourier Analysis and Holographic Representations of 1D and 2D Signals

Authors: G. A. Giraldi, B. F. Moutinho, D. M. L. de Carvalho, J. C. de Oliveira

Abstract: In this paper, we focus on Fourier analysis and holographic transforms for signal representation. For instance, in the case of image processing, the holographic representation has the property that an arbitrary portion of the transformed image enables reconstruction of the whole image with details missing. We focus on holographic representation defined through the Fourier Transforms. Thus, We fi… ▽ More In this paper, we focus on Fourier analysis and holographic transforms for signal representation. For instance, in the case of image processing, the holographic representation has the property that an arbitrary portion of the transformed image enables reconstruction of the whole image with details missing. We focus on holographic representation defined through the Fourier Transforms. Thus, We firstly review some results in Fourier transform and Fourier series. Next, we review the Discrete Holographic Fourier Transform (DHFT) for image representation. Then, we describe the contributions of our work. We show a simple scheme for progressive transmission based on the DHFT. Next, we propose the Continuous Holographic Fourier Transform (CHFT) and discuss some theoretical aspects of it for 1D signals. Finally, some testes are presented in the experimental results △ Less

Submitted 3 April, 2006; v1 submitted 29 March, 2006; originally announced March 2006.

Comments: 13 pages

ACM Class: I.4.10

Showing 1–13 of 13 results for author: De Carvalho, M