Search | arXiv e-print repository

arXiv:2310.09991 [pdf, other]

Applications of Machine Learning in Biopharmaceutical Process Development and Manufacturing: Current Trends, Challenges, and Opportunities

Authors: Thanh Tung Khuat, Robert Bassett, Ellen Otte, Alistair Grevis-James, Bogdan Gabrys

Abstract: While machine learning (ML) has made significant contributions to the biopharmaceutical field, its applications are still in the early stages in terms of providing direct support for quality-by-design based development and manufacturing of biopharmaceuticals, hindering the enormous potential for bioprocesses automation from their development to manufacturing. However, the adoption of ML-based mode… ▽ More While machine learning (ML) has made significant contributions to the biopharmaceutical field, its applications are still in the early stages in terms of providing direct support for quality-by-design based development and manufacturing of biopharmaceuticals, hindering the enormous potential for bioprocesses automation from their development to manufacturing. However, the adoption of ML-based models instead of conventional multivariate data analysis methods is significantly increasing due to the accumulation of large-scale production data. This trend is primarily driven by the real-time monitoring of process variables and quality attributes of biopharmaceutical products through the implementation of advanced process analytical technologies. Given the complexity and multidimensionality of a bioproduct design, bioprocess development, and product manufacturing data, ML-based approaches are increasingly being employed to achieve accurate, flexible, and high-performing predictive models to address the problems of analytics, monitoring, and control within the biopharma field. This paper aims to provide a comprehensive review of the current applications of ML solutions in a bioproduct design, monitoring, control, and optimisation of upstream, downstream, and product formulation processes. Finally, this paper thoroughly discusses the main challenges related to the bioprocesses themselves, process data, and the use of machine learning models in biopharmaceutical process development and manufacturing. Moreover, it offers further insights into the adoption of innovative machine learning methods and novel trends in the development of new digital biopharma solutions. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: 155 pages

ACM Class: A.1; J.3; I.2.0; I.2.6; I.2.m; I.5.4

arXiv:2310.09987 [pdf, other]

Network Disruption via Continuous Batch Removal: The Case of Sicilian Mafia

Authors: Mingshan Jia, Pasquale De Meo, Bogdan Gabrys, Katarzyna Musial

Abstract: Network disruption is pivotal in understanding the robustness and vulnerability of complex networks, which is instrumental in devising strategies for infrastructure protection, epidemic control, cybersecurity, and combating crime. In this paper, with a particular focus on disrupting criminal networks, we proposed to impose a within-the-largest-connected-component constraint in a continuous batch r… ▽ More Network disruption is pivotal in understanding the robustness and vulnerability of complex networks, which is instrumental in devising strategies for infrastructure protection, epidemic control, cybersecurity, and combating crime. In this paper, with a particular focus on disrupting criminal networks, we proposed to impose a within-the-largest-connected-component constraint in a continuous batch removal disruption process. Through a series of experiments on a recently released Sicilian Mafia network, we revealed that the constraint would enhance degree-based methods while weakening betweenness-based approaches. Moreover, based on the findings from the experiments using various disruption strategies, we propose a structurally-filtered greedy disruption strategy that integrates the effectiveness of greedy-like methods with the efficiency of structural-metric-based approaches. The proposed strategy significantly outperforms the longstanding state-of-the-art method of betweenness centrality while maintaining the same time complexity. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:2309.13229 [pdf, other]

Heterogeneous Feature Representation for Digital Twin-Oriented Complex Networked Systems

Authors: Jiaqi Wen, Bogdan Gabrys, Katarzyna Musial

Abstract: Building models of Complex Networked Systems (CNS) that can accurately represent reality forms an important research area. To be able to reflect real world systems, the modelling needs to consider not only the intensity of interactions between the entities but also features of all the elements of the system. This study aims to improve the expressive power of node features in Digital Twin-Oriented… ▽ More Building models of Complex Networked Systems (CNS) that can accurately represent reality forms an important research area. To be able to reflect real world systems, the modelling needs to consider not only the intensity of interactions between the entities but also features of all the elements of the system. This study aims to improve the expressive power of node features in Digital Twin-Oriented Complex Networked Systems (DT-CNSs) with heterogeneous feature representation principles. This involves representing features with crisp feature values and fuzzy sets, each describing the objective and the subjective inductions of the nodes' features and feature differences. Our empirical analysis builds DT-CNSs to recreate realistic physical contact networks in different countries from real node feature distributions based on various representation principles and an optimised feature preference. We also investigate their respective disaster resilience to an epidemic outbreak starting from the most popular node. The results suggest that the increasing flexibility of feature representation with fuzzy sets improves the expressive power and enables more accurate modelling. In addition, the heterogeneous features influence the network structure and the speed of the epidemic outbreak, requiring various mitigation policies targeted at different people. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2308.11034 [pdf, other]

Digital Twin-Oriented Complex Networked Systems based on Heterogeneous Node Features and Interaction Rules

Authors: Jiaqi Wen, Bogdan Gabrys, Katarzyna Musial

Abstract: This study proposes an extendable modelling framework for Digital Twin-Oriented Complex Networked Systems (DT-CNSs) with a goal of generating networks that faithfully represent real systems. Modelling process focuses on (i) features of nodes and (ii) interaction rules for creating connections that are built based on individual node's preferences. We conduct experiments on simulation-based DT-CNSs… ▽ More This study proposes an extendable modelling framework for Digital Twin-Oriented Complex Networked Systems (DT-CNSs) with a goal of generating networks that faithfully represent real systems. Modelling process focuses on (i) features of nodes and (ii) interaction rules for creating connections that are built based on individual node's preferences. We conduct experiments on simulation-based DT-CNSs that incorporate various features and rules about network growth and different transmissibilities related to an epidemic spread on these networks. We present a case study on disaster resilience of social networks given an epidemic outbreak by investigating the infection occurrence within specific time and social distance. The experimental results show how different levels of the structural and dynamics complexities, concerned with feature diversity and flexibility of interaction rules respectively, influence network growth and epidemic spread. The analysis revealed that, to achieve maximum disaster resilience, mitigation policies should be targeted at nodes with preferred features as they have higher infection risks and should be the focus of the epidemic control. △ Less

Submitted 22 September, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

arXiv:2306.15683 [pdf, other]

doi 10.1088/2632-2153/ad0d11

Harnessing Data Augmentation to Quantify Uncertainty in the Early Estimation of Single-Photon Source Quality

Authors: David Jacob Kedziora, Anna Musiał, Wojciech Rudno-Rudziński, Bogdan Gabrys

Abstract: Novel methods for rapidly estimating single-photon source (SPS) quality have been promoted in recent literature to address the expensive and time-consuming nature of experimental validation via intensity interferometry. However, the frequent lack of uncertainty discussions and reproducible details raises concerns about their reliability. This study investigates the use of data augmentation, a mach… ▽ More Novel methods for rapidly estimating single-photon source (SPS) quality have been promoted in recent literature to address the expensive and time-consuming nature of experimental validation via intensity interferometry. However, the frequent lack of uncertainty discussions and reproducible details raises concerns about their reliability. This study investigates the use of data augmentation, a machine learning technique, to supplement experimental data with bootstrapped samples and quantify the uncertainty of such estimates. Eight datasets obtained from measurements involving a single InGaAs/GaAs epitaxial quantum dot serve as a proof-of-principle example. Analysis of one of the SPS quality metrics derived from efficient histogram fitting of the synthetic samples, i.e. the probability of multi-photon emission events, reveals significant uncertainty contributed by stochastic variability in the Poisson processes that describe detection rates. Ignoring this source of error risks severe overconfidence in both early quality estimates and claims for state-of-the-art SPS devices. Additionally, this study finds that standard least-squares fitting is comparable to using a Poisson likelihood, and expanding averages show some promise for early estimation. Also, reducing background counts improves fitting accuracy but does not address the Poisson-process variability. Ultimately, data augmentation demonstrates its value in supplementing physical experiments; its benefit here is to emphasise the need for a cautious assessment of SPS quality. △ Less

Submitted 9 January, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

Comments: Updated title and content

Journal ref: Mach. Learn.: Sci. Technol. 4 045042 (2023)

arXiv:2305.03234 [pdf, other]

Review and Assessment of Digital Twin--Oriented Social Network Simulators

Authors: Jiaqi Wen, Bogdan Gabrys, Katarzyna Musial

Abstract: The ability to faithfully represent real social networks is critical from the perspective of testing various what-if scenarios which are not feasible to be implemented in a real system as the system's state would be irreversibly changed. High fidelity simulators allow one to investigate the consequences of different actions before introducing them to the real system. For example, in the context of… ▽ More The ability to faithfully represent real social networks is critical from the perspective of testing various what-if scenarios which are not feasible to be implemented in a real system as the system's state would be irreversibly changed. High fidelity simulators allow one to investigate the consequences of different actions before introducing them to the real system. For example, in the context of social systems, an accurate social network simulator can be a powerful tool used to guide policy makers, help companies plan their advertising campaigns or authorities to analyse fake news spread. In this study we explore different Social Network Simulators (SNSs) and assess to what extent they are able to mimic the real social networks. We conduct a critical review and assessment of existing Social Network Simulators under the Digital Twin-Oriented Modelling framework proposed in our previous study. We subsequently extend one of the most promising simulators from the evaluated ones, to facilitate generation of social networks of varied structural complexity levels. This extension brings us one step closer to a Digital Twin Oriented SNS (DT Oriented SNS). We also propose an approach to assess the similarity between real and simulated networks with the composite performance indexes based on both global and local structural measures, while taking runtime of the simulator as an indicator of its efficiency. We illustrate various characteristics of the proposed DT Oriented SNS using a well known Karate Club network as an example. While not considered to be of sufficient complexity, the simulator is intended as one of the first steps on a journey towards building a Digital Twin of a social network that perfectly mimics the reality. △ Less

Submitted 4 May, 2023; originally announced May 2023.

arXiv:2301.04824 [pdf, other]

doi 10.1109/ACCESS.2023.3268797

A Network Science perspective of Graph Convolutional Networks: A survey

Authors: Mingshan Jia, Bogdan Gabrys, Katarzyna Musial

Abstract: The mining and exploitation of graph structural information have been the focal points in the study of complex networks. Traditional structural measures in Network Science focus on the analysis and modelling of complex networks from the perspective of network structure, such as the centrality measures, the clustering coefficient, and motifs and graphlets, and they have become basic tools for study… ▽ More The mining and exploitation of graph structural information have been the focal points in the study of complex networks. Traditional structural measures in Network Science focus on the analysis and modelling of complex networks from the perspective of network structure, such as the centrality measures, the clustering coefficient, and motifs and graphlets, and they have become basic tools for studying and understanding graphs. In comparison, graph neural networks, especially graph convolutional networks (GCNs), are particularly effective at integrating node features into graph structures via neighbourhood aggregation and message passing, and have been shown to significantly improve the performances in a variety of learning tasks. These two classes of methods are, however, typically treated separately with limited references to each other. In this work, aiming to establish relationships between them, we provide a network science perspective of GCNs. Our novel taxonomy classifies GCNs from three structural information angles, i.e., the layer-wise message aggregation scope, the message content, and the overall learning scope. Moreover, as a prerequisite for reviewing GCNs via a network science perspective, we also summarise traditional structural measures and propose a new taxonomy for them. Finally and most importantly, we draw connections between traditional structural approaches and graph convolutional networks, and discuss potential directions for future research. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2211.04148 [pdf, other]

The Technological Emergence of AutoML: A Survey of Performant Software and Applications in the Context of Industry

Authors: Alexander Scriven, David Jacob Kedziora, Katarzyna Musial, Bogdan Gabrys

Abstract: With most technical fields, there exists a delay between fundamental academic research and practical industrial uptake. Whilst some sciences have robust and well-established processes for commercialisation, such as the pharmaceutical practice of regimented drug trials, other fields face transitory periods in which fundamental academic advancements diffuse gradually into the space of commerce and i… ▽ More With most technical fields, there exists a delay between fundamental academic research and practical industrial uptake. Whilst some sciences have robust and well-established processes for commercialisation, such as the pharmaceutical practice of regimented drug trials, other fields face transitory periods in which fundamental academic advancements diffuse gradually into the space of commerce and industry. For the still relatively young field of Automated/Autonomous Machine Learning (AutoML/AutonoML), that transitory period is under way, spurred on by a burgeoning interest from broader society. Yet, to date, little research has been undertaken to assess the current state of this dissemination and its uptake. Thus, this review makes two primary contributions to knowledge around this topic. Firstly, it provides the most up-to-date and comprehensive survey of existing AutoML tools, both open-source and commercial. Secondly, it motivates and outlines a framework for assessing whether an AutoML solution designed for real-world application is 'performant'; this framework extends beyond the limitations of typical academic criteria, considering a variety of stakeholder needs and the human-computer interactions required to service them. Thus, additionally supported by an extensive assessment and comparison of academic and commercial case-studies, this review evaluates mainstream engagement with AutoML in the early 2020s, identifying obstacles and opportunities for accelerating future uptake. △ Less

Submitted 8 November, 2022; originally announced November 2022.

arXiv:2210.02704 [pdf, other]

hyperbox-brain: A Toolbox for Hyperbox-based Machine Learning Algorithms

Authors: Thanh Tung Khuat, Bogdan Gabrys

Abstract: Hyperbox-based machine learning algorithms are an important and popular branch of machine learning in the construction of classifiers using fuzzy sets and logic theory and neural network architectures. This type of learning is characterised by many strong points of modern predictors such as a high scalability, explainability, online adaptation, effective learning from a small amount of data, nativ… ▽ More Hyperbox-based machine learning algorithms are an important and popular branch of machine learning in the construction of classifiers using fuzzy sets and logic theory and neural network architectures. This type of learning is characterised by many strong points of modern predictors such as a high scalability, explainability, online adaptation, effective learning from a small amount of data, native ability to deal with missing data and accommodating new classes. Nevertheless, there is no comprehensive existing package for hyperbox-based machine learning which can serve as a benchmark for research and allow non-expert users to apply these algorithms easily. hyperbox-brain is an open-source Python library implementing the leading hyperbox-based machine learning algorithms. This library exposes a unified API which closely follows and is compatible with the renowned scikit-learn and numpy toolboxes. The library may be installed from Python Package Index (PyPI) and the conda package manager and is distributed under the GPL-3 license. The source code, documentation, detailed tutorials, and the full descriptions of the API are available at https://uts-caslab.github.io/hyperbox-brain. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: 11 pages

MSC Class: 68T05; 68T30; 68T37; 68W27 ACM Class: I.2.1; I.2.4; I.2.5; I.2.6; I.5.1; I.5.3; I.5.4; I.5.5

arXiv:2208.04376 [pdf, other]

On Taking Advantage of Opportunistic Meta-knowledge to Reduce Configuration Spaces for Automated Machine Learning

Authors: David Jacob Kedziora, Tien-Dung Nguyen, Katarzyna Musial, Bogdan Gabrys

Abstract: The automated machine learning (AutoML) process can require searching through complex configuration spaces of not only machine learning (ML) components and their hyperparameters but also ways of composing them together, i.e. forming ML pipelines. Optimisation efficiency and the model accuracy attainable for a fixed time budget suffer if this pipeline configuration space is excessively large. A key… ▽ More The automated machine learning (AutoML) process can require searching through complex configuration spaces of not only machine learning (ML) components and their hyperparameters but also ways of composing them together, i.e. forming ML pipelines. Optimisation efficiency and the model accuracy attainable for a fixed time budget suffer if this pipeline configuration space is excessively large. A key research question is whether it is both possible and practical to preemptively avoid costly evaluations of poorly performing ML pipelines by leveraging their historical performance for various ML tasks, i.e. meta-knowledge. The previous experience comes in the form of classifier/regressor accuracy rankings derived from either (1) a substantial but non-exhaustive number of pipeline evaluations made during historical AutoML runs, i.e. 'opportunistic' meta-knowledge, or (2) comprehensive cross-validated evaluations of classifiers/regressors with default hyperparameters, i.e. 'systematic' meta-knowledge. Numerous experiments with the AutoWeka4MCPS package suggest that (1) opportunistic/systematic meta-knowledge can improve ML outcomes, typically in line with how relevant that meta-knowledge is, and (2) configuration-space culling is optimal when it is neither too conservative nor too radical. However, the utility and impact of meta-knowledge depend critically on numerous facets of its generation and exploitation, warranting extensive analysis; these are often overlooked/underappreciated within AutoML and meta-learning literature. In particular, we observe strong sensitivity to the `challenge' of a dataset, i.e. whether specificity in choosing a predictor leads to significantly better performance. Ultimately, identifying `difficult' datasets, thus defined, is crucial to both generating informative meta-knowledge bases and understanding optimal search-space reduction strategies. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: 71 pages

arXiv:2205.04139 [pdf, other]

The Roles and Modes of Human Interactions with Automated Machine Learning Systems

Authors: Thanh Tung Khuat, David Jacob Kedziora, Bogdan Gabrys

Abstract: As automated machine learning (AutoML) systems continue to progress in both sophistication and performance, it becomes important to understand the `how' and `why' of human-computer interaction (HCI) within these frameworks, both current and expected. Such a discussion is necessary for optimal system design, leveraging advanced data-processing capabilities to support decision-making involving human… ▽ More As automated machine learning (AutoML) systems continue to progress in both sophistication and performance, it becomes important to understand the `how' and `why' of human-computer interaction (HCI) within these frameworks, both current and expected. Such a discussion is necessary for optimal system design, leveraging advanced data-processing capabilities to support decision-making involving humans, but it is also key to identifying the opportunities and risks presented by ever-increasing levels of machine autonomy. Within this context, we focus on the following questions: (i) How does HCI currently look like for state-of-the-art AutoML algorithms, especially during the stages of development, deployment, and maintenance? (ii) Do the expectations of HCI within AutoML frameworks vary for different types of users and stakeholders? (iii) How can HCI be managed so that AutoML solutions acquire human trust and broad acceptance? (iv) As AutoML systems become more autonomous and capable of learning from complex open-ended environments, will the fundamental nature of HCI evolve? To consider these questions, we project existing literature in HCI into the space of AutoML; this connection has, to date, largely been unexplored. In so doing, we review topics including user-interface design, human-bias mitigation, and trust in artificial intelligence (AI). Additionally, to rigorously gauge the future of HCI, we contemplate how AutoML may manifest in effectively open-ended environments. This discussion necessarily reviews projected developmental pathways for AutoML, such as the incorporation of reasoning, although the focus remains on how and why HCI may occur in such a framework rather than on any implementational details. Ultimately, this review serves to identify key research directions aimed at better facilitating the roles and modes of human interactions with both current and future AutoML systems. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: Submitted to Foundations and Trends in Human-Computer Interaction

ACM Class: A.1; I.2.0; I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.4

arXiv:2202.09363 [pdf, other]

Towards Digital Twin Oriented Modelling of Complex Networked Systems and Their Dynamics: A Comprehensive Survey

Authors: Jiaqi Wen, Bogdan Gabrys, Katarzyna Musial

Abstract: This paper aims to provide a comprehensive critical overview on how entities and their interactions in Complex Networked Systems (CNS) are modelled across disciplines as they approach their ultimate goal of creating a Digital Twin (DT) that perfectly matches the reality. We propose a new framework to conceptually compare diverse existing modelling paradigms from different perspectives and create u… ▽ More This paper aims to provide a comprehensive critical overview on how entities and their interactions in Complex Networked Systems (CNS) are modelled across disciplines as they approach their ultimate goal of creating a Digital Twin (DT) that perfectly matches the reality. We propose a new framework to conceptually compare diverse existing modelling paradigms from different perspectives and create unified assessment criteria to assess their respective capabilities of reaching such an ultimate goal. Using the proposed criteria, we also appraise how far the reviewed current state-of-the-art approaches are from the idealised DTs. We also identify and propose potential directions and ways of building a DT-orientated CNS based on the convergence and integration of CNS and DT utilising a variety of cross-disciplinary techniques. △ Less

Submitted 15 February, 2022; originally announced February 2022.

Comments: 36 pages, 13 figures

arXiv:2112.09245 [pdf, other]

Automated Deep Learning: Neural Architecture Search Is Not the End

Authors: Xuanyi Dong, David Jacob Kedziora, Katarzyna Musial, Bogdan Gabrys

Abstract: Deep learning (DL) has proven to be a highly effective approach for develo** models in diverse contexts, including visual perception, speech recognition, and machine translation. However, the end-to-end process for applying DL is not trivial. It requires grappling with problem formulation and context understanding, data engineering, model development, deployment, continuous monitoring and mainte… ▽ More Deep learning (DL) has proven to be a highly effective approach for develo** models in diverse contexts, including visual perception, speech recognition, and machine translation. However, the end-to-end process for applying DL is not trivial. It requires grappling with problem formulation and context understanding, data engineering, model development, deployment, continuous monitoring and maintenance, and so on. Moreover, each of these steps typically relies heavily on humans, in terms of both knowledge and interactions, which impedes the further advancement and democratization of DL. Consequently, in response to these issues, a new field has emerged over the last few years: automated deep learning (AutoDL). This endeavor seeks to minimize the need for human involvement and is best known for its achievements in neural architecture search (NAS), a topic that has been the focus of several surveys. That stated, NAS is not the be-all and end-all of AutoDL. Accordingly, this review adopts an overarching perspective, examining research efforts into automation across the entirety of an archetypal DL workflow. In so doing, this work also proposes a comprehensive set of ten criteria by which to assess existing work in both individual publications and broader research areas. These criteria are: novelty, solution quality, efficiency, stability, interpretability, reproducibility, engineering quality, scalability, generalizability, and eco-friendliness. Thus, ultimately, this review provides an evaluative overview of AutoDL in the early 2020s, identifying where future opportunities for progress may exist. △ Less

Submitted 16 May, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

Comments: 66 pages, 10 tables, 4 figures, 325 references; improve the old version with community feedback

arXiv:2105.00282 [pdf, other]

Exploring Opportunistic Meta-knowledge to Reduce Search Spaces for Automated Machine Learning

Authors: Tien-Dung Nguyen, David Jacob Kedziora, Katarzyna Musial, Bogdan Gabrys

Abstract: Machine learning (ML) pipeline composition and optimisation have been studied to seek multi-stage ML models, i.e. preprocessor-inclusive, that are both valid and well-performing. These processes typically require the design and traversal of complex configuration spaces consisting of not just individual ML components and their hyperparameters, but also higher-level pipeline structures that link the… ▽ More Machine learning (ML) pipeline composition and optimisation have been studied to seek multi-stage ML models, i.e. preprocessor-inclusive, that are both valid and well-performing. These processes typically require the design and traversal of complex configuration spaces consisting of not just individual ML components and their hyperparameters, but also higher-level pipeline structures that link these components together. Optimisation efficiency and resulting ML-model accuracy both suffer if this pipeline search space is unwieldy and excessively large; it becomes an appealing notion to avoid costly evaluations of poorly performing ML components ahead of time. Accordingly, this paper investigates whether, based on previous experience, a pool of available classifiers/regressors can be preemptively culled ahead of initiating a pipeline composition/optimisation process for a new ML problem, i.e. dataset. The previous experience comes in the form of classifier/regressor accuracy rankings derived, with loose assumptions, from a substantial but non-exhaustive number of pipeline evaluations; this meta-knowledge is considered 'opportunistic'. Numerous experiments with the AutoWeka4MCPS package, including ones leveraging similarities between datasets via the relative landmarking method, show that, despite its seeming unreliability, opportunistic meta-knowledge can improve ML outcomes. However, results also indicate that the culling of classifiers/regressors should not be too severe either. In effect, it is better to search through a 'top tier' of recommended predictors than to pin hopes onto one previously supreme performer. △ Less

Submitted 1 May, 2021; originally announced May 2021.

Journal ref: International Joint Conference on Neural Network 2021

arXiv:2101.02939 [pdf]

doi 10.1109/TSMC.2023.3244714

Application of Machine Learning to Performance Assessment for a class of PID-based Control Systems

Authors: Patryk Grelewicz, Thanh Tung Khuat, Jacek Czeczot, Pawel Nowak, Tomasz Klopot, Bogdan Gabrys

Abstract: In this paper, a novel machine learning derived control performance assessment (CPA) classification system is proposed. It is dedicated for a wide class of PID-based control industrial loops with processes exhibiting dynamical properties close to second order plus delay time (SOPDT). The proposed concept is very general and easy to configure to distinguish between acceptable and poor closed loop p… ▽ More In this paper, a novel machine learning derived control performance assessment (CPA) classification system is proposed. It is dedicated for a wide class of PID-based control industrial loops with processes exhibiting dynamical properties close to second order plus delay time (SOPDT). The proposed concept is very general and easy to configure to distinguish between acceptable and poor closed loop performance. This approach allows for determining the best (but also robust and practically achievable) closed loop performance based on very popular and intuitive closed loop quality factors. Training set can be automatically derived off-line using a number of different, diverse control performance indices (CPIs) used as discriminative features of the assessed control system. The proposed extended set of CPIs is discussed with comprehensive performance assessment of different machine learning based classification methods and practical application of the suggested solution. As a result, a general-purpose CPA system is derived that can be immediately applied in practice without any preliminary or additional learning stage during normal closed loop operation. It is verified by practical application to assess the control system for a laboratory heat exchange and distribution setup. △ Less

Submitted 4 April, 2022; v1 submitted 8 January, 2021; originally announced January 2021.

Comments: This work has been submitted to the IEEE Trans. On Systems, Man and Cybernetics: Systems, for possible publication. Comparing to previous version, this one is extended with new comparative results of the proposed method and clear describtion of motivation and novelty of this work

MSC Class: 93 ACM Class: I.2; I.5

Journal ref: IEEE Transactions on Systems, Man, and Cybernetics: Systems 2023

arXiv:2012.12600 [pdf, other]

AutonoML: Towards an Integrated Framework for Autonomous Machine Learning

Authors: David Jacob Kedziora, Katarzyna Musial, Bogdan Gabrys

Abstract: Over the last decade, the long-running endeavour to automate high-level processes in machine learning (ML) has risen to mainstream prominence, stimulated by advances in optimisation techniques and their impact on selecting ML models/algorithms. Central to this drive is the appeal of engineering a computational system that both discovers and deploys high-performance solutions to arbitrary ML proble… ▽ More Over the last decade, the long-running endeavour to automate high-level processes in machine learning (ML) has risen to mainstream prominence, stimulated by advances in optimisation techniques and their impact on selecting ML models/algorithms. Central to this drive is the appeal of engineering a computational system that both discovers and deploys high-performance solutions to arbitrary ML problems with minimal human interaction. Beyond this, an even loftier goal is the pursuit of autonomy, which describes the capability of the system to independently adjust an ML solution over a lifetime of changing contexts. However, these ambitions are unlikely to be achieved in a robust manner without the broader synthesis of various mechanisms and theoretical frameworks, which, at the present time, remain scattered across numerous research threads. Accordingly, this review seeks to motivate a more expansive perspective on what constitutes an automated/autonomous ML system, alongside consideration of how best to consolidate those elements. In doing so, we survey developments in the following research areas: hyperparameter optimisation, multi-component models, neural architecture search, automated feature engineering, meta-learning, multi-level ensembling, dynamic adaptation, multi-objective evaluation, resource constraints, flexible user involvement, and the principles of generalisation. We also develop a conceptual framework throughout the review, augmented by each topic, to illustrate one possible way of fusing high-level mechanisms into an autonomous ML system. Ultimately, we conclude that the notion of architectural integration deserves more discussion, without which the field of automated ML risks stifling both its technical advantages and general uptake. △ Less

Submitted 29 March, 2022; v1 submitted 23 December, 2020; originally announced December 2020.

Comments: Updated with feedback from ML community

arXiv:2011.11846 [pdf, other]

AutoWeka4MCPS-AVATAR: Accelerating Automated Machine Learning Pipeline Composition and Optimisation

Authors: Tien-Dung Nguyen, Bogdan Gabrys, Katarzyna Musial

Abstract: Automated machine learning pipeline (ML) composition and optimisation aim at automating the process of finding the most promising ML pipelines within allocated resources (i.e., time, CPU and memory). Existing methods, such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline compos… ▽ More Automated machine learning pipeline (ML) composition and optimisation aim at automating the process of finding the most promising ML pipelines within allocated resources (i.e., time, CPU and memory). Existing methods, such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods frequently require a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid in the first place, and attempting to execute them is a waste of time and resources. To address this issue, we propose a novel method to evaluate the validity of ML pipelines, without their execution, using a surrogate model (AVATAR). The AVATAR generates a knowledge base by automatically learning the capabilities and effects of ML algorithms on datasets' characteristics. This knowledge base is used for a simplified map** from an original ML pipeline to a surrogate model which is a Petri net based pipeline. Instead of executing the original ML pipeline to evaluate its validity, the AVATAR evaluates its surrogate model constructed by capabilities and effects of the ML pipeline components and input/output simplified map**s. Evaluating this surrogate model is less resource-intensive than the execution of the original pipeline. As a result, the AVATAR enables the pipeline composition and optimisation methods to evaluate more pipelines by quickly rejecting invalid pipelines. We integrate the AVATAR into the sequential model-based algorithm configuration (SMAC). Our experiments show that when SMAC employs AVATAR, it finds better solutions than on its own. △ Less

Submitted 21 November, 2020; originally announced November 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:2001.11158

arXiv:2011.10763 [pdf, other]

Measuring Quadrangle Formation in Complex Networks

Authors: Mingshan Jia, Bogdan Gabrys, Katarzyna Musial

Abstract: The classic clustering coefficient and the lately proposed closure coefficient quantify the formation of triangles from two different perspectives, with the focal node at the centre or at the end in an open triad respectively. As many networks are naturally rich in triangles, they become standard metrics to describe and analyse networks. However, the advantages of applying them can be limited in n… ▽ More The classic clustering coefficient and the lately proposed closure coefficient quantify the formation of triangles from two different perspectives, with the focal node at the centre or at the end in an open triad respectively. As many networks are naturally rich in triangles, they become standard metrics to describe and analyse networks. However, the advantages of applying them can be limited in networks, where there are relatively few triangles but which are rich in quadrangles, such as the protein-protein interaction networks, the neural networks and the food webs. This yields for other approaches that would leverage quadrangles in our journey to better understand local structures and their meaning in different types of networks. Here we propose two quadrangle coefficients, i.e., the i-quad coefficient and the o-quad coefficient, to quantify quadrangle formation in networks, and we further extend them to weighted networks. Through experiments on 16 networks from six different domains, we first reveal the density distribution of the two quadrangle coefficients, and then analyse their correlations with node degree. Finally, we demonstrate that at network-level, adding the average i-quad coefficient and the average o-quad coefficient leads to significant improvement in network classification, while at node-level, the i-quad and o-quad coefficients are useful features to improve link prediction. △ Less

Submitted 21 November, 2020; originally announced November 2020.

arXiv:2009.14670 [pdf, other]

An Online Learning Algorithm for a Neuro-Fuzzy Classifier with Mixed-Attribute Data

Authors: Thanh Tung Khuat, Bogdan Gabrys

Abstract: General fuzzy min-max neural network (GFMMNN) is one of the efficient neuro-fuzzy systems for data classification. However, one of the downsides of its original learning algorithms is the inability to handle and learn from the mixed-attribute data. While categorical features encoding methods can be used with the GFMMNN learning algorithms, they exhibit a lot of shortcomings. Other approaches propo… ▽ More General fuzzy min-max neural network (GFMMNN) is one of the efficient neuro-fuzzy systems for data classification. However, one of the downsides of its original learning algorithms is the inability to handle and learn from the mixed-attribute data. While categorical features encoding methods can be used with the GFMMNN learning algorithms, they exhibit a lot of shortcomings. Other approaches proposed in the literature are not suitable for on-line learning as they require entire training data available in the learning phase. With the rapid change in the volume and velocity of streaming data in many application areas, it is increasingly required that the constructed models can learn and adapt to the continuous data changes in real-time without the need for their full retraining or access to the historical data. This paper proposes an extended online learning algorithm for the GFMMNN. The proposed method can handle the datasets with both continuous and categorical features. The extensive experiments confirmed superior and stable classification performance of the proposed approach in comparison to other relevant learning algorithms for the GFMM model. △ Less

Submitted 30 September, 2020; originally announced September 2020.

MSC Class: 68T30; 68T20; 68T37; 68W27 ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

arXiv:2009.00437 [pdf, other]

doi 10.1109/TPAMI.2021.3054824

NATS-Bench: Benchmarking NAS Algorithms for Architecture Topology and Size

Authors: Xuanyi Dong, Lu Liu, Katarzyna Musial, Bogdan Gabrys

Abstract: Neural architecture search (NAS) has attracted a lot of attention and has been illustrated to bring tangible benefits in a large number of applications in the past few years. Architecture topology and architecture size have been regarded as two of the most important aspects for the performance of deep learning models and the community has spawned lots of searching algorithms for both aspects of th… ▽ More Neural architecture search (NAS) has attracted a lot of attention and has been illustrated to bring tangible benefits in a large number of applications in the past few years. Architecture topology and architecture size have been regarded as two of the most important aspects for the performance of deep learning models and the community has spawned lots of searching algorithms for both aspects of the neural architectures. However, the performance gain from these searching algorithms is achieved under different search spaces and training setups. This makes the overall performance of the algorithms to some extent incomparable and the improvement from a sub-module of the searching model unclear. In this paper, we propose NATS-Bench, a unified benchmark on searching for both topology and size, for (almost) any up-to-date NAS algorithm. NATS-Bench includes the search space of 15,625 neural cell candidates for architecture topology and 32,768 for architecture size on three datasets. We analyze the validity of our benchmark in terms of various criteria and performance comparison of all candidates in the search space. We also show the versatility of NATS-Bench by benchmarking 13 recent state-of-the-art NAS algorithms on it. All logs and diagnostic information trained using the same setup for each candidate are provided. This facilitates a much larger community of researchers to focus on develo** better NAS algorithms in a more comparable and computationally cost friendly environment. All codes are publicly available at: https://xuanyidong.com/assets/projects/NATS-Bench. △ Less

Submitted 25 January, 2021; v1 submitted 28 August, 2020; originally announced September 2020.

Comments: Accepted to IEEE TPAMI 2021, an extended version of NAS-Bench-201 (ICLR 2020) [arXiv:2001.00326]

arXiv:2009.00237 [pdf, other]

An in-depth comparison of methods handling mixed-attribute data for general fuzzy min-max neural network

Authors: Thanh Tung Khuat, Bogdan Gabrys

Abstract: A general fuzzy min-max (GFMM) neural network is one of the efficient neuro-fuzzy systems for classification problems. However, a disadvantage of most of the current learning algorithms for GFMM is that they can handle effectively numerical valued features only. Therefore, this paper provides some potential approaches to adapting GFMM learning algorithms for classification problems with mixed-type… ▽ More A general fuzzy min-max (GFMM) neural network is one of the efficient neuro-fuzzy systems for classification problems. However, a disadvantage of most of the current learning algorithms for GFMM is that they can handle effectively numerical valued features only. Therefore, this paper provides some potential approaches to adapting GFMM learning algorithms for classification problems with mixed-type or only categorical features as they are very common in practical applications and often carry very useful information. We will compare and assess three main methods of handling datasets with mixed features, including the use of encoding methods, the combination of the GFMM model with other classifiers, and employing the specific learning algorithms for both types of features. The experimental results showed that the target and James-Stein are appropriate categorical encoding methods for learning algorithms of GFMM models, while the combination of GFMM neural networks and decision trees is a flexible way to enhance the classification performance of GFMM models on datasets with the mixed features. The learning algorithms with the mixed-type feature abilities are potential approaches to deal with mixed-attribute data in a natural way, but they need further improvement to achieve a better classification accuracy. Based on the analysis, we also identify the strong and weak points of different methods and propose potential research directions. △ Less

Submitted 1 September, 2020; originally announced September 2020.

MSC Class: 68T30; 68T20; 68T37; 68W27 ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

arXiv:2007.10818 [pdf, other]

A Review of Meta-level Learning in the Context of Multi-component, Multi-level Evolving Prediction Systems

Authors: Abbas Raza Ali, Marcin Budka, Bogdan Gabrys

Abstract: The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the most appropriate map** of learning methods for a given problem. It becomes a challenge in the presence of numerous configurations of l… ▽ More The exponential growth of volume, variety and velocity of data is raising the need for investigations of automated or semi-automated ways to extract useful patterns from the data. It requires deep expert knowledge and extensive computational resources to find the most appropriate map** of learning methods for a given problem. It becomes a challenge in the presence of numerous configurations of learning algorithms on massive amounts of data. So there is a need for an intelligent recommendation engine that can advise what is the best learning algorithm for a dataset. The techniques that are commonly used by experts are based on a trial and error approach evaluating and comparing a number of possible solutions against each other, using their prior experience on a specific domain, etc. The trial and error approach combined with the expert's prior knowledge, though computationally and time expensive, have been often shown to work for stationary problems where the processing is usually performed off-line. However, this approach would not normally be feasible to apply to non-stationary problems where streams of data are continuously arriving. Furthermore, in a non-stationary environment, the manual analysis of data and testing of various methods whenever there is a change in the underlying data distribution would be very difficult or simply infeasible. In that scenario and within an on-line predictive system, there are several tasks where Meta-learning can be used to effectively facilitate best recommendations including 1) pre-processing steps, 2) learning algorithms or their combination, 3) adaptivity mechanisms and their parameters, 4) recurring concept extraction, and 5) concept drift detection. △ Less

Submitted 17 July, 2020; originally announced July 2020.

arXiv:2006.12366 [pdf, other]

doi 10.1016/j.engappai.2020.103760

Scoring and Assessment in Medical VR Training Simulators with Dynamic Time Series Classification

Authors: Neil Vaughan, Bogdan Gabrys

Abstract: This research proposes and evaluates scoring and assessment methods for Virtual Reality (VR) training simulators. VR simulators capture detailed n-dimensional human motion data which is useful for performance analysis. Custom made medical haptic VR training simulators were developed and used to record data from 271 trainees of multiple clinical experience levels. DTW Multivariate Prototy** (DTW-… ▽ More This research proposes and evaluates scoring and assessment methods for Virtual Reality (VR) training simulators. VR simulators capture detailed n-dimensional human motion data which is useful for performance analysis. Custom made medical haptic VR training simulators were developed and used to record data from 271 trainees of multiple clinical experience levels. DTW Multivariate Prototy** (DTW-MP) is proposed. VR data was classified as Novice, Intermediate or Expert. Accuracy of algorithms applied for time-series classification were: dynamic time war** 1-nearest neighbor (DTW-1NN) 60%, nearest centroid SoftDTW classification 77.5%, Deep Learning: ResNet 85%, FCN 75%, CNN 72.5% and MCDCNN 28.5%. Expert VR data recordings can be used for guidance of novices. Assessment feedback can help trainees to improve skills and consistency. Motion analysis can identify different techniques used by individuals. Mistakes can be detected dynamically in real-time, raising alarms to prevent injuries. △ Less

Submitted 11 June, 2020; originally announced June 2020.

ACM Class: I.2.0

Journal ref: Engineering Applications of Artificial Intelligence (2020) 103760

arXiv:2006.03656 [pdf, other]

AutoHAS: Efficient Hyperparameter and Architecture Search

Authors: Xuanyi Dong, Mingxing Tan, Adams Wei Yu, Daiyi Peng, Bogdan Gabrys, Quoc V. Le

Abstract: Efficient hyperparameter or architecture search methods have shown remarkable results, but each of them is only applicable to searching for either hyperparameters (HPs) or architectures. In this work, we propose a unified pipeline, AutoHAS, to efficiently search for both architectures and hyperparameters. AutoHAS learns to alternately update the shared network weights and a reinforcement learning… ▽ More Efficient hyperparameter or architecture search methods have shown remarkable results, but each of them is only applicable to searching for either hyperparameters (HPs) or architectures. In this work, we propose a unified pipeline, AutoHAS, to efficiently search for both architectures and hyperparameters. AutoHAS learns to alternately update the shared network weights and a reinforcement learning (RL) controller, which learns the probability distribution for the architecture candidates and HP candidates. A temporary weight is introduced to store the updated weight from the selected HPs (by the controller), and a validation accuracy based on this temporary weight serves as a reward to update the controller. In experiments, we show AutoHAS is efficient and generalizable to different search spaces, baselines and datasets. In particular, AutoHAS can improve the accuracy over popular network architectures, such as ResNet and EfficientNet, on CIFAR-10/100, ImageNet, and four more other datasets. △ Less

Submitted 7 April, 2021; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: Accepted to 2nd Workshop on Neural Architecture Search at ICLR 2021

arXiv:2006.01963 [pdf, other]

Multi-level Graph Convolutional Networks for Cross-platform Anchor Link Prediction

Authors: Hongxu Chen, Hongzhi Yin, Xiangguo Sun, Tong Chen, Bogdan Gabrys, Katarzyna Musial

Abstract: Cross-platform account matching plays a significant role in social network analytics, and is beneficial for a wide range of applications. However, existing methods either heavily rely on high-quality user generated content (including user profiles) or suffer from data insufficiency problem if only focusing on network topology, which brings researchers into an insoluble dilemma of model selection.… ▽ More Cross-platform account matching plays a significant role in social network analytics, and is beneficial for a wide range of applications. However, existing methods either heavily rely on high-quality user generated content (including user profiles) or suffer from data insufficiency problem if only focusing on network topology, which brings researchers into an insoluble dilemma of model selection. In this paper, to address this problem, we propose a novel framework that considers multi-level graph convolutions on both local network structure and hypergraph structure in a unified manner. The proposed method overcomes data insufficiency problem of existing work and does not necessarily rely on user demographic information. Moreover, to adapt the proposed method to be capable of handling large-scale social networks, we propose a two-phase space reconciliation mechanism to align the embedding spaces in both network partitioning based parallel training and account matching across different social networks. Extensive experiments have been conducted on two large-scale real-life social networks. The experimental results demonstrate that the proposed method outperforms the state-of-the-art models with a big margin. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: To appear in KDD'20

arXiv:2006.00695 [pdf, other]

doi 10.1109/TNNLS.2021.3104896

Random Hyperboxes

Authors: Thanh Tung Khuat, Bogdan Gabrys

Abstract: This paper proposes a simple yet powerful ensemble classifier, called Random Hyperboxes, constructed from individual hyperbox-based classifiers trained on the random subsets of sample and feature spaces of the training set. We also show a generalization error bound of the proposed classifier based on the strength of the individual hyperbox-based classifiers as well as the correlation among them. T… ▽ More This paper proposes a simple yet powerful ensemble classifier, called Random Hyperboxes, constructed from individual hyperbox-based classifiers trained on the random subsets of sample and feature spaces of the training set. We also show a generalization error bound of the proposed classifier based on the strength of the individual hyperbox-based classifiers as well as the correlation among them. The effectiveness of the proposed classifier is analyzed using a carefully selected illustrative example and compared empirically with other popular single and ensemble classifiers via 20 datasets using statistical testing methods. The experimental results confirmed that our proposed method outperformed other fuzzy min-max neural networks, popular learning algorithms, and is competitive with other ensemble methods. Finally, we identify the existing issues related to the generalization error bounds of the real datasets and inform the potential research directions. △ Less

Submitted 4 April, 2022; v1 submitted 31 May, 2020; originally announced June 2020.

Comments: @20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

MSC Class: 68T30; 68T20; 68T37; 68W27 ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

Journal ref: IEEE Transactions on Neural Networks and Learning Systems (2021)

arXiv:2005.07496 [pdf, other]

doi 10.1109/ACCESS.2021.3082932

Foundations and modelling of dynamic networks using Dynamic Graph Neural Networks: A survey

Authors: Joakim Skarding, Bogdan Gabrys, Katarzyna Musial

Abstract: Dynamic networks are used in a wide range of fields, including social network analysis, recommender systems, and epidemiology. Representing complex networks as structures changing over time allow network models to leverage not only structural but also temporal patterns. However, as dynamic network literature stems from diverse fields and makes use of inconsistent terminology, it is challenging to… ▽ More Dynamic networks are used in a wide range of fields, including social network analysis, recommender systems, and epidemiology. Representing complex networks as structures changing over time allow network models to leverage not only structural but also temporal patterns. However, as dynamic network literature stems from diverse fields and makes use of inconsistent terminology, it is challenging to navigate. Meanwhile, graph neural networks (GNNs) have gained a lot of attention in recent years for their ability to perform well on a range of network science tasks, such as link prediction and node classification. Despite the popularity of graph neural networks and the proven benefits of dynamic network models, there has been little focus on graph neural networks for dynamic networks. To address the challenges resulting from the fact that this research crosses diverse fields as well as to survey dynamic graph neural networks, this work is split into two main parts. First, to address the ambiguity of the dynamic network terminology we establish a foundation of dynamic networks with consistent, detailed terminology and notation. Second, we present a comprehensive survey of dynamic graph neural network models using the proposed terminology △ Less

Submitted 13 June, 2021; v1 submitted 13 May, 2020; originally announced May 2020.

Comments: 28 pages, 9 figures, 8 tables

Journal ref: in IEEE Access, vol. 9, pp. 79143-79168, 2021

arXiv:2003.11333 [pdf, other]

Accelerated learning algorithms of general fuzzy min-max neural network using a novel hyperbox selection rule

Authors: Thanh Tung Khuat, Bogdan Gabrys

Abstract: This paper proposes a method to accelerate the training process of a general fuzzy min-max neural network. The purpose is to reduce the unsuitable hyperboxes selected as the potential candidates of the expansion step of existing hyperboxes to cover a new input pattern in the online learning algorithms or candidates of the hyperbox aggregation process in the agglomerative learning algorithms. Our p… ▽ More This paper proposes a method to accelerate the training process of a general fuzzy min-max neural network. The purpose is to reduce the unsuitable hyperboxes selected as the potential candidates of the expansion step of existing hyperboxes to cover a new input pattern in the online learning algorithms or candidates of the hyperbox aggregation process in the agglomerative learning algorithms. Our proposed approach is based on the mathematical formulas to form a branch-and-bound solution aiming to remove the hyperboxes which are certain not to satisfy expansion or aggregation conditions, and in turn, decreasing the training time of learning algorithms. The efficiency of the proposed method is assessed over a number of widely used data sets. The experimental results indicated the significant decrease in training time of the proposed approach for both online and agglomerative learning algorithms. Notably, the training time of the online learning algorithms is reduced from 1.2 to 12 times when using the proposed method, while the agglomerative learning algorithms are accelerated from 7 to 37 times on average. △ Less

Submitted 19 May, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

arXiv:2001.11158 [pdf, other]

AVATAR -- Machine Learning Pipeline Evaluation Using Surrogate Model

Authors: Tien-Dung Nguyen, Tomasz Maszczyk, Katarzyna Musial, Marc-Andre Zöller, Bogdan Gabrys

Abstract: The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of tim… ▽ More The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid, and it is unnecessary to execute them to find out whether they are good pipelines. To address this issue, we propose a novel method to evaluate the validity of ML pipelines using a surrogate model (AVATAR). The AVATAR enables to accelerate automatic ML pipeline composition and optimisation by quickly ignoring invalid pipelines. Our experiments show that the AVATAR is more efficient in evaluating complex pipelines in comparison with the traditional evaluation approaches requiring their execution. △ Less

Submitted 2 February, 2020; v1 submitted 29 January, 2020; originally announced January 2020.

Comments: The Eighteenth International Symposium on Intelligent Data Analysis, IDA 2020

arXiv:2001.02391 [pdf, other]

An improved online learning algorithm for general fuzzy min-max neural network

Authors: Thanh Tung Khuat, Fang Chen, Bogdan Gabrys

Abstract: This paper proposes an improved version of the current online learning algorithm for a general fuzzy min-max neural network (GFMM) to tackle existing issues concerning expansion and contraction steps as well as the way of dealing with unseen data located on decision boundaries. These drawbacks lower its classification performance, so an improved algorithm is proposed in this study to address the a… ▽ More This paper proposes an improved version of the current online learning algorithm for a general fuzzy min-max neural network (GFMM) to tackle existing issues concerning expansion and contraction steps as well as the way of dealing with unseen data located on decision boundaries. These drawbacks lower its classification performance, so an improved algorithm is proposed in this study to address the above limitations. The proposed approach does not use the contraction process for overlap** hyperboxes, which is more likely to increase the error rate as shown in the literature. The empirical results indicated the improvement in the classification accuracy and stability of the proposed method compared to the original version and other fuzzy min-max classifiers. In order to reduce the sensitivity to the training samples presentation order of this new on-line learning algorithm, a simple ensemble method is also proposed. △ Less

Submitted 8 January, 2020; originally announced January 2020.

Comments: 9 pages, 8 tables, 6 figures

ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

arXiv:1907.13308 [pdf, other]

doi 10.1016/j.neucom.2019.12.090

A comparative study of general fuzzy min-max neural networks for pattern classification problems

Authors: Thanh Tung Khuat, Bogdan Gabrys

Abstract: General fuzzy min-max (GFMM) neural network is a generalization of fuzzy neural networks formed by hyperbox fuzzy sets for classification and clustering problems. Two principle algorithms are deployed to train this type of neural network, i.e., incremental learning and agglomerative learning. This paper presents a comprehensive empirical study of performance influencing factors, advantages, and dr… ▽ More General fuzzy min-max (GFMM) neural network is a generalization of fuzzy neural networks formed by hyperbox fuzzy sets for classification and clustering problems. Two principle algorithms are deployed to train this type of neural network, i.e., incremental learning and agglomerative learning. This paper presents a comprehensive empirical study of performance influencing factors, advantages, and drawbacks of the general fuzzy min-max neural network on pattern classification problems. The subjects of this study include (1) the impact of maximum hyperbox size, (2) the influence of the similarity threshold and measures on the agglomerative learning algorithm, (3) the effect of data presentation order, (4) comparative performance evaluation of the GFMM with other types of fuzzy min-max neural networks and prevalent machine learning algorithms. The experimental results on benchmark datasets widely used in machine learning showed overall strong and weak points of the GFMM classifier. These outcomes also informed potential research directions for this class of machine learning algorithms in the future. △ Less

Submitted 8 January, 2020; v1 submitted 31 July, 2019; originally announced July 2019.

Comments: 18 pages, 7 figures, 12 tables

MSC Class: 68T30; 68T20; 68T37; 68W27 ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

Journal ref: Neurocomputing, 2019

arXiv:1905.12170 [pdf, other]

doi 10.1109/TFUZZ.2019.2956917

An Effective Multi-Resolution Hierarchical Granular Representation based Classifier using General Fuzzy Min-Max Neural Network

Authors: Thanh Tung Khuat, Fang Chen, Bogdan Gabrys

Abstract: Motivated by the practical demands for simplification of data towards being consistent with human thinking and problem solving as well as tolerance of uncertainty, information granules are becoming important entities in data processing at different levels of data abstraction. This paper proposes a method to construct classifiers from multi-resolution hierarchical granular representations (MRHGRC)… ▽ More Motivated by the practical demands for simplification of data towards being consistent with human thinking and problem solving as well as tolerance of uncertainty, information granules are becoming important entities in data processing at different levels of data abstraction. This paper proposes a method to construct classifiers from multi-resolution hierarchical granular representations (MRHGRC) using hyperbox fuzzy sets. The proposed approach forms a series of granular inferences hierarchically through many levels of abstraction. An attractive characteristic of our classifier is that it can maintain relatively high accuracy at a low degree of granularity based on reusing the knowledge learned from lower levels of abstraction. In addition, our approach can reduce the data size significantly as well as handling the uncertainty and incompleteness associated with data in real-world applications. The construction process of the classifier consists of two phases. The first phase is to formulate the model at the greatest level of granularity, while the later stage aims to reduce the complexity of the constructed model and deduce it from data at higher abstraction levels. Experimental outcomes conducted comprehensively on both synthetic and real datasets indicated the efficiency of our method in terms of training time and predictive performance in comparison to other types of fuzzy min-max neural networks and common machine learning algorithms. △ Less

Submitted 3 December, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

Comments: @20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

MSC Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4 ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

Journal ref: IEEE Transactions on Fuzzy Systems, pp. 1-1, 2019

arXiv:1901.11303 [pdf, other]

Hyperbox based machine learning algorithms: A comprehensive survey

Authors: Thanh Tung Khuat, Dymitr Ruta, Bogdan Gabrys

Abstract: With the rapid development of digital information, the data volume generated by humans and machines is growing exponentially. Along with this trend, machine learning algorithms have been formed and evolved continuously to discover new information and knowledge from different data sources. Learning algorithms using hyperboxes as fundamental representational and building blocks are a branch of machi… ▽ More With the rapid development of digital information, the data volume generated by humans and machines is growing exponentially. Along with this trend, machine learning algorithms have been formed and evolved continuously to discover new information and knowledge from different data sources. Learning algorithms using hyperboxes as fundamental representational and building blocks are a branch of machine learning methods. These algorithms have enormous potential for high scalability and online adaptation of predictors built using hyperbox data representations to the dynamically changing environments and streaming data. This paper aims to give a comprehensive survey of literature on hyperbox-based machine learning models. In general, according to the architecture and characteristic features of the resulting models, the existing hyperbox-based learning algorithms may be grouped into three major categories: fuzzy min-max neural networks, hyperbox-based hybrid models, and other algorithms based on hyperbox representations. Within each of these groups, this paper shows a brief description of the structure of models, associated learning algorithms, and an analysis of their advantages and drawbacks. Main applications of these hyperbox-based models to the real-world problems are also described in this paper. Finally, we discuss some open problems and identify potential future research directions in this field. △ Less

Submitted 21 March, 2019; v1 submitted 31 January, 2019; originally announced January 2019.

Comments: 7 figures

MSC Class: 68T30; 68T20; 68T37; 68W27 ACM Class: I.2.1; I.2.6; I.2.m; I.5.0; I.5.1; I.5.2; I.5.3; I.5.4

arXiv:1812.10793 [pdf, other]

Automated Adaptation Strategies for Stream Learning

Authors: Rashid Bakirov, Bogdan Gabrys, Damien Fay

Abstract: Automation of machine learning model development is increasingly becoming an established research area. While automated model selection and automated data pre-processing have been studied in depth, there is, however, a gap concerning automated model adaptation strategies when multiple strategies are available. Manually develo** an adaptation strategy can be time consuming and costly. In this pap… ▽ More Automation of machine learning model development is increasingly becoming an established research area. While automated model selection and automated data pre-processing have been studied in depth, there is, however, a gap concerning automated model adaptation strategies when multiple strategies are available. Manually develo** an adaptation strategy can be time consuming and costly. In this paper we address this issue by proposing the use of flexible adaptive mechanism deployment for automated development of adaptation strategies. Experimental results after using the proposed strategies with five adaptive algorithms on 36 datasets confirm their viability. These strategies achieve better or comparable performance to the custom adaptation strategies and the repeated deployment of any single adaptive mechanism. △ Less

Submitted 30 April, 2021; v1 submitted 27 December, 2018; originally announced December 2018.

arXiv:1612.08789 [pdf, other]

doi 10.1109/TASE.2018.2876430

Automatic Composition and Optimization of Multicomponent Predictive Systems With an Extended Auto-WEKA

Authors: Manuel Martin Salvador, Marcin Budka, Bogdan Gabrys

Abstract: Composition and parameterization of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps are a challenging task. Auto-WEKA is a tool to automate the combined algorithm selection and hyperparameter (CASH) optimization problem. In this paper, we extend the CASH problem and Auto-WEKA to support the MCPS, including preprocessing steps for both classification and… ▽ More Composition and parameterization of multicomponent predictive systems (MCPSs) consisting of chains of data transformation steps are a challenging task. Auto-WEKA is a tool to automate the combined algorithm selection and hyperparameter (CASH) optimization problem. In this paper, we extend the CASH problem and Auto-WEKA to support the MCPS, including preprocessing steps for both classification and regression tasks. We define the optimization problem in which the search space consists of suitably parameterized Petri nets forming the sought MCPS solutions. In the experimental analysis, we focus on examining the impact of considerably extending the search space (from approximately 22,000 to 812 billion possible combinations of methods and categorical hyperparameters). In a range of extensive experiments, three different optimization strategies are used to automatically compose MCPSs for 21 publicly available data sets. The diversity of the composed MCPSs found is an indication that fully and automatically exploiting different combinations of data cleaning and preprocessing techniques is possible and highly beneficial for different predictive models. We also present the results on seven data sets from real chemical production processes. Our findings can have a major impact on the development of high-quality predictive models as well as their maintenance and scalability aspects needed in modern applications and deployment scenarios. △ Less

Submitted 1 February, 2019; v1 submitted 27 December, 2016; originally announced December 2016.

Journal ref: in IEEE Transactions on Automation Science and Engineering. (2018) 1-14

Showing 1–35 of 35 results for author: Gabrys, B