Search | arXiv e-print repository

Methods for Class-Imbalanced Learning with Support Vector Machines: A Review and an Empirical Evaluation

Authors: Salim Rezvani, Farhad Pourpanah, Chee Peng Lim, Q. M. Jonathan Wu

Abstract: This paper presents a review on methods for class-imbalanced learning with the Support Vector Machine (SVM) and its variants. We first explain the structure of SVM and its variants and discuss their inefficiency in learning with class-imbalanced data sets. We introduce a hierarchical categorization of SVM-based models with respect to class-imbalanced learning. Specifically, we categorize SVM-based… ▽ More This paper presents a review on methods for class-imbalanced learning with the Support Vector Machine (SVM) and its variants. We first explain the structure of SVM and its variants and discuss their inefficiency in learning with class-imbalanced data sets. We introduce a hierarchical categorization of SVM-based models with respect to class-imbalanced learning. Specifically, we categorize SVM-based models into re-sampling, algorithmic, and fusion methods, and discuss the principles of the representative models in each category. In addition, we conduct a series of empirical evaluations to compare the performances of various representative SVM-based models in each category using benchmark imbalanced data sets, ranging from low to high imbalanced ratios. Our findings reveal that while algorithmic methods are less time-consuming owing to no data pre-processing requirements, fusion methods, which combine both re-sampling and algorithmic approaches, generally perform the best, but with a higher computational load. A discussion on research gaps and future research directions is provided. △ Less

Submitted 11 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

Comments: Accepted in Soft Computing

arXiv:2405.13535 [pdf, other]

Generalized Laplace Approximation

Authors: Yinsong Chen, Samson S. Yu, Zhong Li, Chee Peng Lim

Abstract: In recent years, the inconsistency in Bayesian deep learning has garnered increasing attention. Tempered or generalized posterior distributions often offer a direct and effective solution to this issue. However, understanding the underlying causes and evaluating the effectiveness of generalized posteriors remain active areas of research. In this study, we introduce a unified theoretical framework… ▽ More In recent years, the inconsistency in Bayesian deep learning has garnered increasing attention. Tempered or generalized posterior distributions often offer a direct and effective solution to this issue. However, understanding the underlying causes and evaluating the effectiveness of generalized posteriors remain active areas of research. In this study, we introduce a unified theoretical framework to attribute Bayesian inconsistency to model misspecification and inadequate priors. We interpret the generalization of the posterior with a temperature factor as a correction for misspecified models through adjustments to the joint probability model, and the recalibration of priors by redistributing probability mass on models within the hypothesis space using data samples. Additionally, we highlight a distinctive feature of Laplace approximation, which ensures that the generalized normalizing constant can be treated as invariant, unlike the typical scenario in general Bayesian learning where this constant varies with model parameters post-generalization. Building on this insight, we propose the generalized Laplace approximation, which involves a simple adjustment to the computation of the Hessian matrix of the regularized loss function. This method offers a flexible and scalable framework for obtaining high-quality posterior distributions. We assess the performance and properties of the generalized Laplace approximation on state-of-the-art neural networks and real-world datasets. △ Less

Submitted 24 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

arXiv:2404.05196 [pdf, other]

HSViT: Horizontally Scalable Vision Transformer

Authors: Chenhao Xu, Chang-Tsun Li, Chee Peng Lim, Douglas Creighton

Abstract: While the Vision Transformer (ViT) architecture gains prominence in computer vision and attracts significant attention from multimedia communities, its deficiency in prior knowledge (inductive bias) regarding shift, scale, and rotational invariance necessitates pre-training on large-scale datasets. Furthermore, the growing layers and parameters in both ViT and convolutional neural networks (CNNs)… ▽ More While the Vision Transformer (ViT) architecture gains prominence in computer vision and attracts significant attention from multimedia communities, its deficiency in prior knowledge (inductive bias) regarding shift, scale, and rotational invariance necessitates pre-training on large-scale datasets. Furthermore, the growing layers and parameters in both ViT and convolutional neural networks (CNNs) impede their applicability to mobile multimedia services, primarily owing to the constrained computational resources on edge devices. To mitigate the aforementioned challenges, this paper introduces a novel horizontally scalable vision transformer (HSViT). Specifically, a novel image-level feature embedding allows ViT to better leverage the inductive bias inherent in the convolutional layers. Based on this, an innovative horizontally scalable architecture is designed, which reduces the number of layers and parameters of the models while facilitating collaborative training and inference of ViT models across multiple nodes. The experimental results depict that, without pre-training on large-scale datasets, HSViT achieves up to 10% higher top-1 accuracy than state-of-the-art schemes, ascertaining its superior preservation of inductive bias. The code is available at https://github.com/xuchenhao001/HSViT. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2402.09975 [pdf]

Current and future roles of artificial intelligence in retinopathy of prematurity

Authors: Ali Jafarizadeh, Shadi Farabi Maleki, Parnia Pouya, Navid Sobhi, Mirsaeed Abdollahi, Siamak Pedrammehr, Chee Peng Lim, Houshyar Asadi, Roohallah Alizadehsani, Ru-San Tan, Sheikh Mohammad Shariful Islam, U. Rajendra Acharya

Abstract: Retinopathy of prematurity (ROP) is a severe condition affecting premature infants, leading to abnormal retinal blood vessel growth, retinal detachment, and potential blindness. While semi-automated systems have been used in the past to diagnose ROP-related plus disease by quantifying retinal vessel features, traditional machine learning (ML) models face challenges like accuracy and overfitting. R… ▽ More Retinopathy of prematurity (ROP) is a severe condition affecting premature infants, leading to abnormal retinal blood vessel growth, retinal detachment, and potential blindness. While semi-automated systems have been used in the past to diagnose ROP-related plus disease by quantifying retinal vessel features, traditional machine learning (ML) models face challenges like accuracy and overfitting. Recent advancements in deep learning (DL), especially convolutional neural networks (CNNs), have significantly improved ROP detection and classification. The i-ROP deep learning (i-ROP-DL) system also shows promise in detecting plus disease, offering reliable ROP diagnosis potential. This research comprehensively examines the contemporary progress and challenges associated with using retinal imaging and artificial intelligence (AI) to detect ROP, offering valuable insights that can guide further investigation in this domain. Based on 89 original studies in this field (out of 1487 studies that were comprehensively reviewed), we concluded that traditional methods for ROP diagnosis suffer from subjectivity and manual analysis, leading to inconsistent clinical decisions. AI holds great promise for improving ROP management. This review explores AI's potential in ROP detection, classification, diagnosis, and prognosis. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: 28 pages, 8 figures, 2 tables, 235 references, 1 supplementary table

ACM Class: J.3.2; J.3.3

arXiv:2310.12393 [pdf, other]

Deep Learning Techniques for Video Instance Segmentation: A Survey

Authors: Chenhao Xu, Chang-Tsun Li, Yongjian Hu, Chee Peng Lim, Douglas Creighton

Abstract: Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By tackling the video instance segmentation tasks through effective analysis and utilization of visual information in videos, a range of computer vision-enabled applicati… ▽ More Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By tackling the video instance segmentation tasks through effective analysis and utilization of visual information in videos, a range of computer vision-enabled applications (e.g., human action recognition, medical image processing, autonomous vehicle navigation, surveillance, etc) can be implemented. As deep-learning techniques take a dominant role in various computer vision areas, a plethora of deep-learning-based video instance segmentation schemes have been proposed. This survey offers a multifaceted view of deep-learning schemes for video instance segmentation, covering various architectural paradigms, along with comparisons of functional performance, model complexity, and computational overheads. In addition to the common architectural designs, auxiliary techniques for improving the performance of deep-learning models for video instance segmentation are compiled and discussed. Finally, we discuss a range of major challenges and directions for further investigations to help advance this promising research field. △ Less

Submitted 18 October, 2023; originally announced October 2023.

arXiv:2309.12560 [pdf, other]

Machine Learning Meets Advanced Robotic Manipulation

Authors: Saeid Nahavandi, Roohallah Alizadehsani, Darius Nahavandi, Chee Peng Lim, Kevin Kelly, Fernando Bello

Abstract: Automated industries lead to high quality production, lower manufacturing cost and better utilization of human resources. Robotic manipulator arms have major role in the automation process. However, for complex manipulation tasks, hard coding efficient and safe trajectories is challenging and time consuming. Machine learning methods have the potential to learn such controllers based on expert demo… ▽ More Automated industries lead to high quality production, lower manufacturing cost and better utilization of human resources. Robotic manipulator arms have major role in the automation process. However, for complex manipulation tasks, hard coding efficient and safe trajectories is challenging and time consuming. Machine learning methods have the potential to learn such controllers based on expert demonstrations. Despite promising advances, better approaches must be developed to improve safety, reliability, and efficiency of ML methods in both training and deployment phases. This survey aims to review cutting edge technologies and recent trends on ML methods applied to real-world manipulation tasks. After reviewing the related background on ML, the rest of the paper is devoted to ML applications in different domains such as industry, healthcare, agriculture, space, military, and search and rescue. The paper is closed with important research directions for future works. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2305.14373 [pdf, other]

doi 10.1109/TETCI.2023.3285932

An Ensemble Semi-Supervised Adaptive Resonance Theory Model with Explanation Capability for Pattern Classification

Authors: Farhad Pourpanah, Chee Peng Lim, Ali Etemad, Q. M. Jonathan Wu

Abstract: Most semi-supervised learning (SSL) models entail complex structures and iterative training processes as well as face difficulties in interpreting their predictions to users. To address these issues, this paper proposes a new interpretable SSL model using the supervised and unsupervised Adaptive Resonance Theory (ART) family of networks, which is denoted as SSL-ART. Firstly, SSL-ART adopts an unsu… ▽ More Most semi-supervised learning (SSL) models entail complex structures and iterative training processes as well as face difficulties in interpreting their predictions to users. To address these issues, this paper proposes a new interpretable SSL model using the supervised and unsupervised Adaptive Resonance Theory (ART) family of networks, which is denoted as SSL-ART. Firstly, SSL-ART adopts an unsupervised fuzzy ART network to create a number of prototype nodes using unlabeled samples. Then, it leverages a supervised fuzzy ARTMAP structure to map the established prototype nodes to the target classes using labeled samples. Specifically, a one-to-many (OtM) map** scheme is devised to associate a prototype node with more than one class label. The main advantages of SSL-ART include the capability of: (i) performing online learning, (ii) reducing the number of redundant prototype nodes through the OtM map** scheme and minimizing the effects of noisy samples, and (iii) providing an explanation facility for users to interpret the predicted outcomes. In addition, a weighted voting strategy is introduced to form an ensemble SSL-ART model, which is denoted as WESSL-ART. Every ensemble member, i.e., SSL-ART, assigns {\color{black}a different weight} to each class based on its performance pertaining to the corresponding class. The aim is to mitigate the effects of training data sequences on all SSL-ART members and improve the overall performance of WESSL-ART. The experimental results on eighteen benchmark data sets, three artificially generated data sets, and a real-world case study indicate the benefits of the proposed SSL-ART and WESSL-ART models for tackling pattern classification problems. △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: 13 pages, 8 figures

arXiv:2011.08641 [pdf, other]

doi 10.1109/TPAMI.2022.3191696

A Review of Generalized Zero-Shot Learning Methods

Authors: Farhad Pourpanah, Moloud Abdar, Yuxuan Luo, Xinlei Zhou, Ran Wang, Chee Peng Lim, Xi-Zhao Wang, Q. M. Jonathan Wu

Abstract: Generalized zero-shot learning (GZSL) aims to train a model for classifying data samples under the condition that some output classes are unknown during supervised learning. To address this challenging task, GZSL leverages semantic information of the seen (source) and unseen (target) classes to bridge the gap between both seen and unseen classes. Since its introduction, many GZSL models have been… ▽ More Generalized zero-shot learning (GZSL) aims to train a model for classifying data samples under the condition that some output classes are unknown during supervised learning. To address this challenging task, GZSL leverages semantic information of the seen (source) and unseen (target) classes to bridge the gap between both seen and unseen classes. Since its introduction, many GZSL models have been formulated. In this review paper, we present a comprehensive review on GZSL. Firstly, we provide an overview of GZSL including the problems and challenges. Then, we introduce a hierarchical categorization for the GZSL methods and discuss the representative methods in each category. In addition, we discuss the available benchmark data sets and applications of GZSL, along with a discussion on the research gaps and directions for future investigations. △ Less

Submitted 12 July, 2022; v1 submitted 17 November, 2020; originally announced November 2020.

Comments: 26 pages, 12 figures

arXiv:2011.05700 [pdf, other]

doi 10.1007/s10462-022-10214-4

A Review of the Family of Artificial Fish Swarm Algorithms: Recent Advances and Applications

Authors: Farhad Pourpanah, Ran Wang, Chee Peng Lim, Xi-Zhao Wang, Danial Yazdani

Abstract: The Artificial Fish Swarm Algorithm (AFSA) is inspired by the ecological behaviors of fish schooling in nature, viz., the preying, swarming and following behaviors. Owing to a number of salient properties, which include flexibility, fast convergence, and insensitivity to the initial parameter settings, the family of AFSA has emerged as an effective Swarm Intelligence (SI) methodology that has been… ▽ More The Artificial Fish Swarm Algorithm (AFSA) is inspired by the ecological behaviors of fish schooling in nature, viz., the preying, swarming and following behaviors. Owing to a number of salient properties, which include flexibility, fast convergence, and insensitivity to the initial parameter settings, the family of AFSA has emerged as an effective Swarm Intelligence (SI) methodology that has been widely applied to solve real-world optimization problems. Since its introduction in 2002, many improved and hybrid AFSA models have been developed to tackle continuous, binary, and combinatorial optimization problems. This paper aims to present a concise review of the continuous AFSA, encompassing the original ASFA, its improvements and hybrid models, as well as their associated applications. We focus on articles published in high-quality journals since 2013. Our review provides insights into AFSA parameters modifications, procedures and sub-functions. The main reasons for these enhancements and the comparison results with other hybrid methods are discussed. In addition, hybrid, multi-objective and dynamic AFSA models that have been proposed to solve continuous optimization problems are elucidated. We also analyse possible AFSA enhancements and highlight future research directions for advancing AFSA-based models. △ Less

Submitted 12 May, 2022; v1 submitted 11 November, 2020; originally announced November 2020.

Comments: 40 pages, 8 figures

arXiv:1807.05380 [pdf, other]

3D Hand Pose Estimation using Simulation and Partial-Supervision with a Shared Latent Space

Authors: Masoud Abdi, Ehsan Abbasnejad, Chee Peng Lim, Saeid Nahavandi

Abstract: Tremendous amounts of expensive annotated data are a vital ingredient for state-of-the-art 3d hand pose estimation. Therefore, synthetic data has been popularized as annotations are automatically available. However, models trained only with synthetic samples do not generalize to real data, mainly due to the gap between the distribution of synthetic and real data. In this paper, we propose a novel… ▽ More Tremendous amounts of expensive annotated data are a vital ingredient for state-of-the-art 3d hand pose estimation. Therefore, synthetic data has been popularized as annotations are automatically available. However, models trained only with synthetic samples do not generalize to real data, mainly due to the gap between the distribution of synthetic and real data. In this paper, we propose a novel method that seeks to predict the 3d position of the hand using both synthetic and partially-labeled real data. Accordingly, we form a shared latent space between three modalities: synthetic depth image, real depth image, and pose. We demonstrate that by carefully learning the shared latent space, we can find a regression model that is able to generalize to real data. As such, we show that our method produces accurate predictions in both semi-supervised and unsupervised settings. Additionally, the proposed model is capable of generating novel, meaningful, and consistent samples from all of the three domains. We evaluate our method qualitatively and quantitively on two highly competitive benchmarks (i.e., NYU and ICVL) and demonstrate its superiority over the state-of-the-art methods. The source code will be made available at https://github.com/masabdi/LSPS. △ Less

Submitted 14 July, 2018; originally announced July 2018.

Comments: Oral presentation at British Machine Vision Conference (BMVC) 2018

arXiv:1803.08067 [pdf]

doi 10.1109/JSYST.2019.2918283

A Review of Situation Awareness Assessment Approaches in Aviation Environments

Authors: Thanh Nguyen, Chee Peng Lim, Ngoc Duy Nguyen, Lee Gordon-Brown, Saeid Nahavandi

Abstract: Situation awareness (SA) is an important constituent in human information processing and essential in pilots' decision-making processes. Acquiring and maintaining appropriate levels of SA is critical in aviation environments as it affects all decisions and actions taking place in flights and air traffic control. This paper provides an overview of recent measurement models and approaches to establi… ▽ More Situation awareness (SA) is an important constituent in human information processing and essential in pilots' decision-making processes. Acquiring and maintaining appropriate levels of SA is critical in aviation environments as it affects all decisions and actions taking place in flights and air traffic control. This paper provides an overview of recent measurement models and approaches to establishing and enhancing SA in aviation environments. Many aspects of SA are examined including the classification of SA techniques into six categories, and different theoretical SA models from individual, to shared or team, and to distributed or system levels. Quantitative and qualitative perspectives pertaining to SA methods and issues of SA for unmanned vehicles are also addressed. Furthermore, future research directions regarding SA assessment approaches are raised to deal with shortcomings of the existing state-of-the-art methods in the literature. △ Less

Submitted 7 June, 2019; v1 submitted 6 March, 2018; originally announced March 2018.

Comments: IEEE Systems Journal, https://ieeexplore.ieee.org/document/8732669

arXiv:1803.02965 [pdf]

doi 10.1016/j.engappai.2020.103915

A Multi-Objective Deep Reinforcement Learning Framework

Authors: Thanh Thi Nguyen, Ngoc Duy Nguyen, Peter Vamplew, Saeid Nahavandi, Richard Dazeley, Chee Peng Lim

Abstract: This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We develop a high-performance MODRL framework that supports both single-policy and multi-policy strategies, as well as both linear and non-linear approaches to action selection. The experimental results on two benchmark problems (two-objective deep sea treasure environment a… ▽ More This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We develop a high-performance MODRL framework that supports both single-policy and multi-policy strategies, as well as both linear and non-linear approaches to action selection. The experimental results on two benchmark problems (two-objective deep sea treasure environment and three-objective Mountain Car problem) indicate that the proposed framework is able to find the Pareto-optimal solutions effectively. The proposed framework is generic and highly modularized, which allows the integration of different deep reinforcement learning algorithms in different complex problem domains. This therefore overcomes many disadvantages involved with standard multi-objective reinforcement learning methods in the current literature. The proposed framework acts as a testbed platform that accelerates the development of MODRL for solving increasingly complicated multi-objective problems. △ Less

Submitted 19 June, 2020; v1 submitted 7 March, 2018; originally announced March 2018.

Comments: 21 pages

Report number: Volume 96, November 2020, 103915

Journal ref: Engineering Applications of Artificial Intelligence, 2020

Showing 1–12 of 12 results for author: Lim, C P