-
Methods for Class-Imbalanced Learning with Support Vector Machines: A Review and an Empirical Evaluation
Authors:
Salim Rezvani,
Farhad Pourpanah,
Chee Peng Lim,
Q. M. Jonathan Wu
Abstract:
This paper presents a review on methods for class-imbalanced learning with the Support Vector Machine (SVM) and its variants. We first explain the structure of SVM and its variants and discuss their inefficiency in learning with class-imbalanced data sets. We introduce a hierarchical categorization of SVM-based models with respect to class-imbalanced learning. Specifically, we categorize SVM-based…
▽ More
This paper presents a review on methods for class-imbalanced learning with the Support Vector Machine (SVM) and its variants. We first explain the structure of SVM and its variants and discuss their inefficiency in learning with class-imbalanced data sets. We introduce a hierarchical categorization of SVM-based models with respect to class-imbalanced learning. Specifically, we categorize SVM-based models into re-sampling, algorithmic, and fusion methods, and discuss the principles of the representative models in each category. In addition, we conduct a series of empirical evaluations to compare the performances of various representative SVM-based models in each category using benchmark imbalanced data sets, ranging from low to high imbalanced ratios. Our findings reveal that while algorithmic methods are less time-consuming owing to no data pre-processing requirements, fusion methods, which combine both re-sampling and algorithmic approaches, generally perform the best, but with a higher computational load. A discussion on research gaps and future research directions is provided.
△ Less
Submitted 11 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Generalized Laplace Approximation
Authors:
Yinsong Chen,
Samson S. Yu,
Zhong Li,
Chee Peng Lim
Abstract:
In recent years, the inconsistency in Bayesian deep learning has garnered increasing attention. Tempered or generalized posterior distributions often offer a direct and effective solution to this issue. However, understanding the underlying causes and evaluating the effectiveness of generalized posteriors remain active areas of research. In this study, we introduce a unified theoretical framework…
▽ More
In recent years, the inconsistency in Bayesian deep learning has garnered increasing attention. Tempered or generalized posterior distributions often offer a direct and effective solution to this issue. However, understanding the underlying causes and evaluating the effectiveness of generalized posteriors remain active areas of research. In this study, we introduce a unified theoretical framework to attribute Bayesian inconsistency to model misspecification and inadequate priors. We interpret the generalization of the posterior with a temperature factor as a correction for misspecified models through adjustments to the joint probability model, and the recalibration of priors by redistributing probability mass on models within the hypothesis space using data samples. Additionally, we highlight a distinctive feature of Laplace approximation, which ensures that the generalized normalizing constant can be treated as invariant, unlike the typical scenario in general Bayesian learning where this constant varies with model parameters post-generalization. Building on this insight, we propose the generalized Laplace approximation, which involves a simple adjustment to the computation of the Hessian matrix of the regularized loss function. This method offers a flexible and scalable framework for obtaining high-quality posterior distributions. We assess the performance and properties of the generalized Laplace approximation on state-of-the-art neural networks and real-world datasets.
△ Less
Submitted 24 May, 2024; v1 submitted 22 May, 2024;
originally announced May 2024.
-
HSViT: Horizontally Scalable Vision Transformer
Authors:
Chenhao Xu,
Chang-Tsun Li,
Chee Peng Lim,
Douglas Creighton
Abstract:
While the Vision Transformer (ViT) architecture gains prominence in computer vision and attracts significant attention from multimedia communities, its deficiency in prior knowledge (inductive bias) regarding shift, scale, and rotational invariance necessitates pre-training on large-scale datasets. Furthermore, the growing layers and parameters in both ViT and convolutional neural networks (CNNs)…
▽ More
While the Vision Transformer (ViT) architecture gains prominence in computer vision and attracts significant attention from multimedia communities, its deficiency in prior knowledge (inductive bias) regarding shift, scale, and rotational invariance necessitates pre-training on large-scale datasets. Furthermore, the growing layers and parameters in both ViT and convolutional neural networks (CNNs) impede their applicability to mobile multimedia services, primarily owing to the constrained computational resources on edge devices. To mitigate the aforementioned challenges, this paper introduces a novel horizontally scalable vision transformer (HSViT). Specifically, a novel image-level feature embedding allows ViT to better leverage the inductive bias inherent in the convolutional layers. Based on this, an innovative horizontally scalable architecture is designed, which reduces the number of layers and parameters of the models while facilitating collaborative training and inference of ViT models across multiple nodes. The experimental results depict that, without pre-training on large-scale datasets, HSViT achieves up to 10% higher top-1 accuracy than state-of-the-art schemes, ascertaining its superior preservation of inductive bias. The code is available at https://github.com/xuchenhao001/HSViT.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Current and future roles of artificial intelligence in retinopathy of prematurity
Authors:
Ali Jafarizadeh,
Shadi Farabi Maleki,
Parnia Pouya,
Navid Sobhi,
Mirsaeed Abdollahi,
Siamak Pedrammehr,
Chee Peng Lim,
Houshyar Asadi,
Roohallah Alizadehsani,
Ru-San Tan,
Sheikh Mohammad Shariful Islam,
U. Rajendra Acharya
Abstract:
Retinopathy of prematurity (ROP) is a severe condition affecting premature infants, leading to abnormal retinal blood vessel growth, retinal detachment, and potential blindness. While semi-automated systems have been used in the past to diagnose ROP-related plus disease by quantifying retinal vessel features, traditional machine learning (ML) models face challenges like accuracy and overfitting. R…
▽ More
Retinopathy of prematurity (ROP) is a severe condition affecting premature infants, leading to abnormal retinal blood vessel growth, retinal detachment, and potential blindness. While semi-automated systems have been used in the past to diagnose ROP-related plus disease by quantifying retinal vessel features, traditional machine learning (ML) models face challenges like accuracy and overfitting. Recent advancements in deep learning (DL), especially convolutional neural networks (CNNs), have significantly improved ROP detection and classification. The i-ROP deep learning (i-ROP-DL) system also shows promise in detecting plus disease, offering reliable ROP diagnosis potential. This research comprehensively examines the contemporary progress and challenges associated with using retinal imaging and artificial intelligence (AI) to detect ROP, offering valuable insights that can guide further investigation in this domain. Based on 89 original studies in this field (out of 1487 studies that were comprehensively reviewed), we concluded that traditional methods for ROP diagnosis suffer from subjectivity and manual analysis, leading to inconsistent clinical decisions. AI holds great promise for improving ROP management. This review explores AI's potential in ROP detection, classification, diagnosis, and prognosis.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Deep Learning Techniques for Video Instance Segmentation: A Survey
Authors:
Chenhao Xu,
Chang-Tsun Li,
Yongjian Hu,
Chee Peng Lim,
Douglas Creighton
Abstract:
Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By tackling the video instance segmentation tasks through effective analysis and utilization of visual information in videos, a range of computer vision-enabled applicati…
▽ More
Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By tackling the video instance segmentation tasks through effective analysis and utilization of visual information in videos, a range of computer vision-enabled applications (e.g., human action recognition, medical image processing, autonomous vehicle navigation, surveillance, etc) can be implemented. As deep-learning techniques take a dominant role in various computer vision areas, a plethora of deep-learning-based video instance segmentation schemes have been proposed. This survey offers a multifaceted view of deep-learning schemes for video instance segmentation, covering various architectural paradigms, along with comparisons of functional performance, model complexity, and computational overheads. In addition to the common architectural designs, auxiliary techniques for improving the performance of deep-learning models for video instance segmentation are compiled and discussed. Finally, we discuss a range of major challenges and directions for further investigations to help advance this promising research field.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Machine Learning Meets Advanced Robotic Manipulation
Authors:
Saeid Nahavandi,
Roohallah Alizadehsani,
Darius Nahavandi,
Chee Peng Lim,
Kevin Kelly,
Fernando Bello
Abstract:
Automated industries lead to high quality production, lower manufacturing cost and better utilization of human resources. Robotic manipulator arms have major role in the automation process. However, for complex manipulation tasks, hard coding efficient and safe trajectories is challenging and time consuming. Machine learning methods have the potential to learn such controllers based on expert demo…
▽ More
Automated industries lead to high quality production, lower manufacturing cost and better utilization of human resources. Robotic manipulator arms have major role in the automation process. However, for complex manipulation tasks, hard coding efficient and safe trajectories is challenging and time consuming. Machine learning methods have the potential to learn such controllers based on expert demonstrations. Despite promising advances, better approaches must be developed to improve safety, reliability, and efficiency of ML methods in both training and deployment phases. This survey aims to review cutting edge technologies and recent trends on ML methods applied to real-world manipulation tasks. After reviewing the related background on ML, the rest of the paper is devoted to ML applications in different domains such as industry, healthcare, agriculture, space, military, and search and rescue. The paper is closed with important research directions for future works.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
An Ensemble Semi-Supervised Adaptive Resonance Theory Model with Explanation Capability for Pattern Classification
Authors:
Farhad Pourpanah,
Chee Peng Lim,
Ali Etemad,
Q. M. Jonathan Wu
Abstract:
Most semi-supervised learning (SSL) models entail complex structures and iterative training processes as well as face difficulties in interpreting their predictions to users. To address these issues, this paper proposes a new interpretable SSL model using the supervised and unsupervised Adaptive Resonance Theory (ART) family of networks, which is denoted as SSL-ART. Firstly, SSL-ART adopts an unsu…
▽ More
Most semi-supervised learning (SSL) models entail complex structures and iterative training processes as well as face difficulties in interpreting their predictions to users. To address these issues, this paper proposes a new interpretable SSL model using the supervised and unsupervised Adaptive Resonance Theory (ART) family of networks, which is denoted as SSL-ART. Firstly, SSL-ART adopts an unsupervised fuzzy ART network to create a number of prototype nodes using unlabeled samples. Then, it leverages a supervised fuzzy ARTMAP structure to map the established prototype nodes to the target classes using labeled samples. Specifically, a one-to-many (OtM) map** scheme is devised to associate a prototype node with more than one class label. The main advantages of SSL-ART include the capability of: (i) performing online learning, (ii) reducing the number of redundant prototype nodes through the OtM map** scheme and minimizing the effects of noisy samples, and (iii) providing an explanation facility for users to interpret the predicted outcomes. In addition, a weighted voting strategy is introduced to form an ensemble SSL-ART model, which is denoted as WESSL-ART. Every ensemble member, i.e., SSL-ART, assigns {\color{black}a different weight} to each class based on its performance pertaining to the corresponding class. The aim is to mitigate the effects of training data sequences on all SSL-ART members and improve the overall performance of WESSL-ART. The experimental results on eighteen benchmark data sets, three artificially generated data sets, and a real-world case study indicate the benefits of the proposed SSL-ART and WESSL-ART models for tackling pattern classification problems.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
A Review of Generalized Zero-Shot Learning Methods
Authors:
Farhad Pourpanah,
Moloud Abdar,
Yuxuan Luo,
Xinlei Zhou,
Ran Wang,
Chee Peng Lim,
Xi-Zhao Wang,
Q. M. Jonathan Wu
Abstract:
Generalized zero-shot learning (GZSL) aims to train a model for classifying data samples under the condition that some output classes are unknown during supervised learning. To address this challenging task, GZSL leverages semantic information of the seen (source) and unseen (target) classes to bridge the gap between both seen and unseen classes. Since its introduction, many GZSL models have been…
▽ More
Generalized zero-shot learning (GZSL) aims to train a model for classifying data samples under the condition that some output classes are unknown during supervised learning. To address this challenging task, GZSL leverages semantic information of the seen (source) and unseen (target) classes to bridge the gap between both seen and unseen classes. Since its introduction, many GZSL models have been formulated. In this review paper, we present a comprehensive review on GZSL. Firstly, we provide an overview of GZSL including the problems and challenges. Then, we introduce a hierarchical categorization for the GZSL methods and discuss the representative methods in each category. In addition, we discuss the available benchmark data sets and applications of GZSL, along with a discussion on the research gaps and directions for future investigations.
△ Less
Submitted 12 July, 2022; v1 submitted 17 November, 2020;
originally announced November 2020.
-
A Review of the Family of Artificial Fish Swarm Algorithms: Recent Advances and Applications
Authors:
Farhad Pourpanah,
Ran Wang,
Chee Peng Lim,
Xi-Zhao Wang,
Danial Yazdani
Abstract:
The Artificial Fish Swarm Algorithm (AFSA) is inspired by the ecological behaviors of fish schooling in nature, viz., the preying, swarming and following behaviors. Owing to a number of salient properties, which include flexibility, fast convergence, and insensitivity to the initial parameter settings, the family of AFSA has emerged as an effective Swarm Intelligence (SI) methodology that has been…
▽ More
The Artificial Fish Swarm Algorithm (AFSA) is inspired by the ecological behaviors of fish schooling in nature, viz., the preying, swarming and following behaviors. Owing to a number of salient properties, which include flexibility, fast convergence, and insensitivity to the initial parameter settings, the family of AFSA has emerged as an effective Swarm Intelligence (SI) methodology that has been widely applied to solve real-world optimization problems. Since its introduction in 2002, many improved and hybrid AFSA models have been developed to tackle continuous, binary, and combinatorial optimization problems. This paper aims to present a concise review of the continuous AFSA, encompassing the original ASFA, its improvements and hybrid models, as well as their associated applications. We focus on articles published in high-quality journals since 2013. Our review provides insights into AFSA parameters modifications, procedures and sub-functions. The main reasons for these enhancements and the comparison results with other hybrid methods are discussed. In addition, hybrid, multi-objective and dynamic AFSA models that have been proposed to solve continuous optimization problems are elucidated. We also analyse possible AFSA enhancements and highlight future research directions for advancing AFSA-based models.
△ Less
Submitted 12 May, 2022; v1 submitted 11 November, 2020;
originally announced November 2020.
-
3D Hand Pose Estimation using Simulation and Partial-Supervision with a Shared Latent Space
Authors:
Masoud Abdi,
Ehsan Abbasnejad,
Chee Peng Lim,
Saeid Nahavandi
Abstract:
Tremendous amounts of expensive annotated data are a vital ingredient for state-of-the-art 3d hand pose estimation. Therefore, synthetic data has been popularized as annotations are automatically available. However, models trained only with synthetic samples do not generalize to real data, mainly due to the gap between the distribution of synthetic and real data. In this paper, we propose a novel…
▽ More
Tremendous amounts of expensive annotated data are a vital ingredient for state-of-the-art 3d hand pose estimation. Therefore, synthetic data has been popularized as annotations are automatically available. However, models trained only with synthetic samples do not generalize to real data, mainly due to the gap between the distribution of synthetic and real data. In this paper, we propose a novel method that seeks to predict the 3d position of the hand using both synthetic and partially-labeled real data. Accordingly, we form a shared latent space between three modalities: synthetic depth image, real depth image, and pose. We demonstrate that by carefully learning the shared latent space, we can find a regression model that is able to generalize to real data. As such, we show that our method produces accurate predictions in both semi-supervised and unsupervised settings. Additionally, the proposed model is capable of generating novel, meaningful, and consistent samples from all of the three domains. We evaluate our method qualitatively and quantitively on two highly competitive benchmarks (i.e., NYU and ICVL) and demonstrate its superiority over the state-of-the-art methods. The source code will be made available at https://github.com/masabdi/LSPS.
△ Less
Submitted 14 July, 2018;
originally announced July 2018.
-
A Review of Situation Awareness Assessment Approaches in Aviation Environments
Authors:
Thanh Nguyen,
Chee Peng Lim,
Ngoc Duy Nguyen,
Lee Gordon-Brown,
Saeid Nahavandi
Abstract:
Situation awareness (SA) is an important constituent in human information processing and essential in pilots' decision-making processes. Acquiring and maintaining appropriate levels of SA is critical in aviation environments as it affects all decisions and actions taking place in flights and air traffic control. This paper provides an overview of recent measurement models and approaches to establi…
▽ More
Situation awareness (SA) is an important constituent in human information processing and essential in pilots' decision-making processes. Acquiring and maintaining appropriate levels of SA is critical in aviation environments as it affects all decisions and actions taking place in flights and air traffic control. This paper provides an overview of recent measurement models and approaches to establishing and enhancing SA in aviation environments. Many aspects of SA are examined including the classification of SA techniques into six categories, and different theoretical SA models from individual, to shared or team, and to distributed or system levels. Quantitative and qualitative perspectives pertaining to SA methods and issues of SA for unmanned vehicles are also addressed. Furthermore, future research directions regarding SA assessment approaches are raised to deal with shortcomings of the existing state-of-the-art methods in the literature.
△ Less
Submitted 7 June, 2019; v1 submitted 6 March, 2018;
originally announced March 2018.
-
A Multi-Objective Deep Reinforcement Learning Framework
Authors:
Thanh Thi Nguyen,
Ngoc Duy Nguyen,
Peter Vamplew,
Saeid Nahavandi,
Richard Dazeley,
Chee Peng Lim
Abstract:
This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We develop a high-performance MODRL framework that supports both single-policy and multi-policy strategies, as well as both linear and non-linear approaches to action selection. The experimental results on two benchmark problems (two-objective deep sea treasure environment a…
▽ More
This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We develop a high-performance MODRL framework that supports both single-policy and multi-policy strategies, as well as both linear and non-linear approaches to action selection. The experimental results on two benchmark problems (two-objective deep sea treasure environment and three-objective Mountain Car problem) indicate that the proposed framework is able to find the Pareto-optimal solutions effectively. The proposed framework is generic and highly modularized, which allows the integration of different deep reinforcement learning algorithms in different complex problem domains. This therefore overcomes many disadvantages involved with standard multi-objective reinforcement learning methods in the current literature. The proposed framework acts as a testbed platform that accelerates the development of MODRL for solving increasingly complicated multi-objective problems.
△ Less
Submitted 19 June, 2020; v1 submitted 7 March, 2018;
originally announced March 2018.