Search | arXiv e-print repository

Single Domain Generalization via Normalised Cross-correlation Based Convolutions

Authors: WeiQin Chuah, Ruwan Tennakoon, Reza Hoseinnezhad, David Suter, Alireza Bab-Hadiashar

Abstract: Deep learning techniques often perform poorly in the presence of domain shift, where the test data follows a different distribution than the training data. The most practically desirable approach to address this issue is Single Domain Generalization (S-DG), which aims to train robust models using data from a single source. Prior work on S-DG has primarily focused on using data augmentation techniq… ▽ More Deep learning techniques often perform poorly in the presence of domain shift, where the test data follows a different distribution than the training data. The most practically desirable approach to address this issue is Single Domain Generalization (S-DG), which aims to train robust models using data from a single source. Prior work on S-DG has primarily focused on using data augmentation techniques to generate diverse training data. In this paper, we explore an alternative approach by investigating the robustness of linear operators, such as convolution and dense layers commonly used in deep learning. We propose a novel operator called XCNorm that computes the normalized cross-correlation between weights and an input feature patch. This approach is invariant to both affine shifts and changes in energy within a local feature patch and eliminates the need for commonly used non-linear activation functions. We show that deep neural networks composed of this operator are robust to common semantic distribution shifts. Furthermore, our empirical results on single-domain generalization benchmarks demonstrate that our proposed technique performs comparably to the state-of-the-art methods. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 10 pages, 6 figures

arXiv:2210.12947 [pdf, other]

IT-RUDA: Information Theory Assisted Robust Unsupervised Domain Adaptation

Authors: Shima Rashidi, Ruwan Tennakoon, Aref Miri Rekavandi, Papangkorn Jessadatavornwong, Amanda Freis, Garret Huff, Mark Easton, Adrian Mouritz, Reza Hoseinnezhad, Alireza Bab-Hadiashar

Abstract: Distribution shift between train (source) and test (target) datasets is a common problem encountered in machine learning applications. One approach to resolve this issue is to use the Unsupervised Domain Adaptation (UDA) technique that carries out knowledge transfer from a label-rich source domain to an unlabeled target domain. Outliers that exist in either source or target datasets can introduce… ▽ More Distribution shift between train (source) and test (target) datasets is a common problem encountered in machine learning applications. One approach to resolve this issue is to use the Unsupervised Domain Adaptation (UDA) technique that carries out knowledge transfer from a label-rich source domain to an unlabeled target domain. Outliers that exist in either source or target datasets can introduce additional challenges when using UDA in practice. In this paper, $α$-divergence is used as a measure to minimize the discrepancy between the source and target distributions while inheriting robustness, adjustable with a single parameter $α$, as the prominent feature of this measure. Here, it is shown that the other well-known divergence-based UDA techniques can be derived as special cases of the proposed method. Furthermore, a theoretical upper bound is derived for the loss in the target domain in terms of the source loss and the initial $α$-divergence between the two domains. The robustness of the proposed method is validated through testing on several benchmarked datasets in open-set and partial UDA setups where extra classes existing in target and source datasets are considered as outliers. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2204.08655 [pdf, other]

Interaction-Aware Labeled Multi-Bernoulli Filter

Authors: Nida Ishtiaq, Amirali Khodadadian Gostar, Alireza Bab-Hadiashar, Reza Hoseinnezhad

Abstract: Tracking multiple objects through time is an important part of an intelligent transportation system. Random finite set (RFS)-based filters are one of the emerging techniques for tracking multiple objects. In multi-object tracking (MOT), a common assumption is that each object is moving independent of its surroundings. But in many real-world applications, target objects interact with one another an… ▽ More Tracking multiple objects through time is an important part of an intelligent transportation system. Random finite set (RFS)-based filters are one of the emerging techniques for tracking multiple objects. In multi-object tracking (MOT), a common assumption is that each object is moving independent of its surroundings. But in many real-world applications, target objects interact with one another and the environment. Such interactions, when considered for tracking, are usually modeled by an interactive motion model which is application specific. In this paper, we present a novel approach to incorporate target interactions within the prediction step of an RFS-based multi-target filter, i.e. labeled multi-Bernoulli (LMB) filter. The method has been developed for two practical applications of tracking a coordinated swarm and vehicles. The method has been tested for a complex vehicle tracking dataset and compared with the LMB filter through the OSPA and OSPA$^{(2)}$ metrics. The results demonstrate that the proposed interaction-aware method depicts considerable performance enhancement over the LMB filter in terms of the selected metrics. △ Less

Submitted 19 April, 2022; originally announced April 2022.

Comments: 13 pages including references, 9 figures, submitted and undergoing second round of review with IEEE Transactions on Intelligent Transportation Systems (ITS)

arXiv:2201.02263 [pdf, other]

ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks

Authors: WeiQin Chuah, Ruwan Tennakoon, Reza Hoseinnezhad, Alireza Bab-Hadiashar, David Suter

Abstract: State-of-the-art stereo matching networks trained only on synthetic data often fail to generalize to more challenging real data domains. In this paper, we attempt to unfold an important factor that hinders the networks from generalizing across domains: through the lens of shortcut learning. We demonstrate that the learning of feature representations in stereo matching networks is heavily influence… ▽ More State-of-the-art stereo matching networks trained only on synthetic data often fail to generalize to more challenging real data domains. In this paper, we attempt to unfold an important factor that hinders the networks from generalizing across domains: through the lens of shortcut learning. We demonstrate that the learning of feature representations in stereo matching networks is heavily influenced by synthetic data artefacts (shortcut attributes). To mitigate this issue, we propose an Information-Theoretic Shortcut Avoidance~(ITSA) approach to automatically restrict shortcut-related information from being encoded into the feature representations. As a result, our proposed method learns robust and shortcut-invariant features by minimizing the sensitivity of latent features to input variations. To avoid the prohibitive computational cost of direct input sensitivity optimization, we propose an effective yet feasible algorithm to achieve robustness. We show that using this method, state-of-the-art stereo matching networks that are trained purely on synthetic data can effectively generalize to challenging and previously unseen real data scenarios. Importantly, the proposed method enhances the robustness of the synthetic trained networks to the point that they outperform their fine-tuned counterparts (on real data) for challenging out-of-domain stereo datasets. △ Less

Submitted 3 March, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

Comments: 11 pages, 4 figures. Accepted by CVPR2022

arXiv:2108.12159 [pdf, other]

Anomaly Detection of Defect using Energy of Point Pattern Features within Random Finite Set Framework

Authors: Ammar Mansoor Kamoona, Amirali Khodadadian Gostar, Alireza Bab-Hadiashar, Reza Hoseinnezhad

Abstract: In this paper, we propose an efficient approach for industrial defect detection that is modeled based on anomaly detection using point pattern data. Most recent works use \textit{global features} for feature extraction to summarize image content. However, global features are not robust against lighting and viewpoint changes and do not describe the image's geometrical information to be fully utiliz… ▽ More In this paper, we propose an efficient approach for industrial defect detection that is modeled based on anomaly detection using point pattern data. Most recent works use \textit{global features} for feature extraction to summarize image content. However, global features are not robust against lighting and viewpoint changes and do not describe the image's geometrical information to be fully utilized in the manufacturing industry. To the best of our knowledge, we are the first to propose using transfer learning of local/point pattern features to overcome these limitations and capture geometrical information of the image regions. We model these local/point pattern features as a random finite set (RFS). In addition we propose RFS energy, in contrast to RFS likelihood as anomaly score. The similarity distribution of point pattern features of the normal sample has been modeled as a multivariate Gaussian. Parameters learning of the proposed RFS energy does not require any heavy computation. We evaluate the proposed approach on the MVTec AD dataset, a multi-object defect detection dataset. Experimental results show the outstanding performance of our proposed approach compared to the state-of-the-art methods, and the proposed RFS energy outperforms the state-of-the-art in the few shot learning settings. △ Less

Submitted 27 August, 2021; originally announced August 2021.

Comments: to be submitted to TII journal, 17pages

arXiv:2106.10850 [pdf, other]

Robust Pooling through the Data Mode

Authors: Ayman Mukhaimar, Ruwan Tennakoon, Chow Yin Lai, Reza Hoseinnezhad, AlirezaBab-Hadiashar

Abstract: The task of learning from point cloud data is always challenging due to the often occurrence of noise and outliers in the data. Such data inaccuracies can significantly influence the performance of state-of-the-art deep learning networks and their ability to classify or segment objects. While there are some robust deep learning approaches, they are computationally too expensive for real-time appli… ▽ More The task of learning from point cloud data is always challenging due to the often occurrence of noise and outliers in the data. Such data inaccuracies can significantly influence the performance of state-of-the-art deep learning networks and their ability to classify or segment objects. While there are some robust deep learning approaches, they are computationally too expensive for real-time applications. This paper proposes a deep learning solution that includes a novel robust pooling layer which greatly enhances network robustness and performs significantly faster than state-of-the-art approaches. The proposed pooling layer looks for data a mode/cluster using two methods, RANSAC, and histogram, as clusters are indicative of models. We tested the pooling layer into frameworks such as Point-based and graph-based neural networks, and the tests showed enhanced robustness as compared to robust state-of-the-art methods. △ Less

Submitted 21 June, 2021; originally announced June 2021.

Comments: under consideration at Computer Vision and Image Understanding

arXiv:2102.01882 [pdf]

doi 10.1109/ACCESS.2021.3130261

Evaluation of Point Pattern Features for Anomaly Detection of Defect within Random Finite Set Framework

Authors: Ammar Mansoor Kamoona, Amirali Khodadadian Gostar, Alireza Bab-Hadiashar, Reza Hoseinnezhad

Abstract: Defect detection in the manufacturing industry is of utmost importance for product quality inspection. Recently, optical defect detection has been investigated as an anomaly detection using different deep learning methods. However, the recent works do not explore the use of point pattern features, such as SIFT for anomaly detection using the recently developed set-based methods. In this paper, we… ▽ More Defect detection in the manufacturing industry is of utmost importance for product quality inspection. Recently, optical defect detection has been investigated as an anomaly detection using different deep learning methods. However, the recent works do not explore the use of point pattern features, such as SIFT for anomaly detection using the recently developed set-based methods. In this paper, we present an evaluation of different point pattern feature detectors and descriptors for defect detection application. The evaluation is performed within the random finite set framework. Handcrafted point pattern features, such as SIFT as well as deep features are used in this evaluation. Random finite set-based defect detection is compared with state-of-the-arts anomaly detection methods. The results show that using point pattern features, such as SIFT as data points for random finite set-based anomaly detection achieves the most consistent defect detection accuracy on the MVTec-AD dataset. △ Less

Submitted 3 February, 2021; originally announced February 2021.

Comments: under review. 6 pages

arXiv:2009.04629 [pdf, other]

Adjusting Bias in Long Range Stereo Matching: A semantics guided approach

Authors: WeiQin Chuah, Ruwan Tennakoon, Reza Hoseinnezhad, Alireza Bab-Hadiashar, David Suter

Abstract: Stereo vision generally involves the computation of pixel correspondences and estimation of disparities between rectified image pairs. In many applications, including simultaneous localization and map** (SLAM) and 3D object detection, the disparities are primarily needed to calculate depth values and the accuracy of depth estimation is often more compelling than disparity estimation. The accurac… ▽ More Stereo vision generally involves the computation of pixel correspondences and estimation of disparities between rectified image pairs. In many applications, including simultaneous localization and map** (SLAM) and 3D object detection, the disparities are primarily needed to calculate depth values and the accuracy of depth estimation is often more compelling than disparity estimation. The accuracy of disparity estimation, however, does not directly translate to the accuracy of depth estimation, especially for faraway objects. In the context of learning-based stereo systems, this is largely due to biases imposed by the choices of the disparity-based loss function and the training data. Consequently, the learning algorithms often produce unreliable depth estimates of foreground objects, particularly at large distances~($>50$m). To resolve this issue, we first analyze the effect of those biases and then propose a pair of novel depth-based loss functions for foreground and background, separately. These loss functions are tunable and can balance the inherent bias of the stereo learning algorithms. The efficacy of our solution is demonstrated by an extensive set of experiments, which are benchmarked against state of the art. We show on KITTI~2015 benchmark that our proposed solution yields substantial improvements in disparity and depth estimation, particularly for objects located at distances beyond 50 meters, outperforming the previous state of the art by $10\%$. △ Less

Submitted 9 November, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: 10 pages, 8 figures

arXiv:2009.01369 [pdf, other]

Robust Object Classification Approach using Spherical Harmonics

Authors: Ayman Mukhaimar, Ruwan Tennakoon, Chow Yin Lai, Reza Hoseinnezhad, Alireza Bab-Hadiashar

Abstract: In this paper, we present a robust spherical harmonics approach for the classification of point cloud-based objects. Spherical harmonics have been used for classification over the years, with several frameworks existing in the literature. These approaches use variety of spherical harmonics based descriptors to classify objects. We first investigated these frameworks robustness against data augment… ▽ More In this paper, we present a robust spherical harmonics approach for the classification of point cloud-based objects. Spherical harmonics have been used for classification over the years, with several frameworks existing in the literature. These approaches use variety of spherical harmonics based descriptors to classify objects. We first investigated these frameworks robustness against data augmentation, such as outliers and noise, as it has not been studied before. Then we propose a spherical convolution neural network framework for robust object classification. The proposed framework uses the voxel grid of concentric spheres to learn features over the unit ball. Our proposed model learn features that are less sensitive to data augmentation due to the selected sampling strategy and the designed convolution operation. We tested our proposed model against several types of data augmentation, such as noise and outliers. Our results show that the proposed model outperforms the state of art networks in terms of robustness to data augmentation. △ Less

Submitted 2 September, 2020; originally announced September 2020.

arXiv:2007.01548 [pdf, other]

Multiple Instance-Based Video Anomaly Detection using Deep Temporal Encoding-Decoding

Authors: Ammar Mansoor Kamoona, Amirali Khodadadian Gosta, Alireza Bab-Hadiashar, Reza Hoseinnezhad

Abstract: In this paper, we propose a weakly supervised deep temporal encoding-decoding solution for anomaly detection in surveillance videos using multiple instance learning. The proposed approach uses both abnormal and normal video clips during the training phase which is developed in the multiple instance framework where we treat video as a bag and video clips as instances in the bag. Our main contributi… ▽ More In this paper, we propose a weakly supervised deep temporal encoding-decoding solution for anomaly detection in surveillance videos using multiple instance learning. The proposed approach uses both abnormal and normal video clips during the training phase which is developed in the multiple instance framework where we treat video as a bag and video clips as instances in the bag. Our main contribution lies in the proposed novel approach to consider temporal relations between video instances. We deal with video instances (clips) as a sequential visual data rather than independent instances. We employ a deep temporal and encoder network that is designed to capture spatial-temporal evolution of video instances over time. We also propose a new loss function that is smoother than similar loss functions recently presented in the computer vision literature, and therefore; enjoys faster convergence and improved tolerance to local minima during the training phase. The proposed temporal encoding-decoding approach with modified loss is benchmarked against the state-of-the-art in simulation studies. The results show that the proposed method performs similar to or better than the state-of-the-art solutions for anomaly detection in video surveillance applications. △ Less

Submitted 5 January, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

Comments: The paper is under review

arXiv:1705.09437 [pdf, other]

doi 10.1109/TIP.2018.2834821

Effective Sampling: Fast Segmentation Using Robust Geometric Model Fitting

Authors: Ruwan Tennakoon, Alireza Sadri, Reza Hoseinnezhad, Alireza Bab-Hadiashar

Abstract: Identifying the underlying models in a set of data points contaminated by noise and outliers, leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher order affinities between data points into a graph, which can then be clustered using spectral clustering. Calculating all possible higher order affinities is computationally… ▽ More Identifying the underlying models in a set of data points contaminated by noise and outliers, leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher order affinities between data points into a graph, which can then be clustered using spectral clustering. Calculating all possible higher order affinities is computationally expensive. Hence in most cases only a subset is used. In this paper, we propose an effective sampling method to obtain a highly accurate approximation of the full graph required to solve multi-structural model fitting problems in computer vision. The proposed method is based on the observation that the usefulness of a graph for segmentation improves as the distribution of hypotheses (used to build the graph) approaches the distribution of actual parameters for the given data. In this paper, we approximate this actual parameter distribution using a k-th order statistics based cost function and the samples are generated using a greedy algorithm coupled with a data sub-sampling strategy. The experimental analysis shows that the proposed method is both accurate and computationally efficient compared to the state-of-the-art robust multi-model fitting techniques. The code is publicly available from https://github.com/RuwanT/model-fitting-cbs. △ Less

Submitted 26 May, 2017; originally announced May 2017.

arXiv:1604.05966

Labeled Multi-Bernoulli Tracking for Industrial Mobile Platform Safety

Authors: Tharindu Rathnayake, Reza Hoseinnezhad, Ruwan Tennakoon, Alireza Bab-Hadiashar

Abstract: This paper presents a track-before-detect labeled multi-Bernoulli filter tailored for industrial mobile platform safety applications. We derive two application specific separable likelihood functions that capture the geometric shape and colour information of the human targets who are wearing a high visible vest. These likelihoods are then used in a labeled multi-Bernoulli filter with a novel two s… ▽ More This paper presents a track-before-detect labeled multi-Bernoulli filter tailored for industrial mobile platform safety applications. We derive two application specific separable likelihood functions that capture the geometric shape and colour information of the human targets who are wearing a high visible vest. These likelihoods are then used in a labeled multi-Bernoulli filter with a novel two step Bayesian update. Preliminary simulation results show that the proposed solution can successfully track human workers wearing a luminous yellow colour vest in an industrial environment. △ Less

Submitted 10 May, 2016; v1 submitted 20 April, 2016; originally announced April 2016.

Comments: The conference which this paper was submitted, has rejected this paper. Thus, we are in the process of enhancing the content of the paper and submit it to another conference/journal

arXiv:1503.07276 [pdf, other]

Multi-Bernoulli Sensor-Control via Minimization of Expected Estimation Errors

Authors: Amirali K. Gostar, Reza Hoseinnezhad, Alireza Bab-Hadiashar

Abstract: This paper presents a sensor-control method for choosing the best next state of the sensor(s), that provide(s) accurate estimation results in a multi-target tracking application. The proposed solution is formulated for a multi-Bernoulli filter and works via minimization of a new estimation error-based cost function. Simulation results demonstrate that the proposed method can outperform the state-o… ▽ More This paper presents a sensor-control method for choosing the best next state of the sensor(s), that provide(s) accurate estimation results in a multi-target tracking application. The proposed solution is formulated for a multi-Bernoulli filter and works via minimization of a new estimation error-based cost function. Simulation results demonstrate that the proposed method can outperform the state-of-the-art methods in terms of computation time and robustness to clutter while delivering similar accuracy. △ Less

Submitted 25 March, 2015; originally announced March 2015.

arXiv:1502.01066 [pdf, other]

Information theoretic approach to robust multi-Bernoulli sensor control

Authors: Amirali K. Gostar, Reza Hoseinnezhad, Alireza Bab-Hadiashar

Abstract: A novel sensor control solution is presented, formulated within a Multi-Bernoulli-based multi-target tracking framework. The proposed method is especially designed for the general multi-target tracking case, where no prior knowledge of the clutter distribution or the probability of detection profile are available. In an information theoretic approach, our method makes use of Rènyi divergence as th… ▽ More A novel sensor control solution is presented, formulated within a Multi-Bernoulli-based multi-target tracking framework. The proposed method is especially designed for the general multi-target tracking case, where no prior knowledge of the clutter distribution or the probability of detection profile are available. In an information theoretic approach, our method makes use of Rènyi divergence as the reward function to be maximized for finding the optimal sensor control command at each step. We devise a Monte Carlo sampling method for computation of the reward. Simulation results demonstrate successful performance of the proposed method in a challenging scenario involving five targets maneuvering in a relatively uncertain space with unknown distance-dependent clutter rate and probability of detection. △ Less

Submitted 3 February, 2015; originally announced February 2015.

Showing 1–14 of 14 results for author: Hoseinnezhad, R