Search | arXiv e-print repository

Stress and Adaptation: Applying Anna Karenina Principle in Deep Learning for Image Classification

Authors: Nesma Mahmoud, Hanna Antson, Jaesik Choi, Osamu Shimmi, Kallol Roy

Abstract: Image classification with deep neural networks has reached state-of-art with high accuracy. This success is attributed to good internal representation features that bypasses the difficulties of the non-convex optimization problems. We have little understanding of these internal representations, let alone quantifying them. Recent research efforts have focused on alternative theories and explanation… ▽ More Image classification with deep neural networks has reached state-of-art with high accuracy. This success is attributed to good internal representation features that bypasses the difficulties of the non-convex optimization problems. We have little understanding of these internal representations, let alone quantifying them. Recent research efforts have focused on alternative theories and explanations of the generalizability of these deep networks. We propose the alternative perturbation of deep models during their training induces changes that lead to transitions to different families. The result is an Anna Karenina Principle AKP for deep learning, in which less generalizable models unhappy families vary more in their representation than more generalizable models happy families paralleling Leo Tolstoy dictum that all happy families look alike, each unhappy family is unhappy in its own way. Anna Karenina principle has been found in systems in a wide range: from the surface of endangered corals exposed to harsh weather to the lungs of patients suffering from fatal diseases of AIDs. In our paper, we have generated artificial perturbations to our model by hot-swap** the activation and loss functions during the training. In this paper, we build a model to classify cancer cells from non-cancer ones. We give theoretical proof that the internal representations of generalizable happy models are similar in the asymptotic limit. Our experiments verify similar representations of generalizable models. △ Less

Submitted 22 February, 2023; originally announced February 2023.

Report number: 12nesma

arXiv:2208.11036 [pdf]

doi 10.1177/03611981231185768

CitySim: A Drone-Based Vehicle Trajectory Dataset for Safety Oriented Research and Digital Twins

Authors: Ou Zheng, Mohamed Abdel-Aty, Lishengsa Yue, Amr Abdelraouf, Zi** Wang, Nada Mahmoud

Abstract: The development of safety-oriented research and applications requires fine-grain vehicle trajectories that not only have high accuracy, but also capture substantial safety-critical events. However, it would be challenging to satisfy both these requirements using the available vehicle trajectory datasets do not have the capacity to satisfy both.This paper introduces the CitySim dataset that has the… ▽ More The development of safety-oriented research and applications requires fine-grain vehicle trajectories that not only have high accuracy, but also capture substantial safety-critical events. However, it would be challenging to satisfy both these requirements using the available vehicle trajectory datasets do not have the capacity to satisfy both.This paper introduces the CitySim dataset that has the core objective of facilitating safety-oriented research and applications. CitySim has vehicle trajectories extracted from 1140 minutes of drone videos recorded at 12 locations. It covers a variety of road geometries including freeway basic segments, signalized intersections, stop-controlled intersections, and control-free intersections. CitySim was generated through a five-step procedure that ensured trajectory accuracy. The five-step procedure included video stabilization, object filtering, multi-video stitching, object detection and tracking, and enhanced error filtering. Furthermore, CitySim provides the rotated bounding box information of a vehicle, which was demonstrated to improve safety evaluations. Compared with other video-based critical events, including cut-in, merge, and diverge events, which were validated by distributions of both minimum time-to-collision and minimum post-encroachment time. In addition, CitySim had the capability to facilitate digital-twin-related research by providing relevant assets, such as the recording locations' three-dimensional base maps and signal timings. △ Less

Submitted 31 July, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Transportation Research Record (2023)

arXiv:2112.04021 [pdf, other]

A Robust Completed Local Binary Pattern (RCLBP) for Surface Defect Detection

Authors: Nana Kankam Gyimah, Abenezer Girma, Mahmoud Nabil Mahmoud, Shamila Nateghi, Abdollah Homaifar, Daniel Opoku

Abstract: In this paper, we present a Robust Completed Local Binary Pattern (RCLBP) framework for a surface defect detection task. Our approach uses a combination of Non-Local (NL) means filter with wavelet thresholding and Completed Local Binary Pattern (CLBP) to extract robust features which are fed into classifiers for surface defects detection. This paper combines three components: A denoising technique… ▽ More In this paper, we present a Robust Completed Local Binary Pattern (RCLBP) framework for a surface defect detection task. Our approach uses a combination of Non-Local (NL) means filter with wavelet thresholding and Completed Local Binary Pattern (CLBP) to extract robust features which are fed into classifiers for surface defects detection. This paper combines three components: A denoising technique based on Non-Local (NL) means filter with wavelet thresholding is established to denoise the noisy image while preserving the textures and edges. Second, discriminative features are extracted using the CLBP technique. Finally, the discriminative features are fed into the classifiers to build the detection model and evaluate the performance of the proposed framework. The performance of the defect detection models are evaluated using a real-world steel surface defect database from Northeastern University (NEU). Experimental results demonstrate that the proposed approach RCLBP is noise robust and can be applied for surface defect detection under varying conditions of intra-class and inter-class changes and with illumination changes. △ Less

Submitted 7 December, 2021; originally announced December 2021.

Comments: Accepted to IEEE SMC 2021 as a special invited session paper

arXiv:2111.13157 [pdf, other]

DA$^{\textbf{2}}$-Net : Diverse & Adaptive Attention Convolutional Neural Network

Authors: Abenezer Girma, Abdollah Homaifar, M Nabil Mahmoud, Xuyang Yan, Mrinmoy Sarkar

Abstract: Standard Convolutional Neural Network (CNN) designs rarely focus on the importance of explicitly capturing diverse features to enhance the network's performance. Instead, most existing methods follow an indirect approach of increasing or tuning the networks' depth and width, which in many cases significantly increases the computational cost. Inspired by a biological visual system, we propose a Div… ▽ More Standard Convolutional Neural Network (CNN) designs rarely focus on the importance of explicitly capturing diverse features to enhance the network's performance. Instead, most existing methods follow an indirect approach of increasing or tuning the networks' depth and width, which in many cases significantly increases the computational cost. Inspired by a biological visual system, we propose a Diverse and Adaptive Attention Convolutional Network (DA$^{2}$-Net), which enables any feed-forward CNNs to explicitly capture diverse features and adaptively select and emphasize the most informative features to efficiently boost the network's performance. DA$^{2}$-Net incurs negligible computational overhead and it is designed to be easily integrated with any CNN architecture. We extensively evaluated DA$^{2}$-Net on benchmark datasets, including CIFAR100, SVHN, and ImageNet, with various CNN architectures. The experimental results show DA$^{2}$-Net provides a significant performance improvement with very minimal computational overhead. △ Less

Submitted 25 November, 2021; originally announced November 2021.

arXiv:1705.09107 [pdf, other]

SLAM based Quasi Dense Reconstruction For Minimally Invasive Surgery Scenes

Authors: Nader Mahmoud, Alexandre Hostettler, Toby Collins, Luc Soler, Christophe Doignon, J. M. M. Montiel

Abstract: Recovering surgical scene structure in laparoscope surgery is crucial step for surgical guidance and augmented reality applications. In this paper, a quasi dense reconstruction algorithm of surgical scene is proposed. This is based on a state-of-the-art SLAM system, and is exploiting the initial exploration phase that is typically performed by the surgeon at the beginning of the surgery. We show h… ▽ More Recovering surgical scene structure in laparoscope surgery is crucial step for surgical guidance and augmented reality applications. In this paper, a quasi dense reconstruction algorithm of surgical scene is proposed. This is based on a state-of-the-art SLAM system, and is exploiting the initial exploration phase that is typically performed by the surgeon at the beginning of the surgery. We show how to convert the sparse SLAM map to a quasi dense scene reconstruction, using pairs of keyframe images and correlation-based featureless patch matching. We have validated the approach with a live porcine experiment using Computed Tomography as ground truth, yielding a Root Mean Squared Error of 4.9mm. △ Less

Submitted 25 May, 2017; originally announced May 2017.

Comments: ICRA 2017 workshop C4 Surgical Robots: Compliant, Continuum, Cognitive, and Collaborative

arXiv:1608.08149 [pdf, other]

ORBSLAM-based Endoscope Tracking and 3D Reconstruction

Authors: Nader Mahmoud, Iñigo Cirauqui, Alexandre Hostettler, Christophe Doignon, Luc Soler, Jacques Marescaux, J. M. M. Montiel

Abstract: We aim to track the endoscope location inside the surgical scene and provide 3D reconstruction, in real-time, from the sole input of the image sequence captured by the monocular endoscope. This information offers new possibilities for develo** surgical navigation and augmented reality applications. The main benefit of this approach is the lack of extra tracking elements which can disturb the sur… ▽ More We aim to track the endoscope location inside the surgical scene and provide 3D reconstruction, in real-time, from the sole input of the image sequence captured by the monocular endoscope. This information offers new possibilities for develo** surgical navigation and augmented reality applications. The main benefit of this approach is the lack of extra tracking elements which can disturb the surgeon performance in the clinical routine. It is our first contribution to exploit ORBSLAM, one of the best performing monocular SLAM algorithms, to estimate both of the endoscope location, and 3D structure of the surgical scene. However, the reconstructed 3D map poorly describe textureless soft organ surfaces such as liver. It is our second contribution to extend ORBSLAM to be able to reconstruct a semi-dense map of soft organs. Experimental results on in-vivo pigs, shows a robust endoscope tracking even with organs deformations and partial instrument occlusions. It also shows the reconstruction density, and accuracy against ground truth surface obtained from CT. △ Less

Submitted 29 August, 2016; originally announced August 2016.

Showing 1–6 of 6 results for author: Mahmoud, N