Skip to main content

Showing 1–26 of 26 results for author: Rashwan, A

.
  1. arXiv:2405.16759  [pdf, other

    cs.CV cs.LG

    Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

    Authors: Cristina N. Vasconcelos, Abdullah Rashwan, Austin Waters, Trevor Walker, Keyang Xu, Jimmy Yan, Rui Qian, Shixin Luo, Zarana Parekh, Andrew Bunner, Hongliang Fei, Roopal Garg, Mandy Guo, Ivana Kajic, Yeqing Li, Henna Nandwani, Jordi Pont-Tuset, Yasumasa Onoe, Sarah Rosston, Su Wang, Wenlei Zhou, Kevin Swersky, David J. Fleet, Jason M. Baldridge, Oliver Wang

    Abstract: We address the long-standing problem of how to learn effective pixel-based image diffusion models at scale, introducing a remarkably simple greedy growing method for stable training of large-scale, high-resolution models. without the needs for cascaded super-resolution components. The key insight stems from careful pre-training of core components, namely, those responsible for text-to-image alignm… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  2. arXiv:2404.02738  [pdf, other

    cs.CV

    Adaptive Affinity-Based Generalization For MRI Imaging Segmentation Across Resource-Limited Settings

    Authors: Eddardaa B. Loussaief, Mohammed Ayad, Domenc Puig, Hatem A. Rashwan

    Abstract: The joint utilization of diverse data sources for medical imaging segmentation has emerged as a crucial area of research, aiming to address challenges such as data heterogeneity, domain shift, and data quality discrepancies. Integrating information from multiple data domains has shown promise in improving model generalizability and adaptability. However, this approach often demands substantial com… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  3. arXiv:2312.06052  [pdf, other

    cs.CV cs.AI

    MaskConver: Revisiting Pure Convolution Model for Panoptic Segmentation

    Authors: Abdullah Rashwan, Jiageng Zhang, Ali Taalimi, Fan Yang, Xingyi Zhou, Chaochao Yan, Liang-Chieh Chen, Yeqing Li

    Abstract: In recent years, transformer-based models have dominated panoptic segmentation, thanks to their strong modeling capabilities and their unified representation for both semantic and instance classes as global binary masks. In this paper, we revisit pure convolution model and propose a novel panoptic architecture named MaskConver. MaskConver proposes to fully unify things and stuff representation by… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: 11 pages, 5 figures

  4. arXiv:2309.16139  [pdf, other

    cs.CV cs.LG

    Two-Step Active Learning for Instance Segmentation with Uncertainty and Diversity Sampling

    Authors: Ke Yu, Stephen Albro, Giulia DeSalvo, Suraj Kothawade, Abdullah Rashwan, Sasan Tavakkol, Kayhan Batmanghelich, Xiaoqi Yin

    Abstract: Training high-quality instance segmentation models requires an abundance of labeled images with instance masks and classifications, which is often expensive to procure. Active learning addresses this challenge by striving for optimum performance with minimal labeling cost by selecting the most informative and representative images for labeling. Despite its potential, active learning has been less… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: UNCV ICCV 2023

  5. arXiv:2306.01736  [pdf, other

    cs.CV cs.AI cs.LG

    DaTaSeg: Taming a Universal Multi-Dataset Multi-Task Segmentation Model

    Authors: Xiuye Gu, Yin Cui, Jonathan Huang, Abdullah Rashwan, Xuan Yang, Xingyi Zhou, Golnaz Ghiasi, Weicheng Kuo, Huizhong Chen, Liang-Chieh Chen, David A Ross

    Abstract: Observing the close relationship among panoptic, semantic and instance segmentation tasks, we propose to train a universal multi-dataset multi-task segmentation model: DaTaSeg.We use a shared representation (mask proposals with class predictions) for all tasks. To tackle task discrepancy, we adopt different merge operations and post-processing for different tasks. We also leverage weak-supervision… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  6. arXiv:2211.07521  [pdf, other

    cs.CV

    PKCAM: Previous Knowledge Channel Attention Module

    Authors: Eslam Mohamed Bakr, Ahmad El Sallab, Mohsen A. Rashwan

    Abstract: Recently, attention mechanisms have been explored with ConvNets, both across the spatial and channel dimensions. However, from our knowledge, all the existing methods devote the attention modules to capture local interactions from a uni-scale. In this paper, we propose a Previous Knowledge Channel Attention Module(PKCAM), that captures channel-wise relations across different layers to model the gl… ▽ More

    Submitted 25 November, 2022; v1 submitted 14 November, 2022; originally announced November 2022.

  7. arXiv:2112.06782  [pdf, other

    cs.CV

    GCNDepth: Self-supervised Monocular Depth Estimation based on Graph Convolutional Network

    Authors: Armin Masoumian, Hatem A. Rashwan, Saddam Abdulwahab, Julian Cristiano, Domenec Puig

    Abstract: Depth estimation is a challenging task of 3D reconstruction to enhance the accuracy sensing of environment awareness. This work brings a new solution with a set of improvements, which increase the quantitative and qualitative understanding of depth maps compared to existing methods. Recently, a convolutional neural network (CNN) has demonstrated its extraordinary ability in estimating depth maps f… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

    Comments: 10 pages, Submitted to IEEE transactions on intelligent transportation systems

  8. Designing and Analyzing the PID and Fuzzy Control System for an Inverted Pendulum

    Authors: Armin Masoumian, Pezhman kazemi, Mohammad Chehreghani Montazer, Hatem A. Rashwan, Domenec Puig Valls

    Abstract: The inverted pendulum is a non-linear unbalanced system that needs to be controlled using motors to achieve stability and equilibrium. The inverted pendulum is constructed with Lego and using the Lego Mindstorm NXT, which is a programmable robot capable of completing many different functions. In this paper, an initial design of the inverted pendulum is proposed and the performance of different sen… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

    Comments: 5 pages, Accepted for The 6th International Conference on Mechatronics and Robotics Engineering (ICMRE 2020)

  9. Using The Feedback of Dynamic Active-Pixel Vision Sensor (Davis) to Prevent Slip in Real Time

    Authors: Armin Masoumian, Pezhman kazemi, Mohammad Chehreghani Montazer, Hatem A. Rashwan, Domenec Puig Valls

    Abstract: The objective of this paper is to describe an approach to detect the slip and contact force in real-time feedback. In this novel approach, the DAVIS camera is used as a vision tactile sensor due to its fast process speed and high resolution. Two hundred experiments were performed on four objects with different shapes, sizes, weights, and materials to compare the accuracy and response of the Baxter… ▽ More

    Submitted 9 November, 2021; originally announced November 2021.

    Comments: 5 pages, Accepted for The 6th International Conference on Mechatronics and Robotics Engineering (ICMRE 2020)

  10. Absolute distance prediction based on deep learning object detection and monocular depth estimation models

    Authors: Armin Masoumian, David G. F. Marei, Saddam Abdulwahab, Julian Cristiano, Domenec Puig, Hatem A. Rashwan

    Abstract: Determining the distance between the objects in a scene and the camera sensor from 2D images is feasible by estimating depth images using stereo cameras or 3D cameras. The outcome of depth estimation is relative distances that can be used to calculate absolute distances to be applicable in reality. However, distance estimation is very challenging using 2D monocular cameras. This paper presents a d… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

    Comments: 10 pages, Submitted to 23rd International Conference of the Catalan Association for Artificial Intelligence (CCIA 2021)

  11. arXiv:2103.12270  [pdf, other

    cs.CV cs.AI

    Dilated SpineNet for Semantic Segmentation

    Authors: Abdullah Rashwan, Xianzhi Du, Xiaoqi Yin, **g Li

    Abstract: Scale-permuted networks have shown promising results on object bounding box detection and instance segmentation. Scale permutation and cross-scale fusion of features enable the network to capture multi-scale semantics while preserving spatial resolution. In this work, we evaluate this meta-architecture design on semantic segmentation - another vision task that benefits from high spatial resolution… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

    Comments: 8 pages

  12. arXiv:2002.10631  [pdf, other

    cs.LG cs.CV stat.ML

    Batch norm with entropic regularization turns deterministic autoencoders into generative models

    Authors: Amur Ghose, Abdullah Rashwan, Pascal Poupart

    Abstract: The variational autoencoder is a well defined deep generative model that utilizes an encoder-decoder framework where an encoding neural network outputs a non-deterministic code for reconstructing an input. The encoder achieves this by sampling from a distribution for every input, instead of outputting a deterministic code per input. The great advantage of this process is that it allows the use of… ▽ More

    Submitted 21 September, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

    Journal ref: Published in the Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI), 2020

  13. arXiv:2001.03194  [pdf, other

    cs.CV

    MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection

    Authors: Abdullah Rashwan, Rishav Agarwal, Agastya Kalra, Pascal Poupart

    Abstract: We present MatrixNets (xNets), a new deep architecture for object detection. xNets map objects with similar sizes and aspect ratios into many specialized layers, allowing xNets to provide a scale and aspect ratio aware architecture. We leverage xNets to enhance single-stage object detection frameworks. First, we apply xNets on anchor-based object detection, for which we predict object centers and… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.

    Comments: This is the full paper for arXiv:1908.04646 with more applications, experiments, and ablation study

  14. arXiv:1908.04646  [pdf, other

    cs.CV

    Matrix Nets: A New Deep Architecture for Object Detection

    Authors: Abdullah Rashwan, Agastya Kalra, Pascal Poupart

    Abstract: We present Matrix Nets (xNets), a new deep architecture for object detection. xNets map objects with different sizes and aspect ratios into layers where the sizes and the aspect ratios of the objects within their layers are nearly uniform. Hence, xNets provide a scale and aspect ratio aware architecture. We leverage xNets to enhance key-points based object detection. Our architecture achieves mAP… ▽ More

    Submitted 14 August, 2019; v1 submitted 13 August, 2019; originally announced August 2019.

    Comments: Short paper, stay tuned for the full paper!

  15. arXiv:1907.02742  [pdf, other

    eess.IV cs.CV

    Adversarial Learning with Multiscale Features and Kernel Factorization for Retinal Blood Vessel Segmentation

    Authors: Farhan Akram, Vivek Kumar Singh, Hatem A. Rashwan, Mohamed Abdel-Nasser, Md. Mostafa Kamal Sarker, Nidhi Pandey, Domenec Puig

    Abstract: In this paper, we propose an efficient blood vessel segmentation method for the eye fundus images using adversarial learning with multiscale features and kernel factorization. In the generator network of the adversarial framework, spatial pyramid pooling, kernel factorization and squeeze excitation block are employed to enhance the feature representation in spatial domain on different scales with… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

    Comments: 9 pages, 4 figures

  16. arXiv:1907.00887  [pdf, other

    eess.IV cs.CV

    An Efficient Solution for Breast Tumor Segmentation and Classification in Ultrasound Images Using Deep Adversarial Learning

    Authors: Vivek Kumar Singh, Hatem A. Rashwan, Mohamed Abdel-Nasser, Md. Mostafa Kamal Sarker, Farhan Akram, Nidhi Pandey, Santiago Romani, Domenec Puig

    Abstract: This paper proposes an efficient solution for tumor segmentation and classification in breast ultrasound (BUS) images. We propose to add an atrous convolution layer to the conditional generative adversarial network (cGAN) segmentation model to learn tumor features at different resolutions of BUS images. To automatically re-balance the relative impact of each of the highest level encoded features,… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: 9 pages

  17. arXiv:1907.00856  [pdf, other

    eess.IV cs.CV

    SLSNet: Skin lesion segmentation using a lightweight generative adversarial network

    Authors: Md. Mostafa Kamal Sarker, Hatem A. Rashwan, Farhan Akram, Vivek Kumar Singh, Syeda Furruka Banu, Forhad U H Chowdhury, Kabir Ahmed Choudhury, Sylvie Chambon, Petia Radeva, Domenec Puig, Mohamed Abdel-Nasser

    Abstract: The determination of precise skin lesion boundaries in dermoscopic images using automated methods faces many challenges, most importantly, the presence of hair, inconspicuous lesion edges and low contrast in dermoscopic images, and variability in the color, texture and shapes of skin lesions. Existing deep learning-based skin lesion segmentation algorithms are expensive in terms of computational t… ▽ More

    Submitted 17 June, 2021; v1 submitted 1 July, 2019; originally announced July 2019.

    Comments: Accepted in Expert Systems with Applications

  18. arXiv:1809.06663  [pdf, other

    cs.CV

    Support Vector Machine (SVM) Recognition Approach adapted to Individual and Touching Moths Counting in Trap Images

    Authors: Mohamed Chafik Bakkay, Sylvie Chambon, Hatem A. Rashwan, Christian Lubat, SĂ©bastien Barsotti

    Abstract: This paper aims at develo** an automatic algorithm for moth recognition from trap images in real-world conditions. This method uses our previous work for detection [1] and introduces an adapted classification step. More precisely, SVM classifier is trained with a multi-scale descriptor, Histogram Of Curviness Saliency (HCS). This descriptor is robust to illumination changes and is able to detect… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

  19. arXiv:1809.01687  [pdf, other

    cs.CV

    Breast Tumor Segmentation and Shape Classification in Mammograms using Generative Adversarial and Convolutional Neural Network

    Authors: Vivek Kumar Singh, Hatem A. Rashwan, Santiago Romani, Farhan Akram, Nidhi Pandey, Md. Mostafa Kamal Sarker, Adel Saleh, Meritexell Arenas, Miguel Arquez, Domenec Puig, Jordina Torrents-Barrena

    Abstract: Mammogram inspection in search of breast tumors is a tough assignment that radiologists must carry out frequently. Therefore, image analysis methods are needed for the detection and delineation of breast masses, which portray crucial morphological information that will support reliable diagnosis. In this paper, we proposed a conditional Generative Adversarial Network (cGAN) devised to segment a br… ▽ More

    Submitted 23 October, 2018; v1 submitted 5 September, 2018; originally announced September 2018.

    Comments: 33 pages, Submitted to Expert Systems with Applications

  20. arXiv:1808.09829  [pdf, other

    cs.CV

    MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams

    Authors: Md. Mostafa Kamal Sarker, Hatem A. Rashwan, Estefania Talavera, Syeda Furruka Banu, Petia Radeva, Domenec Puig

    Abstract: First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart mo… ▽ More

    Submitted 29 August, 2018; originally announced August 2018.

    Comments: 10 pages, accepted in ECCV at EPIC 2018

  21. arXiv:1807.11433  [pdf, other

    cs.CV

    REFUGE CHALLENGE 2018-Task 2:Deep Optic Disc and Cup Segmentation in Fundus Images Using U-Net and Multi-scale Feature Matching Networks

    Authors: Vivek Kumar Singh, Hatem A. Rashwan, Adel Saleh, Farhan Akram, Md Mostafa Kamal Sarker, Nidhi Pandey, Saddam Abdulwahab

    Abstract: In this paper, an optic disc and cup segmentation method is proposed using U-Net followed by a multi-scale feature matching network. The proposed method targets task 2 of the REFUGE challenge 2018. In order to solve the segmentation problem of task 2, we firstly crop the input image using single shot multibox detector (SSD). The cropped image is then passed to an encoder-decoder network with skip… ▽ More

    Submitted 30 July, 2018; originally announced July 2018.

    Comments: EYE REFUGE CHALLENGE 2018, submitted 7 Pages

  22. arXiv:1805.12081  [pdf, other

    cs.CV

    CuisineNet: Food Attributes Classification using Multi-scale Convolution Network

    Authors: Md. Mostafa Kamal Sarker, Mohammed Jabreel, Hatem A. Rashwan, Syeda Furruka Banu, Antonio Moreno, Petia Radeva, Domenec Puig

    Abstract: Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input… ▽ More

    Submitted 8 June, 2018; v1 submitted 30 May, 2018; originally announced May 2018.

    Comments: 8 pages, Submitted in CCIA 2018

  23. arXiv:1805.10241  [pdf, other

    cs.CV

    SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks

    Authors: Md. Mostafa Kamal Sarker, Hatem A. Rashwan, Farhan Akram, Syeda Furruka Banu, Adel Saleh, Vivek Kumar Singh, Forhad U H Chowdhury, Saddam Abdulwahab, Santiago Romani, Petia Radeva, Domenec Puig

    Abstract: Skin lesion segmentation (SLS) in dermoscopic images is a crucial task for automated diagnosis of melanoma. In this paper, we present a robust deep learning SLS model, so-called SLSDeep, which is represented as an encoder-decoder network. The encoder network is constructed by dilated residual layers, in turn, a pyramid pooling network followed by three convolution layers is used for the decoder. U… ▽ More

    Submitted 30 May, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

    Comments: Accepted in MICCAI 2018, 9 pages

  24. arXiv:1805.10207  [pdf, other

    cs.CV

    Conditional Generative Adversarial and Convolutional Networks for X-ray Breast Mass Segmentation and Shape Classification

    Authors: Vivek Kumar Singh, Santiago Romani, Hatem A. Rashwan, Farhan Akram, Nidhi Pandey, Md. Mostafa Kamal Sarker, Jordina Torrents Barrena, Saddam Abdulwahab, Adel Saleh, Miguel Arquez, Meritxell Arenas, Domenec Puig

    Abstract: This paper proposes a novel approach based on conditional Generative Adversarial Networks (cGAN) for breast mass segmentation in mammography. We hypothesized that the cGAN structure is well-suited to accurately outline the mass area, especially when the training data is limited. The generative network learns intrinsic features of tumors while the adversarial network enforces segmentations to be si… ▽ More

    Submitted 10 June, 2018; v1 submitted 25 May, 2018; originally announced May 2018.

    Comments: 8 pages, Accepted at Medical Image Computing and Computer Assisted Intervention (MICCAI) 2018

  25. Using Curvilinear Features in Focus for Registering a Single Image to a 3D Object

    Authors: Hatem A. Rashwan, Sylvie Chambon, Pierre Gurdjos, GĂ©raldine Morin, Vincent Charvillat

    Abstract: In the context of 2D/3D registration, this paper introduces an approach that allows to match features detected in two different modalities: photographs and 3D models, by using a common 2D reprensentation. More precisely, 2D images are matched with a set of depth images, representing the 3D model. After introducing the concept of curvilinear saliency, related to curvature estimation, we propose a n… ▽ More

    Submitted 26 February, 2018; originally announced February 2018.

  26. arXiv:1709.08271  [pdf

    cs.CV

    3D Camouflaging Object using RGB-D Sensors

    Authors: Ahmed M. Siddek, Mohsen A. Rashwan, Islam A. Eshrah

    Abstract: This paper proposes a new optical camouflage system that uses RGB-D cameras, for acquiring point cloud of background scene, and tracking observers eyes. This system enables a user to conceal an object located behind a display that surrounded by 3D objects. If we considered here the tracked point of observer s eyes is a light source, the system will work on estimating shadow shape of the display de… ▽ More

    Submitted 24 September, 2017; originally announced September 2017.

    Comments: 6 pages, 12 figures, 2017 IEEE International Conference on SMC