Search | arXiv e-print repository

SegLoc: Visual Self-supervised Learning Scheme for Dense Prediction Tasks of Security Inspection X-ray Images

Authors: Shervin Halat, Mohammad Rahmati, Ehsan Nazerfard

Abstract: Lately, remarkable advancements of artificial intelligence have been attributed to the integration of self-supervised learning (SSL) scheme. Despite impressive achievements within natural language processing (NLP), SSL in computer vision has not been able to stay on track comparatively. Recently, integration of contrastive learning on top of existing visual SSL models has established considerable… ▽ More Lately, remarkable advancements of artificial intelligence have been attributed to the integration of self-supervised learning (SSL) scheme. Despite impressive achievements within natural language processing (NLP), SSL in computer vision has not been able to stay on track comparatively. Recently, integration of contrastive learning on top of existing visual SSL models has established considerable progress, thereby being able to outperform supervised counterparts. Nevertheless, the improvements were mostly limited to classification tasks; moreover, few studies have evaluated visual SSL models in real-world scenarios, while the majority considered datasets containing class-wise portrait images, notably ImageNet. Thus, here, we have considered dense prediction tasks on security inspection x-ray images to evaluate our proposed model Segmentation Localization (SegLoc). Based upon the model Instance Localization (InsLoc), our model has managed to address one of the most challenging downsides of contrastive learning, i.e., false negative pairs of query embeddings. To do so, our pre-training dataset is synthesized by cutting, transforming, then pasting labeled segments, as foregrounds, from an already existing labeled dataset (PIDray) onto instances, as backgrounds, of an unlabeled dataset (SIXray;) further, we fully harness the labels through integration of the notion, one queue per class, into MoCo-v2 memory bank, avoiding false negative pairs. Regarding the task in question, our approach has outperformed random initialization method by 3% to 6%, while having underperformed supervised initialization, in AR and AP metrics at different IoU values for 20 to 30 pre-training epochs. △ Less

Submitted 21 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

arXiv:2309.17048 [pdf, other]

On Continuity of Robust and Accurate Classifiers

Authors: Ramin Barati, Reza Safabakhsh, Mohammad Rahmati

Abstract: The reliability of a learning model is key to the successful deployment of machine learning in various applications. Creating a robust model, particularly one unaffected by adversarial attacks, requires a comprehensive understanding of the adversarial examples phenomenon. However, it is difficult to describe the phenomenon due to the complicated nature of the problems in machine learning. It has b… ▽ More The reliability of a learning model is key to the successful deployment of machine learning in various applications. Creating a robust model, particularly one unaffected by adversarial attacks, requires a comprehensive understanding of the adversarial examples phenomenon. However, it is difficult to describe the phenomenon due to the complicated nature of the problems in machine learning. It has been shown that adversarial training can improve the robustness of the hypothesis. However, this improvement comes at the cost of decreased performance on natural samples. Hence, it has been suggested that robustness and accuracy of a hypothesis are at odds with each other. In this paper, we put forth the alternative proposal that it is the continuity of a hypothesis that is incompatible with its robustness and accuracy. In other words, a continuous function cannot effectively learn the optimal robust hypothesis. To this end, we will introduce a framework for a rigorous study of harmonic and holomorphic hypothesis in learning theory terms and provide empirical evidence that continuous hypotheses does not perform as well as discontinuous hypotheses in some common machine learning tasks. From a practical point of view, our results suggests that a robust and accurate learning rule would train different continuous hypotheses for different regions of the domain. From a theoretical perspective, our analysis explains the adversarial examples phenomenon as a conflict between the continuity of a sequence of functions and its uniform convergence to a discontinuous function. △ Less

Submitted 29 September, 2023; originally announced September 2023.

arXiv:2304.07769 [pdf, other]

Spot The Odd One Out: Regularized Complete Cycle Consistent Anomaly Detector GAN

Authors: Zahra Dehghanian, Saeed Saravani, Maryam Amirmazlaghani, Mohammad Rahmati

Abstract: This study presents an adversarial method for anomaly detection in real-world applications, leveraging the power of generative adversarial neural networks (GANs) through cycle consistency in reconstruction error. Previous methods suffer from the high variance between class-wise accuracy which leads to not being applicable for all types of anomalies. The proposed method named RCALAD tries to solve… ▽ More This study presents an adversarial method for anomaly detection in real-world applications, leveraging the power of generative adversarial neural networks (GANs) through cycle consistency in reconstruction error. Previous methods suffer from the high variance between class-wise accuracy which leads to not being applicable for all types of anomalies. The proposed method named RCALAD tries to solve this problem by introducing a novel discriminator to the structure, which results in a more efficient training process. Additionally, RCALAD employs a supplementary distribution in the input space to steer reconstructions toward the normal data distribution, effectively separating anomalous samples from their reconstructions and facilitating more accurate anomaly detection. To further enhance the performance of the model, two novel anomaly scores are introduced. The proposed model has been thoroughly evaluated through extensive experiments on six various datasets, yielding results that demonstrate its superiority over existing state-of-the-art models. The code is readily available to the research community at https://github.com/zahraDehghanian97/RCALAD. △ Less

Submitted 30 April, 2024; v1 submitted 16 April, 2023; originally announced April 2023.

arXiv:2205.13502 [pdf, ps, other]

An Analytic Framework for Robust Training of Artificial Neural Networks

Authors: Ramin Barati, Reza Safabakhsh, Mohammad Rahmati

Abstract: The reliability of a learning model is key to the successful deployment of machine learning in various industries. Creating a robust model, particularly one unaffected by adversarial attacks, requires a comprehensive understanding of the adversarial examples phenomenon. However, it is difficult to describe the phenomenon due to the complicated nature of the problems in machine learning. Consequent… ▽ More The reliability of a learning model is key to the successful deployment of machine learning in various industries. Creating a robust model, particularly one unaffected by adversarial attacks, requires a comprehensive understanding of the adversarial examples phenomenon. However, it is difficult to describe the phenomenon due to the complicated nature of the problems in machine learning. Consequently, many studies investigate the phenomenon by proposing a simplified model of how adversarial examples occur and validate it by predicting some aspect of the phenomenon. While these studies cover many different characteristics of the adversarial examples, they have not reached a holistic approach to the geometric and analytic modeling of the phenomenon. This paper propose a formal framework to study the phenomenon in learning theory and make use of complex analysis and holomorphicity to offer a robust learning rule for artificial neural networks. With the help of complex analysis, we can effortlessly move between geometric and analytic perspectives of the phenomenon and offer further insights on the phenomenon by revealing its connection with harmonic functions. Using our model, we can explain some of the most intriguing characteristics of adversarial examples, including transferability of adversarial examples, and pave the way for novel approaches to mitigate the effects of the phenomenon. △ Less

Submitted 13 August, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

arXiv:2203.03621 [pdf]

Triple Motion Estimation and Frame Interpolation based on Adaptive Threshold for Frame Rate Up-Conversion

Authors: Hanieh Naderi, Mohammad Rahmati

Abstract: In this paper, we propose a novel motion-compensated frame rate up-conversion (MC-FRUC) algorithm. The proposed algorithm creates interpolated frames by first estimating motion vectors using unilateral (jointing forward and backward) and bilateral motion estimation. Then motion vectors are combined based on adaptive threshold, in order to creates high-quality interpolated frames and reduce block a… ▽ More In this paper, we propose a novel motion-compensated frame rate up-conversion (MC-FRUC) algorithm. The proposed algorithm creates interpolated frames by first estimating motion vectors using unilateral (jointing forward and backward) and bilateral motion estimation. Then motion vectors are combined based on adaptive threshold, in order to creates high-quality interpolated frames and reduce block artifacts. Since motion-compensated frame interpolation along unilateral motion trajectories yields holes, a new algorithm is introduced to resolve this problem. The experimental results show that the quality of the interpolated frames using the proposed algorithm is much higher than the existing algorithms. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: Frame rate up-conversion, frame interpolation, motion estimation, motion compensation

arXiv:2112.13953 [pdf, ps, other]

Source Feature Compression for Object Classification in Vision-Based Underwater Robotics

Authors: Xueyuan Zhao, Mehdi Rahmati, Dario Pompili

Abstract: New efficient source feature compression solutions are proposed based on a two-stage Walsh-Hadamard Transform (WHT) for Convolutional Neural Network (CNN)-based object classification in underwater robotics. The object images are firstly transformed by WHT following a two-stage process. The transform-domain tensors have large values concentrated in the upper left corner of the matrices in the RGB c… ▽ More New efficient source feature compression solutions are proposed based on a two-stage Walsh-Hadamard Transform (WHT) for Convolutional Neural Network (CNN)-based object classification in underwater robotics. The object images are firstly transformed by WHT following a two-stage process. The transform-domain tensors have large values concentrated in the upper left corner of the matrices in the RGB channels. By observing this property, the transform-domain matrix is partitioned into inner and outer regions. Consequently, two novel partitioning methods are proposed in this work: (i) fixing the size of inner and outer regions; and (ii) adjusting the size of inner and outer regions adaptively per image. The proposals are evaluated with an underwater object dataset captured from the Raritan River in New Jersey, USA. It is demonstrated and verified that the proposals reduce the training time effectively for learning-based underwater object classification task and increase the accuracy compared with the competing methods. The object classification is an essential part of a vision-based underwater robot that can sense the environment and navigate autonomously. Therefore, the proposed method is well-suited for efficient computer vision-based tasks in underwater robotics applications. △ Less

Submitted 27 December, 2021; originally announced December 2021.

arXiv:2111.10665 [pdf, other]

Location-aware Beamforming for MIMO-enabled UAV Communications: An Unknown Input Observer Approach

Authors: Alireza Mohammadi, Mehdi Rahmati, Hafiz Malik

Abstract: Numerous communications and networking challenges prevent deploying unmanned aerial vehicles (UAVs) in extreme environments where the existing wireless technologies are mainly ground-focused; and, as a consequence, the air-to-air channel for UAVs is not fully covered. In this paper, a novel spatial estimation for beamforming is proposed to address UAV-based joint sensing and communications (JSC).… ▽ More Numerous communications and networking challenges prevent deploying unmanned aerial vehicles (UAVs) in extreme environments where the existing wireless technologies are mainly ground-focused; and, as a consequence, the air-to-air channel for UAVs is not fully covered. In this paper, a novel spatial estimation for beamforming is proposed to address UAV-based joint sensing and communications (JSC). The proposed spatial estimation algorithm relies on using a delay tolerant observer-based predictor, which can accurately predict the positions of the target UAVs in the presence of uncertainties due to factors such as wind gust. The solution, which uses discrete-time unknown input observers (UIOs), reduces the joint target detection and communication complication notably by operating on the same device and performs reliably in the presence of channel blockage and interference. The effectiveness of the proposed approach is demonstrated using simulation results. △ Less

Submitted 20 November, 2021; originally announced November 2021.

Comments: 9 pages, 10 figures, under review in IEEE Sensors Journal

arXiv:2111.10634 [pdf, other]

Identity-Preserving Pose-Robust Face Hallucination Through Face Subspace Prior

Authors: Ali Abbasi, Mohammad Rahmati

Abstract: Over the past few decades, numerous attempts have been made to address the problem of recovering a high-resolution (HR) facial image from its corresponding low-resolution (LR) counterpart, a task commonly referred to as face hallucination. Despite the impressive performance achieved by position-patch and deep learning-based methods, most of these techniques are still unable to recover identity-spe… ▽ More Over the past few decades, numerous attempts have been made to address the problem of recovering a high-resolution (HR) facial image from its corresponding low-resolution (LR) counterpart, a task commonly referred to as face hallucination. Despite the impressive performance achieved by position-patch and deep learning-based methods, most of these techniques are still unable to recover identity-specific features of faces. The former group of algorithms often produces blurry and oversmoothed outputs particularly in the presence of higher levels of degradation, whereas the latter generates faces which sometimes by no means resemble the individuals in the input images. In this paper, a novel face super-resolution approach will be introduced, in which the hallucinated face is forced to lie in a subspace spanned by the available training faces. Therefore, in contrast to the majority of existing face hallucination techniques and thanks to this face subspace prior, the reconstruction is performed in favor of recovering person-specific facial features, rather than merely increasing image quantitative scores. Furthermore, inspired by recent advances in the area of 3D face reconstruction, an efficient 3D dictionary alignment scheme is also presented, through which the algorithm becomes capable of dealing with low-resolution faces taken in uncontrolled conditions. In extensive experiments carried out on several well-known face datasets, the proposed algorithm shows remarkable performance by generating detailed and close to ground truth results which outperform the state-of-the-art face hallucination algorithms by significant margins both in quantitative and qualitative evaluations. △ Less

Submitted 20 November, 2021; originally announced November 2021.

Comments: A shorter version of this paper has been submitted to IEEE Transactions on Image Processing

arXiv:2111.10233 [pdf, other]

Xp-GAN: Unsupervised Multi-object Controllable Video Generation

Authors: Bahman Rouhani, Mohammad Rahmati

Abstract: Video Generation is a relatively new and yet popular subject in machine learning due to its vast variety of potential applications and its numerous challenges. Current methods in Video Generation provide the user with little or no control over the exact specification of how the objects in the generate video are to be moved and located at each frame, that is, the user can't explicitly control how e… ▽ More Video Generation is a relatively new and yet popular subject in machine learning due to its vast variety of potential applications and its numerous challenges. Current methods in Video Generation provide the user with little or no control over the exact specification of how the objects in the generate video are to be moved and located at each frame, that is, the user can't explicitly control how each object in the video should move. In this paper we propose a novel method that allows the user to move any number of objects of a single initial frame just by drawing bounding boxes over those objects and then moving those boxes in the desired path. Our model utilizes two Autoencoders to fully decompose the motion and content information in a video and achieves results comparable to well-known baseline and state of the art methods. △ Less

Submitted 19 November, 2021; originally announced November 2021.

Comments: 8 pages, 9 figures

arXiv:2107.10599 [pdf, other]

doi 10.1109/ICPR48806.2021.9412367

Towards Explaining Adversarial Examples Phenomenon in Artificial Neural Networks

Authors: Ramin Barati, Reza Safabakhsh, Mohammad Rahmati

Abstract: In this paper, we study the adversarial examples existence and adversarial training from the standpoint of convergence and provide evidence that pointwise convergence in ANNs can explain these observations. The main contribution of our proposal is that it relates the objective of the evasion attacks and adversarial training with concepts already defined in learning theory. Also, we extend and unif… ▽ More In this paper, we study the adversarial examples existence and adversarial training from the standpoint of convergence and provide evidence that pointwise convergence in ANNs can explain these observations. The main contribution of our proposal is that it relates the objective of the evasion attacks and adversarial training with concepts already defined in learning theory. Also, we extend and unify some of the other proposals in the literature and provide alternative explanations on the observations made in those proposals. Through different experiments, we demonstrate that the framework is valuable in the study of the phenomenon and is applicable to real-world problems. △ Less

Submitted 22 July, 2021; originally announced July 2021.

Comments: submitted to 25th International Conference on Pattern Recognition (ICPR)

ACM Class: I.5.1

Journal ref: In 2020 25th International Conference on Pattern Recognition (ICPR)

arXiv:2107.01410 [pdf, other]

Maximum Entropy Weighted Independent Set Pooling for Graph Neural Networks

Authors: Amirhossein Nouranizadeh, Mohammadjavad Matinkia, Mohammad Rahmati, Reza Safabakhsh

Abstract: In this paper, we propose a novel pooling layer for graph neural networks based on maximizing the mutual information between the pooled graph and the input graph. Since the maximum mutual information is difficult to compute, we employ the Shannon capacity of a graph as an inductive bias to our pooling method. More precisely, we show that the input graph to the pooling layer can be viewed as a repr… ▽ More In this paper, we propose a novel pooling layer for graph neural networks based on maximizing the mutual information between the pooled graph and the input graph. Since the maximum mutual information is difficult to compute, we employ the Shannon capacity of a graph as an inductive bias to our pooling method. More precisely, we show that the input graph to the pooling layer can be viewed as a representation of a noisy communication channel. For such a channel, sending the symbols belonging to an independent set of the graph yields a reliable and error-free transmission of information. We show that reaching the maximum mutual information is equivalent to finding a maximum weight independent set of the graph where the weights convey entropy contents. Through this communication theoretic standpoint, we provide a distinct perspective for posing the problem of graph pooling as maximizing the information transmission rate across a noisy communication channel, implemented by a graph neural network. We evaluate our method, referred to as Maximum Entropy Weighted Independent Set Pooling (MEWISPool), on graph classification tasks and the combinatorial optimization problem of the maximum independent set. Empirical results demonstrate that our method achieves the state-of-the-art and competitive results on graph classification tasks and the maximum independent set problem in several benchmark datasets. △ Less

Submitted 3 July, 2021; originally announced July 2021.

Comments: 21 pages, 12 figures, under review in 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

arXiv:2011.09619 [pdf, other]

Abnormal Event Detection in Urban Surveillance Videos Using GAN and Transfer Learning

Authors: Ali Atghaei, Soroush Ziaeinejad, Mohammad Rahmati

Abstract: Abnormal event detection (AED) in urban surveillance videos has multiple challenges. Unlike other computer vision problems, the AED is not solely dependent on the content of frames. It also depends on the appearance of the objects and their movements in the scene. Various methods have been proposed to address the AED problem. Among those, deep learning based methods show the best results. This pap… ▽ More Abnormal event detection (AED) in urban surveillance videos has multiple challenges. Unlike other computer vision problems, the AED is not solely dependent on the content of frames. It also depends on the appearance of the objects and their movements in the scene. Various methods have been proposed to address the AED problem. Among those, deep learning based methods show the best results. This paper is based on deep learning methods and provides an effective way to detect and locate abnormal events in videos by handling spatio temporal data. This paper uses generative adversarial networks (GANs) and performs transfer learning algorithms on pre trained convolutional neural network (CNN) which result in an accurate and efficient model. The efficiency of the model is further improved by processing the optical flow information of the video. This paper runs experiments on two benchmark datasets for AED problem (UCSD Peds1 and UCSD Peds2) and compares the results with other previous methods. The comparisons are based on various criteria such as area under curve (AUC) and true positive rate (TPR). Experimental results show that the proposed method can effectively detect and locate abnormal events in crowd scenes. △ Less

Submitted 18 November, 2020; originally announced November 2020.

Comments: 7 pages, 9 figures, 3 tables

arXiv:1911.04096 [pdf, other]

doi 10.1145/3366486.3366533

UW-MARL: Multi-Agent Reinforcement Learning for Underwater Adaptive Sampling using Autonomous Vehicles

Authors: Mehdi Rahmati, Mohammad Nadeem, Vidyasagar Sadhu, Dario Pompili

Abstract: Near-real-time water-quality monitoring in uncertain environments such as rivers, lakes, and water reservoirs of different variables is critical to protect the aquatic life and to prevent further propagation of the potential pollution in the water. In order to measure the physical values in a region of interest, adaptive sampling is helpful as an energy- and time-efficient technique since an exhau… ▽ More Near-real-time water-quality monitoring in uncertain environments such as rivers, lakes, and water reservoirs of different variables is critical to protect the aquatic life and to prevent further propagation of the potential pollution in the water. In order to measure the physical values in a region of interest, adaptive sampling is helpful as an energy- and time-efficient technique since an exhaustive search of an area is not feasible with a single vehicle. We propose an adaptive sampling algorithm using multiple autonomous vehicles, which are well-trained, as agents, in a Multi-Agent Reinforcement Learning (MARL) framework to make efficient sequence of decisions on the adaptive sampling procedure. The proposed solution is evaluated using experimental data, which is fed into a simulation framework. Experiments were conducted in the Raritan River, Somerset and in Carnegie Lake, Princeton, NJ during July 2019. △ Less

Submitted 11 November, 2019; originally announced November 2019.

arXiv:1911.04072 [pdf, other]

doi 10.1145/3366486.3366488

Compressed Underwater Acoustic Communications for Dynamic Interaction with Underwater Vehicles

Authors: Mehdi Rahmati, Archana Arjula, Dario Pompili

Abstract: Underwater vehicles are utilized in various applications including underwater data-collection missions. The tethered connection constrains the mission both in distance traveled and number of vehicles that can run in the same area, while the addition of acoustic communications onto the vehicles grants them several functionalities. However, due to the low bandwidth of the underwater acoustic channel… ▽ More Underwater vehicles are utilized in various applications including underwater data-collection missions. The tethered connection constrains the mission both in distance traveled and number of vehicles that can run in the same area, while the addition of acoustic communications onto the vehicles grants them several functionalities. However, due to the low bandwidth of the underwater acoustic channel-which leads to low data rates-and the time overhead imposed by both the channel propagation delay and the processing delay by the acoustic modems, efficient protocols are required. In this paper, an implicit data-compression and transmission protocol is proposed to carry out environmental monitoring missions such as adaptive sampling of physical and chemical parameters in the water. In a semi-autonomous manner between the vehicle and the control center, both sides keep silent in data transmission as long as they can estimate and predict the actions of the other side, unless environmental data and/or kinematic data are found to be unpredictable. Our design puts the human in the loop to send high-level control commands. Experiments were conducted using an autonomous vehicle with WHOI micro-modems in the Raritan River, Somerset, Carnegie Lake in Princeton, and in the Marine Park in Red Bank, all in New Jersey. △ Less

Submitted 10 November, 2019; originally announced November 2019.

arXiv:1910.08844 [pdf, other]

UW-SVC: Scalable Video Coding Transmission for In-network Underwater Imagery Analysis

Authors: Mehdi Rahmati, Dario Pompili

Abstract: Underwater imagery has enabled numerous civilian applications in various domains, ranging from academia to industry, and from industrial surveillance and maintenance to environmental protection and behavior of marine creatures studies. The accumulation of litter and plastic debris at the seafloor and the bottom of rivers are extremely harmful for the aquatic life. We propose a solution for this pr… ▽ More Underwater imagery has enabled numerous civilian applications in various domains, ranging from academia to industry, and from industrial surveillance and maintenance to environmental protection and behavior of marine creatures studies. The accumulation of litter and plastic debris at the seafloor and the bottom of rivers are extremely harmful for the aquatic life. We propose a solution for this problem using a team of Autonomous Underwater Vehicles (AUVs) to exchange the recorded video in order to reconstruct the seafloor regions of interest. However, underwater video transmission is a challenge in the harsh environment in which radio-frequency waves are absorbed for distances above a few tens of meters, optical waves require narrow laser beams and suffer from scattering and ocean wave motion, and acoustic waves, while long range, provide a very low bandwidth and unreliable channel for communication. In our solution, the scalable coded video of each vehicle is shared with a selected group of receiving videos, pseudo-multicasting, through the acoustic channel. Presented evaluations, including both simulations and experiments, confirm the efficiency and flexibility of the proposed solution using acoustic software-defined modems. △ Less

Submitted 19 October, 2019; originally announced October 2019.

arXiv:1909.12746 [pdf]

Cross-domain recommender system using Generalized Canonical Correlation Analysis

Authors: Seyed Mohammad Hashemi, Mohammad Rahmati

Abstract: Recommender systems provide personalized recommendations to the users from a large number of possible options in online stores. Matrix factorization is a well-known and accurate collaborative filtering approach for recommender system, which suffers from cold-start problem for new users and items. Whenever a new user participate with the system there is not enough interactions with the system, ther… ▽ More Recommender systems provide personalized recommendations to the users from a large number of possible options in online stores. Matrix factorization is a well-known and accurate collaborative filtering approach for recommender system, which suffers from cold-start problem for new users and items. Whenever a new user participate with the system there is not enough interactions with the system, therefore there are not enough ratings in the user-item matrix to learn the matrix factorization model. Using auxiliary data such as users demographic, ratings and reviews in relevant domains, is an effective solution to reduce the new user problem. In this paper, we used data of users from other domains and build a common space to represent the latent factors of users from different domains. In this representation we proposed an iterative method which applied MAX-VAR generalized canonical correlation analysis (GCCA) on users latent factors learned from matrix factorization on each domain. Also, to improve the capability of GCCA to learn latent factors for new users, we propose generalized canonical correlation analysis by inverse sum of selection matrices (GCCA-ISSM) approach, which provides better recommendations in cold-start scenarios. The proposed approach is extended using content-based features from topic modeling extracted from users reviews. We demonstrate the accuracy and effectiveness of the proposed approaches on cross-domain ratings predictions using comprehensive experiments on Amazon and MovieLens datasets. △ Less

Submitted 15 September, 2019; originally announced September 2019.

arXiv:1905.07220 [pdf, other]

doi 10.1016/j.ins.2019.11.031

Neither Global Nor Local: A Hierarchical Robust Subspace Clustering For Image Data

Authors: Maryam Abdolali, Mohammad Rahmati

Abstract: In this paper, we consider the problem of subspace clustering in presence of contiguous noise, occlusion and disguise. We argue that self-expressive representation of data in current state-of-the-art approaches is severely sensitive to occlusions and complex real-world noises. To alleviate this problem, we propose a hierarchical framework that brings robustness of local patches-based representatio… ▽ More In this paper, we consider the problem of subspace clustering in presence of contiguous noise, occlusion and disguise. We argue that self-expressive representation of data in current state-of-the-art approaches is severely sensitive to occlusions and complex real-world noises. To alleviate this problem, we propose a hierarchical framework that brings robustness of local patches-based representations and discriminant property of global representations together. This approach consists of 1) a top-down stage, in which the input data is subject to repeated division to smaller patches and 2) a bottom-up stage, in which the low rank embedding of local patches in field of view of a corresponding patch in upper level are merged on a Grassmann manifold. This summarized information provides two key information for the corresponding patch on the upper level: cannot-links and recommended-links. This information is employed for computing a self-expressive representation of each patch at upper levels using a weighted sparse group lasso optimization problem. Numerical results on several real data sets confirm the efficiency of our approach. △ Less

Submitted 17 May, 2019; originally announced May 2019.

Journal ref: Information Sciences 514, pp. 333-353, 2020

arXiv:1802.07648 [pdf, other]

doi 10.1016/j.sigpro.2019.05.017

Scalable and Robust Sparse Subspace Clustering Using Randomized Clustering and Multilayer Graphs

Authors: Maryam Abdolali, Nicolas Gillis, Mohammad Rahmati

Abstract: Sparse subspace clustering (SSC) is one of the current state-of-the-art methods for partitioning data points into the union of subspaces, with strong theoretical guarantees. However, it is not practical for large data sets as it requires solving a LASSO problem for each data point, where the number of variables in each LASSO problem is the number of data points. To improve the scalability of SSC,… ▽ More Sparse subspace clustering (SSC) is one of the current state-of-the-art methods for partitioning data points into the union of subspaces, with strong theoretical guarantees. However, it is not practical for large data sets as it requires solving a LASSO problem for each data point, where the number of variables in each LASSO problem is the number of data points. To improve the scalability of SSC, we propose to select a few sets of anchor points using a randomized hierarchical clustering method, and, for each set of anchor points, solve the LASSO problems for each data point allowing only anchor points to have a non-zero weight (this reduces drastically the number of variables). This generates a multilayer graph where each layer corresponds to a different set of anchor points. Using the Grassmann manifold of orthogonal matrices, the shared connectivity among the layers is summarized within a single subspace. Finally, we use $k$-means clustering within that subspace to cluster the data points, similarly as done by spectral clustering in SSC. We show on both synthetic and real-world data sets that the proposed method not only allows SSC to scale to large-scale data sets, but that it is also much more robust as it performs significantly better on noisy data and on data with close susbspaces and outliers, while it is not prone to oversegmentation. △ Less

Submitted 23 February, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

Comments: 25 pages, v2: typos corrected

Journal ref: Signal Processing 163, pp. 166-180, 2019

arXiv:1709.04393 [pdf]

An Efficient Evolutionary Based Method For Image Segmentation

Authors: Roohollah Aslanzadeh, Kazem Qazanfari, Mohammad Rahmati

Abstract: The goal of this paper is to present a new efficient image segmentation method based on evolutionary computation which is a model inspired from human behavior. Based on this model, a four layer process for image segmentation is proposed using the split/merge approach. In the first layer, an image is split into numerous regions using the watershed algorithm. In the second layer, a co-evolutionary p… ▽ More The goal of this paper is to present a new efficient image segmentation method based on evolutionary computation which is a model inspired from human behavior. Based on this model, a four layer process for image segmentation is proposed using the split/merge approach. In the first layer, an image is split into numerous regions using the watershed algorithm. In the second layer, a co-evolutionary process is applied to form centers of finals segments by merging similar primary regions. In the third layer, a meta-heuristic process uses two operators to connect the residual regions to their corresponding determined centers. In the final layer, an evolutionary algorithm is used to combine the resulted similar and neighbor regions. Different layers of the algorithm are totally independent, therefore for certain applications a specific layer can be changed without constraint of changing other layers. Some properties of this algorithm like the flexibility of its method, the ability to use different feature vectors for segmentation (grayscale, color, texture, etc), the ability to control uniformity and the number of final segments using free parameters and also maintaining small regions, makes it possible to apply the algorithm to different applications. Moreover, the independence of each region from other regions in the second layer, and the independence of centers in the third layer, makes parallel implementation possible. As a result the algorithm speed will increase. The presented algorithm was tested on a standard dataset (BSDS 300) of images, and the region boundaries were compared with different people segmentation contours. Results show the efficiency of the algorithm and its improvement to similar methods. As an instance, in 70% of tested images, results are better than ACT algorithm, besides in 100% of tested images, we had better results in comparison with VSP algorithm. △ Less

Submitted 6 December, 2017; v1 submitted 13 September, 2017; originally announced September 2017.

Comments: 17 pages

arXiv:1706.06247 [pdf, other]

Low Resolution Face Recognition Using a Two-Branch Deep Convolutional Neural Network Architecture

Authors: Erfan Zangeneh, Mohammad Rahmati, Yalda Mohsenzadeh

Abstract: We propose a novel couple map**s method for low resolution face recognition using deep convolutional neural networks (DCNNs). The proposed architecture consists of two branches of DCNNs to map the high and low resolution face images into a common space with nonlinear transformations. The branch corresponding to transformation of high resolution images consists of 14 layers and the other branch w… ▽ More We propose a novel couple map**s method for low resolution face recognition using deep convolutional neural networks (DCNNs). The proposed architecture consists of two branches of DCNNs to map the high and low resolution face images into a common space with nonlinear transformations. The branch corresponding to transformation of high resolution images consists of 14 layers and the other branch which maps the low resolution face images to the common space includes a 5-layer super-resolution network connected to a 14-layer network. The distance between the features of corresponding high and low resolution images are backpropagated to train the networks. Our proposed method is evaluated on FERET data set and compared with state-of-the-art competing methods. Our extensive experimental results show that the proposed method significantly improves the recognition performance especially for very low resolution probe face images (11.4% improvement in recognition accuracy). Furthermore, it can reconstruct a high resolution image from its corresponding low resolution probe image which is comparable with state-of-the-art super-resolution methods in terms of visual quality. △ Less

Submitted 19 June, 2017; originally announced June 2017.

Comments: 11 pages, 8 figures

arXiv:1301.6599 [pdf, ps, other]

doi 10.1109/ISIT.2013.6620764

An Upper Bound on the Capacity of non-Binary Deletion Channels

Authors: Mojtaba Rahmati, Tolga M. Duman

Abstract: We derive an upper bound on the capacity of non-binary deletion channels. Although binary deletion channels have received significant attention over the years, and many upper and lower bounds on their capacity have been derived, such studies for the non-binary case are largely missing. The state of the art is the following: as a trivial upper bound, capacity of an erasure channel with the same inp… ▽ More We derive an upper bound on the capacity of non-binary deletion channels. Although binary deletion channels have received significant attention over the years, and many upper and lower bounds on their capacity have been derived, such studies for the non-binary case are largely missing. The state of the art is the following: as a trivial upper bound, capacity of an erasure channel with the same input alphabet as the deletion channel can be used, and as a lower bound the results by Diggavi and Grossglauser are available. In this paper, we derive the first non-trivial non-binary deletion channel capacity upper bound and reduce the gap with the existing achievable rates. To derive the results we first prove an inequality between the capacity of a 2K-ary deletion channel with deletion probability $d$, denoted by $C_{2K}(d)$, and the capacity of the binary deletion channel with the same deletion probability, $C_2(d)$, that is, $C_{2K}(d)\leq C_2(d)+(1-d)\log(K)$. Then by employing some existing upper bounds on the capacity of the binary deletion channel, we obtain upper bounds on the capacity of the 2K-ary deletion channel. We illustrate via examples the use of the new bounds and discuss their asymptotic behavior as $d \rightarrow 0$. △ Less

Submitted 8 May, 2013; v1 submitted 28 January, 2013; originally announced January 2013.

Comments: accepted for presentation in ISIT 2013

arXiv:1211.2497 [pdf, ps, other]

A Note on the Deletion Channel Capacity

Authors: Mojtaba Rahmati, Tolga M. Duman

Abstract: Memoryless channels with deletion errors as defined by a stochastic channel matrix allowing for bit drop outs are considered in which transmitted bits are either independently deleted with probability $d$ or unchanged with probability $1-d$. Such channels are information stable, hence their Shannon capacity exists. However, computation of the channel capacity is formidable, and only some upper and… ▽ More Memoryless channels with deletion errors as defined by a stochastic channel matrix allowing for bit drop outs are considered in which transmitted bits are either independently deleted with probability $d$ or unchanged with probability $1-d$. Such channels are information stable, hence their Shannon capacity exists. However, computation of the channel capacity is formidable, and only some upper and lower bounds on the capacity exist. In this paper, we first show a simple result that the parallel concatenation of two different independent deletion channels with deletion probabilities $d_1$ and $d_2$, in which every input bit is either transmitted over the first channel with probability of $λ$ or over the second one with probability of $1-λ$, is nothing but another deletion channel with deletion probability of $d=λd_1+(1-λ)d_2$. We then provide an upper bound on the concatenated deletion channel capacity $C(d)$ in terms of the weighted average of $C(d_1)$, $C(d_2)$ and the parameters of the three channels. An interesting consequence of this bound is that $C(λd_1+(1-λ))\leq λC(d_1)$ which enables us to provide an improved upper bound on the capacity of the i.i.d. deletion channels, i.e., $C(d)\leq 0.4143(1-d)$ for $d\geq 0.65$. This generalizes the asymptotic result by Dalai as it remains valid for all $d\geq 0.65$. Using the same approach we are also able to improve upon existing upper bounds on the capacity of the deletion/substitution channel. △ Less

Submitted 11 November, 2012; originally announced November 2012.

Comments: Submitted to the IEEE Transactions on Information Theory

arXiv:1203.6396 [pdf, ps, other]

Achievable Rates for Noisy Channels with Synchronization Errors

Authors: Mojtaba Rahmati, Tolga M. Duman

Abstract: We develop several lower bounds on the capacity of binary input symmetric output channels with synchronization errors which also suffer from other types of impairments such as substitutions, erasures, additive white Gaussian noise (AWGN) etc. More precisely, we show that if the channel with synchronization errors can be decomposed into a cascade of two channels where only the first one suffers fro… ▽ More We develop several lower bounds on the capacity of binary input symmetric output channels with synchronization errors which also suffer from other types of impairments such as substitutions, erasures, additive white Gaussian noise (AWGN) etc. More precisely, we show that if the channel with synchronization errors can be decomposed into a cascade of two channels where only the first one suffers from synchronization errors and the second one is a memoryless channel, a lower bound on the capacity of the original channel in terms of the capacity of the synchronization error-only channel can be derived. To accomplish this, we derive lower bounds on the mutual information rate between the transmitted and received sequences (for the original channel) for an arbitrary input distribution, and then relate this result to the channel capacity. The results apply without the knowledge of the exact capacity achieving input distributions. A primary application of our results is that we can employ any lower bound derived on the capacity of the first channel (synchronization error channel in the decomposition) to find lower bounds on the capacity of the (original) noisy channel with synchronization errors. We apply the general ideas to several specific classes of channels such as synchronization error channels with erasures and substitutions, with symmetric q-ary outputs and with AWGN explicitly, and obtain easy-to-compute bounds. We illustrate that, with our approach, it is possible to derive tighter capacity lower bounds compared to the currently available bounds in the literature for certain classes of channels, e.g., deletion/substitution channels and deletion/AWGN channels (for certain signal to noise ratio (SNR) ranges). △ Less

Submitted 28 March, 2012; originally announced March 2012.

Comments: Submitted to the IEEE Transactions on Information Theory

arXiv:1101.1310 [pdf, ps, other]

doi 10.1109/TIT.2013.2262019

Bounds on the Capacity of Random Insertion and Deletion-Additive Noise Channels

Authors: Mojtaba Rahmati, Tolga M. Duman

Abstract: We develop several analytical lower bounds on the capacity of binary insertion and deletion channels by considering independent uniformly distributed (i.u.d.) inputs and computing lower bounds on the mutual information between the input and output sequences. For the deletion channel, we consider two different models: independent and identically distributed (i.i.d.) deletion-substitution channel an… ▽ More We develop several analytical lower bounds on the capacity of binary insertion and deletion channels by considering independent uniformly distributed (i.u.d.) inputs and computing lower bounds on the mutual information between the input and output sequences. For the deletion channel, we consider two different models: independent and identically distributed (i.i.d.) deletion-substitution channel and i.i.d. deletion channel with additive white Gaussian noise (AWGN). These two models are considered to incorporate effects of the channel noise along with the synchronization errors. For the insertion channel case we consider the Gallager's model in which the transmitted bits are replaced with two random bits and uniform over the four possibilities independently of any other insertion events. The general approach taken is similar in all cases, however the specific computations differ. Furthermore, the approach yields a useful lower bound on the capacity for a wide range of deletion probabilities for the deletion channels, while it provides a beneficial bound only for small insertion probabilities (less than 0.25) for the insertion model adopted. We emphasize the importance of these results by noting that 1) our results are the first analytical bounds on the capacity of deletion-AWGN channels, 2) the results developed are the best available analytical lower bounds on the deletion-substitution case, 3) for the Gallager insertion channel model, the new lower bound improves the existing results for small insertion probabilities. △ Less

Submitted 2 May, 2013; v1 submitted 6 January, 2011; originally announced January 2011.

Comments: Accepted for publication in IEEE Transactions on Information Theory

Journal ref: IEEE Transactions on Information Theory, vol.59, no.9, pp.5534,5546, Sept. 2013

Showing 1–24 of 24 results for author: Rahmati, M