Search | arXiv e-print repository

OPSD: an Offensive Persian Social media Dataset and its baseline evaluations

Authors: Mehran Safayani, Amir Sartipi, Amir Hossein Ahmadi, Parniyan Jalali, Amir Hossein Mansouri, Mohammad Bisheh-Niasar, Zahra Pourbahman

Abstract: The proliferation of hate speech and offensive comments on social media has become increasingly prevalent due to user activities. Such comments can have detrimental effects on individuals' psychological well-being and social behavior. While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper in… ▽ More The proliferation of hate speech and offensive comments on social media has become increasingly prevalent due to user activities. Such comments can have detrimental effects on individuals' psychological well-being and social behavior. While numerous datasets in the English language exist in this domain, few equivalent resources are available for Persian language. To address this gap, this paper introduces two offensive datasets. The first dataset comprises annotations provided by domain experts, while the second consists of a large collection of unlabeled data obtained through web crawling for unsupervised learning purposes. To ensure the quality of the former dataset, a meticulous three-stage labeling process was conducted, and kappa measures were computed to assess inter-annotator agreement. Furthermore, experiments were performed on the dataset using state-of-the-art language models, both with and without employing masked language modeling techniques, as well as machine learning algorithms, in order to establish the baselines for the dataset using contemporary cutting-edge approaches. The obtained F1-scores for the three-class and two-class versions of the dataset were 76.9% and 89.9% for XLM-RoBERTa, respectively. △ Less

Submitted 8 April, 2024; originally announced April 2024.

Comments: 16 pages, 5 figures, 8 tables

arXiv:2401.11798 [pdf, other]

Knowledge Distillation on Spatial-Temporal Graph Convolutional Network for Traffic Prediction

Authors: Mohammad Izadi, Mehran Safayani, Abdolreza Mirzaei

Abstract: Efficient real-time traffic prediction is crucial for reducing transportation time. To predict traffic conditions, we employ a spatio-temporal graph neural network (ST-GNN) to model our real-time traffic data as temporal graphs. Despite its capabilities, it often encounters challenges in delivering efficient real-time predictions for real-world traffic data. Recognizing the significance of timely… ▽ More Efficient real-time traffic prediction is crucial for reducing transportation time. To predict traffic conditions, we employ a spatio-temporal graph neural network (ST-GNN) to model our real-time traffic data as temporal graphs. Despite its capabilities, it often encounters challenges in delivering efficient real-time predictions for real-world traffic data. Recognizing the significance of timely prediction due to the dynamic nature of real-time data, we employ knowledge distillation (KD) as a solution to enhance the execution time of ST-GNNs for traffic prediction. In this paper, We introduce a cost function designed to train a network with fewer parameters (the student) using distilled data from a complex network (the teacher) while maintaining its accuracy close to that of the teacher. We use knowledge distillation, incorporating spatial-temporal correlations from the teacher network to enable the student to learn the complex patterns perceived by the teacher. However, a challenge arises in determining the student network architecture rather than considering it inadvertently. To address this challenge, we propose an algorithm that utilizes the cost function to calculate pruning scores, addressing small network architecture search issues, and jointly fine-tunes the network resulting from each pruning stage using KD. Ultimately, we evaluate our proposed ideas on two real-world datasets, PeMSD7 and PeMSD8. The results indicate that our method can maintain the student's accuracy close to that of the teacher, even with the retention of only $3\%$ of network parameters. △ Less

Submitted 28 January, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

arXiv:2311.02903 [pdf, other]

HDGL: A hierarchical dynamic graph representation learning model for brain disorder classification

Authors: Parniyan Jalali, Mehran Safayani

Abstract: The human brain can be considered as complex networks, composed of various regions that continuously exchange their information with each other, forming the brain network graph, from which nodes and edges are extracted using resting-state functional magnetic resonance imaging (rs-fMRI). Therefore, this graph can potentially depict abnormal patterns that have emerged under the influence of brain di… ▽ More The human brain can be considered as complex networks, composed of various regions that continuously exchange their information with each other, forming the brain network graph, from which nodes and edges are extracted using resting-state functional magnetic resonance imaging (rs-fMRI). Therefore, this graph can potentially depict abnormal patterns that have emerged under the influence of brain disorders. So far, numerous studies have attempted to find embeddings for brain network graphs and subsequently classify samples with brain disorders from healthy ones, which include limitations such as: not considering the relationship between samples, not utilizing phenotype information, lack of temporal analysis, using static functional connectivity (FC) instead of dynamic ones and using a fixed graph structure. We propose a hierarchical dynamic graph representation learning (HDGL) model, which is the first model designed to address all the aforementioned challenges. HDGL consists of two levels, where at the first level, it constructs brain network graphs and learns their spatial and temporal embeddings, and at the second level, it forms population graphs and performs classification after embedding learning. Furthermore, based on how these two levels are trained, four methods have been introduced, some of which are suggested for reducing memory complexity. We evaluated the performance of the proposed model on the ABIDE and ADHD-200 datasets, and the results indicate the improvement of this model compared to several state-of-the-art models in terms of various evaluation metrics. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2303.13665 [pdf, other]

Clustering based on Mixtures of Sparse Gaussian Processes

Authors: Zahra Moslehi, Abdolreza Mirzaei, Mehran Safayani

Abstract: Creating low dimensional representations of a high dimensional data set is an important component in many machine learning applications. How to cluster data using their low dimensional embedded space is still a challenging problem in machine learning. In this article, we focus on proposing a joint formulation for both clustering and dimensionality reduction. When a probabilistic model is desired,… ▽ More Creating low dimensional representations of a high dimensional data set is an important component in many machine learning applications. How to cluster data using their low dimensional embedded space is still a challenging problem in machine learning. In this article, we focus on proposing a joint formulation for both clustering and dimensionality reduction. When a probabilistic model is desired, one possible solution is to use the mixture models in which both cluster indicator and low dimensional space are learned. Our algorithm is based on a mixture of sparse Gaussian processes, which is called Sparse Gaussian Process Mixture Clustering (SGP-MIC). The main advantages to our approach over existing methods are that the probabilistic nature of this model provides more advantages over existing deterministic methods, it is straightforward to construct non-linear generalizations of the model, and applying a sparse model and an efficient variational EM approximation help to speed up the algorithm. △ Less

Submitted 23 March, 2023; originally announced March 2023.

arXiv:2205.05168 [pdf, other]

Deep Graph Clustering via Mutual Information Maximization and Mixture Model

Authors: Maedeh Ahmadi, Mehran Safayani, Abdolreza Mirzaei

Abstract: Attributed graph clustering or community detection which learns to cluster the nodes of a graph is a challenging task in graph analysis. In this paper, we introduce a contrastive learning framework for learning clustering-friendly node embedding. Although graph contrastive learning has shown outstanding performance in self-supervised graph learning, using it for graph clustering is not well explor… ▽ More Attributed graph clustering or community detection which learns to cluster the nodes of a graph is a challenging task in graph analysis. In this paper, we introduce a contrastive learning framework for learning clustering-friendly node embedding. Although graph contrastive learning has shown outstanding performance in self-supervised graph learning, using it for graph clustering is not well explored. We propose Gaussian mixture information maximization (GMIM) which utilizes a mutual information maximization approach for node embedding. Meanwhile, it assumes that the representation space follows a Mixture of Gaussians (MoG) distribution. The clustering part of our objective tries to fit a Gaussian distribution to each community. The node embedding is jointly optimized with the parameters of MoG in a unified framework. Experiments on real-world datasets demonstrate the effectiveness of our method in community detection. △ Less

Submitted 10 May, 2022; originally announced May 2022.

arXiv:2110.11870 [pdf]

Biomedical text summarization using Conditional Generative Adversarial Network(CGAN)

Authors: Seyed Vahid Moravvej, Abdolreza Mirzaei, Mehran Safayani

Abstract: Text summarization in medicine can help doctors for reducing the time to access important information from countless documents. The paper offers a supervised extractive summarization method based on conditional generative adversarial networks using convolutional neural networks. Unlike previous models, which often use greedy methods to select sentences, we use a new approach for selecting sentence… ▽ More Text summarization in medicine can help doctors for reducing the time to access important information from countless documents. The paper offers a supervised extractive summarization method based on conditional generative adversarial networks using convolutional neural networks. Unlike previous models, which often use greedy methods to select sentences, we use a new approach for selecting sentences. Moreover, we provide a network for biomedical word embedding, which improves summarization. An essential contribution of the paper is introducing a new loss function for the discriminator, making the discriminator perform better. The proposed model achieves results comparable to the state-of-the-art approaches, as determined by the ROUGE metric. Experiments on the medical dataset show that the proposed method works on average 5% better than the competing models and is more similar to the reference summaries. △ Less

Submitted 17 September, 2021; originally announced October 2021.

Comments: 12 pages, to appear in artificial intelligence in medicine journal

arXiv:1711.03736 [pdf, other]

doi 10.1007/s11042-019-7427-5

Joint Sentiment/Topic Modeling on Text Data Using Boosted Restricted Boltzmann Machine

Authors: Masoud Fatemi, Mehran Safayani

Abstract: Recently by the development of the Internet and the Web, different types of social media such as web blogs become an immense source of text data. Through the processing of these data, it is possible to discover practical information about different topics, individuals opinions and a thorough understanding of the society. Therefore, applying models which can automatically extract the subjective inf… ▽ More Recently by the development of the Internet and the Web, different types of social media such as web blogs become an immense source of text data. Through the processing of these data, it is possible to discover practical information about different topics, individuals opinions and a thorough understanding of the society. Therefore, applying models which can automatically extract the subjective information from the documents would be efficient and helpful. Topic modeling methods, also sentiment analysis are the most raised topics in the natural language processing and text mining fields. In this paper a new structure for joint sentiment-topic modeling based on Restricted Boltzmann Machine (RBM) which is a type of neural networks is proposed. By modifying the structure of RBM as well as appending a layer which is analogous to sentiment of text data to it, we propose a generative structure for joint sentiment topic modeling based on neutral networks. The proposed method is supervised and trained by the Contrastive Divergence algorithm. The new attached layer in the proposed model is a layer with the multinomial probability distribution which can be used in text data sentiment classification or any other supervised application. The proposed model is compared with existing models in the experiments such as evaluating as a generative model, sentiment classification, information retrieval and the corresponding results demonstrate the efficiency of the method. △ Less

Submitted 10 November, 2017; originally announced November 2017.

arXiv:1708.01519 [pdf, other]

doi 10.1007/s00500-020-04906-8

A Latent Variable Model for Two-Dimensional Canonical Correlation Analysis and its Variational Inference

Authors: Mehran Safayani, Saeid Momenzadeh

Abstract: Describing the dimension reduction (DR) techniques by means of probabilistic models has recently been given special attention. Probabilistic models, in addition to a better interpretability of the DR methods, provide a framework for further extensions of such algorithms. One of the new approaches to the probabilistic DR methods is to preserving the internal structure of data. It is meant that it i… ▽ More Describing the dimension reduction (DR) techniques by means of probabilistic models has recently been given special attention. Probabilistic models, in addition to a better interpretability of the DR methods, provide a framework for further extensions of such algorithms. One of the new approaches to the probabilistic DR methods is to preserving the internal structure of data. It is meant that it is not necessary that the data first be converted from the matrix or tensor format to the vector format in the process of dimensionality reduction. In this paper, a latent variable model for matrix-variate data for canonical correlation analysis (CCA) is proposed. Since in general there is not any analytical maximum likelihood solution for this model, we present two approaches for learning the parameters. The proposed methods are evaluated using the synthetic data in terms of convergence and quality of map**s. Also, real data set is employed for assessing the proposed methods with several probabilistic and none-probabilistic CCA based approaches. The results confirm the superiority of the proposed methods with respect to the competing algorithms. Moreover, this model can be considered as a framework for further extensions. △ Less

Submitted 4 August, 2017; originally announced August 2017.

arXiv:1702.07884 [pdf, other]

doi 10.1007/s10489-017-1012-2

An EM Based Probabilistic Two-Dimensional CCA with Application to Face Recognition

Authors: Mehran Safayani, Seyed Hashem Ahmadi, Homayun Afrabandpey, Abdolreza Mirzaei

Abstract: Recently, two-dimensional canonical correlation analysis (2DCCA) has been successfully applied for image feature extraction. The method instead of concatenating the columns of the images to the one-dimensional vectors, directly works with two-dimensional image matrices. Although 2DCCA works well in different recognition tasks, it lacks a probabilistic interpretation. In this paper, we present a pr… ▽ More Recently, two-dimensional canonical correlation analysis (2DCCA) has been successfully applied for image feature extraction. The method instead of concatenating the columns of the images to the one-dimensional vectors, directly works with two-dimensional image matrices. Although 2DCCA works well in different recognition tasks, it lacks a probabilistic interpretation. In this paper, we present a probabilistic framework for 2DCCA called probabilistic 2DCCA (P2DCCA) and an iterative EM based algorithm for optimizing the parameters. Experimental results on synthetic and real data demonstrate superior performance in loading factor estimation for P2DCCA compared to 2DCCA. For real data, three subsets of AR face database and also the UMIST face database confirm the robustness of the proposed algorithm in face recognition tasks with different illumination conditions, facial expressions, poses and occlusions. △ Less

Submitted 25 February, 2017; originally announced February 2017.

arXiv:1701.00474 [pdf, other]

Duplicate matching and estimating features for detection of copy-move images forgery

Authors: Ghassem Alikhajeh, Abdolreza Mirzaei, Mehran Safayani, Meysam Ghaffari

Abstract: Copy-move forgery is the most popular and simplest image manipulation method. In this type of forgery, an area from the image copied, then after post processing such as rotation and scaling, placed on the destination. The goal of Copy-move forgery is to hide or duplicate one or more objects in the image. Key-point based Copy-move forgery detection methods have five main steps: preprocessing, featu… ▽ More Copy-move forgery is the most popular and simplest image manipulation method. In this type of forgery, an area from the image copied, then after post processing such as rotation and scaling, placed on the destination. The goal of Copy-move forgery is to hide or duplicate one or more objects in the image. Key-point based Copy-move forgery detection methods have five main steps: preprocessing, feature extraction, matching, transform estimation and post processing that matching and transform estimation have important effect on the detection. More over the error could happens in some steps due to the noise. The existing methods process these steps separately and in case of having an error in a step, this error could be propagated to the following steps and affects the detection. To solve the above mentioned problem, in this paper the steps of the detection system interact with each other and if an error happens in a step, following steps are trying to detect and solve it. We formulate this interaction by defining and optimizing a cost function. This function includes matching and transform estimation steps. Then in an iterative procedure the steps are executed and in case of detecting error, the error will be corrected. The efficiency of the proposed method analyzed in diverse cases such as pixel image precision level on the simple forgery images, robustness to the rotation and scaling, detecting professional forgery images and the precision of the transformation matrix. The results indicate the better efficiency of the proposed method. △ Less

Submitted 2 January, 2017; originally announced January 2017.

arXiv:1010.4951

Local Component Analysis for Nonparametric Bayes Classifier

Authors: Mahmoud Khademi, Mohammad T. Manzuri-Shalmani, Meharn safayani

Abstract: The decision boundaries of Bayes classifier are optimal because they lead to maximum probability of correct decision. It means if we knew the prior probabilities and the class-conditional densities, we could design a classifier which gives the lowest probability of error. However, in classification based on nonparametric density estimation methods such as Parzen windows, the decision regions depen… ▽ More The decision boundaries of Bayes classifier are optimal because they lead to maximum probability of correct decision. It means if we knew the prior probabilities and the class-conditional densities, we could design a classifier which gives the lowest probability of error. However, in classification based on nonparametric density estimation methods such as Parzen windows, the decision regions depend on the choice of parameters such as window width. Moreover, these methods suffer from curse of dimensionality of the feature space and small sample size problem which severely restricts their practical applications. In this paper, we address these problems by introducing a novel dimension reduction and classification method based on local component analysis. In this method, by adopting an iterative cross-validation algorithm, we simultaneously estimate the optimal transformation matrices (for dimension reduction) and classifier parameters based on local information. The proposed method can classify the data with complicated boundary and also alleviate the course of dimensionality dilemma. Experiments on real data show the superiority of the proposed algorithm in term of classification accuracies for pattern classification applications like age, facial expression and character recognition. Keywords: Bayes classifier, curse of dimensionality dilemma, Parzen window, pattern classification, subspace learning. △ Less

Submitted 19 July, 2012; v1 submitted 24 October, 2010; originally announced October 2010.

Comments: This paper has been withdrawn by the author due to an error in experimental results

arXiv:1004.0755 [pdf]

Extended Two-Dimensional PCA for Efficient Face Representation and Recognition

Authors: Mehran Safayani, Mohammad T. Manzuri-Shalmani, Mahmoud Khademi

Abstract: In this paper a novel method called Extended Two-Dimensional PCA (E2DPCA) is proposed which is an extension to the original 2DPCA. We state that the covariance matrix of 2DPCA is equivalent to the average of the main diagonal of the covariance matrix of PCA. This implies that 2DPCA eliminates some covariance information that can be useful for recognition. E2DPCA instead of just using the main diag… ▽ More In this paper a novel method called Extended Two-Dimensional PCA (E2DPCA) is proposed which is an extension to the original 2DPCA. We state that the covariance matrix of 2DPCA is equivalent to the average of the main diagonal of the covariance matrix of PCA. This implies that 2DPCA eliminates some covariance information that can be useful for recognition. E2DPCA instead of just using the main diagonal considers a radius of r diagonals around it and expands the averaging so as to include the covariance information within those diagonals. The parameter r unifies PCA and 2DPCA. r = 1 produces the covariance of 2DPCA, r = n that of PCA. Hence, by controlling r it is possible to control the trade-offs between recognition accuracy and energy compression (fewer coefficients), and between training and recognition complexity. Experiments on ORL face database show improvement in both recognition accuracy and recognition time over the original 2DPCA. △ Less

Submitted 5 April, 2010; originally announced April 2010.

Comments: Proc. of 4th International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, pp. 295--298, 2008.

arXiv:1004.0517 [pdf]

Multilinear Biased Discriminant Analysis: A Novel Method for Facial Action Unit Representation

Authors: Mahmoud Khademi, Mehran Safayani, Mohammad T. Manzuri-Shalmani

Abstract: In this paper a novel efficient method for representation of facial action units by encoding an image sequence as a fourth-order tensor is presented. The multilinear tensor-based extension of the biased discriminant analysis (BDA) algorithm, called multilinear biased discriminant analysis (MBDA), is first proposed. Then, we apply the MBDA and two-dimensional BDA (2DBDA) algorithms, as the dimensio… ▽ More In this paper a novel efficient method for representation of facial action units by encoding an image sequence as a fourth-order tensor is presented. The multilinear tensor-based extension of the biased discriminant analysis (BDA) algorithm, called multilinear biased discriminant analysis (MBDA), is first proposed. Then, we apply the MBDA and two-dimensional BDA (2DBDA) algorithms, as the dimensionality reduction techniques, to Gabor representations and the geometric features of the input image sequence respectively. The proposed scheme can deal with the asymmetry between positive and negative samples as well as curse of dimensionality dilemma. Extensive experiments on Cohn-Kanade database show the superiority of the proposed method for representation of the subtle changes and the temporal information involved in formation of the facial expressions. As an accurate tool, this representation can be applied to many areas such as recognition of spontaneous and deliberate facial expressions, multi modal/media human computer interaction and lie detection efforts. △ Less

Submitted 4 April, 2010; originally announced April 2010.

Comments: Proc. of 16th Korea-Japan Joint Workshop on Frontiers of Computer Vision, Hiroshima, Japan, 2010.

arXiv:1004.0378

Facial Expression Representation and Recognition Using 2DHLDA, Gabor Wavelets, and Ensemble Learning

Authors: Mahmoud Khademi, Mohammad H. Kiapour, Mehran Safayani, Mohammad T. Manzuri, M. Shojaei

Abstract: In this paper, a novel method for representation and recognition of the facial expressions in two-dimensional image sequences is presented. We apply a variation of two-dimensional heteroscedastic linear discriminant analysis (2DHLDA) algorithm, as an efficient dimensionality reduction technique, to Gabor representation of the input sequence. 2DHLDA is an extension of the two-dimensional linear dis… ▽ More In this paper, a novel method for representation and recognition of the facial expressions in two-dimensional image sequences is presented. We apply a variation of two-dimensional heteroscedastic linear discriminant analysis (2DHLDA) algorithm, as an efficient dimensionality reduction technique, to Gabor representation of the input sequence. 2DHLDA is an extension of the two-dimensional linear discriminant analysis (2DLDA) approach and it removes the equal within-class covariance. By applying 2DHLDA in two directions, we eliminate the correlations between both image columns and image rows. Then, we perform a one-dimensional LDA on the new features. This combined method can alleviate the small sample size problem and instability encountered by HLDA. Also, employing both geometric and appearance features and using an ensemble learning scheme based on data fusion, we create a classifier which can efficiently classify the facial expressions. The proposed method is robust to illumination changes and it can properly represent temporal information as well as subtle changes in facial muscles. We provide experiments on Cohn-Kanade database that show the superiority of the proposed method. KEYWORDS: two-dimensional heteroscedastic linear discriminant analysis (2DHLDA), subspace learning, facial expression analysis, Gabor wavelets, ensemble learning. △ Less

Submitted 19 July, 2012; v1 submitted 2 April, 2010; originally announced April 2010.

Comments: This paper has been withdrawn by the author due to an error in experimental results

ACM Class: I.5

Showing 1–14 of 14 results for author: safayani, M