Search | arXiv e-print repository

doi 10.1016/j.cam.2018.07.045

The Gamma Generalized Normal Distribution: A Descriptor of SAR Imagery

Authors: G. M. Cordeiro, R. J. Cintra, L. C. Rêgo, A. D. C. Nascimento

Abstract: We propose a new four-parameter distribution for modeling synthetic aperture radar (SAR) imagery named the gamma generalized normal (GGN) by combining the gamma and generalized normal distributions. A mathematical characterization of the new distribution is provided by identifying the limit behavior and by calculating the density and moment expansions. The GGN model performance is evaluated on bot… ▽ More We propose a new four-parameter distribution for modeling synthetic aperture radar (SAR) imagery named the gamma generalized normal (GGN) by combining the gamma and generalized normal distributions. A mathematical characterization of the new distribution is provided by identifying the limit behavior and by calculating the density and moment expansions. The GGN model performance is evaluated on both synthetic and actual data and, for that, maximum likelihood estimation and random number generation are discussed. The proposed distribution is compared with the beta generalized normal distribution (BGN), which has already shown to appropriately represent SAR imagery. The performance of these two distributions are measured by means of statistics which provide evidence that the GGN can outperform the BGN distribution in some contexts. △ Less

Submitted 3 June, 2022; originally announced June 2022.

Comments: 21 pages, 6 figures, 6 tables

Journal ref: Journal of Computational and Applied Mathematics, vol. 347, pages 257-272, February 2019

arXiv:2003.05703 [pdf, other]

Inline Detection of DGA Domains Using Side Information

Authors: Raaghavi Sivaguru, Jonathan Peck, Femi Olumofin, Anderson Nascimento, Martine De Cock

Abstract: Malware applications typically use a command and control (C&C) server to manage bots to perform malicious activities. Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names that can be used to establish a communication between an infected bot and the C&C server. In recent years, machine learning based systems have been widely used to detect DGAs. There ar… ▽ More Malware applications typically use a command and control (C&C) server to manage bots to perform malicious activities. Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names that can be used to establish a communication between an infected bot and the C&C server. In recent years, machine learning based systems have been widely used to detect DGAs. There are several well known state-of-the-art classifiers in the literature that can detect DGA domain names in real-time applications with high predictive performance. However, these DGA classifiers are highly vulnerable to adversarial attacks in which adversaries purposely craft domain names to evade DGA detection classifiers. In our work, we focus on hardening DGA classifiers against adversarial attacks. To this end, we train and evaluate state-of-the-art deep learning and random forest (RF) classifiers for DGA detection using side information that is harder for adversaries to manipulate than the domain name itself. Additionally, the side information features are selected such that they are easily obtainable in practice to perform inline DGA detection. The performance and robustness of these models is assessed by exposing them to one day of real-traffic data as well as domains generated by adversarial attack algorithms. We found that the DGA classifiers that rely on both the domain name and side information have high performance and are more robust against adversaries. △ Less

Submitted 12 March, 2020; originally announced March 2020.

arXiv:1905.01078 [pdf, other]

CharBot: A Simple and Effective Method for Evading DGA Classifiers

Authors: Jonathan Peck, Claire Nie, Raaghavi Sivaguru, Charles Grumer, Femi Olumofin, Bin Yu, Anderson Nascimento, Martine De Cock

Abstract: Domain generation algorithms (DGAs) are commonly leveraged by malware to create lists of domain names which can be used for command and control (C&C) purposes. Approaches based on machine learning have recently been developed to automatically detect generated domain names in real-time. In this work, we present a novel DGA called CharBot which is capable of producing large numbers of unregistered d… ▽ More Domain generation algorithms (DGAs) are commonly leveraged by malware to create lists of domain names which can be used for command and control (C&C) purposes. Approaches based on machine learning have recently been developed to automatically detect generated domain names in real-time. In this work, we present a novel DGA called CharBot which is capable of producing large numbers of unregistered domain names that are not detected by state-of-the-art classifiers for real-time detection of DGAs, including the recently published methods FANCI (a random forest based on human-engineered features) and LSTM.MI (a deep learning approach). CharBot is very simple, effective and requires no knowledge of the targeted DGA classifiers. We show that retraining the classifiers on CharBot samples is not a viable defense strategy. We believe these findings show that DGA classifiers are inherently vulnerable to adversarial attacks if they rely only on the domain name string to make a decision. Designing a robust DGA classifier may, therefore, necessitate the use of additional information besides the domain name alone. To the best of our knowledge, CharBot is the simplest and most efficient black-box adversarial attack against DGA classifiers proposed to date. △ Less

Submitted 30 May, 2019; v1 submitted 3 May, 2019; originally announced May 2019.

arXiv:1712.01824 [pdf, ps, other]

A new extended Cardioid model: an application to wind data

Authors: Fernanda V. Paula, Abraão D. C. Nascimento, Getúlio J. A. Amaral

Abstract: The Cardioid distribution is a relevant model for circular data. However, this model is not suitable for scenarios were there is asymmetry or multimodality. In order to overcome this gap, an extended Cardioid model is proposed, which is called Exponentiated Cardioid (EC) distribution. Besides, some of its properties are derived, such as trigonometric moments, kurtosis and skewness. A discussion ab… ▽ More The Cardioid distribution is a relevant model for circular data. However, this model is not suitable for scenarios were there is asymmetry or multimodality. In order to overcome this gap, an extended Cardioid model is proposed, which is called Exponentiated Cardioid (EC) distribution. Besides, some of its properties are derived, such as trigonometric moments, kurtosis and skewness. A discussion about the modality and and expressions for the quantiles through approximations of the studied model are also presented. To fit the EC model, two estimation methods are presented based on maximum likelihood and quantile least squares procedures. The performance of proposed estimators is evaluated in a Monte Carlo simulation study, adopting both bias and mean square error as comparison criteria. Finally, the proposed model is applied to a dataset in the wind direction context. Results indicate that the EC distribution may outperform Cardioid and the von Mises distributions. △ Less

Submitted 5 December, 2017; originally announced December 2017.

arXiv:1404.4880 [pdf, other]

doi 10.1109/TGRS.2013.2285927

Bias Correction and Modified Profile Likelihood under the Wishart Complex Distribution

Authors: Abraão D. C. Nascimento, Alejandro C. Frery, Renato J. Cintra

Abstract: This paper proposes improved methods for the maximum likelihood (ML) estimation of the equivalent number of looks $L$. This parameter has a meaningful interpretation in the context of polarimetric synthetic aperture radar (PolSAR) images. Due to the presence of coherent illumination in their processing, PolSAR systems generate images which present a granular noise called speckle. As a potential so… ▽ More This paper proposes improved methods for the maximum likelihood (ML) estimation of the equivalent number of looks $L$. This parameter has a meaningful interpretation in the context of polarimetric synthetic aperture radar (PolSAR) images. Due to the presence of coherent illumination in their processing, PolSAR systems generate images which present a granular noise called speckle. As a potential solution for reducing such interference, the parameter $L$ controls the signal-noise ratio. Thus, the proposal of efficient estimation methodologies for $L$ has been sought. To that end, we consider firstly that a PolSAR image is well described by the scaled complex Wishart distribution. In recent years, Anfinsen et al. derived and analyzed estimation methods based on the ML and on trace statistical moments for obtaining the parameter $L$ of the unscaled version of such probability law. This paper generalizes that approach. We present the second-order bias expression proposed by Cox and Snell for the ML estimator of this parameter. Moreover, the formula of the profile likelihood modified by Barndorff-Nielsen in terms of $L$ is discussed. Such derivations yield two new ML estimators for the parameter $L$, which are compared to the estimators proposed by Anfinsen et al. The performance of these estimators is assessed by means of Monte Carlo experiments, adopting three statistical measures as comparison criterion: the mean square error, the bias, and the coefficient of variation. Equivalently to the simulation study, an application to actual PolSAR data concludes that the proposed estimators outperform all the others in homogeneous scenarios. △ Less

Submitted 18 April, 2014; originally announced April 2014.

Journal ref: IEEE Transactions on Geoscience and Remote Sensing, vol. 52, issue 8, August, pages 4932--4941, 2014

arXiv:1402.1876 [pdf, other]

Information Theory and Image Understanding: An Application to Polarimetric SAR Imagery

Authors: A. C. Frery, A. D. C. Nascimento, R. J. Cintra

Abstract: This work presents a comprehensive examination of the use of information theory for understanding Polarimetric Synthetic Aperture Radar (PolSAR) images by means of contrast measures that can be used as test statistics. Due to the phenomenon called `speckle', common to all images obtained with coherent illumination such as PolSAR imagery, accurate modelling is required in their processing and analy… ▽ More This work presents a comprehensive examination of the use of information theory for understanding Polarimetric Synthetic Aperture Radar (PolSAR) images by means of contrast measures that can be used as test statistics. Due to the phenomenon called `speckle', common to all images obtained with coherent illumination such as PolSAR imagery, accurate modelling is required in their processing and analysis. The scaled multilook complex Wishart distribution has proven to be a successful approach for modelling radar backscatter from forest and pasture areas. Classification, segmentation, and image analysis techniques which depend on this model have been devised, and many of them employ some kind of dissimilarity measure. Specifically, we introduce statistical tests for analyzing contrast in such images. These tests are based on the chi-square, Kullback-Leibler, Rényi, Bhattacharyya, and Hellinger distances. Results obtained by Monte Carlo experiments reveal the Kullback-Leibler distance as the best one with respect to the empirical test sizes under several situations which include pure and contaminated data. The proposed methodology was applied to actual data, obtained by an E-SAR sensor over surroundings of We$β$ssling, Bavaria, Germany. △ Less

Submitted 8 February, 2014; originally announced February 2014.

Comments: 15 pages, 11 figures

Journal ref: Chilean Journal of Statistics, Vol. 2, No. 2, September 2011, 81-100

arXiv:1304.5417 [pdf, other]

doi 10.1109/TGRS.2013.2248737

Analytic Expressions for Stochastic Distances Between Relaxed Complex Wishart Distributions

Authors: Alejandro C. Frery, Abraão D. C. Nascimento, Renato J. Cintra

Abstract: The scaled complex Wishart distribution is a widely used model for multilook full polarimetric SAR data whose adequacy has been attested in the literature. Classification, segmentation, and image analysis techniques which depend on this model have been devised, and many of them employ some type of dissimilarity measure. In this paper we derive analytic expressions for four stochastic distances bet… ▽ More The scaled complex Wishart distribution is a widely used model for multilook full polarimetric SAR data whose adequacy has been attested in the literature. Classification, segmentation, and image analysis techniques which depend on this model have been devised, and many of them employ some type of dissimilarity measure. In this paper we derive analytic expressions for four stochastic distances between relaxed scaled complex Wishart distributions in their most general form and in important particular cases. Using these distances, inequalities are obtained which lead to new ways of deriving the Bartlett and revised Wishart distances. The expressiveness of the four analytic distances is assessed with respect to the variation of parameters. Such distances are then used for deriving new tests statistics, which are proved to have asymptotic chi-square distribution. Adopting the test size as a comparison criterion, a sensitivity study is performed by means of Monte Carlo experiments suggesting that the Bhattacharyya statistic outperforms all the others. The power of the tests is also assessed. Applications to actual data illustrate the discrimination and homogeneity identification capabilities of these distances. △ Less

Submitted 19 April, 2013; originally announced April 2013.

Comments: Accepted for publication in the IEEE Transactions on Geoscience and Remote Sensing journal

arXiv:1210.4154 [pdf, other]

doi 10.1109/TGRS.2012.2222029

Entropy-based Statistical Analysis of PolSAR Data

Authors: Alejandro C. Frery, Renato J. Cintra, Abraão D. C. Nascimento

Abstract: Images obtained from coherent illumination processes are contaminated with speckle noise, with polarimetric synthetic aperture radar (PolSAR) imagery as a prominent example. With an adequacy widely attested in the literature, the scaled complex Wishart distribution is an acceptable model for PolSAR data. In this perspective, we derive analytic expressions for the Shannon, Rényi, and restricted Tsa… ▽ More Images obtained from coherent illumination processes are contaminated with speckle noise, with polarimetric synthetic aperture radar (PolSAR) imagery as a prominent example. With an adequacy widely attested in the literature, the scaled complex Wishart distribution is an acceptable model for PolSAR data. In this perspective, we derive analytic expressions for the Shannon, Rényi, and restricted Tsallis entropies under this model. Relationships between the derived measures and the parameters of the scaled Wishart law (i.e., the equivalent number of looks and the covariance matrix) are discussed. In addition, we obtain the asymptotic variances of the Shannon and Rényi entropies when replacing distribution parameters by maximum likelihood estimators. As a consequence, confidence intervals based on these two entropies are also derived and proposed as new ways of capturing contrast. New hypothesis tests are additionally proposed using these results, and their performance is assessed using simulated and real data. In general terms, the test based on the Shannon entropy outperforms those based on Rényi's. △ Less

Submitted 15 October, 2012; originally announced October 2012.

Comments: Accepted for publication on IEEE Transactions on Geoscience and Remote Sensing

arXiv:1207.2959 [pdf, other]

doi 10.1109/TGRS.2009.2025498

Hypothesis Testing in Speckled Data with Stochastic Distances

Authors: Abraão D. C. Nascimento, Renato J. Cintra, Alejandro C. Frery

Abstract: Images obtained with coherent illumination, as is the case of sonar, ultrasound-B, laser and Synthetic Aperture Radar -- SAR, are affected by speckle noise which reduces the ability to extract information from the data. Specialized techniques are required to deal with such imagery, which has been modeled by the G0 distribution and under which regions with different degrees of roughness and mean br… ▽ More Images obtained with coherent illumination, as is the case of sonar, ultrasound-B, laser and Synthetic Aperture Radar -- SAR, are affected by speckle noise which reduces the ability to extract information from the data. Specialized techniques are required to deal with such imagery, which has been modeled by the G0 distribution and under which regions with different degrees of roughness and mean brightness can be characterized by two parameters; a third parameter, the number of looks, is related to the overall signal-to-noise ratio. Assessing distances between samples is an important step in image analysis; they provide grounds of the separability and, therefore, of the performance of classification procedures. This work derives and compares eight stochastic distances and assesses the performance of hypothesis tests that employ them and maximum likelihood estimation. We conclude that tests based on the triangular distance have the closest empirical size to the theoretical one, while those based on the arithmetic-geometric distances have the best power. Since the power of tests based on the triangular distance is close to optimum, we conclude that the safest choice is using this distance for hypothesis testing, even when compared with classical distances as Kullback-Leibler and Bhattacharyya. △ Less

Submitted 12 July, 2012; originally announced July 2012.

Journal ref: IEEE Transactions on Geoscience and Remote Sensing, vol. 48, p. 373-385, 2010

arXiv:1207.2378 [pdf, other]

doi 10.1007/s10044-011-0249-3

Parametric and Nonparametric Tests for Speckled Imagery

Authors: Renato J. Cintra, Abraão D. C. Nascimento, Alejandro C. Frery

Abstract: Synthetic aperture radar (SAR) has a pivotal role as a remote imaging method. Obtained by means of coherent illumination, SAR images are contaminated with speckle noise. The statistical modeling of such contamination is well described according with the multiplicative model and its implied G0 distribution. The understanding of SAR imagery and scene element identification is an important objective… ▽ More Synthetic aperture radar (SAR) has a pivotal role as a remote imaging method. Obtained by means of coherent illumination, SAR images are contaminated with speckle noise. The statistical modeling of such contamination is well described according with the multiplicative model and its implied G0 distribution. The understanding of SAR imagery and scene element identification is an important objective in the field. In particular, reliable image contrast tools are sought. Aiming the proposition of new tools for evaluating SAR image contrast, we investigated new methods based on stochastic divergence. We propose several divergence measures specifically tailored for G0 distributed data. We also introduce a nonparametric approach based on the Kolmogorov-Smirnov distance for G0 data. We devised and assessed tests based on such measures, and their performances were quantified according to their test sizes and powers. Using Monte Carlo simulation, we present a robustness analysis of test statistics and of maximum likelihood estimators for several degrees of innovative contamination. It was identified that the proposed tests based on triangular and arithmetic-geometric measures outperformed the Kolmogorov-Smirnov methodology. △ Less

Submitted 10 July, 2012; originally announced July 2012.

Comments: Accepted for publication in the Patter Analysis and Applications journal

Showing 1–10 of 10 results for author: Nascimento, A