Search | arXiv e-print repository

Spectral Clustering of Categorical and Mixed-type Data via Extra Graph Nodes

Authors: Dylan Soemitro, Jeova Farias Sales Rocha Neto

Abstract: Clustering data objects into homogeneous groups is one of the most important tasks in data mining. Spectral clustering is arguably one of the most important algorithms for clustering, as it is appealing for its theoretical soundness and is adaptable to many real-world data settings. For example, mixed data, where the data is composed of numerical and categorical features, is typically handled via… ▽ More Clustering data objects into homogeneous groups is one of the most important tasks in data mining. Spectral clustering is arguably one of the most important algorithms for clustering, as it is appealing for its theoretical soundness and is adaptable to many real-world data settings. For example, mixed data, where the data is composed of numerical and categorical features, is typically handled via numerical discretization, dummy coding, or similarity computation that takes into account both data types. This paper explores a more natural way to incorporate both numerical and categorical information into the spectral clustering algorithm, avoiding the need for data preprocessing or the use of sophisticated similarity functions. We propose adding extra nodes corresponding to the different categories the data may belong to and show that it leads to an interpretable clustering objective function. Furthermore, we demonstrate that this simple framework leads to a linear-time spectral clustering algorithm for categorical-only data. Finally, we compare the performance of our algorithms against other related methods and show that it provides a competitive alternative to them in terms of performance and runtime. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2402.09786 [pdf, other]

Examining Pathological Bias in a Generative Adversarial Network Discriminator: A Case Study on a StyleGAN3 Model

Authors: Alvin Grissom II, Ryan F. Lei, Matt Gusdorff, Jeova Farias Sales Rocha Neto, Bailey Lin, Ryan Trotter

Abstract: Generative adversarial networks (GANs) generate photorealistic faces that are often indistinguishable by humans from real faces. While biases in machine learning models are often assumed to be due to biases in training data, we find pathological internal color and luminance biases in the discriminator of a pre-trained StyleGAN3-r model that are not explicable by the training data. We also find tha… ▽ More Generative adversarial networks (GANs) generate photorealistic faces that are often indistinguishable by humans from real faces. While biases in machine learning models are often assumed to be due to biases in training data, we find pathological internal color and luminance biases in the discriminator of a pre-trained StyleGAN3-r model that are not explicable by the training data. We also find that the discriminator systematically stratifies scores by both image- and face-level qualities and that this disproportionately affects images across gender, race, and other categories. We examine axes common in research on stereoty** in social psychology. △ Less

Submitted 12 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

arXiv:2311.01475 [pdf, other]

Patch-Based Deep Unsupervised Image Segmentation using Graph Cuts

Authors: Isaac Wasserman, Jeova Farias Sales Rocha Neto

Abstract: Unsupervised image segmentation aims at grou** different semantic patterns in an image without the use of human annotation. Similarly, image clustering searches for grou**s of images based on their semantic content without supervision. Classically, both problems have captivated researchers as they drew from sound mathematical concepts to produce concrete applications. With the emergence of dee… ▽ More Unsupervised image segmentation aims at grou** different semantic patterns in an image without the use of human annotation. Similarly, image clustering searches for grou**s of images based on their semantic content without supervision. Classically, both problems have captivated researchers as they drew from sound mathematical concepts to produce concrete applications. With the emergence of deep learning, the scientific community turned its attention to complex neural network-based solvers that achieved impressive results in those domains but rarely leveraged the advances made by classical methods. In this work, we propose a patch-based unsupervised image segmentation strategy that bridges advances in unsupervised feature extraction from deep clustering methods with the algorithmic help of classical graph-based methods. We show that a simple convolutional neural network, trained to classify image patches and iteratively regularized using graph cuts, naturally leads to a state-of-the-art fully-convolutional unsupervised pixel-level segmenter. Furthermore, we demonstrate that this is the ideal setting for leveraging the patch-level pairwise features generated by vision transformer models. Our results on real image data demonstrate the effectiveness of our proposed methodology. △ Less

Submitted 15 January, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

arXiv:2309.03351 [pdf, other]

Using Neural Networks for Fast SAR Roughness Estimation of High Resolution Images

Authors: Li Fan, Jeova Farias Sales Rocha Neto

Abstract: The analysis of Synthetic Aperture Radar (SAR) imagery is an important step in remote sensing applications, and it is a challenging problem due to its inherent speckle noise. One typical solution is to model the data using the $G_I^0$ distribution and extract its roughness information, which in turn can be used in posterior imaging tasks, such as segmentation, classification and interpretation. Th… ▽ More The analysis of Synthetic Aperture Radar (SAR) imagery is an important step in remote sensing applications, and it is a challenging problem due to its inherent speckle noise. One typical solution is to model the data using the $G_I^0$ distribution and extract its roughness information, which in turn can be used in posterior imaging tasks, such as segmentation, classification and interpretation. This leads to the need of quick and reliable estimation of the roughness parameter from SAR data, especially with high resolution images. Unfortunately, traditional parameter estimation procedures are slow and prone to estimation failures. In this work, we proposed a neural network-based estimation framework that first learns how to predict underlying parameters of $G_I^0$ samples and then can be used to estimate the roughness of unseen data. We show that this approach leads to an estimator that is quicker, yields less estimation error and is less prone to failures than the traditional estimation procedures for this problem, even when we use a simple network. More importantly, we show that this same methodology can be generalized to handle image inputs and, even if trained on purely synthetic data for a few seconds, is able to perform real time pixel-wise roughness estimation for high resolution real SAR imagery. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2306.13200 [pdf, other]

Improving Log-Cumulant Based Estimation of Roughness Information in SAR imagery

Authors: Jeova Farias Sales Rocha Neto, Francisco Alixandre Avila Rodrigues

Abstract: Synthetic Aperture Radar (SAR) image understanding is crucial in remote sensing applications, but it is hindered by its intrinsic noise contamination, called speckle. Sophisticated statistical models, such as the $\mathcal{G}^0$ family of distributions, have been employed to SAR data and many of the current advancements in processing this imagery have been accomplished through extracting informati… ▽ More Synthetic Aperture Radar (SAR) image understanding is crucial in remote sensing applications, but it is hindered by its intrinsic noise contamination, called speckle. Sophisticated statistical models, such as the $\mathcal{G}^0$ family of distributions, have been employed to SAR data and many of the current advancements in processing this imagery have been accomplished through extracting information from these models. In this paper, we propose improvements to parameter estimation in $\mathcal{G}^0$ distributions using the Method of Log-Cumulants. First, using Bayesian modeling, we construct that regularly produce reliable roughness estimates under both $\mathcal{G}^0_A$ and $\mathcal{G}^0_I$ models. Second, we make use of an approximation of the Trigamma function to compute the estimated roughness in constant time, making it considerably faster than the existing method for this task. Finally, we show how we can use this method to achieve fast and reliable SAR image understanding based on roughness information. △ Less

Submitted 22 June, 2023; originally announced June 2023.

arXiv:2306.13166 [pdf, other]

A Sparse Graph Formulation for Efficient Spectral Image Segmentation

Authors: Rahul Palnitkar, Jeova Farias Sales Rocha Neto

Abstract: Spectral Clustering is one of the most traditional methods to solve segmentation problems. Based on Normalized Cuts, it aims at partitioning an image using an objective function defined by a graph. Despite their mathematical attractiveness, spectral approaches are traditionally neglected by the scientific community due to their practical issues and underperformance. In this paper, we adopt a spars… ▽ More Spectral Clustering is one of the most traditional methods to solve segmentation problems. Based on Normalized Cuts, it aims at partitioning an image using an objective function defined by a graph. Despite their mathematical attractiveness, spectral approaches are traditionally neglected by the scientific community due to their practical issues and underperformance. In this paper, we adopt a sparse graph formulation based on the inclusion of extra nodes to a simple grid graph. While the grid encodes the pixel spatial disposition, the extra nodes account for the pixel color data. Applying the original Normalized Cuts algorithm to this graph leads to a simple and scalable method for spectral image segmentation, with an interpretable solution. Our experiments also demonstrate that our proposed methodology over performs both traditional and modern unsupervised algorithms for segmentation in both real and synthetic data. △ Less

Submitted 7 June, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

arXiv:2208.07853 [pdf, other]

Estimating Appearance Models for Image Segmentation via Tensor Factorization

Authors: Jeova Farias Sales Rocha Neto

Abstract: Image Segmentation is one of the core tasks in Computer Vision and solving it often depends on modeling the image appearance data via the color distributions of each it its constituent regions. Whereas many segmentation algorithms handle the appearance models dependence using alternation or implicit methods, we propose here a new approach to directly estimate them from the image without prior info… ▽ More Image Segmentation is one of the core tasks in Computer Vision and solving it often depends on modeling the image appearance data via the color distributions of each it its constituent regions. Whereas many segmentation algorithms handle the appearance models dependence using alternation or implicit methods, we propose here a new approach to directly estimate them from the image without prior information on the underlying segmentation. Our method uses local high order color statistics from the image as an input to tensor factorization-based estimator for latent variable models. This approach is able to estimate models in multiregion images and automatically output the regions proportions without prior user interaction, overcoming the drawbacks from a prior attempt to this problem. We also demonstrate the performance of our proposed method in many challenging synthetic and real imaging scenarios and show that it leads to an efficient segmentation algorithm. △ Less

Submitted 15 November, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

arXiv:2102.11121 [pdf, other]

Direct Estimation of Appearance Models for Segmentation

Authors: Jeova F. S. Rocha Neto, Pedro Felzenszwalb, Marilyn Vazquez

Abstract: Image segmentation algorithms often depend on appearance models that characterize the distribution of pixel values in different image regions. We describe a new approach for estimating appearance models directly from an image, without explicit consideration of the pixels that make up each region. Our approach is based on novel algebraic expressions that relate local image statistics to the appeara… ▽ More Image segmentation algorithms often depend on appearance models that characterize the distribution of pixel values in different image regions. We describe a new approach for estimating appearance models directly from an image, without explicit consideration of the pixels that make up each region. Our approach is based on novel algebraic expressions that relate local image statistics to the appearance of spatially coherent regions. We describe two algorithms that can use the aforementioned algebraic expressions to estimate appearance models directly from an image. The first algorithm solves a system of linear and quadratic equations using a least squares formulation. The second algorithm is a spectral method based on an eigenvector computation. We present experimental results that demonstrate the proposed methods work well in practice and lead to effective image segmentation algorithms. △ Less

Submitted 15 September, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

Comments: To appear in the SIAM Journal on Imaging Sciences (SIIMS)

MSC Class: 68U10; 62M05; 62H30; 65C20

arXiv:2006.06573 [pdf, other]

Spectral Image Segmentation with Global Appearance Modeling

Authors: Jeova F. S. Rocha Neto, Pedro F. Felzenszwalb

Abstract: We introduce a new spectral method for image segmentation that incorporates long range relationships for global appearance modeling. The approach combines two different graphs, one is a sparse graph that captures spatial relationships between nearby pixels and another is a dense graph that captures pairwise similarity between all pairs of pixels. We extend the spectral method for Normalized Cuts t… ▽ More We introduce a new spectral method for image segmentation that incorporates long range relationships for global appearance modeling. The approach combines two different graphs, one is a sparse graph that captures spatial relationships between nearby pixels and another is a dense graph that captures pairwise similarity between all pairs of pixels. We extend the spectral method for Normalized Cuts to this setting by combining the transition matrices of Markov chains associated with each graph. We also derive an efficient method for sparsifying the dense graph of appearance relationships. This leads to a practical algorithm for segmenting high-resolution images. The resulting method can segment challenging images without any filtering or pre-processing. △ Less

Submitted 6 October, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

ACM Class: I.4; I.5

Showing 1–9 of 9 results for author: Neto, J F S R