Skip to main content

Showing 1–22 of 22 results for author: Prates, M O

.
  1. arXiv:2405.02666  [pdf, other

    stat.ME

    The Analysis of Criminal Recidivism: A Hierarchical Model-Based Approach for the Analysis of Zero-Inflated, Spatially Correlated recurrent events Data

    Authors: Alisson C. C. Silva, Fábio N. Demarqui, Bráulio F. Silva, Marcos O. Prates

    Abstract: The life course perspective in criminology has become prominent last years, offering valuable insights into various patterns of criminal offending and pathways. The study of criminal trajectories aims to understand the beginning, persistence and desistence in crime, providing intriguing explanations about these moments in life. Central to this analysis is the identification of patterns in the freq… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 23 pages, 12 figuras and 4 tables

  2. arXiv:2308.06677  [pdf, other

    stat.ME

    Imputation of missing data using multivariate Gaussian Linear Cluster-Weighted Modeling

    Authors: Luis Alejandro Masmela-Caita, Thais Paiva Galletti, Marcos Oliveira Prates

    Abstract: Missing data arises when certain values are not recorded or observed for variables of interest. However, most of the statistical theory assume complete data availability. To address incomplete databases, one approach is to fill the gaps corresponding to the missing information based on specific criteria, known as imputation. In this study, we propose a novel imputation methodology for databases wi… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 23 pages, 9 figures

  3. arXiv:2304.10283  [pdf, other

    cs.CL stat.ML

    Is augmentation effective to improve prediction in imbalanced text datasets?

    Authors: Gabriel O. Assunção, Rafael Izbicki, Marcos O. Prates

    Abstract: Imbalanced datasets present a significant challenge for machine learning models, often leading to biased predictions. To address this issue, data augmentation techniques are widely used in natural language processing (NLP) to generate new samples for the minority class. However, in this paper, we challenge the common assumption that data augmentation is always necessary to improve predictions on i… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 21 pages, 5 figures

  4. arXiv:2208.07900  [pdf, other

    stat.ME

    An unified framework for point-level, areal, and mixed spatial data: the Hausdorff-Gaussian Process

    Authors: Lucas da Cunha Godoy, Marcos Oliveira Prates, Jun Yan

    Abstract: More realistic models can be built taking into account spatial dependence when analyzing areal data. Most of the models for areal data employ adjacency matrices to assess the spatial structure of the data. Such methodologies impose some limitations. Remarkably, spatial polygons of different shapes and sizes are not treated differently, and it becomes difficult, if not impractical, to compute predi… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

  5. arXiv:2203.06437  [pdf, other

    stat.ME

    Beyond Gaussian processes: Flexible Bayesian modeling and inference for geostatistical processes

    Authors: F. B. Gonçalves, M. O. Prates, G. A. S. Aguilar

    Abstract: This paper proposes a novel family of geostatistical models to account for features that cannot be properly accommodated by traditional Gaussian processes. The family is specified hierarchically and combines the infinite-dimensional dynamics of Gaussian processes with that of any multivariate continuous distribution. This combination is stochastically defined through a latent Poisson process and t… ▽ More

    Submitted 6 April, 2023; v1 submitted 12 March, 2022; originally announced March 2022.

  6. arXiv:2110.12514  [pdf, other

    stat.ME

    Imputation of Missing Data Using Linear Gaussian Cluster-Weighted Modeling

    Authors: Luis Alejandro Masmela-Caita, Thais Paiva Galletti, Marcos Oliveira Prates

    Abstract: Missing data theory deals with the statistical methods in the occurrence of missing data. Missing data occurs when some values are not stored or observed for variables of interest. However, most of the statistical theory assumes that data is fully observed. An alternative to deal with incomplete databases is to fill in the spaces corresponding to the missing information based on some criteria, thi… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

  7. arXiv:2008.06911  [pdf, other

    stat.ME stat.AP

    Alleviating Spatial Confounding in Spatial Frailty Models

    Authors: Douglas Roberto Mesquita Azevedo, Marcos Oliveira Prates, Dipankar Bandyopadhyay

    Abstract: Spatial confounding is how is called the confounding between fixed and spatial random effects. It has been widely studied and it gained attention in the past years in the spatial statistics literature, as it may generate unexpected results in modeling. The projection-based approach, also known as restricted models, appears as a good alternative to overcome the spatial confounding in generalized li… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: 21 pages, 5 figures, 4 tables

  8. arXiv:2007.00848  [pdf, other

    stat.AP

    A robust nonlinear mixed-effects model for COVID-19 deaths data

    Authors: Fernanda L. Schumacher, Clecio S. Ferreira, Marcos O. Prates, Alberto Lachos, Victor H. Lachos

    Abstract: The analysis of complex longitudinal data such as COVID-19 deaths is challenging due to several inherent features: (i) Similarly-shaped profiles with different decay patterns; (ii) Unexplained variation among repeated measurements within each country, these repeated measurements may be viewed as clustered data since they are taken on the same country at roughly the same time; (iii) Skewness, outli… ▽ More

    Submitted 1 August, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: 11 pages, 2 figures, 4 tables

  9. arXiv:2006.08036  [pdf, other

    stat.ME stat.CO

    Heckman selection-t model: parameter estimation via the EM-algorithm

    Authors: Victor H. Lachos Davila, Marcos O. Prates, Dipak K. Dey

    Abstract: Heckman selection model is perhaps the most popular econometric model in the analysis of data with sample selection. The analyses of this model are based on the normality assumption for the error terms, however, in some applications, the distribution of the error term departs significantly from normality, for instance, in the presence of heavy tails and/or atypical observation. In this paper, we e… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

    Comments: 19 pages, 5 Tables, 4 Figures

  10. arXiv:2005.05464  [pdf, other

    stat.ME stat.AP

    Non-Separable Spatio-temporal Models via Transformed Gaussian Markov Random Fields

    Authors: Douglas R. M. Azevedo, Marcos O. Prates, Michael R. Willig

    Abstract: Models that capture the spatial and temporal dynamics are applicable in many science fields. Non-separable spatio-temporal models were introduced in the literature to capture these features. However, these models are generally complicated in construction and interpretation. We introduce a class of non-separable Transformed Gaussian Markov Random Fields (TGMRF) in which the dependence structure is… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Comments: 15 pages, 3 figures, 4 tables

  11. arXiv:2004.04341  [pdf, ps, other

    math.ST

    Objective Bayesian analysis for spatial Student-t regression models

    Authors: Jose A. Ordoñez, Marcos O. Prates, Larissa A. Matos, Victor H. Lachos

    Abstract: The choice of the prior distribution is a key aspect of Bayesian analysis. For the spatial regression setting a subjective prior choice for the parameters may not be trivial, from this perspective, using the objective Bayesian analysis framework a reference is introduced for the spatial Student-t regression model with unknown degrees of freedom. The spatial Student-t regression model poses two mai… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: 21 pages, 3 tables

  12. arXiv:1908.06437  [pdf, other

    stat.ME

    Fast Bayesian inference of Block Nearest Neighbor Gaussian process for large data

    Authors: Zaida C. Quiroz, Marcos O. Prates, Dipak K. Dey, Håvard Rue

    Abstract: This paper presents the development of a spatial block-Nearest Neighbor Gaussian process (block-NNGP) for location-referenced large spatial data. The key idea behind this approach is to divide the spatial domain into several blocks which are dependent under some constraints. The cross-blocks capture the large-scale spatial dependence, while each block captures the small-scale spatial dependence. T… ▽ More

    Submitted 4 February, 2021; v1 submitted 18 August, 2019; originally announced August 2019.

    Comments: 60 pages, 20 figures (including the ones in the Supplementary Material), 4 tables

  13. arXiv:1906.05399  [pdf, ps, other

    stat.AP eess.SP stat.CO

    Dynamic Time Scan Forecasting

    Authors: Marcelo Azevedo Costa, Leandro Brioschi Mineti, Marcos Oliveira Prates, Ramiro Ruiz Cardenas

    Abstract: The dynamic time scan forecasting method relies on the premise that the most important pattern in a time series precedes the forecasting window, i.e., the last observed values. Thus, a scan procedure is applied to identify similar patterns, or best matches, throughout the time series. As oppose to euclidean distance, or any distance function, a similarity function is dynamically estimated in order… ▽ More

    Submitted 12 June, 2019; originally announced June 2019.

    Comments: 15 pages, 7 figures, working paper, version 1

  14. arXiv:1901.07984  [pdf, other

    cs.LG cs.AI stat.ML

    Typed Graph Networks

    Authors: Marcelo O. R. Prates, Pedro H. C. Avelar, Henrique Lemos, Marco Gori, Luis Lamb

    Abstract: Recently, the deep learning community has given growing attention to neural architectures engineered to learn problems in relational domains. Convolutional Neural Networks employ parameter sharing over the image domain, tying the weights of neural connections on a grid topology and thus enforcing the learning of a number of convolutional kernels. By instantiating trainable neural modules and assem… ▽ More

    Submitted 24 February, 2019; v1 submitted 23 January, 2019; originally announced January 2019.

    Comments: Under submission

  15. arXiv:1809.07695  [pdf, other

    cs.SI cs.AI cs.LG cs.NE stat.ML

    Multitask Learning on Graph Neural Networks: Learning Multiple Graph Centrality Measures with a Unified Network

    Authors: Pedro H. C. Avelar, Henrique Lemos, Marcelo O. R. Prates, Luis Lamb

    Abstract: The application of deep learning to symbolic domains remains an active research endeavour. Graph neural networks (GNN), consisting of trained neural modules which can be arranged in different topologies at run time, are sound alternatives to tackle relational problems which lend themselves to graph representations. In this paper, we show that GNNs are capable of multitask learning, which can be na… ▽ More

    Submitted 28 November, 2019; v1 submitted 11 September, 2018; originally announced September 2018.

    Comments: Published at ICANN2019. 10 pages, 3 Figures

  16. arXiv:1809.02721  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Learning to Solve NP-Complete Problems - A Graph Neural Network for Decision TSP

    Authors: Marcelo O. R. Prates, Pedro H. C. Avelar, Henrique Lemos, Luis Lamb, Moshe Vardi

    Abstract: Graph Neural Networks (GNN) are a promising technique for bridging differential programming and combinatorial domains. GNNs employ trainable modules which can be assembled in different configurations that reflect the relational structure of each problem instance. In this paper, we show that GNNs can learn to solve, with very little supervision, the decision variant of the Traveling Salesperson Pro… ▽ More

    Submitted 16 November, 2018; v1 submitted 7 September, 2018; originally announced September 2018.

    Comments: Accepted for presentation at AAAI 2019

  17. arXiv:1809.02208  [pdf, other

    cs.CY cs.CL

    Assessing Gender Bias in Machine Translation -- A Case Study with Google Translate

    Authors: Marcelo O. R. Prates, Pedro H. C. Avelar, Luis Lamb

    Abstract: Recently there has been a growing concern about machine bias, where trained statistical models grow to reflect controversial societal asymmetries, such as gender or racial bias. A significant number of AI tools have recently been suggested to be harmfully biased towards some minority, with reports of racist criminal behavior predictors, Iphone X failing to differentiate between two Asian people an… ▽ More

    Submitted 11 March, 2019; v1 submitted 6 September, 2018; originally announced September 2018.

    Comments: Accepted for publication on Neural Computing and Applications; 33 pages, 14 figures, 12 tables

  18. arXiv:1711.04376  [pdf, ps, other

    stat.ME

    Bayesian linear regression models with flexible error distributions

    Authors: Nívea B. da Silva, Marcos O. Prates, Flávio B. Gonçalves

    Abstract: This work introduces a novel methodology based on finite mixtures of Student-t distributions to model the errors' distribution in linear regression models. The novelty lies on a particular hierarchical structure for the mixture distribution in which the first level models the number of modes, responsible to accommodate multimodality and skewness features, and the second level models tail behavior.… ▽ More

    Submitted 12 November, 2017; originally announced November 2017.

  19. arXiv:1509.00331  [pdf, ps, other

    stat.ME

    Robust Bayesian model selection for heavy-tailed linear regression using finite mixtures

    Authors: Flávio B Gonçalves, Marcos O. Prates, Victor H. Lachos

    Abstract: In this paper we present a novel methodology to perform Bayesian model selection in linear models with heavy-tailed distributions. We consider a finite mixture of distributions to model a latent variable where each component of the mixture corresponds to one possible model within the symmetrical class of normal independent distributions. Naturally, the Gaussian model is one of the possibilities. T… ▽ More

    Submitted 17 August, 2017; v1 submitted 1 September, 2015; originally announced September 2015.

  20. arXiv:1407.5363  [pdf, other

    stat.ME

    Where geography lives? A projection approach for spatial confounding

    Authors: Marcos O. Prates, Erica C. Rodrigues, Renato M. Assunção

    Abstract: Spatial confounding between the spatial random effects and fixed effects covariates has been recently discovered and showed that it may bring misleading interpretation to the model results. Solutions to alleviate this problem are based on decomposing the spatial random effect and fitting a restricted spatial regression. In this paper, we propose a different approach: a transformation of the geogra… ▽ More

    Submitted 16 May, 2016; v1 submitted 20 July, 2014; originally announced July 2014.

  21. arXiv:1312.6896  [pdf, other

    stat.AP

    Inference on Dynamic Models for non-Gaussian Random Fields using INLA: A Homicide Rate Analysis of Brazilian Cities

    Authors: Renan Xavier Cortes, Thiago Guerrera Martins, Marcos Oliveira Prates, Bráulio Figueiredo Alves da Silva

    Abstract: Robust time series analysis is an important subject in statistical modeling. Models based on Gaussian distribution are sensitive to outliers, which may imply in a significant degradation in estimation performance as well as in prediction accuracy. State-space models, also referred as Dynamic Models, is a very useful way to describe the evolution of a time series variable through a structured laten… ▽ More

    Submitted 13 February, 2015; v1 submitted 24 December, 2013; originally announced December 2013.

    Comments: 26 pages, 4 figures

  22. Transformed Gaussian Markov Random Fields and Spatial Modeling

    Authors: Marcos O. Prates, Dipak K. Dey, Michael R. Willig, Jun Yan

    Abstract: The Gaussian random field (GRF) and the Gaussian Markov random field (GMRF) have been widely used to accommodate spatial dependence under the generalized linear mixed model framework. These models have limitations rooted in the symmetry and thin tail of the Gaussian distribution. We introduce a new class of random fields, termed transformed GRF (TGRF), and a new class of Markov random fields, term… ▽ More

    Submitted 24 May, 2012; originally announced May 2012.

    Comments: 19 pages, 2 figures, 6 tables

    Journal ref: Spatial Statistics 14 (2015): 382-399