Search | arXiv e-print repository

2-Cats: 2D Copula Approximating Transforms

Authors: Flavio Figueiredo, José Geraldo Fernandes, Jackson Silva, Renato M. Assunção

Abstract: Copulas are powerful statistical tools for capturing dependencies across data dimensions. Applying Copulas involves estimating independent marginals, a straightforward task, followed by the much more challenging task of determining a single copulating function, $C$, that links these marginals. For bivariate data, a copula takes the form of a two-increasing function… ▽ More Copulas are powerful statistical tools for capturing dependencies across data dimensions. Applying Copulas involves estimating independent marginals, a straightforward task, followed by the much more challenging task of determining a single copulating function, $C$, that links these marginals. For bivariate data, a copula takes the form of a two-increasing function $C: (u,v)\in \mathbb{I}^2 \rightarrow \mathbb{I}$, where $\mathbb{I} = [0, 1]$. This paper proposes 2-Cats, a Neural Network (NN) model that learns two-dimensional Copulas without relying on specific Copula families (e.g., Archimedean). Furthermore, via both theoretical properties of the model and a Lagrangian training approach, we show that 2-Cats meets the desiderata of Copula properties. Moreover, inspired by the literature on Physics-Informed Neural Networks and Sobolev Training, we further extend our training strategy to learn not only the output of a Copula but also its derivatives. Our proposed method exhibits superior performance compared to the state-of-the-art across various datasets while respecting (provably for most and approximately for a single other) properties of C. △ Less

Submitted 28 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

arXiv:2109.13734 [pdf, other]

Cooperative Object Transportation using Gibbs Random Fields

Authors: Paulo Rezeck, Renato M. Assunção, Luiz Chaimowicz

Abstract: This paper presents a novel methodology that allows a swarm of robots to perform a cooperative transportation task. Our approach consists of modeling the swarm as a {\em Gibbs Random Field} (GRF), taking advantage of this framework's locality properties. By setting appropriate potential functions, robots can dynamically navigate, form groups, and perform cooperative transportation in a completely… ▽ More This paper presents a novel methodology that allows a swarm of robots to perform a cooperative transportation task. Our approach consists of modeling the swarm as a {\em Gibbs Random Field} (GRF), taking advantage of this framework's locality properties. By setting appropriate potential functions, robots can dynamically navigate, form groups, and perform cooperative transportation in a completely decentralized fashion. Moreover, these behaviors emerge from the local interactions without the need for explicit communication or coordination. To evaluate our methodology, we perform a series of simulations and proof-of-concept experiments in different scenarios. Our results show that the method is scalable, adaptable, and robust to failures and changes in the environment. △ Less

Submitted 28 September, 2021; originally announced September 2021.

Comments: 8 pages, 9 figures, accepted by IROS 2021

arXiv:2105.02203 [pdf, other]

Bayesian Dynamic Estimation of Mortality Schedules in Small Areas

Authors: Guilherme Lopes de Oliveira, Rosangela Helena Loschi, Renato Martins Assunção

Abstract: The determination of the shapes of mortality curves, the estimation and projection of mortality patterns over time, and the investigation of differences in mortality patterns across different small underdeveloped populations have received special attention in recent years. The challenges involved in this type of problems are the common sparsity and the unstable behavior of observed death counts in… ▽ More The determination of the shapes of mortality curves, the estimation and projection of mortality patterns over time, and the investigation of differences in mortality patterns across different small underdeveloped populations have received special attention in recent years. The challenges involved in this type of problems are the common sparsity and the unstable behavior of observed death counts in small areas (populations). These features impose many dificulties in the estimation of reasonable mortality schedules. In this chapter, we present a discussion about this problem and we introduce the use of relational Bayesian dynamic models for estimating and smoothing mortality schedules by age and sex. Preliminary results are presented, including a comparison with a methodology recently proposed in the literature. The analyzes are based on simulated data as well as mortality data observed in some Brazilian municipalities. △ Less

Submitted 5 May, 2021; originally announced May 2021.

Comments: 25 pages, 6 figures, 1 table

arXiv:2104.10814 [pdf, ps, other]

Flocking-Segregative Swarming Behaviors using Gibbs Random Fields

Authors: Paulo Rezeck, Renato M. Assuncao, Luiz Chaimowicz

Abstract: This paper presents a novel approach that allows a swarm of heterogeneous robots to produce simultaneously segregative and flocking behaviors using only local sensing. These behaviors have been widely studied in swarm robotics and their combination allows the execution of several complex tasks, ranging from surveillance and reconnaissance, to search and rescue, to transport, and to foraging. Altho… ▽ More This paper presents a novel approach that allows a swarm of heterogeneous robots to produce simultaneously segregative and flocking behaviors using only local sensing. These behaviors have been widely studied in swarm robotics and their combination allows the execution of several complex tasks, ranging from surveillance and reconnaissance, to search and rescue, to transport, and to foraging. Although there are several works in the literature proposing different strategies to achieve these behaviors, to the best of our knowledge, this paper is the first to propose an algorithm that emerges simultaneously behaviors and do not rely on global information or communication. Our approach consists of modeling the swarm as a Gibbs Random Field (GRF) and using appropriate potential functions to reach segregation, cohesion and consensus on the velocity of the swarm. Simulations and proof-of-concept experiments using real robots are presented to evaluate the performance of our methodology in comparison to some of the state-of-the-art works that tackle segregative behaviors. △ Less

Submitted 21 April, 2021; originally announced April 2021.

Comments: 7 pages, 11 figures, accepted by ICRA 2021

arXiv:1807.04595 [pdf, other]

Fast Estimation of Causal Interactions using Wold Processes

Authors: Flavio Figueiredo, Guilherme Borges, Pedro O. S. Vaz de Melo, Renato M. Assunção

Abstract: We here focus on the task of learning Granger causality matrices for multivariate point processes. In order to accomplish this task, our work is the first to explore the use of Wold processes. By doing so, we are able to develop asymptotically fast MCMC learning algorithms. With $N$ being the total number of events and $K$ the number of processes, our learning algorithm has a… ▽ More We here focus on the task of learning Granger causality matrices for multivariate point processes. In order to accomplish this task, our work is the first to explore the use of Wold processes. By doing so, we are able to develop asymptotically fast MCMC learning algorithms. With $N$ being the total number of events and $K$ the number of processes, our learning algorithm has a $O(N(\,\log(N)\,+\,\log(K)))$ cost per iteration. This is much faster than the $O(N^3\,K^2)$ or $O(K^3)$ for the state of the art. Our approach, called GrangerBusca, is validated on nine datasets. This is an advance in relation to most prior efforts which focus mostly on subsets of the Memetracker data. Regarding accuracy, GrangerBusca is three times more accurate (in Precision@10) than the state of the art for the commonly explored subsets Memetracker. Due to GrangerBusca's much lower training complexity, our approach is the only one able to train models for larger, full, sets of data. △ Less

Submitted 2 December, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

Comments: 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada

arXiv:1706.02447 [pdf, other]

doi 10.1145/3097983.3098045

Luck is Hard to Beat: The Difficulty of Sports Prediction

Authors: Raquel YS Aoki, Renato M Assuncao, Pedro OS Vaz de Melo

Abstract: Predicting the outcome of sports events is a hard task. We quantify this difficulty with a coefficient that measures the distance between the observed final results of sports leagues and idealized perfectly balanced competitions in terms of skill. This indicates the relative presence of luck and skill. We collected and analyzed all games from 198 sports leagues comprising 1503 seasons from 84 coun… ▽ More Predicting the outcome of sports events is a hard task. We quantify this difficulty with a coefficient that measures the distance between the observed final results of sports leagues and idealized perfectly balanced competitions in terms of skill. This indicates the relative presence of luck and skill. We collected and analyzed all games from 198 sports leagues comprising 1503 seasons from 84 countries of 4 different sports: basketball, soccer, volleyball and handball. We measured the competitiveness by countries and sports. We also identify in each season which teams, if removed from its league, result in a completely random tournament. Surprisingly, not many of them are needed. As another contribution of this paper, we propose a probabilistic graphical model to learn about the teams' skills and to decompose the relative weights of luck and skill in each game. We break down the skill component into factors associated with the teams' characteristics. The model also allows to estimate as 0.36 the probability that an underdog team wins in the NBA league, with a home advantage adding 0.09 to this probability. As shown in the first part of the paper, luck is substantially present even in the most competitive championships, which partially explains why sophisticated and complex feature-based models hardly beat simple models in the task of forecasting sports' outcomes. △ Less

Submitted 7 June, 2017; originally announced June 2017.

Comments: 10 pages, KDD2017, Applied Data Science track

Journal ref: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017

arXiv:1703.03895 [pdf, ps, other]

Antagonism also Flows through Retweets: The Impact of Out-of-Context Quotes in Opinion Polarization Analysis

Authors: Pedro Calais Guerra, Roberto C. S. N. P. Souza, Renato M. Assunção, Wagner Meira Jr

Abstract: In this paper, we study the implications of the commonplace assumption that most social media studies make with respect to the nature of message shares (such as retweets) as a predominantly positive interaction. By analyzing two large longitudinal Brazilian Twitter datasets containing 5 years of conversations on two polarizing topics - Politics and Sports - we empirically demonstrate that groups h… ▽ More In this paper, we study the implications of the commonplace assumption that most social media studies make with respect to the nature of message shares (such as retweets) as a predominantly positive interaction. By analyzing two large longitudinal Brazilian Twitter datasets containing 5 years of conversations on two polarizing topics - Politics and Sports - we empirically demonstrate that groups holding antagonistic views can actually retweet each other more often than they retweet other groups. We show that assuming retweets as endorsement interactions can lead to misleading conclusions with respect to the level of antagonism among social communities, and that this apparent paradox is explained in part by the use of retweets to quote the original content creator out of the message's original temporal context, for humor and criticism purposes. As a consequence, messages diffused on online media can have their polarity reversed over time, what poses challenges for social and computer scientists aiming to classify and track opinion groups on online media. On the other hand, we found that the time users take to retweet a message after it has been originally posted can be a useful signal to infer antagonism in social platforms, and that surges of out-of-context retweets correlate with sentiment drifts triggered by real-world events. We also discuss how such evidences can be embedded in sentiment analysis models. △ Less

Submitted 10 March, 2017; originally announced March 2017.

Comments: This is an extended version of the short paper published at ICWSM 2017

arXiv:1510.05981 [pdf, other]

A latent shared-component generative model for real-time disease surveillance using Twitter data

Authors: Roberto C. S. N. P. Souza, Denise E. F de Brito, Renato M. Assunção, Wagner Meira Jr

Abstract: Exploiting the large amount of available data for addressing relevant social problems has been one of the key challenges in data mining. Such efforts have been recently named "data science for social good" and attracted the attention of several researchers and institutions. We give a contribution in this objective in this paper considering a difficult public health problem, the timely monitoring o… ▽ More Exploiting the large amount of available data for addressing relevant social problems has been one of the key challenges in data mining. Such efforts have been recently named "data science for social good" and attracted the attention of several researchers and institutions. We give a contribution in this objective in this paper considering a difficult public health problem, the timely monitoring of dengue epidemics in small geographical areas. We develop a generative simple yet effective model to connect the fluctuations of disease cases and disease-related Twitter posts. We considered a hidden Markov process driving both, the fluctuations in dengue reported cases and the tweets issued in each region. We add a stable but random source of tweets to represent the posts when no disease cases are recorded. The model is learned through a Markov chain Monte Carlo algorithm that produces the posterior distribution of the relevant parameters. Using data from a significant number of large Brazilian towns, we demonstrate empirically that our model is able to predict well the next weeks of the disease counts using the tweets and disease cases jointly. △ Less

Submitted 20 October, 2015; originally announced October 2015.

Comments: Appears in 2nd ACM SIGKDD Workshop on Connected Health at Big Data Era (BigCHat)

arXiv:1407.5363 [pdf, other]

Where geography lives? A projection approach for spatial confounding

Authors: Marcos O. Prates, Erica C. Rodrigues, Renato M. Assunção

Abstract: Spatial confounding between the spatial random effects and fixed effects covariates has been recently discovered and showed that it may bring misleading interpretation to the model results. Solutions to alleviate this problem are based on decomposing the spatial random effect and fitting a restricted spatial regression. In this paper, we propose a different approach: a transformation of the geogra… ▽ More Spatial confounding between the spatial random effects and fixed effects covariates has been recently discovered and showed that it may bring misleading interpretation to the model results. Solutions to alleviate this problem are based on decomposing the spatial random effect and fitting a restricted spatial regression. In this paper, we propose a different approach: a transformation of the geographic space to ensure that the unobserved spatial random effect added to the regression is orthogonal to the fixed effects covariates. Our approach, named SPOCK, has the additional benefit of providing a fast and simple computational method to estimate the parameters. Furthermore, it does not constrain the distribution class assumed for the spatial error term. A simulation study and a real data analysis are presented to better understand the advantages of the new method in comparison with the existing ones. △ Less

Submitted 16 May, 2016; v1 submitted 20 July, 2014; originally announced July 2014.

arXiv:math/0103104 [pdf, ps, other]

doi 10.1007/s11749-006-0012-z

Detection of spatial pattern through independence of thinned processes

Authors: Renato M. Assuncao, Pablo A. Ferrari

Abstract: Let N, N' and N'' be point processes such that N' is obtained from N by homogeneous independent thinning and N''= N- N'. We give a new elementary proof that N' and N'' are independent if and only if N is a Poisson point process. We present some applications of this result to test if a homogeneous point process is a Poisson point process. Let N, N' and N'' be point processes such that N' is obtained from N by homogeneous independent thinning and N''= N- N'. We give a new elementary proof that N' and N'' are independent if and only if N is a Poisson point process. We present some applications of this result to test if a homogeneous point process is a Poisson point process. △ Less

Submitted 16 March, 2001; originally announced March 2001.

Comments: 11 pages, one figure

MSC Class: 60G55

Showing 1–10 of 10 results for author: Assunção, R M