-
A Generalized Multiscale Bundle-Based Hyperspectral Sparse Unmixing Algorithm
Authors:
Luciano Carvalho Ayres,
Ricardo Augusto Borsoi,
José Carlos Moreira Bermudez,
Sérgio José Melo de Almeida
Abstract:
In hyperspectral sparse unmixing, a successful approach employs spectral bundles to address the variability of the endmembers in the spatial domain. However, the regularization penalties usually employed aggregate substantial computational complexity, and the solutions are very noise-sensitive. We generalize a multiscale spatial regularization approach to solve the unmixing problem by incorporatin…
▽ More
In hyperspectral sparse unmixing, a successful approach employs spectral bundles to address the variability of the endmembers in the spatial domain. However, the regularization penalties usually employed aggregate substantial computational complexity, and the solutions are very noise-sensitive. We generalize a multiscale spatial regularization approach to solve the unmixing problem by incorporating group sparsity-inducing mixed norms. Then, we propose a noise-robust method that can take advantage of the bundle structure to deal with endmember variability while ensuring inter- and intra-class sparsity in abundance estimation with reasonable computational cost. We also present a general heuristic to select the \emph{most representative} abundance estimation over multiple runs of the unmixing process, yielding a solution that is robust and highly reproducible. Experiments illustrate the robustness and consistency of the results when compared to related methods.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks
Authors:
Giordano Paoletti,
Luca Gioacchini,
Marco Mellia,
Luca Vassio,
Jussara M. Almeida
Abstract:
In dynamic complex networks, entities interact and form network communities that evolve over time. Among the many static Community Detection (CD) solutions, the modularity-based Louvain, or Greedy Modularity Algorithm (GMA), is widely employed in real-world applications due to its intuitiveness and scalability. Nevertheless, addressing CD in dynamic graphs remains an open problem, since the evolut…
▽ More
In dynamic complex networks, entities interact and form network communities that evolve over time. Among the many static Community Detection (CD) solutions, the modularity-based Louvain, or Greedy Modularity Algorithm (GMA), is widely employed in real-world applications due to its intuitiveness and scalability. Nevertheless, addressing CD in dynamic graphs remains an open problem, since the evolution of the network connections may poison the identification of communities, which may be evolving at a slower pace. Hence, naively applying GMA to successive network snapshots may lead to temporal inconsistencies in the communities. Two evolutionary adaptations of GMA, sGMA and $α$GMA, have been proposed to tackle this problem. Yet, evaluating the performance of these methods and understanding to which scenarios each one is better suited is challenging because of the lack of a comprehensive set of metrics and a consistent ground truth. To address these challenges, we propose (i) a benchmarking framework for evolutionary CD algorithms in dynamic networks and (ii) a generalised modularity-based approach (NeGMA). Our framework allows us to generate synthetic community-structured graphs and design evolving scenarios with nine basic graph transformations occurring at different rates. We evaluate performance through three metrics we define, i.e. Correctness, Delay, and Stability. Our findings reveal that $α$GMA is well-suited for detecting intermittent transformations, but struggles with abrupt changes; sGMA achieves superior stability, but fails to detect emerging communities; and NeGMA appears a well-balanced solution, excelling in responsiveness and instantaneous transformations detection.
△ Less
Submitted 11 January, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Hel** Fact-Checkers Identify Fake News Stories Shared through Images on WhatsApp
Authors:
Julio C. S. Reis,
Philipe Melo,
Fabiano Belém,
Fabricio Murai,
Jussara M. Almeida,
Fabricio Benevenuto
Abstract:
WhatsApp has introduced a novel avenue for smartphone users to engage with and disseminate news stories. The convenience of forming interest-based groups and seamlessly sharing content has rendered WhatsApp susceptible to the exploitation of misinformation campaigns. While the process of fact-checking remains a potent tool in identifying fabricated news, its efficacy falters in the face of the unp…
▽ More
WhatsApp has introduced a novel avenue for smartphone users to engage with and disseminate news stories. The convenience of forming interest-based groups and seamlessly sharing content has rendered WhatsApp susceptible to the exploitation of misinformation campaigns. While the process of fact-checking remains a potent tool in identifying fabricated news, its efficacy falters in the face of the unprecedented deluge of information generated on the Internet today. In this work, we explore automatic ranking-based strategies to propose a "fakeness score" model as a means to help fact-checking agencies identify fake news stories shared through images on WhatsApp. Based on the results, we design a tool and integrate it into a real system that has been used extensively for monitoring content during the 2018 Brazilian general election. Our experimental evaluation shows that this tool can reduce by up to 40% the amount of effort required to identify 80% of the fake news in the data when compared to current mechanisms practiced by the fact-checking agencies for the selection of news stories to be checked.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
An explainable model to support the decision about the therapy protocol for AML
Authors:
Jade M. Almeida,
Giovanna A. Castro,
João A. Machado-Neto,
Tiago A. Almeida
Abstract:
Acute Myeloid Leukemia (AML) is one of the most aggressive types of hematological neoplasm. To support the specialists' decision about the appropriate therapy, patients with AML receive a prognostic of outcomes according to their cytogenetic and molecular characteristics, often divided into three risk categories: favorable, intermediate, and adverse. However, the current risk classification has kn…
▽ More
Acute Myeloid Leukemia (AML) is one of the most aggressive types of hematological neoplasm. To support the specialists' decision about the appropriate therapy, patients with AML receive a prognostic of outcomes according to their cytogenetic and molecular characteristics, often divided into three risk categories: favorable, intermediate, and adverse. However, the current risk classification has known problems, such as the heterogeneity between patients of the same risk group and no clear definition of the intermediate risk category. Moreover, as most patients with AML receive an intermediate-risk classification, specialists often demand other tests and analyses, leading to delayed treatment and worsening of the patient's clinical condition. This paper presents the data analysis and an explainable machine-learning model to support the decision about the most appropriate therapy protocol according to the patient's survival prediction. In addition to the prediction model being explainable, the results obtained are promising and indicate that it is possible to use it to support the specialists' decisions safely. Most importantly, the findings offered in this study have the potential to open new avenues of research toward better treatments and prognostic markers.
△ Less
Submitted 15 July, 2023; v1 submitted 5 July, 2023;
originally announced July 2023.
-
Impact of User Privacy and Mobility on Edge Offloading
Authors:
João Paulo Esper,
Nadjib Achir,
Kleber Vieira Cardoso,
Jussara M. Almeida
Abstract:
Offloading high-demanding applications to the edge provides better quality of experience (QoE) for users with limited hardware devices. However, to maintain a competitive QoE, infrastructure, and service providers must adapt to users' different mobility patterns, which can be challenging, especially for location-based services (LBS). Another issue that needs to be tackled is the increasing demand…
▽ More
Offloading high-demanding applications to the edge provides better quality of experience (QoE) for users with limited hardware devices. However, to maintain a competitive QoE, infrastructure, and service providers must adapt to users' different mobility patterns, which can be challenging, especially for location-based services (LBS). Another issue that needs to be tackled is the increasing demand for user privacy protection. With less (accurate) information regarding user location, preferences, and usage patterns, forecasting the performance of offloading mechanisms becomes even more challenging. This work discusses the impacts of users' privacy and mobility when offloading to the edge. Different privacy and mobility scenarios are simulated and discussed to shed light on the trade-offs (e.g., privacy protection at the cost of increased latency) among privacy protection, mobility, and offloading performance.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Localized Nitrogen-Vacancy centers generated by low-repetition rate fs-laser pulses
Authors:
Charlie Oncebay,
Juliana M. P. Almeida,
Gustavo F. B. Almeida,
Sergio R. Muniz,
Cleber R. Mendonça
Abstract:
Among hundreds of impurities and defects in diamond, the nitrogen-vacancy (NV) center is one of the most interesting to be used as a platform for quantum technologies and nanosensing. Traditionally, synthetic diamond is irradiated with high-energy electrons or nitrogen ions to generate these color-centers. For precise positioning of the NV centers, fs-laser irradiation has been proposed as an alte…
▽ More
Among hundreds of impurities and defects in diamond, the nitrogen-vacancy (NV) center is one of the most interesting to be used as a platform for quantum technologies and nanosensing. Traditionally, synthetic diamond is irradiated with high-energy electrons or nitrogen ions to generate these color-centers. For precise positioning of the NV centers, fs-laser irradiation has been proposed as an alternative approach to produce spatially localized NV centers in diamond. However, most of the studies reported so far used high-repetition rate fs-laser systems. Here, we studied the influence of the irradiation conditions on the generation of NV$^-$. Specifically, we varied pulse fluence, laser focusing, and the number of pulses upon irradiation with 150 fs pulses at 775 nm from a Ti:sapphire laser amplifier operating at 1 kHz repetition rate. Optically Detected Magnetic Resonance (ODMR) was used to investigate the produced NV centers, revealing a sizeable zero-field splitting in the spectra and indicating the conditions in which the lattice strain produced in the ablation process may be deleterious for quantum information applications.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Understanding mobility in networks: A node embedding approach
Authors:
Matheus F. C. Barros,
Carlos H. G. Ferreira,
Bruno Pereira dos Santos,
Lourenço A. P. Júnior,
Marco Mellia,
Jussara M. Almeida
Abstract:
Motivated by the growing number of mobile devices capable of connecting and exchanging messages, we propose a methodology aiming to model and analyze node mobility in networks. We note that many existing solutions in the literature rely on topological measurements calculated directly on the graph of node contacts, aiming to capture the notion of the node's importance in terms of connectivity and m…
▽ More
Motivated by the growing number of mobile devices capable of connecting and exchanging messages, we propose a methodology aiming to model and analyze node mobility in networks. We note that many existing solutions in the literature rely on topological measurements calculated directly on the graph of node contacts, aiming to capture the notion of the node's importance in terms of connectivity and mobility patterns beneficial for prototy**, design, and deployment of mobile networks. However, each measure has its specificity and fails to generalize the node importance notions that ultimately change over time. Unlike previous approaches, our methodology is based on a node embedding method that models and unveils the nodes' importance in mobility and connectivity patterns while preserving their spatial and temporal characteristics. We focus on a case study based on a trace of group meetings. The results show that our methodology provides a rich representation for extracting different mobility and connectivity patterns, which can be helpful for various applications and services in mobile networks.
△ Less
Submitted 11 November, 2021;
originally announced November 2021.
-
A Hierarchical Network-Oriented Analysis of User Participation in Misinformation Spread on WhatsApp
Authors:
Gabriel Peres Nobre,
Carlos H. G. Ferreira,
Jussara M. Almeida
Abstract:
WhatsApp emerged as a major communication platform in many countries in the recent years. Despite offering only one-to-one and small group conversations, WhatsApp has been shown to enable the formation of a rich underlying network, crossing the boundaries of existing groups, and with structural properties that favor information dissemination at large. Indeed, WhatsApp has reportedly been used as a…
▽ More
WhatsApp emerged as a major communication platform in many countries in the recent years. Despite offering only one-to-one and small group conversations, WhatsApp has been shown to enable the formation of a rich underlying network, crossing the boundaries of existing groups, and with structural properties that favor information dissemination at large. Indeed, WhatsApp has reportedly been used as a forum of misinformation campaigns with significant social, political and economic consequences in several countries. In this article, we aim at complementing recent studies on misinformation spread on WhatsApp, mostly focused on content properties and propagation dynamics, by looking into the network that connects users sharing the same piece of content. Specifically, we present a hierarchical network-oriented characterization of the users engaged in misinformation spread by focusing on three perspectives: individuals, WhatsApp groups and user communities, i.e., grou**s of users who, intentionally or not, share the same content disproportionately often. By analyzing sharing and network topological properties, our study offers valuable insights into how WhatsApp users leverage the underlying network connecting different groups to gain large reach in the spread of misinformation on the platform.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
On the Dynamics of Political Discussions on Instagram: A Network Perspective
Authors:
Carlos H. G. Ferreira,
Fabricio Murai,
Ana P. C. Silva,
Jussara M. Almeida,
Martino Trevisan,
Luca Vassio,
Marco Mellia,
Idilio Drago
Abstract:
Instagram has been increasingly used as a source of information especially among the youth. As a result, political figures now leverage the platform to spread opinions and political agenda. We here analyze online discussions on Instagram, notably in political topics, from a network perspective. Specifically, we investigate the emergence of communities of co-commenters, that is, groups of users who…
▽ More
Instagram has been increasingly used as a source of information especially among the youth. As a result, political figures now leverage the platform to spread opinions and political agenda. We here analyze online discussions on Instagram, notably in political topics, from a network perspective. Specifically, we investigate the emergence of communities of co-commenters, that is, groups of users who often interact by commenting on the same posts and may be driving the ongoing online discussions. In particular, we are interested in salient co-interactions, i.e., interactions of co-commenters that occur more often than expected by chance and under independent behavior. Unlike casual and accidental co-interactions which normally happen in large volumes, salient co-interactions are key elements driving the online discussions and, ultimately, the information dissemination. We base our study on the analysis of 10 weeks of data centered around major elections in Brazil and Italy, following both politicians and other celebrities. We extract and characterize the communities of co-commenters in terms of topological structure, properties of the discussions carried out by community members, and how some community properties, notably community membership and topics, evolve over time. We show that communities discussing political topics tend to be more engaged in the debate by writing longer comments, using more emojis, hashtags and negative words than in other subjects. Also, communities built around political discussions tend to be more dynamic, although top commenters remain active and preserve community membership over time. Moreover, we observe a great diversity in discussed topics over time: whereas some topics attract attention only momentarily, others, centered around more fundamental political discussions, remain consistently active over time.
△ Less
Submitted 13 September, 2022; v1 submitted 19 September, 2021;
originally announced September 2021.
-
Machine Learning for Performance Prediction of Spark Cloud Applications
Authors:
Alexandre Maros,
Fabricio Murai,
Ana Paula Couto da Silva,
Jussara M. Almeida,
Marco Lattuada,
Eugenio Gianniti,
Marjan Hosseini,
Danilo Ardagna
Abstract:
Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the und…
▽ More
Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that are often run on cloud systems. Predicting execution times is important for estimating the cost of cloud services and for effectively managing the underlying resources at runtime. Machine Learning (ML), providing black box solutions to model the relationship between application performance and system configuration without requiring in-detail knowledge of the system, has become a popular way of predicting the performance of big data applications. We investigate the cost-benefits of using supervised ML models for predicting the performance of applications on Spark, one of today's most widely used frameworks for big data analysis. We compare our approach with \textit{Ernest} (an ML-based technique proposed in the literature by the Spark inventors) on a range of scenarios, application workloads, and cloud system configurations. Our experiments show that Ernest can accurately estimate the performance of very regular applications, but it fails when applications exhibit more irregular patterns and/or when extrapolating on bigger data set sizes. Results show that our models match or exceed Ernest's performance, sometimes enabling us to reduce the prediction error from 126-187% to only 5-19%.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Electronic structure of water from Koopmans-compliant functionals
Authors:
James Moraes de Almeida,
Ngoc Linh Nguyen,
Nicola Colonna,
Wei Chen,
Caetano Rodrigues Miranda,
Alfredo Pasquarello,
Nicola Marzari
Abstract:
Obtaining a precise theoretical description of the spectral properties of liquid water poses challenges for both molecular dynamics (MD) and electronic structure methods. The lower computational cost of the Koopmans-compliant functionals with respect to Green's function methods allows the simulations of many MD trajectories, with a description close to the state-of-art quasi-particle self-consiste…
▽ More
Obtaining a precise theoretical description of the spectral properties of liquid water poses challenges for both molecular dynamics (MD) and electronic structure methods. The lower computational cost of the Koopmans-compliant functionals with respect to Green's function methods allows the simulations of many MD trajectories, with a description close to the state-of-art quasi-particle self-consistent GW plus vertex corrections method (QSGW+f$_{xc}$). Thus, we explore water spectral properties when different MD approaches are used, ranging from classical MD to first-principles MD, and including nuclear quantum effects. We have observed that the different MD approaches lead to up to 1 eV change in the average band gap, thus, we focused on the band gap dependence with the geometrical properties of the system to explain such spread. We have evaluated the changes in the band gap due to variations in the intramolecular O-H bond distance, and HOH angle, as well as the intermolecular hydrogen bond O$\cdot\cdot\cdot$O distance, and the OHO angles. We have observed that the dominant contribution comes from the O-H bond length; the O$\cdot\cdot\cdot$O distance plays a secondary role, and the other geometrical properties do not significantly influence the gap. Furthermore, we analyze the electronic density of states (DOS), where the KIPZ functional shows a good agreement with the DOS obtained with state-of-art approaches employing quasi-particle self-consistent GW plus vertex corrections. The O-H bond length also significantly influences the DOS. When nuclear quantum effects are considered, a broadening of the peaks driven by the broader distribution of the O-H bond lengths is observed, leading to a closer agreement with the experimental photoemission spectra.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Tech Report: A Homogeneity-Based Multiscale Hyperspectral Image Representation for Sparse Spectral Unmixing
Authors:
L. C. Ayres,
S. J. M. de Almeida,
J. C. M. Bermudez,
R. A. Borsoi
Abstract:
Several approaches have been proposed to solve the spectral unmixing problem in hyperspectral image analysis. Among them the use of sparse regression techniques aims to characterize the abundances in pixels based on a large library of spectral signatures known a priori. Recently, the integration of image spatial-contextual information significantly enhanced the performance of sparse unmixing. In t…
▽ More
Several approaches have been proposed to solve the spectral unmixing problem in hyperspectral image analysis. Among them the use of sparse regression techniques aims to characterize the abundances in pixels based on a large library of spectral signatures known a priori. Recently, the integration of image spatial-contextual information significantly enhanced the performance of sparse unmixing. In this work, we propose a computationally efficient multiscale representation method for hyperspectral data adapted to the unmixing problem. The proposed method is based on a hierarchical extension of the SLIC oversegmentation algorithm constructed using a robust homogeneity testing. The image is subdivided into a set of spectrally homogeneous regions formed by pixels with similar characteristics (superpixels). This representation is then used to provide prior spatial regularity information for the abundances of materials present in the scene, improving the conditioning of the unmixing problem. Simulation results illustrate that the method is capable of estimating abundances with high quality and low computational cost, especially in noisy scenarios.
△ Less
Submitted 10 February, 2021;
originally announced February 2021.
-
A Dataset of Fact-Checked Images Shared on WhatsApp During the Brazilian and Indian Elections
Authors:
Julio C. S. Reis,
Philipe de Freitas Melo,
Kiran Garimella,
Jussara M. Almeida,
Dean Eckles,
Fabrício Benevenuto
Abstract:
Recently, messaging applications, such as WhatsApp, have been reportedly abused by misinformation campaigns, especially in Brazil and India. A notable form of abuse in WhatsApp relies on several manipulated images and memes containing all kinds of fake stories. In this work, we performed an extensive data collection from a large set of WhatsApp publicly accessible groups and fact-checking agency w…
▽ More
Recently, messaging applications, such as WhatsApp, have been reportedly abused by misinformation campaigns, especially in Brazil and India. A notable form of abuse in WhatsApp relies on several manipulated images and memes containing all kinds of fake stories. In this work, we performed an extensive data collection from a large set of WhatsApp publicly accessible groups and fact-checking agency websites. This paper opens a novel dataset to the research community containing fact-checked fake images shared through WhatsApp for two distinct scenarios known for the spread of fake news on the platform: the 2018 Brazilian elections and the 2019 Indian elections.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
Towards Understanding Political Interactions on Instagram
Authors:
Martino Trevisan,
Luca Vassio,
Idilio Drago,
Marco Mellia,
Fabricio Murai,
Flavio Figueiredo,
Ana Paula Couto da Silva,
Jussara M. Almeida
Abstract:
Online Social Networks (OSNs) allow personalities and companies to communicate directly with the public, bypassing filters of traditional medias. As people rely on OSNs to stay up-to-date, the political debate has moved online too. We witness the sudden explosion of harsh political debates and the dissemination of rumours in OSNs. Identifying such behaviour requires a deep understanding on how peo…
▽ More
Online Social Networks (OSNs) allow personalities and companies to communicate directly with the public, bypassing filters of traditional medias. As people rely on OSNs to stay up-to-date, the political debate has moved online too. We witness the sudden explosion of harsh political debates and the dissemination of rumours in OSNs. Identifying such behaviour requires a deep understanding on how people interact via OSNs during political debates. We present a preliminary study of interactions in a popular OSN, namely Instagram. We take Italy as a case study in the period before the 2019 European Elections. We observe the activity of top Italian Instagram profiles in different categories: politics, music, sport and show. We record their posts for more than two months, tracking "likes" and comments from users. Results suggest that profiles of politicians attract markedly different interactions than other categories. People tend to comment more, with longer comments, debating for longer time, with a large number of replies, most of which are not explicitly solicited. Moreover, comments tend to come from a small group of very active users. Finally, we witness substantial differences when comparing profiles of different parties.
△ Less
Submitted 4 May, 2021; v1 submitted 26 April, 2019;
originally announced April 2019.
-
Analyzing Ideological Communities in Congressional Voting Networks
Authors:
Carlos H. G. Ferreira,
Breno de Souza Matos,
Jusssara M. Almeida
Abstract:
We here study the behavior of political party members aiming at identifying how ideological communities are created and evolve over time in diverse (fragmented and non-fragmented) party systems. Using public voting data of both Brazil and the US, we propose a methodology to identify and characterize ideological communities, their member polarization, and how such communities evolve over time, cove…
▽ More
We here study the behavior of political party members aiming at identifying how ideological communities are created and evolve over time in diverse (fragmented and non-fragmented) party systems. Using public voting data of both Brazil and the US, we propose a methodology to identify and characterize ideological communities, their member polarization, and how such communities evolve over time, covering a 15-year period. Our results reveal very distinct patterns across the two case studies, in terms of both structural and dynamic properties.
△ Less
Submitted 29 October, 2018;
originally announced October 2018.
-
Nonlinear optical spectrum of diamond at femtosecond regime
Authors:
Juliana M. P. Almeida,
Charlie Oncebay,
Jonathas P. Siqueira,
Sergio R. Muniz,
Leonardo De Boni,
Cleber R. Mendonça
Abstract:
Although diamond photonics has driven considerable interest and useful applications, as shown in frequency generation devices and single photon emitters, fundamental studies on the third-order optical nonlinearities of diamond are still scarce, stalling the development of an integrated platform for nonlinear and quantum optics. The purpose of this paper is to contribute to those studies by measuri…
▽ More
Although diamond photonics has driven considerable interest and useful applications, as shown in frequency generation devices and single photon emitters, fundamental studies on the third-order optical nonlinearities of diamond are still scarce, stalling the development of an integrated platform for nonlinear and quantum optics. The purpose of this paper is to contribute to those studies by measuring the spectra of two-photon absorption coefficient ($β$) and the nonlinear index of refraction (n$_2$) of diamond using femtosecond laser pulses, in a wide spectral range. These measurements show the magnitude of $β$ increasing from 0.07 to 0.23 cm/GW, as it approaches the bandgap energy, in the region from 3.18 to 4.77 eV (390 - 260 nm), whereas the n$_2$ varies from zero to 1.7E-19 m$^2$/W in the full measured range, from 0.83 - 4.77 eV (1500 - 260 nm). The experimental results are compared with theoretical models for nonlinear absorption and refraction in indirect gap semiconductors, indicating the two-photon absorption as the dominant effect in the dispersion of the third-order nonlinear susceptibility. These data, together with optical Kerr gate measurements, also provided here, are of foremost relevance to the understanding of ultrafast optical processes in diamond and its nonlinear properties.
△ Less
Submitted 9 September, 2017;
originally announced September 2017.
-
Gender Matters! Analyzing Global Cultural Gender Preferences for Venues Using Social Sensing
Authors:
Willi Mueller,
Thiago H Silva,
Jussara M Almeida,
Antonio A F Loureiro
Abstract:
Gender differences is a phenomenon around the world actively researched by social scientists. Traditionally, the data used to support such studies is manually obtained, often through surveys with volunteers. However, due to their inherent high costs because of manual steps, such traditional methods do not quickly scale to large-size studies. We here investigate a particular aspect of gender differ…
▽ More
Gender differences is a phenomenon around the world actively researched by social scientists. Traditionally, the data used to support such studies is manually obtained, often through surveys with volunteers. However, due to their inherent high costs because of manual steps, such traditional methods do not quickly scale to large-size studies. We here investigate a particular aspect of gender differences: preferences for venues. To that end we explore the use of check-in data collected from Foursquare to estimate cultural gender preferences for venues in the physical world. For that, we first demonstrate that by analyzing the check-in data in various regions of the world we can find significant differences in preferences for specific venues between gender groups. Some of these significant differences reflect well-known cultural patterns. Moreover, we also gathered evidence that our methodology offers useful information about gender preference for venues in a given region in the real world. This suggests that gender and venue preferences observed may not be independent. Our results suggests that our proposed methodology could be a promising tool to support studies on gender preferences for venues at different spatial granularities around the world, being faster and cheaper than traditional methods, besides quickly capturing changes in the real world.
△ Less
Submitted 18 March, 2017;
originally announced March 2017.
-
Understanding Video-Ad Consumption on YouTube: A Measurement Study on User Behavior, Popularity, and Content Properties
Authors:
Mariana Arantes,
Flavio Figueiredo,
Jussara M. Almeida
Abstract:
Faced with the challenge of attracting user attention and revenue, social media websites have turned to video advertisements (video-ads). While in traditional media the video-ad market is mostly based on an interaction between content providers and marketers, the use of video-ads in social media has enabled a more complex interaction, that also includes content creator and viewer preferences. To b…
▽ More
Faced with the challenge of attracting user attention and revenue, social media websites have turned to video advertisements (video-ads). While in traditional media the video-ad market is mostly based on an interaction between content providers and marketers, the use of video-ads in social media has enabled a more complex interaction, that also includes content creator and viewer preferences. To better understand this novel setting, we present the first data-driven analysis of video-ad exhibitions on YouTube.
△ Less
Submitted 26 April, 2016;
originally announced April 2016.
-
Improving the Effectiveness of Content Popularity Prediction Methods using Time Series Trends
Authors:
Flavio Figueiredo,
Marcos André Gonçalves,
Jussara M. Almeida
Abstract:
We here present a simple and effective model to predict the popularity of web content. Our solution, which is the winner of two of the three tasks of the ECML/PKDD 2014 Predictive Analytics Challenge, aims at predicting user engagement metrics, such as number of visits and social network engagement, that a web page will achieve 48 hours after its upload, using only information available in the fir…
▽ More
We here present a simple and effective model to predict the popularity of web content. Our solution, which is the winner of two of the three tasks of the ECML/PKDD 2014 Predictive Analytics Challenge, aims at predicting user engagement metrics, such as number of visits and social network engagement, that a web page will achieve 48 hours after its upload, using only information available in the first hour after upload. Our model is based on two steps. We first use time series clustering techniques to extract common temporal trends of content popularity. Next, we use linear regression models, exploiting as predictors both content features (e.g., numbers of visits and mentions on online social networks) and metrics that capture the distance between the popularity time series to the trends extracted in the first step. We discuss why this model is effective and show its gains over state of the art alternatives.
△ Less
Submitted 29 August, 2014;
originally announced August 2014.
-
Revisit Behavior in Social Media: The Phoenix-R Model and Discoveries
Authors:
Flavio Figueiredo,
Jussara M. Almeida,
Yasuko Matsubara,
Bruno Ribeiro,
Christos Faloutsos
Abstract:
How many listens will an artist receive on a online radio? How about plays on a YouTube video? How many of these visits are new or returning users? Modeling and mining popularity dynamics of social activity has important implications for researchers, content creators and providers. We here investigate the effect of revisits (successive visits from a single user) on content popularity. Using four d…
▽ More
How many listens will an artist receive on a online radio? How about plays on a YouTube video? How many of these visits are new or returning users? Modeling and mining popularity dynamics of social activity has important implications for researchers, content creators and providers. We here investigate the effect of revisits (successive visits from a single user) on content popularity. Using four datasets of social activity, with up to tens of millions media objects (e.g., YouTube videos, Twitter hashtags or LastFM artists), we show the effect of revisits in the popularity evolution of such objects. Secondly, we propose the Phoenix-R model which captures the popularity dynamics of individual objects. Phoenix-R has the desired properties of being: (1) parsimonious, being based on the minimum description length principle, and achieving lower root mean squared error than state-of-the-art baselines; (2) applicable, the model is effective for predicting future popularity values of objects.
△ Less
Submitted 22 June, 2014; v1 submitted 6 May, 2014;
originally announced May 2014.
-
TrendLearner: Early Prediction of Popularity Trends of User Generated Content
Authors:
Flavio Figueiredo,
Jussara M. Almeida,
Marcos André Gonçalves,
Fabrício Benevenuto
Abstract:
We here focus on the problem of predicting the popularity trend of user generated content (UGC) as early as possible. Taking YouTube videos as case study, we propose a novel two-step learning approach that: (1) extracts popularity trends from previously uploaded objects, and (2) predicts trends for new content. Unlike previous work, our solution explicitly addresses the inherent tradeoff between p…
▽ More
We here focus on the problem of predicting the popularity trend of user generated content (UGC) as early as possible. Taking YouTube videos as case study, we propose a novel two-step learning approach that: (1) extracts popularity trends from previously uploaded objects, and (2) predicts trends for new content. Unlike previous work, our solution explicitly addresses the inherent tradeoff between prediction accuracy and remaining interest in the content after prediction, solving it on a per-object basis. Our experimental results show great improvements of our solution over alternatives, and its applicability to improve the accuracy of state-of-the-art popularity prediction methods.
△ Less
Submitted 14 February, 2016; v1 submitted 10 February, 2014;
originally announced February 2014.
-
On the Dynamics of Social Media Popularity: A YouTube Case Study
Authors:
Flavio Figueiredo,
Jussara M. Almeida,
Marcos André Gonçalves,
Fabrício Benevenuto
Abstract:
Understanding the factors that impact the popularity dynamics of social media can drive the design of effective information services, besides providing valuable insights to content generators and online advertisers. Taking YouTube as case study, we analyze how video popularity evolves since upload, extracting popularity trends that characterize groups of videos. We also analyze the referrers that…
▽ More
Understanding the factors that impact the popularity dynamics of social media can drive the design of effective information services, besides providing valuable insights to content generators and online advertisers. Taking YouTube as case study, we analyze how video popularity evolves since upload, extracting popularity trends that characterize groups of videos. We also analyze the referrers that lead users to videos, correlating them, features of the video and early popularity measures with the popularity trend and total observed popularity the video will experience. Our findings provide fundamental knowledge about popularity dynamics and its implications for services such as advertising and search.
△ Less
Submitted 17 October, 2014; v1 submitted 7 February, 2014;
originally announced February 2014.
-
Is Learning to Rank Worth It? A Statistical Analysis of Learning to Rank Methods
Authors:
Guilherme de Castro Mendes Gomes,
Vitor Campos de Oliveira,
Jussara Marques de Almeida,
Marcos André Gonçalves
Abstract:
The Learning to Rank (L2R) research field has experienced a fast paced growth over the last few years, with a wide variety of benchmark datasets and baselines available for experimentation. We here investigate the main assumption behind this field, which is that, the use of sophisticated L2R algorithms and models, produce significant gains over more traditional and simple information retrieval app…
▽ More
The Learning to Rank (L2R) research field has experienced a fast paced growth over the last few years, with a wide variety of benchmark datasets and baselines available for experimentation. We here investigate the main assumption behind this field, which is that, the use of sophisticated L2R algorithms and models, produce significant gains over more traditional and simple information retrieval approaches. Our experimental results surprisingly indicate that many L2R algorithms, when put up against the best individual features of each dataset, may not produce statistically significant differences, even if the absolute gains may seem large. We also find that most of the reported baselines are statistically tied, with no clear winner.
△ Less
Submitted 9 March, 2013;
originally announced March 2013.
-
Spin-filtering and Disorder Induced Giant Magnetoresistance in Carbon Nanotubes: Ab Initio Calculations
Authors:
J. M. de Almeida,
A. R. Rocha,
A. J. R. da Silva,
A. Fazzio
Abstract:
Nitrogen-doped carbon nanotubes can provide reactive sites on the porphyrin-like defects. It's well known that many porphyrins have transition metal atoms, and we have explored transition metal atoms bonded to those porphyrin-like defects in N-doped carbon nanotubes. The electronic structure and transport are analyzed by means of a combination of density functional theory and recursive Green's fun…
▽ More
Nitrogen-doped carbon nanotubes can provide reactive sites on the porphyrin-like defects. It's well known that many porphyrins have transition metal atoms, and we have explored transition metal atoms bonded to those porphyrin-like defects in N-doped carbon nanotubes. The electronic structure and transport are analyzed by means of a combination of density functional theory and recursive Green's functions methods. The results determined the Heme B-like defect (an iron atom bonded to four nitrogens) as the most stable and with a higher polarization current for a single defect. With randomly positioned Heme B-defects in a few hundred nanometers long nanotubes the polarization reaches near 100% meaning an effective spin filter. A disorder induced magnetoresistance effect is also observed in those long nanotubes, values as high as 20000% are calculated with non-magnectic eletrodes.
△ Less
Submitted 25 July, 2011; v1 submitted 26 March, 2011;
originally announced March 2011.
-
Action Recognition in Videos: from Motion Capture Labs to the Web
Authors:
Ana Paula Brandão Lopes,
Eduardo Alves do Valle Jr.,
Jussara Marques de Almeida,
Arnaldo Albuquerque de Araújo
Abstract:
This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation…
▽ More
This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.
△ Less
Submitted 17 June, 2010;
originally announced June 2010.
-
Dissipation due to fermions in inflaton equations of motion
Authors:
Javier Moreno Almeida,
Ian D. Lawrie
Abstract:
According to quantum field theory, the inflaton equation of motion does not have the local form that is generally assumed for cosmological purposes. In particular, earlier investigations of the nonequilibrium dynamics of an inflaton that decays into scalar particles suggest that the loss of inflaton energy is not well approximated by the local friction term derived from linear response theory. W…
▽ More
According to quantum field theory, the inflaton equation of motion does not have the local form that is generally assumed for cosmological purposes. In particular, earlier investigations of the nonequilibrium dynamics of an inflaton that decays into scalar particles suggest that the loss of inflaton energy is not well approximated by the local friction term derived from linear response theory. We extend this analysis to the case of an inflaton that decays into fermions, and reach broadly the same conclusion.
△ Less
Submitted 23 November, 2007;
originally announced November 2007.
-
Quantifying social vs. antisocial behavior in email networks
Authors:
Luiz H. Gomes,
Luis M. A. Bettencourt,
Virgilio A. F. Almeida,
Jussara M. Almeida,
Fernando D. O. Castro
Abstract:
Email graphs have been used to illustrate general properties of social networks of communication and collaboration. However, increasingly, the majority of email traffic reflects opportunistic, rather than symbiotic social relations. Here we use e-mail data drawn from a large university to construct directed graphs of email exchange that quantify the differences between social and antisocial beha…
▽ More
Email graphs have been used to illustrate general properties of social networks of communication and collaboration. However, increasingly, the majority of email traffic reflects opportunistic, rather than symbiotic social relations. Here we use e-mail data drawn from a large university to construct directed graphs of email exchange that quantify the differences between social and antisocial behaviors in networks of communication. We show that while structural characteristics typical of other social networks are shared to a large extent by the legitimate component they are not characteristic of antisocial traffic. Interestingly, opportunistic patterns of behavior do create nontrivial graphs with certain general characteristics that we identify. To complement the graph analysis, which suffers from incomplete knowledge of users external to the domain, we study temporal patterns of communication to show that the dynamical properties of email traffic can, in principle, distinguish different types of social relations.
△ Less
Submitted 27 November, 2006; v1 submitted 19 January, 2006;
originally announced January 2006.
-
Comparative Graph Theoretical Characterization of Networks of Spam and Legitimate Email
Authors:
Luiz H. Gomes,
Rodrigo B. Almeida,
Luis M. A. Bettencourt,
Virgilio Almeida,
Jussara M. Almeida
Abstract:
Email is an increasingly important and ubiquitous means of communication, both facilitating contact between private individuals and enabling rises in the productivity of organizations. However the relentless rise of automatic unauthorized emails, a.k.a. spam is eroding away much of the attractiveness of email communication. Most of the attention dedicated to date to spam detection has focused on…
▽ More
Email is an increasingly important and ubiquitous means of communication, both facilitating contact between private individuals and enabling rises in the productivity of organizations. However the relentless rise of automatic unauthorized emails, a.k.a. spam is eroding away much of the attractiveness of email communication. Most of the attention dedicated to date to spam detection has focused on the content of the emails or on the addresses or domains associated with spam senders. Although methods based on these - easily changeable - identifiers work reasonably well they miss on the fundamental nature of spam as an opportunistic relationship, very different from the normal mutual relations between senders and recipients of legitimate email. Here we present a comprehensive graph theoretical analysis of email traffic that captures these properties quantitatively. We identify several simple metrics that serve both to distinguish between spam and legitimate email and to provide a statistical basis for models of spam traffic.
△ Less
Submitted 4 April, 2005;
originally announced April 2005.
-
Improving Spam Detection Based on Structural Similarity
Authors:
Luiz H. Gomes,
Fernando D. O. Castro,
Rodrigo B. Almeida,
Luis M. A. Bettencourt,
Virgilio A. F. Almeida,
Jussara M. Almeida
Abstract:
We propose a new detection algorithm that uses structural relationships between senders and recipients of email as the basis for the identification of spam messages. Users and receivers are represented as vectors in their reciprocal spaces. A measure of similarity between vectors is constructed and used to group users into clusters. Knowledge of their classification as past senders/receivers of…
▽ More
We propose a new detection algorithm that uses structural relationships between senders and recipients of email as the basis for the identification of spam messages. Users and receivers are represented as vectors in their reciprocal spaces. A measure of similarity between vectors is constructed and used to group users into clusters. Knowledge of their classification as past senders/receivers of spam or legitimate mail, comming from an auxiliary detection algorithm, is then used to label these clusters probabilistically. This knowledge comes from an auxiliary algorithm. The measure of similarity between the sender and receiver sets of a new message to the center vector of clusters is then used to asses the possibility of that message being legitimate or spam. We show that the proposed algorithm is able to correct part of the false positives (legitimate messages classified as spam) using a testbed of one week smtp log.
△ Less
Submitted 5 April, 2005;
originally announced April 2005.