Search | arXiv e-print repository

MPS: A New Method for Selecting the Stable Closed-Loop Equilibrium Attitude-Error Quaternion of a UAV During Flight

Authors: Francisco M. F. R. Gonçalves, Ryan M. Bena, Konstantin I. Matveev, Néstor O. Pérez-Arancibia

Abstract: We present model predictive selection (MPS), a new method for selecting the stable closed-loop (CL) equilibrium attitude-error quaternion (AEQ) of an uncrewed aerial vehicle (UAV) during the execution of high-speed yaw maneuvers. In this approach, we minimize the cost of yawing measured with a performance figure of merit (PFM) that takes into account both the aerodynamic-torque control input and a… ▽ More We present model predictive selection (MPS), a new method for selecting the stable closed-loop (CL) equilibrium attitude-error quaternion (AEQ) of an uncrewed aerial vehicle (UAV) during the execution of high-speed yaw maneuvers. In this approach, we minimize the cost of yawing measured with a performance figure of merit (PFM) that takes into account both the aerodynamic-torque control input and attitude-error state of the UAV. Specifically, this method uses a control law with a term whose sign is dynamically switched in real time to select, between two options, the torque associated with the lesser cost of rotation as predicted by a dynamical model of the UAV derived from first principles. This problem is relevant because the selection of the stable CL equilibrium AEQ significantly impacts the performance of a UAV during high-speed rotational flight, from both the power and control-error perspectives. To test and demonstrate the functionality and performance of the proposed method, we present data collected during one hundred real-time high-speed yaw-tracking flight experiments. These results highlight the superior capabilities of the proposed MPS-based scheme when compared to a benchmark controller commonly used in aerial robotics, as the PFM used to quantify the cost of flight is reduced by 60.30 %, on average. To our best knowledge, these are the first flight-test results that thoroughly demonstrate, evaluate, and compare the performance of a real-time controller capable of selecting the stable CL equilibrium AEQ during operation. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: ICRA 2024

arXiv:2402.03337 [pdf, other]

Reinforcement-learning robotic sailboats: simulator and preliminary results

Authors: Eduardo Charles Vasconcellos, Ronald M Sampaio, André P D Araújo, Esteban Walter Gonzales Clua, Philippe Preux, Raphael Guerra, Luiz M G Gonçalves, Luis Martí, Hernan Lira, Nayat Sanchez-Pi

Abstract: This work focuses on the main challenges and problems in develo** a virtual oceanic environment reproducing real experiments using Unmanned Surface Vehicles (USV) digital twins. We introduce the key features for building virtual worlds, considering using Reinforcement Learning (RL) agents for autonomous navigation and control. With this in mind, the main problems concern the definition of the si… ▽ More This work focuses on the main challenges and problems in develo** a virtual oceanic environment reproducing real experiments using Unmanned Surface Vehicles (USV) digital twins. We introduce the key features for building virtual worlds, considering using Reinforcement Learning (RL) agents for autonomous navigation and control. With this in mind, the main problems concern the definition of the simulation equations (physics and mathematics), their effective implementation, and how to include strategies for simulated control and perception (sensors) to be used with RL. We present the modeling, implementation steps, and challenges required to create a functional digital twin based on a real robotic sailing vessel. The application is immediate for develo** navigation algorithms based on RL to be applied on real boats. △ Less

Submitted 16 January, 2024; originally announced February 2024.

Journal ref: NeurIPS 2023 Workshop on Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models, Dec 2023, New Orelans, United States

arXiv:2311.07216 [pdf, other]

Few Shot Learning for the Classification of Confocal Laser Endomicroscopy Images of Head and Neck Tumors

Authors: Marc Aubreville, Zhaoya Pan, Matti Sievert, Jonas Ammeling, Jonathan Ganz, Nicolai Oetter, Florian Stelzle, Ann-Kathrin Frenken, Katharina Breininger, Miguel Goncalves

Abstract: The surgical removal of head and neck tumors requires safe margins, which are usually confirmed intraoperatively by means of frozen sections. This method is, in itself, an oversampling procedure, which has a relatively low sensitivity compared to the definitive tissue analysis on paraffin-embedded sections. Confocal laser endomicroscopy (CLE) is an in-vivo imaging technique that has shown its pote… ▽ More The surgical removal of head and neck tumors requires safe margins, which are usually confirmed intraoperatively by means of frozen sections. This method is, in itself, an oversampling procedure, which has a relatively low sensitivity compared to the definitive tissue analysis on paraffin-embedded sections. Confocal laser endomicroscopy (CLE) is an in-vivo imaging technique that has shown its potential in the live optical biopsy of tissue. An automated analysis of this notoriously difficult to interpret modality would help surgeons. However, the images of CLE show a wide variability of patterns, caused both by individual factors but also, and most strongly, by the anatomical structures of the imaged tissue, making it a challenging pattern recognition task. In this work, we evaluate four popular few shot learning (FSL) methods towards their capability of generalizing to unseen anatomical domains in CLE images. We evaluate this on images of sinunasal tumors (SNT) from five patients and on images of the vocal folds (VF) from 11 patients using a cross-validation scheme. The best respective approach reached a median accuracy of 79.6% on the rather homogeneous VF dataset, but only of 61.6% for the highly diverse SNT dataset. Our results indicate that FSL on CLE images is viable, but strongly affected by the number of patients, as well as the diversity of anatomical patterns. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 6 pages

arXiv:2310.03491 [pdf, other]

TPDR: A Novel Two-Step Transformer-based Product and Class Description Match and Retrieval Method

Authors: Washington Cunha, Celso França, Leonardo Rocha, Marcos André Gonçalves

Abstract: There is a niche of companies responsible for intermediating the purchase of large batches of varied products for other companies, for which the main challenge is to perform product description standardization, i.e., matching an item described by a client with a product described in a catalog. The problem is complex since the client's product description may be: (1) potentially noisy; (2) short an… ▽ More There is a niche of companies responsible for intermediating the purchase of large batches of varied products for other companies, for which the main challenge is to perform product description standardization, i.e., matching an item described by a client with a product described in a catalog. The problem is complex since the client's product description may be: (1) potentially noisy; (2) short and uninformative (e.g., missing information about model and size); and (3) cross-language. In this paper, we formalize this problem as a ranking task: given an initial client product specification (query), return the most appropriate standardized descriptions (response). In this paper, we propose TPDR, a two-step Transformer-based Product and Class Description Retrieval method that is able to explore the semantic correspondence between IS and SD, by exploiting attention mechanisms and contrastive learning. First, TPDR employs the transformers as two encoders sharing the embedding vector space: one for encoding the IS and another for the SD, in which corresponding pairs (IS, SD) must be close in the vector space. Closeness is further enforced by a contrastive learning mechanism leveraging a specialized loss function. TPDR also exploits a (second) re-ranking step based on syntactic features that are very important for the exact matching (model, dimension) of certain products that may have been neglected by the transformers. To evaluate our proposal, we consider 11 datasets from a real company, covering different application contexts. Our solution was able to retrieve the correct standardized product before the 5th ranking position in 71% of the cases and its correct category in the first position in 80% of the situations. Moreover, the effectiveness gains over purely syntactic or semantic baselines reach up to 3.7 times, solving cases that none of the approaches in isolation can do by themselves. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 10 pages, 8 figures, 5 tables

arXiv:2307.01373 [pdf, other]

A numerical variability approach to results stability tests and its application to neuroimaging

Authors: Yohan Chatelain, Loïc Tetrel, Christopher J. Markiewicz, Mathias Goncalves, Gregory Kiar, Oscar Esteban, Pierre Bellec, Tristan Glatard

Abstract: Ensuring the long-term reproducibility of data analyses requires results stability tests to verify that analysis results remain within acceptable variation bounds despite inevitable software updates and hardware evolutions. This paper introduces a numerical variability approach for results stability tests, which determines acceptable variation bounds using random rounding of floating-point calcula… ▽ More Ensuring the long-term reproducibility of data analyses requires results stability tests to verify that analysis results remain within acceptable variation bounds despite inevitable software updates and hardware evolutions. This paper introduces a numerical variability approach for results stability tests, which determines acceptable variation bounds using random rounding of floating-point calculations. By applying the resulting stability test to \fmriprep, a widely-used neuroimaging tool, we show that the test is sensitive enough to detect subtle updates in image processing methods while remaining specific enough to accept numerical variations within a reference version of the application. This result contributes to enhancing the reliability and reproducibility of data analyses by providing a robust and flexible method for stability testing. △ Less

Submitted 10 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

ACM Class: D.2.5

arXiv:2111.06166 [pdf, other]

G-GPU: A Fully-Automated Generator of GPU-like ASIC Accelerators

Authors: Tiago Diadami Perez, Márcio M. Gonçalves, José Rodrigo Azambuja, Leonardo Gobatto, Marcelo Brandalero, Samuel Pagliarini

Abstract: Modern Systems on Chip (SoC), almost as a rule, require accelerators for achieving energy efficiency and high performance for specific tasks that are not necessarily well suited for execution in standard processing units. Considering the broad range of applications and necessity for specialization, the design of SoCs has thus become expressively more challenging. In this paper, we put forward the… ▽ More Modern Systems on Chip (SoC), almost as a rule, require accelerators for achieving energy efficiency and high performance for specific tasks that are not necessarily well suited for execution in standard processing units. Considering the broad range of applications and necessity for specialization, the design of SoCs has thus become expressively more challenging. In this paper, we put forward the concept of G-GPU, a general-purpose GPU-like accelerator that is not application-specific but still gives benefits in energy efficiency and throughput. Furthermore, we have identified an existing gap for these accelerators in ASIC, for which no known automated generation platform/tool exists. Our solution, called GPUPlanner, is an open-source generator of accelerators, from RTL to GDSII, that addresses this gap. Our analysis results show that our automatically generated G-GPU designs are remarkably efficient when compared against the popular CPU architecture RISC-V, presenting speed-ups of up to 223 times in raw performance and up to 11 times when the metric is performance derated by area. These results are achieved by executing a design space exploration of the GPU-like accelerators, where the memory hierarchy is broken in a smart fashion and the logic is pipelined on demand. Finally, tapeout-ready layouts of the G-GPU in 65nm CMOS are presented. △ Less

Submitted 6 December, 2021; v1 submitted 11 November, 2021; originally announced November 2021.

arXiv:2010.05681 [pdf, other]

From Time Series to Euclidean Spaces: On Spatial Transformations for Temporal Clustering

Authors: Nuno Mota Goncalves, Ioana Giurgiu, Anika Schumann

Abstract: Unsupervised clustering of temporal data is both challenging and crucial in machine learning. In this paper, we show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data. We propose a novel approach to temporal clustering, in which we (1) tran… ▽ More Unsupervised clustering of temporal data is both challenging and crucial in machine learning. In this paper, we show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data. We propose a novel approach to temporal clustering, in which we (1) transform the input time series into a distance-based projected representation by using similarity measures suitable for dealing with temporal data,(2) feed these projections into a multi-layer CNN-GRU autoencoder to generate meaningful domain-aware latent representations, which ultimately (3) allow for a natural separation of clusters beneficial for most important traditional clustering algorithms. We evaluate our approach on time series datasets from various domains and show that it not only outperforms existing methods in all cases, by up to 32%, but is also robust and incurs negligible computation overheads. △ Less

Submitted 2 October, 2020; originally announced October 2020.

Comments: 8 pages

arXiv:2009.14178 [pdf, other]

Robust Detection of Objects under Periodic Motion with Gaussian Process Filtering

Authors: Joris Guerin, Anne Magaly de Paula Canuto, Luiz Marcos Garcia Goncalves

Abstract: Object Detection (OD) is an important task in Computer Vision with many practical applications. For some use cases, OD must be done on videos, where the object of interest has a periodic motion. In this paper, we formalize the problem of periodic OD, which consists in improving the performance of an OD model in the specific case where the object of interest is repeating similar spatio-temporal tra… ▽ More Object Detection (OD) is an important task in Computer Vision with many practical applications. For some use cases, OD must be done on videos, where the object of interest has a periodic motion. In this paper, we formalize the problem of periodic OD, which consists in improving the performance of an OD model in the specific case where the object of interest is repeating similar spatio-temporal trajectories with respect to the video frames. The proposed approach is based on training a Gaussian Process to model the periodic motion, and use it to filter out the erroneous predictions of the OD model. By simulating various OD models and periodic trajectories, we demonstrate that this filtering approach, which is entirely data-driven, improves the detection performance by a large margin. △ Less

Submitted 29 September, 2020; originally announced September 2020.

Comments: 8 pages, 11 figures, 1 table Accepted as a full paper at ICMLA 2020 (19th IEEE International Conference On Machine Learning And Applications)

arXiv:1905.00825 [pdf, other]

doi 10.1145/3292522.3326018

Characterizing Attention Cascades in WhatsApp Groups

Authors: Josemar Alves Caetano, Gabriel Magno, Marcos Gonçalves, Jussara Almeida, Humberto T. Marques-Neto, Virgílio Almeida

Abstract: An important political and social phenomena discussed in several countries, like India and Brazil, is the use of WhatsApp to spread false or misleading content. However, little is known about the information dissemination process in WhatsApp groups. Attention affects the dissemination of information in WhatsApp groups, determining what topics or subjects are more attractive to participants of a gr… ▽ More An important political and social phenomena discussed in several countries, like India and Brazil, is the use of WhatsApp to spread false or misleading content. However, little is known about the information dissemination process in WhatsApp groups. Attention affects the dissemination of information in WhatsApp groups, determining what topics or subjects are more attractive to participants of a group. In this paper, we characterize and analyze how attention propagates among the participants of a WhatsApp group. An attention cascade begins when a user asserts a topic in a message to the group, which could include written text, photos, or links to articles online. Others then propagate the information by responding to it. We analyzed attention cascades in more than 1.7 million messages posted in 120 groups over one year. Our analysis focused on the structural and temporal evolution of attention cascades as well as on the behavior of users that participate in them. We found specific characteristics in cascades associated with groups that discuss political subjects and false information. For instance, we observe that cascades with false information tend to be deeper, reach more users, and last longer in political groups than in non-political groups. △ Less

Submitted 3 May, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

Comments: Accepted as a full paper at the 11th International ACM Web Science Conference (WebSci 2019). Please cite the WebSci version

arXiv:1902.08985 [pdf, other]

doi 10.1007/978-3-030-29196-9_4

Transferability of Deep Learning Algorithms for Malignancy Detection in Confocal Laser Endomicroscopy Images from Different Anatomical Locations of the Upper Gastrointestinal Tract

Authors: Marc Aubreville, Miguel Goncalves, Christian Knipfer, Nicolai Oetter, Helmut Neumann, Florian Stelzle, Christopher Bohr, Andreas Maier

Abstract: Squamous Cell Carcinoma (SCC) is the most common cancer type of the epithelium and is often detected at a late stage. Besides invasive diagnosis of SCC by means of biopsy and histo-pathologic assessment, Confocal Laser Endomicroscopy (CLE) has emerged as noninvasive method that was successfully used to diagnose SCC in vivo. For interpretation of CLE images, however, extensive training is required,… ▽ More Squamous Cell Carcinoma (SCC) is the most common cancer type of the epithelium and is often detected at a late stage. Besides invasive diagnosis of SCC by means of biopsy and histo-pathologic assessment, Confocal Laser Endomicroscopy (CLE) has emerged as noninvasive method that was successfully used to diagnose SCC in vivo. For interpretation of CLE images, however, extensive training is required, which limits its applicability and use in clinical practice of the method. To aid diagnosis of SCC in a broader scope, automatic detection methods have been proposed. This work compares two methods with regard to their applicability in a transfer learning sense, i.e. training on one tissue type (from one clinical team) and applying the learnt classification system to another entity (different anatomy, different clinical team). Besides a previously proposed, patch-based method based on convolutional neural networks, a novel classification method on image level (based on a pre-trained Inception V.3 network with dedicated preprocessing and interpretation of class activation maps) is proposed and evaluated. The newly presented approach improves recognition performance, yielding accuracies of 91.63% on the first data set (oral cavity) and 92.63% on a joint data set. The generalization from oral cavity to the second data set (vocal folds) lead to similar area-under-the-ROC curve values than a direct training on the vocal folds data set, indicating good generalization. △ Less

Submitted 3 January, 2020; v1 submitted 24 February, 2019; originally announced February 2019.

Comments: Erratum for version 1, correcting the number of CLE image sequences used in one data set

Journal ref: BIOSTEC 2018: Biomedical Engineering Systems and Technologies

arXiv:1808.06900 [pdf, other]

doi 10.1109/LCN.Workshops.2017.71

Defending against Intrusion of Malicious UAVs with Networked UAV Defense Swarms

Authors: Matthias R. Brust, Grégoire Danoy, Pascal Bouvry, Dren Gashi, Himadri Pathak, Mike P. Gonçalves

Abstract: Nowadays, companies such as Amazon, Alibaba, and even pizza chains are pushing forward to use drones, also called UAVs (Unmanned Aerial Vehicles), for service provision, such as package and food delivery. As governments intend to use these immense economic benefits that UAVs have to offer, urban planners are moving forward to incorporate so-called UAV flight zones and UAV highways in their smart c… ▽ More Nowadays, companies such as Amazon, Alibaba, and even pizza chains are pushing forward to use drones, also called UAVs (Unmanned Aerial Vehicles), for service provision, such as package and food delivery. As governments intend to use these immense economic benefits that UAVs have to offer, urban planners are moving forward to incorporate so-called UAV flight zones and UAV highways in their smart city designs. However, the high-speed mobility and behavior dynamics of UAVs need to be monitored to detect and, subsequently, to deal with intruders, rogue drones, and UAVs with a malicious intent. This paper proposes a UAV defense system for the purpose of intercepting and escorting a malicious UAV outside the flight zone. The proposed UAV defense system consists of a defense UAV swarm, which is capable to self-organize its defense formation in the event of intruder detection, and chase the malicious UAV as a networked swarm. Modular design principles have been used for our fully localized approach. We developed an innovative auto-balanced clustering process to realize the intercept- and capture-formation. As it turned out, the resulting networked defense UAV swarm is resilient against communication losses. Finally, a prototype UAV simulator has been implemented. Through extensive simulations, we show the feasibility and performance of our approach. △ Less

Submitted 2 September, 2018; v1 submitted 21 August, 2018; originally announced August 2018.

Comments: IEEE Conference on Local Computer Networks (LCN), 2017

arXiv:1711.07915 [pdf, ps, other]

10Sent: A Stable Sentiment Analysis Method Based on the Combination of Off-The-Shelf Approaches

Authors: Philipe F. Melo, Daniel H. Dalip, Manoel M. Junior, Marcos A. Gonçalves, Fabrício Benevenuto

Abstract: Sentiment analysis has become a very important tool for analysis of social media data. There are several methods developed for this research field, many of them working very differently from each other, covering distinct aspects of the problem and disparate strategies. Despite the large number of existent techniques, there is no single one which fits well in all cases or for all data sources. Supe… ▽ More Sentiment analysis has become a very important tool for analysis of social media data. There are several methods developed for this research field, many of them working very differently from each other, covering distinct aspects of the problem and disparate strategies. Despite the large number of existent techniques, there is no single one which fits well in all cases or for all data sources. Supervised approaches may be able to adapt to specific situations but they require manually labeled training, which is very cumbersome and expensive to acquire, mainly for a new application. In this context, in here, we propose to combine several very popular and effective state-of-the-practice sentiment analysis methods, by means of an unsupervised bootstrapped strategy for polarity classification. One of our main goals is to reduce the large variability (lack of stability) of the unsupervised methods across different domains (datasets). Our solution was thoroughly tested considering thirteen different datasets in several domains such as opinions, comments, and social media. The experimental results demonstrate that our combined method (aka, 10SENT) improves the effectiveness of the classification task, but more importantly, it solves a key problem in the field. It is consistently among the best methods in many data types, meaning that it can produce the best (or close to best) results in almost all considered contexts, without any additional costs (e.g., manual labeling). Our self-learning approach is also very independent of the base methods, which means that it is highly extensible to incorporate any new additional method that can be envisioned in the future. Finally, we also investigate a transfer learning approach for sentiment analysis as a means to gather additional (unsupervised) information for the proposed approach and we show the potential of this technique to improve our results. △ Less

Submitted 21 November, 2017; originally announced November 2017.

arXiv:1707.08149 [pdf, other]

doi 10.5220/0006534700270034

Patch-based Carcinoma Detection on Confocal Laser Endomicroscopy Images -- A Cross-Site Robustness Assessment

Authors: Marc Aubreville, Miguel Goncalves, Christian Knipfer, Nicolai Oetter, Tobias Wuerfl, Helmut Neumann, Florian Stelzle, Christopher Bohr, Andreas Maier

Abstract: Deep learning technologies such as convolutional neural networks (CNN) provide powerful methods for image recognition and have recently been employed in the field of automated carcinoma detection in confocal laser endomicroscopy (CLE) images. CLE is a (sub-)surface microscopic imaging technique that reaches magnifications of up to 1000x and is thus suitable for in vivo structural tissue analysis.… ▽ More Deep learning technologies such as convolutional neural networks (CNN) provide powerful methods for image recognition and have recently been employed in the field of automated carcinoma detection in confocal laser endomicroscopy (CLE) images. CLE is a (sub-)surface microscopic imaging technique that reaches magnifications of up to 1000x and is thus suitable for in vivo structural tissue analysis. In this work, we aim to evaluate the prospects of a priorly developed deep learning-based algorithm targeted at the identification of oral squamous cell carcinoma with regard to its generalization to further anatomic locations of squamous cell carcinomas in the area of head and neck. We applied the algorithm on images acquired from the vocal fold area of five patients with histologically verified squamous cell carcinoma and presumably healthy control images of the clinically normal contra-lateral vocal cord. We find that the network trained on the oral cavity data reaches an accuracy of 89.45% and an area-under-the-curve (AUC) value of 0.955, when applied on the vocal cords data. Compared to the state of the art, we achieve very similar results, yet with an algorithm that was trained on a completely disjunct data set. Concatenating both data sets yielded further improvements in cross-validation with an accuracy of 90.81% and AUC of 0.970. In this study, for the first time to our knowledge, a deep learning mechanism for the identification of oral carcinomas using CLE Images could be applied to other disciplines in the area of head and neck. This study shows the prospect of the algorithmic approach to generalize well on other malignant entities of the head and neck, regardless of the anatomical location and furthermore in an examiner-independent manner. △ Less

Submitted 3 January, 2020; v1 submitted 25 July, 2017; originally announced July 2017.

Comments: Erratum: In the previous version, the number of CLE sequences in the vocal folds data set was inadequately reported

Journal ref: Proceedings of BIOIMAGING 2018, ISBN: 978-989-758-278-3

arXiv:1512.01818 [pdf, other]

SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

Authors: Filipe Nunes Ribeiro, Matheus Araújo, Pollyanna Gonçalves, Fabrício Benevenuto, Marcos André Gonçalves

Abstract: In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide… ▽ More In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, \textit{as they are used in practice}, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods' codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods. △ Less

Submitted 14 July, 2016; v1 submitted 6 December, 2015; originally announced December 2015.

arXiv:1408.7094 [pdf, other]

Improving the Effectiveness of Content Popularity Prediction Methods using Time Series Trends

Authors: Flavio Figueiredo, Marcos André Gonçalves, Jussara M. Almeida

Abstract: We here present a simple and effective model to predict the popularity of web content. Our solution, which is the winner of two of the three tasks of the ECML/PKDD 2014 Predictive Analytics Challenge, aims at predicting user engagement metrics, such as number of visits and social network engagement, that a web page will achieve 48 hours after its upload, using only information available in the fir… ▽ More We here present a simple and effective model to predict the popularity of web content. Our solution, which is the winner of two of the three tasks of the ECML/PKDD 2014 Predictive Analytics Challenge, aims at predicting user engagement metrics, such as number of visits and social network engagement, that a web page will achieve 48 hours after its upload, using only information available in the first hour after upload. Our model is based on two steps. We first use time series clustering techniques to extract common temporal trends of content popularity. Next, we use linear regression models, exploiting as predictors both content features (e.g., numbers of visits and mentions on online social networks) and metrics that capture the distance between the popularity time series to the trends extracted in the first step. We discuss why this model is effective and show its gains over state of the art alternatives. △ Less

Submitted 29 August, 2014; originally announced August 2014.

Comments: Presented on the ECML/PKDD Discovery Challenge on Predictive Analytics. Winner of two out pf three tasks of the Predictive Analytics Discovery Challenge

ACM Class: H.3.5

arXiv:1402.2351 [pdf, other]

TrendLearner: Early Prediction of Popularity Trends of User Generated Content

Authors: Flavio Figueiredo, Jussara M. Almeida, Marcos André Gonçalves, Fabrício Benevenuto

Abstract: We here focus on the problem of predicting the popularity trend of user generated content (UGC) as early as possible. Taking YouTube videos as case study, we propose a novel two-step learning approach that: (1) extracts popularity trends from previously uploaded objects, and (2) predicts trends for new content. Unlike previous work, our solution explicitly addresses the inherent tradeoff between p… ▽ More We here focus on the problem of predicting the popularity trend of user generated content (UGC) as early as possible. Taking YouTube videos as case study, we propose a novel two-step learning approach that: (1) extracts popularity trends from previously uploaded objects, and (2) predicts trends for new content. Unlike previous work, our solution explicitly addresses the inherent tradeoff between prediction accuracy and remaining interest in the content after prediction, solving it on a per-object basis. Our experimental results show great improvements of our solution over alternatives, and its applicability to improve the accuracy of state-of-the-art popularity prediction methods. △ Less

Submitted 14 February, 2016; v1 submitted 10 February, 2014; originally announced February 2014.

Comments: To appear at Elsevier Information Sciences Journal

arXiv:1402.1777 [pdf, other]

On the Dynamics of Social Media Popularity: A YouTube Case Study

Authors: Flavio Figueiredo, Jussara M. Almeida, Marcos André Gonçalves, Fabrício Benevenuto

Abstract: Understanding the factors that impact the popularity dynamics of social media can drive the design of effective information services, besides providing valuable insights to content generators and online advertisers. Taking YouTube as case study, we analyze how video popularity evolves since upload, extracting popularity trends that characterize groups of videos. We also analyze the referrers that… ▽ More Understanding the factors that impact the popularity dynamics of social media can drive the design of effective information services, besides providing valuable insights to content generators and online advertisers. Taking YouTube as case study, we analyze how video popularity evolves since upload, extracting popularity trends that characterize groups of videos. We also analyze the referrers that lead users to videos, correlating them, features of the video and early popularity measures with the popularity trend and total observed popularity the video will experience. Our findings provide fundamental knowledge about popularity dynamics and its implications for services such as advertising and search. △ Less

Submitted 17 October, 2014; v1 submitted 7 February, 2014; originally announced February 2014.

Comments: Extended version of a paper published in ACM WSDM 2011. Pre-print of the paper accepted for publication on the ACM Transactions on Internet Tecnology

arXiv:1303.2277 [pdf, ps, other]

Is Learning to Rank Worth It? A Statistical Analysis of Learning to Rank Methods

Authors: Guilherme de Castro Mendes Gomes, Vitor Campos de Oliveira, Jussara Marques de Almeida, Marcos André Gonçalves

Abstract: The Learning to Rank (L2R) research field has experienced a fast paced growth over the last few years, with a wide variety of benchmark datasets and baselines available for experimentation. We here investigate the main assumption behind this field, which is that, the use of sophisticated L2R algorithms and models, produce significant gains over more traditional and simple information retrieval app… ▽ More The Learning to Rank (L2R) research field has experienced a fast paced growth over the last few years, with a wide variety of benchmark datasets and baselines available for experimentation. We here investigate the main assumption behind this field, which is that, the use of sophisticated L2R algorithms and models, produce significant gains over more traditional and simple information retrieval approaches. Our experimental results surprisingly indicate that many L2R algorithms, when put up against the best individual features of each dataset, may not produce statistically significant differences, even if the absolute gains may seem large. We also find that most of the reported baselines are statistically tied, with no clear winner. △ Less

Submitted 9 March, 2013; originally announced March 2013.

Comments: 7 pages, 10 tables, 14 references. Original (short) paper published in the Brazilian Symposium on Databases, 2012 (SBBD2012). Current revision submitted to the Journal of Information and Data Management (JIDM)

ACM Class: H.3

arXiv:cs/0205059 [pdf, ps, other]

A Connection-Centric Survey of Recommender Systems Research

Authors: Saverio Perugini, Marcos Andre Goncalves, Edward A. Fox

Abstract: Recommender systems attempt to reduce information overload and retain customers by selecting a subset of items from a universal set based on user preferences. While research in recommender systems grew out of information retrieval and filtering, the topic has steadily advanced into a legitimate and challenging research area of its own. Recommender systems have traditionally been studied from a c… ▽ More Recommender systems attempt to reduce information overload and retain customers by selecting a subset of items from a universal set based on user preferences. While research in recommender systems grew out of information retrieval and filtering, the topic has steadily advanced into a legitimate and challenging research area of its own. Recommender systems have traditionally been studied from a content-based filtering vs. collaborative design perspective. Recommendations, however, are not delivered within a vacuum, but rather cast within an informal community of users and social context. Therefore, ultimately all recommender systems make connections among people and thus should be surveyed from such a perspective. This viewpoint is under-emphasized in the recommender systems literature. We therefore take a connection-oriented viewpoint toward recommender systems research. We posit that recommendation has an inherently social element and is ultimately intended to connect people either directly as a result of explicit user modeling or indirectly through the discovery of relationships implicit in extant data. Thus, recommender systems are characterized by how they model users to bring people together: explicitly or implicitly. Finally, user modeling and the connection-centric viewpoint raise broadening and social issues--such as evaluation, targeting, and privacy and trust--which we also briefly address. △ Less

Submitted 29 July, 2003; v1 submitted 22 May, 2002; originally announced May 2002.

Comments: Based on the comments from reviewers, we have made modifications to our article, including the following: Shifted the focus of the survey completely to recommender system research rather than recommendation and personalization and subsequently changed the title to "A Connection-Centric Survey of Recommender Systems Research." Now only cite the most seminal works in this area and as a result have reduced the references significantly from over 200 to 120

ACM Class: A.1; H.1.0; H.1.2; H.3.0; H.3.3; H.3.4; H.3.5; H.4.2; H.5.2; H.5.4

Showing 1–19 of 19 results for author: Goncalves, M