Search | arXiv e-print repository

OLGA: One-cLass Graph Autoencoder

Authors: M. P. S. Gôlo, J. G. B. M. Junior, D. F. Silva, R. M. Marcacini

Abstract: One-class learning (OCL) comprises a set of techniques applied when real-world problems have a single class of interest. The usual procedure for OCL is learning a hypersphere that comprises instances of this class and, ideally, repels unseen instances from any other classes. Besides, several OCL algorithms for graphs have been proposed since graph representation learning has succeeded in various f… ▽ More One-class learning (OCL) comprises a set of techniques applied when real-world problems have a single class of interest. The usual procedure for OCL is learning a hypersphere that comprises instances of this class and, ideally, repels unseen instances from any other classes. Besides, several OCL algorithms for graphs have been proposed since graph representation learning has succeeded in various fields. These methods may use a two-step strategy, initially representing the graph and, in a second step, classifying its nodes. On the other hand, end-to-end methods learn the node representations while classifying the nodes in one learning process. We highlight three main gaps in the literature on OCL for graphs: (i) non-customized representations for OCL; (ii) the lack of constraints on hypersphere parameters learning; and (iii) the methods' lack of interpretability and visualization. We propose One-cLass Graph Autoencoder (OLGA). OLGA is end-to-end and learns the representations for the graph nodes while encapsulating the interest instances by combining two loss functions. We propose a new hypersphere loss function to encapsulate the interest instances. OLGA combines this new hypersphere loss with the graph autoencoder reconstruction loss to improve model learning. OLGA achieved state-of-the-art results and outperformed six other methods with a statistically significant difference from five methods. Moreover, OLGA learns low-dimensional representations maintaining the classification performance with an interpretable model representation learning and results. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.09073 [pdf, other]

Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition

Authors: Eleni Triantafillou, Peter Kairouz, Fabian Pedregosa, Jamie Hayes, Meghdad Kurmanji, Kairan Zhao, Vincent Dumoulin, Julio Jacques Junior, Ioannis Mitliagkas, Jun Wan, Lisheng Sun Hosoya, Sergio Escalera, Gintare Karolina Dziugaite, Peter Triantafillou, Isabelle Guyon

Abstract: We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and initiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In thi… ▽ More We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and initiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In this paper, we analyze top solutions and delve into discussions on benchmarking unlearning, which itself is a research problem. The evaluation methodology we developed for the competition measures forgetting quality according to a formal notion of unlearning, while incorporating model utility for a holistic evaluation. We analyze the effectiveness of different instantiations of this evaluation framework vis-a-vis the associated compute cost, and discuss implications for standardizing evaluation. We find that the ranking of leading methods remains stable under several variations of this framework, pointing to avenues for reducing the cost of evaluation. Overall, our findings indicate progress in unlearning, with top-performing competition entries surpassing existing algorithms under our evaluation framework. We analyze trade-offs made by different algorithms and strengths or weaknesses in terms of generalizability to new datasets, paving the way for advancing both benchmarking and algorithm development in this important area. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2405.05129 [pdf, other]

Web Intelligence Journal in perspective: an analysis of its two decades trajectory

Authors: Diogenes Ademir Domingos, Victor Emanuel Santos Moura, Antonio Fernando Lavareda Jacob Junior, Fabio Manoel Franca Lobato

Abstract: The evolution of a thematic area undergoes various changes of perspective and adopts new theoretical approaches that arise from the interactions of the community and a wide range of social needs. The advent of digital technologies, such as social networks, underlines this factor by spreading knowledge and forging links between different communities. Web intelligence is now on the verge of raising… ▽ More The evolution of a thematic area undergoes various changes of perspective and adopts new theoretical approaches that arise from the interactions of the community and a wide range of social needs. The advent of digital technologies, such as social networks, underlines this factor by spreading knowledge and forging links between different communities. Web intelligence is now on the verge of raising questions that broaden the understanding of how artificial intelligence impacts the Web of People, Data, and Things, among other factors. To the best of our knowledge, there is no study that has conducted a longitudinal analysis of the evolution of this community. Thus, we investigate in this paper how Web intelligence has evolved in the last twenty years by carrying out a literature review and bibliometric analysis. Concerning the impact of this research study, increasing attention is devoted to determining which are the most influential papers in the community by referring to citation networks and discovering the most popular and pressing topics through a co-citation analysis and the keywords co-occurrence. The results obtained can guide the direction of new research projects in the area and update the scope and places of interest found in current trends and the relevant journals. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2404.09703 [pdf, other]

AI Competitions and Benchmarks: Dataset Development

Authors: Romain Egele, Julio C. S. Jacques Junior, Jan N. van Rijn, Isabelle Guyon, Xavier Baró, Albert Clapés, Prasanna Balaprakash, Sergio Escalera, Thomas Moeslund, Jun Wan

Abstract: Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual dat… ▽ More Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual data preparation. The haste in develo** new models can frequently result in various shortcomings, potentially posing risks when deployed in real-world scenarios (eg social discrimination, critical failures), leading to the failure or substantial escalation of costs in AI-based projects. This chapter provides a comprehensive overview of established methodological tools, enriched by our practical experience, in the development of datasets for machine learning. Initially, we develop the tasks involved in dataset development and offer insights into their effective management (including requirements, design, implementation, evaluation, distribution, and maintenance). Then, we provide more details about the implementation process which includes data collection, transformation, and quality evaluation. Finally, we address practical considerations regarding dataset distribution and maintenance. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: Preprint version of the 3rd Chapter of the book: Competitions and Benchmarks, the science behind the contests (https://sites.google.com/chalearn.org/book/home)

arXiv:2404.01552 [pdf]

The use of the open innovation paradigm in the public sector: a systematic review of published studies

Authors: Joel Alves de Lima Júnior, Kiev Gama, Jorge da Silva Correia Neto

Abstract: The use of the open innovation paradigm has been, over the past years, getting special attention in the public sector. Motivated by an urban environment that is increasingly more complex and challenging, several government agencies have been allocating financial resources and efforts to promote open and participative government initiatives. As a way to try and understand this scenario, a systemati… ▽ More The use of the open innovation paradigm has been, over the past years, getting special attention in the public sector. Motivated by an urban environment that is increasingly more complex and challenging, several government agencies have been allocating financial resources and efforts to promote open and participative government initiatives. As a way to try and understand this scenario, a systematic review of the literature was conducted, to provide a comprehensive analysis of the scientific papers that were published, seeking to capture, classify, evaluate and synthesize how the use of this paradigm has been put into practice in the public sector. In total, 4,741 preliminary studies were analyzed. From this number, only 37 articles were classified as potentially relevant and moved forward, going through the process of data extraction and analysis. From the data obtained, it was possible to verify that the use of this paradigm started to be reported with a higher frequency in the literature since 2013 and, among the main findings, we highlight the reports of experiences, approach propositions, of understanding how the phenomenon occurs and theoretical reflections. It was also possible to verify that the use of open innovation through social media was one of the pioneer techniques of engagement between the public sector and citizens. In conclusion, the reports confirm that the main challenges of this paradigm applied to the public sector are associated with their respective bureaucratic aspects, therefore lacking a bigger reflection on the procedures and methods to be used in the public sphere. △ Less

Submitted 8 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

Comments: 33 pages, 6 figures and 18 tables

arXiv:2402.15470 [pdf, other]

Some results involving the $A_α$-eigenvalues for graphs and line graphs

Authors: Joao Domingos Gomes da Silva Junior, Carla Silva Oliveira, Liliana Manuela Gaspar C. da Costa

Abstract: Let $G$ be a simple graph with adjacency matrix $A(G)$, signless Laplacian matrix $Q(G)$, degree diagonal matrix $D(G)$ and let $l(G)$ be the line graph of $G$. In 2017, Nikiforov defined the $A_α$-matrix of $G$, $A_α(G)$, as a linear convex combination of $A(G)$ and $D(G)$, the following way, $A_α(G):=αA(G)+(1-α)D(G),$ where $α\in[0,1]$. In this paper, we present some bounds for the eigenvalues o… ▽ More Let $G$ be a simple graph with adjacency matrix $A(G)$, signless Laplacian matrix $Q(G)$, degree diagonal matrix $D(G)$ and let $l(G)$ be the line graph of $G$. In 2017, Nikiforov defined the $A_α$-matrix of $G$, $A_α(G)$, as a linear convex combination of $A(G)$ and $D(G)$, the following way, $A_α(G):=αA(G)+(1-α)D(G),$ where $α\in[0,1]$. In this paper, we present some bounds for the eigenvalues of $A_α(G)$ and for the largest and smallest eigenvalues of $A_α(l(G))$. Extremal graphs attaining some of these bounds are characterized. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: 18 pages, 5 figures, 3 tables

MSC Class: 05C05

arXiv:2311.04253 [pdf, ps, other]

Blind Federated Learning via Over-the-Air q-QAM

Authors: Saeed Razavikia, José Mairton Barros Da Silva Júnior, Carlo Fischione

Abstract: In this work, we investigate federated edge learning over a fading multiple access channel. To alleviate the communication burden between the edge devices and the access point, we introduce a pioneering digital over-the-air computation strategy employing q-ary quadrature amplitude modulation, culminating in a low latency communication scheme. Indeed, we propose a new federated edge learning framew… ▽ More In this work, we investigate federated edge learning over a fading multiple access channel. To alleviate the communication burden between the edge devices and the access point, we introduce a pioneering digital over-the-air computation strategy employing q-ary quadrature amplitude modulation, culminating in a low latency communication scheme. Indeed, we propose a new federated edge learning framework in which edge devices use digital modulation for over-the-air uplink transmission to the edge server while they have no access to the channel state information. Furthermore, we incorporate multiple antennas at the edge server to overcome the fading inherent in wireless communication. We analyze the number of antennas required to mitigate the fading impact effectively. We prove a non-asymptotic upper bound for the mean squared error for the proposed federated learning with digital over-the-air uplink transmissions under both noisy and fading conditions. Leveraging the derived upper bound, we characterize the convergence rate of the learning process of a non-convex loss function in terms of the mean square error of gradients due to the fading channel. Furthermore, we substantiate the theoretical assurances through numerical experiments concerning mean square error and the convergence efficacy of the digital federated edge learning framework. Notably, the results demonstrate that augmenting the number of antennas at the edge server and adopting higher-order modulations improve the model accuracy up to 60\%. △ Less

Submitted 19 April, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

arXiv:2310.20504 [pdf, ps, other]

SumComp: Coding for Digital Over-the-Air Computation via the Ring of Integers

Authors: Saeed Razavikia, José Mairton Barros Da Silva Júnior, Carlo Fischione

Abstract: Communication and computation are traditionally treated as separate entities, allowing for individual optimizations. However, many applications focus on local information's functionality rather than the information itself. For such cases, harnessing interference for computation in a multiple access channel through digital over-the-air computation can notably increase the computation, as establishe… ▽ More Communication and computation are traditionally treated as separate entities, allowing for individual optimizations. However, many applications focus on local information's functionality rather than the information itself. For such cases, harnessing interference for computation in a multiple access channel through digital over-the-air computation can notably increase the computation, as established by the ChannelComp method. However, the coding scheme originally proposed in ChannelComp may suffer from high computational complexity because it is general and is not optimized for specific modulation categories. Therefore, this study considers a specific category of digital modulations for over-the-air computations, QAM and PAM, for which we introduce a novel coding scheme called SumComp. Furthermore, we derive an MSE analysis for SumComp coding in the computation of the arithmetic mean function and establish an upper bound on the MAE for a set of nomographic functions. Simulation results affirm the superior performance of SumComp coding compared to traditional analog over-the-air computation and the original coding in ChannelComp approaches regarding both MSE and MAE over a noisy multiple access channel. Specifically, SumComp coding shows approximately $10$ dB improvements for computing arithmetic and geometric mean on the normalized MSE for low noise scenarios. △ Less

Submitted 27 June, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

arXiv:2310.06532 [pdf, ps, other]

ChannelComp: A General Method for Computation by Communications

Authors: Saeed Razavikia, José Mairton Barros Da Silva Júnior, Carlo Fischione

Abstract: Over-the-air computation (AirComp) is a well-known technique by which several wireless devices transmit by analog amplitude modulation to achieve a sum of their transmit signals at a common receiver. The underlying physical principle is the superposition property of the radio waves. Since such superposition is analog and in amplitude, it is natural that AirComp uses analog amplitude modulations. U… ▽ More Over-the-air computation (AirComp) is a well-known technique by which several wireless devices transmit by analog amplitude modulation to achieve a sum of their transmit signals at a common receiver. The underlying physical principle is the superposition property of the radio waves. Since such superposition is analog and in amplitude, it is natural that AirComp uses analog amplitude modulations. Unfortunately, this is impractical because most wireless devices today use digital modulations. It would be highly desirable to use digital communications because of their numerous benefits, such as error correction, synchronization, acquisition of channel state information, and widespread use. However, when we use digital modulations for AirComp, a general belief is that the superposition property of the radio waves returns a meaningless overlap** of the digital signals. In this paper, we break through such beliefs and propose an entirely new digital channel computing method named ChannelComp, which can use digital as well as analog modulations. We propose a feasibility optimization problem that ascertains the optimal modulation for computing arbitrary functions over-the-air. Additionally, we propose pre-coders to adapt existing digital modulation schemes for computing the function over the multiple access channel. The simulation results verify the superior performance of ChannelComp compared to AirComp, particularly for the product functions, with more than 10 dB improvement of the computation error. △ Less

Submitted 10 October, 2023; originally announced October 2023.

arXiv:2309.02449 [pdf]

doi 10.21528/CBIC2023-161

League of Legends: Real-Time Result Prediction

Authors: Jailson B. S. Junior, Claudio E. C. Campelo

Abstract: This paper presents a study on the prediction of outcomes in matches of the electronic game League of Legends (LoL) using machine learning techniques. With the aim of exploring the ability to predict real-time results, considering different variables and stages of the match, we highlight the use of unpublished data as a fundamental part of this process. With the increasing popularity of LoL and th… ▽ More This paper presents a study on the prediction of outcomes in matches of the electronic game League of Legends (LoL) using machine learning techniques. With the aim of exploring the ability to predict real-time results, considering different variables and stages of the match, we highlight the use of unpublished data as a fundamental part of this process. With the increasing popularity of LoL and the emergence of tournaments, betting related to the game has also emerged, making the investigation in this area even more relevant. A variety of models were evaluated and the results were encouraging. A model based on LightGBM showed the best performance, achieving an average accuracy of 81.62\% in intermediate stages of the match when the percentage of elapsed time was between 60\% and 80\%. On the other hand, the Logistic Regression and Gradient Boosting models proved to be more effective in early stages of the game, with promising results. This study contributes to the field of machine learning applied to electronic games, providing valuable insights into real-time prediction in League of Legends. The results obtained may be relevant for both players seeking to improve their strategies and the betting industry related to the game. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 8 pages

arXiv:2306.16623 [pdf, other]

The Segment Anything Model (SAM) for Remote Sensing Applications: From Zero to One Shot

Authors: Lucas Prado Osco, Qiusheng Wu, Eduardo Lopes de Lemos, Wesley Nunes Gonçalves, Ana Paula Marques Ramos, Jonathan Li, José Marcato Junior

Abstract: Segmentation is an essential step for remote sensing image processing. This study aims to advance the application of the Segment Anything Model (SAM), an innovative image segmentation model by Meta AI, in the field of remote sensing image analysis. SAM is known for its exceptional generalization capabilities and zero-shot learning, making it a promising approach to processing aerial and orbital im… ▽ More Segmentation is an essential step for remote sensing image processing. This study aims to advance the application of the Segment Anything Model (SAM), an innovative image segmentation model by Meta AI, in the field of remote sensing image analysis. SAM is known for its exceptional generalization capabilities and zero-shot learning, making it a promising approach to processing aerial and orbital images from diverse geographical contexts. Our exploration involved testing SAM across multi-scale datasets using various input prompts, such as bounding boxes, individual points, and text descriptors. To enhance the model's performance, we implemented a novel automated technique that combines a text-prompt-derived general example with one-shot training. This adjustment resulted in an improvement in accuracy, underscoring SAM's potential for deployment in remote sensing imagery and reducing the need for manual annotation. Despite the limitations encountered with lower spatial resolution images, SAM exhibits promising adaptability to remote sensing data analysis. We recommend future research to enhance the model's proficiency through integration with supplementary fine-tuning techniques and other networks. Furthermore, we provide the open-source code of our modifications on online repositories, encouraging further and broader adaptations of SAM to the remote sensing domain. △ Less

Submitted 31 October, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: 20 pages, 9 figures

arXiv:2305.03431 [pdf, other]

Hearing the voice of experts: Unveiling Stack Exchange communities' knowledge of test smells

Authors: Luana Martins, Denivan Campos, Railana Santana, Joselito Mota Junior, Heitor Costa, Ivan Machado

Abstract: Refactorings are transformations to improve the code design without changing overall functionality and observable behavior. During the refactoring process of smelly test code, practitioners may struggle to identify refactoring candidates and define and apply corrective strategies. This paper reports on an empirical study aimed at understanding how test smells and test refactorings are discussed on… ▽ More Refactorings are transformations to improve the code design without changing overall functionality and observable behavior. During the refactoring process of smelly test code, practitioners may struggle to identify refactoring candidates and define and apply corrective strategies. This paper reports on an empirical study aimed at understanding how test smells and test refactorings are discussed on the Stack Exchange network. Developers commonly count on Stack Exchange to pick the brains of the wise, i.e., to `look up' how others are completing similar tasks. Therefore, in light of data from the Stack Exchange discussion topics, we could examine how developers understand and perceive test smells, the corrective actions they take to handle them, and the challenges they face when refactoring test code aiming to fix test smells. We observed that developers are interested in others' perceptions and hands-on experience handling test code issues. Besides, there is a clear indication that developers often ask whether test smells or anti-patterns are either good or bad testing practices than code-based refactoring recommendations. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: Preprint of the manuscript accepted for publication at CHASE 2023

arXiv:2305.02813 [pdf, other]

MTLSegFormer: Multi-task Learning with Transformers for Semantic Segmentation in Precision Agriculture

Authors: Diogo Nunes Goncalves, Jose Marcato Junior, Pedro Zamboni, Hemerson Pistori, Jonathan Li, Keiller Nogueira, Wesley Nunes Goncalves

Abstract: Multi-task learning has proven to be effective in improving the performance of correlated tasks. Most of the existing methods use a backbone to extract initial features with independent branches for each task, and the exchange of information between the branches usually occurs through the concatenation or sum of the feature maps of the branches. However, this type of information exchange does not… ▽ More Multi-task learning has proven to be effective in improving the performance of correlated tasks. Most of the existing methods use a backbone to extract initial features with independent branches for each task, and the exchange of information between the branches usually occurs through the concatenation or sum of the feature maps of the branches. However, this type of information exchange does not directly consider the local characteristics of the image nor the level of importance or correlation between the tasks. In this paper, we propose a semantic segmentation method, MTLSegFormer, which combines multi-task learning and attention mechanisms. After the backbone feature extraction, two feature maps are learned for each task. The first map is proposed to learn features related to its task, while the second map is obtained by applying learned visual attention to locally re-weigh the feature maps of the other tasks. In this way, weights are assigned to local regions of the image of other tasks that have greater importance for the specific task. Finally, the two maps are combined and used to solve a task. We tested the performance in two challenging problems with correlated tasks and observed a significant improvement in accuracy, mainly in tasks with high dependence on the others. △ Less

Submitted 4 May, 2023; originally announced May 2023.

Comments: Accepted 4th Agriculture-Vision Workshop - CVPRW

arXiv:2304.13009 [pdf, other]

The Potential of Visual ChatGPT For Remote Sensing

Authors: Lucas Prado Osco, Eduardo Lopes de Lemos, Wesley Nunes Gonçalves, Ana Paula Marques Ramos, José Marcato Junior

Abstract: Recent advancements in Natural Language Processing (NLP), particularly in Large Language Models (LLMs), associated with deep learning-based computer vision techniques, have shown substantial potential for automating a variety of tasks. One notable model is Visual ChatGPT, which combines ChatGPT's LLM capabilities with visual computation to enable effective image analysis. The model's ability to pr… ▽ More Recent advancements in Natural Language Processing (NLP), particularly in Large Language Models (LLMs), associated with deep learning-based computer vision techniques, have shown substantial potential for automating a variety of tasks. One notable model is Visual ChatGPT, which combines ChatGPT's LLM capabilities with visual computation to enable effective image analysis. The model's ability to process images based on textual inputs can revolutionize diverse fields. However, its application in the remote sensing domain remains unexplored. This is the first paper to examine the potential of Visual ChatGPT, a cutting-edge LLM founded on the GPT architecture, to tackle the aspects of image processing related to the remote sensing domain. Among its current capabilities, Visual ChatGPT can generate textual descriptions of images, perform canny edge and straight line detection, and conduct image segmentation. These offer valuable insights into image content and facilitate the interpretation and extraction of information. By exploring the applicability of these techniques within publicly available datasets of satellite images, we demonstrate the current model's limitations in dealing with remote sensing images, highlighting its challenges and future prospects. Although still in early development, we believe that the combination of LLMs and visual models holds a significant potential to transform remote sensing image processing, creating accessible and practical application opportunities in the field. △ Less

Submitted 5 July, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

arXiv:2210.17474 [pdf, ps, other]

A-LAQ: Adaptive Lazily Aggregated Quantized Gradient

Authors: Afsaneh Mahmoudi, José Mairton Barros Da Silva Júnior, Hossein S. Ghadikolaei, Carlo Fischione

Abstract: Federated Learning (FL) plays a prominent role in solving machine learning problems with data distributed across clients. In FL, to reduce the communication overhead of data between clients and the server, each client communicates the local FL parameters instead of the local data. However, when a wireless network connects clients and the server, the communication resource limitations of the client… ▽ More Federated Learning (FL) plays a prominent role in solving machine learning problems with data distributed across clients. In FL, to reduce the communication overhead of data between clients and the server, each client communicates the local FL parameters instead of the local data. However, when a wireless network connects clients and the server, the communication resource limitations of the clients may prevent completing the training of the FL iterations. Therefore, communication-efficient variants of FL have been widely investigated. Lazily Aggregated Quantized Gradient (LAQ) is one of the promising communication-efficient approaches to lower resource usage in FL. However, LAQ assigns a fixed number of bits for all iterations, which may be communication-inefficient when the number of iterations is medium to high or convergence is approaching. This paper proposes Adaptive Lazily Aggregated Quantized Gradient (A-LAQ), which is a method that significantly extends LAQ by assigning an adaptive number of communication bits during the FL iterations. We train FL in an energy-constraint condition and investigate the convergence analysis for A-LAQ. The experimental results highlight that A-LAQ outperforms LAQ by up to a $50$% reduction in spent communication energy and an $11$% increase in test accuracy. △ Less

Submitted 31 October, 2022; originally announced October 2022.

arXiv:2209.04230 [pdf, ps, other]

Survey on Deep Fuzzy Systems in regression applications: a view on interpretability

Authors: Jorge S. S. Júnior, Jérôme Mendes, Francisco Souza, Cristiano Premebida

Abstract: Regression problems have been more and more embraced by deep learning (DL) techniques. The increasing number of papers recently published in this domain, including surveys and reviews, shows that deep regression has captured the attention of the community due to efficiency and good accuracy in systems with high-dimensional data. However, many DL methodologies have complex structures that are not r… ▽ More Regression problems have been more and more embraced by deep learning (DL) techniques. The increasing number of papers recently published in this domain, including surveys and reviews, shows that deep regression has captured the attention of the community due to efficiency and good accuracy in systems with high-dimensional data. However, many DL methodologies have complex structures that are not readily transparent to human users. Accessing the interpretability of these models is an essential factor for addressing problems in sensitive areas such as cyber-security systems, medical, financial surveillance, and industrial processes. Fuzzy logic systems (FLS) are inherently interpretable models, well known in the literature, capable of using nonlinear representations for complex systems through linguistic terms with membership degrees mimicking human thought. Within an atmosphere of explainable artificial intelligence, it is necessary to consider a trade-off between accuracy and interpretability for develo** intelligent models. This paper aims to investigate the state-of-the-art on existing methodologies that combine DL and FLS, namely deep fuzzy systems, to address regression problems, configuring a topic that is currently not sufficiently explored in the literature and thus deserves a comprehensive survey. △ Less

Submitted 9 September, 2022; originally announced September 2022.

arXiv:2206.05600 [pdf, other]

Narratives: the Unforeseen Influencer of Privacy Concerns

Authors: Ze Shi Li, Manish Sihag, Nowshin Nawar Arony, Joao Bezerra Junior, Thanh Phan, Neil Ernst, Daniela Damian

Abstract: Privacy requirements are increasingly growing in importance as new privacy regulations are enacted. To adequately manage privacy requirements, organizations not only need to comply with privacy regulations, but also consider user privacy concerns. In this exploratory study, we used Reddit as a source to understand users' privacy concerns regarding software applications. We collected 4.5 million po… ▽ More Privacy requirements are increasingly growing in importance as new privacy regulations are enacted. To adequately manage privacy requirements, organizations not only need to comply with privacy regulations, but also consider user privacy concerns. In this exploratory study, we used Reddit as a source to understand users' privacy concerns regarding software applications. We collected 4.5 million posts from Reddit and classified 129075 privacy related posts, which is a non-negligible number of privacy discussions. Next, we clustered these posts and identified 9 main areas of privacy concerns. We use the concept of narratives from economics (i.e., posts that can go viral) to explain the phenomenon of what and when users change in their discussion of privacy. We further found that privacy discussions change over time and privacy regulatory events have a short term impact on such discussions. However, narratives have a notable impact on what and when users discussed about privacy. Considering narratives could guide software organizations in eliciting the relevant privacy concerns before develo** them as privacy requirements. △ Less

Submitted 11 June, 2022; originally announced June 2022.

Comments: 13 pages, to be published in 30th IEEE International Requirements Engineering Conference (RE'22)

arXiv:2206.02300 [pdf, other]

On the horizontal compression of dag-derivations in minimal purely implicational logic

Authors: Edward Hermann Haeusler, José Flávio Cavalcante Barros Junior, Robinson

Abstract: This report defines (plain) Dag-like derivations in the purely implicational fragment of minimal logic $M_{\imply}$. Introduce the horizontal collapsing set of rules and the algorithm {\bf HC}. Explain why {\bf HC} can transform any polynomial height-bounded tree-like proof of a $M_{\imply}$ tautology into a smaller dag-like proof. Sketch a proof that {\bf HC} preserves the soundness of any tree-l… ▽ More This report defines (plain) Dag-like derivations in the purely implicational fragment of minimal logic $M_{\imply}$. Introduce the horizontal collapsing set of rules and the algorithm {\bf HC}. Explain why {\bf HC} can transform any polynomial height-bounded tree-like proof of a $M_{\imply}$ tautology into a smaller dag-like proof. Sketch a proof that {\bf HC} preserves the soundness of any tree-like ND in $M_{\imply}$ in its dag-like version after the horizontal collapsing application. We show some experimental results about applying the compression method to a class of (huge) propositional proofs and an example, with non-hamiltonian graphs, for qualitative analysis. The contributions include the comprehensive presentation of the set of horizontal compression (HC), the (sketch) of the proof that HC rules preserve soundness and the demonstration that the compressed dag-like proofs are polynomially upper-bounded when the submitted tree-like proof is height and foundation poly-bounded. Finally, in the appendix, we outline an algorithm that verifies in polynomial time on the size of the dag-like proofs whether they are valid proofs of their conclusions. △ Less

Submitted 12 April, 2023; v1 submitted 5 June, 2022; originally announced June 2022.

Comments: We updated the set of compression rulesand, provided a more detailed prooof of soundness preservation. We added an example that helps to understand the compression rules and strategy. It serves as the fundamental presentation of the compression and is used by authors of the group to formalize it in ITPs. Robinson Callou was added as author

Journal ref: Part of a previous version of this compression algorithm was published in Studia Logica 107(1), pp.55-83, 2019. Part of the paper was published in pre-proceedings of LSFA2022

arXiv:2204.07773 [pdf, ps, other]

doi 10.1109/TWC.2024.3378351

FedCau: A Proactive Stop Policy for Communication and Computation Efficient Federated Learning

Authors: Afsaneh Mahmoudi, Hossein S. Ghadikolaei, José Mairton Barros Da Silva Júnior, Carlo Fischione

Abstract: This paper investigates efficient distributed training of a Federated Learning~(FL) model over a wireless network of wireless devices. The communication iterations of the distributed training algorithm may be substantially deteriorated or even blocked by the effects of the devices' background traffic, packet losses, congestion, or latency. We abstract the communication-computation impacts as an `i… ▽ More This paper investigates efficient distributed training of a Federated Learning~(FL) model over a wireless network of wireless devices. The communication iterations of the distributed training algorithm may be substantially deteriorated or even blocked by the effects of the devices' background traffic, packet losses, congestion, or latency. We abstract the communication-computation impacts as an `iteration cost' and propose a cost-aware causal FL algorithm~(FedCau) to tackle this problem. We propose an iteration-termination method that trade-offs the training performance and networking costs. We apply our approach when clients use the slotted-ALOHA, the carrier-sense multiple access with collision avoidance~(CSMA/CA), and the orthogonal frequency-division multiple access~(OFDMA) protocols. We show that, given a total cost budget, the training performance degrades as either the background communication traffic or the dimension of the training problem increases. Our results demonstrate the importance of proactively designing optimal cost-efficient stop** criteria to avoid unnecessary communication-computation costs to achieve only a marginal FL training improvement. We validate our method by training and testing FL over the MNIST dataset. Finally, we apply our approach to existing communication efficient FL methods from the literature, achieving further efficiency. We conclude that cost-efficient stop** criteria are essential for the success of practical FL over wireless networks. △ Less

Submitted 26 March, 2024; v1 submitted 16 April, 2022; originally announced April 2022.

arXiv:2202.02415 [pdf, other]

On the Efficiency and Quality of Protection of Preprovisioning in Elastic Optical Networks

Authors: Paulo José S. Júnior, Lucas R. Costa, André C. Drummond

Abstract: The study of protection techniques, such as pre-provisioning (off-line) and provisioning (on-line), has been explored in several ways in the optical network literature. In the new Elastic Optical Network (EON) paradigm, the pre-provisioning techniques were still little explored. Preprovisioning implies the prior allocation of resources in the network for the transport and protection of future conn… ▽ More The study of protection techniques, such as pre-provisioning (off-line) and provisioning (on-line), has been explored in several ways in the optical network literature. In the new Elastic Optical Network (EON) paradigm, the pre-provisioning techniques were still little explored. Preprovisioning implies the prior allocation of resources in the network for the transport and protection of future connection demands, while the provisioning implies the allocation of resources when the demand arrives in the network. Applying preprovisioning reduces the downtime experienced by a connection after a failure, which will reduce unavailability and potentially avoid penalties for violation of Service Level Agreements (SLA) established with client networks. This work aims to explore the main protection techniques and evaluate their efficient in the EON scenario. The performance evaluation show that the use of preprovisioning techniques are more efficient, significantly reducing the network unavailability and bandwidth usage in EON networks. Our solution has an unavailability 40 times lower than shared solutions being only 4% above the optimum. △ Less

Submitted 4 February, 2022; originally announced February 2022.

arXiv:2201.03801 [pdf, other]

Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019

Authors: Zhengying Liu, Adrien Pavao, Zhen Xu, Sergio Escalera, Fabio Ferreira, Isabelle Guyon, Sirui Hong, Frank Hutter, Rongrong Ji, Julio C. S. Jacques Junior, Ge Li, Marius Lindauer, Zhipeng Luo, Meysam Madadi, Thomas Nierhoff, Kangning Niu, Chunguang Pan, Danny Stoll, Sebastien Treguer, ** Wang, Peng Wang, Chenglin Wu, Youcheng Xiong, Arbe r Zela, Yang Zhang

Abstract: This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification… ▽ More This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator". This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free "AutoDL self-service". △ Less

Submitted 11 January, 2022; originally announced January 2022.

Comments: The first three authors contributed equally; This is only a draft version

Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) 2021

arXiv:2110.01328 [pdf, other]

Label it be! A large-scale study of issue labeling in modern open-source repositories

Authors: Joselito Júnior, Gláucya Boechat, Ivan Machado

Abstract: In a wave of growth, open-source projects need to modernize and change how they deal with processes, methods, and communication with their contributors. We could observe that open-source projects are constantly evolving to improve their management of the entire community. Starting with community communication, software development, managing open-source projects faces crucial challenges. One of the… ▽ More In a wave of growth, open-source projects need to modernize and change how they deal with processes, methods, and communication with their contributors. We could observe that open-source projects are constantly evolving to improve their management of the entire community. Starting with community communication, software development, managing open-source projects faces crucial challenges. One of the enabling environments that open-source communities found to achieve community communications objectives was code repositories with integration with issue trackers. Using issue trackers in their projects should encompass an infrastructure capable of hosting the project source code and community participation. Some issue trackers use a structure in which the issue's title and description are the key information. However, we have observed a slight change in this strategy over the years, as more and more data are fundamental to solving the issue. For example, labeling the issues could enable users to provide the issue with more contextual information. By understanding how modern issue trackers handle issue labeling, this study analyzes the impact, engagement, and influence that labels have on the Github repositories, based on a database of 10,673,459 issues mined from 13,280 repositories in 180 Github featured topics. We found that 78.75\% of the repositories label their issues, with more adherence from those repositories that contain big numbers of issues. The labeling practice is essential and prioritized as a first step in the issue resolution process in 65.91\% of the first events. Issues with labels draw more attention and impact by collecting more subscribers, assigns, and comments, hel** to engage contributors to the resolution. △ Less

Submitted 4 October, 2021; originally announced October 2021.

Journal ref: XXIV Ibero-American Conference on Software Engineering (CIbSE 2021)

arXiv:2109.09487 [pdf]

Dyadformer: A Multi-modal Transformer for Long-Range Modeling of Dyadic Interactions

Authors: David Curto, Albert Clapés, Javier Selva, Sorina Smeureanu, Julio C. S. Jacques Junior, David Gallardo-Pujol, Georgina Guilera, David Leiva, Thomas B. Moeslund, Sergio Escalera, Cristina Palmero

Abstract: Personality computing has become an emerging topic in computer vision, due to the wide range of applications it can be used for. However, most works on the topic have focused on analyzing the individual, even when applied to interaction scenarios, and for short periods of time. To address these limitations, we present the Dyadformer, a novel multi-modal multi-subject Transformer architecture to mo… ▽ More Personality computing has become an emerging topic in computer vision, due to the wide range of applications it can be used for. However, most works on the topic have focused on analyzing the individual, even when applied to interaction scenarios, and for short periods of time. To address these limitations, we present the Dyadformer, a novel multi-modal multi-subject Transformer architecture to model individual and interpersonal features in dyadic interactions using variable time windows, thus allowing the capture of long-term interdependencies. Our proposed cross-subject layer allows the network to explicitly model interactions among subjects through attentional operations. This proof-of-concept approach shows how multi-modality and joint modeling of both interactants for longer periods of time helps to predict individual attributes. With Dyadformer, we improve state-of-the-art self-reported personality inference results on individual subjects on the UDIVA v0.5 dataset. △ Less

Submitted 20 September, 2021; originally announced September 2021.

Comments: Accepted to the 2021 ICCV Workshop on Understanding Social Behavior in Dyadic and Small Group Interactions

arXiv:2109.01693 [pdf, other]

Weakly Supervised Few-Shot Segmentation Via Meta-Learning

Authors: Pedro H. T. Gama, Hugo Oliveira, José Marcato Junior, Jefersson A. dos Santos

Abstract: Semantic segmentation is a classic computer vision task with multiple applications, which includes medical and remote sensing image analysis. Despite recent advances with deep-based approaches, labeling samples (pixels) for training models is laborious and, in some cases, unfeasible. In this paper, we present two novel meta learning methods, named WeaSeL and ProtoSeg, for the few-shot semantic seg… ▽ More Semantic segmentation is a classic computer vision task with multiple applications, which includes medical and remote sensing image analysis. Despite recent advances with deep-based approaches, labeling samples (pixels) for training models is laborious and, in some cases, unfeasible. In this paper, we present two novel meta learning methods, named WeaSeL and ProtoSeg, for the few-shot semantic segmentation task with sparse annotations. We conducted extensive evaluation of the proposed methods in different applications (12 datasets) in medical imaging and agricultural remote sensing, which are very distinct fields of knowledge and usually subject to data scarcity. The results demonstrated the potential of our method, achieving suitable results for segmenting both coffee/orange crops and anatomical parts of the human body in comparison with full dense annotation. △ Less

Submitted 3 September, 2021; originally announced September 2021.

arXiv:2105.05066 [pdf, other]

ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research

Authors: Ozge Mercanoglu Sincan, Julio C. S. Jacques Junior, Sergio Escalera, Hacer Yalim Keles

Abstract: The performances of Sign Language Recognition (SLR) systems have improved considerably in recent years. However, several open challenges still need to be solved to allow SLR to be useful in practice. The research in the field is in its infancy in regards to the robustness of the models to a large diversity of signs and signers, and to fairness of the models to performers from different demographic… ▽ More The performances of Sign Language Recognition (SLR) systems have improved considerably in recent years. However, several open challenges still need to be solved to allow SLR to be useful in practice. The research in the field is in its infancy in regards to the robustness of the models to a large diversity of signs and signers, and to fairness of the models to performers from different demographics. This work summarises the ChaLearn LAP Large Scale Signer Independent Isolated SLR Challenge, organised at CVPR 2021 with the goal of overcoming some of the aforementioned challenges. We analyse and discuss the challenge design, top winning solutions and suggestions for future research. The challenge attracted 132 participants in the RGB track and 59 in the RGB+Depth track, receiving more than 1.5K submissions in total. Participants were evaluated using a new large-scale multi-modal Turkish Sign Language (AUTSL) dataset, consisting of 226 sign labels and 36,302 isolated sign video samples performed by 43 different signers. Winning teams achieved more than 96% recognition rate, and their approaches benefited from pose/hand/face estimation, transfer learning, external data, fusion/ensemble of modalities and different strategies to model spatio-temporal information. However, methods still fail to distinguish among very similar signs, in particular those sharing similar hand trajectories. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: Preprint of the accepted paper at ChaLearn Looking at People Sign Language Recognition in the Wild Workshop at CVPR 2021

arXiv:2102.04566 [pdf, other]

doi 10.1016/j.jag.2022.102690

Semantic Segmentation with Labeling Uncertainty and Class Imbalance

Authors: Patrik Olã Bressan, José Marcato Junior, José Augusto Correa Martins, Diogo Nunes Gonçalves, Daniel Matte Freitas, Lucas Prado Osco, Jonathan de Andrade Silva, Zhipeng Luo, Jonathan Li, Raymundo Cordero Garcia, Wesley Nunes Gonçalves

Abstract: Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pix… ▽ More Recently, methods based on Convolutional Neural Networks (CNN) achieved impressive success in semantic segmentation tasks. However, challenges such as the class imbalance and the uncertainty in the pixel-labeling process are not completely addressed. As such, we present a new approach that calculates a weight for each pixel considering its class and uncertainty during the labeling process. The pixel-wise weights are used during training to increase or decrease the importance of the pixels. Experimental results show that the proposed approach leads to significant improvements in three challenging segmentation tasks in comparison to baseline methods. It was also proved to be more invariant to noise. The approach presented here may be used within a wide range of semantic segmentation methods to improve their robustness. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 15 pages, 9 figures, 3 tables

MSC Class: 68T07 ACM Class: I.2.1

Journal ref: International Journal of Applied Earth Observation and Geoinformation, 2022

arXiv:2102.04366 [pdf, other]

doi 10.1016/j.eswa.2022.116555

Counting and Locating High-Density Objects Using Convolutional Neural Network

Authors: Mauro dos Santos de Arruda, Lucas Prado Osco, Plabiany Rodrigo Acosta, Diogo Nunes Gonçalves, José Marcato Junior, Ana Paula Marques Ramos, Edson Takashi Matsubara, Zhipeng Luo, Jonathan Li, Jonathan de Andrade Silva, Wesley Nunes Gonçalves

Abstract: This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our meth… ▽ More This paper presents a Convolutional Neural Network (CNN) approach for counting and locating objects in high-density imagery. To the best of our knowledge, this is the first object counting and locating method based on a feature map enhancement and a Multi-Stage Refinement of the confidence map. The proposed method was evaluated in two counting datasets: tree and car. For the tree dataset, our method returned a mean absolute error (MAE) of 2.05, a root-mean-squared error (RMSE) of 2.87 and a coefficient of determination (R$^2$) of 0.986. For the car dataset (CARPK and PUCPR+), our method was superior to state-of-the-art methods. In the these datasets, our approach achieved an MAE of 4.45 and 3.16, an RMSE of 6.18 and 4.39, and an R$^2$ of 0.975 and 0.999, respectively. The proposed method is suitable for dealing with high object-density, returning a state-of-the-art performance for counting and locating objects. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 15 pages, 10 figures, 8 tables

MSC Class: 68T07 ACM Class: I.2.1

Journal ref: Expert Systems with Applications, 2022

arXiv:2102.03213 [pdf, other]

A Deep Learning Approach Based on Graphs to Detect Plantation Lines

Authors: Diogo Nunes Gonçalves, Mauro dos Santos de Arruda, Hemerson Pistori, Vanessa Jordão Marcato Fernandes, Ana Paula Marques Ramos, Danielle Elis Garcia Furuya, Lucas Prado Osco, Hongjie He, Jonathan Li, José Marcato Junior, Wesley Nunes Gonçalves

Abstract: Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the… ▽ More Deep learning-based networks are among the most prominent methods to learn linear patterns and extract this type of information from diverse imagery conditions. Here, we propose a deep learning approach based on graphs to detect plantation lines in UAV-based RGB imagery presenting a challenging scenario containing spaced plants. The first module of our method extracts a feature map throughout the backbone, which consists of the initial layers of the VGG16. This feature map is used as an input to the Knowledge Estimation Module (KEM), organized in three concatenated branches for detecting 1) the plant positions, 2) the plantation lines, and 3) for the displacement vectors between the plants. A graph modeling is applied considering each plant position on the image as vertices, and edges are formed between two vertices (i.e. plants). Finally, the edge is classified as pertaining to a certain plantation line based on three probabilities (higher than 0.5): i) in visual features obtained from the backbone; ii) a chance that the edge pixels belong to a line, from the KEM step; and iii) an alignment of the displacement vectors with the edge, also from KEM. Experiments were conducted in corn plantations with different growth stages and patterns with aerial RGB imagery. A total of 564 patches with 256 x 256 pixels were used and randomly divided into training, validation, and testing sets in a proportion of 60\%, 20\%, and 20\%, respectively. The proposed method was compared against state-of-the-art deep learning methods, and achieved superior performance with a significant margin, returning precision, recall, and F1-score of 98.7\%, 91.9\%, and 95.1\%, respectively. This approach is useful in extracting lines with spaced plantation patterns and could be implemented in scenarios where plantation gaps occur, generating lines with few-to-none interruptions. △ Less

Submitted 5 February, 2021; originally announced February 2021.

Comments: 19 pages, 11 figures, 4 tables

MSC Class: 68Txx

arXiv:2101.10861 [pdf, other]

doi 10.1016/j.jag.2021.102456

A Review on Deep Learning in UAV Remote Sensing

Authors: Lucas Prado Osco, José Marcato Junior, Ana Paula Marques Ramos, Lúcio André de Castro Jorge, Sarah Narges Fatholahi, Jonathan de Andrade Silva, Edson Takashi Matsubara, Hemerson Pistori, Wesley Nunes Gonçalves, Jonathan Li

Abstract: Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, and many others. In the remote sensing field, surveys and literature revisions specifically involving DNNs algorithms' applications have been conducted in an attempt to summarize the amount of information p… ▽ More Deep Neural Networks (DNNs) learn representation from data with an impressive capability, and brought important breakthroughs for processing images, time-series, natural language, audio, video, and many others. In the remote sensing field, surveys and literature revisions specifically involving DNNs algorithms' applications have been conducted in an attempt to summarize the amount of information produced in its subfields. Recently, Unmanned Aerial Vehicles (UAV) based applications have dominated aerial sensing research. However, a literature revision that combines both "deep learning" and "UAV remote sensing" thematics has not yet been conducted. The motivation for our work was to present a comprehensive review of the fundamentals of Deep Learning (DL) applied in UAV-based imagery. We focused mainly on describing classification and regression techniques used in recent applications with UAV-acquired data. For that, a total of 232 papers published in international scientific journal databases was examined. We gathered the published material and evaluated their characteristics regarding application, sensor, and technique used. We relate how DL presents promising results and has the potential for processing tasks associated with UAV-based image data. Lastly, we project future perspectives, commentating on prominent DL paths to be explored in the UAV remote sensing field. Our revision consists of a friendly-approach to introduce, commentate, and summarize the state-of-the-art in UAV-based image applications with DNNs algorithms in diverse subfields of remote sensing, grou** it in the environmental, urban, and agricultural contexts. △ Less

Submitted 20 August, 2023; v1 submitted 22 January, 2021; originally announced January 2021.

Comments: 27 pages, 10 figures

Journal ref: International Journal of Applied Earth Observation and Geoinformation, 2022

arXiv:2012.15827 [pdf, other]

doi 10.1016/j.isprsjprs.2021.01.024

A CNN Approach to Simultaneously Count Plants and Detect Plantation-Rows from UAV Imagery

Authors: Lucas Prado Osco, Mauro dos Santos de Arruda, Diogo Nunes Gonçalves, Alexandre Dias, Juliana Batistoti, Mauricio de Souza, Felipe David Georges Gomes, Ana Paula Marques Ramos, Lúcio André de Castro Jorge, Veraldo Liesenberg, Jonathan Li, Lingfei Ma, José Marcato Junior, Wesley Nunes Gonçalves

Abstract: In this paper, we propose a novel deep learning method based on a Convolutional Neural Network (CNN) that simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The experimental setup was evaluated in a cornfield with different growth stages and in a Citrus orchard. Both datasets characterize different plant density scena… ▽ More In this paper, we propose a novel deep learning method based on a Convolutional Neural Network (CNN) that simultaneously detects and geolocates plantation-rows while counting its plants considering highly-dense plantation configurations. The experimental setup was evaluated in a cornfield with different growth stages and in a Citrus orchard. Both datasets characterize different plant density scenarios, locations, types of crops, sensors, and dates. A two-branch architecture was implemented in our CNN method, where the information obtained within the plantation-row is updated into the plant detection branch and retro-feed to the row branch; which are then refined by a Multi-Stage Refinement method. In the corn plantation datasets (with both growth phases, young and mature), our approach returned a mean absolute error (MAE) of 6.224 plants per image patch, a mean relative error (MRE) of 0.1038, precision and recall values of 0.856, and 0.905, respectively, and an F-measure equal to 0.876. These results were superior to the results from other deep networks (HRNet, Faster R-CNN, and RetinaNet) evaluated with the same task and dataset. For the plantation-row detection, our approach returned precision, recall, and F-measure scores of 0.913, 0.941, and 0.925, respectively. To test the robustness of our model with a different type of agriculture, we performed the same task in the citrus orchard dataset. It returned an MAE equal to 1.409 citrus-trees per patch, MRE of 0.0615, precision of 0.922, recall of 0.911, and F-measure of 0.965. For citrus plantation-row detection, our approach resulted in precision, recall, and F-measure scores equal to 0.965, 0.970, and 0.964, respectively. The proposed method achieved state-of-the-art performance for counting and geolocating plants and plant-rows in UAV images from different types of crops. △ Less

Submitted 14 February, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: 27 pages, 12 figures, 9 tables

ACM Class: J.2

Journal ref: ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING 174 (2021) 1-17

arXiv:2012.14259 [pdf, other]

Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset

Authors: Cristina Palmero, Javier Selva, Sorina Smeureanu, Julio C. S. Jacques Junior, Albert Clapés, Alexa Moseguí, Zejian Zhang, David Gallardo, Georgina Guilera, David Leiva, Sergio Escalera

Abstract: This paper introduces UDIVA, a new non-acted dataset of face-to-face dyadic interactions, where interlocutors perform competitive and collaborative tasks with different behavior elicitation and cognitive workload. The dataset consists of 90.5 hours of dyadic interactions among 147 participants distributed in 188 sessions, recorded using multiple audiovisual and physiological sensors. Currently, it… ▽ More This paper introduces UDIVA, a new non-acted dataset of face-to-face dyadic interactions, where interlocutors perform competitive and collaborative tasks with different behavior elicitation and cognitive workload. The dataset consists of 90.5 hours of dyadic interactions among 147 participants distributed in 188 sessions, recorded using multiple audiovisual and physiological sensors. Currently, it includes sociodemographic, self- and peer-reported personality, internal state, and relationship profiling from participants. As an initial analysis on UDIVA, we propose a transformer-based method for self-reported personality inference in dyadic scenarios, which uses audiovisual data and different sources of context from both interlocutors to regress a target person's personality traits. Preliminary results from an incremental study show consistent improvements when using all available context information. △ Less

Submitted 28 December, 2020; originally announced December 2020.

Comments: Accepted to the 11th International Workshop on Human Behavior Understanding workshop at Winter Conference on Applications of Computer Vision 2021

arXiv:2011.14906 [pdf, other]

Person Perception Biases Exposed: Revisiting the First Impressions Dataset

Authors: Julio C. S. Jacques Junior, Agata Lapedriza, Cristina Palmero, Xavier Baró, Sergio Escalera

Abstract: This work revisits the ChaLearn First Impressions database, annotated for personality perception using pairwise comparisons via crowdsourcing. We analyse for the first time the original pairwise annotations, and reveal existing person perception biases associated to perceived attributes like gender, ethnicity, age and face attractiveness. We show how person perception bias can influence data label… ▽ More This work revisits the ChaLearn First Impressions database, annotated for personality perception using pairwise comparisons via crowdsourcing. We analyse for the first time the original pairwise annotations, and reveal existing person perception biases associated to perceived attributes like gender, ethnicity, age and face attractiveness. We show how person perception bias can influence data labelling of a subjective task, which has received little attention from the computer vision and machine learning communities by now. We further show that the mechanism used to convert pairwise annotations to continuous values may magnify the biases if no special treatment is considered. The findings of this study are relevant for the computer vision community that is still creating new datasets on subjective tasks, and using them for practical applications, ignoring these perceptual biases. △ Less

Submitted 30 November, 2020; originally announced November 2020.

Comments: accepted on 11th International Workshop on Human Behavior Understanding (HBU), organized as part of WACV 2021

arXiv:2009.07838 [pdf, other]

FairFace Challenge at ECCV 2020: Analyzing Bias in Face Recognition

Authors: Tomáš Sixta, Julio C. S. Jacques Junior, Pau Buch-Cardona, Neil M. Robertson, Eduard Vazquez, Sergio Escalera

Abstract: This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were… ▽ More This work summarizes the 2020 ChaLearn Looking at People Fair Face Recognition and Analysis Challenge and provides a description of the top-winning solutions and analysis of the results. The aim of the challenge was to evaluate accuracy and bias in gender and skin colour of submitted algorithms on the task of 1:1 face verification in the presence of other confounding attributes. Participants were evaluated using an in-the-wild dataset based on reannotated IJB-C, further enriched by 12.5K new images and additional labels. The dataset is not balanced, which simulates a real world scenario where AI-based models supposed to present fair outcomes are trained and evaluated on imbalanced data. The challenge attracted 151 participants, who made more than 1.8K submissions in total. The final phase of the challenge attracted 36 active teams out of which 10 exceeded 0.999 AUC-ROC while achieving very low scores in the proposed bias metrics. Common strategies by the participants were face pre-processing, homogenization of data distributions, the use of bias aware loss functions and ensemble models. The analysis of top-10 teams shows higher false positive rates (and lower false negative rates) for females with dark skin tone as well as the potential of eyeglasses and young age to increase the false positive rates too. △ Less

Submitted 2 December, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

Comments: accepted on ECCV'2020 Fair Face Recognition and Analysis Workshop

arXiv:2008.12624 [pdf, other]

A Framework for Studying Reinforcement Learning and Sim-to-Real in Robot Soccer

Authors: Hansenclever F. Bassani, Renie A. Delgado, José Nilton de O. Lima Junior, Heitor R. Medeiros, Pedro H. M. Braga, Mateus G. Machado, Lucas H. C. Santos, Alain Tapp

Abstract: This article introduces an open framework, called VSSS-RL, for studying Reinforcement Learning (RL) and sim-to-real in robot soccer, focusing on the IEEE Very Small Size Soccer (VSSS) league. We propose a simulated environment in which continuous or discrete control policies can be trained to control the complete behavior of soccer agents and a sim-to-real method based on domain adaptation to adap… ▽ More This article introduces an open framework, called VSSS-RL, for studying Reinforcement Learning (RL) and sim-to-real in robot soccer, focusing on the IEEE Very Small Size Soccer (VSSS) league. We propose a simulated environment in which continuous or discrete control policies can be trained to control the complete behavior of soccer agents and a sim-to-real method based on domain adaptation to adapt the obtained policies to real robots. Our results show that the trained policies learned a broad repertoire of behaviors that are difficult to implement with handcrafted control policies. With VSSS-RL, we were able to beat human-designed policies in the 2019 Latin American Robotics Competition (LARC), achieving 4th place out of 21 teams, being the first to apply Reinforcement Learning (RL) successfully in this competition. Both environment and hardware specifications are available open-source to allow reproducibility of our results and further studies. △ Less

Submitted 18 August, 2020; originally announced August 2020.

arXiv:2007.05643 [pdf, other]

Learning Local Complex Features using Randomized Neural Networks for Texture Analysis

Authors: Lucas C. Ribas, Leonardo F. S. Scabini, Jarbas Joaci de Mesquita Sá Junior, Odemir M. Bruno

Abstract: Texture is a visual attribute largely used in many problems of image analysis. Currently, many methods that use learning techniques have been proposed for texture discrimination, achieving improved performance over previous handcrafted methods. In this paper, we present a new approach that combines a learning technique and the Complex Network (CN) theory for texture analysis. This method takes adv… ▽ More Texture is a visual attribute largely used in many problems of image analysis. Currently, many methods that use learning techniques have been proposed for texture discrimination, achieving improved performance over previous handcrafted methods. In this paper, we present a new approach that combines a learning technique and the Complex Network (CN) theory for texture analysis. This method takes advantage of the representation capacity of CN to model a texture image as a directed network and uses the topological information of vertices to train a randomized neural network. This neural network has a single hidden layer and uses a fast learning algorithm, which is able to learn local CN patterns for texture characterization. Thus, we use the weighs of the trained neural network to compose a feature vector. These feature vectors are evaluated in a classification experiment in four widely used image databases. Experimental results show a high classification performance of the proposed method when compared to other methods, indicating that our approach can be used in many image analysis problems. △ Less

Submitted 17 August, 2020; v1 submitted 10 July, 2020; originally announced July 2020.

arXiv:2004.09397 [pdf, other]

Multi-label Stream Classification with Self-Organizing Maps

Authors: Ricardo Cerri, Joel David Costa Júnior, Elaine Ribeiro de Faria Paiva, João Manuel Portela da Gama

Abstract: Several learning algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such chan… ▽ More Several learning algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such changes (concept drift). Also, in realistic applications, changes occur in scenarios of infinitely delayed labels, where the true classes of the arrival instances are never available. We propose an online unsupervised incremental method based on self-organizing maps for multi-label stream classification with infinitely delayed labels. In the classification phase, we use a k-nearest neighbors strategy to compute the winning neurons in the maps, adapting to concept drift by online adjusting neuron weight vectors and dataset label cardinality. We predict labels for each instance using the Bayes rule and the outputs of each neuron, adapting the probabilities and conditional probabilities of the classes in the stream. Experiments using synthetic and real datasets show that our method is highly competitive with several ones from the literature, in both stationary and concept drift scenarios. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: 7 pages, 14 figures

ACM Class: I.2.6

arXiv:2003.12463 [pdf, other]

doi 10.3390/s22207851

Reconfigurable Computing Applied to Latency Reduction for the Tactile Internet

Authors: José C. V. S. Junior, Matheus F. Torquato, Toktam Mahmoodi, Mischa Dohler, Marcelo A. C. Fernandes

Abstract: Tactile internet applications allow robotic devices to be remotely controlled over a communication medium with an unnoticeable time delay. In a bilateral communication, the acceptable round trip latency is usually in the order of 1ms up to 10ms depending on the application requirements. It is estimated that 70% of the total latency is generated by the communication network, and the remaining 30% i… ▽ More Tactile internet applications allow robotic devices to be remotely controlled over a communication medium with an unnoticeable time delay. In a bilateral communication, the acceptable round trip latency is usually in the order of 1ms up to 10ms depending on the application requirements. It is estimated that 70% of the total latency is generated by the communication network, and the remaining 30% is produced by master and slave devices. Thus, this paper aims to propose a strategy to reduce 30% of the total latency that is produced by such devices. The strategy is to apply reconfigurable computation using FPGAs to minimize the execution time of device-associated algorithms. With this in mind, this work presents a hardware reference model for modules that implement nonlinear positioning and force calculations as well as a tactile system formed by two robotic manipulators. In addition to presenting the implementation details, simulations and experimental tests are performed in order to validate the proposed model. Results associated with the FPGA sampling rate, throughput, latency, and post-synthesis occupancy area are analyzed. △ Less

Submitted 11 March, 2020; originally announced March 2020.

Comments: 20 pages, 32 Figures

arXiv:2003.11102 [pdf, other]

Learning to Play Soccer by Reinforcement and Applying Sim-to-Real to Compete in the Real World

Authors: Hansenclever F. Bassani, Renie A. Delgado, Jose Nilton de O. Lima Junior, Heitor R. Medeiros, Pedro H. M. Braga, Alain Tapp

Abstract: This work presents an application of Reinforcement Learning (RL) for the complete control of real soccer robots of the IEEE Very Small Size Soccer (VSSS), a traditional league in the Latin American Robotics Competition (LARC). In the VSSS league, two teams of three small robots play against each other. We propose a simulated environment in which continuous or discrete control policies can be train… ▽ More This work presents an application of Reinforcement Learning (RL) for the complete control of real soccer robots of the IEEE Very Small Size Soccer (VSSS), a traditional league in the Latin American Robotics Competition (LARC). In the VSSS league, two teams of three small robots play against each other. We propose a simulated environment in which continuous or discrete control policies can be trained, and a Sim-to-Real method to allow using the obtained policies to control a robot in the real world. The results show that the learned policies display a broad repertoire of behaviors that are difficult to specify by hand. This approach, called VSSS-RL, was able to beat the human-designed policy for the striker of the team ranked 3rd place in the 2018 LARC, in 1-vs-1 matches. △ Less

Submitted 24 March, 2020; originally announced March 2020.

Journal ref: LatinX in AI Research Workshop at NeurIPS 2019

arXiv:1909.05568 [pdf, other]

On the Effect of Observed Subject Biases in Apparent Personality Analysis from Audio-visual Signals

Authors: Ricardo Darío Pérez Principi, Cristina Palmero, Julio C. S. Jacques Junior, Sergio Escalera

Abstract: Personality perception is implicitly biased due to many subjective factors, such as cultural, social, contextual, gender and appearance. Approaches developed for automatic personality perception are not expected to predict the real personality of the target, but the personality external observers attributed to it. Hence, they have to deal with human bias, inherently transferred to the training dat… ▽ More Personality perception is implicitly biased due to many subjective factors, such as cultural, social, contextual, gender and appearance. Approaches developed for automatic personality perception are not expected to predict the real personality of the target, but the personality external observers attributed to it. Hence, they have to deal with human bias, inherently transferred to the training data. However, bias analysis in personality computing is an almost unexplored area. In this work, we study different possible sources of bias affecting personality perception, including emotions from facial expressions, attractiveness, age, gender, and ethnicity, as well as their influence on prediction ability for apparent personality estimation. To this end, we propose a multi-modal deep neural network that combines raw audio and visual information alongside predictions of attribute-specific models to regress apparent personality. We also analyse spatio-temporal aggregation schemes and the effect of different time intervals on first impressions. We base our study on the ChaLearn First Impressions dataset, consisting of one-person conversational videos. Our model shows state-of-the-art results regressing apparent personality based on the Big-Five model. Furthermore, given the interpretability nature of our network design, we provide an incremental analysis on the impact of each possible source of bias on final network predictions. △ Less

Submitted 28 November, 2019; v1 submitted 12 September, 2019; originally announced September 2019.

Comments: Accepted in IEEE Transactions on Affective Computing (TAC)

arXiv:1908.08651 [pdf]

Trajectory-Based Urban Air Mobility (UAM) Operations Simulator (TUS)

Authors: Euclides C. Pinto Neto, Derick M. Baum, Jorge Rady de Almeida Junior, João Batista Camargo Junior, Paulo Sérgio Cugnasca

Abstract: Nowadays, the demand for optimized services in urban environments to provide better society wellness is increasing. In this sense, ground transportation in dense urban environments has been facing challenges for many years (e.g., congestion and resilience). One import outcome of the effort made toward the creation of new concepts for enhancing urban transportation is the Urban Air Mobility (UAM) c… ▽ More Nowadays, the demand for optimized services in urban environments to provide better society wellness is increasing. In this sense, ground transportation in dense urban environments has been facing challenges for many years (e.g., congestion and resilience). One import outcome of the effort made toward the creation of new concepts for enhancing urban transportation is the Urban Air Mobility (UAM) concept. UAM aims at enhancing city transportation services using manned and unmanned vehicles. However, these operations bring many challenges to be faced, e.g., the interaction between the controller agent and autonomous vehicles. Furthermore, trajectory planning is not a simple task due to several factors. Firstly, the trajectories must consider a reduced minimum separation as eVTOL vehicle are expected to operate in complex urban environments. This leads the trajectory planning process to observe safety primitives more restrictively once the airspace is expected to comport many vehicles that follow small minimum separation standards. Thereupon, the main goal of the Trajectory-Based UAM Operations Simulator (TUS) is to simulate the Trajectory-Based UAM operations in urban environments considering the presence of both manned and unmanned eVTOL vehicles. For this, a Discrete Event Simulation (DES) approach is adopted, which considers an input (i.e., the eVTOL vehicles, their origin and destination, and their respective trajectories) and produces an output (which describes if the trajectories are safe and the elapsed operation time). The main contribution of this simulation tool is to provide a simulated environment for testing and measuring the effectiveness (e.g., flight duration) of trajectories planned for eVTOL vehicles. △ Less

Submitted 22 August, 2019; originally announced August 2019.

arXiv:1902.07653 [pdf, other]

On the effect of age perception biases for real age regression

Authors: Julio C. S. Jacques Junior, Cagri Ozcinar, Marina Marjanovic, Xavier Baró, Gholamreza Anbarjafari, Sergio Escalera

Abstract: Automatic age estimation from facial images represents an important task in computer vision. This paper analyses the effect of gender, age, ethnic, makeup and expression attributes of faces as sources of bias to improve deep apparent age prediction. Following recent works where it is shown that apparent age labels benefit real age estimation, rather than direct real to real age regression, our mai… ▽ More Automatic age estimation from facial images represents an important task in computer vision. This paper analyses the effect of gender, age, ethnic, makeup and expression attributes of faces as sources of bias to improve deep apparent age prediction. Following recent works where it is shown that apparent age labels benefit real age estimation, rather than direct real to real age regression, our main contribution is the integration, in an end-to-end architecture, of face attributes for apparent age prediction with an additional loss for real age regression. Experimental results on the APPA-REAL dataset indicate the proposed network successfully take advantage of the adopted attributes to improve both apparent and real age estimation. Our model outperformed a state-of-the-art architecture proposed to separately address apparent and real age regression. Finally, we present preliminary results and discussion of a proof of concept application using the proposed model to regress the apparent age of an individual based on the gender of an external observer. △ Less

Submitted 20 February, 2019; originally announced February 2019.

Comments: Accepted in the 14th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2019)

arXiv:1811.11121 [pdf]

doi 10.5748/9788599693148-15CONTECSI/PS-5903

Um Sistema de Aquisição e Análise de Dados para Extração de Conhecimento da Plataforma Ebit

Authors: Marcelo Augusto Muniz Teixeira, Fábio Manoel França Lobato, Beatriz Nery Rodrigues Chagas, Antonio Fernando Lavareda Jacob Junior

Abstract: The internet development and the consequent change in communication forms have strengthened as online social networks, increasing the involvement of people with this media and making consumers of products and services, which are more informed and demanding for companies. This context has given rise to Social CRM, which can be put into practice by means of electronic word of mouth platforms, enable… ▽ More The internet development and the consequent change in communication forms have strengthened as online social networks, increasing the involvement of people with this media and making consumers of products and services, which are more informed and demanding for companies. This context has given rise to Social CRM, which can be put into practice by means of electronic word of mouth platforms, enable web sharing of comments and evaluations about companies, defining their reputation. However, most electronic word of mouth platforms do not provide information for extracting your information, making it difficult to analyze the data. To satisfy this gap, a system was developed to capture and automatically summarize the data of the companies registered in the eBit platform. △ Less

Submitted 27 November, 2018; originally announced November 2018.

Comments: in Portuguese, Paper presented at the 15th International Conference On Information Systems & Technology Management

arXiv:1811.10605 [pdf]

doi 10.5748/9788599693148-15CONTECSI/PS-5891

GRSUS: Gerenciamento De Recursos De Saúde, Um Estudo Sob A Ótica Da Portaria GM/MS 1631/2015 No Estado do Pará

Authors: Paulo Sérgio Viegas Bernardino da Silva, Lucas Vinícius Araújo Caldas, Antônio Fernando Lavareda Jacob Junior, Fábio Manoel França Lobato

Abstract: Investments in public health had an increase of about R$ 20 bi in recent years. Even with the dynamism of the Unique Health System (SUS), only after 13 years the criteria and parameters for the planning and programming of health services have been updated. The calculations for health resources division are complex due to the nature of the SUS administrative organization, which has three administra… ▽ More Investments in public health had an increase of about R$ 20 bi in recent years. Even with the dynamism of the Unique Health System (SUS), only after 13 years the criteria and parameters for the planning and programming of health services have been updated. The calculations for health resources division are complex due to the nature of the SUS administrative organization, which has three administrative levels. Despite providing the criteria and parameters for the calculations, it was not provided any information system that would automate this process and provide reliable information for decision making. In order to fill such gap, this paper presents a system for health resource management from the perspective of GM/MS 1631/2015 ordinance. The tool has been validated using as case studies two municipalities in the interior of the state of Pará. The results were promising, with latent market potential, being possible to simulate various scenarios for a medium and long-term predictions. △ Less

Submitted 26 November, 2018; originally announced November 2018.

Comments: Paper presented at the 15th International Conference On Information Systems & Technology Management - Contecsi - 2018

arXiv:1806.09170 [pdf, other]

Fusion of complex networks and randomized neural networks for texture analysis

Authors: Lucas C. Ribas, Jarbas J. M. Sa Junior, Leonardo F. S. Scabini, Odemir M. Bruno

Abstract: This paper presents a high discriminative texture analysis method based on the fusion of complex networks and randomized neural networks. In this approach, the input image is modeled as a complex networks and its topological properties as well as the image pixels are used to train randomized neural networks in order to create a signature that represents the deep characteristics of the texture. The… ▽ More This paper presents a high discriminative texture analysis method based on the fusion of complex networks and randomized neural networks. In this approach, the input image is modeled as a complex networks and its topological properties as well as the image pixels are used to train randomized neural networks in order to create a signature that represents the deep characteristics of the texture. The results obtained surpassed the accuracies of many methods available in the literature. This performance demonstrates that our proposed approach opens a promising source of research, which consists of exploring the synergy of neural networks and complex networks in the texture analysis field. △ Less

Submitted 17 August, 2020; v1 submitted 24 June, 2018; originally announced June 2018.

Comments: 13 pages, 4 figures

arXiv:1804.08046 [pdf, other]

First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis

Authors: Julio C. S. Jacques Junior, Yağmur Güçlütürk, Marc Pérez, Umut Güçlü, Carlos Andujar, Xavier Baró, Hugo Jair Escalante, Isabelle Guyon, Marcel A. J. van Gerven, Rob van Lier, Sergio Escalera

Abstract: Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing inter… ▽ More Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed. △ Less

Submitted 17 July, 2019; v1 submitted 21 April, 2018; originally announced April 2018.

Comments: Accepted on IEEE Transactions on Affective Computing (TAC)

arXiv:1804.04419 [pdf, other]

Exploiting feature representations through similarity learning, post-ranking and ranking aggregation for person re-identification

Authors: Julio C. S. Jacques Junior, Xavier Baró, Sergio Escalera

Abstract: Person re-identification has received special attention by the human analysis community in the last few years. To address the challenges in this field, many researchers have proposed different strategies, which basically exploit either cross-view invariant features or cross-view robust metrics. In this work, we propose to exploit a post-ranking approach and combine different feature representation… ▽ More Person re-identification has received special attention by the human analysis community in the last few years. To address the challenges in this field, many researchers have proposed different strategies, which basically exploit either cross-view invariant features or cross-view robust metrics. In this work, we propose to exploit a post-ranking approach and combine different feature representations through ranking aggregation. Spatial information, which potentially benefits the person matching, is represented using a 2D body model, from which color and texture information are extracted and combined. We also consider background/foreground information, automatically extracted via Deep Decompositional Network, and the usage of Convolutional Neural Network (CNN) features. To describe the matching between images we use the polynomial feature map, also taking into account local and global information. The Discriminant Context Information Analysis based post-ranking approach is used to improve initial ranking lists. Finally, the Stuart ranking aggregation method is employed to combine complementary ranking lists obtained from different feature representations. Experimental results demonstrated that we improve the state-of-the-art on VIPeR and PRID450s datasets, achieving 67.21% and 75.64% on top-1 rank recognition rate, respectively, as well as obtaining competitive results on CUHK01 dataset. △ Less

Submitted 12 April, 2018; originally announced April 2018.

Comments: Preprint submitted to Image and Vision Computing

arXiv:1802.00745 [pdf, other]

Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos

Authors: Hugo Jair Escalante, Heysem Kaya, Albert Ali Salah, Sergio Escalera, Yagmur Gucluturk, Umut Guclu, Xavier Baro, Isabelle Guyon, Julio Jacques Junior, Meysam Madadi, Stephane Ayache, Evelyne Viegas, Furkan Gurpinar, Achmadnoer Sukma Wicaksana, Cynthia C. S. Liem, Marcel A. J. van Gerven, Rob van Lier

Abstract: Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in… ▽ More Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of computer vision with an emphasis on looking at people tasks. Specifically, we review and study those mechanisms in the context of first impressions analysis. To the best of our knowledge, this is the first effort in this direction. Additionally, we describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, the evaluation protocol, and summarize the results of the challenge. Finally, derived from our study, we outline research opportunities that we foresee will be decisive in the near future for the development of the explainable computer vision field. △ Less

Submitted 28 September, 2019; v1 submitted 2 February, 2018; originally announced February 2018.

Comments: Preprint submitted to TAC

Showing 1–47 of 47 results for author: Júnior, J