Search | arXiv e-print repository

MLinter: Learning Coding Practices from Examples-Dream or Reality?

Authors: Corentin Latappy, Quentin Perez, Thomas Degueule, Jean-Rémy Falleri, Christelle Urtado, Sylvain Vauttier, Xavier Blanc, Cédric Teyton

Abstract: Coding practices are increasingly used by software companies. Their use promotes consistency, readability, and maintainability, which contribute to software quality. Coding practices were initially enforced by general-purpose linters, but companies now tend to design and adopt their own company-specific practices. However, these company-specific practices are often not automated, making it challen… ▽ More Coding practices are increasingly used by software companies. Their use promotes consistency, readability, and maintainability, which contribute to software quality. Coding practices were initially enforced by general-purpose linters, but companies now tend to design and adopt their own company-specific practices. However, these company-specific practices are often not automated, making it challenging to ensure they are shared and used by developers. Converting these practices into linter rules is a complex task that requires extensive static analysis and language engineering expertise. In this paper, we seek to answer the following question: can coding practices be learned automatically from examples manually tagged by developers? We conduct a feasibility study using CodeBERT, a state-of-the-art machine learning approach, to learn linter rules. Our results show that, although the resulting classifiers reach high precision and recall scores when evaluated on balanced synthetic datasets, their application on real-world, unbalanced codebases, while maintaining excellent recall, suffers from a severe drop in precision that hinders their usability. △ Less

Submitted 24 January, 2023; originally announced January 2023.

Journal ref: 30th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Mar 2023, Macao SAR, Macau SAR China

arXiv:2212.03169 [pdf, other]

When Brain-Computer Interfaces Meet the Metaverse: Landscape, Demonstrator, Trends, Challenges, and Concerns

Authors: Sergio López Bernal, Mario Quiles Pérez, Enrique Tomás Martínez Beltrán, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: The metaverse has gained tremendous popularity in recent years, allowing the interconnection of users worldwide. However, current systems in metaverse scenarios, such as virtual reality glasses, offer a partial immersive experience. In this context, Brain-Computer Interfaces (BCIs) can introduce a revolution in the metaverse, although a study of the applicability and implications of BCIs in these… ▽ More The metaverse has gained tremendous popularity in recent years, allowing the interconnection of users worldwide. However, current systems in metaverse scenarios, such as virtual reality glasses, offer a partial immersive experience. In this context, Brain-Computer Interfaces (BCIs) can introduce a revolution in the metaverse, although a study of the applicability and implications of BCIs in these virtual scenarios is required. Based on the absence of literature, this work reviews, for the first time, the applicability of BCIs in the metaverse, analyzing the current status of this integration based on different categories related to virtual worlds and the evolution of BCIs in these scenarios in the medium and long term. This work also proposes the design and implementation of a general framework that integrates BCIs with different data sources from sensors and actuators (e.g., VR glasses) based on a modular design to be easily extended. This manuscript also validates the framework in a demonstrator consisting of driving a car within a metaverse, using a BCI for neural data acquisition, a VR headset to provide realism, and a steering wheel and pedals. Four use cases (UCs) are selected, focusing on cognitive and emotional assessment of the driver, detection of drowsiness, and driver authentication while using the vehicle. Moreover, this manuscript offers an analysis of BCI trends in the metaverse, also identifying future challenges that the intersection of these technologies will face. Finally, it reviews the concerns that using BCIs in virtual world applications could generate according to different categories: accessibility, user inclusion, privacy, cybersecurity, physical safety, and ethics. △ Less

Submitted 16 November, 2023; v1 submitted 6 December, 2022; originally announced December 2022.

arXiv:2211.08413 [pdf, other]

doi 10.1109/COMST.2023.3315746

Decentralized Federated Learning: Fundamentals, State of the Art, Frameworks, Trends, and Challenges

Authors: Enrique Tomás Martínez Beltrán, Mario Quiles Pérez, Pedro Miguel Sánchez Sánchez, Sergio López Bernal, Gérôme Bovet, Manuel Gil Pérez, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: In recent years, Federated Learning (FL) has gained relevance in training collaborative models without sharing sensitive data. Since its birth, Centralized FL (CFL) has been the most common approach in the literature, where a central entity creates a global model. However, a centralized approach leads to increased latency due to bottlenecks, heightened vulnerability to system failures, and trustwo… ▽ More In recent years, Federated Learning (FL) has gained relevance in training collaborative models without sharing sensitive data. Since its birth, Centralized FL (CFL) has been the most common approach in the literature, where a central entity creates a global model. However, a centralized approach leads to increased latency due to bottlenecks, heightened vulnerability to system failures, and trustworthiness concerns affecting the entity responsible for the global model creation. Decentralized Federated Learning (DFL) emerged to address these concerns by promoting decentralized model aggregation and minimizing reliance on centralized architectures. However, despite the work done in DFL, the literature has not (i) studied the main aspects differentiating DFL and CFL; (ii) analyzed DFL frameworks to create and evaluate new solutions; and (iii) reviewed application scenarios using DFL. Thus, this article identifies and analyzes the main fundamentals of DFL in terms of federation architectures, topologies, communication mechanisms, security approaches, and key performance indicators. Additionally, the paper at hand explores existing mechanisms to optimize critical DFL fundamentals. Then, the most relevant features of the current DFL frameworks are reviewed and compared. After that, it analyzes the most used DFL application scenarios, identifying solutions based on the fundamentals and frameworks previously defined. Finally, the evolution of existing DFL solutions is studied to provide a list of trends, lessons learned, and open challenges. △ Less

Submitted 13 September, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

arXiv:2209.04048 [pdf, other]

Studying Drowsiness Detection Performance while Driving through Scalable Machine Learning Models using Electroencephalography

Authors: José Manuel Hidalgo Rogel, Enrique Tomás Martínez Beltrán, Mario Quiles Pérez, Sergio López Bernal, Gregorio Martínez Pérez, Alberto Huertas Celdrán

Abstract: - Background / Introduction: Driver drowsiness is a significant concern and one of the leading causes of traffic accidents. Advances in cognitive neuroscience and computer science have enabled the detection of drivers' drowsiness using Brain-Computer Interfaces (BCIs) and Machine Learning (ML). However, the literature lacks a comprehensive evaluation of drowsiness detection performance using a het… ▽ More - Background / Introduction: Driver drowsiness is a significant concern and one of the leading causes of traffic accidents. Advances in cognitive neuroscience and computer science have enabled the detection of drivers' drowsiness using Brain-Computer Interfaces (BCIs) and Machine Learning (ML). However, the literature lacks a comprehensive evaluation of drowsiness detection performance using a heterogeneous set of ML algorithms, and it is necessary to study the performance of scalable ML models suitable for groups of subjects. - Methods: To address these limitations, this work presents an intelligent framework employing BCIs and features based on electroencephalography for detecting drowsiness in driving scenarios. The SEED-VIG dataset is used to evaluate the best-performing models for individual subjects and groups. - Results: Results show that Random Forest (RF) outperformed other models used in the literature, such as Support Vector Machine (SVM), with a 78% f1-score for individual models. Regarding scalable models, RF reached a 79% f1-score, demonstrating the effectiveness of these approaches. This publication highlights the relevance of exploring a diverse set of ML algorithms and scalable approaches suitable for groups of subjects to improve drowsiness detection systems and ultimately reduce the number of accidents caused by driver fatigue. - Conclusions: The lessons learned from this study show that not only SVM but also other models not sufficiently explored in the literature are relevant for drowsiness detection. Additionally, scalable approaches are effective in detecting drowsiness, even when new subjects are evaluated. Thus, the proposed framework presents a novel approach for detecting drowsiness in driving scenarios using BCIs and ML. △ Less

Submitted 30 October, 2023; v1 submitted 8 September, 2022; originally announced September 2022.

arXiv:2209.00993 [pdf, other]

Data Fusion in Neuromarketing: Multimodal Analysis of Biosignals, Lifecycle Stages, Current Advances, Datasets, Trends, and Challenges

Authors: Mario Quiles Pérez, Enrique Tomás Martínez Beltrán, Sergio López Bernal, Eduardo Horna Prat, Luis Montesano Del Campo, Lorenzo Fernández Maimó, Alberto Huertas Celdrán

Abstract: The primary goal of any company is to increase its profits by improving both the quality of its products and how they are advertised. In this context, neuromarketing seeks to enhance the promotion of products and generate a greater acceptance on potential buyers. Traditionally, neuromarketing studies have relied on a single biosignal to obtain feedback from presented stimuli. However, thanks to ne… ▽ More The primary goal of any company is to increase its profits by improving both the quality of its products and how they are advertised. In this context, neuromarketing seeks to enhance the promotion of products and generate a greater acceptance on potential buyers. Traditionally, neuromarketing studies have relied on a single biosignal to obtain feedback from presented stimuli. However, thanks to new devices and technological advances studying this area of knowledge, recent trends indicate a shift towards the fusion of diverse biosignals. An example is the usage of electroencephalography for understanding the impact of an advertisement at the neural level and visual tracking to identify the stimuli that induce such impacts. This emerging pattern determines which biosignals to employ for achieving specific neuromarketing objectives. Furthermore, the fusion of data from multiple sources demands advanced processing methodologies. Despite these complexities, there is a lack of literature that adequately collates and organizes the various data sources and the applied processing techniques for the research objectives pursued. To address these challenges, the current paper conducts a comprehensive analysis of the objectives, biosignals, and data processing techniques employed in neuromarketing research. This study provides both the technical definition and a graphical distribution of the elements under revision. Additionally, it presents a categorization based on research objectives and provides an overview of the combinatory methodologies employed. After this, the paper examines primary public datasets designed for neuromarketing research together with others whose main purpose is not neuromarketing, but can be used for this matter. Ultimately, this work provides a historical perspective on the evolution of techniques across various phases over recent years and enumerates key lessons learned. △ Less

Submitted 21 August, 2023; v1 submitted 30 August, 2022; originally announced September 2022.

Comments: 26 pages, 15 figures

arXiv:2103.12218 [pdf, other]

Bug or not bug? That is the question

Authors: Quentin Perez, Pierre-Antoine Jean, Christelle Urtado, Sylvain Vauttier

Abstract: Nowadays, development teams often rely on tools such as Jira or Bugzilla to manage backlogs of issues to be solved to develop or maintain software. Although they relate to many different concerns (e.g., bug fixing, new feature development, architecture refactoring), few means are proposed to identify and classify these different kinds of issues, except for non mandatory labels that can be manually… ▽ More Nowadays, development teams often rely on tools such as Jira or Bugzilla to manage backlogs of issues to be solved to develop or maintain software. Although they relate to many different concerns (e.g., bug fixing, new feature development, architecture refactoring), few means are proposed to identify and classify these different kinds of issues, except for non mandatory labels that can be manually associated to them. This may lead to a lack of issue classification or to issue misclassification that may impact automatic issue management (planning, assignment) or issue-derived metrics. Automatic issue classification thus is a relevant topic for assisting backlog management. This paper proposes a binary classification solution for discriminating bug from non bug issues. This solution combines natural language processing (TF-IDF) and classification (multi-layer perceptron) techniques, selected after comparing commonly used solutions to classify issues. Moreover, hyper-parameters of the neural network are optimized using a genetic algorithm. The obtained results, as compared to existing works on a commonly used benchmark, show significant improvements on the F1 measure for all datasets. △ Less

Submitted 22 March, 2021; originally announced March 2021.

Comments: Accepted as Research track at ICPC 2021

arXiv:1405.4709 [pdf]

YouTube QoE Evaluation Tool for Android Wireless Terminals

Authors: Gerardo Gomez, Lorenzo Hortiguela, Quiliano Perez, Javier Lorca, Raquel Garcia, Mari Carmen Aguayo-Torres

Abstract: In this paper, we present an Android application which is able to evaluate and analyze the perceived Quality of Experience (QoE) for YouTube service in wireless terminals. To achieve this goal, the application carries out measurements of objective Quality of Service (QoS) parameters, which are then mapped onto subjective QoE (in terms of Mean Opinion Score, MOS) by means of a utility function. Our… ▽ More In this paper, we present an Android application which is able to evaluate and analyze the perceived Quality of Experience (QoE) for YouTube service in wireless terminals. To achieve this goal, the application carries out measurements of objective Quality of Service (QoS) parameters, which are then mapped onto subjective QoE (in terms of Mean Opinion Score, MOS) by means of a utility function. Our application also informs the user about potential causes that lead to a low MOS as well as provides some hints to improve it. After each YouTube session, the users may optionally qualify the session through an online opinion survey. This information has been used in a pilot experience to correlate the theoretical QoE model with real user feedback. Results from such an experience have shown that the theoretical model (taken from the literature) provides slightly more pessimistic results compared to user feedback. Users seem to be more indulgent with wireless connections, increasing the MOS from the opinion survey in about 20% compared to the theoretical model, which was obtained from wired scenarios. △ Less

Submitted 19 May, 2014; originally announced May 2014.

Showing 1–7 of 7 results for author: Perez, Q