Search | arXiv e-print repository

The Role of Generative AI in Software Development Productivity: A Pilot Case Study

Authors: Mariana Coutinho, Lorena Marques, Anderson Santos, Marcio Dahia, Cesar Franca, Ronnie de Souza Santos

Abstract: With software development increasingly reliant on innovative technologies, there is a growing interest in exploring the potential of generative AI tools to streamline processes and enhance productivity. In this scenario, this paper investigates the integration of generative AI tools within software development, focusing on understanding their uses, benefits, and challenges to software professional… ▽ More With software development increasingly reliant on innovative technologies, there is a growing interest in exploring the potential of generative AI tools to streamline processes and enhance productivity. In this scenario, this paper investigates the integration of generative AI tools within software development, focusing on understanding their uses, benefits, and challenges to software professionals, in particular, looking at aspects of productivity. Through a pilot case study involving software practitioners working in different roles, we gathered valuable experiences on the integration of generative AI tools into their daily work routines. Our findings reveal a generally positive perception of these tools in individual productivity while also highlighting the need to address identified limitations. Overall, our research sets the stage for further exploration into the evolving landscape of software development practices with the integration of generative AI tools. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.02490 [pdf]

Software Fairness Debt

Authors: Ronnie de Souza Santos, Felipe Fronchetti, Savio Freire, Rodrigo Spinola

Abstract: As software systems continue to play a significant role in modern society, ensuring their fairness has become a critical concern in software engineering. Motivated by this scenario, this paper focused on exploring the multifaceted nature of bias in software systems, aiming to provide a comprehensive understanding of its origins, manifestations, and impacts. Through a sco** study, we identified t… ▽ More As software systems continue to play a significant role in modern society, ensuring their fairness has become a critical concern in software engineering. Motivated by this scenario, this paper focused on exploring the multifaceted nature of bias in software systems, aiming to provide a comprehensive understanding of its origins, manifestations, and impacts. Through a sco** study, we identified the primary causes of fairness deficiency in software development and highlighted their adverse effects on individuals and communities, including instances of discrimination and the perpetuation of inequalities. Our investigation culminated in the introduction of the concept of software fairness debt, which complements the notions of technical and social debt, encapsulating the accumulation of biases in software engineering practices while emphasizing the societal ramifications of bias embedded within software systems. Our study contributes to a deeper understanding of fairness in software engineering and paves the way for the development of more equitable and socially responsible software systems. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2404.13464 [pdf]

Paths to Testing: Why Women Enter and Remain in Software Testing?

Authors: Kleice Silva, Ann Barcomb, Ronnie de Souza Santos

Abstract: Background. Women bring unique problem-solving skills to software development, often favoring a holistic approach and attention to detail. In software testing, precision and attention to detail are essential as professionals explore system functionalities to identify defects. Recognizing the alignment between these skills and women's strengths can derive strategies for enhancing diversity in softw… ▽ More Background. Women bring unique problem-solving skills to software development, often favoring a holistic approach and attention to detail. In software testing, precision and attention to detail are essential as professionals explore system functionalities to identify defects. Recognizing the alignment between these skills and women's strengths can derive strategies for enhancing diversity in software engineering. Goal. This study investigates the motivations behind women choosing careers in software testing, aiming to provide insights into their reasons for entering and remaining in the field. Method. This study used a cross-sectional survey methodology following established software engineering guidelines, collecting data from women in software testing to explore their motivations, experiences, and perspectives. Findings. The findings reveal that women enter software testing due to increased entry-level job opportunities, work-life balance, and even fewer gender stereotypes. Their motivations to stay include the impact of delivering high-quality software, continuous learning opportunities, and the challenges the activities bring to them. However, inclusiveness and career development in the field need improvement for sustained diversity. Conclusion. Preliminary yet significant, these findings offer interesting insights for researchers and practitioners towards the understanding of women's diverse motivations in software testing and how this understanding is important for fostering professional growth and creating a more inclusive and equitable industry landscape. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.13462 [pdf]

Exploring Hybrid Work Realities: A Case Study with Software Professionals From Underrepresented Groups

Authors: Ronnie de Souza Santos, Cleyton Magalhes, Robson Santons, Jorge Correia-Neto

Abstract: Context. In the post-pandemic era, software professionals resist returning to office routines, favoring the flexibility gained from remote work. Hybrid work structures, then, become popular within software companies, allowing them to choose not to work in the office every day, preserving flexibility, and creating several benefits, including an increase in the support for underrepresented groups in… ▽ More Context. In the post-pandemic era, software professionals resist returning to office routines, favoring the flexibility gained from remote work. Hybrid work structures, then, become popular within software companies, allowing them to choose not to work in the office every day, preserving flexibility, and creating several benefits, including an increase in the support for underrepresented groups in software development. Goal. We investigated how software professionals from underrepresented groups are experiencing post-pandemic hybrid work. In particular, we analyzed the experiences of neurodivergents, LGBTQIA+ individuals, and people with disabilities working in the software industry. Method. We conducted a case study focusing on the underrepresented groups within a well-established South American software company. Results. Hybrid work is preferred by software professionals from underrepresented groups in the post-pandemic era. Advantages include improved focus at home, personalized work setups, and accommodation for health treatments. Concerns arise about isolation and inadequate infrastructure support, highlighting the need for proactive organizational strategies. Conclusions. Hybrid work emerges as a promising strategy for fostering diversity and inclusion in software engineering, addressing past limitations of the traditional office environment. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.07934 [pdf, other]

Goal Recognition via Linear Programming

Authors: Felipe Meneguzzi, Luísa R. de A. Santos, Ramon Fraga Pereira, André G. Pereira

Abstract: Goal Recognition is the task by which an observer aims to discern the goals that correspond to plans that comply with the perceived behavior of subject agents given as a sequence of observations. Research on Goal Recognition as Planning encompasses reasoning about the model of a planning task, the observations, and the goals using planning techniques, resulting in very efficient recognition approa… ▽ More Goal Recognition is the task by which an observer aims to discern the goals that correspond to plans that comply with the perceived behavior of subject agents given as a sequence of observations. Research on Goal Recognition as Planning encompasses reasoning about the model of a planning task, the observations, and the goals using planning techniques, resulting in very efficient recognition approaches. In this article, we design novel recognition approaches that rely on the Operator-Counting framework, proposing new constraints, and analyze their constraints' properties both theoretically and empirically. The Operator-Counting framework is a technique that efficiently computes heuristic estimates of cost-to-goal using Integer/Linear Programming (IP/LP). In the realm of theory, we prove that the new constraints provide lower bounds on the cost of plans that comply with observations. We also provide an extensive empirical evaluation to assess how the new constraints improve the quality of the solution, and we found that they are especially informed in deciding which goals are unlikely to be part of the solution. Our novel recognition approaches have two pivotal advantages: first, they employ new IP/LP constraints for efficiently recognizing goals; second, we show how the new IP/LP constraints can improve the recognition of goals under both partial and noisy observability. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: Submitted to JAIR April 2024

arXiv:2403.13220 [pdf]

Elevating Software Quality in Agile Environments: The Role of Testing Professionals in Unit Testing

Authors: Lucas Neves, Oscar Campos, Robson Santos, Italo Santos, Cleyton Magalhaes, Ronnie de Souza Santos

Abstract: Testing is an essential quality activity in the software development process. Usually, a software system is tested on several levels, starting with unit testing that checks the smallest parts of the code until acceptance testing, which is focused on the validations with the end-user. Historically, unit testing has been the domain of developers, who are responsible for ensuring the accuracy of thei… ▽ More Testing is an essential quality activity in the software development process. Usually, a software system is tested on several levels, starting with unit testing that checks the smallest parts of the code until acceptance testing, which is focused on the validations with the end-user. Historically, unit testing has been the domain of developers, who are responsible for ensuring the accuracy of their code. However, in agile environments, testing professionals play an integral role in various quality improvement initiatives throughout each development cycle. This paper explores the participation of test engineers in unit testing within an industrial context, employing a survey-based research methodology. Our findings demonstrate that testing professionals have the potential to strengthen unit testing by collaborating with developers to craft thorough test cases and fostering a culture of mutual learning and cooperation, ultimately contributing to increasing the overall quality of software projects. △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2401.09608 [pdf]

Hidden Populations in Software Engineering: Challenges, Lessons Learned, and Opportunities

Authors: Ronnie de Souza Santos, Kiev Gama

Abstract: The growing emphasis on studying equity, diversity, and inclusion within software engineering has amplified the need to explore hidden populations within this field. Exploring hidden populations becomes important to obtain invaluable insights into the experiences, challenges, and perspectives of underrepresented groups in software engineering and, therefore, devise strategies to make the software… ▽ More The growing emphasis on studying equity, diversity, and inclusion within software engineering has amplified the need to explore hidden populations within this field. Exploring hidden populations becomes important to obtain invaluable insights into the experiences, challenges, and perspectives of underrepresented groups in software engineering and, therefore, devise strategies to make the software industry more diverse. However, studying these hidden populations presents multifaceted challenges, including the complexities associated with identifying and engaging participants due to their marginalized status. In this paper, we discuss our experiences and lessons learned while conducting multiple studies involving hidden populations in software engineering. We emphasize the importance of recognizing and addressing these challenges within the software engineering research community to foster a more inclusive and comprehensive understanding of diverse populations of software professionals. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.09605 [pdf]

Charting a Path to Efficient Onboarding: The Role of Software Visualization

Authors: Fernando Padoan, Ronnie de Souza Santos, Rodrigo Pessoa Medeiros

Abstract: Background. Within the software industry, it is commonly estimated that software professionals invest a substantial portion of their work hours in the process of understanding existing systems. In this context, an ineffective technical onboarding process, which introduces newcomers to software under development, can result in a prolonged period for them to absorb the necessary knowledge required t… ▽ More Background. Within the software industry, it is commonly estimated that software professionals invest a substantial portion of their work hours in the process of understanding existing systems. In this context, an ineffective technical onboarding process, which introduces newcomers to software under development, can result in a prolonged period for them to absorb the necessary knowledge required to become productive in their roles. Goal. The present study aims to explore the familiarity of managers, leaders, and developers with software visualization tools and how these tools are employed to facilitate the technical onboarding of new team members. Method. To address the research problem, we built upon the insights gained through the literature and embraced a sequential exploratory approach. This approach incorporated quantitative and qualitative analyses of data collected from practitioners using questionnaires and semi-structured interviews. Findings. Our findings demonstrate a gap between the concept of software visualization and the practical use of onboarding tools and techniques. Overall, practitioners do not systematically incorporate software visualization tools into their technical onboarding processes due to a lack of conceptual understanding and awareness of their potential benefits. Conclusion. The software industry could benefit from standardized and evolving onboarding models, improved by incorporating software visualization techniques and tools to support program comprehension of newcomers in the software projects. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.08922 [pdf]

Post-Pandemic Hybrid Work in Software Companies: Findings from an Industrial Case Study

Authors: Ronnie de Souza Santos, Willian Grillo, Djafran Cabral, Catarina de Castro, Nicole Albuquerque, Cesar França

Abstract: Context. Software professionals learned from their experience during the pandemic that most of their work can be done remotely, and now software companies are expected to adopt hybrid work models to avoid the resignation of talented professionals who require more flexibility and work-life balance. However, hybrid work is a spectrum of flexible work arrangements, and currently, there are no well-es… ▽ More Context. Software professionals learned from their experience during the pandemic that most of their work can be done remotely, and now software companies are expected to adopt hybrid work models to avoid the resignation of talented professionals who require more flexibility and work-life balance. However, hybrid work is a spectrum of flexible work arrangements, and currently, there are no well-established hybrid work configurations to be followed in the post-pandemic period. Goal. We investigated how software engineers are experiencing the post-pandemic hybrid work landscape, aiming to understand the factors that influence their choices between remote and in-office work. Method. We explored a large South American company by collecting quantitative and qualitative data from 545 software professionals who are currently navigating diverse hybrid work arrangements tailored to their individual and team requirements. Findings. Our study revealed an array of factors that significantly impact hybrid work within the software industry, including individual preferences, work-life balance, commute time, social interactions, productivity, and more. Team dynamics, project demands, client expectations, and organizational strategies also play an important role in sha** the complex landscape of hybrid work configurations in software engineering. Conclusions. In summary, the success of hybrid work models depends on balancing individual preferences, team dynamics, and organizational strategies. Our study demonstrated that, at present, there is no one-size-fits-all individual approach to hybrid work in the software industry. △ Less

Submitted 16 January, 2024; originally announced January 2024.

arXiv:2401.05912 [pdf, other]

Prompt-based mental health screening from social media text

Authors: Wesley Ramos dos Santos, Ivandre Paraboni

Abstract: This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. Our method uses GPT 3.5. prompting to distinguish publications that may be more relevant to the task, and then uses a straightforward bag-of-words text classifier to predict actual user labels. Results are found to be on pair with a BERT mixture of experts classifier, and in… ▽ More This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. Our method uses GPT 3.5. prompting to distinguish publications that may be more relevant to the task, and then uses a straightforward bag-of-words text classifier to predict actual user labels. Results are found to be on pair with a BERT mixture of experts classifier, and incurring only a fraction of its training costs. △ Less

Submitted 11 May, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

Comments: To appear in BrasNam-2024

arXiv:2312.04860 [pdf]

Are We Testing or Being Tested? Exploring the Practical Applications of Large Language Models in Software Testing

Authors: Robson Santos, Italo Santos, Cleyton Magalhaes, Ronnie de Souza Santos

Abstract: A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content, including grammatically precise sentences, human-like paragraphs, and syntactically accurate code snippets. LLMs can play a pivotal role in software development, including software testing. LLMs go beyond traditional roles such as requirement analysis and documentation and can supp… ▽ More A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content, including grammatically precise sentences, human-like paragraphs, and syntactically accurate code snippets. LLMs can play a pivotal role in software development, including software testing. LLMs go beyond traditional roles such as requirement analysis and documentation and can support test case generation, making them valuable tools that significantly enhance testing practices within the field. Hence, we explore the practical application of LLMs in software testing within an industrial setting, focusing on their current use by professional testers. In this context, rather than relying on existing data, we conducted a cross-sectional survey and collected data within real working contexts, specifically, engaging with practitioners in industrial settings. We applied quantitative and qualitative techniques to analyze and synthesize our collected data. Our findings demonstrate that LLMs effectively enhance testing documents and significantly assist testing professionals in programming tasks like debugging and test case automation. LLMs can support individuals engaged in manual testing who need to code. However, it is crucial to emphasize that, at this early stage, software testing professionals should use LLMs with caution while well-defined methods and guidelines are being built for the secure adoption of these tools. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.04832 [pdf]

Exposing Algorithmic Discrimination and Its Consequences in Modern Society: Insights from a Sco** Study

Authors: Ramandeep Singh Dehal, Mehak Sharma, Ronnie de Souza Santos

Abstract: Algorithmic discrimination is a condition that arises when data-driven software unfairly treats users based on attributes like ethnicity, race, gender, sexual orientation, religion, age, disability, or other personal characteristics. Nowadays, as machine learning gains popularity, cases of algorithmic discrimination are increasingly being reported in several contexts. This study delves into variou… ▽ More Algorithmic discrimination is a condition that arises when data-driven software unfairly treats users based on attributes like ethnicity, race, gender, sexual orientation, religion, age, disability, or other personal characteristics. Nowadays, as machine learning gains popularity, cases of algorithmic discrimination are increasingly being reported in several contexts. This study delves into various studies published over the years reporting algorithmic discrimination. We aim to support software engineering researchers and practitioners in addressing this issue by discussing key characteristics of the problem △ Less

Submitted 16 January, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.04809 [pdf]

Navigating the Path of Women in Software Engineering: From Academia to Industry

Authors: Tatalina Oliveira, Ann Barcomb, Ronnie de Souza Santos, Helda Barros, Maria Teresa Baldassarre, César França

Abstract: Context. Women remain significantly underrepresented in software engineering, leading to a lasting gender gap in the software industry. This disparity starts in education and extends into the industry, causing challenges such as hostile work environments and unequal opportunities. Addressing these issues is crucial for fostering an inclusive and diverse software engineering workforce. Aim. This st… ▽ More Context. Women remain significantly underrepresented in software engineering, leading to a lasting gender gap in the software industry. This disparity starts in education and extends into the industry, causing challenges such as hostile work environments and unequal opportunities. Addressing these issues is crucial for fostering an inclusive and diverse software engineering workforce. Aim. This study aims to enhance the literature on women in software engineering, exploring their journey from academia to industry and discussing perspectives, challenges, and support. We focus on Brazilian women to extend existing research, which has largely focused on North American and European contexts. Method. In this study, we conducted a cross-sectional survey, collecting both quantitative and qualitative data, focusing on women's experiences in software engineering to explore their journey from university to the software industry. Findings. Our findings highlight persistent challenges faced by women in software engineering, including gender bias, harassment, work-life imbalance, undervaluation, low sense of belonging, and impostor syndrome. These difficulties commonly emerge from university experiences and continue to affect women throughout their entire careers. Conclusion. In summary, our study identifies systemic challenges in women's software engineering journey, emphasizing the need for organizational commitment to address these issues. We provide actionable recommendations for practitioners. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: 12 pages

arXiv:2311.06201 [pdf]

doi 10.1109/MS.2023.3267296

Myths and Facts about a Career in Software Testing: A Comparison between Students' Beliefs and Professionals' Experience

Authors: Ronnie de Souza Santos, Luiz Fernando Capretz, Cleyton Magalhaes, Rodrigo Souza

Abstract: Testing is an indispensable part of software development. However, a career in software testing is reported to be unpopular among students in computer science and related areas. This can potentially create a shortage of testers in the software industry in the future. The question is, whether the perception that undergraduate students have about software testing is accurate and whether it differs f… ▽ More Testing is an indispensable part of software development. However, a career in software testing is reported to be unpopular among students in computer science and related areas. This can potentially create a shortage of testers in the software industry in the future. The question is, whether the perception that undergraduate students have about software testing is accurate and whether it differs from the experience reported by those who work in testing activities in the software development industry. This investigation demonstrates that a career in software testing is more exciting and rewarding, as reported by professionals working in the field, than students may believe. Therefore, in order to guarantee a workforce focused on software quality, the academy and the software industry need to work together to better inform students about software testing and its essential role in software development. △ Less

Submitted 10 November, 2023; originally announced November 2023.

Comments: IEEE Software, Volume 40, Issue 5, pp. 76-84, September/October 2023

Journal ref: IEEE Software, Volume 40, Issue 5, pp. 76-84. September/October 2023

arXiv:2310.07671 [pdf, other]

Discovery of Novel Reticular Materials for Carbon Dioxide Capture using GFlowNets

Authors: Flaviu Cipcigan, Jonathan Booth, Rodrigo Neumann Barros Ferreira, Carine Ribeiro dos Santos, Mathias Steiner

Abstract: Artificial intelligence holds promise to improve materials discovery. GFlowNets are an emerging deep learning algorithm with many applications in AI-assisted discovery. By using GFlowNets, we generate porous reticular materials, such as metal organic frameworks and covalent organic frameworks, for applications in carbon dioxide capture. We introduce a new Python package (matgfn) to train and sampl… ▽ More Artificial intelligence holds promise to improve materials discovery. GFlowNets are an emerging deep learning algorithm with many applications in AI-assisted discovery. By using GFlowNets, we generate porous reticular materials, such as metal organic frameworks and covalent organic frameworks, for applications in carbon dioxide capture. We introduce a new Python package (matgfn) to train and sample GFlowNets. We use matgfn to generate the matgfn-rm dataset of novel and diverse reticular materials with gravimetric surface area above 5000 m$^2$/g. We calculate single- and two-component gas adsorption isotherms for the top-100 candidates in matgfn-rm. These candidates are novel compared to the state-of-art ARC-MOF dataset and rank in the 90th percentile in terms of working capacity compared to the CoRE2019 dataset. We discover 15 materials outperforming all materials in CoRE2019. △ Less

Submitted 16 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

arXiv:2310.01719 [pdf]

Software Testing and Code Refactoring: A Survey with Practitioners

Authors: Danilo Leandro Lima, Ronnie de Souza Santos, Guilherme Pires Garcia, Sildemir S. da Silva, Cesar Franca, Luiz Fernando Capretz

Abstract: Nowadays, software testing professionals are commonly required to develop coding skills to work on test automation. One essential skill required from those who code is the ability to implement code refactoring, a valued quality aspect of software development; however, software developers usually encounter obstacles in successfully applying this practice. In this scenario, the present study aims to… ▽ More Nowadays, software testing professionals are commonly required to develop coding skills to work on test automation. One essential skill required from those who code is the ability to implement code refactoring, a valued quality aspect of software development; however, software developers usually encounter obstacles in successfully applying this practice. In this scenario, the present study aims to explore how software testing professionals (e.g., software testers, test engineers, test analysts, and software QAs) deal with code refactoring to understand the benefits and limitations of this practice in the context of software testing. We followed the guidelines to conduct surveys in software engineering and applied three sampling techniques, namely convenience sampling, purposive sampling, and snowballing sampling, to collect data from testing professionals. We received answers from 80 individuals reporting their experience refactoring the code of automated tests. We concluded that in the context of software testing, refactoring offers several benefits, such as supporting the maintenance of automated tests and improving the performance of the testing team. However, practitioners might encounter barriers in effectively implementing this practice, in particular, the lack of interest from managers and leaders. Our study raises discussions on the importance of having testing professionals implement refactoring in the code of automated tests, allowing them to improve their coding abilities. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2309.15186 [pdf, other]

doi 10.1109/TCE.2023.3255411

AsQM: Audio streaming Quality Metric based on Network Impairments and User Preferences

Authors: Marcelo Rodrigo dos Santos, Andreza Patrícia Batista, Renata Lopes Rosa, Muhammad Saadi, Dick Carrillo Melgarejo, Demóstenes Zegarra Rodríguez

Abstract: There are many users of audio streaming services because of the proliferation of cloud-based audio streaming services for different content. The complex networks that support these services do not always guarantee an acceptable quality on the end-user side. In this paper, the impact of temporal interruptions on the reproduction of audio streaming and the users preference in relation to audio conte… ▽ More There are many users of audio streaming services because of the proliferation of cloud-based audio streaming services for different content. The complex networks that support these services do not always guarantee an acceptable quality on the end-user side. In this paper, the impact of temporal interruptions on the reproduction of audio streaming and the users preference in relation to audio contents are studied. In order to determine the key parameters in the audio streaming service, subjective tests were conducted, and their results show that users Quality-of-Experience (QoE) is highly correlated with the following application parameters, the number of temporal interruptions or stalls, its frequency and length, and the temporal location in which they occur. However, most important, experimental results demonstrated that users preference for audio content plays an important role in users QoE. Thus, a Preference Factor (PF) function is defined and considered in the formulation of the proposed metric named Audio streaming Quality Metric (AsQM). Considering that multimedia service providers are based on web servers, a framework to obtain user information is proposed. Furthermore, results show that the AsQM implemented in the audio player of an end users device presents a low impact on energy, processing and memory consumption. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 11 pages

Journal ref: IEEE Transactions on Consumer Electronics, vol. 69, no. 3, pp. 408-420, Aug. 2023

arXiv:2307.10312 [pdf, other]

Beyond the ML Model: Applying Safety Engineering Frameworks to Text-to-Image Development

Authors: Shalaleh Rismani, Renee Shelby, Andrew Smart, Renelito Delos Santos, AJung Moon, Negar Rostamzadeh

Abstract: Identifying potential social and ethical risks in emerging machine learning (ML) models and their applications remains challenging. In this work, we applied two well-established safety engineering frameworks (FMEA, STPA) to a case study involving text-to-image models at three stages of the ML product development pipeline: data processing, integration of a T2I model with other models, and use. Resu… ▽ More Identifying potential social and ethical risks in emerging machine learning (ML) models and their applications remains challenging. In this work, we applied two well-established safety engineering frameworks (FMEA, STPA) to a case study involving text-to-image models at three stages of the ML product development pipeline: data processing, integration of a T2I model with other models, and use. Results of our analysis demonstrate the safety frameworks - both of which are not designed explicitly examine social and ethical risks - can uncover failure and hazards that pose social and ethical risks. We discovered a broad range of failures and hazards (i.e., functional, social, and ethical) by analyzing interactions (i.e., between different ML models in the product, between the ML product and user, and between development teams) and processes (i.e., preparation of training data or workflows for using an ML service/product). Our findings underscore the value and importance of examining beyond an ML model in examining social and ethical risks, especially when we have minimal information about an ML model. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.00355 [pdf]

Comparing Mobile Testing Tools Using Documentary Analysis

Authors: Gustavo da Silva, Ronnie de Souza Santos

Abstract: Due to the high demand for mobile applications, given the exponential growth of users of this type of technology, testing professionals are frequently required to invest time in studying testing tools, in particular, because nowadays, several different tools are available. A variety of tools makes it difficult for testing professionals to choose the one that best fits their goals and supports them… ▽ More Due to the high demand for mobile applications, given the exponential growth of users of this type of technology, testing professionals are frequently required to invest time in studying testing tools, in particular, because nowadays, several different tools are available. A variety of tools makes it difficult for testing professionals to choose the one that best fits their goals and supports them in their work. In this sense, we conducted a comparative analysis among five open-source tools for mobile testing: Appium, Robotium, Espresso, Frank, and EarGrey. We used the documentary analysis method to explore the official documentation of each above-cited tool and developed various comparisons based on technical criteria reported in the literature about characteristics that mobile testing tools should have. Our findings are expected to help practitioners understand several aspects of mobile testing tools. △ Less

Submitted 1 July, 2023; originally announced July 2023.

arXiv:2306.15133 [pdf, other]

The Perspective of Software Professionals on Algorithmic Racism

Authors: Ronnie de Souza Santos, Luiz Fernando de Lima, Cleyton Magalhaes

Abstract: Context. Algorithmic racism is the term used to describe the behavior of technological solutions that constrains users based on their ethnicity. Lately, various data-driven software systems have been reported to discriminate against Black people, either for the use of biased data sets or due to the prejudice propagated by software professionals in their code. As a result, Black people are experien… ▽ More Context. Algorithmic racism is the term used to describe the behavior of technological solutions that constrains users based on their ethnicity. Lately, various data-driven software systems have been reported to discriminate against Black people, either for the use of biased data sets or due to the prejudice propagated by software professionals in their code. As a result, Black people are experiencing disadvantages in accessing technology-based services, such as housing, banking, and law enforcement. Goal. This study aims to explore algorithmic racism from the perspective of software professionals. Method. A survey questionnaire was applied to explore the understanding of software practitioners on algorithmic racism, and data analysis was conducted using descriptive statistics and coding techniques. Results. We obtained answers from a sample of 73 software professionals discussing their understanding and perspectives on algorithmic racism in software development. Our results demonstrate that the effects of algorithmic racism are well-known among practitioners. However, there is no consensus on how the problem can be effectively addressed in software engineering. In this paper, some solutions to the problem are proposed based on the professionals' narratives. Conclusion. Combining technical and social strategies, including training on structural racism for software professionals, is the most promising way to address the algorithmic racism problem and its effects on the software solutions delivered to our society. △ Less

Submitted 26 June, 2023; originally announced June 2023.

arXiv:2305.07430 [pdf, ps, other]

Expertise-based Weighting for Regression Models with Noisy Labels

Authors: Milene Regina dos Santos, Rafael Izbicki

Abstract: Regression methods assume that accurate labels are available for training. However, in certain scenarios, obtaining accurate labels may not be feasible, and relying on multiple specialists with differing opinions becomes necessary. Existing approaches addressing noisy labels often impose restrictive assumptions on the regression function. In contrast, this paper presents a novel, more flexible app… ▽ More Regression methods assume that accurate labels are available for training. However, in certain scenarios, obtaining accurate labels may not be feasible, and relying on multiple specialists with differing opinions becomes necessary. Existing approaches addressing noisy labels often impose restrictive assumptions on the regression function. In contrast, this paper presents a novel, more flexible approach. Our method consists of two steps: estimating each labeler's expertise and combining their opinions using learned weights. We then regress the weighted average against the input features to build the prediction model. The proposed method is formally justified and empirically demonstrated to outperform existing techniques on simulated and real data. Furthermore, its flexibility enables the utilization of any machine learning technique in both steps. In summary, this method offers a simple, fast, and effective solution for training regression models with noisy labels derived from diverse expert opinions. △ Less

Submitted 12 May, 2023; originally announced May 2023.

arXiv:2303.12913 [pdf]

What do Transgender Software Professionals say about a Career in the Software Industry?

Authors: Ronnie de Souza Santos, Brody Stuart-Verner, Cleyton Magalhaes

Abstract: Diversity is an essential aspect of software development because technology influences almost every aspect of modern society, and if the software industry lacks diversity, software products might unintentionally constrain groups of individuals instead of promoting an equalitarian experience to all. In this study, we investigate the perspectives of transgender software professionals about a career… ▽ More Diversity is an essential aspect of software development because technology influences almost every aspect of modern society, and if the software industry lacks diversity, software products might unintentionally constrain groups of individuals instead of promoting an equalitarian experience to all. In this study, we investigate the perspectives of transgender software professionals about a career in software engineering as one of the aspects of diversity in the software industry. Our findings demonstrate that, on the one hand, trans people choose careers in software engineering for two primary reasons: a) even though software development environments are not exempt from discrimination, the software industry is safer than other industries for transgenders; b) trans people occasionally have to deal with gender dysphoria, anxiety, and fear of judgment, and the work flexibility offered by software companies allow them to cope with these issues more efficiently. △ Less

Submitted 22 March, 2023; originally announced March 2023.

arXiv:2303.06215 [pdf, ps, other]

Post-pandemic Resilience of Hybrid Software Teams

Authors: Ronnie de Souza Santos, Gianisa Adisaputri, Paul Ralph

Abstract: Background. The COVID-19 pandemic triggered a widespread transition to hybrid work models (combinations of co-located and remote work) as software professionals' demanded more flexibility and improved work-life balance. However, hybrid work models reduce the spontaneous, informal face-to-face interactions that promote group maturation, cohesion, and resilience. Little is known about how software c… ▽ More Background. The COVID-19 pandemic triggered a widespread transition to hybrid work models (combinations of co-located and remote work) as software professionals' demanded more flexibility and improved work-life balance. However, hybrid work models reduce the spontaneous, informal face-to-face interactions that promote group maturation, cohesion, and resilience. Little is known about how software companies can successfully transition to a hybrid workforce or the factors that influence the resilience of hybrid software development teams. Goal. The purpose of this study is to explore the relationship between hybrid work and team resilience in the context of software development. Method. Constructivist Grounded Theory was used, based on interviews of 26 software professionals. This sample included professionals of different genders, ethnicities, sexual orientations, and levels of experience. Interviewees came from eight different companies, 22 different projects, and four different countries. Consistent with grounded theory methodology, data collection, and analysis were conducted iteratively, in waves, using theoretical sampling, constant comparison, and initial, focused, and theoretical coding. Results. Software Team Resilience is the ability of a group of software professionals to continue working together effectively under adverse conditions. Resilience depends on the group's maturity. The configuration of a hybrid team (who works where and when) can promote or hinder group maturity depending on the level of intra-group interaction it supports. Conclusion. This paper presents the first study on the resilience of hybrid software teams. Software teams need resilience to maintain their performance in the face of disruptions and crises. Software professionals strongly value hybrid work; therefore, team resilience is a key factor to be considered in the software industry. △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2303.05953 [pdf]

LGBTQIA+ (In)Visibility in Computer Science and Software Engineering Education

Authors: Ronnie de Souza Santos, Brody Stuart-Verner, Cleyton de Magalhaes

Abstract: Modern society is diverse, multicultural, and multifaceted. Because of these characteristics, we are currently observing an increase in the debates about equity, diversity, and inclusion in different areas, especially because several groups of individuals are underrepresented in many environments. In computer science and software engineering, it seems counter-intuitive that these areas, which are… ▽ More Modern society is diverse, multicultural, and multifaceted. Because of these characteristics, we are currently observing an increase in the debates about equity, diversity, and inclusion in different areas, especially because several groups of individuals are underrepresented in many environments. In computer science and software engineering, it seems counter-intuitive that these areas, which are responsible for creating technological solutions and systems for billions of users around the world, do not reflect the diversity of the society to which it serves. In trying to solve this diversity crisis in the software industry, researchers started to investigate strategies that can be applied to increase diversity and improve inclusion in academia and the software industry. However, the lack of diversity in computer science and related courses, including software engineering, is still a problem, in particular when some specific groups are considered. LGBTQIA+ students, for instance, face several challenges to fit into technology courses, even though most students in universities right now belong to Generation Z, which is described as open-minded to aspects of gender and sexuality. In this study, we aimed to discuss the state-of-art of publications about the inclusion of LGBTQIA+ students in computer science education. Using a map** study, we identified eight studies published in the past six years that focused on this public. We present strategies developed to adapt curricula and lectures to be more inclusive to LGBTQIA+ students and discuss challenges and opportunities for future research △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2303.05950 [pdf]

Diversity in Software Engineering: A Survey about Scientists from Underrepresented Groups

Authors: Ronnie de Souza Santos, Brody Stuart-Verner, Cleyton de Magalhaes

Abstract: Technology plays a crucial role in people's lives. However, software engineering discriminates against individuals from underrepresented groups in several ways, either through algorithms that produce biased outcomes or for the lack of diversity and inclusion in software development environments and academic courses focused on technology. This reality contradicts the history of software engineering… ▽ More Technology plays a crucial role in people's lives. However, software engineering discriminates against individuals from underrepresented groups in several ways, either through algorithms that produce biased outcomes or for the lack of diversity and inclusion in software development environments and academic courses focused on technology. This reality contradicts the history of software engineering, which is filled with outstanding scientists from underrepresented groups who changed the world with their contributions to the field. Ada Lovelace, Alan Turing, and Clarence Ellis are only some individuals who made significant breakthroughs in the area and belonged to the population that is so underrepresented in undergraduate courses and the software industry. Previous research discusses that women, LGBTQIA+ people, and non-white individuals are examples of students who often feel unwelcome and ostracized in software engineering. However, do they know about the remarkable scientists that came before them and that share background similarities with them? Can we use these scientists as role models to motivate these students to continue pursuing a career in software engineering? In this study, we present the preliminary results of a survey with 128 undergraduate students about this topic. Our findings demonstrate that students' knowledge of computer scientists from underrepresented groups is limited. This creates opportunities for investigations on fostering diversity in software engineering courses using strategies exploring computer science's history. △ Less

Submitted 6 May, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

arXiv:2301.05379 [pdf, other]

Benefits and Limitations of Remote Work to LGBTQIA+ Software Professionals

Authors: Ronnie de Souza Santos, Cleyton Magalhaes, Paul Ralph

Abstract: Background. The mass transition to remote work amid the COVID-19 pandemic profoundly affected software professionals, who abruptly shifted into ostensibly temporary home offices. The effects of this transition on these professionals are complex, depending on the particularities of the context and individuals. Recent studies advocate for remote structures to create opportunities for many equity-des… ▽ More Background. The mass transition to remote work amid the COVID-19 pandemic profoundly affected software professionals, who abruptly shifted into ostensibly temporary home offices. The effects of this transition on these professionals are complex, depending on the particularities of the context and individuals. Recent studies advocate for remote structures to create opportunities for many equity-deserving groups; however, remote work can also be challenging for some individuals, such as women and individuals with disabilities. Objective. This study aims to investigate the effects of remote work on LGBTQIA+ software professionals. Method. Grounded theory methodology was applied based on information collected from two main sources: a survey questionnaire with a sample of 57 LGBTQIA+ software professionals and nine follow-up interviews with individuals from this sample. This sample included professionals of different genders, ethnicities, sexual orientations, and levels of experience. Findings. Our findings demonstrate that (1) remote work benefits LGBTQIA+ people by increasing security and visibility; (2) remote work harms LGBTQIA+ software professionals through isolation and invisibility; (3) the benefits outweigh the drawbacks; (4) the drawbacks can be mitigated by supportive measures developed by software companies. Conclusion. This paper investigated how remote work can affect LGBTQIA+ software professionals and presented a set of recommendations on how software companies can address the benefits and limitations associated with this work model. In summary, we concluded that remote work is crucial in increasing diversity and inclusion in the software industry. △ Less

Submitted 4 June, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

Comments: 10 pages

arXiv:2201.08239 [pdf, other]

LaMDA: Language Models for Dialog Applications

Authors: Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia **, Taylor Bos, Leslie Baker, Yu Du, YaGuang Li, Hongrae Lee, Huaixiu Steven Zheng, Amin Ghafouri, Marcelo Menegali, Yan** Huang, Maxim Krikun, Dmitry Lepikhin, James Qin, Dehao Chen, Yuanzhong Xu, Zhifeng Chen, Adam Roberts, Maarten Bosma, Vincent Zhao , et al. (35 additional authors not shown)

Abstract: We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotat… ▽ More We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency. △ Less

Submitted 10 February, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

arXiv:2112.06735 [pdf, other]

doi 10.1140/epjb/s10051-022-00453-3

Unsupervised machine learning approaches to the $q$-state Potts model

Authors: Andrea Tirelli, Danyella O. Carvalho, Lucas A. Oliveira, J. P. Lima, Natanael C. Costa, Raimundo R. dos Santos

Abstract: In this paper with study phase transitions of the $q$-state Potts model, through a number of unsupervised machine learning techniques, namely Principal Component Analysis (PCA), $k$-means clustering, Uniform Manifold Approximation and Projection (UMAP), and Topological Data Analysis (TDA). Even though in all cases we are able to retrieve the correct critical temperatures $T_c(q)$, for $q = 3, 4$ a… ▽ More In this paper with study phase transitions of the $q$-state Potts model, through a number of unsupervised machine learning techniques, namely Principal Component Analysis (PCA), $k$-means clustering, Uniform Manifold Approximation and Projection (UMAP), and Topological Data Analysis (TDA). Even though in all cases we are able to retrieve the correct critical temperatures $T_c(q)$, for $q = 3, 4$ and $5$, results show that non-linear methods as UMAP and TDA are less dependent on finite size effects, while still being able to distinguish between first and second order phase transitions. This study may be considered as a benchmark for the use of different unsupervised machine learning algorithms in the investigation of phase transitions. △ Less

Submitted 18 March, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

Comments: Added computation of critical exponents; exposition improved

arXiv:2107.13537 [pdf]

Abordagem probabilística para análise de confiabilidade de dados gerados em sequenciamentos multiplex na plataforma ABI SOLiD

Authors: Fabio M. F. Lobato, Carlos D. N. Damasceno, Péricles L. Machado, Nandamudi L. Vijaykumar, André R. dos Santos, Sylvain H. Darnet, André N. A. Gonçalves, Dayse O. de Alencar, Ádamo L. de Santana

Abstract: The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer… ▽ More The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer provides a mixture of all samples in a single output. This process must be secure to avoid any harm that may scramble further analysis. In this context, realized the need to develop a probabilistic model capable of assigning a degree of confidence in the marking system used in multiplex sequencing. The results confirmed the adequacy of the model obtained, which allows, among other things, to guide a process of filtering the data and evaluation of the sequencing protocol used. △ Less

Submitted 11 August, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

Comments: 8 pages, 4 figures, 2 tables, Published in Portuguese in the Anais of the XLIII Simpósio Brasileiro de Pesquisa Operacional (SBPO 2011), 2011. URL: http://www.din.uem.br/sbpo/sbpo2011/pdf/87903.pdf

arXiv:2106.07428 [pdf, other]

Audio Attacks and Defenses against AED Systems -- A Practical Study

Authors: Rodrigo dos Santos, Shirin Nilizadeh

Abstract: In this paper, we evaluate deep learning-enabled AED systems against evasion attacks based on adversarial examples. We test the robustness of multiple security critical AED tasks, implemented as CNNs classifiers, as well as existing third-party Nest devices, manufactured by Google, which run their own black-box deep learning models. Our adversarial examples use audio perturbations made of white an… ▽ More In this paper, we evaluate deep learning-enabled AED systems against evasion attacks based on adversarial examples. We test the robustness of multiple security critical AED tasks, implemented as CNNs classifiers, as well as existing third-party Nest devices, manufactured by Google, which run their own black-box deep learning models. Our adversarial examples use audio perturbations made of white and background noises. Such disturbances are easy to create, to perform and to reproduce, and can be accessible to a large number of potential attackers, even non-technically savvy ones. We show that an adversary can focus on audio adversarial inputs to cause AED systems to misclassify, achieving high success rates, even when we use small levels of a given type of noisy disturbance. For instance, on the case of the gunshot sound class, we achieve nearly 100% success rate when employing as little as 0.05 white noise level. Similarly to what has been previously done by works focusing on adversarial examples from the image domain as well as on the speech recognition domain. We then, seek to improve classifiers' robustness through countermeasures. We employ adversarial training and audio denoising. We show that these countermeasures, when applied to audio input, can be successful, either in isolation or in combination, generating relevant increases of nearly fifty percent in the performance of the classifiers when these are under attack. △ Less

Submitted 10 November, 2021; v1 submitted 14 June, 2021; originally announced June 2021.

arXiv:2106.03954 [pdf, other]

Evaluating Meta-Feature Selection for the Algorithm Recommendation Problem

Authors: Geand Trindade Pereira, Moises Rocha dos Santos, Andre Carlos Ponce de Leon Ferreira de Carvalho

Abstract: With the popularity of Machine Learning (ML) solutions, algorithms and data have been released faster than the capacity of processing them. In this context, the problem of Algorithm Recommendation (AR) is receiving a significant deal of attention recently. This problem has been addressed in the literature as a learning task, often as a Meta-Learning problem where the aim is to recommend the best a… ▽ More With the popularity of Machine Learning (ML) solutions, algorithms and data have been released faster than the capacity of processing them. In this context, the problem of Algorithm Recommendation (AR) is receiving a significant deal of attention recently. This problem has been addressed in the literature as a learning task, often as a Meta-Learning problem where the aim is to recommend the best alternative for a specific dataset. For such, datasets encoded by meta-features are explored by ML algorithms that try to learn the map** between meta-representations and the best technique to be used. One of the challenges for the successful use of ML is to define which features are the most valuable for a specific dataset since several meta-features can be used, which increases the meta-feature dimension. This paper presents an empirical analysis of Feature Selection and Feature Extraction in the meta-level for the AR problem. The present study was focused on three criteria: predictive performance, dimensionality reduction, and pipeline runtime. As we verified, applying Dimensionality Reduction (DR) methods did not improve predictive performances in general. However, DR solutions reduced about 80% of the meta-features, obtaining pretty much the same performance as the original setup but with lower runtimes. The only exception was PCA, which presented about the same runtime as the original meta-features. Experimental results also showed that various datasets have many non-informative meta-features and that it is possible to obtain high predictive performance using around 20% of the original meta-features. Therefore, due to their natural trend for high dimensionality, DR methods should be used for Meta-Feature Selection and Meta-Feature Extraction. △ Less

Submitted 11 June, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

arXiv:2104.12295 [pdf, other]

Vulnerabilities and Open Issues of Smart Contracts: A Systematic Map**

Authors: Gabriel de Sousa Matsumura, Luciana Brasil Rebelo dos Santos, Arlindo Flavio da Conceição, Nandamudi Lankalapalli Vijaykumar

Abstract: Smart Contracts (SCs) are programs stored in a Blockchain to ensure agreements between two or more parties. Due to the unchangeable essence of Blockchain, failures or errors in SCs become perpetual once published. The reliability of SCs is essential to avoid financial losses. So, SCs must be checked to ensure the absence of errors. Hence, many studies addressed new methods and tools for zero-bug s… ▽ More Smart Contracts (SCs) are programs stored in a Blockchain to ensure agreements between two or more parties. Due to the unchangeable essence of Blockchain, failures or errors in SCs become perpetual once published. The reliability of SCs is essential to avoid financial losses. So, SCs must be checked to ensure the absence of errors. Hence, many studies addressed new methods and tools for zero-bug software in SCs. This paper conducted a systematic literature map** identifying initiatives and tools to analyze SCs and how to deal with the identified vulnerabilities. Besides, this work identifies gaps that may lead to research topics for future work. △ Less

Submitted 26 May, 2021; v1 submitted 25 April, 2021; originally announced April 2021.

arXiv:2002.04312 [pdf, other]

Improved prediction of soil properties with Multi-target Stacked Generalisation on EDXRF spectra

Authors: Everton Jose Santana, Felipe Rodrigues dos Santos, Saulo Martiello Mastelini, Fabio Luiz Melquiades, Sylvio Barbon Jr

Abstract: Machine Learning (ML) algorithms have been used for assessing soil quality parameters along with non-destructive methodologies. Among spectroscopic analytical methodologies, energy dispersive X-ray fluorescence (EDXRF) is one of the more quick, environmentally friendly and less expensive when compared to conventional methods. However, some challenges in EDXRF spectral data analysis still demand mo… ▽ More Machine Learning (ML) algorithms have been used for assessing soil quality parameters along with non-destructive methodologies. Among spectroscopic analytical methodologies, energy dispersive X-ray fluorescence (EDXRF) is one of the more quick, environmentally friendly and less expensive when compared to conventional methods. However, some challenges in EDXRF spectral data analysis still demand more efficient methods capable of providing accurate outcomes. Using Multi-target Regression (MTR) methods, multiple parameters can be predicted, and also taking advantage of inter-correlated parameters the overall predictive performance can be improved. In this study, we proposed the Multi-target Stacked Generalisation (MTSG), a novel MTR method relying on learning from different regressors arranged in stacking structure for a boosted outcome. We compared MTSG and 5 MTR methods for predicting 10 parameters of soil fertility. Random Forest and Support Vector Machine (with linear and radial kernels) were used as learning algorithms embedded into each MTR method. Results showed the superiority of MTR methods over the Single-target Regression (the traditional ML method), reducing the predictive error for 5 parameters. Particularly, MTSG obtained the lowest error for phosphorus, total organic carbon and cation exchange capacity. When observing the relative performance of Support Vector Machine with a radial kernel, the prediction of base saturation percentage was improved in 19%. Finally, the proposed method was able to reduce the average error from 0.67 (single-target) to 0.64 analysing all targets, representing a global improvement of 4.48%. △ Less

Submitted 11 February, 2020; originally announced February 2020.

Comments: 20 pages, 5 figures

arXiv:1912.02480 [pdf, other]

Leveraging Operational Technology and the Internet of Things to Attack Smart Buildings

Authors: Daniel Ricardo dos Santos, Mario Dagrada, Elisa Costante

Abstract: In recent years, the buildings where we spend most part of our life are rapidly evolving. They are becoming fully automated environments where energy consumption, access control, heating and many other subsystems are all integrated within a single system commonly referred to as smart building (SB). To support the growing complexity of building operations, building automation systems (BAS) powering… ▽ More In recent years, the buildings where we spend most part of our life are rapidly evolving. They are becoming fully automated environments where energy consumption, access control, heating and many other subsystems are all integrated within a single system commonly referred to as smart building (SB). To support the growing complexity of building operations, building automation systems (BAS) powering SBs are integrating consumer range Internet of Things (IoT) devices such as IP cameras alongside with operational technology (OT) controllers and actuators. However, these changes pose important cybersecurity concerns since the attack surface is larger, attack vectors are increasing and attacks can potentially harm building occupants. In this paper, we analyze the threat landscape of BASs by focusing on subsystems which are strongly affected by the advent of IoT devices such as video surveillance systems and smart lightning. We demonstrate how BAS operation can be disrupted by simple attacks to widely used network protocols. Furthermore, using both known and 0-day vulnerabilities reported in the paper and previously disclosed, we present the first (at our knowledge) BAS-specific malware which is able to persist within the BAS network by leveraging both OT and IoT devices connected to the BAS. Our research highlights how BAS networks can be considered as critical as industrial control systems and security concerns in BASs deserve more attention from both industrial and scientific communities. Even within a simulated environment, our proof-of-concept attacks were carried out with relative ease and a limited amount of budget and resources. Therefore, we believe that well-funded attack groups will increasingly shift their focus towards BASs with the potential of impacting the live of thousands of people. △ Less

Submitted 5 December, 2019; originally announced December 2019.

arXiv:1905.04210 [pdf, other]

An LP-Based Approach for Goal Recognition as Planning

Authors: Luísa R. de A. Santos, Felipe Meneguzzi, Ramon Fraga Pereira, André Grahl Pereira

Abstract: Goal recognition aims to recognize the set of candidate goals that are compatible with the observed behavior of an agent. In this paper, we develop a method based on the operator-counting framework that efficiently computes solutions that satisfy the observations and uses the information generated to solve goal recognition tasks. Our method reasons explicitly about both partial and noisy observati… ▽ More Goal recognition aims to recognize the set of candidate goals that are compatible with the observed behavior of an agent. In this paper, we develop a method based on the operator-counting framework that efficiently computes solutions that satisfy the observations and uses the information generated to solve goal recognition tasks. Our method reasons explicitly about both partial and noisy observations: estimating uncertainty for the former, and satisfying observations given the unreliability of the sensor for the latter. We evaluate our approach empirically over a large data set, analyzing its components on how each can impact the quality of the solutions. In general, our approach is superior to previous methods in terms of agreement ratio, accuracy, and spread. Finally, our approach paves the way for new research on combinatorial optimization to solve goal recognition tasks. △ Less

Submitted 15 June, 2021; v1 submitted 10 May, 2019; originally announced May 2019.

Comments: 8 pages, 4 tables, 3 figures. Published in AAAI 2021. Updated final authorship and text

Journal ref: AAAI 2021: 11939-11946

arXiv:1810.02980 [pdf, ps, other]

Personality facets recognition from text

Authors: Wesley Ramos dos Santos, Ivandre Paraboni

Abstract: Fundamental Big Five personality traits (e.g., Extraversion) and their facets (e.g., Activity) are known to correlate with a broad range of linguistic features and, accordingly, the recognition of personality traits from text is a well-known Natural Language Processing task. Labelling text data with facets information, however, may require the use of lengthy personality inventories, and perhaps fo… ▽ More Fundamental Big Five personality traits (e.g., Extraversion) and their facets (e.g., Activity) are known to correlate with a broad range of linguistic features and, accordingly, the recognition of personality traits from text is a well-known Natural Language Processing task. Labelling text data with facets information, however, may require the use of lengthy personality inventories, and perhaps for that reason existing computational models of this kind are usually limited to the recognition of the fundamental traits. Based on these observations, this paper investigates the issue of personality facets recognition from text labelled only with information available from a shorter personality inventory. In doing so, we provide a low-cost model for the recognition of certain personality facets, and present reference results for further studies in this field. △ Less

Submitted 27 March, 2019; v1 submitted 6 October, 2018; originally announced October 2018.

arXiv:1709.03084 [pdf, other]

doi 10.1007/s10009-017-0474-1

TestREx: a Framework for Repeatable Exploits

Authors: Stanislav Dashevskyi, Daniel Ricardo dos Santos, Fabio Massacci, Antonino Sabetta

Abstract: Web applications are the target of many well known exploits and also a fertile ground for the discovery of security vulnerabilities. Yet, the success of an exploit depends both on the vulnerability in the application source code and the environment in which the application is deployed and run. As execution environments are complex (application servers, databases and other supporting applications),… ▽ More Web applications are the target of many well known exploits and also a fertile ground for the discovery of security vulnerabilities. Yet, the success of an exploit depends both on the vulnerability in the application source code and the environment in which the application is deployed and run. As execution environments are complex (application servers, databases and other supporting applications), we need to have a reliable framework to test whether known exploits can be reproduced in different settings, better understand their effects, and facilitate the discovery of new vulnerabilities. In this paper, we present TestREx - a framework that allows for highly automated, easily repeatable exploit testing in a variety of contexts, so that a security tester may quickly and efficiently perform large-scale experiments with vulnerability exploits. It supports packing and running applications with their environments, injecting exploits, monitoring their success, and generating security reports. We also provide a corpus of example applications, taken from related works or implemented by us. △ Less

Submitted 10 September, 2017; originally announced September 2017.

Journal ref: Int. J. Software Tools for Technology Transfer, 2017

arXiv:1706.07205 [pdf, other]

A Survey on Workflow Satisfiability, Resiliency, and Related Problems

Authors: Daniel Ricardo dos Santos, Silvio Ranise

Abstract: Workflows specify collections of tasks that must be executed under the responsibility or supervision of human users. Workflow management systems and workflow-driven applications need to enforce security policies in the form of access control, specifying which users can execute which tasks, and authorization constraints, such as Separation of Duty, further restricting the execution of tasks at run-… ▽ More Workflows specify collections of tasks that must be executed under the responsibility or supervision of human users. Workflow management systems and workflow-driven applications need to enforce security policies in the form of access control, specifying which users can execute which tasks, and authorization constraints, such as Separation of Duty, further restricting the execution of tasks at run-time. Enforcing these policies is crucial to avoid frauds and malicious use, but it may lead to situations where a workflow instance cannot be completed without the violation of the policy. The Workflow Satisfiability Problem (WSP) asks whether there exists an assignment of users to tasks in a workflow such that every task is executed and the policy is not violated. The WSP is inherently hard, but solutions to this problem have a practical application in reconciling business compliance and business continuity. Solutions to related problems, such as workflow resiliency (i.e., whether a workflow instance is still satisfiable even in the absence of users), are important to help in policy design. Several variations of the WSP and similar problems have been defined in the literature and there are many solution methods available. In this paper, we survey the work done on these problems in the past 20 years. △ Less

Submitted 22 June, 2017; originally announced June 2017.

arXiv:1507.07479 [pdf, other]

Modularity for Security-Sensitive Workflows

Authors: Daniel Ricardo dos Santos, Silvio Ranise, Serena Elisa Ponta

Abstract: An established trend in software engineering insists on using components (sometimes also called services or packages) to encapsulate a set of related functionalities or data. By defining interfaces specifying what functionalities they provide or use, components can be combined with others to form more complex components. In this way, IT systems can be designed by mostly re-using existing component… ▽ More An established trend in software engineering insists on using components (sometimes also called services or packages) to encapsulate a set of related functionalities or data. By defining interfaces specifying what functionalities they provide or use, components can be combined with others to form more complex components. In this way, IT systems can be designed by mostly re-using existing components and develo** new ones to provide new functionalities. In this paper, we introduce a notion of component and a combination mechanism for an important class of software artifacts, called security-sensitive workflows. These are business processes in which execution constraints on the tasks are complemented with authorization constraints (e.g., Separation of Duty) and authorization policies (constraining which users can execute which tasks). We show how well-known workflow execution patterns can be simulated by our combination mechanism and how authorization constraints can also be imposed across components. Then, we demonstrate the usefulness of our notion of component by showing (i) the scalability of a technique for the synthesis of run-time monitors for security-sensitive workflows and (ii) the design of a plug-in for the re-use of workflows and related run-time monitors inside an editor for security-sensitive workflows. △ Less

Submitted 27 July, 2015; originally announced July 2015.

arXiv:1404.0855 [pdf, other]

doi 10.4204/EPTCS.147.10

Transformation of UML Behavioral Diagrams to Support Software Model Checking

Authors: Luciana Brasil Rebelo dos Santos, Valdivino Alexandre de Santiago Júnior, Nandamudi Lankalapalli Vijaykumar

Abstract: Unified Modeling Language (UML) is currently accepted as the standard for modeling (object-oriented) software, and its use is increasing in the aerospace industry. Verification and Validation of complex software developed according to UML is not trivial due to complexity of the software itself, and the several different UML models/diagrams that can be used to model behavior and structure of the so… ▽ More Unified Modeling Language (UML) is currently accepted as the standard for modeling (object-oriented) software, and its use is increasing in the aerospace industry. Verification and Validation of complex software developed according to UML is not trivial due to complexity of the software itself, and the several different UML models/diagrams that can be used to model behavior and structure of the software. This paper presents an approach to transform up to three different UML behavioral diagrams (sequence, behavioral state machines, and activity) into a single Transition System to support Model Checking of software developed in accordance with UML. In our approach, properties are formalized based on use case descriptions. The transformation is done for the NuSMV model checker, but we see the possibility in using other model checkers, such as SPIN. The main contribution of our work is the transformation of a non-formal language (UML) to a formal language (language of the NuSMV model checker) towards a greater adoption in practice of formal methods in software development. △ Less

Submitted 3 April, 2014; originally announced April 2014.

Comments: In Proceedings FESCA 2014, arXiv:1404.0436

Journal ref: EPTCS 147, 2014, pp. 133-142

Showing 1–40 of 40 results for author: Santos, R D