-
AI-Assisted Assessment of Coding Practices in Modern Code Review
Authors:
Manushree Vijayvergiya,
Małgorzata Salawa,
Ivan Budiselić,
Dan Zheng,
Pascal Lamblin,
Marko Ivanković,
Juanjo Carin,
Mateusz Lewko,
Jovan Andonov,
Goran Petrović,
Daniel Tarlow,
Petros Maniatis,
René Just
Abstract:
Modern code review is a process in which an incremental code contribution made by a code author is reviewed by one or more peers before it is committed to the version control system. An important element of modern code review is verifying that code contributions adhere to best practices. While some of these best practices can be automatically verified, verifying others is commonly left to human re…
▽ More
Modern code review is a process in which an incremental code contribution made by a code author is reviewed by one or more peers before it is committed to the version control system. An important element of modern code review is verifying that code contributions adhere to best practices. While some of these best practices can be automatically verified, verifying others is commonly left to human reviewers. This paper reports on the development, deployment, and evaluation of AutoCommenter, a system backed by a large language model that automatically learns and enforces coding best practices. We implemented AutoCommenter for four programming languages (C++, Java, Python, and Go) and evaluated its performance and adoption in a large industrial setting. Our evaluation shows that an end-to-end system for learning and enforcing coding best practices is feasible and has a positive impact on the developer workflow. Additionally, this paper reports on the challenges associated with deploying such a system to tens of thousands of developers and the corresponding lessons learned.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
MuRS: Mutant Ranking and Suppression using Identifier Templates
Authors:
Zimin Chen,
Malgorzata Salawa,
Manushree Vijayvergiya,
Goran Petrovic,
Marko Ivankovic,
Rene Just
Abstract:
Diff-based mutation testing is a mutation testing approach that only mutates lines affected by a code change under review. Google's mutation testing service integrates diff-based mutation testing into the code review process and continuously gathers developer feedback on mutants surfaced during code review. To enhance the developer experience, the mutation testing service implements a number of su…
▽ More
Diff-based mutation testing is a mutation testing approach that only mutates lines affected by a code change under review. Google's mutation testing service integrates diff-based mutation testing into the code review process and continuously gathers developer feedback on mutants surfaced during code review. To enhance the developer experience, the mutation testing service implements a number of suppression rules, which target not-useful mutants-that is, mutants that have consistently received negative developer feedback. However, while effective, manually implementing suppression rules require significant engineering time. An automatic system to rank and suppress mutants would facilitate the maintenance of the mutation testing service. This paper proposes and evaluates MuRS, an automated approach that groups mutants by patterns in the source code under test and uses these patterns to rank and suppress future mutants based on historical developer feedback on mutants in the same group. To evaluate MuRS, we conducted an A/B testing study, comparing MuRS to the existing mutation testing service. Despite the strong baseline, which uses manually developed suppression rules, the results show a statistically significantly lower negative feedback ratio of 11.45% for MuRS versus 12.41% for the baseline. The results also show that MuRS is able to recover existing suppression rules implemented in the baseline. Finally, the results show that statement-deletion mutant groups received both the most positive and negative developer feedback, suggesting a need for additional context that can distinguish between useful and not-useful mutants in these groups. Overall, MuRS has the potential to substantially reduce the development and maintenance cost for an effective mutation testing service by automatically learning suppression rules.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness
Authors:
António Branco,
João Rodrigues,
Małgorzata Salawa,
Ruben Branco,
Chakaveh Saedi
Abstract:
Lexical semantics theories differ in advocating that the meaning of words is represented as an inference graph, a feature map** or a vector space, thus raising the question: is it the case that one of these approaches is superior to the others in representing lexical semantics appropriately? Or in its non antagonistic counterpart: could there be a unified account of lexical semantics where these…
▽ More
Lexical semantics theories differ in advocating that the meaning of words is represented as an inference graph, a feature map** or a vector space, thus raising the question: is it the case that one of these approaches is superior to the others in representing lexical semantics appropriately? Or in its non antagonistic counterpart: could there be a unified account of lexical semantics where these approaches seamlessly emerge as (partial) renderings of (different) aspects of a core semantic knowledge base?
In this paper, we contribute to these research questions with a number of experiments that systematically probe different lexical semantics theories for their levels of cognitive plausibility and of technological usefulness.
The empirical findings obtained from these experiments advance our insight on lexical semantics as the feature-based approach emerges as superior to the other ones, and arguably also move us closer to finding answers to the research questions above.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.