Search | arXiv e-print repository

How do annotations affect Java code readability?

Authors: Eduardo Guerra, Everaldo Gomes, Jeferson Ferreira, Igor Wiese, Phyllipe Lima, Marco Gerosa, Paulo Meirelles

Abstract: Context: Code annotations have gained widespread popularity in programming languages, offering developers the ability to attach metadata to code elements to define custom behaviors. Many modern frameworks and APIs use annotations to keep integration less verbose and located nearer to the corresponding code element. Despite these advantages, practitioners' anecdotal evidence suggests that annotatio… ▽ More Context: Code annotations have gained widespread popularity in programming languages, offering developers the ability to attach metadata to code elements to define custom behaviors. Many modern frameworks and APIs use annotations to keep integration less verbose and located nearer to the corresponding code element. Despite these advantages, practitioners' anecdotal evidence suggests that annotations might negatively affect code readability. Objective: To better understand this effect, this paper systematically investigates the relationship between code annotations and code readability. Method: In a survey with software developers (n=332), we present 15 pairs of Java code snippets with and without code annotations. These pairs were designed considering five categories of annotation used in real-world Java frameworks and APIs. Survey participants selected the code snippet they considered more readable for each pair and answered an open question about how annotations affect the code's readability. Results: Preferences were scattered for all categories of annotation usage, revealing no consensus among participants. The answers were spread even when segregated by participants' programming or annotation-related experience. Nevertheless, some participants showed a consistent preference in favor or against annotations across all categories, which may indicate a personal preference. Our qualitative analysis of the open-ended questions revealed that participants often praise annotation impacts on design, maintainability, and productivity but expressed contrasting views on understandability and code clarity. Conclusions: Software developers and API designers can consider our results when deciding whether to use annotations, equipped with the insight that developers express contrasting views of the annotations' impact on code readability. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: Accepted to Empirical Software Engineering (EMSE) Journal

arXiv:2302.09852 [pdf, other]

Unsupervised Layer-wise Score Aggregation for Textual OOD Detection

Authors: Maxime Darrin, Guillaume Staerman, Eduardo Dadalto Câmara Gomes, Jackie CK Cheung, Pablo Piantanida, Pierre Colombo

Abstract: Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number of AI-based systems. Existing OOD textual detectors often rely on an anomaly score (e.g., Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending… ▽ More Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number of AI-based systems. Existing OOD textual detectors often rely on an anomaly score (e.g., Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending on the task and layer output. More importantly, we show that the usual choice (the last layer) is rarely the best one for OOD detection and that far better results could be achieved if the best layer were picked. To leverage this observation, we propose a data-driven, unsupervised method to combine layer-wise anomaly scores. In addition, we extend classical textual OOD benchmarks by including classification tasks with a greater number of classes (up to 77), which reflects more realistic settings. On this augmented benchmark, we show that the proposed post-aggregation methods achieve robust and consistent results while removing manual feature selection altogether. Their performance achieves near oracle's best layer performance. △ Less

Submitted 21 February, 2024; v1 submitted 20 February, 2023; originally announced February 2023.

arXiv:2211.13527 [pdf, other]

Beyond Mahalanobis-Based Scores for Textual OOD Detection

Authors: Pierre Colombo, Eduardo D. C. Gomes, Guillaume Staerman, Nathan Noiry, Pablo Piantanida

Abstract: Deep learning methods have boosted the adoption of NLP systems in real-life applications. However, they turn out to be vulnerable to distribution shifts over time which may cause severe dysfunctions in production systems, urging practitioners to develop tools to detect out-of-distribution (OOD) samples through the lens of the neural network. In this paper, we introduce TRUSTED, a new OOD detector… ▽ More Deep learning methods have boosted the adoption of NLP systems in real-life applications. However, they turn out to be vulnerable to distribution shifts over time which may cause severe dysfunctions in production systems, urging practitioners to develop tools to detect out-of-distribution (OOD) samples through the lens of the neural network. In this paper, we introduce TRUSTED, a new OOD detector for classifiers based on Transformer architectures that meets operational requirements: it is unsupervised and fast to compute. The efficiency of TRUSTED relies on the fruitful idea that all hidden layers carry relevant information to detect OOD examples. Based on this, for a given input, TRUSTED consists in (i) aggregating this information and (ii) computing a similarity score by exploiting the training distribution, leveraging the powerful concept of data depth. Our extensive numerical experiments involve 51k model configurations, including various checkpoints, seeds, and datasets, and demonstrate that TRUSTED achieves state-of-the-art performances. In particular, it improves previous AUROC over 3 points. △ Less

Submitted 24 November, 2022; originally announced November 2022.

Journal ref: NeurIPS 2022

arXiv:2203.07798 [pdf, other]

Igeood: An Information Geometry Approach to Out-of-Distribution Detection

Authors: Eduardo Dadalto Camara Gomes, Florence Alberge, Pierre Duhamel, Pablo Piantanida

Abstract: Reliable out-of-distribution (OOD) detection is fundamental to implementing safer modern machine learning (ML) systems. In this paper, we introduce Igeood, an effective method for detecting OOD samples. Igeood applies to any pre-trained neural network, works under various degrees of access to the ML model, does not require OOD samples or assumptions on the OOD data but can also benefit (if availab… ▽ More Reliable out-of-distribution (OOD) detection is fundamental to implementing safer modern machine learning (ML) systems. In this paper, we introduce Igeood, an effective method for detecting OOD samples. Igeood applies to any pre-trained neural network, works under various degrees of access to the ML model, does not require OOD samples or assumptions on the OOD data but can also benefit (if available) from OOD samples. By building on the geodesic (Fisher-Rao) distance between the underlying data distributions, our discriminator can combine confidence scores from the logits outputs and the learned features of a deep neural network. Empirically, we show that Igeood outperforms competing state-of-the-art methods on a variety of network architectures and datasets. △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: Accepted in ICLR 2022

arXiv:2112.10938 [pdf, other]

doi 10.1016/j.infsof.2022.107089

CADV: A software visualization approach for code annotations distribution

Authors: Phyllipe Lima, Jorge Melegati, Everaldo Gomes, Nathalya Stefhany Pereira, Eduardo Guerra, Paulo Meirelles

Abstract: Code annotations is a widely used feature in Java systems to configure custom metadata on programming elements. Their increasing presence creates the need for approaches to assess and comprehend their usage and distribution. In this context, software visualization has been studied and researched to improve program comprehension in different aspects. This study aimed at designing a software visuali… ▽ More Code annotations is a widely used feature in Java systems to configure custom metadata on programming elements. Their increasing presence creates the need for approaches to assess and comprehend their usage and distribution. In this context, software visualization has been studied and researched to improve program comprehension in different aspects. This study aimed at designing a software visualization approach that graphically displays how code annotations are distributed and organized in a software system and develo** a tool, as a reference implementation of the approach, to generate views and interact with users. We conducted an empirical evaluation through questionnaires and interviews to evaluate our visualization approach considering four aspects: effectiveness for program comprehension, perceived usefulness, perceived ease of use, and suitability for the intended audience. The resulting data was used to perform a qualitative and quantitative analysis. The tool identifies package responsibilities providing visual information about their annotations at different levels. Using the developed tool, the participants achieved a high correctness rate in the program comprehension tasks and performed very well in questions about the overview of the system under analysis. Finally, participants perceived that the tool outperforms existing approaches for code inspection when searching for information related to code annotations. The results show that the visualization approach using the developed tool is effective in program comprehension tasks related to code annotations, which can also be used to identify responsibilities in the application packages. Moreover, it was evaluated as suitable for newcomers to overview the usage of annotations in the system and for architects to perform a deep analysis that can potentially detect misplaced annotations and abnormal growths on their usage. △ Less

Submitted 27 June, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

Comments: 53 pages

arXiv:0907.2039 [pdf, ps, other]

Refining interfaces: the case of the B method

Authors: David Deharbe, Bruno E. G. Gomes, Anamaria M. Moreira

Abstract: Model-driven design of software for safety-critical applications often relies on mathematically grounded techniques such as the B method. Such techniques consist in the successive applications of refinements to derive a concrete implementation from an abstract specification. Refinement theory defines verification conditions to guarantee that such operations preserve the intended behaviour of the… ▽ More Model-driven design of software for safety-critical applications often relies on mathematically grounded techniques such as the B method. Such techniques consist in the successive applications of refinements to derive a concrete implementation from an abstract specification. Refinement theory defines verification conditions to guarantee that such operations preserve the intended behaviour of the abstract specifications. One of these conditions requires however that concrete operations have exactly the same signatures as their abstract counterpart, which is not always a practical requirement. This paper shows how changes of signatures can be achieved while still staying within the bounds of refinement theory. This makes it possible to take advantage of the mathematical guarantees and tool support provided for the current refinement-based techniques, such as the B method. △ Less

Submitted 12 July, 2009; originally announced July 2009.

Comments: 18 pages, submitted to ICFEM 2009

ACM Class: D.2.4

Showing 1–6 of 6 results for author: Gomes, E