-
Towards Safe Automated Refactoring of Imperative Deep Learning Programs to Graph Execution
Authors:
Raffi Khatchadourian,
Tatiana Castro Vélez,
Mehdi Bagherzadeh,
Nan Jia,
Anita Raja
Abstract:
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code -- supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce code that is error-prone, non-intuitive, and difficult to debug. Consequently…
▽ More
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code -- supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged at the expense of run-time performance. Though hybrid approaches aim for the "best of both worlds," using them effectively requires subtle considerations to make code amenable to safe, accurate, and efficient graph execution -- avoiding performance bottlenecks and semantically inequivalent results. We present our ongoing work on an automated refactoring approach that assists developers in specifying whether and how their otherwise eagerly-executed imperative DL code could be reliably and efficiently executed as graphs at run-time in a semantics-preserving fashion. The approach, based on a novel tensor analysis specifically for imperative DL code, consists of refactoring preconditions for automatically determining when it is safe and potentially advantageous to migrate imperative DL code to graph execution and modifying decorator parameters or eagerly executing code already running as graphs. The approach is being implemented as a PyDev Eclipse IDE plug-in and uses the WALA Ariadne analysis framework. We discuss our ongoing work towards optimizing imperative DL code to its full potential.
△ Less
Submitted 10 October, 2023; v1 submitted 22 August, 2023;
originally announced August 2023.
-
QuerTCI: A Tool Integrating GitHub Issue Querying with Comment Classification
Authors:
Ye Paing,
Tatiana Castro Vélez,
Raffi Khatchadourian
Abstract:
Empirical Software Engineering (ESE) researchers study (open-source) project issues and the comments and threads within to discover -- among others -- challenges developers face when incorporating new technologies, platforms, and programming language constructs. However, such threads accumulate, becoming unwieldy and hindering any insight researchers may gain. While existing approaches alleviate t…
▽ More
Empirical Software Engineering (ESE) researchers study (open-source) project issues and the comments and threads within to discover -- among others -- challenges developers face when incorporating new technologies, platforms, and programming language constructs. However, such threads accumulate, becoming unwieldy and hindering any insight researchers may gain. While existing approaches alleviate this burden by classifying issue thread comments, there is a gap between searching popular open-source software repositories (e.g., those on GitHub) for issues containing particular keywords and feeding the results into a classification model. This paper demonstrates a research infrastructure tool called QuerTCI that bridges this gap by integrating the GitHub issue comment search API with the classification models found in existing approaches. Using queries, ESE researchers can retrieve GitHub issues containing particular keywords, e.g., those related to a specific programming language construct, and, subsequently, classify the discussions occurring in those issues. We hope ESE researchers can use our tool to uncover challenges related to particular technologies using specific keywords through popular open-source repositories more seamlessly than previously possible. A tool demonstration video may be found at: https://youtu.be/fADKSxn0QUk.
△ Less
Submitted 19 July, 2022; v1 submitted 17 February, 2022;
originally announced February 2022.
-
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study
Authors:
Tatiana Castro Vélez,
Raffi Khatchadourian,
Mehdi Bagherzadeh,
Anita Raja
Abstract:
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequen…
▽ More
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged but at the expense of run-time performance. While hybrid approaches aim for the "best of both worlds," the challenges in applying them in the real world are largely unknown. We conduct a data-driven analysis of challenges -- and resultant bugs -- involved in writing reliable yet performant imperative DL code by studying 250 open-source projects, consisting of 19.7 MLOC, along with 470 and 446 manually examined code patches and bug reports, respectively. The results indicate that hybridization: (i) is prone to API misuse, (ii) can result in performance degradation -- the opposite of its intention, and (iii) has limited application due to execution mode incompatibility. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code, potentially benefiting DL practitioners, API designers, tool developers, and educators.
△ Less
Submitted 5 April, 2022; v1 submitted 24 January, 2022;
originally announced January 2022.
-
A Tool for Rejuvenating Feature Logging Levels via Git Histories and Degree of Interest
Authors:
Yiming Tang,
Allan Spektor,
Raffi Khatchadourian,
Mehdi Bagherzadeh
Abstract:
Logging is a significant programming practice. Due to the highly transactional nature of modern software applications, massive amount of logs are generated every day, which may overwhelm developers. Logging information overload can be dangerous to software applications. Using log levels, developers can print the useful information while hiding the verbose logs during software runtime. As software…
▽ More
Logging is a significant programming practice. Due to the highly transactional nature of modern software applications, massive amount of logs are generated every day, which may overwhelm developers. Logging information overload can be dangerous to software applications. Using log levels, developers can print the useful information while hiding the verbose logs during software runtime. As software evolves, the log levels of logging statements associated with the surrounding software feature implementation may also need to be altered. Maintaining log levels necessitates a significant amount of manual effort. In this paper, we demonstrate an automated approach that can rejuvenate feature log levels by matching the interest level of developers in the surrounding features. The approach is implemented as an open-source Eclipse plugin, using two external plug-ins (JGit and Mylyn). It was tested on 18 open-source Java projects consisting of ~3 million lines of code and ~4K log statements. Our tool successfully analyzes 99.22% of logging statements, increases log level distributions by ~20%, and increases the focus of logs in bug fix contexts ~83% of the time. For further details, interested readers can watch our demonstration video (https://www.youtube.com/watch?v=qIULoAXoDv4).
△ Less
Submitted 16 February, 2022; v1 submitted 5 December, 2021;
originally announced December 2021.
-
Interests, Difficulties, Sentiments, and Tool Usages of Concurrency Developers: A Large-Scale Study on Stack Overflow
Authors:
Mehdi Bagherzadeh,
Syed Ahmed,
Srilakshmi Sripathi,
Raffi Khatchadourian
Abstract:
Context: Software developers are increasingly facing the challenges of writing code that is not only concurrent but also correct.
Objective: To help these developers, it is necessary to understand concurrency topics they are interested in, their difficulty in finding answers for questions in these topics, their sentiment for these topics, and how they use concurrency tools and techniques to guar…
▽ More
Context: Software developers are increasingly facing the challenges of writing code that is not only concurrent but also correct.
Objective: To help these developers, it is necessary to understand concurrency topics they are interested in, their difficulty in finding answers for questions in these topics, their sentiment for these topics, and how they use concurrency tools and techniques to guarantee correctness.
Method: We conduct a large-scale study on the entirety of Stack Overflow to understand interests, difficulties, sentiment, and tool usages of concurrency developers. We discuss the implications of our findings for the practice, research, and education of concurrent software development, and investigate the relation of our findings with the findings of the previous work.
Results: A few findings of our study are: (1) questions that concurrency developers ask can be grouped into a hierarchy with 27 concurrency topics under 8 major categories, (2) thread safety is among the most popular concurrency topics and client-server concurrency is among the least popular, (3) irreproducible behavior is among the most difficult topics and memory consistency is among the least difficult, (4) data scra** is among the most positive concurrency topics and irreproducible behavior is among the most negative, (5) root cause identification has the most number of questions for usage of data race tools and alternative use has the least.
Conclusion: The results of our study can not only help concurrency developers but also concurrency educators and researchers to better decide where to focus their efforts, by trading off one concurrency topic against another.
△ Less
Submitted 9 September, 2021; v1 submitted 7 September, 2021;
originally announced September 2021.
-
Automated Evolution of Feature Logging Statement Levels Using Git Histories and Degree of Interest
Authors:
Yiming Tang,
Allan Spektor,
Raffi Khatchadourian,
Mehdi Bagherzadeh
Abstract:
Logging -- used for system events and security breaches to describe more informational yet essential aspects of software features -- is pervasive. Given the high transactionality of today's software, logging effectiveness can be reduced by information overload. Log levels help alleviate this problem by correlating a priority to logs that can be later filtered. As software evolves, however, levels…
▽ More
Logging -- used for system events and security breaches to describe more informational yet essential aspects of software features -- is pervasive. Given the high transactionality of today's software, logging effectiveness can be reduced by information overload. Log levels help alleviate this problem by correlating a priority to logs that can be later filtered. As software evolves, however, levels of logs documenting surrounding feature implementations may also require modification as features once deemed important may have decreased in urgency and vice-versa. We present an automated approach that assists developers in evolving levels of such (feature) logs. The approach, based on mining Git histories and manipulating a degree of interest (DOI) model, transforms source code to revitalize feature log levels based on the "interestingness" of the surrounding code. Built upon JGit and Mylyn, the approach is implemented as an Eclipse IDE plug-in and evaluated on 18 Java projects with $\sim$3 million lines of code and $\sim$4K log statements. Our tool successfully analyzes 99.22% of logging statements, increases log level distributions by $\sim$20%, and increases the focus of logs in bug fix contexts $\sim$83% of the time. Moreover, pull (patch) requests were integrated into large and popular open-source projects. The results indicate that the approach is promising in assisting developers in evolving feature log levels.
△ Less
Submitted 12 September, 2021; v1 submitted 15 April, 2021;
originally announced April 2021.
-
Proactive Empirical Assessment of New Language Feature Adoption via Automated Refactoring: The Case of Java 8 Default Methods
Authors:
Raffi Khatchadourian,
Hidehiko Masuhara
Abstract:
Programming languages and platforms improve over time, sometimes resulting in new language features that offer many benefits. However, despite these benefits, developers may not always be willing to adopt them in their projects for various reasons. In this paper, we describe an empirical study where we assess the adoption of a particular new language feature. Studying how developers use (or do not…
▽ More
Programming languages and platforms improve over time, sometimes resulting in new language features that offer many benefits. However, despite these benefits, developers may not always be willing to adopt them in their projects for various reasons. In this paper, we describe an empirical study where we assess the adoption of a particular new language feature. Studying how developers use (or do not use) new language features is important in programming language research and engineering because it gives designers insight into the usability of the language to create meaning programs in that language. This knowledge, in turn, can drive future innovations in the area. Here, we explore Java 8 default methods, which allow interfaces to contain (instance) method implementations.
Default methods can ease interface evolution, make certain ubiquitous design patterns redundant, and improve both modularity and maintainability. A focus of this work is to discover, through a scientific approach and a novel technique, situations where developers found these constructs useful and where they did not, and the reasons for each. Although several studies center around assessing new language features, to the best of our knowledge, this kind of construct has not been previously considered.
Despite their benefits, we found that developers did not adopt default methods in all situations. Our study consisted of submitting pull requests introducing the language feature to 19 real-world, open source Java projects without altering original program semantics. This novel assessment technique is proactive in that the adoption was driven by an automatic refactoring approach rather than waiting for developers to discover and integrate the feature themselves. In this way, we set forth best practices and patterns of using the language feature effectively earlier rather than later and are able to possibly guide (near) future language evolution. We foresee this technique to be useful in assessing other new language features, design patterns, and other programming idioms.
△ Less
Submitted 27 March, 2018;
originally announced March 2018.