-
Deoptless: Speculation with Dispatched On-Stack Replacement and Specialized Continuations
Authors:
Olivier Flückiger,
Jan Ječmen,
Sebastián Krynski,
Jan Vitek
Abstract:
Just-in-time compilation provides significant performance improvements for programs written in dynamic languages. These benefits come from the ability of the compiler to speculate about likely cases and generate optimized code for these. Unavoidably, speculations sometimes fail and the optimizations must be reverted. In some pathological cases, this can leave the program stuck with suboptimal code…
▽ More
Just-in-time compilation provides significant performance improvements for programs written in dynamic languages. These benefits come from the ability of the compiler to speculate about likely cases and generate optimized code for these. Unavoidably, speculations sometimes fail and the optimizations must be reverted. In some pathological cases, this can leave the program stuck with suboptimal code. In this paper we propose deoptless, a technique that replaces deoptimization points with dispatched specialized continuations. The goal of deoptless is to take a step towards providing users with a more transparent performance model in which mysterious slowdowns are less frequent and grave.
△ Less
Submitted 5 April, 2022; v1 submitted 4 March, 2022;
originally announced March 2022.
-
Type Stability in Julia: Avoiding Performance Pathologies in JIT Compilation (Extended Version)
Authors:
Artem Pelenitsyn,
Julia Belyakova,
Benjamin Chung,
Ross Tate,
Jan Vitek
Abstract:
As a scientific programming language, Julia strives for performance but also provides high-level productivity features. To avoid performance pathologies, Julia users are expected to adhere to a coding discipline that enables so-called type stability. Informally, a function is type stable if the type of the output depends only on the types of the inputs, not their values. This paper provides a form…
▽ More
As a scientific programming language, Julia strives for performance but also provides high-level productivity features. To avoid performance pathologies, Julia users are expected to adhere to a coding discipline that enables so-called type stability. Informally, a function is type stable if the type of the output depends only on the types of the inputs, not their values. This paper provides a formal definition of type stability as well as a stronger property of type groundedness, shows that groundedness enables compiler optimizations, and proves the compiler correct. We also perform a corpus analysis to uncover how these type-related properties manifest in practice.
△ Less
Submitted 17 November, 2021; v1 submitted 4 September, 2021;
originally announced September 2021.
-
World Age in Julia: Optimizing Method Dispatch in the Presence of Eval (Extended Version)
Authors:
Julia Belyakova,
Benjamin Chung,
Jack Gelinas,
Jameson Nash,
Ross Tate,
Jan Vitek
Abstract:
Dynamic programming languages face semantic and performance challenges in the presence of features, such as eval, that can inject new code into a running program. The Julia programming language introduces the novel concept of world age to insulate optimized code from one of the most disruptive side-effects of eval: changes to the definition of an existing function. This paper provides the first fo…
▽ More
Dynamic programming languages face semantic and performance challenges in the presence of features, such as eval, that can inject new code into a running program. The Julia programming language introduces the novel concept of world age to insulate optimized code from one of the most disruptive side-effects of eval: changes to the definition of an existing function. This paper provides the first formal semantics of world age in a core calculus named Juliette, and shows how world age enables compiler optimizations, such as inlining, in the presence of eval. While Julia also provides programmers with the means to bypass world age, we found that this mechanism is not used extensively: a static analysis of over 4,000 registered Julia packages shows that only 4-9% of packages bypass world age. This suggests that Julia's semantics aligns with programmer expectations.
△ Less
Submitted 15 October, 2020; v1 submitted 15 October, 2020;
originally announced October 2020.
-
Sampling Optimized Code for Type Feedback
Authors:
Olivier Flückiger,
Andreas Wälchli,
Sebastián Krynski,
Jan Vitek
Abstract:
To efficiently execute dynamically typed languages, many language implementations have adopted a two-tier architecture. The first tier aims for low-latency startup times and collects dynamic profiles, such as the dynamic types of variables. The second tier provides high-throughput using an optimizing compiler that specializes code to the recorded type information. If the program behavior changes t…
▽ More
To efficiently execute dynamically typed languages, many language implementations have adopted a two-tier architecture. The first tier aims for low-latency startup times and collects dynamic profiles, such as the dynamic types of variables. The second tier provides high-throughput using an optimizing compiler that specializes code to the recorded type information. If the program behavior changes to the point that not previously seen types occur in specialized code, that specialized code becomes invalid, it is deoptimized, and control is transferred back to the first tier execution engine which will start specializing anew. However, if the program behavior becomes more specific, for instance, if a polymorphic variable becomes monomorphic, nothing changes. Once the program is running optimized code, there are no means to notice that an opportunity for optimization has been missed.
We propose to employ a sampling-based profiler to monitor native code without any instrumentation. The absence of instrumentation means that when the profiler is not active, no overhead is incurred. We present an implementation is in the context of the Ř just-in-time, optimizing compiler for the R language. Based on the sampled profiles, we are able to detect when the native code produced by Ř is specialized for stale type feedback and recompile it to more type-specific code. We show that sampling adds an overhead of less than 3% in most cases and up to 9% in few cases and that it reliably detects stale type feedback within milliseconds.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
FSE/CACM Rebuttal$^2$: Correcting A Large-Scale Study of Programming Languages and Code Quality in GitHub
Authors:
Emery D. Berger,
Petr Maj,
Olga Vitek,
Jan Vitek
Abstract:
Ray, Devanbu and Filkov issued a rebuttal of our TOPLAS paper "On the Impact of Programming Languages on Code Quality: A Reproduction Study". Our paper reproduced "A Large-Scale Study of Programming Languages and Code Quality in GitHub", which appeared at FSE 2014 and was subsequently republished as a CACM research highlight in 2017. This article is a rebuttal to that rebuttal.
Ray, Devanbu and Filkov issued a rebuttal of our TOPLAS paper "On the Impact of Programming Languages on Code Quality: A Reproduction Study". Our paper reproduced "A Large-Scale Study of Programming Languages and Code Quality in GitHub", which appeared at FSE 2014 and was subsequently republished as a CACM research highlight in 2017. This article is a rebuttal to that rebuttal.
△ Less
Submitted 26 November, 2019;
originally announced November 2019.
-
Precise Dataflow Analysis of Event-Driven Applications
Authors:
Ming-Ho Yee,
Ayaz Badouraly,
Ondřej Lhoták,
Frank Tip,
Jan Vitek
Abstract:
Event-driven programming is widely used for implementing user interfaces, web applications, and non-blocking I/O. An event-driven program is organized as a collection of event handlers whose execution is triggered by events. Traditional static analysis techniques are unable to reason precisely about event-driven code because they conservatively assume that event handlers may execute in any order.…
▽ More
Event-driven programming is widely used for implementing user interfaces, web applications, and non-blocking I/O. An event-driven program is organized as a collection of event handlers whose execution is triggered by events. Traditional static analysis techniques are unable to reason precisely about event-driven code because they conservatively assume that event handlers may execute in any order. This paper proposes an automatic transformation from Interprocedural Finite Distributive Subset (IFDS) problems to Interprocedural Distributed Environment (IDE) problems as a general solution to obtain precise static analysis of event-driven applications; problems in both forms can be solved by existing implementations. Our contribution is to show how to improve analysis precision by automatically enriching the former with information about the state of event handlers to filter out infeasible paths. We prove the correctness of our transformation and report on experiments with a proof-of-concept implementation for a subset of JavaScript.
△ Less
Submitted 28 October, 2019;
originally announced October 2019.
-
On the Design, Implementation, and Use of Laziness in R
Authors:
Aviral Goel,
Jan Vitek
Abstract:
The R programming language has been lazy for over twenty-five years. This paper presents a review of the design and implementation of call-by-need in R, and a data-driven study of how generations of programmers have put laziness to use in their code. We analyze 16,707 packages and observe the creation of 270.9 B promises. Our data suggests that there is little supporting evidence to assert that pr…
▽ More
The R programming language has been lazy for over twenty-five years. This paper presents a review of the design and implementation of call-by-need in R, and a data-driven study of how generations of programmers have put laziness to use in their code. We analyze 16,707 packages and observe the creation of 270.9 B promises. Our data suggests that there is little supporting evidence to assert that programmers use laziness to avoid unnecessary computation or to operate over infinite data structures. For the most part R code appears to have been written without reliance on, and in many cases even knowledge of, delayed argument evaluation. The only significant exception is a small number of packages which leverage call-by-need for meta-programming.
△ Less
Submitted 19 September, 2019;
originally announced September 2019.
-
Scala Implicits are Everywhere: A large-scale study of the use of Implicits in the wild
Authors:
Filip Křikava,
Heather Miller,
Jan Vitek
Abstract:
The Scala programming language offers two distinctive language features implicit parameters and implicit conversions, often referred together as implicits. Announced without fanfare in 2004, implicits have quickly grown to become a widely and pervasively used feature of the language. They provide a way to reduce the boilerplate code in Scala programs. They are also used to implement certain langua…
▽ More
The Scala programming language offers two distinctive language features implicit parameters and implicit conversions, often referred together as implicits. Announced without fanfare in 2004, implicits have quickly grown to become a widely and pervasively used feature of the language. They provide a way to reduce the boilerplate code in Scala programs. They are also used to implement certain language features without having to modify the compiler. We report on a large-scale study of the use of implicits in the wild. For this, we analyzed 7,280 Scala projects hosted on GitHub, spanning over 8.1M call sites involving implicits and 370.7K implicit declarations across 18.7M lines of Scala code.
△ Less
Submitted 12 September, 2019; v1 submitted 21 August, 2019;
originally announced August 2019.
-
R Melts Brains -- An IR for First-Class Environments and Lazy Effectful Arguments
Authors:
Olivier Flückiger,
Guido Chari,
Jan Ječmen,
Ming-Ho Yee,
Jakob Hain,
Jan Vitek
Abstract:
The R programming language combines a number of features considered hard to analyze and implement efficiently: dynamic ty**, reflection, lazy evaluation, vectorized primitive types, first-class closures, and extensive use of native code. Additionally, variable scopes are reified at runtime as first-class environments. The combination of these features renders most static program analysis techniq…
▽ More
The R programming language combines a number of features considered hard to analyze and implement efficiently: dynamic ty**, reflection, lazy evaluation, vectorized primitive types, first-class closures, and extensive use of native code. Additionally, variable scopes are reified at runtime as first-class environments. The combination of these features renders most static program analysis techniques impractical, and thus, compiler optimizations based on them ineffective. We present our work on PIR, an intermediate representation with explicit support for first-class environments and effectful lazy evaluation. We describe two dataflow analyses on PIR: the first enables reasoning about variables and their environments, and the second infers where arguments are evaluated. Leveraging their results, we show how to elide environment creation and inline functions.
△ Less
Submitted 5 September, 2019; v1 submitted 11 July, 2019;
originally announced July 2019.
-
On the Impact of Programming Languages on Code Quality
Authors:
Emery D. Berger,
Celeste Hollenbeck,
Petr Maj,
Olga Vitek,
Jan Vitek
Abstract:
This paper is a reproduction of work by Ray et al. which claimed to have uncovered a statistically significant association between eleven programming languages and software defects in projects hosted on GitHub. First we conduct an experimental repetition, repetition is only partially successful, but it does validate one of the key claims of the original work about the association of ten programmin…
▽ More
This paper is a reproduction of work by Ray et al. which claimed to have uncovered a statistically significant association between eleven programming languages and software defects in projects hosted on GitHub. First we conduct an experimental repetition, repetition is only partially successful, but it does validate one of the key claims of the original work about the association of ten programming languages with defects. Next, we conduct a complete, independent reanalysis of the data and statistical modeling steps of the original study. We uncover a number of flaws that undermine the conclusions of the original study as only four languages are found to have a statistically significant association with defects, and even for those the effect size is exceedingly small. We conclude with some additional sources of bias that should be investigated in follow up work and a few best practice recommendations for similar efforts.
△ Less
Submitted 24 April, 2019; v1 submitted 29 January, 2019;
originally announced January 2019.
-
Feature-Specific Profiling
Authors:
Leif Andersen,
Vincent St-Amour,
Jan Vitek,
Matthias Felleisen
Abstract:
While high-level languages come with significant readability and maintainability benefits, their performance remains difficult to predict. For example, programmers may unknowingly use language features inappropriately, which cause their programs to run slower than expected. To address this issue, we introduce feature-specific profiling, a technique that reports performance costs in terms of lingui…
▽ More
While high-level languages come with significant readability and maintainability benefits, their performance remains difficult to predict. For example, programmers may unknowingly use language features inappropriately, which cause their programs to run slower than expected. To address this issue, we introduce feature-specific profiling, a technique that reports performance costs in terms of linguistic constructs. Feature-specific profilers help programmers find expensive uses of specific features of their language. We describe the architecture of a profiler that implements our approach, explain prototypes of the profiler for two languages with different characteristics and implementation strategies, and provide empirical evidence for the approach's general usefulness as a performance debugging tool.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Correctness of Speculative Optimizations with Dynamic Deoptimization
Authors:
Olivier Flückiger,
Gabriel Scherer,
Ming-Ho Yee,
Aviral Goel,
Amal Ahmed,
Jan Vitek
Abstract:
High-performance dynamic language implementations make heavy use of speculative optimizations to achieve speeds close to statically compiled languages. These optimizations are typically performed by a just-in-time compiler that generates code under a set of assumptions about the state of the program and its environment. In certain cases, a program may execute code compiled under assumptions that a…
▽ More
High-performance dynamic language implementations make heavy use of speculative optimizations to achieve speeds close to statically compiled languages. These optimizations are typically performed by a just-in-time compiler that generates code under a set of assumptions about the state of the program and its environment. In certain cases, a program may execute code compiled under assumptions that are no longer valid. The implementation must then deoptimize the program on-the-fly; this entails finding semantically equivalent code that does not rely on invalid assumptions, translating program state to that expected by the target code, and transferring control. This paper looks at the interaction between optimization and deoptimization, and shows that reasoning about speculation is surprisingly easy when assumptions are made explicit in the program representation. This insight is demonstrated on a compiler intermediate representation, named \sourir, modeled after the high-level representation for a dynamic language. Traditional compiler optimizations such constant folding, dead code elimination, and function inlining are shown to be correct in the presence of assumptions. Furthermore, the paper establishes the correctness of compiler transformations specific to deoptimization: namely unrestricted deoptimization, predicate hoisting, and assume composition.
△ Less
Submitted 15 November, 2017; v1 submitted 8 November, 2017;
originally announced November 2017.
-
Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems - Report on the Workshop ICOOOLPS'2007 at ECOOP'07
Authors:
Olivier Zendra,
Eric Jul,
Roland Ducournau,
Etienne Gagnon,
Richard E. Jones,
Chandra Krintz,
Philippe Mulet,
Jan Vitek
Abstract:
ICOOOLPS'2007 was the second edition of the ECOOP-ICOOOLPS workshop. ICOOOLPS intends to bring researchers and practitioners both from academia and industry together, with a spirit of openness, to try and identify and begin to address the numerous and very varied issues of optimization. After a first successful edition, this second one put a stronger emphasis on exchanges and discussions amongst…
▽ More
ICOOOLPS'2007 was the second edition of the ECOOP-ICOOOLPS workshop. ICOOOLPS intends to bring researchers and practitioners both from academia and industry together, with a spirit of openness, to try and identify and begin to address the numerous and very varied issues of optimization. After a first successful edition, this second one put a stronger emphasis on exchanges and discussions amongst the participants, progressing on the bases set last year in Nantes. The workshop attendance was a success, since the 30-people limit we had set was reached about 2 weeks before the workshop itself. Some of the discussions (e.g. annotations) were so successful that they would required even more time than we were able to dedicate to them. That's one area we plan to further improve for the next edition.
△ Less
Submitted 7 December, 2007;
originally announced December 2007.
-
Implementation, Compilation, Optimization of Object-Oriented Languages, Programs and Systems - Report on the Workshop ICOOOLPS'2006 at ECOOP'06
Authors:
Roland Ducournau,
Etienne Gagnon,
Chandra Krintz,
Philippe Mulet,
Jan Vitek,
Olivier Zendra
Abstract:
ICOOOLPS'2006 was the first edition of ECOOP-ICOOOLPS workshop. It intended to bring researchers and practitioners both from academia and industry together, with a spirit of openness, to try and identify and begin to address the numerous and very varied issues of optimization. This succeeded, as can be seen from the papers, the attendance and the liveliness of the discussions that took place dur…
▽ More
ICOOOLPS'2006 was the first edition of ECOOP-ICOOOLPS workshop. It intended to bring researchers and practitioners both from academia and industry together, with a spirit of openness, to try and identify and begin to address the numerous and very varied issues of optimization. This succeeded, as can be seen from the papers, the attendance and the liveliness of the discussions that took place during and after the workshop, not to mention a few new cooperations or postdoctoral contracts. The 22 talented people from different groups who participated were unanimous to appreciate this first edition and recommend that ICOOOLPS be continued next year. A community is thus beginning to form, and should be reinforced by a second edition next year, with all the improvements this first edition made emerge.
△ Less
Submitted 15 October, 2007;
originally announced October 2007.