-
Dependently Ty** R Vectors, Arrays, and Matrices
Authors:
John Wrenn,
Anjali Pal,
Alexa VanHattum,
Shriram Krishnamurthi
Abstract:
The R programming language is widely used in large-scale data analyses. It contains especially rich built-in support for dealing with vectors, arrays, and matrices. These operations feature prominently in the applications that form R's raison d'ĂȘtre, making their behavior worth understanding. Furthermore, ostensibly for programmer convenience, their behavior in R is a notable extension over the co…
▽ More
The R programming language is widely used in large-scale data analyses. It contains especially rich built-in support for dealing with vectors, arrays, and matrices. These operations feature prominently in the applications that form R's raison d'ĂȘtre, making their behavior worth understanding. Furthermore, ostensibly for programmer convenience, their behavior in R is a notable extension over the corresponding operations in mathematics, thereby offering some challenges for specification and static verification.
We report on progress towards statically ty** this aspect of the R language. The interesting aspects of ty**, in this case, warn programmers about violating bounds, so the types must necessarily be dependent. We explain the ways in which R extends standard mathematical behavior. We then show how R's behavior can be specified in LiquidHaskell, a dependently-typed extension to Haskell. In the general case, actually verifying library and client code is currently beyond LiquidHaskell's reach; therefore, this work provides challenges and opportunities both for ty** R and for progress in dependently-typed programming languages.
△ Less
Submitted 9 April, 2023;
originally announced April 2023.
-
Automated, Targeted Testing of Property-Based Testing Predicates
Authors:
Tim Nelson,
Elijah Rivera,
Sam Soucie,
Thomas Del Vecchio,
John Wrenn,
Shriram Krishnamurthi
Abstract:
Context: This work is based on property-based testing (PBT). PBT is an increasingly important form of software testing. Furthermore, it serves as a concrete gateway into the abstract area of formal methods. Specifically, we focus on students learning PBT methods.
Inquiry: How well do students do at PBT? Our goal is to assess the quality of the predicates they write as part of PBT. Prior work int…
▽ More
Context: This work is based on property-based testing (PBT). PBT is an increasingly important form of software testing. Furthermore, it serves as a concrete gateway into the abstract area of formal methods. Specifically, we focus on students learning PBT methods.
Inquiry: How well do students do at PBT? Our goal is to assess the quality of the predicates they write as part of PBT. Prior work introduced the idea of decomposing the predicate's property into a conjunction of independent subproperties. Testing the predicate against each subproperty gives a "semantic" understanding of their performance.
Approach: The notion of independence of subproperties both seems intuitive and was an important condition in prior work. First, we show that this condition is overly restrictive and might hide valuable information: it both undercounts errors and makes it hard to capture misconceptions. Second, we introduce two forms of automation, one based on PBT tools and the other on SAT-solving, to enable testing of student predicates. Third, we compare the output of these automated tools against manually-constructed tests. Fourth, we also measure the performance of those tools. Finally, we re-assess student performance reported in prior work.
Knowledge: We show the difficulty caused by the independent subproperty requirement. We provide insight into how to use automation effectively to assess PBT predicates. In particular, we discuss the steps we had to take to beat human performance. We also provide insight into how to make the automation work efficiently. Finally, we present a much richer account than prior work of how students did.
Grounding: Our methods are grounded in mathematical logic. We also make use of well-understood principles of test generation from more formal specifications. This combination ensures the soundness of our work. We use standard methods to measure performance.
Importance: As both educators and programmers, we believe PBT is a valuable tool for students to learn, and its importance will only grow as more developers appreciate its value. Effective teaching requires a clear understanding of student knowledge and progress. Our methods enable a rich and automated analysis of student performance on PBT that yields insight into their understanding and can capture misconceptions. We therefore expect these results to be valuable to educators.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Using Relational Problems to Teach Property-Based Testing
Authors:
John Wrenn,
Tim Nelson,
Shriram Krishnamurthi
Abstract:
Context: The success of QuickCheck has led to the development of property-based testing (PBT) libraries for many languages and the process is getting increasing attention. However, unlike regular testing, PBT is not widespread in collegiate curricula. Furthermore, the value of PBT is not limited to software testing. The growing use of formal methods in, and the growth of software synthesis, all cr…
▽ More
Context: The success of QuickCheck has led to the development of property-based testing (PBT) libraries for many languages and the process is getting increasing attention. However, unlike regular testing, PBT is not widespread in collegiate curricula. Furthermore, the value of PBT is not limited to software testing. The growing use of formal methods in, and the growth of software synthesis, all create demand for techniques to train students and developers in the art of specification writing. We posit that PBT forms a strong bridge between testing and the act of specification: it's a form of testing where the tester is actually writing abstract specifications.
Inquiry: Even well-informed technologists mention the difficulty of finding good motivating examples for its use. We take steps to fill this lacuna.
Approach & Knowledge: We find that the use of "relational" problems -- those for which an input may admit multiple valid outputs -- easily motivates the use of PBT. We also notice that such problems are readily available in the computer science pantheon of problems (e.g., many graph and sorting algorithms). We have been using these for some years now to teach PBT in collegiate courses.
Grounding: In this paper, we describe the problems we use and report on students? completion of them. We believe the problems overcome some of the motivation issues described above. We also show that students can do quite well at PBT for these problems, suggesting that the topic is well within their reach. In the process, we introduce a simple method to evaluate the accuracy of their specifications, and use it to characterize their common mistakes.
Importance: Based on our findings, we believe that relational problems are an underutilized motivating example for PBT. We hope this paper initiates a catalog of such problems for educators (and developers) to use, and also provides a concrete (though by no means exclusive) method to analyze the quality of PBT.
△ Less
Submitted 30 October, 2020;
originally announced October 2020.