-
Quasi-Monte Carlo Software
Authors:
Sou-Cheng T. Choi,
Fred J. Hickernell,
R. Jagadeeswaran,
Michael J. McCourt,
Aleksei G. Sorokin
Abstract:
Practitioners wishing to experience the efficiency gains from using low discrepancy sequences need correct, robust, well-written software. This article, based on our MCQMC 2020 tutorial, describes some of the better quasi-Monte Carlo (QMC) software available. We highlight the key software components required by QMC to approximate multivariate integrals or expectations of functions of vector random…
▽ More
Practitioners wishing to experience the efficiency gains from using low discrepancy sequences need correct, robust, well-written software. This article, based on our MCQMC 2020 tutorial, describes some of the better quasi-Monte Carlo (QMC) software available. We highlight the key software components required by QMC to approximate multivariate integrals or expectations of functions of vector random variables. We have combined these components in QMCPy, a Python open-source library, which we hope will draw the support of the QMC community. Here we introduce QMCPy.
△ Less
Submitted 14 October, 2021; v1 submitted 15 February, 2021;
originally announced February 2021.
-
Real-time Prediction of Intermediate-Horizon Automotive Collision Risk
Authors:
Blake Wulfe,
Sunil Chintakindi,
Sou-Cheng T. Choi,
Rory Hartong-Redden,
Anuradha Kodali,
Mykel J. Kochenderfer
Abstract:
Advanced collision avoidance and driver hand-off systems can benefit from the ability to accurately predict, in real time, the probability a vehicle will be involved in a collision within an intermediate horizon of 10 to 20 seconds. The rarity of collisions in real-world data poses a significant challenge to develo** this capability because, as we demonstrate empirically, intermediate-horizon ri…
▽ More
Advanced collision avoidance and driver hand-off systems can benefit from the ability to accurately predict, in real time, the probability a vehicle will be involved in a collision within an intermediate horizon of 10 to 20 seconds. The rarity of collisions in real-world data poses a significant challenge to develo** this capability because, as we demonstrate empirically, intermediate-horizon risk prediction depends heavily on high-dimensional driver behavioral features. As a result, a large amount of data is required to fit an effective predictive model. In this paper, we assess whether simulated data can help alleviate this issue. Focusing on highway driving, we present a three-step approach for generating data and fitting a predictive model capable of real-time prediction. First, high-risk automotive scenes are generated using importance sampling on a learned Bayesian network scene model. Second, collision risk is estimated through Monte Carlo simulation. Third, a neural network domain adaptation model is trained on real and simulated data to address discrepancies between the two domains. Experiments indicate that simulated data can mitigate issues resulting from collision rarity, thereby improving risk prediction in real-world data.
△ Less
Submitted 5 February, 2018;
originally announced February 2018.
-
Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)
Authors:
Daniel S. Katz,
Sou-Cheng T. Choi,
Kyle E. Niemeyer,
James Hetherington,
Frank Löffler,
Dan Gunter,
Ray Idaszak,
Steven R. Brandt,
Mark A. Miller,
Sandra Gesing,
Nick D. Jones,
Nic Weber,
Suresh Marru,
Gabrielle Allen,
Birgit Penzenstadler,
Colin C. Venters,
Ethan Davis,
Lorraine Hwang,
Ilian Todorov,
Abani Patra,
Miguel de Val-Borro
Abstract:
This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustain…
▽ More
This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustaining scientific software. The final and main contribution of the report is a summary of the discussions, future steps, and future organization for a set of self-organized working groups on topics including develo** pathways to funding scientific software; constructing useful common metrics for crediting software stakeholders; identifying principles for sustainable software engineering design; reaching out to research software organizations around the world; and building communities for software sustainability. For each group, we include a point of contact and a landing page that can be used by those who want to join that group's future activities. The main challenge left by the workshop is to see if the groups will execute these activities that they have scheduled, and how the WSSSPE community can encourage this to happen.
△ Less
Submitted 6 February, 2016;
originally announced February 2016.
-
Machine Learning for Machine Data from a CATI Network
Authors:
Sou-Cheng T. Choi
Abstract:
This is a machine learning application paper involving big data. We present high-accuracy prediction methods of rare events in semi-structured machine log files, which are produced at high velocity and high volume by NORC's computer-assisted telephone interviewing (CATI) network for conducting surveys. We judiciously apply natural language processing (NLP) techniques and data-mining strategies to…
▽ More
This is a machine learning application paper involving big data. We present high-accuracy prediction methods of rare events in semi-structured machine log files, which are produced at high velocity and high volume by NORC's computer-assisted telephone interviewing (CATI) network for conducting surveys. We judiciously apply natural language processing (NLP) techniques and data-mining strategies to train effective learning and prediction models for classifying uncommon error messages in the log---without access to source code, updated documentation or dictionaries. In particular, our simple but effective approach of features preallocation for learning from imbalanced data coupled with naive Bayes classifiers can be conceivably generalized to supervised or semi-supervised learning and prediction methods for other critical events such as cyberattack detection.
△ Less
Submitted 2 October, 2015;
originally announced October 2015.
-
Report on the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2)
Authors:
Daniel S. Katz,
Sou-Cheng T. Choi,
Nancy Wilkins-Diehr,
Neil Chue Hong,
Colin C. Venters,
James Howison,
Frank Seinstra,
Matthew Jones,
Karen Cranston,
Thomas L. Clune,
Miguel de Val-Borro,
Richard Littauer
Abstract:
This technical report records and discusses the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2). The report includes a description of the alternative, experimental submission and review process, two workshop keynote presentations, a series of lightning talks, a discussion on sustainability, and five discussions from the topic areas of exploring sustainabilit…
▽ More
This technical report records and discusses the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2). The report includes a description of the alternative, experimental submission and review process, two workshop keynote presentations, a series of lightning talks, a discussion on sustainability, and five discussions from the topic areas of exploring sustainability; software development experiences; credit & incentives; reproducibility & reuse & sharing; and code testing & code review. For each topic, the report includes a list of tangible actions that were proposed and that would lead to potential change. The workshop recognized that reliance on scientific software is pervasive in all areas of world-leading research today. The workshop participants then proceeded to explore different perspectives on the concept of sustainability. Key enablers and barriers of sustainable scientific software were identified from their experiences. In addition, recommendations with new requirements such as software credit files and software prize frameworks were outlined for improving practices in sustainable software engineering. There was also broad consensus that formal training in software development or engineering was rare among the practitioners. Significant strides need to be made in building a sense of community via training in software and technical practices, on increasing their size and scope, and on better integrating them directly into graduate education programs. Finally, journals can define and publish policies to improve reproducibility, whereas reviewers can insist that authors provide sufficient information and access to data and software to allow them reproduce the results in the paper. Hence a list of criteria is compiled for journals to provide to reviewers so as to make it easier to review software submitted for publication as a "Software Paper."
△ Less
Submitted 8 July, 2015; v1 submitted 7 July, 2015;
originally announced July 2015.
-
GAIL---Guaranteed Automatic Integration Library in MATLAB: Documentation for Version 2.1
Authors:
Sou-Cheng T. Choi,
Yuhan Ding,
Fred J. Hickernell,
Lan Jiang,
Lluís Antoni Jiménez Rugama,
Xin Tong,
Yizhi Zhang,
Xuan Zhou
Abstract:
Automatic and adaptive approximation, optimization, or integration of functions in a cone with guarantee of accuracy is a relatively new paradigm. Our purpose is to create an open-source MATLAB package, Guaranteed Automatic Integration Library (GAIL), following the philosophy of reproducible research and sustainable practices of robust scientific software development. For our conviction that true…
▽ More
Automatic and adaptive approximation, optimization, or integration of functions in a cone with guarantee of accuracy is a relatively new paradigm. Our purpose is to create an open-source MATLAB package, Guaranteed Automatic Integration Library (GAIL), following the philosophy of reproducible research and sustainable practices of robust scientific software development. For our conviction that true scholarship in computational sciences are characterized by reliable reproducibility, we employ the best practices in mathematical research and software engineering known to us and available in MATLAB. This document describes the key features of functions in GAIL, which includes one-dimensional function approximation and minimization using linear splines, one-dimensional numerical integration using trapezoidal rule, and last but not least, mean estimation and multidimensional integration by Monte Carlo methods or Quasi Monte Carlo methods.
△ Less
Submitted 23 March, 2015; v1 submitted 23 March, 2015;
originally announced March 2015.
-
Summary of the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1)
Authors:
Daniel S. Katz,
Sou-Cheng T. Choi,
Hilmar Lapp,
Ketan Maheshwari,
Frank Löffler,
Matthew Turk,
Marcus D. Hanwell,
Nancy Wilkins-Diehr,
James Hetherington,
James Howison,
Shel Swenson,
Gabrielle D. Allen,
Anne C. Elster,
Bruce Berriman,
Colin Venters
Abstract:
Challenges related to development, deployment, and maintenance of reusable software for science are becoming a growing concern. Many scientists' research increasingly depends on the quality and availability of software upon which their works are built. To highlight some of these issues and share experiences, the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1)…
▽ More
Challenges related to development, deployment, and maintenance of reusable software for science are becoming a growing concern. Many scientists' research increasingly depends on the quality and availability of software upon which their works are built. To highlight some of these issues and share experiences, the First Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE1) was held in November 2013 in conjunction with the SC13 Conference. The workshop featured keynote presentations and a large number (54) of solicited extended abstracts that were grouped into three themes and presented via panels. A set of collaborative notes of the presentations and discussion was taken during the workshop.
Unique perspectives were captured about issues such as comprehensive documentation, development and deployment practices, software licenses and career paths for developers. Attribution systems that account for evidence of software contribution and impact were also discussed. These include mechanisms such as Digital Object Identifiers, publication of "software papers", and the use of online systems, for example source code repositories like GitHub.
This paper summarizes the issues and shared experiences that were discussed, including cross-cutting issues and use cases. It joins a nascent literature seeking to understand what drives software work in science, and how it is impacted by the reward systems of science. These incentives can determine the extent to which developers are motivated to build software for the long-term, for the use of others, and whether to work collaboratively or separately. It also explores community building, leadership, and dynamics in relation to successful scientific software.
△ Less
Submitted 12 June, 2014; v1 submitted 29 April, 2014;
originally announced April 2014.
-
ALGORITHM 937: MINRES-QLP for Singular Symmetric and Hermitian Linear Equations and Least-Squares Problems
Authors:
Sou-Cheng T. Choi,
Michael A. Saunders
Abstract:
We describe algorithm MINRES-QLP and its FORTRAN 90 implementation for solving symmetric or Hermitian linear systems or least-squares problems. If the system is singular, MINRES-QLP computes the unique minimum-length solution (also known as the pseudoinverse solution), which generally eludes MINRES. In all cases, it overcomes a potential instability in the original MINRES algorithm. A positive-def…
▽ More
We describe algorithm MINRES-QLP and its FORTRAN 90 implementation for solving symmetric or Hermitian linear systems or least-squares problems. If the system is singular, MINRES-QLP computes the unique minimum-length solution (also known as the pseudoinverse solution), which generally eludes MINRES. In all cases, it overcomes a potential instability in the original MINRES algorithm. A positive-definite preconditioner may be supplied. Our FORTRAN 90 implementation illustrates a design pattern that allows users to make problem data known to the solver but hidden and secure from other program units. In particular, we circumvent the need for reverse communication. While we focus here on a FORTRAN 90 implementation, we also provide and maintain MATLAB versions of MINRES and MINRES-QLP.
△ Less
Submitted 26 March, 2015; v1 submitted 12 January, 2013;
originally announced January 2013.
-
MINRES-QLP: a Krylov subspace method for indefinite or singular symmetric systems
Authors:
Sou-Cheng T. Choi,
Christopher C. Paige,
Michael A. Saunders
Abstract:
CG, SYMMLQ, and MINRES are Krylov subspace methods for solving symmetric systems of linear equations. When these methods are applied to an incompatible system (that is, a singular symmetric least-squares problem), CG could break down and SYMMLQ's solution could explode, while MINRES would give a least-squares solution but not necessarily the minimum-length (pseudoinverse) solution. This understand…
▽ More
CG, SYMMLQ, and MINRES are Krylov subspace methods for solving symmetric systems of linear equations. When these methods are applied to an incompatible system (that is, a singular symmetric least-squares problem), CG could break down and SYMMLQ's solution could explode, while MINRES would give a least-squares solution but not necessarily the minimum-length (pseudoinverse) solution. This understanding motivates us to design a MINRES-like algorithm to compute minimum-length solutions to singular symmetric systems.
MINRES uses QR factors of the tridiagonal matrix from the Lanczos process (where R is upper-tridiagonal). MINRES-QLP uses a QLP decomposition (where rotations on the right reduce R to lower-tridiagonal form). On ill-conditioned systems (singular or not), MINRES-QLP can give more accurate solutions than MINRES. We derive preconditioned MINRES-QLP, new stop** rules, and better estimates of the solution and residual norms, the matrix norm, and the condition number.
△ Less
Submitted 26 March, 2015; v1 submitted 21 March, 2010;
originally announced March 2010.