-
Defining a Reference Architecture for Edge Systems in Highly-Uncertain Environments
Authors:
Kevin Pitstick,
Marc Novakouski,
Grace A. Lewis,
Ipek Ozkaya
Abstract:
Increasing rate of progress in hardware and artificial intelligence (AI) solutions is enabling a range of software systems to be deployed closer to their users, increasing application of edge software system paradigms. Edge systems support scenarios in which computation is placed closer to where data is generated and needed, and provide benefits such as reduced latency, bandwidth optimization, and…
▽ More
Increasing rate of progress in hardware and artificial intelligence (AI) solutions is enabling a range of software systems to be deployed closer to their users, increasing application of edge software system paradigms. Edge systems support scenarios in which computation is placed closer to where data is generated and needed, and provide benefits such as reduced latency, bandwidth optimization, and higher resiliency and availability. Users who operate in highly-uncertain and resource-constrained environments, such as first responders, law enforcement, and soldiers, can greatly benefit from edge systems to support timelier decision making. Unfortunately, understanding how different architecture approaches for edge systems impact priority quality concerns is largely neglected by industry and research, yet crucial for national and local safety, optimal resource utilization, and timely decision making. Much of industry is focused on the hardware and networking aspects of edge systems, with very little attention to the software that enables edge capabilities. This paper presents our work to fill this gap, defining a reference architecture for edge systems in highly-uncertain environments, and showing examples of how it has been implemented in practice.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Using Quality Attribute Scenarios for ML Model Test Case Generation
Authors:
Rachel Brower-Sinning,
Grace A. Lewis,
Sebastían Echeverría,
Ipek Ozkaya
Abstract:
Testing of machine learning (ML) models is a known challenge identified by researchers and practitioners alike. Unfortunately, current practice for ML model testing prioritizes testing for model performance, while often neglecting the requirements and constraints of the ML-enabled system that integrates the model. This limited view of testing leads to failures during integration, deployment, and o…
▽ More
Testing of machine learning (ML) models is a known challenge identified by researchers and practitioners alike. Unfortunately, current practice for ML model testing prioritizes testing for model performance, while often neglecting the requirements and constraints of the ML-enabled system that integrates the model. This limited view of testing leads to failures during integration, deployment, and operations, contributing to the difficulties of moving models from development to production. This paper presents an approach based on quality attribute (QA) scenarios to elicit and define system- and model-relevant test cases for ML models. The QA-based approach described in this paper has been integrated into MLTE, a process and tool to support ML model test and evaluation. Feedback from users of MLTE highlights its effectiveness in testing beyond model performance and identifying failures early in the development process.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
A Synthesis of Green Architectural Tactics for ML-Enabled Systems
Authors:
Heli Järvenpää,
Patricia Lago,
Justus Bogner,
Grace Lewis,
Henry Muccini,
Ipek Ozkaya
Abstract:
The rapid adoption of artificial intelligence (AI) and machine learning (ML) has generated growing interest in understanding their environmental impact and the challenges associated with designing environmentally friendly ML-enabled systems. While Green AI research, i.e., research that tries to minimize the energy footprint of AI, is receiving increasing attention, very few concrete guidelines are…
▽ More
The rapid adoption of artificial intelligence (AI) and machine learning (ML) has generated growing interest in understanding their environmental impact and the challenges associated with designing environmentally friendly ML-enabled systems. While Green AI research, i.e., research that tries to minimize the energy footprint of AI, is receiving increasing attention, very few concrete guidelines are available on how ML-enabled systems can be designed to be more environmentally sustainable. In this paper, we provide a catalog of 30 green architectural tactics for ML-enabled systems to fill this gap. An architectural tactic is a high-level design technique to improve software quality, in our case environmental sustainability. We derived the tactics from the analysis of 51 peer-reviewed publications that primarily explore Green AI, and validated them using a focus group approach with three experts. The 30 tactics we identified are aimed to serve as an initial reference guide for further exploration into Green AI from a software engineering perspective, and assist in designing sustainable ML-enabled systems. To enhance transparency and facilitate their widespread use and extension, we make the tactics available online in easily consumable formats. Wide-spread adoption of these tactics has the potential to substantially reduce the societal impact of ML-enabled systems regarding their energy and carbon footprint.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs
Authors:
Chenyang Yang,
Rishabh Rustogi,
Rachel Brower-Sinning,
Grace A. Lewis,
Christian Kästner,
Tongshuang Wu
Abstract:
Current model testing work has mostly focused on creating test cases. Identifying what to test is a step that is largely ignored and poorly supported. We propose Weaver, an interactive tool that supports requirements elicitation for guiding model testing. Weaver uses large language models to generate knowledge bases and recommends concepts from them interactively, allowing testers to elicit requir…
▽ More
Current model testing work has mostly focused on creating test cases. Identifying what to test is a step that is largely ignored and poorly supported. We propose Weaver, an interactive tool that supports requirements elicitation for guiding model testing. Weaver uses large language models to generate knowledge bases and recommends concepts from them interactively, allowing testers to elicit requirements for further testing. Weaver provides rich external knowledge to testers and encourages testers to systematically explore diverse concepts beyond their own biases. In a user study, we show that both NLP experts and non-experts identified more, as well as more diverse concepts worth testing when using Weaver. Collectively, they found more than 200 failing test cases for stance detection with zero-shot ChatGPT. Our case studies further show that Weaver can help practitioners test models in real-world settings, where developers define more nuanced application scenarios (e.g., code understanding and transcript summarization) using LLMs.
△ Less
Submitted 14 October, 2023;
originally announced October 2023.
-
A Dataset and Analysis of Open-Source Machine Learning Products
Authors:
Nadia Nahar,
Haoran Zhang,
Grace Lewis,
Shurui Zhou,
Christian Kästner
Abstract:
Machine learning (ML) components are increasingly incorporated into software products, yet developers face challenges in transitioning from ML prototypes to products. Academic researchers struggle to propose solutions to these challenges and evaluate interventions because they often do not have access to close-sourced ML products from industry. In this study, we define and identify open-source ML…
▽ More
Machine learning (ML) components are increasingly incorporated into software products, yet developers face challenges in transitioning from ML prototypes to products. Academic researchers struggle to propose solutions to these challenges and evaluate interventions because they often do not have access to close-sourced ML products from industry. In this study, we define and identify open-source ML products, curating a dataset of 262 repositories from GitHub, to facilitate further research and education. As a start, we explore six broad research questions related to different development activities and report 21 findings from a sample of 30 ML products from the dataset. Our findings reveal a variety of development practices and architectural decisions surrounding different types and uses of ML models that offer ample opportunities for future research innovations. We also find very little evidence of industry best practices such as model testing and pipeline automation within the open-source ML products, which leaves room for further investigation to understand its potential impact on the development and eventual end-user experience for the products.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
A Meta-Summary of Challenges in Building Products with ML Components -- Collecting Experiences from 4758+ Practitioners
Authors:
Nadia Nahar,
Haoran Zhang,
Grace Lewis,
Shurui Zhou,
Christian Kästner
Abstract:
Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing challenges. Many researchers have invested significant effort in understanding the challenges of industry practitioners working on building products with ML components, through interviews and surveys with practitioners. With the intention to aggregate and presen…
▽ More
Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing challenges. Many researchers have invested significant effort in understanding the challenges of industry practitioners working on building products with ML components, through interviews and surveys with practitioners. With the intention to aggregate and present their collective findings, we conduct a meta-summary study: We collect 50 relevant papers that together interacted with over 4758 practitioners using guidelines for systematic literature reviews. We then collected, grouped, and organized the over 500 mentions of challenges within those papers. We highlight the most commonly reported challenges and hope this meta-summary will be a useful resource for the research community to prioritize research and education in this field.
△ Less
Submitted 31 March, 2023;
originally announced April 2023.
-
MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities
Authors:
Katherine R. Maffey,
Kyle Dotterrer,
Jennifer Niemann,
Iain Cruickshank,
Grace A. Lewis,
Christian Kästner
Abstract:
Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles…
▽ More
Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Capabilities for Better ML Engineering
Authors:
Chenyang Yang,
Rachel Brower-Sinning,
Grace A. Lewis,
Christian Kästner,
Tongshuang Wu
Abstract:
In spite of machine learning's rapid growth, its engineering support is scattered in many forms, and tends to favor certain engineering stages, stakeholders, and evaluation preferences. We envision a capability-based framework, which uses fine-grained specifications for ML model behaviors to unite existing efforts towards better ML engineering. We use concrete scenarios (model design, debugging, a…
▽ More
In spite of machine learning's rapid growth, its engineering support is scattered in many forms, and tends to favor certain engineering stages, stakeholders, and evaluation preferences. We envision a capability-based framework, which uses fine-grained specifications for ML model behaviors to unite existing efforts towards better ML engineering. We use concrete scenarios (model design, debugging, and maintenance) to articulate capabilities' broad applications across various different dimensions, and their impact on building safer, more generalizable and more trustworthy models that reflect human needs. Through preliminary experiments, we show capabilities' potential for reflecting model generalizability, which can provide guidance for ML engineering process. We discuss challenges and opportunities for capabilities' integration into ML engineering.
△ Less
Submitted 10 February, 2023; v1 submitted 11 November, 2022;
originally announced November 2022.
-
Self-Adaptation in Industry: A Survey
Authors:
Danny Weyns,
Ilias Gerostathopoulos,
Nadeem Abbas,
Jesper Andersson,
Stefan Biffl,
Premek Brada,
Tomas Bures,
Amleto Di Salle,
Matthias Galster,
Patricia Lago,
Grace Lewis,
Marin Litoiu,
Angelika Musil,
Juergen Musil,
Panos Patros,
Patrizio Pelliccione
Abstract:
Computing systems form the backbone of many areas in our society, from manufacturing to traffic control, healthcare, and financial systems. When software plays a vital role in the design, construction, and operation, these systems are referred as software-intensive systems. Self-adaptation equips a software-intensive system with a feedback loop that either automates tasks that otherwise need to be…
▽ More
Computing systems form the backbone of many areas in our society, from manufacturing to traffic control, healthcare, and financial systems. When software plays a vital role in the design, construction, and operation, these systems are referred as software-intensive systems. Self-adaptation equips a software-intensive system with a feedback loop that either automates tasks that otherwise need to be performed by human operators or deals with uncertain conditions. Such feedback loops have found their way to a variety of practical applications; typical examples are an elastic cloud to adapt computing resources and automated server management to respond quickly to business needs. To gain insight into the motivations for applying self-adaptation in practice, the problems solved using self-adaptation and how these problems are solved, and the difficulties and risks that industry faces in adopting self-adaptation, we performed a large-scale survey. We received 184 valid responses from practitioners spread over 21 countries. Based on the analysis of the survey data, we provide an empirically grounded overview of state-of-the-practice in the application of self-adaptation. From that, we derive insights for researchers to check their current research with industrial needs, and for practitioners to compare their current practice in applying self-adaptation. These insights also provide opportunities for the application of self-adaptation in practice and pave the way for future industry-research collaborations.
△ Less
Submitted 6 November, 2022;
originally announced November 2022.
-
Predicting Swarm Equatorial Plasma Bubbles via Machine Learning and Shapley Values
Authors:
S. A. Reddy,
C. Forsyth,
A. Aruliah,
A. Smith,
J. Bortnik,
E. Aa,
D. O. Kataria,
G. Lewis
Abstract:
In this study we present AI Prediction of Equatorial Plasma Bubbles (APE), a machine learning model that can accurately predict the Ionospheric Bubble Index (IBI) on the Swarm spacecraft. IBI is a correlation ($R^2$) between perturbations in plasma density and the magnetic field, whose source can be Equatorial Plasma Bubbles (EPBs). EPBs have been studied for a number of years, but their day-to-da…
▽ More
In this study we present AI Prediction of Equatorial Plasma Bubbles (APE), a machine learning model that can accurately predict the Ionospheric Bubble Index (IBI) on the Swarm spacecraft. IBI is a correlation ($R^2$) between perturbations in plasma density and the magnetic field, whose source can be Equatorial Plasma Bubbles (EPBs). EPBs have been studied for a number of years, but their day-to-day variability has made predicting them a considerable challenge. We build an ensemble machine learning model to predict IBI. We use data from 2014-22 at a resolution of 1sec, and transform it from a time-series into a 6-dimensional space with a corresponding EPB $R^2$ (0-1) acting as the label. APE performs well across all metrics, exhibiting a skill, association and root mean squared error score of 0.96, 0.98 and 0.08 respectively. The model performs best post-sunset, in the American/Atlantic sector, around the equinoxes, and when solar activity is high. This is promising because EPBs are most likely to occur during these periods. Shapley values reveal that F10.7 is the most important feature in driving the predictions, whereas latitude is the least. The analysis also examines the relationship between the features, which reveals new insights into EPB climatology. Finally, the selection of the features means that APE could be expanded to forecasting EPBs following additional investigations into their onset.
△ Less
Submitted 30 September, 2023; v1 submitted 27 September, 2022;
originally announced September 2022.
-
Multi-level Adversarial Spatio-temporal Learning for Footstep Pressure based FoG Detection
Authors:
Kun Hu,
Shaohui Mei,
Wei Wang,
Kaylena A. Ehgoetz Martens,
Liang Wang,
Simon J. G. Lewis,
David D. Feng,
Zhiyong Wang
Abstract:
Freezing of gait (FoG) is one of the most common symptoms of Parkinson's disease, which is a neurodegenerative disorder of the central nervous system impacting millions of people around the world. To address the pressing need to improve the quality of treatment for FoG, devising a computer-aided detection and quantification tool for FoG has been increasingly important. As a non-invasive technique…
▽ More
Freezing of gait (FoG) is one of the most common symptoms of Parkinson's disease, which is a neurodegenerative disorder of the central nervous system impacting millions of people around the world. To address the pressing need to improve the quality of treatment for FoG, devising a computer-aided detection and quantification tool for FoG has been increasingly important. As a non-invasive technique for collecting motion patterns, the footstep pressure sequences obtained from pressure sensitive gait mats provide a great opportunity for evaluating FoG in the clinic and potentially in the home environment. In this study, FoG detection is formulated as a sequential modelling task and a novel deep learning architecture, namely Adversarial Spatio-temporal Network (ASTN), is proposed to learn FoG patterns across multiple levels. A novel adversarial training scheme is introduced with a multi-level subject discriminator to obtain subject-independent FoG representations, which helps to reduce the over-fitting risk due to the high inter-subject variance. As a result, robust FoG detection can be achieved for unseen subjects. The proposed scheme also sheds light on improving subject-level clinical studies from other scenarios as it can be integrated with many existing deep architectures. To the best of our knowledge, this is one of the first studies of footstep pressure-based FoG detection and the approach of utilizing ASTN is the first deep neural network architecture in pursuit of subject-independent representations. Experimental results on 393 trials collected from 21 subjects demonstrate encouraging performance of the proposed ASTN for FoG detection with an AUC 0.85.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
Data Leakage in Notebooks: Static Detection and Better Processes
Authors:
Chenyang Yang,
Rachel A Brower-Sinning,
Grace A. Lewis,
Christian Kästner
Abstract:
Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model's accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and cha…
▽ More
Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model's accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and challenging to detect manually. We develop a static analysis approach to detect common forms of data leakage in data science code. Our evaluation shows that our analysis accurately detects data leakage and that such leakage is pervasive among over 100,000 analyzed public notebooks. We discuss how our static analysis approach can help both practitioners and educators, and how leakage prevention can be designed into the development process.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
Large Scale Distributed Linear Algebra With Tensor Processing Units
Authors:
Adam G. M. Lewis,
Jackson Beall,
Martin Ganahl,
Markus Hauru,
Shrestha Basu Mallick,
Guifre Vidal
Abstract:
We have repurposed Google Tensor Processing Units (TPUs), application-specific chips developed for machine learning, into large-scale dense linear algebra supercomputers. The TPUs' fast inter-core interconnects (ICI)s, physically two-dimensional network topology, and high-bandwidth memory (HBM) permit distributed matrix multiplication algorithms to rapidly become computationally bound. In this reg…
▽ More
We have repurposed Google Tensor Processing Units (TPUs), application-specific chips developed for machine learning, into large-scale dense linear algebra supercomputers. The TPUs' fast inter-core interconnects (ICI)s, physically two-dimensional network topology, and high-bandwidth memory (HBM) permit distributed matrix multiplication algorithms to rapidly become computationally bound. In this regime, the matrix-multiply units (MXU)s dominate the runtime, yielding impressive scaling, performance, and raw size: operating in float32 precision, a full 2048-core pod of third generation TPUs can multiply two matrices with linear size $N= 220= 1 048 576$ in about 2 minutes. Via curated algorithms emphasizing large, single-core matrix multiplications, other tasks in dense linear algebra can similarly scale. As examples, we present (i) QR decomposition; (ii) resolution of linear systems; and (iii) the computation of matrix functions by polynomial iteration, demonstrated by the matrix polar factorization.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process
Authors:
Nadia Nahar,
Shurui Zhou,
Grace Lewis,
Christian Kästner
Abstract:
The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous…
▽ More
The introduction of machine learning (ML) components in software projects has created the need for software engineers to collaborate with data scientists and other specialists. While collaboration can always be challenging, ML introduces additional challenges with its exploratory model development process, additional skills and knowledge needed, difficulties testing ML systems, need for continuous evolution and monitoring, and non-traditional quality requirements such as fairness and explainability. Through interviews with 45 practitioners from 28 organizations, we identified key collaboration challenges that teams face when building and deploying ML systems into production. We report on common collaboration points in the development of production ML systems for requirements, data, and integration, as well as corresponding team patterns and challenges. We find that most of these challenges center around communication, documentation, engineering, and process and collect recommendations to address these challenges.
△ Less
Submitted 10 February, 2022; v1 submitted 19 October, 2021;
originally announced October 2021.
-
Dim but not entirely dark: Extracting the Galactic Center Excess' source-count distribution with neural nets
Authors:
Florian List,
Nicholas L. Rodd,
Geraint F. Lewis
Abstract:
The two leading hypotheses for the Galactic Center Excess (GCE) in the $\textit{Fermi}$ data are an unresolved population of faint millisecond pulsars (MSPs) and dark-matter (DM) annihilation. The dichotomy between these explanations is typically reflected by modeling them as two separate emission components. However, point-sources (PSs) such as MSPs become statistically degenerate with smooth Poi…
▽ More
The two leading hypotheses for the Galactic Center Excess (GCE) in the $\textit{Fermi}$ data are an unresolved population of faint millisecond pulsars (MSPs) and dark-matter (DM) annihilation. The dichotomy between these explanations is typically reflected by modeling them as two separate emission components. However, point-sources (PSs) such as MSPs become statistically degenerate with smooth Poisson emission in the ultra-faint limit (formally where each source is expected to contribute much less than one photon on average), leading to an ambiguity that can render questions such as whether the emission is PS-like or Poissonian in nature ill-defined. We present a conceptually new approach that describes the PS and Poisson emission in a unified manner and only afterwards derives constraints on the Poissonian component from the so obtained results. For the implementation of this approach, we leverage deep learning techniques, centered around a neural network-based method for histogram regression that expresses uncertainties in terms of quantiles. We demonstrate that our method is robust against a number of systematics that have plagued previous approaches, in particular DM / PS misattribution. In the $\textit{Fermi}$ data, we find a faint GCE described by a median source-count distribution (SCD) peaked at a flux of $\sim4 \times 10^{-11} \ \text{counts} \ \text{cm}^{-2} \ \text{s}^{-1}$ (corresponding to $\sim3 - 4$ expected counts per PS), which would require $N \sim \mathcal{O}(10^4)$ sources to explain the entire excess (median value $N = \text{29,300}$ across the sky). Although faint, this SCD allows us to derive the constraint $η_P \leq 66\%$ for the Poissonian fraction of the GCE flux $η_P$ at 95% confidence, suggesting that a substantial amount of the GCE flux is due to PSs.
△ Less
Submitted 15 December, 2021; v1 submitted 19 July, 2021;
originally announced July 2021.
-
Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems
Authors:
Grace A. Lewis,
Stephany Bellomo,
Ipek Ozkaya
Abstract:
Increasing availability of machine learning (ML) frameworks and tools, as well as their promise to improve solutions to data-driven decision problems, has resulted in popularity of using ML techniques in software systems. However, end-to-end development of ML-enabled systems, as well as their seamless deployment and operations, remain a challenge. One reason is that development and deployment of M…
▽ More
Increasing availability of machine learning (ML) frameworks and tools, as well as their promise to improve solutions to data-driven decision problems, has resulted in popularity of using ML techniques in software systems. However, end-to-end development of ML-enabled systems, as well as their seamless deployment and operations, remain a challenge. One reason is that development and deployment of ML-enabled systems involves three distinct workflows, perspectives, and roles, which include data science, software engineering, and operations. These three distinct perspectives, when misaligned due to incorrect assumptions, cause ML mismatches which can result in failed systems. We conducted an interview and survey study where we collected and validated common types of mismatches that occur in end-to-end development of ML-enabled systems. Our analysis shows that how each role prioritizes the importance of relevant mismatches varies, potentially contributing to these mismatched assumptions. In addition, the mismatch categories we identified can be specified as machine readable descriptors contributing to improved ML-enabled system development. In this paper, we report our findings and their implications for improving end-to-end ML-enabled system development.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
Constrained optimisation of preliminary spacecraft configurations under the design-for-demise paradigm
Authors:
Mirko Trisolini,
Hugh G. Lewis,
Camilla Colombo
Abstract:
In the past few years, the interest towards the implementation of design-for-demise measures has increased steadily. Most mid-sized satellites currently launched and already in orbit fail to comply with the casualty risk threshold of 0.0001. Therefore, satellites manufacturers and mission operators need to perform a disposal through a controlled re-entry, which has a higher cost and increased comp…
▽ More
In the past few years, the interest towards the implementation of design-for-demise measures has increased steadily. Most mid-sized satellites currently launched and already in orbit fail to comply with the casualty risk threshold of 0.0001. Therefore, satellites manufacturers and mission operators need to perform a disposal through a controlled re-entry, which has a higher cost and increased complexity. Through the design-for-demise paradigm, this additional cost and complexity can be removed as the spacecraft is directly compliant with the casualty risk regulations. However, building a spacecraft such that most of its parts will demise may lead to designs that are more vulnerable to space debris impacts, thus compromising the reliability of the mission. In fact, the requirements connected to the demisability and the survivability are in general competing. Given this competing nature, trade-off solutions can be found, which favour the implementation of design-for-demise measures while still maintaining the spacecraft resilient to space debris impacts. A multi-objective optimisation framework has been developed by the authors in previous works. The framework's objective is to find preliminary design solutions considering the competing nature of the demisability and the survivability of a spacecraft since the early stages of the mission design. In this way, a more integrated design can be achieved. The present work focuses on the improvement of the multi-objective optimisation framework by including constraints. The paper shows the application of the constrained optimisation to two relevant examples: the optimisation of a tank assembly and the optimisation of a typical satellite configuration.
△ Less
Submitted 21 January, 2021; v1 submitted 27 December, 2020;
originally announced January 2021.
-
Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation
Authors:
Bryan Chen,
Alexander Sax,
Gene Lewis,
Iro Armeni,
Silvio Savarese,
Amir Zamir,
Jitendra Malik,
Lerrel Pinto
Abstract:
Vision-based robotics often separates the control loop into one module for perception and a separate module for control. It is possible to train the whole system end-to-end (e.g. with deep RL), but doing it "from scratch" comes with a high sample complexity cost and the final result is often brittle, failing unexpectedly if the test environment differs from that of training.
We study the effects…
▽ More
Vision-based robotics often separates the control loop into one module for perception and a separate module for control. It is possible to train the whole system end-to-end (e.g. with deep RL), but doing it "from scratch" comes with a high sample complexity cost and the final result is often brittle, failing unexpectedly if the test environment differs from that of training.
We study the effects of using mid-level visual representations (features learned asynchronously for traditional computer vision objectives), as a generic and easy-to-decode perceptual state in an end-to-end RL framework. Mid-level representations encode invariances about the world, and we show that they aid generalization, improve sample complexity, and lead to a higher final performance. Compared to other approaches for incorporating invariances, such as domain randomization, asynchronously trained mid-level representations scale better: both to harder problems and to larger domain shifts. In practice, this means that mid-level representations could be used to successfully train policies for tasks where domain randomization and learning-from-scratch failed. We report results on both manipulation and navigation tasks, and for navigation include zero-shot sim-to-real experiments on real robots.
△ Less
Submitted 12 November, 2020;
originally announced November 2020.
-
Mostly Harmless Machine Learning: Learning Optimal Instruments in Linear IV Models
Authors:
Jiafeng Chen,
Daniel L. Chen,
Greg Lewis
Abstract:
We offer straightforward theoretical results that justify incorporating machine learning in the standard linear instrumental variable setting. The key idea is to use machine learning, combined with sample-splitting, to predict the treatment variable from the instrument and any exogenous covariates, and then use this predicted treatment and the covariates as technical instruments to recover the coe…
▽ More
We offer straightforward theoretical results that justify incorporating machine learning in the standard linear instrumental variable setting. The key idea is to use machine learning, combined with sample-splitting, to predict the treatment variable from the instrument and any exogenous covariates, and then use this predicted treatment and the covariates as technical instruments to recover the coefficients in the second-stage. This allows the researcher to extract non-linear co-variation between the treatment and instrument that may dramatically improve estimation precision and robustness by boosting instrument strength. Importantly, we constrain the machine-learned predictions to be linear in the exogenous covariates, thus avoiding spurious identification arising from non-linear relationships between the treatment and the covariates. We show that this approach delivers consistent and asymptotically normal estimates under weak conditions and that it may be adapted to be semiparametrically efficient (Chamberlain, 1992). Our method preserves standard intuitions and interpretations of linear instrumental variable methods, including under weak identification, and provides a simple, user-friendly upgrade to the applied economics toolbox. We illustrate our method with an example in law and criminal justice, examining the causal effect of appellate court reversals on district court sentencing decisions.
△ Less
Submitted 18 June, 2021; v1 submitted 11 November, 2020;
originally announced November 2020.
-
The GCE in a New Light: Disentangling the $γ$-ray Sky with Bayesian Graph Convolutional Neural Networks
Authors:
Florian List,
Nicholas L. Rodd,
Geraint F. Lewis,
Ishaan Bhat
Abstract:
A fundamental question regarding the Galactic Center Excess (GCE) is whether the underlying structure is point-like or smooth. This debate, often framed in terms of a millisecond pulsar or annihilating dark matter (DM) origin for the emission, awaits a conclusive resolution. In this work we weigh in on the problem using Bayesian graph convolutional neural networks. In simulated data, our neural ne…
▽ More
A fundamental question regarding the Galactic Center Excess (GCE) is whether the underlying structure is point-like or smooth. This debate, often framed in terms of a millisecond pulsar or annihilating dark matter (DM) origin for the emission, awaits a conclusive resolution. In this work we weigh in on the problem using Bayesian graph convolutional neural networks. In simulated data, our neural network (NN) is able to reconstruct the flux of inner Galaxy emission components to on average $\sim$0.5%, comparable to the non-Poissonian template fit (NPTF). When applied to the actual $\textit{Fermi}$-LAT data, we find that the NN estimates for the flux fractions from the background templates are consistent with the NPTF; however, the GCE is almost entirely attributed to smooth emission. While suggestive, we do not claim a definitive resolution for the GCE, as the NN tends to underestimate the flux of point-sources peaked near the 1$σ$ detection threshold. Yet the technique displays robustness to a number of systematics, including reconstructing injected DM, diffuse mismodeling, and unmodeled north-south asymmetries. So while the NN is hinting at a smooth origin for the GCE at present, with further refinements we argue that Bayesian Deep Learning is well placed to resolve this DM mystery.
△ Less
Submitted 28 October, 2020; v1 submitted 22 June, 2020;
originally announced June 2020.
-
Minimax Estimation of Conditional Moment Models
Authors:
Nishanth Dikkala,
Greg Lewis,
Lester Mackey,
Vasilis Syrgkanis
Abstract:
We develop an approach for estimating models described via conditional moment restrictions, with a prototypical application being non-parametric instrumental variable regression. We introduce a min-max criterion function, under which the estimation problem can be thought of as solving a zero-sum game between a modeler who is optimizing over the hypothesis space of the target model and an adversary…
▽ More
We develop an approach for estimating models described via conditional moment restrictions, with a prototypical application being non-parametric instrumental variable regression. We introduce a min-max criterion function, under which the estimation problem can be thought of as solving a zero-sum game between a modeler who is optimizing over the hypothesis space of the target model and an adversary who identifies violating moments over a test function space. We analyze the statistical estimation rate of the resulting estimator for arbitrary hypothesis spaces, with respect to an appropriate analogue of the mean squared error metric, for ill-posed inverse problems. We show that when the minimax criterion is regularized with a second moment penalty on the test function and the test function space is sufficiently rich, then the estimation rate scales with the critical radius of the hypothesis and test function spaces, a quantity which typically gives tight fast rates. Our main result follows from a novel localized Rademacher analysis of statistical learning problems defined via minimax objectives. We provide applications of our main results for several hypothesis spaces used in practice such as: reproducing kernel Hilbert spaces, high dimensional sparse linear functions, spaces defined via shape constraints, ensemble estimators such as random forests, and neural networks. For each of these applications we provide computationally efficient optimization methods for solving the corresponding minimax problem (e.g. stochastic first-order heuristics for neural networks). In several applications, we show how our modified mean squared error rate, combined with conditions that bound the ill-posedness of the inverse problem, lead to mean squared error rates. We conclude with an extensive experimental analysis of the proposed methods.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Predicting the vulnerability of spacecraft components: modelling debris impact effects through vulnerable-zones
Authors:
Mirko Trisolini,
Hugh G. Lewis,
Camilla Colombo
Abstract:
The space environment around the Earth is populated by more than 130 million objects of 1 mm in size and larger, and future predictions shows that this amount is destined to increase, even if mitigation measures are implemented at a far better rate than today. These objects can hit and damage a spacecraft or its components. It is thus necessary to assess the risk level for a satellite during its m…
▽ More
The space environment around the Earth is populated by more than 130 million objects of 1 mm in size and larger, and future predictions shows that this amount is destined to increase, even if mitigation measures are implemented at a far better rate than today. These objects can hit and damage a spacecraft or its components. It is thus necessary to assess the risk level for a satellite during its mission lifetime. Few software packages perform this analysis, and most of them employ time-consuming ray-tracing methodology, where particles are randomly sampled from relevant distributions. In addition, they tend not to consider the risk associated with the secondary debris clouds. The paper presents the development of a vulnerability assessment model, which relies on a fully statistical procedure: the debris fluxes are directly used combining them with the concept of the vulnerable zone, avoiding the random sampling the debris fluxes. A novel methodology is presented to predict damage to internal components. It models the interaction between the components and the secondary debris cloud through basic geometric operations, considering mutual shielding and shadowing between internal components. The methodologies are tested against state-of-the-art software for relevant test cases, comparing results on external structures and internal components.
△ Less
Submitted 10 March, 2020;
originally announced March 2020.
-
A unified framework for 21cm tomography sample generation and parameter inference with Progressively Growing GANs
Authors:
Florian List,
Geraint F. Lewis
Abstract:
Creating a database of 21cm brightness temperature signals from the Epoch of Reionisation (EoR) for an array of reionisation histories is a complex and computationally expensive task, given the range of astrophysical processes involved and the possibly high-dimensional parameter space that is to be probed. We utilise a specific type of neural network, a Progressively Growing Generative Adversarial…
▽ More
Creating a database of 21cm brightness temperature signals from the Epoch of Reionisation (EoR) for an array of reionisation histories is a complex and computationally expensive task, given the range of astrophysical processes involved and the possibly high-dimensional parameter space that is to be probed. We utilise a specific type of neural network, a Progressively Growing Generative Adversarial Network (PGGAN), to produce realistic tomography images of the 21cm brightness temperature during the EoR, covering a continuous three-dimensional parameter space that models varying X-ray emissivity, Lyman band emissivity, and ratio between hard and soft X-rays. The GPU-trained network generates new samples at a resolution of $\sim 3'$ in a second (on a laptop CPU), and the resulting global 21cm signal, power spectrum, and pixel distribution function agree well with those of the training data, taken from the 21SSD catalogue \citep{Semelin2017}. Finally, we showcase how a trained PGGAN can be leveraged for the converse task of inferring parameters from 21cm tomography samples via Approximate Bayesian Computation.
△ Less
Submitted 18 February, 2020;
originally announced February 2020.
-
Double/Debiased Machine Learning for Dynamic Treatment Effects via g-Estimation
Authors:
Greg Lewis,
Vasilis Syrgkanis
Abstract:
We consider the estimation of treatment effects in settings when multiple treatments are assigned over time and treatments can have a causal effect on future outcomes or the state of the treated unit. We propose an extension of the double/debiased machine learning framework to estimate the dynamic effects of treatments, which can be viewed as a Neyman orthogonal (locally robust) cross-fitted versi…
▽ More
We consider the estimation of treatment effects in settings when multiple treatments are assigned over time and treatments can have a causal effect on future outcomes or the state of the treated unit. We propose an extension of the double/debiased machine learning framework to estimate the dynamic effects of treatments, which can be viewed as a Neyman orthogonal (locally robust) cross-fitted version of $g$-estimation in the dynamic treatment regime. Our method applies to a general class of non-linear dynamic treatment models known as Structural Nested Mean Models and allows the use of machine learning methods to control for potentially high dimensional state variables, subject to a mean square error guarantee, while still allowing parametric estimation and construction of confidence intervals for the structural parameters of interest. These structural parameters can be used for off-policy evaluation of any target dynamic policy at parametric rates, subject to semi-parametric restrictions on the data generating process. Our work is based on a recursive peeling process, typical in $g$-estimation, and formulates a strongly convex objective at each stage, which allows us to extend the $g$-estimation framework in multiple directions: i) to provide finite sample guarantees, ii) to estimate non-linear effect heterogeneity with respect to fixed unit characteristics, within arbitrary function spaces, enabling a dynamic analogue of the RLearner algorithm for heterogeneous effects, iii) to allow for high-dimensional sparse parameterizations of the target structural functions, enabling automated model selection via a recursive lasso algorithm. We also provide guarantees for data stemming from a single treated unit over a long horizon and under stationarity conditions.
△ Less
Submitted 16 June, 2021; v1 submitted 17 February, 2020;
originally announced February 2020.
-
Quantum Computation with Machine-Learning-Controlled Quantum Stuff
Authors:
Lucien Hardy,
Adam G. M. Lewis
Abstract:
We describe how one may go about performing quantum computation with arbitrary "quantum stuff", as long as it has some basic physical properties. Imagine a long strip of stuff, equipped with regularly spaced wires to provide input settings and to read off outcomes. After showing how the corresponding map from settings to outcomes can be construed as a quantum circuit, we provide a machine learning…
▽ More
We describe how one may go about performing quantum computation with arbitrary "quantum stuff", as long as it has some basic physical properties. Imagine a long strip of stuff, equipped with regularly spaced wires to provide input settings and to read off outcomes. After showing how the corresponding map from settings to outcomes can be construed as a quantum circuit, we provide a machine learning algorithm to tomographically "learn" which settings implement the members of a universal gate set. At optimum, arbitrary quantum gates, and thus arbitrary quantum programs, can be implemented using the stuff.
△ Less
Submitted 29 November, 2019;
originally announced November 2019.
-
Component Mismatches Are a Critical Bottleneck to Fielding AI-Enabled Systems in the Public Sector
Authors:
Grace A. Lewis,
Stephany Bellomo,
April Galyardt
Abstract:
The use of machine learning or artificial intelligence (ML/AI) holds substantial potential toward improving many functions and needs of the public sector. In practice however, integrating ML/AI components into public sector applications is severely limited not only by the fragility of these components and their algorithms, but also because of mismatches between components of ML-enabled systems. Fo…
▽ More
The use of machine learning or artificial intelligence (ML/AI) holds substantial potential toward improving many functions and needs of the public sector. In practice however, integrating ML/AI components into public sector applications is severely limited not only by the fragility of these components and their algorithms, but also because of mismatches between components of ML-enabled systems. For example, if an ML model is trained on data that is different from data in the operational environment, field performance of the ML component will be dramatically reduced. Separate from software engineering considerations, the expertise needed to field an ML/AI component within a system frequently comes from outside software engineering. As a result, assumptions and even descriptive language used by practitioners from these different disciplines can exacerbate other challenges to integrating ML/AI components into larger systems. We are investigating classes of mismatches in ML/AI systems integration, to identify the implicit assumptions made by practitioners in different fields (data scientists, software engineers, operations staff) and find ways to communicate the appropriate information explicitly. We will discuss a few categories of mismatch, and provide examples from each class. To enable ML/AI components to be fielded in a meaningful way, we will need to understand the mismatches that exist and develop practices to mitigate the impacts of these mismatches.
△ Less
Submitted 14 October, 2019;
originally announced October 2019.
-
Spacecraft design optimisation for demise and survivability
Authors:
Mirko Trisolini,
Hugh G. Lewis,
Camilla Colombo
Abstract:
Among the mitigation measures introduced to cope with the space debris issue there is the de-orbiting of decommissioned satellites. Guidelines for re-entering objects call for a ground casualty risk no higher than 0.0001. To comply with this requirement, satellites can be designed through a design-for-demise philosophy. Still, a spacecraft designed to demise has to survive the debris-populated spa…
▽ More
Among the mitigation measures introduced to cope with the space debris issue there is the de-orbiting of decommissioned satellites. Guidelines for re-entering objects call for a ground casualty risk no higher than 0.0001. To comply with this requirement, satellites can be designed through a design-for-demise philosophy. Still, a spacecraft designed to demise has to survive the debris-populated space environment for many years. The demisability and the survivability of a satellite can both be influenced by a set of common design choices such as the material selection, the geometry definition, and the position of the components. Within this context, two models have been developed to analyse the demise and the survivability of satellites. Given the competing nature of the demisability and the survivability, a multi-objective optimisation framework was developed, with the aim to identify trade-off solutions for the preliminary design of satellites. As the problem is nonlinear and involves the combination of continuous and discrete variables, classical derivative based approaches are unsuited and a genetic algorithm was selected instead. The genetic algorithm uses the developed demisability and survivability criteria as the fitness functions of the multi-objective algorithm. The paper presents a test case, which considers the preliminary optimisation of tanks in terms of material, geometry, location, and number of tanks for a representative Earth observation mission. The configuration of the external structure of the spacecraft is fixed. Tanks were selected because they are sensitive to both design requirements: they represent critical components in the demise process and impact damage can cause the loss of the mission because of leaking and ruptures. The results present the possible trade off solutions, constituting the Pareto front obtained from the multi-objective optimisation.
△ Less
Submitted 11 October, 2019;
originally announced October 2019.
-
Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments
Authors:
Vasilis Syrgkanis,
Victor Lei,
Miruna Oprescu,
Maggie Hei,
Keith Battocchi,
Greg Lewis
Abstract:
We consider the estimation of heterogeneous treatment effects with arbitrary machine learning methods in the presence of unobserved confounders with the aid of a valid instrument. Such settings arise in A/B tests with an intent-to-treat structure, where the experimenter randomizes over which user will receive a recommendation to take an action, and we are interested in the effect of the downstream…
▽ More
We consider the estimation of heterogeneous treatment effects with arbitrary machine learning methods in the presence of unobserved confounders with the aid of a valid instrument. Such settings arise in A/B tests with an intent-to-treat structure, where the experimenter randomizes over which user will receive a recommendation to take an action, and we are interested in the effect of the downstream action. We develop a statistical learning approach to the estimation of heterogeneous effects, reducing the problem to the minimization of an appropriate loss function that depends on a set of auxiliary models (each corresponding to a separate prediction task). The reduction enables the use of all recent algorithmic advances (e.g. neural nets, forests). We show that the estimated effect model is robust to estimation errors in the auxiliary models, by showing that the loss satisfies a Neyman orthogonality criterion. Our approach can be used to estimate projections of the true effect model on simpler hypothesis spaces. When these spaces are parametric, then the parameter estimates are asymptotically normal, which enables construction of confidence sets. We applied our method to estimate the effect of membership on downstream webpage engagement on TripAdvisor, using as an instrument an intent-to-treat A/B test among 4 million TripAdvisor users, where some users received an easier membership sign-up process. We also validate our method on synthetic data and on public datasets for the effects of schooling on income.
△ Less
Submitted 5 June, 2019; v1 submitted 24 May, 2019;
originally announced May 2019.
-
Semi-Parametric Efficient Policy Learning with Continuous Actions
Authors:
Mert Demirer,
Vasilis Syrgkanis,
Greg Lewis,
Victor Chernozhukov
Abstract:
We consider off-policy evaluation and optimization with continuous action spaces. We focus on observational data where the data collection policy is unknown and needs to be estimated. We take a semi-parametric approach where the value function takes a known parametric form in the treatment, but we are agnostic on how it depends on the observed contexts. We propose a doubly robust off-policy estima…
▽ More
We consider off-policy evaluation and optimization with continuous action spaces. We focus on observational data where the data collection policy is unknown and needs to be estimated. We take a semi-parametric approach where the value function takes a known parametric form in the treatment, but we are agnostic on how it depends on the observed contexts. We propose a doubly robust off-policy estimate for this setting and show that off-policy optimization based on this estimate is robust to estimation errors of the policy function or the regression model. Our results also apply if the model does not satisfy our semi-parametric form, but rather we measure regret in terms of the best projection of the true value function to this functional space. Our work extends prior approaches of policy optimization from observational data that only considered discrete actions. We provide an experimental evaluation of our method in a synthetic data example motivated by optimal personalized pricing and costly resource allocation.
△ Less
Submitted 20 July, 2019; v1 submitted 24 May, 2019;
originally announced May 2019.
-
Non-Parametric Inference Adaptive to Intrinsic Dimension
Authors:
Khashayar Khosravi,
Greg Lewis,
Vasilis Syrgkanis
Abstract:
We consider non-parametric estimation and inference of conditional moment models in high dimensions. We show that even when the dimension $D$ of the conditioning variable is larger than the sample size $n$, estimation and inference is feasible as long as the distribution of the conditioning variable has small intrinsic dimension $d$, as measured by locally low doubling measures. Our estimation is…
▽ More
We consider non-parametric estimation and inference of conditional moment models in high dimensions. We show that even when the dimension $D$ of the conditioning variable is larger than the sample size $n$, estimation and inference is feasible as long as the distribution of the conditioning variable has small intrinsic dimension $d$, as measured by locally low doubling measures. Our estimation is based on a sub-sampled ensemble of the $k$-nearest neighbors ($k$-NN) $Z$-estimator. We show that if the intrinsic dimension of the covariate distribution is equal to $d$, then the finite sample estimation error of our estimator is of order $n^{-1/(d+2)}$ and our estimate is $n^{1/(d+2)}$-asymptotically normal, irrespective of $D$. The sub-sampling size required for achieving these results depends on the unknown intrinsic dimension $d$. We propose an adaptive data-driven approach for choosing this parameter and prove that it achieves the desired rates. We discuss extensions and applications to heterogeneous treatment effect estimation.
△ Less
Submitted 17 June, 2019; v1 submitted 11 January, 2019;
originally announced January 2019.
-
Automatic generation of CUDA code performing tensor manipulations using C++ expression templates
Authors:
Adam G. M. Lewis,
Harald P. Pfeiffer
Abstract:
We present a C++ library, TLoops, which uses a hierarchy of expression templates to represent operations upon tensorial quantities in single lines of C++ code that resemble analytic equations. These expressions may be run as-is, but may also be used to emit equivalent low-level C or CUDA code, which either performs the operations more quickly on the CPU, or allows them to be rapidly ported to run…
▽ More
We present a C++ library, TLoops, which uses a hierarchy of expression templates to represent operations upon tensorial quantities in single lines of C++ code that resemble analytic equations. These expressions may be run as-is, but may also be used to emit equivalent low-level C or CUDA code, which either performs the operations more quickly on the CPU, or allows them to be rapidly ported to run on NVIDIA GPUs. We detail the expression template and C++-class hierarchy that represents the expressions and which makes automatic code-generation possible. We then present benchmarks of the expression-template code, the automatically generated C code, and the automatically generated CUDA code running on several generations of NVIDIA GPU.
△ Less
Submitted 24 April, 2018;
originally announced April 2018.
-
Adversarial Generalized Method of Moments
Authors:
Greg Lewis,
Vasilis Syrgkanis
Abstract:
We provide an approach for learning deep neural net representations of models described via conditional moment restrictions. Conditional moment restrictions are widely used, as they are the language by which social scientists describe the assumptions they make to enable causal inference. We formulate the problem of estimating the underling model as a zero-sum game between a modeler and an adversar…
▽ More
We provide an approach for learning deep neural net representations of models described via conditional moment restrictions. Conditional moment restrictions are widely used, as they are the language by which social scientists describe the assumptions they make to enable causal inference. We formulate the problem of estimating the underling model as a zero-sum game between a modeler and an adversary and apply adversarial training. Our approach is similar in nature to Generative Adversarial Networks (GAN), though here the modeler is learning a representation of a function that satisfies a continuum of moment conditions and the adversary is identifying violating moments. We outline ways of constructing effective adversaries in practice, including kernels centered by k-means clustering, and random forests. We examine the practical performance of our approach in the setting of non-parametric instrumental variable regression.
△ Less
Submitted 24 April, 2018; v1 submitted 19 March, 2018;
originally announced March 2018.
-
Counterfactual Prediction with Deep Instrumental Variables Networks
Authors:
Jason Hartford,
Greg Lewis,
Kevin Leyton-Brown,
Matt Taddy
Abstract:
We are in the middle of a remarkable rise in the use and capability of artificial intelligence. Much of this growth has been fueled by the success of deep learning architectures: models that map from observables to outputs via multiple layers of latent representations. These deep learning algorithms are effective tools for unstructured prediction, and they can be combined in AI systems to solve co…
▽ More
We are in the middle of a remarkable rise in the use and capability of artificial intelligence. Much of this growth has been fueled by the success of deep learning architectures: models that map from observables to outputs via multiple layers of latent representations. These deep learning algorithms are effective tools for unstructured prediction, and they can be combined in AI systems to solve complex automated reasoning problems. This paper provides a recipe for combining ML algorithms to solve for causal effects in the presence of instrumental variables -- sources of treatment randomization that are conditionally independent from the response. We show that a flexible IV specification resolves into two prediction tasks that can be solved with deep neural nets: a first-stage network for treatment prediction and a second-stage network whose loss function involves integration over the conditional treatment distribution. This Deep IV framework imposes some specific structure on the stochastic gradient descent routine used for training, but it is general enough that we can take advantage of off-the-shelf ML capabilities and avoid extensive algorithm customization. We outline how to obtain out-of-sample causal validation in order to avoid over-fit. We also introduce schemes for both Bayesian and frequentist inference: the former via a novel adaptation of dropout training, and the latter via a data splitting routine.
△ Less
Submitted 30 December, 2016;
originally announced December 2016.