Search | arXiv e-print repository

Management Decisions in Manufacturing using Causal Machine Learning -- To Rework, or not to Rework?

Authors: Philipp Schwarz, Oliver Schacht, Sven Klaassen, Daniel Grünbaum, Sebastian Imhof, Martin Spindler

Abstract: In this paper, we present a data-driven model for estimating optimal rework policies in manufacturing systems. We consider a single production stage within a multistage, lot-based system that allows for optional rework steps. While the rework decision depends on an intermediate state of the lot and system, the final product inspection, and thus the assessment of the actual yield, is delayed until… ▽ More In this paper, we present a data-driven model for estimating optimal rework policies in manufacturing systems. We consider a single production stage within a multistage, lot-based system that allows for optional rework steps. While the rework decision depends on an intermediate state of the lot and system, the final product inspection, and thus the assessment of the actual yield, is delayed until production is complete. Repair steps are applied uniformly to the lot, potentially improving some of the individual items while degrading others. The challenge is thus to balance potential yield improvement with the rework costs incurred. Given the inherently causal nature of this decision problem, we propose a causal model to estimate yield improvement. We apply methods from causal machine learning, in particular double/debiased machine learning (DML) techniques, to estimate conditional treatment effects from data and derive policies for rework decisions. We validate our decision model using real-world data from opto-electronic semiconductor manufacturing, achieving a yield improvement of 2 - 3% during the color-conversion process of white light-emitting diodes (LEDs). △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 30 pages, 10 figures

arXiv:2402.04674 [pdf, other]

Hyperparameter Tuning for Causal Inference with Double Machine Learning: A Simulation Study

Authors: Philipp Bach, Oliver Schacht, Victor Chernozhukov, Sven Klaassen, Martin Spindler

Abstract: Proper hyperparameter tuning is essential for achieving optimal performance of modern machine learning (ML) methods in predictive tasks. While there is an extensive literature on tuning ML learners for prediction, there is only little guidance available on tuning ML learners for causal machine learning and how to select among different ML learners. In this paper, we empirically assess the relation… ▽ More Proper hyperparameter tuning is essential for achieving optimal performance of modern machine learning (ML) methods in predictive tasks. While there is an extensive literature on tuning ML learners for prediction, there is only little guidance available on tuning ML learners for causal machine learning and how to select among different ML learners. In this paper, we empirically assess the relationship between the predictive performance of ML methods and the resulting causal estimation based on the Double Machine Learning (DML) approach by Chernozhukov et al. (2018). DML relies on estimating so-called nuisance parameters by treating them as supervised learning problems and using them as plug-in estimates to solve for the (causal) parameter. We conduct an extensive simulation study using data from the 2019 Atlantic Causal Inference Conference Data Challenge. We provide empirical insights on the role of hyperparameter tuning and other practical decisions for causal estimation with DML. First, we assess the importance of data splitting schemes for tuning ML learners within Double Machine Learning. Second, we investigate how the choice of ML methods and hyperparameters, including recent AutoML frameworks, impacts the estimation performance for a causal parameter of interest. Third, we assess to what extent the choice of a particular causal model, as characterized by incorporated parametric assumptions, can be based on predictive performance metrics. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2402.01785 [pdf, other]

DoubleMLDeep: Estimation of Causal Effects with Multimodal Data

Authors: Sven Klaassen, Jan Teichert-Kluge, Philipp Bach, Victor Chernozhukov, Martin Spindler, Suhas Vijaykumar

Abstract: This paper explores the use of unstructured, multimodal data, namely text and images, in causal inference and treatment effect estimation. We propose a neural network architecture that is adapted to the double machine learning (DML) framework, specifically the partially linear model. An additional contribution of our paper is a new method to generate a semi-synthetic dataset which can be used to e… ▽ More This paper explores the use of unstructured, multimodal data, namely text and images, in causal inference and treatment effect estimation. We propose a neural network architecture that is adapted to the double machine learning (DML) framework, specifically the partially linear model. An additional contribution of our paper is a new method to generate a semi-synthetic dataset which can be used to evaluate the performance of causal effect estimation in the presence of text and images as confounders. The proposed methods and architectures are evaluated on the semi-synthetic dataset and compared to standard approaches, highlighting the potential benefit of using text and images directly in causal studies. Our findings have implications for researchers and practitioners in economics, marketing, finance, medicine and data science in general who are interested in estimating causal quantities using non-traditional data. △ Less

Submitted 1 February, 2024; originally announced February 2024.

MSC Class: 62; 91 ACM Class: I.2.0

arXiv:2103.09603 [pdf, other]

doi 10.18637/jss.v108.i03

DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R

Authors: Philipp Bach, Victor Chernozhukov, Malte S. Kurz, Martin Spindler, Sven Klaassen

Abstract: The R package DoubleML implements the double/debiased machine learning framework of Chernozhukov et al. (2018). It provides functionalities to estimate parameters in causal models based on machine learning methods. The double machine learning framework consist of three key ingredients: Neyman orthogonality, high-quality machine learning estimation and sample splitting. Estimation of nuisance compo… ▽ More The R package DoubleML implements the double/debiased machine learning framework of Chernozhukov et al. (2018). It provides functionalities to estimate parameters in causal models based on machine learning methods. The double machine learning framework consist of three key ingredients: Neyman orthogonality, high-quality machine learning estimation and sample splitting. Estimation of nuisance components can be performed by various state-of-the-art machine learning methods that are available in the mlr3 ecosystem. DoubleML makes it possible to perform inference in a variety of causal models, including partially linear and interactive regression models and their extensions to instrumental variable estimation. The object-oriented implementation of DoubleML enables a high flexibility for the model specification and makes it easily extendable. This paper serves as an introduction to the double machine learning framework and the R package DoubleML. In reproducible code examples with simulated and real data sets, we demonstrate how DoubleML users can perform valid inference based on machine learning methods. △ Less

Submitted 5 June, 2024; v1 submitted 17 March, 2021; originally announced March 2021.

Comments: 56 pages, 8 Figures, 1 Table; Updated version for DoubleML 1.0.0; Updated version due to changes in R package paradox (for parameter tuning with mlr3)

MSC Class: 62-04

Journal ref: Journal of Statistical Software 2024

arXiv:2004.01623 [pdf, other]

Estimation and Uniform Inference in Sparse High-Dimensional Additive Models

Authors: Philipp Bach, Sven Klaassen, Jannis Kueck, Martin Spindler

Abstract: We develop a novel method to construct uniformly valid confidence bands for a nonparametric component $f_1$ in the sparse additive model $Y=f_1(X_1)+\ldots + f_p(X_p) + \varepsilon$ in a high-dimensional setting. Our method integrates sieve estimation into a high-dimensional Z-estimation framework, facilitating the construction of uniformly valid confidence bands for the target component $f_1$. To… ▽ More We develop a novel method to construct uniformly valid confidence bands for a nonparametric component $f_1$ in the sparse additive model $Y=f_1(X_1)+\ldots + f_p(X_p) + \varepsilon$ in a high-dimensional setting. Our method integrates sieve estimation into a high-dimensional Z-estimation framework, facilitating the construction of uniformly valid confidence bands for the target component $f_1$. To form these confidence bands, we employ a multiplier bootstrap procedure. Additionally, we provide rates for the uniform lasso estimation in high dimensions, which may be of independent interest. Through simulation studies, we demonstrate that our proposed method delivers reliable results in terms of estimation and coverage, even in small samples. △ Less

Submitted 23 April, 2024; v1 submitted 3 April, 2020; originally announced April 2020.

MSC Class: 62G08; 62-07

arXiv:1808.10532 [pdf, other]

Uniform Inference in High-Dimensional Gaussian Graphical Models

Authors: Sven Klaassen, Jannis Kück, Martin Spindler, Victor Chernozhukov

Abstract: Graphical models have become a very popular tool for representing dependencies within a large set of variables and are key for representing causal structures. We provide results for uniform inference on high-dimensional graphical models with the number of target parameters $d$ being possible much larger than sample size. This is in particular important when certain features or structures of a caus… ▽ More Graphical models have become a very popular tool for representing dependencies within a large set of variables and are key for representing causal structures. We provide results for uniform inference on high-dimensional graphical models with the number of target parameters $d$ being possible much larger than sample size. This is in particular important when certain features or structures of a causal model should be recovered. Our results highlight how in high-dimensional settings graphical models can be estimated and recovered with modern machine learning methods in complex data sets. To construct simultaneous confidence regions on many target parameters, sufficiently fast estimation rates of the nuisance functions are crucial. In this context, we establish uniform estimation rates and sparsity guarantees of the square-root estimator in a random design under approximate sparsity conditions that might be of independent interest for related problems in high-dimensions. We also demonstrate in a comprehensive simulation study that our procedure has good small sample properties. △ Less

Submitted 3 December, 2018; v1 submitted 30 August, 2018; originally announced August 2018.

Comments: 59 pages, 2 figures, 6 tables

MSC Class: 62H15; 62J07;

arXiv:1712.07364 [pdf, other]

Transformation Models in High-Dimensions

Authors: Sven Klaassen, Jannis Kueck, Martin Spindler

Abstract: Transformation models are a very important tool for applied statisticians and econometricians. In many applications, the dependent variable is transformed so that homogeneity or normal distribution of the error holds. In this paper, we analyze transformation models in a high-dimensional setting, where the set of potential covariates is large. We propose an estimator for the transformation paramete… ▽ More Transformation models are a very important tool for applied statisticians and econometricians. In many applications, the dependent variable is transformed so that homogeneity or normal distribution of the error holds. In this paper, we analyze transformation models in a high-dimensional setting, where the set of potential covariates is large. We propose an estimator for the transformation parameter and we show that it is asymptotically normally distributed using an orthogonalized moment condition where the nuisance functions depend on the target parameter. In a simulation study, we show that the proposed estimator works well in small samples. A common practice in labor economics is to transform wage with the log-function. In this study, we test if this transformation holds in CPS data from the United States. △ Less

Submitted 20 December, 2017; originally announced December 2017.

Comments: 63 pages, 4 figures

MSC Class: 62H; 62F

Showing 1–7 of 7 results for author: Klaassen, S