Search | arXiv e-print repository

Asset Trading in Continuous Time: A Cautionary Tale

Abstract: The continuous time model of dynamic asset trading is the central model of modern finance. Because trading cannot in fact take place at every moment of time, it would seem desirable to show that the continuous time model can be viewed as the limit of models in which trading can occur only at (many) discrete moments of time. This paper demonstrates that, if we take terminal wealth constraints and s… ▽ More The continuous time model of dynamic asset trading is the central model of modern finance. Because trading cannot in fact take place at every moment of time, it would seem desirable to show that the continuous time model can be viewed as the limit of models in which trading can occur only at (many) discrete moments of time. This paper demonstrates that, if we take terminal wealth constraints and self-financing constraints as seriously in the discrete model as in the continuous model, then the continuous trading model need not be the limit of discrete trading models. This raises serious foundational questions about the continuous time model. △ Less

Submitted 7 July, 2022; originally announced July 2022.

arXiv:2202.10153 [pdf, other]

Inferring Lexicographically-Ordered Rewards from Preferences

Authors: Alihan Hüyük, William R. Zame, Mihaela van der Schaar

Abstract: Modeling the preferences of agents over a set of alternatives is a principal concern in many areas. The dominant approach has been to find a single reward/utility function with the property that alternatives yielding higher rewards are preferred over alternatives yielding lower rewards. However, in many settings, preferences are based on multiple, often competing, objectives; a single reward funct… ▽ More Modeling the preferences of agents over a set of alternatives is a principal concern in many areas. The dominant approach has been to find a single reward/utility function with the property that alternatives yielding higher rewards are preferred over alternatives yielding lower rewards. However, in many settings, preferences are based on multiple, often competing, objectives; a single reward function is not adequate to represent such preferences. This paper proposes a method for inferring multi-objective reward-based representations of an agent's observed preferences. We model the agent's priorities over different objectives as entering lexicographically, so that objectives with lower priorities matter only when the agent is indifferent with respect to objectives with higher priorities. We offer two example applications in healthcare, one inspired by cancer treatment, the other inspired by organ transplantation, to illustrate how the lexicographically-ordered rewards we learn can provide a better understanding of a decision-maker's preferences and help improve policies when used in reinforcement learning. △ Less

Submitted 7 June, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

Comments: In Proceedings of the 36th AAAI Conference on Artificial Intelligence

arXiv:2106.02875 [pdf, other]

Integrating Expert ODEs into Neural ODEs: Pharmacology and Disease Progression

Authors: Zhaozhi Qian, William R. Zame, Lucas M. Fleuren, Paul Elbers, Mihaela van der Schaar

Abstract: Modeling a system's temporal behaviour in reaction to external stimuli is a fundamental problem in many areas. Pure Machine Learning (ML) approaches often fail in the small sample regime and cannot provide actionable insights beyond predictions. A promising modification has been to incorporate expert domain knowledge into ML models. The application we consider is predicting the progression of dise… ▽ More Modeling a system's temporal behaviour in reaction to external stimuli is a fundamental problem in many areas. Pure Machine Learning (ML) approaches often fail in the small sample regime and cannot provide actionable insights beyond predictions. A promising modification has been to incorporate expert domain knowledge into ML models. The application we consider is predicting the progression of disease under medications, where a plethora of domain knowledge is available from pharmacology. Pharmacological models describe the dynamics of carefully-chosen medically meaningful variables in terms of systems of Ordinary Differential Equations (ODEs). However, these models only describe a limited collection of variables, and these variables are often not observable in clinical environments. To close this gap, we propose the latent hybridisation model (LHM) that integrates a system of expert-designed ODEs with machine-learned Neural ODEs to fully describe the dynamics of the system and to link the expert and latent variables to observable quantities. We evaluated LHM on synthetic data as well as real-world intensive care data of COVID-19 patients. LHM consistently outperforms previous works, especially when few training samples are available such as at the beginning of the pandemic. △ Less

Submitted 17 June, 2021; v1 submitted 5 June, 2021; originally announced June 2021.

arXiv:1810.02876 [pdf, ps, other]

Adaptive Clinical Trials: Exploiting Sequential Patient Recruitment and Allocation

Authors: Onur Atan, William R. Zame, Mihaela van der Schaar

Abstract: Randomized Controlled Trials (RCTs) are the gold standard for comparing the effectiveness of a new treatment to the current one (the control). Most RCTs allocate the patients to the treatment group and the control group by uniform randomization. We show that this procedure can be highly sub-optimal (in terms of learning) if -- as is often the case -- patients can be recruited in cohorts (rather th… ▽ More Randomized Controlled Trials (RCTs) are the gold standard for comparing the effectiveness of a new treatment to the current one (the control). Most RCTs allocate the patients to the treatment group and the control group by uniform randomization. We show that this procedure can be highly sub-optimal (in terms of learning) if -- as is often the case -- patients can be recruited in cohorts (rather than all at once), the effects on each cohort can be observed before recruiting the next cohort, and the effects are heterogeneous across identifiable subgroups of patients. We formulate the patient allocation problem as a finite stage Markov Decision Process in which the objective is to minimize a given weighted combination of type-I and type-II errors. Because finding the exact solution to this Markov Decision Process is computationally intractable, we propose an algorithm -- \textit{Knowledge Gradient for Randomized Controlled Trials} (RCT-KG) -- that yields an approximate solution. We illustrate our algorithm on a synthetic dataset with Bernoulli outcomes and compare it with uniform randomization. For a given size of trial our method achieves significant reduction in error, and to achieve a prescribed level of confidence (in identifying whether the treatment is superior to the control), our method requires many fewer patients. Our approach uses what has been learned from the effects on previous cohorts to recruit patients to subgroups and allocate patients (to treatment/control) within subgroups in a way that promotes more efficient learning. △ Less

Submitted 18 October, 2018; v1 submitted 5 October, 2018; originally announced October 2018.

arXiv:1802.08679 [pdf, other]

Learning Optimal Policies from Observational Data

Authors: Onur Atan, William R. Zame, M van der Schaar

Abstract: Choosing optimal (or at least better) policies is an important problem in domains from medicine to education to finance and many others. One approach to this problem is through controlled experiments/trials - but controlled experiments are expensive. Hence it is important to choose the best policies on the basis of observational data. This presents two difficult challenges: (i) missing counterfact… ▽ More Choosing optimal (or at least better) policies is an important problem in domains from medicine to education to finance and many others. One approach to this problem is through controlled experiments/trials - but controlled experiments are expensive. Hence it is important to choose the best policies on the basis of observational data. This presents two difficult challenges: (i) missing counterfactuals, and (ii) selection bias. This paper presents theoretical bounds on estimation errors of counterfactuals from observational data by making connections to domain adaptation theory. It also presents a principled way of choosing optimal policies using domain adversarial neural networks. We illustrate the effectiveness of domain adversarial training together with various features of our algorithm on a semi-synthetic breast cancer dataset and a supervised UCI dataset (Statlog). △ Less

Submitted 23 February, 2018; originally announced February 2018.

arXiv:1711.08742 [pdf, other]

Estimating Missing Data in Temporal Data Streams Using Multi-directional Recurrent Neural Networks

Authors: **sung Yoon, William R. Zame, Mihaela van der Schaar

Abstract: Missing data is a ubiquitous problem. It is especially challenging in medical settings because many streams of measurements are collected at different - and often irregular - times. Accurate estimation of those missing measurements is critical for many reasons, including diagnosis, prognosis and treatment. Existing methods address this estimation problem by interpolating within data streams or imp… ▽ More Missing data is a ubiquitous problem. It is especially challenging in medical settings because many streams of measurements are collected at different - and often irregular - times. Accurate estimation of those missing measurements is critical for many reasons, including diagnosis, prognosis and treatment. Existing methods address this estimation problem by interpolating within data streams or imputing across data streams (both of which ignore important information) or ignoring the temporal aspect of the data and imposing strong assumptions about the nature of the data-generating process and/or the pattern of missing data (both of which are especially problematic for medical data). We propose a new approach, based on a novel deep learning architecture that we call a Multi-directional Recurrent Neural Network (M-RNN) that interpolates within data streams and imputes across data streams. We demonstrate the power of our approach by applying it to five real-world medical datasets. We show that it provides dramatically improved estimation of missing measurements in comparison to 11 state-of-the-art benchmarks (including Spline and Cubic Interpolations, MICE, MissForest, matrix completion and several RNN methods); typical improvements in Root Mean Square Error are between 35% - 50%. Additional experiments based on the same five datasets demonstrate that the improvements provided by our method are extremely robust. △ Less

Submitted 23 November, 2017; originally announced November 2017.

Comments: 19 pages (including 3 page Appendix and 2 page reference)

arXiv:1706.01396 [pdf, other]

doi 10.1109/TSP.2018.2807402

ToPs: Ensemble Learning with Trees of Predictors

Authors: **sung Yoon, William R. Zame, Mihaela van der Schaar

Abstract: We present a new approach to ensemble learning. Our approach constructs a tree of subsets of the feature space and associates a predictor (predictive model) - determined by training one of a given family of base learners on an endogenously determined training set - to each node of the tree; we call the resulting object a tree of predictors. The (locally) optimal tree of predictors is derived recur… ▽ More We present a new approach to ensemble learning. Our approach constructs a tree of subsets of the feature space and associates a predictor (predictive model) - determined by training one of a given family of base learners on an endogenously determined training set - to each node of the tree; we call the resulting object a tree of predictors. The (locally) optimal tree of predictors is derived recursively; each step involves jointly optimizing the split of the terminal nodes of the previous tree and the choice of learner and training set (hence predictor) for each set in the split. The feature vector of a new instance determines a unique path through the optimal tree of predictors; the final prediction aggregates the predictions of the predictors along this path. We derive loss bounds for the final predictor in terms of the Rademacher complexity of the base learners. We report the results of a number of experiments on a variety of datasets, showing that our approach provides statistically significant improvements over state-of-the-art machine learning algorithms, including various ensemble learning methods. Our approach works because it allows us to endogenously create more complex learners - when needed - and endogenously match both the learner and the training set to the characteristics of the dataset while still avoiding over-fitting. △ Less

Submitted 13 February, 2018; v1 submitted 5 June, 2017; originally announced June 2017.

Comments: 12 pages, 2 figures

arXiv:1704.03458 [pdf]

Personalized Survival Predictions for Cardiac Transplantation via Trees of Predictors

Authors: J. Yoon, W. R. Zame, A. Banerjee, M. Cadeiras, A. M. Alaa, M. van der Schaar

Abstract: Given the limited pool of donor organs, accurate predictions of survival on the wait list and post transplantation are crucial for cardiac transplantation decisions and policy. However, current clinical risk scores do not yield accurate predictions. We develop a new methodology (ToPs, Trees of Predictors) built on the principle that specific predictors should be used for specific clusters within t… ▽ More Given the limited pool of donor organs, accurate predictions of survival on the wait list and post transplantation are crucial for cardiac transplantation decisions and policy. However, current clinical risk scores do not yield accurate predictions. We develop a new methodology (ToPs, Trees of Predictors) built on the principle that specific predictors should be used for specific clusters within the target population. ToPs discovers these specific clusters of patients and the specific predictor that perform best for each cluster. In comparison with current clinical risk scoring systems, our method provides significant improvements in the prediction of survival time on the wait list and post transplantation. For example, in terms of 3 month survival for patients who were on the US patient wait list in the period 1985 to 2015, our method achieves AUC of 0.847, the best commonly used clinical risk score (MAGGIC) achieves 0.630. In terms of 3 month survival/mortality predictions (in comparison to MAGGIC), holding specificity at 80.0 percents, our algorithm correctly predicts survival for 1,228 (26.0 percents more patients out of 4,723 who actually survived, holding sensitivity at 80.0 percents, our algorithm correctly predicts mortality for 839 (33.0 percents) more patients out of 2,542 who did not survive. Our method achieves similar improvements for other time horizons and for predictions post transplantation. Therefore, we offer a more accurate, personalized approach to survival analysis that can benefit patients, clinicians and policymakers in making clinical decisions and setting clinical policy. Because risk prediction is widely used in diagnostic and prognostic clinical decision making across diseases and clinical specialties, the implications of our methods are far reaching. △ Less

Submitted 11 April, 2017; originally announced April 2017.

Comments: Main manuscript: 20 pages, Supplementary materials: 13 pages, 5 figures, 3 tables. Submitted to Science Translational Medicine

arXiv:1612.08082 [pdf, other]

Constructing Effective Personalized Policies Using Counterfactual Inference from Biased Data Sets with Many Features

Authors: Onur Atan, William R. Zame, Qiaojun Feng, Mihaela van der Schaar

Abstract: This paper proposes a novel approach for constructing effective personalized policies when the observed data lacks counter-factual information, is biased and possesses many features. The approach is applicable in a wide variety of settings from healthcare to advertising to education to finance. These settings have in common that the decision maker can observe, for each previous instance, an array… ▽ More This paper proposes a novel approach for constructing effective personalized policies when the observed data lacks counter-factual information, is biased and possesses many features. The approach is applicable in a wide variety of settings from healthcare to advertising to education to finance. These settings have in common that the decision maker can observe, for each previous instance, an array of features of the instance, the action taken in that instance, and the reward realized -- but not the rewards of actions that were not taken: the counterfactual information. Learning in such settings is made even more difficult because the observed data is typically biased by the existing policy (that generated the data) and because the array of features that might affect the reward in a particular instance -- and hence should be taken into account in deciding on an action in each particular instance -- is often vast. The approach presented here estimates propensity scores for the observed data, infers counterfactuals, identifies a (relatively small) number of features that are (most) relevant for each possible action and instance, and prescribes a policy to be followed. Comparison of the proposed algorithm against the state-of-art algorithm on actual datasets demonstrates that the proposed algorithm achieves a significant improvement in performance. △ Less

Submitted 10 July, 2018; v1 submitted 23 December, 2016; originally announced December 2016.

arXiv:1512.01230 [pdf, other]

A Theory of Individualism, Collectivism and Economic Outcomes

Authors: Kartik Ahuja, Mihaela van der Schaar, William R. Zame

Abstract: This paper presents a dynamic model to study the impact on the economic outcomes in different societies during the Malthusian Era of individualism (time spent working alone) and collectivism (complementary time spent working with others). The model is driven by opposing forces: a greater degree of collectivism provides a higher safety net for low quality workers but a greater degree of individuali… ▽ More This paper presents a dynamic model to study the impact on the economic outcomes in different societies during the Malthusian Era of individualism (time spent working alone) and collectivism (complementary time spent working with others). The model is driven by opposing forces: a greater degree of collectivism provides a higher safety net for low quality workers but a greater degree of individualism allows high quality workers to leave larger bequests. The model suggests that more individualistic societies display smaller populations, greater per capita income and greater income inequality. Some (limited) historical evidence is consistent with these predictions. △ Less

Submitted 1 July, 2016; v1 submitted 3 December, 2015; originally announced December 2015.

arXiv:1011.3580 [pdf, other]

Technology Choices and Pricing Policies in Public and Private Wireless Networks

Authors: Yuanzhang Xiao, William R. Zame, Mihaela van der Schaar

Abstract: This paper studies the provision of a wireless network by a monopolistic provider who may be either benevolent (seeking to maximize social welfare) or selfish (seeking to maximize provider profit). The paper addresses questions that do not seem to have been studied before in the engineering literature on wireless networks: Under what circumstances is it feasible for a provider, either benevolent o… ▽ More This paper studies the provision of a wireless network by a monopolistic provider who may be either benevolent (seeking to maximize social welfare) or selfish (seeking to maximize provider profit). The paper addresses questions that do not seem to have been studied before in the engineering literature on wireless networks: Under what circumstances is it feasible for a provider, either benevolent or selfish, to operate a network in such a way as to cover costs? How is the optimal behavior of a benevolent provider different from the optimal behavior of a selfish provider, and how does this difference affect social welfare? And, most importantly, how does the medium access control (MAC) technology influence the answers to these questions? To address these questions, we build a general model, and provide analysis and simulations for simplified but typical scenarios; the focus in these scenarios is on the contrast between the outcomes obtained under carrier-sensing multiple access (CSMA) and outcomes obtained under time-division multiple access (TDMA). Simulation results demonstrate that differences in MAC technology can have a significant effect on social welfare, on provider profit, and even on the (financial) feasibility of a wireless network. △ Less

Submitted 14 September, 2012; v1 submitted 15 November, 2010; originally announced November 2010.

Comments: 14 pages, 6 figures

Showing 1–11 of 11 results for author: Zame, W R