-
Quasi-conformal Geometry based Local Deformation Analysis of Lateral Cephalogram for Childhood OSA Classification
Authors:
Hei-Long Chan,
Hoi-Man Yuen,
Chun-Ting Au,
Kate Ching-Ching Chan,
Albert Martin Li,
Lok-Ming Lui
Abstract:
Craniofacial profile is one of the anatomical causes of obstructive sleep apnea(OSA). By medical research, cephalometry provides information on patients' skeletal structures and soft tissues. In this work, a novel approach to cephalometric analysis using quasi-conformal geometry based local deformation information was proposed for OSA classification. Our study was a retrospective analysis based on…
▽ More
Craniofacial profile is one of the anatomical causes of obstructive sleep apnea(OSA). By medical research, cephalometry provides information on patients' skeletal structures and soft tissues. In this work, a novel approach to cephalometric analysis using quasi-conformal geometry based local deformation information was proposed for OSA classification. Our study was a retrospective analysis based on 60 case-control pairs with accessible lateral cephalometry and polysomnography (PSG) data. By using the quasi-conformal geometry to study the local deformation around 15 landmark points, and combining the results with three linear distances between landmark points, a total of 1218 information features were obtained per subject. A L2 norm based classification model was built. Under experiments, our proposed model achieves 92.5% testing accuracy.
△ Less
Submitted 31 May, 2020;
originally announced June 2020.
-
Robust Causal Inference for Incremental Return on Ad Spend with Randomized Paired Geo Experiments
Authors:
Aiyou Chen,
Timothy C. Au
Abstract:
Evaluating the incremental return on ad spend (iROAS) of a prospective online marketing strategy (i.e., the ratio of the strategy's causal effect on some response metric of interest relative to its causal effect on the ad spend) has become increasingly more important. Although randomized ``geo experiments'' are frequently employed for this evaluation, obtaining reliable estimates of iROAS can be c…
▽ More
Evaluating the incremental return on ad spend (iROAS) of a prospective online marketing strategy (i.e., the ratio of the strategy's causal effect on some response metric of interest relative to its causal effect on the ad spend) has become increasingly more important. Although randomized ``geo experiments'' are frequently employed for this evaluation, obtaining reliable estimates of iROAS can be challenging as oftentimes only a small number of highly heterogeneous units are used. Moreover, advertisers frequently impose budget constraints on their ad spends, which further complicates causal inference by introducing interference between the experimental units. In this paper, we formulate a novel statistical framework for inferring the iROAS of online advertising from randomized paired geo experiment which further motivates and provides new insights into Rosenbaum's arguments on instrumental variables, and we propose and develop a robust, distribution-free and interpretable estimator ``Trimmed Match'', as well as a data-driven choice of the tuning parameter which may be of independent interest. We investigate the sensitivity of Trimmed Match to some violations of its assumptions and show that it can be more efficient than some alternative estimators based on simulated data. We then demonstrate its practical utility with real case studies.
△ Less
Submitted 6 June, 2021; v1 submitted 8 August, 2019;
originally announced August 2019.
-
Random Forests, Decision Trees, and Categorical Predictors: The "Absent Levels" Problem
Authors:
Timothy C. Au
Abstract:
One advantage of decision tree based methods like random forests is their ability to natively handle categorical predictors without having to first transform them (e.g., by using feature engineering techniques). However, in this paper, we show how this capability can lead to an inherent "absent levels" problem for decision tree based methods that has never been thoroughly discussed, and whose cons…
▽ More
One advantage of decision tree based methods like random forests is their ability to natively handle categorical predictors without having to first transform them (e.g., by using feature engineering techniques). However, in this paper, we show how this capability can lead to an inherent "absent levels" problem for decision tree based methods that has never been thoroughly discussed, and whose consequences have never been carefully explored. This problem occurs whenever there is an indeterminacy over how to handle an observation that has reached a categorical split which was determined when the observation in question's level was absent during training. Although these incidents may appear to be innocuous, by using Leo Breiman and Adele Cutler's random forests FORTRAN code and the randomForest R package (Liaw and Wiener, 2002) as motivating case studies, we examine how overlooking the absent levels problem can systematically bias a model. Furthermore, by using three real data examples, we illustrate how absent levels can dramatically alter a model's performance in practice, and we empirically demonstrate how some simple heuristics can be used to help mitigate the effects of the absent levels problem until a more robust theoretical solution is found.
△ Less
Submitted 28 October, 2018; v1 submitted 12 June, 2017;
originally announced June 2017.