Skip to main content

Showing 1–50 of 56 results for author: Chan, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.08873  [pdf, ps, other

    stat.ME

    Balancing Method for Non-monotone Missing Data

    Authors: Jianing Dong, Raymond K. W. Wong, Kwun Chuen Gary Chan

    Abstract: Covariate balancing methods have been widely applied to single or monotone missing patterns and have certain advantages over likelihood-based methods and inverse probability weighting approaches based on standard logistic regression. In this paper, we consider non-monotone missing data under the complete-case missing variable condition (CCMV), which is a case of missing not at random (MNAR). Using… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  2. arXiv:2312.10072  [pdf, other

    cs.HC cs.AI cs.LG stat.AP

    Assessing the Usability of GutGPT: A Simulation Study of an AI Clinical Decision Support System for Gastrointestinal Bleeding Risk

    Authors: Colleen Chan, Kisung You, Sunny Chung, Mauro Giuffrè, Theo Saarinen, Niroop Rajashekar, Yuan Pu, Yeo Eun Shin, Loren Laine, Ambrose Wong, René Kizilcec, Jasjeet Sekhon, Dennis Shung

    Abstract: Applications of large language models (LLMs) like ChatGPT have potential to enhance clinical decision support through conversational interfaces. However, challenges of human-algorithmic interaction and clinician trust are poorly understood. GutGPT, a LLM for gastrointestinal (GI) bleeding risk prediction and management guidance, was deployed in clinical simulation scenarios alongside the electroni… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10, 2023, New Orleans, United States, 11 pages

  3. arXiv:2311.04871  [pdf, other

    stat.ME

    Integration of Summary Information from External Studies for Semiparametric Models

    Authors: Jianxuan Zang, K. C. G. Chan, Fei Gao

    Abstract: With the development of biomedical science, researchers have increasing access to an abundance of studies focusing on similar research questions. There is a growing interest in the integration of summary information from those studies to enhance the efficiency of estimation in their own internal studies. In this work, we present a comprehensive framework on integration of summary information from… ▽ More

    Submitted 9 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

  4. arXiv:2311.00210  [pdf, other

    stat.ME stat.AP stat.OT

    Broken Adaptive Ridge Method for Variable Selection in Generalized Partly Linear Models with Application to the Coronary Artery Disease Data

    Authors: Christian Chan, Xiaotian Dai, Thierry Chekouo, Quan Long, Xuewen Lu

    Abstract: Motivated by the CATHGEN data, we develop a new statistical learning method for simultaneous variable selection and parameter estimation under the context of generalized partly linear models for data with high-dimensional covariates. The method is referred to as the broken adaptive ridge (BAR) estimator, which is an approximation of the $L_0$-penalized regression by iteratively performing reweight… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  5. arXiv:2310.14146  [pdf, other

    stat.AP

    Cocaine Use Prediction with Tensor-based Machine Learning on Multimodal MRI Connectome Data

    Authors: Anru R. Zhang, Ryan P. Bell, Chen An, Runshi Tang, Shana A. Hall, Cliburn Chan, Kareem Al-Khalil, Christina S. Meade

    Abstract: This paper considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study utilized functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the datasets were transformed in… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  6. arXiv:2309.08039  [pdf, other

    stat.ME math.ST

    Flexible Functional Treatment Effect Estimation

    Authors: Jiayi Wang, Raymond K. W. Wong, Xiaoke Zhang, Kwun Chuen Gary Chan

    Abstract: We study treatment effect estimation with functional treatments where the average potential outcome functional is a function of functions, in contrast to continuous treatment effect estimation where the target is a function of real numbers. By considering a flexible scalar-on-function marginal structural model, a weight-modified kernel ridge regression (WMKRR) is adopted for estimation. The weight… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  7. arXiv:2309.00756  [pdf, other

    stat.AP math.OC

    Learning Risk Preferences in Markov Decision Processes: an Application to the Fourth Down Decision in Football

    Authors: Nathan Sandholtz, Lucas Wu, Martin Puterman, Timothy C. Y. Chan

    Abstract: For decades, National Football League (NFL) coaches' observed fourth down decisions have been largely inconsistent with prescriptions based on statistical models. In this paper, we develop a framework to explain this discrepancy using a novel inverse optimization approach. We model the fourth down decision and the subsequent sequence of plays in a game as a Markov decision process (MDP), the dynam… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

    Comments: 33 pages, 9 figures

  8. Miss It Like Messi: Extracting Value from Off-Target Shots in Soccer

    Authors: Ethan Baron, Nathan Sandholtz, Devin Pleuler, Timothy C. Y. Chan

    Abstract: Measuring soccer shooting skill is a challenging analytics problem due to the scarcity and highly contextual nature of scoring events. The introduction of more advanced data surrounding soccer shots has given rise to model-based metrics which better cope with these challenges. Specifically, metrics such as expected goals added, goals above expectation, and post-shot expected goals all use advanced… ▽ More

    Submitted 24 December, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

    Journal ref: J. Quant. Anal. Sports (2024)

  9. arXiv:2303.11388  [pdf, other

    stat.ME

    An Effective Multivariate Normality Test via Hessians of Empirical Cumulant Generating Functions

    Authors: Kwun Chuen Gary Chan, Hok Kan Ling, Chuan-Fa Tang, Sheung Chi Phillip Yam

    Abstract: In this article, we propose a new class of consistent tests for $p$-variate normality. These tests are based on the characterization of the standard multivariate normal distribution, that the Hessian of the corresponding cumulant generating function is identical to the $p\times p$ identity matrix and the idea of decomposing the information from the joint distribution into the dependence copula and… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

  10. RANG: Reconstructing reproducible R computational environments

    Authors: Chung-hong Chan, David Schoch

    Abstract: A complete declarative description of the computational environment is often missing when researchers share their materials. Without such description, software obsolescence and missing system components can jeopardize computational reproducibility in the future, even when data and computer code are available. The R package rang is a complete solution for generating the declarative description for… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

  11. arXiv:2302.03172  [pdf, ps, other

    econ.EM stat.CO stat.ME

    High-Dimensional Conditionally Gaussian State Space Models with Missing Data

    Authors: Joshua C. C. Chan, Aubrey Poon, Dan Zhu

    Abstract: We develop an efficient sampling approach for handling complex missing data patterns and a large number of missing observations in conditionally Gaussian state space models. Two important examples are dynamic factor models with unbalanced datasets and large Bayesian VARs with variables in multiple frequencies. A key insight underlying the proposed approach is that the joint distribution of the mis… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

  12. arXiv:2208.13255  [pdf, ps, other

    econ.EM stat.ME

    Comparing Stochastic Volatility Specifications for Large Bayesian VARs

    Authors: Joshua C. C. Chan

    Abstract: Large Bayesian vector autoregressions with various forms of stochastic volatility have become increasingly popular in empirical macroeconomics. One main difficulty for practitioners is to choose the most suitable stochastic volatility specification for their particular application. We develop Bayesian model comparison methods -- based on marginal likelihood estimators that combine conditional Mont… ▽ More

    Submitted 28 August, 2022; originally announced August 2022.

  13. arXiv:2208.02806  [pdf, other

    stat.ME

    A tree perspective on stick-breaking models in covariate-dependent mixtures

    Authors: Akira Horiguchi, Cliburn Chan, Li Ma

    Abstract: Stick-breaking (SB) processes are often adopted in Bayesian mixture models for generating mixing weights. When covariates influence the sizes of clusters, SB mixtures are particularly convenient as they can leverage their connection to binary regression to ease both the specification of covariate effects and posterior computation. Existing SB models are typically constructed based on continually b… ▽ More

    Submitted 20 June, 2023; v1 submitted 4 August, 2022; originally announced August 2022.

    Comments: 44 pages, 10 figures

  14. arXiv:2207.03597  [pdf, other

    stat.ME

    Nonparametric Estimation of the Potential Impact Fraction and Population Attributable Fraction with Individual-Level and Aggregated Data

    Authors: Colleen E. Chan, Rodrigo Zepeda-Tello, Dalia Camacho-García-Formentí, Frederick Cudhea, Rafael Meza, Eliane Rodrigues, Donna Spiegelman, Tonatiuh Barrientos-Gutierrez, Xin Zhou

    Abstract: The estimation of the potential impact fraction (including the population attributable fraction) with continuous exposure data frequently relies on strong distributional assumptions. However, these assumptions are often violated if the underlying exposure distribution is unknown or if the same distribution is assumed across time or space. Nonparametric methods to estimate the potential impact frac… ▽ More

    Submitted 24 January, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

  15. arXiv:2204.00896  [pdf

    stat.AP

    Equity, diversity, and inclusion in sports analytics

    Authors: Craig Fernandes, Jason D. Vescovi, Richard Norman, Cheri L. Bradish, Nathan Taback, Timothy C. Y. Chan

    Abstract: This paper presents a landmark study of equity, diversity and inclusion (EDI) in the field of sports analytics. We developed a survey that examined personal and job-related demographics, as well as individual perceptions and experiences about EDI in the workplace. We sent the survey to individuals in the five major North American professional leagues, representatives from the Olympic and Paralympi… ▽ More

    Submitted 14 June, 2022; v1 submitted 2 April, 2022; originally announced April 2022.

  16. arXiv:2201.07303  [pdf, ps, other

    econ.EM stat.ME

    Large Hybrid Time-Varying Parameter VARs

    Authors: Joshua C. C. Chan

    Abstract: Time-varying parameter VARs with stochastic volatility are routinely used for structural analysis and forecasting in settings involving a few endogenous variables. Applying these models to high-dimensional datasets has proved to be challenging due to intensive computations and over-parameterization concerns. We develop an efficient Bayesian sparsification method for a class of models we call hybri… ▽ More

    Submitted 16 June, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

  17. arXiv:2112.11315  [pdf, ps, other

    econ.EM stat.CO

    Efficient Estimation of State-Space Mixed-Frequency VARs: A Precision-Based Approach

    Authors: Joshua C. C. Chan, Aubrey Poon, Dan Zhu

    Abstract: State-space mixed-frequency vector autoregressions are now widely used for nowcasting. Despite their popularity, estimating such models can be computationally intensive, especially for large systems with stochastic volatility. To tackle the computational challenges, we propose two novel precision-based samplers to draw the missing observations of the low-frequency variables in these models, buildi… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

  18. arXiv:2112.03753  [pdf, other

    cs.LG cs.AI stat.ML

    Tell me why! Explanations support learning relational and causal structure

    Authors: Andrew K. Lampinen, Nicholas A. Roy, Ishita Dasgupta, Stephanie C. Y. Chan, Allison C. Tam, James L. McClelland, Chen Yan, Adam Santoro, Neil C. Rabinowitz, Jane X. Wang, Felix Hill

    Abstract: Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational a… ▽ More

    Submitted 25 May, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: ICML 2022; 23 pages

    ACM Class: I.2.6

  19. arXiv:2111.07225  [pdf, ps, other

    econ.EM stat.ME

    Large Order-Invariant Bayesian VARs with Stochastic Volatility

    Authors: Joshua C. C. Chan, Gary Koop, Xuewen Yu

    Abstract: Many popular specifications for Vector Autoregressions (VARs) with multivariate stochastic volatility are not invariant to the way the variables are ordered due to the use of a Cholesky decomposition for the error covariance matrix. We show that the order invariance problem in existing approaches is likely to become more serious in large VARs. We propose the use of a specification which avoids the… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

  20. arXiv:2111.07170  [pdf, ps, other

    econ.EM math.ST stat.CO

    Asymmetric Conjugate Priors for Large Bayesian VARs

    Authors: Joshua C. C. Chan

    Abstract: Large Bayesian VARs are now widely used in empirical macroeconomics. One popular shrinkage prior in this setting is the natural conjugate prior as it facilitates posterior simulation and leads to a range of useful analytical results. This is, however, at the expense of modeling flexibility, as it rules out cross-variable shrinkage -- i.e., shrinking coefficients on lags of other variables more agg… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

  21. arXiv:2110.06077  [pdf, other

    stat.ME stat.AP

    Data Harmonization Via Regularized Nonparametric Mixing Distribution Estimation

    Authors: Steven Wilkins-Reeves, Yen-Chi Chen, Kwun Chuen Gary Chan

    Abstract: Data harmonization is the process by which an equivalence is developed between two variables measuring a common trait. Our problem is motivated by dementia research in which multiple tests are used in practice to measure the same underlying cognitive ability such as language or memory. We connect this statistical problem to mixing distribution estimation. We introduce and study a non-parametric la… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: 46 pages, 15 figures

    MSC Class: 62G05

  22. arXiv:2110.01527  [pdf, other

    math.OC stat.AP

    A Markov process approach to untangling intention versus execution in tennis

    Authors: Timothy C. Y. Chan, Douglas S. Fearing, Craig Fernandes, Stephanie Kovalchik

    Abstract: Value functions are used in sports applications to determine the optimal action players should employ. However, most literature implicitly assumes that the player can perform the prescribed action with known and fixed probability of success. The effect of varying this probability or, equivalently, "execution error" in implementing an action (e.g., hitting a tennis ball to a specific location on th… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  23. arXiv:2106.05850  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Matrix Completion with Model-free Weighting

    Authors: Jiayi Wang, Raymond K. W. Wong, Xiaojun Mao, Kwun Chuen Gary Chan

    Abstract: In this paper, we propose a novel method for matrix completion under general non-uniform missing structures. By controlling an upper bound of a novel balancing error, we construct weights that can actively adjust for the non-uniformity in the empirical risk without explicitly modeling the observation probabilities, and can be computed efficiently via convex optimization. The recovered matrix based… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  24. arXiv:2103.03437  [pdf, other

    stat.ME

    Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing

    Authors: Jiayi Wang, Raymond K. W. Wong, Shu Yang, Kwun Chuen Gary Chan

    Abstract: We study nonparametric estimation for the partially conditional average treatment effect, defined as the treatment effect function over an interested subset of confounders. We propose a hybrid kernel weighting estimator where the weights aim to control the balancing error of any function of the confounders from a reproducing kernel Hilbert space after kernel smoothing over the subset of interested… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 19 pages, 2 figures

  25. arXiv:2010.12797  [pdf, other

    cs.LG cs.GT cs.MA stat.ML

    Collaborative Machine Learning with Incentive-Aware Model Rewards

    Authors: Rachael Hwee Ling Sim, Yehong Zhang, Mun Choon Chan, Bryan Kian Hsiang Low

    Abstract: Collaborative machine learning (ML) is an appealing paradigm to build high-quality ML models by training on the aggregated data from many parties. However, these parties are only willing to share their data when given enough incentives, such as a guaranteed fair reward based on their contributions. This motivates the need for measuring a party's contribution and designing an incentive-aware reward… ▽ More

    Submitted 24 October, 2020; originally announced October 2020.

    Comments: 37th International Conference on Machine Learning (ICML 2020), Extended version with proofs and additional experimental results, 17 pages

  26. arXiv:2010.00061  [pdf, other

    stat.ME

    Defining and Estimating Subgroup Mediation Effects with Semi-Competing Risks Data

    Authors: Fei Gao, Fan Xia, Kwun Chuen Gary Chan

    Abstract: In many medical studies, an ultimate failure event such as death is likely to be affected by the occurrence and timing of other intermediate clinical events. Both event times are subject to censoring by loss-to-follow-up but the nonterminal event may further be censored by the occurrence of the primary outcome, but not vice versa. To study the effect of an intervention on both events, the intermed… ▽ More

    Submitted 15 January, 2021; v1 submitted 30 September, 2020; originally announced October 2020.

  27. arXiv:2006.11601  [pdf, other

    cs.LG cs.CR cs.DC stat.ML

    Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks

    Authors: Lixin Fan, Kam Woh Ng, Ce Ju, Tianyu Zhang, Chang Liu, Chee Seng Chan, Qiang Yang

    Abstract: This paper investigates capabilities of Privacy-Preserving Deep Learning (PPDL) mechanisms against various forms of privacy attacks. First, we propose to quantitatively measure the trade-off between model accuracy and privacy losses incurred by reconstruction, tracing and membership attacks. Second, we formulate reconstruction attacks as solving a noisy system of linear equations, and prove that a… ▽ More

    Submitted 23 June, 2020; v1 submitted 20 June, 2020; originally announced June 2020.

    Comments: under review, 36 pages (updated Eq. 3 and Fig. 8)

  28. arXiv:2006.11408  [pdf, other

    cs.CV cs.LG stat.ML

    Quasi-conformal Geometry based Local Deformation Analysis of Lateral Cephalogram for Childhood OSA Classification

    Authors: Hei-Long Chan, Hoi-Man Yuen, Chun-Ting Au, Kate Ching-Ching Chan, Albert Martin Li, Lok-Ming Lui

    Abstract: Craniofacial profile is one of the anatomical causes of obstructive sleep apnea(OSA). By medical research, cephalometry provides information on patients' skeletal structures and soft tissues. In this work, a novel approach to cephalometric analysis using quasi-conformal geometry based local deformation information was proposed for OSA classification. Our study was a retrospective analysis based on… ▽ More

    Submitted 31 May, 2020; originally announced June 2020.

  29. arXiv:2002.06247  [pdf, other

    cs.LG eess.SY stat.ML

    Robust Policies For Proactive ICU Transfers

    Authors: Julien Grand-Clement, Carri W. Chan, Vineet Goyal, Gabriel Escobar

    Abstract: Patients whose transfer to the Intensive Care Unit (ICU) is unplanned are prone to higher mortality rates than those who were admitted directly to the ICU. Recent advances in machine learning to predict patient deterioration have introduced the possibility of \emph{proactive transfer} from the ward to the ICU. In this work, we study the problem of finding \emph{robust} patient transfer policies wh… ▽ More

    Submitted 22 January, 2021; v1 submitted 14 February, 2020; originally announced February 2020.

  30. arXiv:2001.06451  [pdf, other

    stat.AP

    Coarsened mixtures of hierarchical skew normal kernels for flow cytometry analyses

    Authors: Shai Gorsky, Cliburn Chan, Li Ma

    Abstract: Flow cytometry (FCM) is the standard multi-parameter assay for measuring single cell phenotype and functionality. It is commonly used for quantifying the relative frequencies of cell subsets in blood and disaggregated tissues. A typical analysis of FCM data involves cell classification---that is, the identification of cell subgroups in the sample---and comparisons of the cell subgroups across samp… ▽ More

    Submitted 31 August, 2020; v1 submitted 17 January, 2020; originally announced January 2020.

  31. arXiv:1912.05663  [pdf, other

    stat.ML cs.AI cs.LG

    Measuring the Reliability of Reinforcement Learning Algorithms

    Authors: Stephanie C. Y. Chan, Samuel Fishman, John Canny, Anoop Korattikara, Sergio Guadarrama

    Abstract: Lack of reliability is a well-known issue for reinforcement learning (RL) algorithms. This problem has gained increasing attention in recent years, and efforts to improve it have grown substantially. To aid RL researchers and production users with the evaluation and improvement of reliability, we propose a set of metrics that quantitatively measure different aspects of reliability. In this work, w… ▽ More

    Submitted 12 February, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Accepted for publication at ICLR 2020 (spotlight)

  32. arXiv:1912.00074  [pdf, other

    cs.LG cs.AI stat.ML

    Quadratic Q-network for Learning Continuous Control for Autonomous Vehicles

    Authors: Pin Wang, Hanhan Li, Ching-Yao Chan

    Abstract: Reinforcement Learning algorithms have recently been proposed to learn time-sequential control policies in the field of autonomous driving. Direct applications of Reinforcement Learning algorithms with discrete action space will yield unsatisfactory results at the operational level of driving where continuous control actions are actually required. In addition, the design of neural networks often f… ▽ More

    Submitted 29 November, 2019; originally announced December 2019.

    Comments: Machine Learning for Autonomous Driving Workshop on NeurIPS, 2019

  33. arXiv:1906.07757  [pdf

    stat.ME

    TEAM: A Multiple Testing Algorithm on the Aggregation Tree for Flow Cytometry Analysis

    Authors: John Pura, Xuechan Li, Cliburn Chan, Jichun Xie

    Abstract: In immunology studies, flow cytometry is a commonly used multivariate single-cell assay. One key goal in flow cytometry analysis is to pinpoint the immune cells responsive to certain stimuli. Statistically, this problem can be translated into comparing two protein expression probability density functions (PDFs) before and after the stimulus; the goal is to pinpoint the regions where these two pdfs… ▽ More

    Submitted 26 March, 2021; v1 submitted 18 June, 2019; originally announced June 2019.

  34. arXiv:1906.02815  [pdf

    cs.LG cs.RO stat.ML

    Intention-aware Long Horizon Trajectory Prediction of Surrounding Vehicles using Dual LSTM Networks

    Authors: Long Xin, Pin Wang, Ching-Yao Chan, Jianyu Chen, Shengbo Eben Li, Bo Cheng

    Abstract: As autonomous vehicles (AVs) need to interact with other road users, it is of importance to comprehensively understand the dynamic traffic environment, especially the future possible trajectories of surrounding vehicles. This paper presents an algorithm for long-horizon trajectory prediction of surrounding vehicles using a dual long short term memory (LSTM) network, which is capable of effectively… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: Published at the 21st International Conference on Intelligent Transportation Systems (ITSC), 2018

  35. arXiv:1906.02636  [pdf, other

    stat.AP math.OC

    An Inverse Optimization Approach to Measuring Clinical Pathway Concordance

    Authors: Timothy C. Y. Chan, Maria Eberg, Katharina Forster, Claire Holloway, Luciano Ieraci, Yusuf Shalaby, Nasrin Yousefi

    Abstract: Clinical pathways outline standardized processes in the delivery of care for a specific disease. Patient journeys through the healthcare system, though, can deviate substantially from these pathways. Given the positive benefits of clinical pathways, it is important to measure the concordance of patient pathways so that variations in health system performance or bottlenecks in the delivery of care… ▽ More

    Submitted 15 January, 2021; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: 61 pages

  36. arXiv:1906.02275  [pdf

    cs.RO cs.LG stat.ML

    Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm

    Authors: Pin Wang, Hanhan Li, Ching-Yao Chan

    Abstract: Lane change is a challenging task which requires delicate actions to ensure safety and comfort. Some recent studies have attempted to solve the lane-change control problem with Reinforcement Learning (RL), yet the action is confined to discrete action space. To overcome this limitation, we formulate the lane change behavior with continuous action in a model-free dynamic driving environment based o… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: Published at the 30th IEEE Intelligent Vehicles Symposium (IV), 2019

  37. arXiv:1904.10171  [pdf

    cs.RO cs.LG stat.ML

    Driving Decision and Control for Autonomous Lane Change based on Deep Reinforcement Learning

    Authors: Tianyu Shi, Pin Wang, Xuxin Cheng, Ching-Yao Chan, Ding Huang

    Abstract: We apply Deep Q-network (DQN) with the consideration of safety during the task for deciding whether to conduct the maneuver. Furthermore, we design two similar Deep Q learning frameworks with quadratic approximator for deciding how to select a comfortable gap and just follow the preceding vehicle. Finally, a polynomial lane change trajectory is generated and Pure Pursuit Control is implemented for… ▽ More

    Submitted 30 July, 2019; v1 submitted 23 April, 2019; originally announced April 2019.

    Comments: This Paper has been submitted to ITSC 2019

  38. arXiv:1903.06661  [pdf, other

    cs.LG stat.ML

    GEE: A Gradient-based Explainable Variational Autoencoder for Network Anomaly Detection

    Authors: Quoc Phong Nguyen, Kar Wai Lim, Dinil Mon Divakaran, Kian Hsiang Low, Mun Choon Chan

    Abstract: This paper looks into the problem of detecting network anomalies by analyzing NetFlow records. While many previous works have used statistical models and machine learning techniques in a supervised way, such solutions have the limitations that they require large amount of labeled data for training and are unlikely to detect zero-day attacks. Existing anomaly detection solutions also do not provide… ▽ More

    Submitted 15 March, 2019; originally announced March 2019.

    Comments: to appear in 2019 IEEE Conference on Communications and Network Security (CNS)

  39. arXiv:1901.11212  [pdf

    cs.RO cs.LG stat.ML

    A Data Driven Method of Optimizing Feedforward Compensator for Autonomous Vehicle

    Authors: Tianyu Shi, Pin Wang, Ching-Yao Chan, Chonghao Zou

    Abstract: A reliable controller is critical and essential for the execution of safe and smooth maneuvers of an autonomous vehicle.The controller must be robust to external disturbances, such as road surface, weather, and wind conditions, and so on.It also needs to deal with the internal parametric variations of vehicle sub-systems, including power-train efficiency, measurement errors, time delay,so on.Moreo… ▽ More

    Submitted 30 April, 2019; v1 submitted 31 January, 2019; originally announced January 2019.

    Comments: This paper have been submitted to the 2019 IEEE Intelligent Vehicle Symposium

  40. arXiv:1901.08551  [pdf

    cs.LG stat.ML

    A Universal Logic Operator for Interpretable Deep Convolution Networks

    Authors: KamWoh Ng, Lixin Fan, Chee Seng Chan

    Abstract: Explaining neural network computation in terms of probabilistic/fuzzy logical operations has attracted much attention due to its simplicity and high interpretability. Different choices of logical operators such as AND, OR and XOR give rise to another dimension for network optimization, and in this paper, we study the open problem of learning a universal logical operator without prescribing to any… ▽ More

    Submitted 20 January, 2019; originally announced January 2019.

    Comments: In AAAI-19 Workshop on Network Interpretability for Deep Learning

  41. arXiv:1807.06489  [pdf, other

    cs.LG physics.med-ph stat.ML

    Automated Treatment Planning in Radiation Therapy using Generative Adversarial Networks

    Authors: Rafid Mahmood, Aaron Babier, Andrea McNiven, Adam Diamant, Timothy C. Y. Chan

    Abstract: Knowledge-based planning (KBP) is an automated approach to radiation therapy treatment planning that involves predicting desirable treatment plans before they are then corrected to deliverable ones. We propose a generative adversarial network (GAN) approach for predicting desirable 3D dose distributions that eschews the previous paradigms of site-specific feature engineering and predicting low-dim… ▽ More

    Submitted 17 July, 2018; originally announced July 2018.

    Comments: 15 pages. Accepted for publication in PMLR. Presented at Machine Learning for Health Care

  42. arXiv:1807.00931  [pdf, other

    stat.ME

    Controlling the False Discovery Rate for Binary Feature Selection via Knockoff

    Authors: Yuxiang Xie, Kwun Chuen Gary Chan

    Abstract: Variable selection has been widely used in data analysis for the past decades, and it becomes increasingly important in the Big Data era as there are usually hundreds of variables available in a dataset. To enhance interpretability of a model, identifying potentially relevant features is often a step before fitting all the features into a regression model. A good variable selection method should e… ▽ More

    Submitted 13 August, 2020; v1 submitted 2 July, 2018; originally announced July 2018.

    MSC Class: 62

  43. arXiv:1805.09293  [pdf, other

    cs.LG math.OC stat.ML

    Learning to Optimize Contextually Constrained Problems for Real-Time Decision-Generation

    Authors: Aaron Babier, Timothy C. Y. Chan, Adam Diamant, Rafid Mahmood

    Abstract: The topic of learning to solve optimization problems has received interest from both the operations research and machine learning communities. In this work, we combine techniques from both fields to address the problem of learning to generate decisions to instances of continuous optimization problems where the feasible set varies with contextual features. We propose a novel framework for training… ▽ More

    Submitted 21 April, 2022; v1 submitted 23 May, 2018; originally announced May 2018.

    Comments: 72 pages

  44. arXiv:1709.10041  [pdf, other

    stat.ML

    Bayesian Multi Plate High Throughput Screening of Compounds

    Authors: Ivo D. Shterev, David B. Dunson, Cliburn Chan, Gregory D. Sempowski

    Abstract: High throughput screening of compounds (chemicals) is an essential part of drug discovery [7], involving thousands to millions of compounds, with the purpose of identifying candidate hits. Most statistical tools, including the industry standard B-score method, work on individual compound plates and do not exploit cross-plate correlation or statistical strength among plates. We present a new statis… ▽ More

    Submitted 28 September, 2017; originally announced September 2017.

  45. The Message or the Messenger? Inferring Virality and Diffusion Structure from Online Petition Signature Data

    Authors: Chi Ling Chan, Justin Lai, Bryan Hooi, Todd Davies

    Abstract: Goel et al. (2016) examined diffusion data from Twitter to conclude that online petitions are shared more virally than other types of content. Their definition of structural virality, which measures the extent to which diffusion follows a broadcast model or is spread person to person (virally), depends on knowing the topology of the diffusion cascade. But often the diffusion structure cannot be ob… ▽ More

    Submitted 11 August, 2017; originally announced August 2017.

    Comments: 19 pages, 6 figures, 4 tables, to appear in Giovanni Luca Ciampaglia, Afra J. Mashhadi, and Taha Yasseri (Editors), Social Informatics: Proceedings of the 9th International Conference, SocInfo 2017 (Oxford, UK, September 13-15), Springer LNCS, 2017

    MSC Class: 62P25 ACM Class: H.1.2; H.2.8; J.4; K.4.0

    Journal ref: Lecture Notes in Computer Science 10539:499-517, 2017

  46. arXiv:1606.08759  [pdf, other

    stat.ME

    Bayesian analysis of immune response dynamics with sparse time series data

    Authors: Fernando V. Bonassi, Cliburn Chan, Mike West

    Abstract: In vaccine development, the temporal profiles of relative abundance of subtypes of immune cells (T-cells) is key to understanding vaccine efficacy. Complex and expensive experimental studies generate very sparse time series data on this immune response. Fitting multi-parameter dynamic models of the immune response dynamics-- central to evaluating mechanisms underlying vaccine efficacy-- is challen… ▽ More

    Submitted 28 June, 2016; originally announced June 2016.

    Comments: Main paper: 14 pages, 10 figures. Supplementary material: 16 pages, 20 figures

    MSC Class: 62P10; 62M99

  47. arXiv:1601.03501  [pdf, ps, other

    stat.ME

    Efficient nonparametric estimation of causal mediation effects

    Authors: K. C. G. Chan, K. Imai, S. C. P. Yam, Z. Zhang

    Abstract: An essential goal of program evaluation and scientific research is the investigation of causal mechanisms. Over the past several decades, causal mediation analysis has been used in medical and social sciences to decompose the treatment effect into the natural direct and indirect effects. However, all of the existing mediation analysis methods rely on parametric modeling assumptions in one way or a… ▽ More

    Submitted 14 January, 2016; originally announced January 2016.

    Comments: Nonparametric Estimation, Natural direct effects, Natural indirect effects, Treatment effects, Semiparametric efficiency

    MSC Class: 62G05

  48. Oracle, Multiple Robust and Multipurpose Calibration in a Missing Response Problem

    Authors: Kwun Chuen Gary Chan, Sheung Chi Phillip Yam

    Abstract: In the presence of a missing response, reweighting the complete case subsample by the inverse of nonmissing probability is both intuitive and easy to implement. When the population totals of some auxiliary variables are known and when the inclusion probabilities are known by design, survey statisticians have developed calibration methods for improving efficiencies of the inverse probability weight… ▽ More

    Submitted 15 October, 2014; originally announced October 2014.

    Comments: Published in at http://dx.doi.org/10.1214/13-STS461 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-STS-STS461

    Journal ref: Statistical Science 2014, Vol. 29, No. 3, 380-396

  49. arXiv:1410.3756  [pdf, other

    cs.CV stat.ML

    Crowd Saliency Detection via Global Similarity Structure

    Authors: Mei Kuan Lim, Ven Jyn Kok, Chen Change Loy, Chee Seng Chan

    Abstract: It is common for CCTV operators to overlook inter- esting events taking place within the crowd due to large number of people in the crowded scene (i.e. marathon, rally). Thus, there is a dire need to automate the detection of salient crowd regions acquiring immediate attention for a more effective and proactive surveillance. This paper proposes a novel framework to identify and localize salient re… ▽ More

    Submitted 14 October, 2014; originally announced October 2014.

    Comments: Accepted in ICPR 2014 (Oral). Mei Kuan Lim and Ven Jyn Kok share equal contributions

  50. arXiv:1410.3752  [pdf, ps, other

    cs.CV stat.ML

    Enhanced Random Forest with Image/Patch-Level Learning for Image Understanding

    Authors: Wai Lam Hoo, Tae-Kyun Kim, Yuru Pei, Chee Seng Chan

    Abstract: Image understanding is an important research domain in the computer vision due to its wide real-world applications. For an image understanding framework that uses the Bag-of-Words model representation, the visual codebook is an essential part. Random forest (RF) as a tree-structure discriminative codebook has been a popular choice. However, the performance of the RF can be degraded if the local pa… ▽ More

    Submitted 14 October, 2014; originally announced October 2014.

    Comments: Accepted in ICPR 2014 (Oral)