Skip to main content

Showing 1–50 of 105 results for author: Wong, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.15170  [pdf, other

    stat.ME

    Inference for Delay Differential Equations Using Manifold-Constrained Gaussian Processes

    Authors: Yuxuan Zhao, Samuel W. K. Wong

    Abstract: Dynamic systems described by differential equations often involve feedback among system components. When there are time delays for components to sense and respond to feedback, delay differential equation (DDE) models are commonly used. This paper considers the problem of inferring unknown system parameters, including the time delays, from noisy and sparse experimental data observed from the system… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 42 pages, 8 figures

  2. arXiv:2405.12386  [pdf, other

    stat.ML cs.LG stat.AP stat.CO

    Particle swarm optimization with Applications to Maximum Likelihood Estimation and Penalized Negative Binomial Regression

    Authors: Sisi Shao, Junhyung Park, Weng Kee Wong

    Abstract: General purpose optimization routines such as nlminb, optim (R) or nlmixed (SAS) are frequently used to estimate model parameters in nonstandard distributions. This paper presents Particle Swarm Optimization (PSO), as an alternative to many of the current algorithms used in statistics. We find that PSO can not only reproduce the same results as the above routines, it can also produce results that… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  3. arXiv:2402.10456  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Generative Modeling for Tabular Data via Penalized Optimal Transport Network

    Authors: Wenhui Sophia Lu, Chenyang Zhong, Wing Hung Wong

    Abstract: The task of precisely learning the probability distribution of rows within tabular data and producing authentic synthetic samples is both crucial and non-trivial. Wasserstein generative adversarial network (WGAN) marks a notable improvement in generative modeling, addressing the challenges faced by its predecessor, generative adversarial network. However, due to the mixed data types and multimodal… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 37 pages, 23 figures

  4. arXiv:2402.08873  [pdf, ps, other

    stat.ME

    Balancing Method for Non-monotone Missing Data

    Authors: Jianing Dong, Raymond K. W. Wong, Kwun Chuen Gary Chan

    Abstract: Covariate balancing methods have been widely applied to single or monotone missing patterns and have certain advantages over likelihood-based methods and inverse probability weighting approaches based on standard logistic regression. In this paper, we consider non-monotone missing data under the complete-case missing variable condition (CCMV), which is a case of missing not at random (MNAR). Using… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  5. arXiv:2402.06058  [pdf, other

    stat.ME

    Mathematical programming tools for randomization purposes in small two-arm clinical trials: A case study with real data

    Authors: Alan R. Vazquez, Weng Kee Wong

    Abstract: Modern randomization methods in clinical trials are invariably adaptive, meaning that the assignment of the next subject to a treatment group uses the accumulated information in the trial. Some of the recent adaptive randomization methods use mathematical programming to construct attractive clinical trials that balance the group features, such as their sizes and covariate distributions of their su… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 36 pages, 12 figures

  6. arXiv:2402.01900  [pdf, other

    stat.ML cs.LG

    Distributional Off-policy Evaluation with Bellman Residual Minimization

    Authors: Sungee Hong, Zhengling Qi, Raymond K. W. Wong

    Abstract: We consider the problem of distributional off-policy evaluation which serves as the foundation of many distributional reinforcement learning (DRL) algorithms. In contrast to most existing works (that rely on supremum-extended statistical distances such as supremum-Wasserstein distance), we study the expectation-extended statistical distance for quantifying the distributional Bellman residuals and… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  7. arXiv:2401.04723  [pdf, other

    stat.ME

    Spatio-temporal data fusion for the analysis of in situ and remote sensing data using the INLA-SPDE approach

    Authors: Shiyu He, Samuel W. K. Wong

    Abstract: We propose a Bayesian hierarchical model to address the challenge of spatial misalignment in spatio-temporal data obtained from in situ and satellite sources. The model is fit using the INLA-SPDE approach, which provides efficient computation. Our methodology combines the different data sources in a "fusion"" model via the construction of projection matrices in both spatial and temporal domains. T… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: 23 pages, 7 figures

  8. arXiv:2312.13044  [pdf, other

    stat.ME stat.CO

    Particle Gibbs for Likelihood-Free Inference of State Space Models with Application to Stochastic Volatility

    Authors: Zhaoran Hou, Samuel W. K. Wong

    Abstract: State space models (SSMs) are widely used to describe dynamic systems. However, when the likelihood of the observations is intractable, parameter inference for SSMs cannot be easily carried out using standard Markov chain Monte Carlo or sequential Monte Carlo methods. In this paper, we propose a particle Gibbs sampler as a general strategy to handle SSMs with intractable likelihoods in the approxi… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 23 pages

  9. arXiv:2311.03497  [pdf, other

    stat.AP

    Understanding the Impact of Seasonal Climate Change on Canada's Economy by Region and Sector

    Authors: Shiyu He, Trang Bui, Yuying Huang, Wenling Zhang, Jie Jian, Samuel W. K. Wong, Tony S. Wirjanto

    Abstract: To assess the impact of climate change on the Canadian economy, we investigate and model the relationship between seasonal climate variables and economic growth across provinces and economic sectors. We further provide projections of climate change impacts up to the year 2050, taking into account the diverse climate change patterns and economic conditions across Canada. Our results indicate that r… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 25 pages, 7 figures

  10. arXiv:2310.20537  [pdf, other

    stat.ME stat.ML

    Directed Cyclic Graph for Causal Discovery from Multivariate Functional Data

    Authors: Saptarshi Roy, Raymond K. W. Wong, Yang Ni

    Abstract: Discovering causal relationship using multivariate functional data has received a significant amount of attention very recently. In this article, we introduce a functional linear structural equation model for causal structure learning when the underlying graph involving the multivariate functions may have cycles. To enhance interpretability, our model involves a low-dimensional causal embedded spa… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: 36 pages, 2 figures, 7 tables

  11. arXiv:2310.07801  [pdf, other

    cs.CV cs.AI stat.ME

    Trajectory-aware Principal Manifold Framework for Data Augmentation and Image Generation

    Authors: Elvis Han Cui, Bingbin Li, Yanan Li, Weng Kee Wong, Donghui Wang

    Abstract: Data augmentation for deep learning benefits model training, image transformation, medical imaging analysis and many other fields. Many existing methods generate new samples from a parametric distribution, like the Gaussian, with little attention to generate samples along the data manifold in either the input or feature space. In this paper, we verify that there are theoretical and practical advan… ▽ More

    Submitted 30 July, 2023; originally announced October 2023.

    Comments: 20 figures

  12. arXiv:2310.01402  [pdf, other

    stat.ME

    Evaluating the Decency and Consistency of Data Validation Tests Generated by LLMs

    Authors: Rohan Alexander, Lindsay Katz, Callandra Moore, Michael Wing-Cheung Wong, Zane Schwartz

    Abstract: We investigated whether large language models (LLMs) can develop data validation tests. We considered 96 conditions each for both GPT-3.5 and GPT-4, examining different prompt scenarios, learning modes, temperature settings, and roles. The prompt scenarios were: 1) Asking for expectations, 2) Asking for expectations with a given context, 3) Asking for expectations after requesting a data simulatio… ▽ More

    Submitted 1 April, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 36 pages, 18 figures

  13. arXiv:2309.08039  [pdf, other

    stat.ME math.ST

    Flexible Functional Treatment Effect Estimation

    Authors: Jiayi Wang, Raymond K. W. Wong, Xiaoke Zhang, Kwun Chuen Gary Chan

    Abstract: We study treatment effect estimation with functional treatments where the average potential outcome functional is a function of functions, in contrast to continuous treatment effect estimation where the target is a function of real numbers. By considering a flexible scalar-on-function marginal structural model, a weight-modified kernel ridge regression (WMKRR) is adopted for estimation. The weight… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  14. arXiv:2304.02127  [pdf, other

    stat.ME

    A Bayesian Collocation Integral Method for Parameter Estimation in Ordinary Differential Equations

    Authors: Mingwei Xu, Samuel W. K. Wong, Peijun Sang

    Abstract: Inferring the parameters of ordinary differential equations (ODEs) from noisy observations is an important problem in many scientific fields. Currently, most parameter estimation methods that bypass numerical integration tend to rely on basis functions or Gaussian processes to approximate the ODE solution and its derivatives. Due to the sensitivity of the ODE solution to its derivatives, these met… ▽ More

    Submitted 23 October, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

  15. Bayesian Nonlinear Tensor Regression with Functional Fused Elastic Net Prior

    Authors: Shuoli Chen, Kejun He, Shiyuan He, Yang Ni, Raymond K. W. Wong

    Abstract: Tensor regression methods have been widely used to predict a scalar response from covariates in the form of a multiway array. In many applications, the regions of tensor covariates used for prediction are often spatially connected with unknown shapes and discontinuous jumps on the boundaries. Moreover, the relationship between the response and the tensor covariates can be nonlinear. In this articl… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Journal ref: Technometrics, 65:4, 524-536 (2023)

  16. arXiv:2301.12540  [pdf, other

    stat.ML cs.LG

    Implicit Regularization for Group Sparsity

    Authors: Jiangyuan Li, Thanh V. Nguyen, Chinmay Hegde, Raymond K. W. Wong

    Abstract: We study the implicit regularization of gradient descent towards structured sparsity via a novel neural reparameterization, which we call a diagonally grouped linear neural network. We show the following intriguing property of our reparameterization: gradient descent over the squared regression loss, without any explicit regularization, biases towards solutions with a group sparsity structure. In… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

    Comments: accepted by ICLR 2023

  17. arXiv:2301.12302  [pdf, other

    stat.AP

    A Kriging Metamodel with Adaptive Sampling for Seismic Evaluation of Podium Buildings

    Authors: Yuying Huang, Zhiyong Chen, Samuel W. K. Wong

    Abstract: In this paper, nonlinear time-history dynamic analyses of selected earthquake ground motions are conducted on designated wood-frame podium buildings and the resulting inter-story drifts are analyzed. We aim to construct a reliable region where performance-based seismic design criteria are met, such that a two-step analysis procedure can be used with high confidence. We develop a kriging metamodel… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

    Comments: 14 pages, 2 figures

  18. arXiv:2212.05925  [pdf, other

    stat.ML cs.LG

    CausalEGM: a general causal inference framework by encoding generative modeling

    Authors: Qiao Liu, Zhongren Chen, Wing Hung Wong

    Abstract: Although understanding and characterizing causal effects have become essential in observational studies, it is challenging when the confounders are high-dimensional. In this article, we develop a general framework $\textit{CausalEGM}$ for estimating causal effects by encoding generative modeling, which can be applied in both binary and continuous treatment settings. Under the potential outcome fra… ▽ More

    Submitted 16 March, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

  19. arXiv:2210.14216  [pdf, other

    stat.ME

    Estimating Boltzmann Averages for Protein Structural Quantities Using Sequential Monte Carlo

    Authors: Zhaoran Hou, Samuel W. K. Wong

    Abstract: Sequential Monte Carlo (SMC) methods are widely used to draw samples from intractable target distributions. Particle degeneracy can hinder the use of SMC when the target distribution is highly constrained or multimodal. As a motivating application, we consider the problem of sampling protein structures from the Boltzmann distribution. This paper proposes a general SMC method that propagates multip… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: 20 pages

  20. arXiv:2210.13323  [pdf, other

    q-bio.PE stat.AP

    A Comparative Study of Compartmental Models for COVID-19 Transmission in Ontario, Canada

    Authors: Yuxuan Zhao, Samuel W. K. Wong

    Abstract: The number of confirmed COVID-19 cases reached over 1.3 million in Ontario, Canada by June 4, 2022. The continued spread of the virus underlying COVID-19 has been spurred by the emergence of variants since the initial outbreak in December, 2019. Much attention has thus been devoted to tracking and modelling the transmission of COVID-19. Compartmental models are commonly used to mimic epidemic tran… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: 26 pages, 8 figures

  21. arXiv:2206.12891  [pdf, other

    stat.ME

    Hierarchical nuclear norm penalization for multi-view data

    Authors: Sangyoon Yi, Raymond K. W. Wong, Irina Gaynanova

    Abstract: The prevalence of data collected on the same set of samples from multiple sources (i.e., multi-view data) has prompted significant development of data integration methods based on low-rank matrix factorizations. These methods decompose signal matrices from each view into the sum of shared and individual structures, which are further used for dimension reduction, exploratory analyses, and quantifyi… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

    Comments: 39 pages, 10 figures, 3 tables

  22. arXiv:2203.06066  [pdf, other

    stat.CO

    MAGI: A Package for Inference of Dynamic Systems from Noisy and Sparse Data via Manifold-constrained Gaussian Processes

    Authors: Samuel W. K. Wong, Shihao Yang, S. C. Kou

    Abstract: This article presents the MAGI software package for the inference of dynamic systems. The focus of MAGI is on dynamics modeled by nonlinear ordinary differential equations with unknown parameters. While such models are widely used in science and engineering, the available experimental data for parameter estimation may be noisy and sparse. Furthermore, some system components may be entirely unobser… ▽ More

    Submitted 16 October, 2023; v1 submitted 11 March, 2022; originally announced March 2022.

    Comments: 47 pages, 10 figures

  23. arXiv:2201.07775  [pdf, other

    stat.AP q-bio.BM

    Monte Carlo sampling of flexible protein structures: an application to the SARS-CoV-2 omicron variant

    Authors: Samuel W. K. Wong

    Abstract: Proteins can exhibit dynamic structural flexibility as they carry out their functions, especially in binding regions that interact with other molecules. For the key SARS-CoV-2 spike protein that facilitates COVID-19 infection, studies have previously identified several such highly flexible regions with therapeutic importance. However, protein structures available from the Protein Data Bank are pre… ▽ More

    Submitted 4 February, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: 20 pages, 4 figures

  24. arXiv:2201.03464  [pdf, other

    stat.AP

    Knots and their effect on the tensile strength of lumber: a case study

    Authors: Shuxian Fan, Samuel W. K. Wong, James V. Zidek

    Abstract: When assessing the strength of sawn lumber for use in engineering applications, the sizes and locations of knots are an important consideration. Knots are the most common visual characteristics of lumber, that result from the growth of tree branches. Large individual knots, as well as clusters of distinct knots, are known to have strength-reducing effects. However, industry grading rules that gove… ▽ More

    Submitted 14 February, 2023; v1 submitted 10 January, 2022; originally announced January 2022.

    Comments: 20 pages, 4 figures

  25. arXiv:2111.14623  [pdf, other

    cs.LG cs.CY stat.AP

    An Overview of Healthcare Data Analytics With Applications to the COVID-19 Pandemic

    Authors: Zhe Fei, Yevgen Ryeznik, Oleksandr Sverdlov, Chee Wei Tan, Weng Kee Wong

    Abstract: In the era of big data, standard analysis tools may be inadequate for making inference and there is a growing need for more efficient and innovative ways to collect, process, analyze and interpret the massive and complex data. We provide an overview of challenges in big data problems and describe how innovative analytical methods, machine learning tools and metaheuristics can tackle general health… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

    Journal ref: IEEE TRANSACTIONS ON BIG DATA, 12 August 2021

  26. arXiv:2110.11896  [pdf, other

    stat.AP

    Multimodel Bayesian Analysis of Load Duration Effects in Lumber Reliability

    Authors: Yunfeng Yang, Martin Lysy, Samuel W. K. Wong

    Abstract: This paper evaluates the reliability of lumber, accounting for the duration-of-load (DOL) effect under different load profiles based on a multimodel Bayesian approach. Three individual DOL models previously used for reliability assessment are considered: the US model, the Canadian model, and the Gamma process model. Procedures for stochastic generation of residential, snow, and wind loads are also… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

    Comments: 15 pages, 2 figures

  27. arXiv:2109.04640  [pdf, other

    cs.LG stat.ME

    Projected State-action Balancing Weights for Offline Reinforcement Learning

    Authors: Jiayi Wang, Zhengling Qi, Raymond K. W. Wong

    Abstract: Offline policy evaluation (OPE) is considered a fundamental and challenging problem in reinforcement learning (RL). This paper focuses on the value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. Motivated by the recently developed marginal importance sampling method in RL and t… ▽ More

    Submitted 9 June, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

  28. arXiv:2108.05574  [pdf, other

    stat.ML cs.LG

    Implicit Sparse Regularization: The Impact of Depth and Early Stop**

    Authors: Jiangyuan Li, Thanh V. Nguyen, Chinmay Hegde, Raymond K. W. Wong

    Abstract: In this paper, we study the implicit bias of gradient descent for sparse regression. We extend results on regression with quadratic parametrization, which amounts to depth-2 diagonal linear networks, to more general depth-N networks, under more realistic settings of noise and correlated designs. We show that early stop** is crucial for gradient descent to converge to a sparse model, a phenomenon… ▽ More

    Submitted 26 October, 2021; v1 submitted 12 August, 2021; originally announced August 2021.

    Comments: 32 pages, accepted by NeurIPS 2021. arXiv admin note: text overlap with arXiv:1909.05122 by other authors

  29. arXiv:2106.05850  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Matrix Completion with Model-free Weighting

    Authors: Jiayi Wang, Raymond K. W. Wong, Xiaojun Mao, Kwun Chuen Gary Chan

    Abstract: In this paper, we propose a novel method for matrix completion under general non-uniform missing structures. By controlling an upper bound of a novel balancing error, we construct weights that can actively adjust for the non-uniformity in the empirical risk without explicitly modeling the observation probabilities, and can be computed efficiently via convex optimization. The recovered matrix based… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  30. arXiv:2105.14647  [pdf, ps, other

    stat.ME

    Orthogonal Subsampling for Big Data Linear Regression

    Authors: Lin Wang, Jake Elmstedt, Weng Kee Wong, Hongquan Xu

    Abstract: The dramatic growth of big datasets presents a new challenge to data storage and analysis. Data reduction, or subsampling, that extracts useful information from datasets is a crucial step in big data analysis. We propose an orthogonal subsampling (OSS) approach for big data with a focus on linear regression models. The approach is inspired by the fact that an orthogonal array of two levels provide… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

  31. arXiv:2105.08835  [pdf, ps, other

    q-bio.BM stat.AP

    Conformational variability of loops in the SARS-CoV-2 spike protein

    Authors: Samuel W. K. Wong, Zongjun Liu

    Abstract: The SARS-CoV-2 spike (S) protein facilitates viral infection, and has been the focus of many structure determination efforts. Its flexible loop regions are known to be involved in protein binding and may adopt multiple conformations. This paper identifies the S protein loops and studies their conformational variability based on the available Protein Data Bank (PDB) structures. While most loops had… ▽ More

    Submitted 13 October, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: 24 pages

  32. arXiv:2104.10878  [pdf, other

    stat.AP q-bio.PE

    Comparing regional and provincial-wide COVID-19 models with physical distancing in British Columbia

    Authors: Geoffrey McGregor, Jennifer Tippett, Andy T. S. Wan, Mengxiao Wang, Samuel W. K. Wong

    Abstract: We study the effects of physical distancing measures for the spread of COVID-19 in regional areas within British Columbia, using the reported cases of the five provincial Health Authorities. Building on the Bayesian epidemiological model of Anderson et al. (2020), we propose a hierarchical regional Bayesian model with time-varying regional parameters between March to December of 2020. In the absen… ▽ More

    Submitted 13 November, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: 35 pages, 16 figures

    Journal ref: AIMS Mathematics, 2022, 7(4): 6743-6778

  33. arXiv:2104.10633  [pdf

    math.ST stat.ME

    A calculus for causal inference with instrumental variables

    Authors: Wing Hung Wong

    Abstract: Under a general structural equation framework for causal inference, we provide a definition of the causal effect of a variable X on another variable Y, and propose an approach to estimate this causal effect via the use of instrumental variables.

    Submitted 23 April, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

    Comments: 10 pages

  34. arXiv:2104.10041  [pdf, other

    cs.NE cs.AI stat.AP stat.CO

    Particle swarm optimization in constrained maximum likelihood estimation a case study

    Authors: Elvis Cui, Dongyuan Song, Weng Kee Wong

    Abstract: The aim of paper is to apply two types of particle swarm optimization, global best andlocal best PSO to a constrained maximum likelihood estimation problem in pseudotime anal-ysis, a sub-field in bioinformatics. The results have shown that particle swarm optimizationis extremely useful and efficient when the optimization problem is non-differentiable and non-convex so that analytical solution can… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: 11 pages, 7 figures

  35. arXiv:2103.03437  [pdf, other

    stat.ME

    Estimation of Partially Conditional Average Treatment Effect by Hybrid Kernel-covariate Balancing

    Authors: Jiayi Wang, Raymond K. W. Wong, Shu Yang, Kwun Chuen Gary Chan

    Abstract: We study nonparametric estimation for the partially conditional average treatment effect, defined as the treatment effect function over an interested subset of confounders. We propose a hybrid kernel weighting estimator where the weights aim to control the balancing error of any function of the confounders from a reproducing kernel Hilbert space after kernel smoothing over the subset of interested… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 19 pages, 2 figures

  36. arXiv:2101.02304  [pdf, other

    stat.AP q-bio.BM

    Statistical challenges in the analysis of sequence and structure data for the COVID-19 spike protein

    Authors: Shiyu He, Samuel W. K. Wong

    Abstract: As the major target of many vaccines and neutralizing antibodies against SARS-CoV-2, the spike (S) protein is observed to mutate over time. In this paper, we present statistical approaches to tackle some challenges associated with the analysis of S-protein data. We build a Bayesian hierarchical model to study the temporal and spatial evolution of S-protein sequences, after grou** the sequences i… ▽ More

    Submitted 30 January, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

    Comments: 21 pages, 5 figures

  37. arXiv:2010.13568  [pdf, other

    stat.ML cs.LG stat.ME

    CP Degeneracy in Tensor Regression

    Authors: Ya Zhou, Raymond K. W. Wong, Kejun He

    Abstract: Tensor linear regression is an important and useful tool for analyzing tensor data. To deal with high dimensionality, CANDECOMP/PARAFAC (CP) low-rank constraints are often imposed on the coefficient tensor parameter in the (penalized) $M$-estimation. However, we show that the corresponding optimization may not be attainable, and when this happens, the estimator is not well-defined. This is closely… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Journal ref: IEEE Access, 9:1, 7775-7788 (2021)

  38. arXiv:2009.11452  [pdf, ps, other

    stat.ME stat.AP

    A Wavelet-Based Independence Test for Functional Data with an Application to MEG Functional Connectivity

    Authors: Rui Miao, Xiaoke Zhang, Raymond K. W. Wong

    Abstract: Measuring and testing the dependency between multiple random functions is often an important task in functional data analysis. In the literature, a model-based method relies on a model which is subject to the risk of model misspecification, while a model-free method only provides a correlation measure which is inadequate to test independence. In this paper, we adopt the Hilbert-Schmidt Independenc… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

  39. Inference of dynamic systems from noisy and sparse data via manifold-constrained Gaussian processes

    Authors: Shihao Yang, Samuel W. K. Wong, S. C. Kou

    Abstract: Parameter estimation for nonlinear dynamic system models, represented by ordinary differential equations (ODEs), using noisy and sparse data is a vital task in many fields. We propose a fast and accurate method, MAGI (MAnifold-constrained Gaussian process Inference), for this task. MAGI uses a Gaussian process model over time-series data, explicitly conditioned on the manifold constraint that deri… ▽ More

    Submitted 21 February, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

  40. Broadcasted Nonparametric Tensor Regression

    Authors: Ya Zhou, Raymond K. W. Wong, Kejun He

    Abstract: We propose a novel use of a broadcasting operation, which distributes univariate functions to all entries of the tensor covariate, to model the nonlinearity in tensor regression nonparametrically. A penalized estimation and the corresponding algorithm are proposed. Our theoretical investigation, which allows the dimensions of the tensor covariate to diverge, indicates that the proposed estimation… ▽ More

    Submitted 23 March, 2024; v1 submitted 29 August, 2020; originally announced August 2020.

  41. Low-Rank Covariance Function Estimation for Multidimensional Functional Data

    Authors: Jiayi Wang, Raymond K. W. Wong, Xiaoke Zhang

    Abstract: Multidimensional function data arise from many fields nowadays. The covariance function plays an important role in the analysis of such increasingly common data. In this paper, we propose a novel nonparametric covariance function estimation approach under the framework of reproducing kernel Hilbert spaces (RKHS) that can handle both sparse and dense functional data. We extend multilinear rank stru… ▽ More

    Submitted 29 August, 2020; originally announced August 2020.

    Comments: 25 pages, 4 figures

  42. arXiv:2006.10400  [pdf, other

    stat.ML cs.LG

    Median Matrix Completion: from Embarrassment to Optimality

    Authors: Weidong Liu, Xiaojun Mao, Raymond K. W. Wong

    Abstract: In this paper, we consider matrix completion with absolute deviation loss and obtain an estimator of the median matrix. Despite several appealing properties of median, the non-smooth absolute deviation loss leads to computational challenge for large-scale data sets which are increasingly common among matrix completion problems. A simple solution to large-scale problems is parallel computing. Howev… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: 26 pages, 1 figure, 5 tables

  43. arXiv:2004.09017  [pdf, other

    cs.LG stat.ME stat.ML

    Roundtrip: A Deep Generative Neural Density Estimator

    Authors: Qiao Liu, Jiaze Xu, Rui Jiang, Wing Hung Wong

    Abstract: Density estimation is a fundamental problem in both statistics and machine learning. In this study, we proposed Roundtrip as a general-purpose neural density estimator based on deep generative models. Roundtrip retains the generative power of generative adversarial networks (GANs) but also provides estimates of density values. Unlike previous neural density estimators that put stringent conditions… ▽ More

    Submitted 4 September, 2020; v1 submitted 19 April, 2020; originally announced April 2020.

    Journal ref: Proceedings of the National Academy of Sciences, 2021, 118(15)

  44. arXiv:2004.07395  [pdf, ps, other

    cs.IT cs.LG eess.SP stat.ML

    Joint User Pairing and Association for Multicell NOMA: A Pointer Network-based Approach

    Authors: Manyou Ma, Vincent W. S. Wong

    Abstract: In this paper, we investigate the joint user pairing and association problem for multicell non-orthogonal multiple access (NOMA) systems. We consider a scenario where the user equipments (UEs) are located in a multicell network equipped with multiple base stations. Each base station has multiple orthogonal physical resource blocks (PRBs). Each PRB can be allocated to a pair of UEs using NOMA. Each… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

    Comments: accepted for publication in Proc. of 6th International Workshop on NOMA for 5G and Beyond, co-located with IEEE International Conference on Communications (ICC), Dublin, Ireland, Jun. 2020

  45. arXiv:2002.03537  [pdf, other

    stat.AP

    Calibrating wood products for load duration and rate: A statistical look at three damage models

    Authors: Samuel W. K. Wong

    Abstract: Lumber and wood-based products are versatile construction materials that are susceptible to weakening as a result of applied stresses. To assess the effects of load duration and rate, experiments have been carried out by applying preset load profiles to sample specimens. This paper studies these effects via a damage modeling approach, by considering three models in the literature: the Gerhards and… ▽ More

    Submitted 9 February, 2020; originally announced February 2020.

    Comments: 17 pages, 5 figures

  46. arXiv:1911.11983  [pdf, ps, other

    cs.LG stat.ML

    Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis

    Authors: Thanh V. Nguyen, Raymond K. W. Wong, Chinmay Hegde

    Abstract: A remarkable recent discovery in machine learning has been that deep neural networks can achieve impressive performance (in terms of both lower training error and higher generalization capacity) in the regime where they are massively over-parameterized. Consequently, over the past year, the community has devoted growing interest in analyzing optimization and generalization properties of over-param… ▽ More

    Submitted 2 March, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: Added Sections 3.2 and 3.4 on inductive biases. Fixed an error in deriving the neural tangent kernel in Section 3.3

  47. arXiv:1910.02114  [pdf, other

    stat.ML cs.LG stat.AP

    A Comparison Study on Nonlinear Dimension Reduction Methods with Kernel Variations: Visualization, Optimization and Classification

    Authors: Katherine C. Kempfert, Yishi Wang, Cuixian Chen, Samuel W. K. Wong

    Abstract: Because of high dimensionality, correlation among covariates, and noise contained in data, dimension reduction (DR) techniques are often employed to the application of machine learning algorithms. Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and their kernel variants (KPCA, KLDA) are among the most popular DR methods. Recently, Supervised Kernel Principal Component Analy… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

  48. arXiv:1909.12969  [pdf, other

    cs.LG cs.AI cs.HC stat.ML

    Counterfactual States for Atari Agents via Generative Deep Learning

    Authors: Matthew L. Olson, Lawrence Neal, Fuxin Li, Weng-Keen Wong

    Abstract: Although deep reinforcement learning agents have produced impressive results in many domains, their decision making is difficult to explain to humans. To address this problem, past work has mainly focused on explaining why an action was chosen in a given state. A different type of explanation that is useful is a counterfactual, which deals with "what if?" scenarios. In this work, we introduce the… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

    Comments: IJCAI XAI Workshop 2019

  49. arXiv:1909.08182  [pdf

    cs.LG eess.SP stat.ML

    Predicting Electricity Consumption using Deep Recurrent Neural Networks

    Authors: Anupiya Nugaliyadde, Upeka Somaratne, Kok Wai Wong

    Abstract: Electricity consumption has increased exponentially during the past few decades. This increase is heavily burdening the electricity distributors. Therefore, predicting the future demand for electricity consumption will provide an upper hand to the electricity distributor. Predicting electricity consumption requires many parameters. The paper presents two approaches with one using a Recurrent Neura… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

  50. arXiv:1908.02910  [pdf, other

    stat.ML cs.LG

    Mini-batch Metropolis-Hastings MCMC with Reversible SGLD Proposal

    Authors: Tung-Yu Wu, Y. X. Rachel Wang, Wing H. Wong

    Abstract: Traditional MCMC algorithms are computationally intensive and do not scale well to large data. In particular, the Metropolis-Hastings (MH) algorithm requires passing over the entire dataset to evaluate the likelihood ratio in each iteration. We propose a general framework for performing MH-MCMC using mini-batches of the whole dataset and show that this gives rise to approximately a tempered statio… ▽ More

    Submitted 28 August, 2019; v1 submitted 7 August, 2019; originally announced August 2019.