Skip to main content

Showing 1–50 of 92 results for author: Wu, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01079  [pdf, ps, other

    stat.ML cs.AI cs.LG

    On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

    Authors: Jerry Yao-Chieh Hu, Weimin Wu, Zhuoru Li, Zhao Song, Han Liu

    Abstract: We investigate the statistical and computational limits of latent \textbf{Di}ffusion \textbf{T}ransformers (\textbf{DiT}s) under the low-dimensional linear latent space assumption. Statistically, we study the universal approximation and sample complexity of the DiTs score function, as well as the distribution recovery property of the initial data. Specifically, under mild data assumptions, we deri… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.14753  [pdf, other

    cs.LG stat.ME

    A General Control-Theoretic Approach for Reinforcement Learning: Theory and Algorithms

    Authors: Weiqin Chen, Mark S. Squillante, Chai Wah Wu, Santiago Paternain

    Abstract: We devise a control-theoretic reinforcement learning approach to support direct learning of the optimal policy. We establish theoretical properties of our approach and derive an algorithm based on a specific instance of this approach. Our empirical results demonstrate the significant benefits of our approach.

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.09194  [pdf, ps, other

    stat.ML cs.IT cs.LG math.NA math.ST

    Benign overfitting in Fixed Dimension via Physics-Informed Learning with Smooth Inductive Bias

    Authors: Honam Wong, Wendao Wu, Fanghui Liu, Yi** Lu

    Abstract: Recent advances in machine learning have inspired a surge of research into reconstructing specific quantities of interest from measurements that comply with certain physical laws. These efforts focus on inverse problems that are governed by partial differential equations (PDEs). In this work, we develop an asymptotic Sobolev norm learning curve for kernel ridge(less) regression when addressing (el… ▽ More

    Submitted 16 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2403.17221  [pdf, other

    stat.AP stat.ME

    Are Made and Missed Different? An analysis of Field Goal Attempts of Professional Basketball Players via Depth Based Testing Procedure

    Authors: Kai Qi, Guanyu Hu, Wei Wu

    Abstract: In this paper, we develop a novel depth-based testing procedure on spatial point processes to examine the difference in made and missed field goal attempts for NBA players. Specifically, our testing procedure can statistically detect the differences between made and missed field goal attempts for NBA players. We first obtain the depths of two processes under the polar coordinate system. A two-dime… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 26 pages, 6 figures

  5. arXiv:2403.00600  [pdf, other

    stat.ME

    Random Interval Distillation for Detecting Multiple Changes in General Dependent Data

    Authors: Xinyuan Fan, Weichi Wu

    Abstract: We propose a new and generic approach for detecting multiple change-points in general dependent data, termed random interval distillation (RID). By collecting random intervals with sufficient strength of signals and reassembling them into a sequence of informative short intervals, our new approach captures the shifts in signal characteristics across diverse dependent data forms including locally s… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 59 pages, 5 figures

  6. arXiv:2401.09346  [pdf, other

    stat.ML cs.LG

    High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

    Authors: Wanrong Zhu, Zhipeng Lou, Ziyang Wei, Wei Biao Wu

    Abstract: Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level. Specifically, we propose to use a small number of independent multi-runs to acquire distribution info… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  7. arXiv:2311.13676  [pdf, ps, other

    stat.AP

    Depth-Based Statistical Inferences in the Spike Train Space

    Authors: Xinyu Zhou, Wei Wu

    Abstract: Metric-based summary statistics such as mean and covariance have been introduced in neural spike train space. They can properly describe template and variability in spike train data, but are often sensitive to outliers and expensive to compute. Recent studies also examine outlier detection and classification methods on point processes. These tools provide reasonable and efficient result, whereas t… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  8. arXiv:2311.00577  [pdf, other

    stat.ML cs.LG econ.EM stat.ME

    Personalized Assignment to One of Many Treatment Arms via Regularized and Clustered Joint Assignment Forests

    Authors: Rahul Ladhania, Jann Spiess, Lyle Ungar, Wenbo Wu

    Abstract: We consider learning personalized assignments to one of many treatment arms from a randomized controlled trial. Standard methods that estimate heterogeneous treatment effects separately for each arm may perform poorly in this case due to excess variance. We instead propose methods that pool information across treatment arms: First, we consider a regularized forest-based assignment algorithm based… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  9. arXiv:2309.15316  [pdf, other

    stat.ME

    Leveraging Neural Networks to Profile Health Care Providers with Application to Medicare Claims

    Authors: Wenbo Wu, Fan Li, Richard Liu, Yiting Li, Mara McAdams-DeMarco, Krzysztof J. Geras, Douglas E. Schaubel, Iván Díaz

    Abstract: Encompassing numerous nationwide, statewide, and institutional initiatives in the United States, provider profiling has evolved into a major health care undertaking with ubiquitous applications, profound implications, and high-stakes consequences. In line with such a significant profile, the literature has accumulated a number of developments dedicated to enhancing the statistical paradigm of prov… ▽ More

    Submitted 20 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: 8 figures, 6 tables

  10. arXiv:2309.08488  [pdf, other

    stat.ME

    A Random Graph-based Autoregressive Model for Networked Time Series

    Authors: Weichi Wu, Chenlei Leng

    Abstract: Contemporary time series data often feature objects connected by a social network that naturally induces temporal dependence involving connected neighbours. The network vector autoregressive model is useful for describing the influence of linked neighbours, while recent generalizations aim to separate influence and homophily. Existing approaches, however, require either correct specification of a… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  11. arXiv:2309.00760  [pdf, other

    stat.ME

    Spatial Regression With Multiplicative Errors, and Its Application With Lidar Measurements

    Authors: Hojun You, Wei-Ying Wu, Chae Young Lim, Kyubaek Yoon, Jongeun Choi

    Abstract: Multiplicative errors in addition to spatially referenced observations often arise in geodetic applications, particularly in surface estimation with light detection and ranging (LiDAR) measurements. However, spatial regression involving multiplicative errors remains relatively unexplored in such applications. In this regard, we present a penalized modified least squares estimator to handle the com… ▽ More

    Submitted 1 September, 2023; originally announced September 2023.

  12. arXiv:2307.06915  [pdf, other

    stat.ML cs.LG

    Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality

    Authors: Ziyang Wei, Wanrong Zhu, Wei Biao Wu

    Abstract: Stochastic Gradient Descent (SGD) is one of the simplest and most popular algorithms in modern statistical and machine learning due to its computational and memory efficiency. Various averaging schemes have been proposed to accelerate the convergence of SGD in different settings. In this paper, we explore a general averaging scheme for SGD. Specifically, we establish the asymptotic normality of a… ▽ More

    Submitted 18 July, 2023; v1 submitted 13 July, 2023; originally announced July 2023.

  13. arXiv:2305.19001  [pdf, other

    stat.ML cs.IT cs.LG math.OC math.ST

    High-probability sample complexities for policy evaluation with linear function approximation

    Authors: Gen Li, Weichen Wu, Yuejie Chi, Cong Ma, Alessandro Rinaldo, Yuting Wei

    Abstract: This paper is concerned with the problem of policy evaluation with linear function approximation in discounted infinite horizon Markov decision processes. We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms: the temporal difference (TD) learning algorithm and the two-timescale li… ▽ More

    Submitted 2 May, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: The first two authors contributed equally; paper accepted to IEEE Transactions on Information Theory

  14. arXiv:2303.16599  [pdf, other

    stat.ME math.ST

    Difference-based covariance matrix estimate in time series nonparametric regression with applications to specification tests

    Authors: Lujia Bai, Weichi Wu

    Abstract: Long-run covariance matrix estimation is the building block of time series inference. The corresponding difference-based estimator, which avoids detrending, has attracted considerable interest due to its robustness to both smooth and abrupt structural breaks and its competitive finite sample performance. However, existing methods mainly focus on estimators for the univariate process while their di… ▽ More

    Submitted 28 February, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: arXiv admin note: text overlap with arXiv:2110.08089

  15. arXiv:2303.10117  [pdf, other

    stat.ME econ.EM

    Estimation of Grouped Time-Varying Network Vector Autoregression Models

    Authors: Degui Li, Bin Peng, Songqiao Tang, Weibiao Wu

    Abstract: This paper introduces a flexible time-varying network vector autoregressive model framework for large-scale time series. A latent group structure is imposed on the heterogeneous and node-specific time-varying momentum and network spillover effects so that the number of unknown time-varying coefficients to be estimated can be reduced considerably. A classic agglomerative clustering algorithm with n… ▽ More

    Submitted 10 March, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

  16. arXiv:2302.05158  [pdf, other

    stat.ME math.ST

    Time-varying correlation network analysis of non-stationary multivariate time series with complex trends

    Authors: Lujia Bai, Weichi Wu

    Abstract: This paper proposes a flexible framework for inferring large-scale time-varying and time-lagged correlation networks from multivariate or high-dimensional non-stationary time series with piecewise smooth trends. Built on a novel and unified multiple-testing procedure of time-lagged cross-correlation functions with a fixed or diverging number of lags, our method can accurately disclose flexible tim… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

  17. arXiv:2301.04209  [pdf, other

    stat.ME

    High Dimensional Analysis of Variance in Multivariate Linear Regression

    Authors: Zhipeng Lou, Xianyang Zhang, Wei Biao Wu

    Abstract: In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new \emph{U}~type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general frame… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

  18. Regularized Nonlinear Regression with Dependent Errors and its Application to a Biomechanical Model

    Authors: Hojun You, Kyubaek Yoon, Wei-Ying Wu, Jongeun Choi, Chae Young Lim

    Abstract: A biomechanical model often requires parameter estimation and selection in a known but complicated nonlinear function. Motivated by observing that data from a head-neck position tracking system, one of biomechanical models, show multiplicative time dependent errors, we develop a modified penalized weighted least squares estimator. The proposed method can be also applied to a model with non-zero me… ▽ More

    Submitted 11 October, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: The article revised in overall

    Journal ref: Annals of the Institute of Statistical Mathematics, 2024

  19. arXiv:2209.00181  [pdf, other

    stat.ME stat.AP

    Understanding the dynamic impact of COVID-19 through competing risk modeling with bivariate varying coefficients

    Authors: Wenbo Wu, John D. Kalbfleisch, Jeremy M. G. Taylor, Jian Kang, Kevin He

    Abstract: The coronavirus disease 2019 (COVID-19) pandemic has exerted a profound impact on patients with end-stage renal disease relying on kidney dialysis to sustain their lives. Motivated by a request by the U.S. Centers for Medicare & Medicaid Services, our analysis of their postdischarge hospital readmissions and deaths in 2020 revealed that the COVID-19 effect has varied significantly with postdischar… ▽ More

    Submitted 31 August, 2022; originally announced September 2022.

    Comments: 40 pages, 8 figures, 1 table

  20. arXiv:2208.13074  [pdf, other

    math.ST stat.ME

    $\ell^2$ Inference for Change Points in High-Dimensional Time Series via a Two-Way MOSUM

    Authors: Jiaqi Li, Likai Chen, Weining Wang, Wei Biao Wu

    Abstract: We propose an inference method for detecting multiple change points in high-dimensional time series, targeting dense or spatially clustered signals. Our method aggregates moving sum (MOSUM) statistics cross-sectionally by an $\ell^2$-norm and maximizes them over time. We further introduce a novel Two-Way MOSUM, which utilizes spatial-temporal moving regions to search for breaks, with the added adv… ▽ More

    Submitted 3 July, 2023; v1 submitted 27 August, 2022; originally announced August 2022.

    Comments: 111 pages, 10 figures

  21. arXiv:2207.05195  [pdf, other

    cs.CV stat.ML

    Collaborative Uncertainty Benefits Multi-Agent Multi-Modal Trajectory Forecasting

    Authors: Bohan Tang, Yiqi Zhong, Chenxin Xu, Wei-Tao Wu, Ulrich Neumann, Yanfeng Wang, Ya Zhang, Siheng Chen

    Abstract: In multi-modal multi-agent trajectory forecasting, two major challenges have not been fully tackled: 1) how to measure the uncertainty brought by the interaction module that causes correlations among the predicted trajectories of multiple agents; 2) how to rank the multiple predictions and select the optimal predicted trajectory. In order to handle these challenges, this work first proposes a nove… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: arXiv admin note: text overlap with arXiv:2110.13947

  22. arXiv:2205.04341  [pdf, other

    math.ST stat.ME

    Asymptotic comparison of identifying constraints for Bradley-Terry models

    Authors: Weichen Wu, Brian W. Junker, Nynke M. D. Niezink

    Abstract: The Bradley-Terry model is widely used for pairwise comparison data analysis. In this paper, we analyze the asymptotic behavior of the maximum likelihood estimator of the Bradley-Terry model in its logistic parameterization, under a general class of linear identifiability constraints. We show that the constraint requiring the Bradley-Terry scores for all compared objects to sum to zero minimizes t… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

  23. arXiv:2203.14810  [pdf, other

    stat.ME cs.CV stat.AP

    Data-Driven, Soft Alignment of Functional Data Using Shapes and Landmarks

    Authors: Xiaoyang Guo, Wei Wu, Anuj Srivastava

    Abstract: Alignment or registration of functions is a fundamental problem in statistical analysis of functions and shapes. While there are several approaches available, a more recent approach based on Fisher-Rao metric and square-root velocity functions (SRVFs) has been shown to have good performance. However, this SRVF method has two limitations: (1) it is susceptible to over alignment, i.e., alignment of… ▽ More

    Submitted 9 April, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

  24. arXiv:2203.04454  [pdf, ps, other

    stat.ME

    Statistical Depth for Point Process via the Isometric Log-Ratio Transformation

    Authors: Xinyu Zhou, Yijia Ma, Wei Wu

    Abstract: Statistical depth, a useful tool to measure the center-outward rank of multivariate and functional data, is still under-explored in temporal point processes. Recent studies on point process depth proposed a weighted product of two terms - one indicates the depth of the cardinality of the process, and the other characterizes the conditional depth of the temporal events given the cardinality. The se… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

  25. arXiv:2201.09970  [pdf, ps, other

    stat.ME

    A Stochastic Process Model for Time War** Functions

    Authors: Yijia Ma, Xinyu Zhou, Wei Wu

    Abstract: Time war** function provides a mathematical representation to measure phase variability in functional data. Recent studies have developed various approaches to estimate optimal war** between functions and provide non-Euclidean models. However, a principled, linear, generative model on time war** functions is still under-explored. This is a highly challenging problem because the space of warp… ▽ More

    Submitted 13 April, 2022; v1 submitted 24 January, 2022; originally announced January 2022.

  26. arXiv:2110.14177  [pdf, other

    stat.ML cs.IT cs.LG

    Federated Linear Contextual Bandits

    Authors: Ruiquan Huang, Weiqiang Wu, **g Yang, Cong Shen

    Abstract: This paper presents a novel federated linear contextual bandits model, where individual clients face different $K$-armed stochastic bandits coupled through common global parameters. By leveraging the geometric structure of the linear rewards, a collaborative algorithm called Fed-PE is proposed to cope with the heterogeneity across clients without exchanging local feature vectors or raw data. Fed-P… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  27. arXiv:2107.02043  [pdf

    stat.AP cs.CY

    An extended watershed-based zonal statistical AHP model for flood risk estimation: Constraining runoff converging related indicators by sub-watersheds

    Authors: Hong** Zhang, Zhenfeng Shao, **qi Zhao, Xiao Huang, Jie Yang, Bin Hu, Wenfu Wu

    Abstract: Floods are highly uncertain events, occurring in different regions, with varying prerequisites and intensities. A highly reliable flood disaster risk map can help reduce the impact of floods for flood management, disaster decreasing, and urbanization resilience. In flood risk estimation, the widely used analytic hierarchy process (AHP) usually adopts pixel as a basic unit, it cannot capture the si… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: This paper is a research paper, it contains 40 pages and 8 figures. This paper is a modest contribution to the ongoing discussions the accuracy of flood risk estimation via AHP model improved by adopting pixels replaced with sub-watersheds as basic unit

    MSC Class: 86A05 ACM Class: H.1

  28. arXiv:2105.08893  [pdf, ps, other

    stat.ME

    A unified framework on defining depth for point process using function smoothing

    Authors: Zishen Xu, Chenran Wang, Wei Wu

    Abstract: The notion of statistical depth has been extensively studied in multivariate and functional data over the past few decades. In contrast, the depth on temporal point process is still under-explored. The problem is challenging because a point process has two types of randomness: 1) the number of events in a process, and 2) the distribution of these events. Recent studies proposed depths in a weighte… ▽ More

    Submitted 20 May, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

  29. arXiv:2104.14525  [pdf, other

    stat.ME

    Testing and estimation of clustered signals

    Authors: Hongyuan Cao, Wei Biao Wu

    Abstract: We propose a change-point detection method for large scale multiple testing problems with data having clustered signals. Unlike the classic change-point setup, the signals can vary in size within a cluster. The clustering structure on the signals enables us to effectively delineate the boundaries between signal and non-signal segments. New test statistics are proposed for observations from one and… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Report number: BEJ2007-047

    Journal ref: Bernoulli, 2021

  30. Regularized Nonlinear Regression for Simultaneously Selecting and Estimating Key Model Parameters

    Authors: Kyubaek Yoon, Hojun You, Wei-Ying Wu, Chae Young Lim, Jongeun Choi, Connor Boss, Ahmed Ramadan, John M. Popovich Jr., Jacek Cholewicki, N. Peter Reeves, Clark J. Radcliffe

    Abstract: In system identification, estimating parameters of a model using limited observations results in poor identifiability. To cope with this issue, we propose a new method to simultaneously select and estimate sensitive parameters as key model parameters and fix the remaining parameters to a set of typical values. Our method is formulated as a nonlinear least squares estimator with L1-regularization o… ▽ More

    Submitted 2 June, 2022; v1 submitted 23 April, 2021; originally announced April 2021.

    Comments: 13 pages, 4 figures, 2 Tables

  31. arXiv:2104.01114  [pdf, other

    stat.OT

    The general conformable fractional grey system model and its applications

    Authors: Wanli Xie, Mingyong Pang, Wen-Ze Wu, Chong Liu, Caixia Liu

    Abstract: Grey system theory is an important mathematical tool for describing uncertain information in the real world. It has been used to solve the uncertainty problems specially caused by lack of information. As a novel theory, the theory can deal with various fields and plays an important role in modeling the small sample problems. But many modeling mechanisms of grey system need to be answered, such as… ▽ More

    Submitted 14 July, 2021; v1 submitted 28 March, 2021; originally announced April 2021.

  32. arXiv:2103.07626  [pdf, other

    stat.ML cs.LG

    Helmholtzian Eigenmap: Topological feature discovery & edge flow learning from point cloud data

    Authors: Yu-Chia Chen, Weicheng Wu, Marina Meilă, Ioannis G. Kevrekidis

    Abstract: The manifold Helmholtzian (1-Laplacian) operator $Δ_1$ elegantly generalizes the Laplace-Beltrami operator to vector fields on a manifold $\mathcal M$. In this work, we propose the estimation of the manifold Helmholtzian from point cloud data by a weighted 1-Laplacian $\mathcal L_1$. While higher order Laplacians have been introduced and studied, this work is the first to present a graph Helmholtz… ▽ More

    Submitted 31 October, 2023; v1 submitted 13 March, 2021; originally announced March 2021.

  33. arXiv:2012.14708  [pdf, ps, other

    stat.ME

    Adaptive Estimation for Non-stationary Factor Models And A Test for Static Factor Loadings

    Authors: Weichi Wu, Zhou Zhou

    Abstract: This paper considers the estimation and testing of a class of locally stationary time series factor models with evolutionary temporal dynamics. In particular, the entries and the dimension of the factor loading matrix are allowed to vary with time while the factors and the idiosyncratic noise components are locally stationary. We propose an adaptive sieve estimator for the span of the varying load… ▽ More

    Submitted 3 February, 2024; v1 submitted 29 December, 2020; originally announced December 2020.

  34. arXiv:2012.08223  [pdf, other

    stat.ME econ.EM math.ST

    Long-term prediction intervals with many covariates

    Authors: Sayar Karmakar, Marek Chudy, Wei Biao Wu

    Abstract: Accurate forecasting is one of the fundamental focus in the literature of econometric time-series. Often practitioners and policy makers want to predict outcomes of an entire time horizon in the future instead of just a single $k$-step ahead prediction. These series, apart from their own possible non-linear dependence, are often also influenced by many external predictors. In this paper, we constr… ▽ More

    Submitted 30 September, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

  35. arXiv:2011.06333  [pdf, other

    stat.ME

    Shift identification in time varying regression quantiles

    Authors: Subhra Sankar Dhar, Weichi Wu

    Abstract: This article investigates whether time-varying quantile regression curves are the same up to the horizontal shift or not. The errors and the covariates involved in the regression model are allowed to be locally stationary. We formalize this issue in a corresponding non-parametric hypothesis testing problem, and develop an integrated-squared-norm based test (SIT) as well as a simultaneous confidenc… ▽ More

    Submitted 24 December, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

  36. arXiv:2010.03130  [pdf

    stat.ML cs.LG

    Computational analysis of pathological image enables interpretable prediction for microsatellite instability

    Authors: ** Zhu, Wangwei Wu, Yuting Zhang, Shiyun Lin, Yukang Jiang, Ruixian Liu, Xueqin Wang

    Abstract: Microsatellite instability (MSI) is associated with several tumor types and its status has become increasingly vital in guiding patient treatment decisions. However, in clinical practice, distinguishing MSI from its counterpart is challenging since the diagnosis of MSI requires additional genetic or immunohistochemical tests. In this study, interpretable pathological image analysis strategies are… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

  37. arXiv:2008.09667  [pdf, other

    q-fin.ST cs.CE cs.LG cs.SI stat.ML

    A Blockchain Transaction Graph based Machine Learning Method for Bitcoin Price Prediction

    Authors: Xiao Li, Weili Wu

    Abstract: Bitcoin, as one of the most popular cryptocurrency, is recently attracting much attention of investors. Bitcoin price prediction task is consequently a rising academic topic for providing valuable insights and suggestions. Existing bitcoin prediction works mostly base on trivial feature engineering, that manually designs features or factors from multiple areas, including Bticoin Blockchain informa… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

  38. arXiv:2007.14365  [pdf, other

    math.ST stat.ML

    Tractably Modelling Dependence in Networks Beyond Exchangeability

    Authors: Weichi Wu, Sofia Olhede, Patrick Wolfe

    Abstract: We propose a general framework for modelling network data that is designed to describe aspects of non-exchangeable networks. Conditional on latent (unobserved) variables, the edges of the network are generated by their finite growth history (with latent orders) while the marginal probabilities of the adjacency matrix are modeled by a generalization of a graph limit function (or a graphon). In part… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    MSC Class: 62G05; 62R07; 62E20; 62G20; secondary 53C20

  39. arXiv:2006.08828  [pdf, other

    cs.CY cs.AI cs.LG stat.ML

    Explainable AI for a No-Teardown Vehicle Component Cost Estimation: A Top-Down Approach

    Authors: Ayman Moawad, Ehsan Islam, Namdoo Kim, Ram Vijayagopal, Aymeric Rousseau, Wei Biao Wu

    Abstract: The broader ambition of this article is to popularize an approach for the fair distribution of the quantity of a system's output to its subsystems, while allowing for underlying complex subsystem level interactions. Particularly, we present a data-driven approach to vehicle price modeling and its component price estimation by leveraging a combination of concepts from machine learning and game theo… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

    Comments: 17 pages, 18 figures

    Journal ref: IEEE Transactions on Artificial Intelligence (Volume: 2, Issue: 2, April 2021, Page(s): 185 - 199)

  40. arXiv:2005.05117  [pdf, other

    cs.LG cs.DB stat.ML

    Nearest Neighbor Classifiers over Incomplete Information: From Certain Answers to Certain Predictions

    Authors: Bojan Karlaš, Peng Li, Renzhi Wu, Nezihe Merve Gürel, Xu Chu, Wentao Wu, Ce Zhang

    Abstract: Machine learning (ML) applications have been thriving recently, largely attributed to the increasing availability of data. However, inconsistency and incomplete information are ubiquitous in real-world datasets, and their impact on ML applications remains elusive. In this paper, we present a formal study of this impact by extending the notion of Certain Answers for Codd tables, which has been expl… ▽ More

    Submitted 12 May, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

  41. Multi-View Self-Attention for Interpretable Drug-Target Interaction Prediction

    Authors: Brighter Agyemang, Wei-** Wu, Michael Yelpengne Kpiebaareh, Zhihua Lei, Ebenezer Nanor, Lei Chen

    Abstract: The drug discovery stage is a vital aspect of the drug development process and forms part of the initial stages of the development pipeline. In recent times, machine learning-based methods are actively being used to model drug-target interactions for rational drug discovery due to the successful application of these methods in other domains. In machine learning approaches, the numerical representa… ▽ More

    Submitted 23 August, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

  42. arXiv:2003.12628  [pdf, other

    cs.LG cs.CV stat.ML

    MCFlow: Monte Carlo Flow Models for Data Imputation

    Authors: Trevor W. Richardson, Wencheng Wu, Lei Lin, Beilei Xu, Edgar A. Bernal

    Abstract: We consider the topic of data imputation, a foundational task in machine learning that addresses issues with missing data. To that end, we propose MCFlow, a deep framework for imputation that leverages normalizing flow generative models and Monte Carlo sampling. We address the causality dilemma that arises when training models with incomplete data by introducing an iterative learning scheme which… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

    Journal ref: 2020 Computer Vision and Pattern Recognition (CVPR)

  43. arXiv:2003.09902  [pdf, other

    cs.LG cs.SI stat.ML

    K-Core based Temporal Graph Convolutional Network for Dynamic Graphs

    Authors: **gxin Liu, Chang Xu, Chang Yin, Weiqiang Wu, You Song

    Abstract: Graph representation learning is a fundamental task in various applications that strives to learn low-dimensional embeddings for nodes that can preserve graph topology information. However, many existing methods focus on static graphs while ignoring evolving graph patterns. Inspired by the success of graph convolutional networks(GCNs) in static graph embedding, we propose a novel k-core based temp… ▽ More

    Submitted 6 November, 2020; v1 submitted 22 March, 2020; originally announced March 2020.

  44. arXiv:2003.02681  [pdf, other

    cs.LG cs.IT stat.ML

    Stochastic Linear Contextual Bandits with Diverse Contexts

    Authors: Weiqiang Wu, **g Yang, Cong Shen

    Abstract: In this paper, we investigate the impact of context diversity on stochastic linear contextual bandits. As opposed to the previous view that contexts lead to more difficult bandit learning, we show that when the contexts are sufficiently diverse, the learner is able to utilize the information obtained during exploitation to shorten the exploration process, thus achieving reduced regret. We design t… ▽ More

    Submitted 5 March, 2020; originally announced March 2020.

    Comments: Accepted to AISTATS 2020

  45. arXiv:2002.03979  [pdf, other

    stat.ML cs.LG

    Online Covariance Matrix Estimation in Stochastic Gradient Descent

    Authors: Wanrong Zhu, Xi Chen, Wei Biao Wu

    Abstract: The stochastic gradient descent (SGD) algorithm is widely used for parameter estimation, especially for huge data sets and online learning. While this recursive algorithm is popular for computation and memory efficiency, quantifying variability and randomness of the solutions has been rarely studied. This paper aims at conducting statistical inference of SGD-based estimates in an online setting. I… ▽ More

    Submitted 22 June, 2021; v1 submitted 10 February, 2020; originally announced February 2020.

  46. arXiv:2001.00419  [pdf, other

    stat.ME econ.EM math.ST

    Prediction in locally stationary time series

    Authors: Holger Dette, Weichi Wu

    Abstract: We develop an estimator for the high-dimensional covariance matrix of a locally stationary process with a smoothly varying trend and use this statistic to derive consistent predictors in non-stationary time series. In contrast to the currently available methods for this problem the predictor developed here does not rely on fitting an autoregressive model and does not require a vanishing trend. The… ▽ More

    Submitted 3 January, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

  47. arXiv:1912.09536  [pdf, other

    cs.LG cs.DC stat.ML

    Data Science through the looking glass and what we found there

    Authors: Fotis Psallidas, Yiwen Zhu, Bojan Karlas, Matteo Interlandi, Avrilia Floratou, Konstantinos Karanasos, Wentao Wu, Ce Zhang, Subru Krishnan, Carlo Curino, Markus Weimer

    Abstract: The recent success of machine learning (ML) has led to an explosive growth both in terms of new systems and algorithms built in industry and academia, and new applications built by an ever-growing community of data science (DS) practitioners. This quickly shifting panorama of technologies and applications is challenging for builders and practitioners alike to follow. In this paper, we set out to c… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

  48. Drug-Target Indication Prediction by Integrating End-to-End Learning and Fingerprints

    Authors: Brighter Agyemang, Wei-** Wu, Michael Y. Kpiebaareh, Ebenezer Nanor

    Abstract: Computer-Aided Drug Discovery research has proven to be a promising direction in drug discovery. In recent years, Deep Learning approaches have been applied to problems in the domain such as Drug-Target Interaction Prediction and have shown improvements over traditional screening methods. An existing challenge is how to represent compound-target pairs in deep learning models. While several represe… ▽ More

    Submitted 5 December, 2019; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: Accepted at IEEE ICCWAMTIP 2019

  49. Application of a new information priority accumulated grey model with time power to predict short-term wind turbine capacity

    Authors: Jie Xia, Xin Ma, Wenqing Wu, Baolian Huang, Wanpeng Li

    Abstract: Wind energy makes a significant contribution to global power generation. Predicting wind turbine capacity is becoming increasingly crucial for cleaner production. For this purpose, a new information priority accumulated grey model with time power is proposed to predict short-term wind turbine capacity. Firstly, the computational formulas for the time response sequence and the prediction values are… ▽ More

    Submitted 19 October, 2019; originally announced October 2019.

    Journal ref: Journal of Cleaner Production, Volume 244, 2020, 118573

  50. arXiv:1910.00727  [pdf, other

    cs.LG cs.CV stat.ML

    Analyzing and Improving Neural Networks by Generating Semantic Counterexamples through Differentiable Rendering

    Authors: Lakshya Jain, Varun Chandrasekaran, Uyeong Jang, Wilson Wu, Andrew Lee, Andy Yan, Steven Chen, Somesh Jha, Sanjit A. Seshia

    Abstract: Even as deep neural networks (DNNs) have achieved remarkable success on vision-related tasks, their performance is brittle to transformations in the input. Of particular interest are semantic transformations that model changes that have a basis in the physical world, such as rotations, translations, changes in lighting or camera pose. In this paper, we show how differentiable rendering can be util… ▽ More

    Submitted 17 July, 2020; v1 submitted 1 October, 2019; originally announced October 2019.