Skip to main content

Showing 1–46 of 46 results for author: Wu, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.01870  [pdf, other

    cs.LG stat.ML

    Understanding Stochastic Natural Gradient Variational Inference

    Authors: Kaiwen Wu, Jacob R. Gardner

    Abstract: Stochastic natural gradient variational inference (NGVI) is a popular posterior inference method with applications in various probabilistic models. Despite its wide usage, little is known about the non-asymptotic convergence rate in the \emph{stochastic} setting. We aim to lessen this gap and provide a better understanding. For conjugate likelihoods, we prove the first $\mathcal{O}(\frac{1}{T})$ n… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  2. arXiv:2405.08276  [pdf, other

    stat.ML cs.LG stat.CO

    Scalable Subsampling Inference for Deep Neural Networks

    Authors: Ke** Wu, Dimitris N. Politis

    Abstract: Deep neural networks (DNN) has received increasing attention in machine learning applications in the last several years. Recently, a non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator with ReLU activation functions for estimating regression models. The paper at hand gives a small improvement on the current error bound based on the latest r… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  3. arXiv:2402.08683  [pdf

    stat.AP math.OC

    Order picking efficiency: A scattered storage and clustered allocation strategy in automated drug dispensing systems

    Authors: Mengge Yuan, Ning Zhao, Kan Wu, Lulu Cheng

    Abstract: In the smart hospital, optimizing prescription order fulfilment processes in outpatient pharmacies is crucial. A promising device, automated drug dispensing systems (ADDSs), has emerged to streamline these processes. These systems involve human order pickers who are assisted by ADDSs. The ADDS's robotic arm transports bins from storage locations to the input/output (I/O) points, while the pharmaci… ▽ More

    Submitted 18 December, 2023; originally announced February 2024.

  4. arXiv:2311.00294  [pdf, ps, other

    stat.ME

    Multi-step ahead prediction intervals for non-parametric autoregressions via bootstrap: consistency, debiasing and pertinence

    Authors: Dimitris N. Politis, Ke** Wu

    Abstract: To address the difficult problem of multi-step ahead prediction of non-parametric autoregressions, we consider a forward bootstrap approach. Employing a local constant estimator, we can analyze a general type of non-parametric time series model, and show that the proposed point predictions are consistent with the true optimal predictor. We construct a quantile prediction interval that is asymptoti… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  5. arXiv:2310.17137  [pdf, other

    cs.LG stat.ML

    Large-Scale Gaussian Processes via Alternating Projection

    Authors: Kaiwen Wu, Jonathan Wenger, Haydn Jones, Geoff Pleiss, Jacob R. Gardner

    Abstract: Training and inference in Gaussian processes (GPs) require solving linear systems with $n\times n$ kernel matrices. To address the prohibitive $\mathcal{O}(n^3)$ time complexity, recent work has employed fast iterative methods, like conjugate gradients (CG). However, as datasets increase in magnitude, the kernel matrices become increasingly ill-conditioned and still require $\mathcal{O}(n^2)$ spac… ▽ More

    Submitted 8 March, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: AISTATS 2024

  6. arXiv:2310.00902  [pdf, other

    cs.LG stat.ML

    DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models

    Authors: Yongchan Kwon, Eric Wu, Kevin Wu, James Zou

    Abstract: Quantifying the impact of training data points is crucial for understanding the outputs of machine learning models and for improving the transparency of the AI pipeline. The influence function is a principled and popular data attribution method, but its computational cost often makes it challenging to use. This issue becomes more pronounced in the setting of large language models and text-to-image… ▽ More

    Submitted 13 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  7. arXiv:2309.10301  [pdf, other

    stat.ML cs.LG

    Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms

    Authors: Keru Wu, Yuansi Chen, Wooseok Ha, Bin Yu

    Abstract: Domain adaptation (DA) is a statistical learning problem that arises when the distribution of the source data used to train a model differs from that of the target data used to evaluate the model. While many DA algorithms have demonstrated considerable empirical success, blindly applying these algorithms can often lead to worse performance on new datasets. To address this, it is crucial to clarify… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  8. arXiv:2306.15209  [pdf, other

    stat.CO stat.AP stat.ME

    Dynamic Reconfiguration of Brain Functional Network in Stroke

    Authors: Kaichao Wu, Beth Jelfs, Katrina Neville, Wenzhen He, Qiang Fang

    Abstract: The brain continually reorganizes its functional network to adapt to post-stroke functional impairments. Previous studies using static modularity analysis have presented global-level behavior patterns of this network reorganization. However, it is far from understood how the brain reconfigures its functional network dynamically following a stroke. This study collected resting-state functional MRI… ▽ More

    Submitted 22 March, 2024; v1 submitted 27 June, 2023; originally announced June 2023.

    Comments: Has been accepted by IEEE Journal of Biomedical and Health Informatics, doi:10.1109/JBHI.2024.3371097

    Journal ref: IEEE Journal of Biomedical and Health Informatics 2024

  9. arXiv:2306.05398  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci math.NA stat.CO

    Bayesian model calibration for diblock copolymer thin film self-assembly using power spectrum of microscopy data and machine learning surrogate

    Authors: Lianghao Cao, Keyi Wu, J. Tinsley Oden, Peng Chen, Omar Ghattas

    Abstract: Identifying parameters of computational models from experimental data, or model calibration, is fundamental for assessing and improving the predictability and reliability of computer simulations. In this work, we propose a method for Bayesian calibration of models that predict morphological patterns of diblock copolymer (Di-BCP) thin film self-assembly while accounting for various sources of uncer… ▽ More

    Submitted 3 August, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Minor changes from the original submission, including a change in the title

  10. arXiv:2306.04126  [pdf, ps, other

    stat.ME math.ST

    Bootstrap Prediction Inference of Non-linear Autoregressive Models

    Authors: Ke** Wu, Dimitris N. Politis

    Abstract: The non-linear autoregressive (NLAR) model plays an important role in modeling and predicting time series. One-step ahead prediction is straightforward using the NLAR model, but the multi-step ahead prediction is cumbersome. For instance, iterating the one-step ahead predictor is a convenient strategy for linear autoregressive (LAR) models, but it is suboptimal under NLAR. In this paper, we first… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  11. arXiv:2305.15572  [pdf, other

    cs.LG stat.ML

    The Behavior and Convergence of Local Bayesian Optimization

    Authors: Kaiwen Wu, Kyurae Kim, Roman Garnett, Jacob R. Gardner

    Abstract: A recent development in Bayesian optimization is the use of local optimization strategies, which can deliver strong empirical performance on high-dimensional problems compared to traditional global strategies. The "folk wisdom" in the literature is that the focus on local optimization sidesteps the curse of dimensionality; however, little is known concretely about the expected behavior or converge… ▽ More

    Submitted 8 March, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: 27 pages; NeurIPS 2023

  12. arXiv:2305.15349  [pdf, other

    cs.LG eess.SP math.OC stat.CO stat.ML

    On the Convergence of Black-Box Variational Inference

    Authors: Kyurae Kim, Jisu Oh, Kaiwen Wu, Yi-An Ma, Jacob R. Gardner

    Abstract: We provide the first convergence guarantee for full black-box variational inference (BBVI), also known as Monte Carlo variational inference. While preliminary investigations worked on simplified versions of BBVI (e.g., bounded domain, bounded support, only optimizing for the scale, and such), our setup does not need any such algorithmic modifications. Our results hold for log-smooth posterior dens… ▽ More

    Submitted 10 January, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS'23; previous title: "Black-Box Variational Inference Converges"

  13. arXiv:2303.10472  [pdf, ps, other

    cs.LG math.OC stat.CO stat.ML

    Practical and Matching Gradient Variance Bounds for Black-Box Variational Bayesian Inference

    Authors: Kyurae Kim, Kaiwen Wu, Jisu Oh, Jacob R. Gardner

    Abstract: Understanding the gradient variance of black-box variational inference (BBVI) is a crucial step for establishing its convergence and develo** algorithmic improvements. However, existing studies have yet to show that the gradient variance of BBVI satisfies the conditions used to study the convergence of stochastic gradient descent (SGD), the workhorse of BBVI. In this work, we show that BBVI sati… ▽ More

    Submitted 3 June, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

    Comments: Accepted to ICML'23 for live oral presentation

  14. arXiv:2302.03358  [pdf, other

    cs.LG math.DS math.NA physics.comp-ph stat.ML

    Deep-OSG: Deep Learning of Operators in Semigroup

    Authors: Junfeng Chen, Kailiang Wu

    Abstract: This paper proposes a novel deep learning approach for learning operators in semigroup, with applications to modeling unknown autonomous dynamical systems using time series data collected at varied time lags. It is a sequel to the previous flow map learning (FML) works [T. Qin, K. Wu, and D. Xiu, J. Comput. Phys., 395:620--635, 2019], [K. Wu and D. Xiu, J. Comput. Phys., 408:109307, 2020], and [Z.… ▽ More

    Submitted 12 September, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  15. arXiv:2205.09703  [pdf, other

    cs.LG cs.DC cs.PF eess.SY stat.AP

    Extract Dynamic Information To Improve Time Series Modeling: a Case Study with Scientific Workflow

    Authors: Jeeyung Kim, Mengtian **, Youkow Homma, Alex Sim, Wilko Kroeger, Kesheng Wu

    Abstract: In modeling time series data, we often need to augment the existing data records to increase the modeling accuracy. In this work, we describe a number of techniques to extract dynamic information about the current state of a large scientific workflow, which could be generalized to other types of applications. The specific task to be modeled is the time needed for transferring a file from an experi… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  16. arXiv:2205.06622  [pdf

    stat.AP

    What Makes You Hold on to That Old Car? Joint Insights from Machine Learning and Multinomial Logit on Vehicle-level Transaction Decisions

    Authors: Ling **, Alina Lazar, Caitlin Brown, Bingrong Sun, Venu Garikapati, Srinath Ravulaparthy, Qianmiao Chen, Alexander Sim, Kesheng Wu, Tin Ho, Thomas Wenzel, C. Anna Spurlock

    Abstract: What makes you hold on that old car? While the vast majority of the household vehicles are still powered by conventional internal combustion engines, the progress of adopting emerging vehicle technologies will critically depend on how soon the existing vehicles are transacted out of the household fleet. Leveraging a nationally representative longitudinal data set, the Panel Study of Income Dynamic… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

  17. arXiv:2203.09611  [pdf, other

    cs.LG cs.AI cs.DB cs.SI stat.ML

    STICC: A multivariate spatial clustering method for repeated geographic pattern discovery with consideration of spatial contiguity

    Authors: Yuhao Kang, Kunlin Wu, Song Gao, Ignavier Ng, **meng Rao, Shan Ye, Fan Zhang, Teng Fei

    Abstract: Spatial clustering has been widely used for spatial data mining and knowledge discovery. An ideal multivariate spatial clustering should consider both spatial contiguity and aspatial attributes. Existing spatial clustering approaches may face challenges for discovering repeated geographic patterns with spatial contiguity maintained. In this paper, we propose a Spatial Toeplitz Inverse Covariance-B… ▽ More

    Submitted 30 March, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Journal ref: International Journal of Geographical Information Science, Year 2022

  18. arXiv:2202.12636  [pdf, other

    stat.ML cs.LG

    Learning Multi-Task Gaussian Process Over Heterogeneous Input Domains

    Authors: Haitao Liu, Kai Wu, Yew-Soon Ong, Chao Bian, Xiaomo Jiang, Xiaofang Wang

    Abstract: Multi-task Gaussian process (MTGP) is a well-known non-parametric Bayesian model for learning correlated tasks effectively by transferring knowledge across tasks. But current MTGPs are usually limited to the multi-task scenario defined in the same input domain, leaving no space for tackling the heterogeneous case, i.e., the features of input domains vary over tasks. To this end, this paper present… ▽ More

    Submitted 18 June, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  19. arXiv:2112.08601  [pdf, ps, other

    stat.AP

    A New Model-free Prediction Method: GA-NoVaS

    Authors: Ke** Wu, Sayar Karmakar

    Abstract: Volatility forecasting plays an important role in the financial econometrics. Previous works in this regime are mainly based on applying various GARCH-type models. However, it is hard for people to choose a specific GARCH model which works for general cases and such traditional methods are unstable for dealing with high-volatile period or using small sample size. The newly proposed normalizing and… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2101.02273

  20. arXiv:2109.13055  [pdf, other

    stat.ML cs.LG stat.CO

    Minimax Mixing Time of the Metropolis-Adjusted Langevin Algorithm for Log-Concave Sampling

    Authors: Keru Wu, Scott Schmidler, Yuansi Chen

    Abstract: We study the mixing time of the Metropolis-adjusted Langevin algorithm (MALA) for sampling from a log-smooth and strongly log-concave distribution. We establish its optimal minimax mixing time under a warm start. Our main contribution is two-fold. First, for a $d$-dimensional log-concave density with condition number $κ$, we show that MALA with a warm start mixes in $\tilde O(κ\sqrt{d})$ iteration… ▽ More

    Submitted 2 October, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: 63 pages, 2 figures

    Journal ref: Journal of Machine Learning Research, Vol. 23, No. 270, pp. 1-63 (2022)

  21. arXiv:2102.11027  [pdf, other

    stat.AP cs.CY cs.LG

    Investigating Underlying Drivers of Variability in Residential Energy Usage Patterns with Daily Load Shape Clustering of Smart Meter Data

    Authors: Ling **, C. Anna Spurlock, Sam Borgeson, Alina Lazar, Daniel Fredman, Annika Todd, Alexander Sim, Kesheng Wu

    Abstract: Residential customers have traditionally not been treated as individual entities due to the high volatility in residential consumption patterns as well as a historic focus on aggregated loads from the utility and system feeder perspective. Large-scale deployment of smart meters has motivated increasing studies to explore disaggregated daily load patterns, which can reveal important heterogeneity a… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: 11 pages, 11 figures

  22. arXiv:2101.02273  [pdf, ps, other

    stat.ME

    Model-free time-aggregated predictions for econometric datasets

    Authors: Ke** Wu, Sayar Karmakar

    Abstract: This article explores the existing normalizing and variance-stabilizing (NoVaS) method on predicting squared log-returns of financial data. First, we explore the robustness of the existing NoVaS method for long-term time-aggregated predictions. Then we develop a more parsimonious variant of the existing method. With systematic justification and extensive data analysis, our new method shows better… ▽ More

    Submitted 4 November, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

  23. arXiv:2011.05625  [pdf, other

    cs.IR stat.ML

    CAN: Feature Co-Action for Click-Through Rate Prediction

    Authors: Weijie Bian, Kailun Wu, Lejian Ren, Qi Pi, Yu**g Zhang, Can Xiao, Xiang-Rong Sheng, Yong-Nan Zhu, Zhangming Chan, Na Mou, Xinchen Luo, Shiming Xiang, Guorui Zhou, Xiaoqiang Zhu, Hongbo Deng

    Abstract: Feature interaction has been recognized as an important problem in machine learning, which is also very essential for click-through rate (CTR) prediction tasks. In recent years, Deep Neural Networks (DNNs) can automatically learn implicit nonlinear interactions from original sparse features, and therefore have been widely used in industrial CTR prediction tasks. However, the implicit feature inter… ▽ More

    Submitted 7 December, 2021; v1 submitted 11 November, 2020; originally announced November 2020.

    Comments: WSDM 2022

    MSC Class: Machine Learning (stat.ML); Information Retrieval (cs.IR); Machine Learning (cs.LG) ACM Class: I.2.6

  24. Bayesian inference of heterogeneous epidemic models: Application to COVID-19 spread accounting for long-term care facilities

    Authors: Peng Chen, Keyi Wu, Omar Ghattas

    Abstract: We propose a high dimensional Bayesian inference framework for learning heterogeneous dynamics of a COVID-19 model, with a specific application to the dynamics and severity of COVID-19 inside and outside long-term care (LTC) facilities. We develop a heterogeneous compartmental model that accounts for the heterogeneity of the time-varying spread and severity of COVID-19 inside and outside LTC facil… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

  25. arXiv:2008.02883  [pdf, other

    cs.LG stat.ML

    Stronger and Faster Wasserstein Adversarial Attacks

    Authors: Kaiwen Wu, Allen Houze Wang, Yaoliang Yu

    Abstract: Deep models, while being extremely flexible and accurate, are surprisingly vulnerable to "small, imperceptible" perturbations known as adversarial attacks. While the majority of existing attacks focus on measuring perturbations under the $\ell_p$ metric, Wasserstein distance, which takes geometry in pixel space into account, has long been known to be a suitable metric for measuring image quality a… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: 30 pages, accepted to ICML 2020

  26. arXiv:2007.09762  [pdf, other

    cs.LG stat.ML

    A Theory of Multiple-Source Adaptation with Limited Target Labeled Data

    Authors: Yishay Mansour, Mehryar Mohri, Jae Ro, Ananda Theertha Suresh, Ke Wu

    Abstract: We present a theoretical and algorithmic study of the multiple-source domain adaptation problem in the common scenario where the learner has access only to a limited amount of labeled target data, but where the learner has at disposal a large amount of labeled data from multiple source domains. We show that a new family of algorithms based on model selection ideas benefits from very favorable guar… ▽ More

    Submitted 29 October, 2020; v1 submitted 19 July, 2020; originally announced July 2020.

    Comments: 20 pages

  27. arXiv:2006.14592  [pdf, other

    cs.LG math.OC stat.ML

    Newton-type Methods for Minimax Optimization

    Authors: Guojun Zhang, Kaiwen Wu, Pascal Poupart, Yaoliang Yu

    Abstract: Differential games, in particular two-player sequential zero-sum games (a.k.a. minimax optimization), have been an important modeling tool in applied science and received renewed interest in machine learning due to many recent applications, such as adversarial training, generative models and reinforcement learning. However, existing theory mostly focuses on convex-concave functions with few except… ▽ More

    Submitted 18 February, 2023; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: code update

  28. arXiv:2003.05681  [pdf

    q-bio.PE physics.bio-ph stat.AP

    Generalized logistic growth modeling of the COVID-19 outbreak: comparing the dynamics in the 29 provinces in China and in the rest of the world

    Authors: Ke Wu, Didier Darcet, Qian Wang, Didier Sornette

    Abstract: Started in Wuhan, China, the COVID-19 has been spreading all over the world. We calibrate the logistic growth model, the generalized logistic growth model, the generalized Richards model and the generalized growth model to the reported number of infected cases for the whole of China, 29 provinces in China, and 33 countries and regions that have been or are undergoing major outbreaks. We dissect th… ▽ More

    Submitted 22 September, 2020; v1 submitted 12 March, 2020; originally announced March 2020.

    Journal ref: Nonlinear Dynamics, 2020

  29. arXiv:2003.03616  [pdf, other

    stat.ML cs.CV cs.LG math.PR

    Diffusion State Distances: Multitemporal Analysis, Fast Algorithms, and Applications to Biological Networks

    Authors: Lenore Cowen, Kapil Devkota, Xiaozhe Hu, James M. Murphy, Kaiyi Wu

    Abstract: Data-dependent metrics are powerful tools for learning the underlying structure of high-dimensional data. This article develops and analyzes a data-dependent metric known as diffusion state distance (DSD), which compares points using a data-driven diffusion process. Unlike related diffusion methods, DSDs incorporate information across time scales, which allows for the intrinsic data structure to b… ▽ More

    Submitted 7 March, 2020; originally announced March 2020.

    Comments: 28 pages

  30. arXiv:2003.02387  [pdf, other

    math.NA math.AP stat.ML

    Methods to Recover Unknown Processes in Partial Differential Equations Using Data

    Authors: Zhen Chen, Kailiang Wu, Dongbin Xiu

    Abstract: We study the problem of identifying unknown processes embedded in time-dependent partial differential equation (PDE) using observational data, with an application to advection-diffusion type PDE. We first conduct theoretical analysis and derive conditions to ensure the solvability of the problem. We then present a set of numerical approaches, including Galerkin type algorithm and collocation type… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: 21 pages, 11 figures

    Journal ref: Journal of Scientific Computing 85, 23 (2020)

  31. arXiv:2003.01436  [pdf, other

    cs.LG cs.SI stat.ML

    Learning to Generate Time Series Conditioned Graphs with Generative Adversarial Nets

    Authors: Shanchao Yang, **g Liu, Kai Wu, Mingming Li

    Abstract: Deep learning based approaches have been utilized to model and generate graphs subjected to different distributions recently. However, they are typically unsupervised learning based and unconditioned generative models or simply conditioned on the graph-level contexts, which are not associated with rich semantic node-level contexts. Differently, in this paper, we are interested in a novel problem n… ▽ More

    Submitted 26 August, 2023; v1 submitted 3 March, 2020; originally announced March 2020.

  32. arXiv:2002.04658  [pdf, other

    cs.LG cs.CV stat.ML

    A Non-Intrusive Correction Algorithm for Classification Problems with Corrupted Data

    Authors: Jun Hou, Tong Qin, Kailiang Wu, Dongbin Xiu

    Abstract: A novel correction algorithm is proposed for multi-class classification problems with corrupted training data. The algorithm is non-intrusive, in the sense that it post-processes a trained classification model by adding a correction procedure to the model prediction. The correction procedure can be coupled with any approximators, such as logistic regression, neural networks of various architecture… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

  33. arXiv:1910.06948  [pdf, ps, other

    math.NA cs.LG cs.NE stat.ML

    Data-Driven Deep Learning of Partial Differential Equations in Modal Space

    Authors: Kailiang Wu, Dongbin Xiu

    Abstract: We present a framework for recovering/approximating unknown time-dependent partial differential equation (PDE) using its solution data. Instead of identifying the terms in the underlying PDE, we seek to approximate the evolution operator of the underlying PDE numerically. The evolution operator of the PDE, defined in infinite-dimensional space, maps the solution from a current time to a future tim… ▽ More

    Submitted 18 October, 2019; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: Minor notational changes

    Journal ref: Journal of Computational Physics, 408: 109307, 2020

  34. arXiv:1907.11780  [pdf, other

    cs.LG stat.ML

    Understanding Adversarial Robustness: The Trade-off between Minimum and Average Margin

    Authors: Kaiwen Wu, Yaoliang Yu

    Abstract: Deep models, while being extremely versatile and accurate, are vulnerable to adversarial attacks: slight perturbations that are imperceptible to humans can completely flip the prediction of deep models. Many attack and defense mechanisms have been proposed, although a satisfying solution still largely remains elusive. In this work, we give strong evidence that during training, deep models maximize… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

  35. arXiv:1906.10304  [pdf, other

    stat.ML cs.LG

    Res-embedding for Deep Learning Based Click-Through Rate Prediction Modeling

    Authors: Guorui Zhou, Kailun Wu, Weijie Bian, Zhao Yang, Xiaoqiang Zhu, Kun Gai

    Abstract: Recently, click-through rate (CTR) prediction models have evolved from shallow methods to deep neural networks. Most deep CTR models follow an Embedding\&MLP paradigm, that is, first map** discrete id features, e.g. user visited items, into low dimensional vectors with an embedding module, then learn a multi-layer perception (MLP) to fit the target. In this way, embedding module performs as the… ▽ More

    Submitted 24 June, 2019; originally announced June 2019.

  36. arXiv:1905.10396  [pdf, other

    math.NA cs.LG math.DS physics.comp-ph stat.ML

    Structure-preserving Method for Reconstructing Unknown Hamiltonian Systems from Trajectory Data

    Authors: Kailiang Wu, Tong Qin, Dongbin Xiu

    Abstract: We present a numerical approach for approximating unknown Hamiltonian systems using observation data. A distinct feature of the proposed method is that it is structure-preserving, in the sense that it enforces conservation of the reconstructed Hamiltonian. This is achieved by directly approximating the underlying unknown Hamiltonian, rather than the right-hand-side of the governing equations. We p… ▽ More

    Submitted 19 August, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: 27 pages, 19 figures

    Journal ref: SIAM Journal on Scientific Computing 42 (6), A3704--A3729, 2020

  37. arXiv:1905.06125  [pdf, other

    cs.LG cs.AI stat.ML

    Distributional Reinforcement Learning for Efficient Exploration

    Authors: Borislav Mavrin, Shangtong Zhang, Hengshuai Yao, Linglong Kong, Kaiwen Wu, Yaoliang Yu

    Abstract: In distributional reinforcement learning (RL), the estimated distribution of value function models both the parametric and intrinsic uncertainties. We propose a novel and efficient exploration method for deep RL that has two components. The first is a decaying schedule to suppress the intrinsic uncertainty. The second is an exploration bonus calculated from the upper quantiles of the learned distr… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Journal ref: ICML, 2019

  38. arXiv:1902.02627  [pdf, other

    eess.SP cs.LG stat.ML

    Fast Transient Simulation of High-Speed Channels Using Recurrent Neural Network

    Authors: Thong Nguyen, Tianjian Lu, Ken Wu, Jose Schutt-Aine

    Abstract: Generating eye diagrams by using a circuit simulator can be very computationally intensive, especially in the presence of nonlinearities. It often involves multiple Newton-like iterations at every time step when a SPICE-like circuit simulator handles a nonlinear system in the transient regime. In this paper, we leverage machine learning methods, to be specific, the recurrent neural network (RNN),… ▽ More

    Submitted 7 February, 2019; v1 submitted 25 January, 2019; originally announced February 2019.

  39. arXiv:1812.09645  [pdf, other

    cs.LG stat.ML

    Mixed Membership Recurrent Neural Networks

    Authors: Ghazal Fazelnia, Mark Ibrahim, Ceena Modarres, Kevin Wu, John Paisley

    Abstract: Models for sequential data such as the recurrent neural network (RNN) often implicitly model a sequence as having a fixed time interval between observations and do not account for group-level effects when multiple sequences are observed. We propose a model for grouped sequential data based on the RNN that accounts for varying time intervals between observations in a sequence by learning a group-le… ▽ More

    Submitted 22 December, 2018; originally announced December 2018.

  40. arXiv:1811.05537  [pdf, other

    math.NA cs.LG cs.NE math.DS stat.ML

    Data Driven Governing Equations Approximation Using Deep Neural Networks

    Authors: Tong Qin, Kailiang Wu, Dongbin Xiu

    Abstract: We present a numerical framework for approximating unknown governing equations using observation data and deep neural networks (DNN). In particular, we propose to use residual network (ResNet) as the basic building block for equation approximation. We demonstrate that the ResNet block can be considered as a one-step method that is exact in temporal integration. We then present two multi-step metho… ▽ More

    Submitted 13 November, 2018; originally announced November 2018.

  41. arXiv:1811.00620  [pdf, other

    cs.LG stat.ML

    Efficient Online Hyperparameter Optimization for Kernel Ridge Regression with Applications to Traffic Time Series Prediction

    Authors: Hongyuan Zhan, Gabriel Gomes, Xiaoye S. Li, Kamesh Madduri, Kesheng Wu

    Abstract: Computational efficiency is an important consideration for deploying machine learning models for time series prediction in an online setting. Machine learning algorithms adjust model parameters automatically based on the data, but often require users to set additional parameters, known as hyperparameters. Hyperparameters can significantly impact prediction accuracy. Traffic measurements, typically… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

    Comments: An extended version of "Efficient Online Hyperparameter Learning for Traffic Flow Prediction" published in The 21st IEEE International Conference on Intelligent Transportation Systems (ITSC 2018)

    Journal ref: H. Zhan, G. Gomes, X. S. Li, K. Madduri, and K. Wu. Efficient Online Hyperparameter Learning for Traffic Flow Prediction. In 2018 IEEE 21th International Conference on Intelligent Transportation Systems (ITSC), pages 1-6. IEEE, 2018

  42. arXiv:1810.01483  [pdf, other

    astro-ph.CO cs.CV stat.ML

    DeepCMB: Lensing Reconstruction of the Cosmic Microwave Background with Deep Neural Networks

    Authors: João Caldeira, W. L. Kimmy Wu, Brian Nord, Camille Avestruz, Shubhendu Trivedi, Kyle T. Story

    Abstract: Next-generation cosmic microwave background (CMB) experiments will have lower noise and therefore increased sensitivity, enabling improved constraints on fundamental physics parameters such as the sum of neutrino masses and the tensor-to-scalar ratio r. Achieving competitive constraints on these parameters requires high signal-to-noise extraction of the projected gravitational potential from the C… ▽ More

    Submitted 12 June, 2020; v1 submitted 2 October, 2018; originally announced October 2018.

    Comments: 19 pages; LaTeX; 12 figures; changes to match published version

    Report number: FERMILAB-PUB-18-515-A-CD

    Journal ref: Astronomy and Computing 28 100307 (2019)

  43. arXiv:1809.09170  [pdf, other

    math.NA cs.LG math.DS stat.ML

    Numerical Aspects for Approximating Governing Equations Using Data

    Authors: Kailiang Wu, Dongbin Xiu

    Abstract: We present effective numerical algorithms for locally recovering unknown governing differential equations from measurement data. We employ a set of standard basis functions, e.g., polynomials, to approximate the governing equation with high accuracy. Upon recasting the problem into a function approximation problem, we discuss several important aspects for accurate approximation. Most notably, we d… ▽ More

    Submitted 24 September, 2018; originally announced September 2018.

    Comments: 26 pages, 17 figures

    Journal ref: Journal of Computational Physics, 384, 200-221, 2019

  44. arXiv:1703.10951  [pdf, other

    q-bio.QM cs.LG stat.ML

    Comparison of multi-task convolutional neural network (MT-CNN) and a few other methods for toxicity prediction

    Authors: Kedi Wu, Guo-Wei Wei

    Abstract: Toxicity analysis and prediction are of paramount importance to human health and environmental protection. Existing computational methods are built from a wide variety of descriptors and regressors, which makes their performance analysis difficult. For example, deep neural network (DNN), a successful approach in many occasions, acts like a black box and offers little conceptual elegance or physica… ▽ More

    Submitted 31 March, 2017; originally announced March 2017.

  45. arXiv:1607.07834  [pdf

    q-bio.QM stat.ME

    A W-test collapsing method for rare variant testing with applications to exome sequencing data of hypertensive disorder

    Authors: Rui Sun, Haoyi Weng, Inchi Hu, Junfeng Guo, William K. K. Wu, Benny Chung-Ying Zee, Maggie Haitian Wang

    Abstract: Advancement in sequencing technology enables the study of association between complex disorders and rare variants with low minor allele frequencies. One of the major challenges in rare variant testing is lack of statistical power of traditional testing methods due to extremely low variances of single nucleotide polymorphisms. In this paper, we introduce a W-test collapsing method that evaluates th… ▽ More

    Submitted 26 July, 2016; originally announced July 2016.

    Comments: 18 pages, 1 figure, 4 tables. Genetic Epidemiology accepted

  46. A surrogate accelerated multicanonical Monte Carlo method for uncertainty quantification

    Authors: Keyi Wu, **glai Li

    Abstract: In this work we consider a class of uncertainty quantification problems where the system performance or reliability is characterized by a scalar parameter $y$. The performance parameter $y$ is random due to the presence of various sources of uncertainty in the system, and our goal is to estimate the probability density function (PDF) of $y$. We propose to use the multicanonical Monte Carlo (MMC) m… ▽ More

    Submitted 12 April, 2016; v1 submitted 26 August, 2015; originally announced August 2015.

    MSC Class: 65C05; 65C50