Skip to main content

Showing 1–32 of 32 results for author: Jiang, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.05803  [pdf, other

    econ.EM stat.ME

    Semiparametric Inference for Regression-Discontinuity Designs

    Authors: Rong J. B. Zhu, Weiwei Jiang

    Abstract: Treatment effects in regression discontinuity designs (RDDs) are often estimated using local regression methods. However, global approximation methods are generally deemed inefficient. In this paper, we propose a semiparametric framework tailored for estimating treatment effects in RDDs. Our global approach conceptualizes the identification of treatment effects within RDDs as a partially linear mo… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  2. arXiv:2403.03562  [pdf, other

    cs.LG stat.ML

    Efficient Algorithms for Empirical Group Distributional Robust Optimization and Beyond

    Authors: Dingzhi Yu, Yunuo Cai, Wei Jiang, Lijun Zhang

    Abstract: We investigate the empirical counterpart of group distributionally robust optimization (GDRO), which aims to minimize the maximal empirical risk across $m$ distinct groups. We formulate empirical GDRO as a $\textit{two-level}$ finite-sum convex-concave minimax optimization problem and develop a stochastic variance reduced mirror prox algorithm. Unlike existing methods, we construct the stochastic… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 30 pages, 1 figure

  3. arXiv:2401.17573  [pdf

    stat.ML cs.LG eess.IV eess.SY

    Tensor-based process control and monitoring for semiconductor manufacturing with unstable disturbances

    Authors: Yanrong Li, Juan Du, Fugee Tsung, Wei Jiang

    Abstract: With the development and popularity of sensors installed in manufacturing systems, complex data are collected during manufacturing processes, which brings challenges for traditional process control methods. This paper proposes a novel process control and monitoring method for the complex structure of high-dimensional image-based overlay errors (modeled in tensor form), which are collected in semic… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 30 pages, 5 figures

  4. arXiv:2312.12477  [pdf, other

    cs.LG cs.AI stat.ME

    When Graph Neural Network Meets Causality: Opportunities, Methodologies and An Outlook

    Authors: Wenzhao Jiang, Hao Liu, Hui Xiong

    Abstract: Graph Neural Networks (GNNs) have emerged as powerful representation learning tools for capturing complex dependencies within diverse graph-structured data. Despite their success in a wide range of graph mining tasks, GNNs have raised serious concerns regarding their trustworthiness, including susceptibility to distribution shift, biases towards certain populations, and lack of explainability. Rec… ▽ More

    Submitted 17 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  5. arXiv:2309.09205  [pdf

    cs.LG eess.SY stat.ML

    MFRL-BI: Design of a Model-free Reinforcement Learning Process Control Scheme by Using Bayesian Inference

    Authors: Yanrong Li, Juan Du, Wei Jiang

    Abstract: Design of process control scheme is critical for quality assurance to reduce variations in manufacturing systems. Taking semiconductor manufacturing as an example, extensive literature focuses on control optimization based on certain process models (usually linear models), which are obtained by experiments before a manufacturing process starts. However, in real applications, pre-defined models may… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: 31 pages, 7 figures, and 3 tables

  6. arXiv:2309.08642  [pdf, other

    eess.SY cs.AI cs.LG stat.ME

    A Stochastic Online Forecast-and-Optimize Framework for Real-Time Energy Dispatch in Virtual Power Plants under Uncertainty

    Authors: Wei Jiang, Zhongkai Yi, Li Wang, Hanwei Zhang, Jihai Zhang, Fangquan Lin, Cheng Yang

    Abstract: Aggregating distributed energy resources in power systems significantly increases uncertainties, in particular caused by the fluctuation of renewable energy generation. This issue has driven the necessity of widely exploiting advanced predictive control techniques under uncertainty to ensure long-term economics and decarbonization. In this paper, we propose a real-time uncertainty-aware energy dis… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: Preprint. Accepted by CIKM 23

  7. arXiv:2212.03375  [pdf, other

    cs.LG math.PR math.ST stat.ML

    General multi-fidelity surrogate models: Framework and active learning strategies for efficient rare event simulation

    Authors: Promit Chakroborty, Somayajulu L. N. Dhulipala, Yifeng Che, Wen Jiang, Benjamin W. Spencer, Jason D. Hales, Michael D. Shields

    Abstract: Estimating the probability of failure for complex real-world systems using high-fidelity computational models is often prohibitively expensive, especially when the probability is small. Exploiting low-fidelity models can make this process more feasible, but merging information from multiple low-fidelity and high-fidelity models poses several challenges. This paper presents a robust multi-fidelity… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  8. arXiv:2211.11115  [pdf, other

    stat.AP physics.comp-ph

    Multifidelity Active Learning for Failure Estimation of TRISO Nuclear Fuel

    Authors: Somayajulu L. N. Dhulipala, Promit Chakroborty, Michael D. Shields, Wen Jiang, Benjamin W. Spencer, Jason D. Hales

    Abstract: The Tristructural isotropic (TRISO)-coated particle fuel is a robust nuclear fuel proposed to be used for multiple modern nuclear technologies. Therefore, characterizing its safety is vital for the reliable operation of nuclear technologies. However, the TRISO fuel failure probabilities are small and the computational model is time consuming to evaluate them using traditional Monte Carlo-type appr… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  9. Consistent Covariance estimation for stratum imbalances under minimization method for covariate-adaptive randomization

    Authors: Zixuan Zhao, Yanglei Song, Wenyu Jiang, Dongsheng Tu

    Abstract: Pocock and Simon's minimization method is a popular approach for covariate-adaptive randomization in clinical trials. Valid statistical inference with data collected under the minimization method requires the knowledge of the limiting covariance matrix of within-stratum imbalances, whose existence is only recently established. In this work, we propose a bootstrap-based estimator for this limit and… ▽ More

    Submitted 26 December, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: 29 pages, peer reviewed version, will appear in Scandinavian Journal of Statistics

  10. arXiv:2201.02172  [pdf, other

    stat.AP stat.CO stat.ML

    Reliability Estimation of an Advanced Nuclear Fuel using Coupled Active Learning, Multifidelity Modeling, and Subset Simulation

    Authors: Somayajulu L. N. Dhulipala, Michael D. Shields, Promit Chakroborty, Wen Jiang, Benjamin W. Spencer, Jason D. Hales, Vincent M. Laboure, Zachary M. Prince, Chandrakanth Bolisetti, Yifeng Che

    Abstract: Tristructural isotropic (TRISO)-coated particle fuel is a robust nuclear fuel and determining its reliability is critical for the success of advanced nuclear technologies. However, TRISO failure probabilities are small and the associated computational models are expensive. We used coupled active learning, multifidelity modeling, and subset simulation to estimate the failure probabilities of TRISO… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

  11. arXiv:2112.04677  [pdf, other

    stat.ML cs.LG

    A Note on Comparison of F-measures

    Authors: Wei Ju, Wenxin Jiang

    Abstract: We comment on a recent TKDE paper "Linear Approximation of F-measure for the Performance Evaluation of Classification Algorithms on Imbalanced Data Sets", and make two improvements related to comparison of F-measures for two prediction rules.

    Submitted 8 December, 2021; originally announced December 2021.

  12. Bayesian Inverse Uncertainty Quantification of a MOOSE-based Melt Pool Model for Additive Manufacturing Using Experimental Data

    Authors: Ziyu Xie, Wen Jiang, Congjian Wang, Xu Wu

    Abstract: Additive manufacturing (AM) technology is being increasingly adopted in a wide variety of application areas due to its ability to rapidly produce, prototype, and customize designs. AM techniques afford significant opportunities in regard to nuclear materials, including an accelerated fabrication process and reduced cost. High-fidelity modeling and simulation (M\&S) of AM processes is being develop… ▽ More

    Submitted 17 May, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

    Comments: 26 pages, 11 figures

  13. arXiv:2102.06933  [pdf, ps, other

    cs.LG math.OC stat.ML

    Revisiting Smoothed Online Learning

    Authors: Lijun Zhang, Wei Jiang, Shiyin Lu, Tianbao Yang

    Abstract: In this paper, we revisit the problem of smoothed online learning, in which the online learner suffers both a hitting cost and a switching cost, and target two performance metrics: competitive ratio and dynamic regret with switching cost. To bound the competitive ratio, we assume the hitting cost is known to the learner in each round, and investigate the simple idea of balancing the two costs by… ▽ More

    Submitted 18 May, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

  14. arXiv:2012.14894  [pdf, other

    stat.ML cs.LG

    Statistical Formulas for F Measures

    Authors: Wenxin Jiang

    Abstract: We provide analytic formulas for the standard error and confidence intervals for the F measures, based on a property of asymptotic normality in the large sample limit. The formula can be applied for sample size planning in order to achieve accurate enough estimation of these F measures.

    Submitted 29 December, 2020; originally announced December 2020.

    MSC Class: 62F12; 62P99

  15. arXiv:2007.14080  [pdf, ps, other

    stat.ME stat.CO

    A set of efficient methods to generate high-dimensional binary data with specified correlation structures

    Authors: Wei Jiang, Shuang Song, Lin Hou, Hongyu Zhao

    Abstract: High dimensional correlated binary data arise in many areas, such as observed genetic variations in biomedical research. Data simulation can help researchers evaluate efficiency and explore properties of different computational and statistical methods. Also, some statistical methods, such as Monte-Carlo methods, rely on data simulation. Lunn and Davies (1998) proposed linear time complexity method… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

  16. arXiv:2007.09087  [pdf, ps, other

    cs.LG cs.NE eess.SP stat.ML

    Standing on the Shoulders of Giants: Hardware and Neural Architecture Co-Search with Hot Start

    Authors: Weiwen Jiang, Lei Yang, Sakyasingha Dasgupta, **gtong Hu, Yiyu Shi

    Abstract: Hardware and neural architecture co-search that automatically generates Artificial Intelligence (AI) solutions from a given dataset is promising to promote AI democratization; however, the amount of time that is required by current co-search frameworks is in the order of hundreds of GPU hours for one target hardware. This inhibits the use of such frameworks on commodity hardware. The root cause of… ▽ More

    Submitted 17 July, 2020; originally announced July 2020.

    Comments: 13 pages

  17. arXiv:2004.01024  [pdf, other

    cs.SI cs.LG stat.ML

    Modeling Dynamic Heterogeneous Network for Link Prediction using Hierarchical Attention with Temporal RNN

    Authors: Hansheng Xue, Luwei Yang, Wen Jiang, Yi Wei, Yi Hu, Yu Lin

    Abstract: Network embedding aims to learn low-dimensional representations of nodes while capturing structure information of networks. It has achieved great success on many tasks of network analysis such as link prediction and node classification. Most of existing network embedding algorithms focus on how to learn static homogeneous networks effectively. However, networks in the real world are more complex,… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

  18. arXiv:2004.00378  [pdf, other

    cs.LG eess.SP stat.ML

    Time-Frequency Analysis based Blind Modulation Classification for Multiple-Antenna Systems

    Authors: Weiheng Jiang, Xiaogang Wu, Bolin Chen, Wenjiang Feng, Yi **

    Abstract: Blind modulation classification is an important step to implement cognitive radio networks. The multiple-input multiple-output (MIMO) technique is widely used in military and civil communication systems. Due to the lack of prior information about channel parameters and the overlap** of signals in the MIMO systems, the traditional likelihood-based and feature-based approaches cannot be applied in… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: 12 pages, 11 figures

  19. arXiv:2002.04116  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks

    Authors: Lei Yang, Zheyu Yan, Meng Li, Hyoukjun Kwon, Liangzhen Lai, Tushar Krishna, Vikas Chandra, Weiwen Jiang, Yiyu Shi

    Abstract: Neural Architecture Search (NAS) has demonstrated its power on various AI accelerating platforms such as Field Programmable Gate Arrays (FPGAs) and Graphic Processing Units (GPUs). However, it remains an open problem, how to integrate NAS with Application-Specific Integrated Circuits (ASICs), despite them being the most powerful AI accelerating platforms. The major bottleneck comes from the large… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

    Comments: Accepted by DAC'20

  20. arXiv:1909.06631  [pdf, other

    stat.ME stat.AP stat.CO

    Adaptive Bayesian SLOPE -- High-dimensional Model Selection with Missing Values

    Authors: Wei Jiang, Malgorzata Bogdan, Julie Josse, Blazej Miasojedow, Veronika Rockova, TraumaBase Group

    Abstract: We consider the problem of variable selection in high-dimensional settings with missing observations among the covariates. To address this relatively understudied problem, we propose a new synergistic procedure -- adaptive Bayesian SLOPE -- which effectively combines the SLOPE method (sorted $l_1$ regularization) together with the Spike-and-Slab LASSO method. We position our approach within a Baye… ▽ More

    Submitted 6 November, 2019; v1 submitted 14 September, 2019; originally announced September 2019.

    Comments: R package https://github.com/wjiang94/ABSLOPE

  21. Modeling of Missing Dynamical Systems: Deriving Parametric Models using a Nonparametric Framework

    Authors: Shixiao W. Jiang, John Harlim

    Abstract: In this paper, we consider modeling missing dynamics with a nonparametric non-Markovian model, constructed using the theory of kernel embedding of conditional distributions on appropriate Reproducing Kernel Hilbert Spaces (RKHS), equipped with orthonormal basis functions. Depending on the choice of the basis functions, the resulting closure model from this nonparametric modeling formulation is in… ▽ More

    Submitted 22 June, 2020; v1 submitted 17 May, 2019; originally announced May 2019.

  22. A Novel GAN-based Fault Diagnosis Approach for Imbalanced Industrial Time Series

    Authors: Wenqian Jiang, Cheng Cheng, Beitong Zhou, Guijun Ma, Ye Yuan

    Abstract: This paper proposes a novel fault diagnosis approach based on generative adversarial networks (GAN) for imbalanced industrial time series where normal samples are much larger than failure cases. We combine a well-designed feature extractor with GAN to help train the whole network. Aimed at obtaining data distribution and hidden pattern in both original distinguishing features and latent space, the… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

  23. STFNets: Learning Sensing Signals from the Time-Frequency Perspective with Short-Time Fourier Neural Networks

    Authors: Shuochao Yao, Ailing Piao, Wenjun Jiang, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, **yang Li, Tianshi Wang, Shaohan Hu, Lu Su, Jiawei Han, Tarek Abdelzaher

    Abstract: Recent advances in deep learning motivate the use of deep neural networks in Internet-of-Things (IoT) applications. These networks are modelled after signal processing in the human brain, thereby leading to significant advantages at perceptual tasks such as vision and speech recognition. IoT applications, however, often measure physical phenomena, where the underlying physics (such as inertia, wir… ▽ More

    Submitted 20 February, 2019; originally announced February 2019.

  24. arXiv:1805.04602  [pdf, other

    stat.ME

    Logistic Regression with Missing Covariates -- Parameter Estimation, Model Selection and Prediction within a Joint-Modeling Framework

    Authors: Wei Jiang, Julie Josse, Marc Lavielle, TraumaBase Group

    Abstract: Logistic regression is a common classification method in supervised learning. Surprisingly, there are very few solutions for performing logistic regression with missing values in the covariates. We suggest a complete approach based on a stochastic approximation version of the EM algorithm to do statistical inference with missing values including the estimation of the parameters and their variance,… ▽ More

    Submitted 7 August, 2019; v1 submitted 11 May, 2018; originally announced May 2018.

    Comments: R package misaem https://CRAN.R-project.org/package=misaem, R implementations https://github.com/wjiang94/miSAEM_logReg

  25. Ecological Regression with Partial Identification

    Authors: Wenxin Jiang, Gary King, Allen Schmaltz, Martin A. Tanner

    Abstract: Ecological inference (EI) is the process of learning about individual behavior from aggregate data. We study a partially identified linear contextual effects model for EI and describe how to estimate the district level parameter averaging over many precincts in the presence of the non-identified parameter of the contextual effect. This may be regarded as a first attempt in this venerable literatur… ▽ More

    Submitted 23 April, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

    MSC Class: 62P25; 62J99

    Journal ref: Polit. Anal. 28 (2020) 65-86

  26. arXiv:1607.05573  [pdf, other

    stat.AP stat.ML

    Combining Random Walks and Nonparametric Bayesian Topic Model for Community Detection

    Authors: Ruimin Zhu, Wenxin Jiang

    Abstract: Community detection has been an active research area for decades. Among all probabilistic models, Stochastic Block Model has been the most popular one. This paper introduces a novel probabilistic model: RW-HDP, based on random walks and Hierarchical Dirichlet Process, for community extraction. In RW-HDP, random walks conducted in a social network are treated as documents; nodes are treated as word… ▽ More

    Submitted 2 August, 2016; v1 submitted 19 July, 2016; originally announced July 2016.

  27. arXiv:1511.05680  [pdf, other

    cs.CR cs.DS stat.ML

    Wishart Mechanism for Differentially Private Principal Components Analysis

    Authors: Wuxuan Jiang, Cong Xie, Zhihua Zhang

    Abstract: We propose a new input perturbation mechanism for publishing a covariance matrix to achieve $(ε,0)$-differential privacy. Our mechanism uses a Wishart distribution to generate matrix noise. In particular, We apply this mechanism to principal component analysis. Our mechanism is able to keep the positive semi-definiteness of the published covariance matrix. Thus, our approach gives rise to a genera… ▽ More

    Submitted 19 November, 2015; v1 submitted 18 November, 2015; originally announced November 2015.

    Comments: A full version with technical proofs. Accepted to AAAI-16

  28. arXiv:1508.06715  [pdf, other

    q-bio.GN stat.AP

    Estimating Reproducibility in Genome-Wide Association Studies

    Authors: Wei Jiang, **g-Hao Xue, Weichuan Yu

    Abstract: Genome-wide association studies (GWAS) are widely used to discover genetic variants associated with diseases. To control false positives, all findings from GWAS need to be verified with additional evidences, even for associations discovered from a high power study. Replication study is a common verification method by using independent samples. An association is regarded as true positive with a hig… ▽ More

    Submitted 26 August, 2015; originally announced August 2015.

  29. arXiv:1301.7390  [pdf

    cs.LG stat.ML

    Hierarchical Mixtures-of-Experts for Exponential Family Regression Models with Generalized Linear Mean Functions: A Survey of Approximation and Consistency Results

    Authors: Wenxin Jiang, Martin A. Tanner

    Abstract: We investigate a class of hierarchical mixtures-of-experts (HME) models where exponential family regression models with generalized linear mean functions of the form psi(ga+fx^Tfgb) are mixed. Here psi(...) is the inverse link function. Suppose the true response y follows an exponential family regression model with mean function belonging to a class of smooth functions of the form psi(h(fx)) wh… ▽ More

    Submitted 30 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI1998)

    Report number: UAI-P-1998-PG-296-303

  30. arXiv:1110.2058  [pdf, ps, other

    math.ST stat.ME stat.ML

    Convergence Rates for Mixture-of-Experts

    Authors: Eduardo F. Mendes, Wenxin Jiang

    Abstract: In mixtures-of-experts (ME) model, where a number of submodels (experts) are combined, there have been two longstanding problems: (i) how many experts should be chosen, given the size of the training data? (ii) given the total number of parameters, is it better to use a few very complex experts, or is it better to combine many simple experts? In this paper, we try to provide some insights to these… ▽ More

    Submitted 1 November, 2011; v1 submitted 10 October, 2011; originally announced October 2011.

  31. arXiv:0810.5655  [pdf, ps, other

    stat.ME stat.ML

    Gibbs posterior for variable selection in high-dimensional classification and data mining

    Authors: Wenxin Jiang, Martin A. Tanner

    Abstract: In the popular approach of "Bayesian variable selection" (BVS), one uses prior and posterior distributions to select a subset of candidate variables to enter the model. A completely new direction will be considered here to study BVS with a Gibbs posterior originating in statistical mechanics. The Gibbs posterior is constructed from a risk function of practical interest (such as the classificatio… ▽ More

    Submitted 31 October, 2008; originally announced October 2008.

    Comments: Published in at http://dx.doi.org/10.1214/07-AOS547 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS547 MSC Class: 62F99 (Primary); 82-08 (Secondary)

    Journal ref: Annals of Statistics 2008, Vol. 36, No. 5, 2207-2231

  32. arXiv:0709.3545  [pdf, other

    stat.ME

    Locally Adaptive Nonparametric Binary Regression

    Authors: Sally Wood, Robert Kohn, Remy Cottet, Wenxin Jiang, Martin Tanner

    Abstract: A nonparametric and locally adaptive Bayesian estimator is proposed for estimating a binary regression. Flexibility is obtained by modeling the binary regression as a mixture of probit regressions with the argument of each probit regression having a thin plate spline prior with its own smoothing parameter and with the mixture weights depending on the covariates. The estimator is compared to a si… ▽ More

    Submitted 21 September, 2007; originally announced September 2007.

    Comments: 31 pages, 10 figures