Search | arXiv e-print repository

Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method

Authors: Yuling Jiao, Yanming Lai, Yang Wang

Abstract: Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations wi… ▽ More Machine learning is a rapidly advancing field with diverse applications across various domains. One prominent area of research is the utilization of deep learning techniques for solving partial differential equations(PDEs). In this work, we specifically focus on employing a three-layer tanh neural network within the framework of the deep Ritz method(DRM) to solve second-order elliptic equations with three different types of boundary conditions. We perform projected gradient descent(PDG) to train the three-layer network and we establish its global convergence. To the best of our knowledge, we are the first to provide a comprehensive error analysis of using overparameterized networks to solve PDE problems, as our analysis simultaneously includes estimates for approximation error, generalization error, and optimization error. We present error bound in terms of the sample size $n$ and our work provides guidance on how to set the network depth, width, step size, and number of iterations for the projected gradient descent algorithm. Importantly, our assumptions in this work are classical and we do not require any additional assumptions on the solution of the equation. This ensures the broad applicability and generality of our results. △ Less

Submitted 19 May, 2024; originally announced May 2024.

MSC Class: 65N12; 65N15; 68T07; 62G05; 35J25

arXiv:2404.05678 [pdf, other]

Flexible Fairness Learning via Inverse Conditional Permutation

Authors: Yuheng Lai, Leying Guan

Abstract: Equalized odds, as a popular notion of algorithmic fairness, aims to ensure that sensitive variables, such as race and gender, do not unfairly influence the algorithm prediction when conditioning on the true outcome. Despite rapid advancements, most of the current research focuses on the violation of equalized odds caused by one sensitive attribute, leaving the challenge of simultaneously accounti… ▽ More Equalized odds, as a popular notion of algorithmic fairness, aims to ensure that sensitive variables, such as race and gender, do not unfairly influence the algorithm prediction when conditioning on the true outcome. Despite rapid advancements, most of the current research focuses on the violation of equalized odds caused by one sensitive attribute, leaving the challenge of simultaneously accounting for multiple attributes under-addressed. We address this gap by introducing a fairness learning approach that integrates adversarial learning with a novel inverse conditional permutation. This approach effectively and flexibly handles multiple sensitive attributes, potentially of mixed data types. The efficacy and flexibility of our method are demonstrated through both simulation studies and empirical analysis of real-world datasets. △ Less

Submitted 9 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.02538 [pdf, other]

Convergence Analysis of Flow Matching in Latent Space with Transformers

Authors: Yuling Jiao, Yanming Lai, Yang Wang, Bokai Yan

Abstract: We present theoretical convergence guarantees for ODE-based generative models, specifically flow matching. We use a pre-trained autoencoder network to map high-dimensional original inputs to a low-dimensional latent space, where a transformer network is trained to predict the velocity field of the transformation from a standard normal distribution to the target latent distribution. Our error analy… ▽ More We present theoretical convergence guarantees for ODE-based generative models, specifically flow matching. We use a pre-trained autoencoder network to map high-dimensional original inputs to a low-dimensional latent space, where a transformer network is trained to predict the velocity field of the transformation from a standard normal distribution to the target latent distribution. Our error analysis demonstrates the effectiveness of this approach, showing that the distribution of samples generated via estimated ODE flow converges to the target distribution in the Wasserstein-2 distance under mild and practical assumptions. Furthermore, we show that arbitrary smooth functions can be effectively approximated by transformer networks with Lipschitz continuity, which may be of independent interest. △ Less

Submitted 28 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

arXiv:2104.12953 [pdf, other]

Exploring Uncertainty in Deep Learning for Construction of Prediction Intervals

Authors: Yuandu Lai, Yucheng Shi, Yahong Han, Yunfeng Shao, Meiyu Qi, Bingshuai Li

Abstract: Deep learning has achieved impressive performance on many tasks in recent years. However, it has been found that it is still not enough for deep neural networks to provide only point estimates. For high-risk tasks, we need to assess the reliability of the model predictions. This requires us to quantify the uncertainty of model prediction and construct prediction intervals. In this paper, We explor… ▽ More Deep learning has achieved impressive performance on many tasks in recent years. However, it has been found that it is still not enough for deep neural networks to provide only point estimates. For high-risk tasks, we need to assess the reliability of the model predictions. This requires us to quantify the uncertainty of model prediction and construct prediction intervals. In this paper, We explore the uncertainty in deep learning to construct the prediction intervals. In general, We comprehensively consider two categories of uncertainties: aleatory uncertainty and epistemic uncertainty. We design a special loss function, which enables us to learn uncertainty without uncertainty label. We only need to supervise the learning of regression task. We learn the aleatory uncertainty implicitly from the loss function. And that epistemic uncertainty is accounted for in ensembled form. Our method correlates the construction of prediction intervals with the uncertainty estimation. Impressive results on some publicly available datasets show that the performance of our method is competitive with other state-of-the-art methods. △ Less

Submitted 26 April, 2021; originally announced April 2021.

arXiv:2103.13358 [pdf, other]

doi 10.1103/PhysRevResearch.3.023237

Anticipating synchronization with machine learning

Authors: Huawei Fan, Ling-Wei Kong, Ying-Cheng Lai, Xingang Wang

Abstract: In applications of dynamical systems, situations can arise where it is desired to predict the onset of synchronization as it can lead to characteristic and significant changes in the system performance and behaviors, for better or worse. In experimental and real settings, the system equations are often unknown, raising the need to develop a prediction framework that is model free and fully data dr… ▽ More In applications of dynamical systems, situations can arise where it is desired to predict the onset of synchronization as it can lead to characteristic and significant changes in the system performance and behaviors, for better or worse. In experimental and real settings, the system equations are often unknown, raising the need to develop a prediction framework that is model free and fully data driven. We contemplate that this challenging problem can be addressed with machine learning. In particular, exploiting reservoir computing or echo state networks, we devise a "parameter-aware" scheme to train the neural machine using asynchronous time series, i.e., in the parameter regime prior to the onset of synchronization. A properly trained machine will possess the power to predict the synchronization transition in that, with a given amount of parameter drift, whether the system would remain asynchronous or exhibit synchronous dynamics can be accurately anticipated. We demonstrate the machine-learning based framework using representative chaotic models and small network systems that exhibit continuous (second-order) or abrupt (first-order) transitions. A remarkable feature is that, for a network system exhibiting an explosive (first-order) transition and a hysteresis loop in synchronization, the machine learning scheme is capable of accurately predicting these features, including the precise locations of the transition points associated with the forward and backward transition paths. △ Less

Submitted 12 March, 2021; originally announced March 2021.

Comments: 13 pages; 12 figures

Journal ref: Phys. Rev. Research 3, 023237 (2021)

arXiv:2006.08361 [pdf, other]

An Unsupervised Machine Learning Approach to Assess the ZIP Code Level Impact of COVID-19 in NYC

Authors: Fadoua Khmaissia, Pegah Sagheb Haghighi, Aarthe Jayaprakash, Zhenwei Wu, Sokratis Papadopoulos, Yuan Lai, Freddy T. Nguyen

Abstract: New York City has been recognized as the world's epicenter of the novel Coronavirus pandemic. To identify the key inherent factors that are highly correlated to the Increase Rate of COVID-19 new cases in NYC, we propose an unsupervised machine learning framework. Based on the assumption that ZIP code areas with similar demographic, socioeconomic, and mobility patterns are likely to experience simi… ▽ More New York City has been recognized as the world's epicenter of the novel Coronavirus pandemic. To identify the key inherent factors that are highly correlated to the Increase Rate of COVID-19 new cases in NYC, we propose an unsupervised machine learning framework. Based on the assumption that ZIP code areas with similar demographic, socioeconomic, and mobility patterns are likely to experience similar outbreaks, we select the most relevant features to perform a clustering that can best reflect the spread, and map them down to 9 interpretable categories. We believe that our findings can guide policy makers to promptly anticipate and prevent the spread of the virus by taking the right measures. △ Less

Submitted 18 September, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

Comments: Presented at ICML 2020 Workshop on the Healthcare Systems, Population Health, and the Role of Health-Tech

arXiv:2004.01258 [pdf, other]

Long-term prediction of chaotic systems with recurrent neural networks

Authors: Huawei Fan, Junjie Jiang, Chun Zhang, Xingang Wang, Ying-Cheng Lai

Abstract: Reservoir computing systems, a class of recurrent neural networks, have recently been exploited for model-free, data-based prediction of the state evolution of a variety of chaotic dynamical systems. The prediction horizon demonstrated has been about half dozen Lyapunov time. Is it possible to significantly extend the prediction time beyond what has been achieved so far? We articulate a scheme inc… ▽ More Reservoir computing systems, a class of recurrent neural networks, have recently been exploited for model-free, data-based prediction of the state evolution of a variety of chaotic dynamical systems. The prediction horizon demonstrated has been about half dozen Lyapunov time. Is it possible to significantly extend the prediction time beyond what has been achieved so far? We articulate a scheme incorporating time-dependent but sparse data inputs into reservoir computing and demonstrate that such rare "updates" of the actual state practically enable an arbitrarily long prediction horizon for a variety of chaotic systems. A physical understanding based on the theory of temporal synchronization is developed. △ Less

Submitted 6 March, 2020; originally announced April 2020.

Comments: 10 pages, 8 figures

arXiv:1912.05796 [pdf, other]

Automatic Layout Generation with Applications in Machine Learning Engine Evaluation

Authors: Haoyu Yang, Wen Chen, Piyush Pathak, Frank Gennari, Ya-Chieh Lai, Bei Yu

Abstract: Machine learning-based lithography hotspot detection has been deeply studied recently, from varies feature extraction techniques to efficient learning models. It has been observed that such machine learning-based frameworks are providing satisfactory metal layer hotspot prediction results on known public metal layer benchmarks. In this work, we seek to evaluate how these machine learning-based hot… ▽ More Machine learning-based lithography hotspot detection has been deeply studied recently, from varies feature extraction techniques to efficient learning models. It has been observed that such machine learning-based frameworks are providing satisfactory metal layer hotspot prediction results on known public metal layer benchmarks. In this work, we seek to evaluate how these machine learning-based hotspot detectors generalize to complicated patterns. We first introduce a automatic layout generation tool that can synthesize varies layout patterns given a set of design rules. The tool currently supports both metal layer and via layer generation. As a case study, we conduct hotspot detection on the generated via layer layouts with representative machine learning-based hotspot detectors, which shows that continuous study on model robustness and generality is necessary to prototype and integrate the learning engines in DFM flows. The source code of the layout generation tool will be available at https://github. com/phdyang007/layout-generation. △ Less

Submitted 12 December, 2019; originally announced December 2019.

Comments: 6 pages, submitted to 1st ACM/IEEE Workshop on Machine Learning for CAD (MLCAD) for review

arXiv:1910.12960 [pdf, other]

doi 10.1016/j.csda.2019.106849

Ensemble Quantile Classifier

Authors: Yuanhao Lai, Ian McLeod

Abstract: Both the median-based classifier and the quantile-based classifier are useful for discriminating high-dimensional data with heavy-tailed or skewed inputs. But these methods are restricted as they assign equal weight to each variable in an unregularized way. The ensemble quantile classifier is a more flexible regularized classifier that provides better performance with high-dimensional data, asymme… ▽ More Both the median-based classifier and the quantile-based classifier are useful for discriminating high-dimensional data with heavy-tailed or skewed inputs. But these methods are restricted as they assign equal weight to each variable in an unregularized way. The ensemble quantile classifier is a more flexible regularized classifier that provides better performance with high-dimensional data, asymmetric data or when there are many irrelevant extraneous inputs. The improved performance is demonstrated by a simulation study as well as an application to text categorization. It is proven that the estimated parameters of the ensemble quantile classifier consistently estimate the minimal population loss under suitable general model assumptions. It is also shown that the ensemble quantile classifier is Bayes optimal under suitable assumptions with asymmetric Laplace distribution inputs. △ Less

Submitted 28 October, 2019; originally announced October 2019.

Journal ref: Computational Statistics and Data Analysis (2019) 106849

arXiv:1910.04426 [pdf, other]

Model-free prediction of spatiotemporal dynamical systems with recurrent neural networks: Role of network spectral radius

Authors: Junjie Jiang, Ying-Cheng Lai

Abstract: A common difficulty in applications of machine learning is the lack of any general principle for guiding the choices of key parameters of the underlying neural network. Focusing on a class of recurrent neural networks - reservoir computing systems that have recently been exploited for model-free prediction of nonlinear dynamical systems, we uncover a surprising phenomenon: the emergence of an inte… ▽ More A common difficulty in applications of machine learning is the lack of any general principle for guiding the choices of key parameters of the underlying neural network. Focusing on a class of recurrent neural networks - reservoir computing systems that have recently been exploited for model-free prediction of nonlinear dynamical systems, we uncover a surprising phenomenon: the emergence of an interval in the spectral radius of the neural network in which the prediction error is minimized. In a three-dimensional representation of the error versus time and spectral radius, the interval corresponds to the bottom region of a "valley." Such a valley arises for a variety of spatiotemporal dynamical systems described by nonlinear partial differential equations, regardless of the structure and the edge-weight distribution of the underlying reservoir network. We also find that, while the particular location and size of the valley would depend on the details of the target system to be predicted, the interval tends to be larger for undirected than for directed networks. The valley phenomenon can be beneficial to the design of optimal reservoir computing, representing a small step forward in understanding these machine-learning systems. △ Less

Submitted 10 October, 2019; originally announced October 2019.

Comments: 15 pages, 13 figures

arXiv:1810.08765 [pdf, other]

Attribute-aware Collaborative Filtering: Survey and Classification

Authors: Wen-Hao Chen, Chin-Chi Hsu, Yi-An Lai, Vincent Liu, Mi-Yen Yeh, Shou-De Lin

Abstract: Attribute-aware CF models aims at rating prediction given not only the historical rating from users to items, but also the information associated with users (e.g. age), items (e.g. price), or even ratings (e.g. rating time). This paper surveys works in the past decade develo** attribute-aware CF systems, and discovered that mathematically they can be classified into four different categories. We… ▽ More Attribute-aware CF models aims at rating prediction given not only the historical rating from users to items, but also the information associated with users (e.g. age), items (e.g. price), or even ratings (e.g. rating time). This paper surveys works in the past decade develo** attribute-aware CF systems, and discovered that mathematically they can be classified into four different categories. We provide the readers not only the high level mathematical interpretation of the existing works in this area but also the mathematical insight for each category of models. Finally we provide in-depth experiment results comparing the effectiveness of the major works in each category. △ Less

Submitted 20 October, 2018; originally announced October 2018.

arXiv:1807.10693 [pdf, ps, other]

Infinite Mixture of Inverted Dirichlet Distributions

Authors: Zhanyu Ma, Yu** Lai

Abstract: In this work, we develop a novel Bayesian estimation method for the Dirichlet process (DP) mixture of the inverted Dirichlet distributions, which has been shown to be very flexible for modeling vectors with positive elements. The recently proposed extended variational inference (EVI) framework is adopted to derive an analytically tractable solution. The convergency of the proposed algorithm is the… ▽ More In this work, we develop a novel Bayesian estimation method for the Dirichlet process (DP) mixture of the inverted Dirichlet distributions, which has been shown to be very flexible for modeling vectors with positive elements. The recently proposed extended variational inference (EVI) framework is adopted to derive an analytically tractable solution. The convergency of the proposed algorithm is theoretically guaranteed by introducing single lower bound approximation to the original objective function in the VI framework. In principle, the proposed model can be viewed as an infinite inverted Dirichelt mixture model (InIDMM) that allows the automatic determination of the number of mixture components from data. Therefore, the problem of pre-determining the optimal number of mixing components has been overcome. Moreover, the problems of over-fitting and under-fitting are avoided by the Bayesian estimation approach. Comparing with several recently proposed DP-related methods, the good performance and effectiveness of the proposed method have been demonstrated with both synthesized data and real data evaluations. △ Less

Submitted 2 February, 2020; v1 submitted 27 July, 2018; originally announced July 2018.

Comments: Technical Report of ongoing work

arXiv:1806.05424 [pdf, other]

Sequential Bayesian inference for spatio-temporal models of temperature and humidity data

Authors: Yingying Lai, Andrew Golightly, Richard Boys

Abstract: We develop a spatio-temporal model to forecast sensor output at five locations in North East England. The signal is described using coupled dynamic linear models, with spatial effects specified by a Gaussian process. Data streams are analysed using a stochastic algorithm which sequentially approximates the parameter posterior through a series of reweighting and resampling steps. An iterated batch… ▽ More We develop a spatio-temporal model to forecast sensor output at five locations in North East England. The signal is described using coupled dynamic linear models, with spatial effects specified by a Gaussian process. Data streams are analysed using a stochastic algorithm which sequentially approximates the parameter posterior through a series of reweighting and resampling steps. An iterated batch importance sampling scheme is used to circumvent particle degeneracy through a resample-move step. The algorithm is modified to make it more efficient and parallisable. The model is shown to give a good description of the underlying process and provide reasonable forecast accuracy. △ Less

Submitted 14 June, 2018; originally announced June 2018.

Comments: 25 pages

arXiv:1709.00944 [pdf]

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

Authors: Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang

Abstract: Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus only on addressing audio information. In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent success of convolutional neural networks (CNNs) in SE, we propose an audio-visual deep CNNs (AVDCNN) SE model, which incorporates audio and visual streams into a un… ▽ More Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus only on addressing audio information. In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent success of convolutional neural networks (CNNs) in SE, we propose an audio-visual deep CNNs (AVDCNN) SE model, which incorporates audio and visual streams into a unified network model. We also propose a multi-task learning framework for reconstructing audio and visual signals at the output layer. Precisely speaking, the proposed AVDCNN model is structured as an audio-visual encoder-decoder network, in which audio and visual data are first processed using individual CNNs, and then fused into a joint network to generate enhanced speech (the primary task) and reconstructed images (the secondary task) at the output layer. The model is trained in an end-to-end manner, and parameters are jointly learned through back-propagation. We evaluate enhanced speech using five instrumental criteria. Results show that the AVDCNN model yields a notably superior performance compared with an audio-only CNN-based SE model and two conventional SE approaches, confirming the effectiveness of integrating visual information into the SE process. In addition, the AVDCNN model also outperforms an existing audio-visual SE model, confirming its capability of effectively combining audio and visual information in SE. △ Less

Submitted 18 April, 2022; v1 submitted 1 September, 2017; originally announced September 2017.

Comments: This paper is the same as arXiv:1703.10893v6. Apologies for the inconvenience. arXiv admin note: text overlap with arXiv:1703.10893

arXiv:1703.10893 [pdf]

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

Authors: Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang

Abstract: Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus only on addressing audio information. In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent success of convolutional neural networks (CNNs) in SE, we propose an audio-visual deep CNNs (AVDCNN) SE model, which incorporates audio and visual streams into a un… ▽ More Speech enhancement (SE) aims to reduce noise in speech signals. Most SE techniques focus only on addressing audio information. In this work, inspired by multimodal learning, which utilizes data from different modalities, and the recent success of convolutional neural networks (CNNs) in SE, we propose an audio-visual deep CNNs (AVDCNN) SE model, which incorporates audio and visual streams into a unified network model. We also propose a multi-task learning framework for reconstructing audio and visual signals at the output layer. Precisely speaking, the proposed AVDCNN model is structured as an audio-visual encoder-decoder network, in which audio and visual data are first processed using individual CNNs, and then fused into a joint network to generate enhanced speech (the primary task) and reconstructed images (the secondary task) at the output layer. The model is trained in an end-to-end manner, and parameters are jointly learned through back-propagation. We evaluate enhanced speech using five instrumental criteria. Results show that the AVDCNN model yields a notably superior performance compared with an audio-only CNN-based SE model and two conventional SE approaches, confirming the effectiveness of integrating visual information into the SE process. In addition, the AVDCNN model also outperforms an existing audio-visual SE model, confirming its capability of effectively combining audio and visual information in SE. △ Less

Submitted 24 January, 2018; v1 submitted 30 March, 2017; originally announced March 2017.

Comments: To appear in IEEE Transactions on Emerging Topics in Computational Intelligence. Some audio samples can be reached in this link: https://sites.google.com/view/avse2017

arXiv:1603.07439 [pdf, ps, other]

doi 10.1371/journal.pone.0124472

Spatiotemporal patterns and predictability of cyberattacks

Authors: Yu-Zhong Chen, Zi-Gang Huang, Shouhuai Xu, Ying-Cheng Lai

Abstract: A relatively unexplored issue in cybersecurity science and engineering is whether there exist intrinsic patterns of cyberattacks. Conventional wisdom favors absence of such patterns due to the overwhelming complexity of the modern cyberspace. Surprisingly, through a detailed analysis of an extensive data set that records the time-dependent frequencies of attacks over a relatively wide range of con… ▽ More A relatively unexplored issue in cybersecurity science and engineering is whether there exist intrinsic patterns of cyberattacks. Conventional wisdom favors absence of such patterns due to the overwhelming complexity of the modern cyberspace. Surprisingly, through a detailed analysis of an extensive data set that records the time-dependent frequencies of attacks over a relatively wide range of consecutive IP addresses, we successfully uncover intrinsic spatiotemporal patterns underlying cyberattacks, where the term "spatio" refers to the IP address space. In particular, we focus on analyzing {\em macroscopic} properties of the attack traffic flows and identify two main patterns with distinct spatiotemporal characteristics: deterministic and stochastic. Strikingly, there are very few sets of major attackers committing almost all the attacks, since their attack "fingerprints" and target selection scheme can be unequivocally identified according to the very limited number of unique spatiotemporal characteristics, each of which only exists on a consecutive IP region and differs significantly from the others. We utilize a number of quantitative measures, including the flux-fluctuation law, the Markov state transition probability matrix, and predictability measures, to characterize the attack patterns in a comprehensive manner. A general finding is that the attack patterns possess high degrees of predictability, potentially paving the way to anticipating and, consequently, mitigating or even preventing large-scale cyberattacks using macroscopic approaches. △ Less

Submitted 24 March, 2016; originally announced March 2016.

Journal ref: PLoS One 10(5): e0124472 (2015)

arXiv:1008.0901 [pdf, ps, other]

Convergence to global consensus in opinion dynamics under a nonlinear voter model

Authors: Han-Xin Yang, Wen-Xu Wang, Ying-Cheng Lai, Bing-Hong Wang

Abstract: We propose a nonlinear voter model to study the emergence of global consensus in opinion dynamics. In our model, agent $i$ agrees with one of binary opinions with the probability that is a power function of the number of agents holding this opinion among agent $i$ and its nearest neighbors, where an adjustable parameter $α$ controls the effect of herd behavior on consensus. We find that there exis… ▽ More We propose a nonlinear voter model to study the emergence of global consensus in opinion dynamics. In our model, agent $i$ agrees with one of binary opinions with the probability that is a power function of the number of agents holding this opinion among agent $i$ and its nearest neighbors, where an adjustable parameter $α$ controls the effect of herd behavior on consensus. We find that there exists an optimal value of $α$ leading to the fastest consensus for lattices, random graphs, small-world networks and scale-free networks. Qualitative insights are obtained by examining the spatiotemporal evolution of the opinion clusters. △ Less

Submitted 28 December, 2011; v1 submitted 4 August, 2010; originally announced August 2010.

Journal ref: Physics Letters A 376 (2012) 282-285

arXiv:0804.4361 [pdf, ps, other]

Improving Coverage Accuracy of Block Bootstrap Confidence Intervals

Authors: Stephen M. S. Lee, P. Y. Lai

Abstract: The block bootstrap confidence interval based on dependent data can outperform the computationally more convenient normal approximation only with non-trivial Studentization which, in the case of complicated statistics, calls for highly specialist treatment. We propose two different approaches to improving the accuracy of the block bootstrap confidence interval under very general conditions. The… ▽ More The block bootstrap confidence interval based on dependent data can outperform the computationally more convenient normal approximation only with non-trivial Studentization which, in the case of complicated statistics, calls for highly specialist treatment. We propose two different approaches to improving the accuracy of the block bootstrap confidence interval under very general conditions. The first calibrates the coverage level by iterating the block bootstrap. The second calculates Studentizing factors directly from block bootstrap series and requires no non-trivial analytic treatment. Both approaches involve two nested levels of block bootstrap resampling and yield high-order accuracy with simple tuning of block lengths at the two resampling levels. A simulation study is reported to provide empirical support for our theory. △ Less

Submitted 28 April, 2008; originally announced April 2008.

Report number: Research Report No. 435. Department of Statistics and Actuarial Science, The University of Hong Kong

Showing 1–18 of 18 results for author: Lai, Y