-
An Empirical Bayes Method for Chi-Squared Data
Authors:
Lilun Du,
Inchi Hu
Abstract:
In a thought-provoking paper, Efron (2011) investigated the merit and limitation of an empirical Bayes method to correct selection bias based on Tweedie's formula first reported by \cite{Robbins:1956}. The exceptional virtue of Tweedie's formula for the normal distribution lies in its representation of selection bias as a simple function of the derivative of log marginal likelihood. Since the marg…
▽ More
In a thought-provoking paper, Efron (2011) investigated the merit and limitation of an empirical Bayes method to correct selection bias based on Tweedie's formula first reported by \cite{Robbins:1956}. The exceptional virtue of Tweedie's formula for the normal distribution lies in its representation of selection bias as a simple function of the derivative of log marginal likelihood. Since the marginal likelihood and its derivative can be estimated from the data directly without invoking prior information, bias correction can be carried out conveniently. We propose a Bayesian hierarchical model for chi-squared data such that the resulting Tweedie's formula has the same virtue as that of the normal distribution. Because the family of noncentral chi-squared distributions, the common alternative distributions for chi-squared tests, does not constitute an exponential family, our results cannot be obtained by extending existing results. Furthermore, the corresponding Tweedie's formula manifests new phenomena quite different from those of the normal distribution and suggests new ways of analyzing chi-squared data.
△ Less
Submitted 26 May, 2021; v1 submitted 2 March, 2019;
originally announced March 2019.
-
A W-test collapsing method for rare variant testing with applications to exome sequencing data of hypertensive disorder
Authors:
Rui Sun,
Haoyi Weng,
Inchi Hu,
Junfeng Guo,
William K. K. Wu,
Benny Chung-Ying Zee,
Maggie Haitian Wang
Abstract:
Advancement in sequencing technology enables the study of association between complex disorders and rare variants with low minor allele frequencies. One of the major challenges in rare variant testing is lack of statistical power of traditional testing methods due to extremely low variances of single nucleotide polymorphisms. In this paper, we introduce a W-test collapsing method that evaluates th…
▽ More
Advancement in sequencing technology enables the study of association between complex disorders and rare variants with low minor allele frequencies. One of the major challenges in rare variant testing is lack of statistical power of traditional testing methods due to extremely low variances of single nucleotide polymorphisms. In this paper, we introduce a W-test collapsing method that evaluates the distributional differences in cases and controls using a combined log of odds ratio. The proposed method is compared with the Weighted-Sum Statistic and Sequence Kernel Association Test using simulation data sets and showed better performances and faster computing speed. In the study of real next generation sequencing data set of hypertensive disorder, we identified genes of interesting biological functions that are associated to metabolism disorder and inflammation, which include the MACROD1, NLRP7, AGK, PAK6 and APBB1. The W-test collapsing method offers a fast, effective and alternative way for rare variants association analysis.
△ Less
Submitted 26 July, 2016;
originally announced July 2016.
-
Estimation in hidden Markov models via efficient importance sampling
Authors:
Cheng-Der Fuh,
Inchi Hu
Abstract:
Given a sequence of observations from a discrete-time, finite-state hidden Markov model, we would like to estimate the sampling distribution of a statistic. The bootstrap method is employed to approximate the confidence regions of a multi-dimensional parameter. We propose an importance sampling formula for efficient simulation in this context. Our approach consists of constructing a locally asym…
▽ More
Given a sequence of observations from a discrete-time, finite-state hidden Markov model, we would like to estimate the sampling distribution of a statistic. The bootstrap method is employed to approximate the confidence regions of a multi-dimensional parameter. We propose an importance sampling formula for efficient simulation in this context. Our approach consists of constructing a locally asymptotically normal (LAN) family of probability distributions around the default resampling rule and then minimizing the asymptotic variance within the LAN family. The solution of this minimization problem characterizes the asymptotically optimal resampling scheme, which is given by a tilting formula. The implementation of the tilting formula is facilitated by solving a Poisson equation. A few numerical examples are given to demonstrate the efficiency of the proposed importance sampling scheme.
△ Less
Submitted 30 August, 2007;
originally announced August 2007.