Directional error in testing and confidence intervals

Confidence intervals (CIs) represent a range estimate for a statistical parameter, conveying the uncertainty in its estimation at a specified confidence level 1α1𝛼1-\alpha1 - italic_α. CIs often accompany hypothesis testing, providing additional information about the parameter estimate, but in view of the (unjustified) attacks on hypothesis testing appear more as the only mode of inference. When it comes to CIS, Sign exclusion, also known as Weak Sign Determination (WSD) the property that the interval contains only one sign (and possibly 0). This is different than the Strong Sign Determination (SSD) is when the interval contains a single sign (the interval does not contain 0). If one adopts the attitude that the parameter can never be 0, an attitude shared by many () there is no difference between the two errors. Yet others, especially in the regulatory statistics see a difference and we treat the two separately.

Tukey’s goals for the confidence interval as the tool of inference

tukey1991philosophy lists three questions, in order of importance, to be answered sequentially about a parameter within the limit of acceptable uncertainty; i) what is the direction of the effect? ii) what is the minimal effect size? iii) what is the maximal effect size. Turning to the use of CIs to answer these questions, if the sign cannot be determined providing a short two-sided interval is desirable. If the sign can be identified, say as positive, the next question of importance is answered by the lower end of the CI, which should separate from 0 as far as possible, to better assess the minimum effect. The upper end of this interval is used to answer the least important third question. It should better be finite as otherwise it offers no answer regarding the maximal effect.

Tukey himself, Hayter,() Hsu(), Pratt(), and others tried to construct procedures to fulfill these objectives according to their relative importance. Pratt’s suggestion yielded the shortest CI when the estimate is 0 or close to it. However the length could increase unboundedly when the estimate is away from 0, and there was no separation from 0. benjamini1998confidence suggested to modify Pratt’s suggestion (MP) CI, providing bounded length. They also suggested the Quasi-Conventional (QC) CIs which further provides bounded length and early sign exclusions, satisfying all three goals at the expense of increasing the length of the CI relative to the standard. The lower side of the two-sided CI is close to the one-sided infinite length CI.

Confidence Interval for a large enough estimator

Assymetric preferrence for the two directions

benjamini1998confidence, and weinstein2013selection make use of non-equvariant acceptance regions, changing the CIs shape for different values of the estimator. However, all previous suggested CIS including these two share one property: The directions of errors are treated in the same way, so a CI constructed for an estimate x𝑥-x- italic_x is the mirror image of the CI for x𝑥xitalic_x.

In this work we suggest new direction-preferring confidence interval, that does not address Tukey’s three goals in both directions symmetrically, but allows the analyst to emphasize one direction over the other. We first construct such diretion-preferring CI when no selection occurs, by generalizing the MP CI. It also improves on it by reducing its length for sufficiently extreme values of the estimator. (Section 2)

We then address the problem of selecting to construct the CI only if its absolute value is large enough by conditional CIs. We first offer an improvement of weinstein2013selection . We then construct new conditional CI that offers the possibility of emphasizing making a discovery in one direction over the other. (Section 3)

In its extreme form, the possibility of emphasizing one direction can be expressed by the selection rule: Rather than selecting when the estimator is large in both directions, namely |Y|>c𝑌𝑐|Y|>c| italic_Y | > italic_c, we select to construct a CI only if it is bigger than some constant, namely Y>c𝑌𝑐Y>citalic_Y > italic_c. We build such a CI in section demonstrate the issues of using one-direction conditional CIs. (Section 4)

CIs for parameters selected from many

Selecting from the same data set used for inference alters the distribution of the test statistics. Overlooking this selection can inflate Type I errors, leading to results that are less replicable. The American Statistical Association board emphasized the limitations of p-values and the practice of ’statistical significance’ thresholding (wasserstein2016asa). They suggested alternatives like CIs. Yet, as selection affects test-statistic and estimator distributions, it also effects CIs, as highlighted by The American Statistical Association President’s Task Force (benjamini2021asa). Notably, marginal CI coverage probabilities can be significantly lowered (benjamini2005false).

Several types of CIs have been suggested to deal with the issue of selection. Among the most common methods are the simultaneous CIs, which aim to control the probability that at least one CI will fail to cover its parameter. Indeed tukey1991philosophy paper, and so on…

The other common methods are the False Coverage Rate (FCR) CIs (benjamini2005false), which control the expected value of the proportion of false coverage, and finally, conditional CIs. Conditional CIs control the probability of coverage given the selection criteria (weinstein2013selection).

The two goals, addressing multiplicity and better sign-determination (!that’s the problem with using sign-preferring terminology!) are not conflicting. Most CIs (e.g., MP) can be adjusted to control the FCR or one can design specific CIs with more power to determine the sign which control the FCR (weinstein2020selective). weinstein2013selection suggested conditional-CIs with better power to determine the sign, which also control the FCR.The selection criteria was specified to be … Furthermore, weinstein2020selective designed specific non-equivariant CIs to minimize the length and maximize the power of sign-determination. They restrict the selection rule to parameters whose CI does not cover 0.

However, when correcting for multiplicity using FCR CIs, or conditional-CIs such an adjustment requires a stronger correction, as now the CIs are further conditioned on the estimator sign….and prove they control the marginal FCR under any dependence. Finally, we demonstrate the utility of our approach on a real data-example.

