-
Population-average mediation analysis for zero-inflated count outcomes
Authors:
Andrew Sims,
D. Leann Long,
Hemant K. Tiwari,
**hong Cui,
Dustin M. Long,
Todd M. Brown,
Melissa J. Smith,
Emily B. Levitan
Abstract:
Mediation analysis is an increasingly popular statistical method for explaining causal pathways to inform intervention. While methods have increased, there is still a dearth of robust mediation methods for count outcomes with excess zeroes. Current mediation methods addressing this issue are computationally intensive, biased, or challenging to interpret. To overcome these limitations, we propose a…
▽ More
Mediation analysis is an increasingly popular statistical method for explaining causal pathways to inform intervention. While methods have increased, there is still a dearth of robust mediation methods for count outcomes with excess zeroes. Current mediation methods addressing this issue are computationally intensive, biased, or challenging to interpret. To overcome these limitations, we propose a new mediation methodology for zero-inflated count outcomes using the marginalized zero-inflated Poisson (MZIP) model and the counterfactual approach to mediation. This novel work gives population-average mediation effects whose variance can be estimated rapidly via delta method. This methodology is extended to cases with exposure-mediator interactions. We apply this novel methodology to explore if diabetes diagnosis can explain BMI differences in healthcare utilization and test model performance via simulations comparing the proposed MZIP method to existing zero-inflated and Poisson methods. We find that our proposed method minimizes bias and computation time compared to alternative approaches while allowing for straight-forward interpretations.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
Estimating the correlation between operational risk loss categories over different time horizons
Authors:
Maurice L. Brown,
Cheng Ly
Abstract:
Operational risk is challenging to quantify because of the broad range of categories (fraud, technological issues, natural disasters) and the heavy-tailed nature of realized losses. Operational risk modeling requires quantifying how these broad loss categories are related. We focus on the issue of loss frequencies having different time scales (e.g., daily, yearly, monthly basis), specifically on e…
▽ More
Operational risk is challenging to quantify because of the broad range of categories (fraud, technological issues, natural disasters) and the heavy-tailed nature of realized losses. Operational risk modeling requires quantifying how these broad loss categories are related. We focus on the issue of loss frequencies having different time scales (e.g., daily, yearly, monthly basis), specifically on estimating the statistics of losses on arbitrary time horizons. We present a frequency model where mathematical techniques can be feasibly applied to analytically calculate the mean, variance, and co-variances that are accurate compared to more time-consuming Monte Carlo simulations. We show that the analytic calculations of cumulative loss statistics in an arbitrary time window are feasible here and would otherwise be intractable due to temporal correlations. Our work has potential value because these statistics are crucial for approximating correlations of losses via copulas. We systematically vary all model parameters to demonstrate the accuracy of our methods for calculating all first and second order statistics of aggregate loss distributions. Finally, using combined data from a consortium of institutions, we show that different time horizons can lead to a large range of loss statistics that can significantly affect calculations of capital requirements.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Euclid: Covariance of weak lensing pseudo-$C_\ell$ estimates. Calculation, comparison to simulations, and dependence on survey geometry
Authors:
R. E. Upham,
M. L. Brown,
L. Whittaker,
A. Amara,
N. Auricchio,
D. Bonino,
E. Branchini,
M. Brescia,
J. Brinchmann,
V. Capobianco,
C. Carbone,
J. Carretero,
M. Castellano,
S. Cavuoti,
A. Cimatti,
R. Cledassou,
G. Congedo,
L. Conversi,
Y. Copin,
L. Corcione,
M. Cropper,
A. Da Silva,
H. Degaudenzi,
M. Douspis,
F. Dubath
, et al. (80 additional authors not shown)
Abstract:
An accurate covariance matrix is essential for obtaining reliable cosmological results when using a Gaussian likelihood. In this paper we study the covariance of pseudo-$C_\ell$ estimates of tomographic cosmic shear power spectra. Using two existing publicly available codes in combination, we calculate the full covariance matrix, including mode-coupling contributions arising from both partial sky…
▽ More
An accurate covariance matrix is essential for obtaining reliable cosmological results when using a Gaussian likelihood. In this paper we study the covariance of pseudo-$C_\ell$ estimates of tomographic cosmic shear power spectra. Using two existing publicly available codes in combination, we calculate the full covariance matrix, including mode-coupling contributions arising from both partial sky coverage and non-linear structure growth. For three different sky masks, we compare the theoretical covariance matrix to that estimated from publicly available N-body weak lensing simulations, finding good agreement. We find that as a more extreme sky cut is applied, a corresponding increase in both Gaussian off-diagonal covariance and non-Gaussian super-sample covariance is observed in both theory and simulations, in accordance with expectations. Studying the different contributions to the covariance in detail, we find that the Gaussian covariance dominates along the main diagonal and the closest off-diagonals, but further away from the main diagonal the super-sample covariance is dominant. Forming mock constraints in parameters describing matter clustering and dark energy, we find that neglecting non-Gaussian contributions to the covariance can lead to underestimating the true size of confidence regions by up to 70 per cent. The dominant non-Gaussian covariance component is the super-sample covariance, but neglecting the smaller connected non-Gaussian covariance can still lead to the underestimation of uncertainties by 10--20 per cent. A real cosmological analysis will require marginalisation over many nuisance parameters, which will decrease the relative importance of all cosmological contributions to the covariance, so these values should be taken as upper limits on the importance of each component.
△ Less
Submitted 17 February, 2022; v1 submitted 14 December, 2021;
originally announced December 2021.
-
Sufficiency of a Gaussian power spectrum likelihood for accurate cosmology from upcoming weak lensing surveys
Authors:
Robin E. Upham,
Michael L. Brown,
Lee Whittaker
Abstract:
We investigate whether a Gaussian likelihood is sufficient to obtain accurate parameter constraints from a Euclid-like combined tomographic power spectrum analysis of weak lensing, galaxy clustering and their cross-correlation. Testing its performance on the full sky against the Wishart distribution, which is the exact likelihood under the assumption of Gaussian fields, we find that the Gaussian l…
▽ More
We investigate whether a Gaussian likelihood is sufficient to obtain accurate parameter constraints from a Euclid-like combined tomographic power spectrum analysis of weak lensing, galaxy clustering and their cross-correlation. Testing its performance on the full sky against the Wishart distribution, which is the exact likelihood under the assumption of Gaussian fields, we find that the Gaussian likelihood returns accurate parameter constraints. This accuracy is robust to the choices made in the likelihood analysis, including the choice of fiducial cosmology, the range of scales included, and the random noise level. We extend our results to the cut sky by evaluating the additional non-Gaussianity of the joint cut-sky likelihood in both its marginal distributions and dependence structure. We find that the cut-sky likelihood is more non-Gaussian than the full-sky likelihood, but at a level insufficient to introduce significant inaccuracy into parameter constraints obtained using the Gaussian likelihood. Our results should not be affected by the assumption of Gaussian fields, as this approximation only becomes inaccurate on small scales, which in turn corresponds to the limit in which any non-Gaussianity of the likelihood becomes negligible. We nevertheless compare against N-body weak lensing simulations and find no evidence of significant additional non-Gaussianity in the likelihood. Our results indicate that a Gaussian likelihood will be sufficient for robust parameter constraints with power spectra from Stage IV weak lensing surveys.
△ Less
Submitted 19 February, 2021; v1 submitted 11 December, 2020;
originally announced December 2020.
-
When Ensembling Smaller Models is More Efficient than Single Large Models
Authors:
Dan Kondratyuk,
Mingxing Tan,
Matthew Brown,
Boqing Gong
Abstract:
Ensembling is a simple and popular technique for boosting evaluation performance by training multiple models (e.g., with different initializations) and aggregating their predictions. This approach is commonly reserved for the largest models, as it is commonly held that increasing the model size provides a more substantial reduction in error than ensembling smaller models. However, we show results…
▽ More
Ensembling is a simple and popular technique for boosting evaluation performance by training multiple models (e.g., with different initializations) and aggregating their predictions. This approach is commonly reserved for the largest models, as it is commonly held that increasing the model size provides a more substantial reduction in error than ensembling smaller models. However, we show results from experiments on CIFAR-10 and ImageNet that ensembles can outperform single models with both higher accuracy and requiring fewer total FLOPs to compute, even when those individual models' weights and hyperparameters are highly optimized. Furthermore, this gap in improvement widens as models become large. This presents an interesting observation that output diversity in ensembling can often be more efficient than training larger models, especially when the models approach the size of what their dataset can foster. Instead of using the common practice of tuning a single large model, one can use ensembles as a more flexible trade-off between a model's inference speed and accuracy. This also potentially eases hardware design, e.g., an easier way to parallelize the model across multiple workers for real-time or distributed inference.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition from a Domain Adaptation Perspective
Authors:
Muhammad Abdullah Jamal,
Matthew Brown,
Ming-Hsuan Yang,
Liqiang Wang,
Boqing Gong
Abstract:
Object frequency in the real world often follows a power law, leading to a mismatch between datasets with long-tailed class distributions seen by a machine learning model and our expectation of the model to perform well on all classes. We analyze this mismatch from a domain adaptation point of view. First of all, we connect existing class-balanced methods for long-tailed classification to target s…
▽ More
Object frequency in the real world often follows a power law, leading to a mismatch between datasets with long-tailed class distributions seen by a machine learning model and our expectation of the model to perform well on all classes. We analyze this mismatch from a domain adaptation point of view. First of all, we connect existing class-balanced methods for long-tailed classification to target shift, a well-studied scenario in domain adaptation. The connection reveals that these methods implicitly assume that the training data and test data share the same class-conditioned distribution, which does not hold in general and especially for the tail classes. While a head class could contain abundant and diverse training examples that well represent the expected data at inference time, the tail classes are often short of representative training data. To this end, we propose to augment the classic class-balanced learning by explicitly estimating the differences between the class-conditioned distributions with a meta-learning approach. We validate our approach with six benchmark datasets and three loss functions.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
Federated Visual Classification with Real-World Data Distribution
Authors:
Tzu-Ming Harry Hsu,
Hang Qi,
Matthew Brown
Abstract:
Federated Learning enables visual models to be trained on-device, bringing advantages for user privacy (data need never leave the device), but challenges in terms of data diversity and quality. Whilst typical models in the datacenter are trained using data that are independent and identically distributed (IID), data at source are typically far from IID. Furthermore, differing quantities of data ar…
▽ More
Federated Learning enables visual models to be trained on-device, bringing advantages for user privacy (data need never leave the device), but challenges in terms of data diversity and quality. Whilst typical models in the datacenter are trained using data that are independent and identically distributed (IID), data at source are typically far from IID. Furthermore, differing quantities of data are typically available at each device (imbalance). In this work, we characterize the effect these real-world data distributions have on distributed learning, using as a benchmark the standard Federated Averaging (FedAvg) algorithm. To do so, we introduce two new large-scale datasets for species and landmark classification, with realistic per-user data splits that simulate real-world edge learning scenarios. We also develop two new algorithms (FedVC, FedIR) that intelligently resample and reweight over the client pool, bringing large improvements in accuracy and stability in training. The datasets are made available online.
△ Less
Submitted 17 July, 2020; v1 submitted 18 March, 2020;
originally announced March 2020.
-
Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification
Authors:
Tzu-Ming Harry Hsu,
Hang Qi,
Matthew Brown
Abstract:
Federated Learning enables visual models to be trained in a privacy-preserving way using real-world data from mobile devices. Given their distributed nature, the statistics of the data across these devices is likely to differ significantly. In this work, we look at the effect such non-identical data distributions has on visual classification via Federated Learning. We propose a way to synthesize d…
▽ More
Federated Learning enables visual models to be trained in a privacy-preserving way using real-world data from mobile devices. Given their distributed nature, the statistics of the data across these devices is likely to differ significantly. In this work, we look at the effect such non-identical data distributions has on visual classification via Federated Learning. We propose a way to synthesize datasets with a continuous range of identicalness and provide performance measures for the Federated Averaging algorithm. We show that performance degrades as distributions differ more, and propose a mitigation strategy via server momentum. Experiments on CIFAR-10 demonstrate improved classification performance over a range of non-identicalness, with classification accuracy improved from 30.1% to 76.9% in the most skewed settings.
△ Less
Submitted 13 September, 2019;
originally announced September 2019.
-
Computer Assisted Composition in Continuous Time
Authors:
Chamin Hewa Koneputugodage,
Rhys Healy,
Sean Lamont,
Ian Mallett,
Matt Brown,
Matt Walters,
Ushini Attanayake,
Libo Zhang,
Roger T. Dean,
Alexander Hunter,
Charles Gretton,
Christian Walder
Abstract:
We address the problem of combining sequence models of symbolic music with user defined constraints. For typical models this is non-trivial as only the conditional distribution of each symbol given the earlier symbols is available, while the constraints correspond to arbitrary times. Previously this has been addressed by assuming a discrete time model of fixed rhythm. We generalise to continuous t…
▽ More
We address the problem of combining sequence models of symbolic music with user defined constraints. For typical models this is non-trivial as only the conditional distribution of each symbol given the earlier symbols is available, while the constraints correspond to arbitrary times. Previously this has been addressed by assuming a discrete time model of fixed rhythm. We generalise to continuous time and arbitrary rhythm by introducing a simple, novel, and efficient particle filter scheme, applicable to general continuous time point processes. Extensive experimental evaluations demonstrate that in comparison with a more traditional beam search baseline, the particle filter exhibits superior statistical properties and yields more agreeable results in an extensive human listening test experiment.
△ Less
Submitted 10 September, 2019;
originally announced September 2019.
-
Exact joint likelihood of pseudo-$C_\ell$ estimates from correlated Gaussian cosmological fields
Authors:
Robin E. Upham,
Lee Whittaker,
Michael L. Brown
Abstract:
We present the exact joint likelihood of pseudo-$C_\ell$ power spectrum estimates measured from an arbitrary number of Gaussian cosmological fields. Our method is applicable to both spin-0 fields and spin-2 fields, including a mixture of the two, and is relevant to Cosmic Microwave Background, weak lensing and galaxy clustering analyses. We show that Gaussian cosmological fields are mixed by a mas…
▽ More
We present the exact joint likelihood of pseudo-$C_\ell$ power spectrum estimates measured from an arbitrary number of Gaussian cosmological fields. Our method is applicable to both spin-0 fields and spin-2 fields, including a mixture of the two, and is relevant to Cosmic Microwave Background, weak lensing and galaxy clustering analyses. We show that Gaussian cosmological fields are mixed by a mask in such a way that retains their Gaussianity, without making any assumptions about the mask geometry. We then show that each auto- or cross-pseudo-$C_\ell$ estimator can be written as a quadratic form, and apply the known joint distribution of quadratic forms to obtain the exact joint likelihood of a set of pseudo-$C_\ell$ estimates in the presence of an arbitrary mask. Considering the polarisation of the Cosmic Microwave Background as an example, we show using simulations that our likelihood recovers the full, exact multivariate distribution of $EE$, $BB$ and $EB$ pseudo-$C_\ell$ power spectra. Our method provides a route to robust cosmological constraints from future Cosmic Microwave Background and large-scale structure surveys in an era of ever-increasing statistical precision.
△ Less
Submitted 6 December, 2019; v1 submitted 2 August, 2019;
originally announced August 2019.
-
Learning Robust Representations for Automatic Target Recognition
Authors:
Justin A. Goodwin,
Olivia M. Brown,
Taylor W. Killian,
Sung-Hyun Son
Abstract:
Radio frequency (RF) sensors are used alongside other sensing modalities to provide rich representations of the world. Given the high variability of complex-valued target responses, RF systems are susceptible to attacks masking true target characteristics from accurate identification. In this work, we evaluate different techniques for building robust classification architectures exploiting learned…
▽ More
Radio frequency (RF) sensors are used alongside other sensing modalities to provide rich representations of the world. Given the high variability of complex-valued target responses, RF systems are susceptible to attacks masking true target characteristics from accurate identification. In this work, we evaluate different techniques for building robust classification architectures exploiting learned physical structure in received synthetic aperture radar signals of simulated 3D targets.
△ Less
Submitted 26 November, 2018;
originally announced November 2018.
-
Frame-Recurrent Video Super-Resolution
Authors:
Mehdi S. M. Sajjadi,
Raviteja Vemulapalli,
Matthew Brown
Abstract:
Recent advances in video super-resolution have shown that convolutional neural networks combined with motion compensation are able to merge information from multiple low-resolution (LR) frames to generate high-quality images. Current state-of-the-art methods process a batch of LR frames to generate a single high-resolution (HR) frame and run this scheme in a sliding window fashion over the entire…
▽ More
Recent advances in video super-resolution have shown that convolutional neural networks combined with motion compensation are able to merge information from multiple low-resolution (LR) frames to generate high-quality images. Current state-of-the-art methods process a batch of LR frames to generate a single high-resolution (HR) frame and run this scheme in a sliding window fashion over the entire video, effectively treating the problem as a large number of separate multi-frame super-resolution tasks. This approach has two main weaknesses: 1) Each input frame is processed and warped multiple times, increasing the computational cost, and 2) each output frame is estimated independently conditioned on the input frames, limiting the system's ability to produce temporally consistent results.
In this work, we propose an end-to-end trainable frame-recurrent video super-resolution framework that uses the previously inferred HR estimate to super-resolve the subsequent frame. This naturally encourages temporally consistent results and reduces the computational cost by war** only one image in each step. Furthermore, due to its recurrent nature, the proposed method has the ability to assimilate a large number of previous frames without increased computational demands. Extensive evaluations and comparisons with previous methods validate the strengths of our approach and demonstrate that the proposed framework is able to significantly outperform the current state of the art.
△ Less
Submitted 25 March, 2018; v1 submitted 14 January, 2018;
originally announced January 2018.
-
Detecting gene innovations for phenotypic diversity across multiple genomes
Authors:
Inti Pedroso,
Mark J. F. Brown,
Seirian Sumner
Abstract:
Gene innovation is a key mechanism on the evolution and phenotypic diversity of life forms. There is a need for tools able to study gene innovation across an increasingly large number of genomic sequences to maximally capitalise our understanding of biological systems. Here we present Comparative-Phylostratigraphy, an open-source software suite that enables to time the emergence of new genes acros…
▽ More
Gene innovation is a key mechanism on the evolution and phenotypic diversity of life forms. There is a need for tools able to study gene innovation across an increasingly large number of genomic sequences to maximally capitalise our understanding of biological systems. Here we present Comparative-Phylostratigraphy, an open-source software suite that enables to time the emergence of new genes across evolutionary time and to correlate patterns of gene emergence with species traits simultaneously across whole genomes from multiple species. Such a comparative strategy is a new powerful tool for starting to dissect the relationship between gene innovation and phenotypic diversity. We describe and showcase our method by analysing recently published ant genomes. This new methodology identified significant bouts of new gene evolution in ant clades, that are associated with shifts in life-history traits. Our method allows easy integration of new genomic data as it becomes available, and thus will be a valuable analytical tool for evolutionary biologists interested in explaining the evolution of diversity of life at the level of the genes.
△ Less
Submitted 16 December, 2012;
originally announced December 2012.
-
From Boundary Crossing of Non-Random Functions to Boundary Crossing of Stochastic Processes
Authors:
Mark Brown,
Victor de la Pena,
Tony Sit
Abstract:
One problem of wide interest involves estimating expected crossing-times. Several tools have been developed to solve this problem beginning with the works of Wald and the theory of sequential analysis. An extension of his approach is provided by the optional sampling theorem in conjunction with martingale inequalities. Deriving the explicit close form solution for the expected crossing times may b…
▽ More
One problem of wide interest involves estimating expected crossing-times. Several tools have been developed to solve this problem beginning with the works of Wald and the theory of sequential analysis. An extension of his approach is provided by the optional sampling theorem in conjunction with martingale inequalities. Deriving the explicit close form solution for the expected crossing times may be difficult. In this paper, we provide a framework that can be used to estimate expected crossing times of arbitrary stochastic processes. Our key assumption is the knowledge of the average behavior of the supremum of the process. Our results include a universal sharp lower bound on the expected crossing times.
△ Less
Submitted 11 December, 2012; v1 submitted 4 December, 2012;
originally announced December 2012.