-
Variational Inference for Acceleration of SN Ia Photometric Distance Estimation with BayeSN
Authors:
Ana Sofía M. Uzsoy,
Stephen Thorp,
Matthew Grayling,
Kaisey S. Mandel
Abstract:
Type Ia supernovae (SNe Ia) are standarizable candles whose observed light curves can be used to infer their distances, which can in turn be used in cosmological analyses. As the quantity of observed SNe Ia grows with current and upcoming surveys, increasingly scalable analyses are necessary to take full advantage of these new datasets for precise estimation of cosmological parameters. Bayesian in…
▽ More
Type Ia supernovae (SNe Ia) are standarizable candles whose observed light curves can be used to infer their distances, which can in turn be used in cosmological analyses. As the quantity of observed SNe Ia grows with current and upcoming surveys, increasingly scalable analyses are necessary to take full advantage of these new datasets for precise estimation of cosmological parameters. Bayesian inference methods enable fitting SN Ia light curves with robust uncertainty quantification, but traditional posterior sampling using Markov Chain Monte Carlo (MCMC) is computationally expensive. We present an implementation of variational inference (VI) to accelerate the fitting of SN Ia light curves using the BayeSN hierarchical Bayesian model for time-varying SN Ia spectral energy distributions (SEDs). We demonstrate and evaluate its performance on both simulated light curves and data from the Foundation Supernova Survey with two different forms of surrogate posterior -- a multivariate normal and a custom multivariate zero-lower-truncated normal distribution -- and compare them with the Laplace Approximation and full MCMC analysis. To validate of our variational approximation, we calculate the pareto-smoothed importance sampling (PSIS) diagnostic, and perform variational simulation-based calibration (VSBC). The VI approximation achieves similar results to MCMC but with an order-of-magnitude speedup for the inference of the photometric distance moduli. Overall, we show that VI is a promising method for scalable parameter inference that enables analysis of larger datasets for precision cosmology.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Anomaly Detection and Approximate Similarity Searches of Transients in Real-time Data Streams
Authors:
P. D. Aleo,
A. W. Engel,
G. Narayan,
C. R. Angus,
K. Malanchev,
K. Auchettl,
V. F. Baldassare,
A. Berres,
T. J. L. de Boer,
B. M. Boyd,
K. C. Chambers,
K. W. Davis,
N. Esquivel,
D. Farias,
R. J. Foley,
A. Gagliano,
C. Gall,
H. Gao,
S. Gomez,
M. Grayling,
C. -C. Lin,
E. A. Magnier,
K. S. Mandel,
T. Matheson,
S. I. Raimundo
, et al. (5 additional authors not shown)
Abstract:
We present LAISS (Lightcurve Anomaly Identification and Similarity Search), an automated pipeline to detect anomalous astrophysical transients in real-time data streams. We deploy our anomaly detection model on the nightly ZTF Alert Stream via the ANTARES broker, identifying a manageable $\sim$1-5 candidates per night for expert vetting and coordinating follow-up observations. Our method leverages…
▽ More
We present LAISS (Lightcurve Anomaly Identification and Similarity Search), an automated pipeline to detect anomalous astrophysical transients in real-time data streams. We deploy our anomaly detection model on the nightly ZTF Alert Stream via the ANTARES broker, identifying a manageable $\sim$1-5 candidates per night for expert vetting and coordinating follow-up observations. Our method leverages statistical light-curve and contextual host-galaxy features within a random forest classifier, tagging transients of rare classes (spectroscopic anomalies), of uncommon host-galaxy environments (contextual anomalies), and of peculiar or interaction-powered phenomena (behavioral anomalies). Moreover, we demonstrate the power of a low-latency ($\sim$ms) approximate similarity search method to find transient analogs with similar light-curve evolution and host-galaxy environments. We use analogs for data-driven discovery, characterization, (re-)classification, and imputation in retrospective and real-time searches. To date we have identified $\sim$50 previously known and previously missed rare transients from real-time and retrospective searches, including but not limited to: SLSNe, TDEs, SNe IIn, SNe IIb, SNe Ia-CSM, SNe Ia-91bg-like, SNe Ib, SNe Ic, SNe Ic-BL, and M31 novae. Lastly, we report the discovery of 325 total transients, all observed between 2018-2021 and absent from public catalogs ($\sim$1% of all ZTF Astronomical Transient reports to the Transient Name Server through 2021). These methods enable a systematic approach to finding the "needle in the haystack" in large-volume data streams. Because of its integration with the ANTARES broker, LAISS is built to detect exciting transients in Rubin data.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
JWST Photometric Time-Delay and Magnification Measurements for the Triply-Imaged Type Ia "Supernova H0pe" at z = 1.78
Authors:
J. D. R. Pierel,
B. L. Frye,
M. Pascale,
G. B. Caminha,
W. Chen,
S. Dhawan,
D. Gilman,
M. Grayling,
S. Huber,
P. Kelly,
S. Thorp,
N. Arendse,
S. Birrer,
M. Bronikowski,
R. Canameras,
D. Coe,
S. H. Cohen,
C. J. Conselice,
S. P. Driver,
J. C. J. Dsilva,
M. Engesser,
N. Foo,
C. Gall,
N. Garuda,
C. Grillo
, et al. (38 additional authors not shown)
Abstract:
Supernova (SN) H0pe is a gravitationally lensed, triply-imaged, Type Ia SN (SN Ia) discovered in James Webb Space Telescope imaging of the PLCK G165.7+67.0 cluster of galaxies. Well-observed multiply-imaged SNe provide a rare opportunity to constrain the Hubble constant ($H_0$), by measuring the relative time delay between the images and modeling the foreground mass distribution. SN H0pe is locate…
▽ More
Supernova (SN) H0pe is a gravitationally lensed, triply-imaged, Type Ia SN (SN Ia) discovered in James Webb Space Telescope imaging of the PLCK G165.7+67.0 cluster of galaxies. Well-observed multiply-imaged SNe provide a rare opportunity to constrain the Hubble constant ($H_0$), by measuring the relative time delay between the images and modeling the foreground mass distribution. SN H0pe is located at $z=1.783$, and is the first SN Ia with sufficient light curve sampling and long enough time delays for an $H_0$ inference. Here we present photometric time-delay measurements and SN properties of SN H0pe. Using JWST/NIRCam photometry we measure time delays of $Δt_{ab}=-116.6^{+10.8}_{-9.3}$ and $Δt_{cb}=-48.6^{+3.6}_{-4.0}$ observer-frame days relative to the last image to arrive (image 2b; all uncertainties are $1σ$), which corresponds to a $\sim5.6\%$ uncertainty contribution for $H_0$ assuming $70 \rm{km s^{-1} Mpc^{-1}}$. We also constrain the absolute magnification of each image to $μ_{a}=4.3^{+1.6}_{-1.8}$, $μ_{b}=7.6^{+3.6}_{-2.6}$, $μ_{c}=6.4^{+1.6}_{-1.5}$ by comparing the observed peak near-IR magnitude of SN H0pe to the non-lensed population of SNe Ia.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
SIDE-real: Supernova Ia Dust Extinction with truncated marginal neural ratio estimation applied to real data
Authors:
Konstantin Karchev,
Matthew Grayling,
Benjamin M. Boyd,
Roberto Trotta,
Kaisey S. Mandel,
Christoph Weniger
Abstract:
We present the first fully simulation-based hierarchical analysis of the light curves of a population of low-redshift type Ia supernovae (SNae Ia). Our hardware-accelerated forward model, released in the Python package slicsim, includes stochastic variations of each SN's spectral flux distribution (based on the pre-trained BayeSN model), extinction from dust in the host and in the Milky Way, redsh…
▽ More
We present the first fully simulation-based hierarchical analysis of the light curves of a population of low-redshift type Ia supernovae (SNae Ia). Our hardware-accelerated forward model, released in the Python package slicsim, includes stochastic variations of each SN's spectral flux distribution (based on the pre-trained BayeSN model), extinction from dust in the host and in the Milky Way, redshift, and realistic instrumental noise. By utilising truncated marginal neural ratio estimation (TMNRE), a neural network-enabled simulation-based inference technique, we implicitly marginalise over 4000 latent variables (for a set of $\approx 100$ SNae Ia) to efficiently infer SN Ia absolute magnitudes and host-galaxy dust properties at the population level while also constraining the parameters of individual objects. Amortisation of the inference procedure allows us to obtain coverage guarantees for our results through Bayesian validation and frequentist calibration. Furthermore, we show a detailed comparison to full likelihood-based inference, implemented through Hamiltonian Monte Carlo, on simulated data and then apply TMNRE to the light curves of 86 SNae Ia from the Carnegie Supernova Project, deriving marginal posteriors in excellent agreement with previous work. Given its ability to accommodate arbitrarily complex extensions to the forward model -- e.g. different populations based on host properties, redshift evolution, complicated photometric redshift estimates, selection effects, and non-Ia contamination -- without significant modifications to the inference procedure, TMNRE has the potential to become the tool of choice for cosmological parameter inference from future, large SN Ia samples.
△ Less
Submitted 14 May, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Optimal Bayesian stepped-wedge cluster randomised trial designs for binary outcome data
Authors:
Laura Etfer,
James M. S. Wason,
Michael J. Grayling
Abstract:
Under a generalised estimating equation analysis approach, approximate design theory is used to determine Bayesian D-optimal designs. For two examples, considering simple exchangeable and exponential decay correlation structures, we compare the efficiency of identified optimal designs to balanced stepped-wedge designs and corresponding stepped-wedge designs determined by optimising using a normal…
▽ More
Under a generalised estimating equation analysis approach, approximate design theory is used to determine Bayesian D-optimal designs. For two examples, considering simple exchangeable and exponential decay correlation structures, we compare the efficiency of identified optimal designs to balanced stepped-wedge designs and corresponding stepped-wedge designs determined by optimising using a normal approximation approach. The dependence of the Bayesian D-optimal designs on the assumed correlation structure is explored; for the considered settings, smaller decay in the correlation between outcomes across time periods, along with larger values of the intra-cluster correlation, leads to designs closer to a balanced design being optimal. Unlike for normal data, it is shown that the optimal design need not be centro-symmetric in the binary outcome case. The efficiency of the Bayesian D-optimal design relative to a balanced design can be large, but situations are demonstrated in which the advantages are small. Similarly, the optimal design from a normal approximation approach is often not much less efficient than the Bayesian D-optimal design. Bayesian D-optimal designs can be readily identified for stepped-wedge cluster randomised trials with binary outcome data. In certain circumstances, principally ones with strong time period effects, they will indicate that a design unlikely to have been identified by previous methods may be substantially more efficient. However, they require a larger number of assumptions than existing optimal designs, and in many situations existing theory under a normal approximation will provide an easier means of identifying an efficient design for binary outcome data.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Scalable hierarchical BayeSN inference: Investigating dependence of SN Ia host galaxy dust properties on stellar mass and redshift
Authors:
Matthew Grayling,
Stephen Thorp,
Kaisey S. Mandel,
Suhail Dhawan,
Ana Sofia M. Uzsoy,
Benjamin M. Boyd,
Erin E. Hayes,
Sam M. Ward
Abstract:
We apply the hierarchical probabilistic SED model BayeSN to analyse a sample of 475 SNe Ia (0.015 < z < 0.4) from Foundation, DES3YR and PS1MD to investigate the properties of dust in their host galaxies. We jointly infer the dust law $R_V$ population distributions at the SED level in high- and low-mass galaxies simultaneously with dust-independent, intrinsic differences. We find an intrinsic mass…
▽ More
We apply the hierarchical probabilistic SED model BayeSN to analyse a sample of 475 SNe Ia (0.015 < z < 0.4) from Foundation, DES3YR and PS1MD to investigate the properties of dust in their host galaxies. We jointly infer the dust law $R_V$ population distributions at the SED level in high- and low-mass galaxies simultaneously with dust-independent, intrinsic differences. We find an intrinsic mass step of $-0.049\pm0.016$ mag, at a significance of 3.1$σ$, when allowing for a constant intrinsic, achromatic magnitude offset. We additionally apply a model allowing for time- and wavelength-dependent intrinsic differences between SNe Ia in different mass bins, finding $\sim$2$σ$ differences in magnitude and colour around peak and 4.5$σ$ differences at later times. These intrinsic differences are inferred simultaneously with a difference in population mean $R_V$ of $\sim$2$σ$ significance, demonstrating that both intrinsic and extrinsic differences may play a role in causing the host galaxy mass step. We also consider a model which allows the mean of the $R_V$ distribution to linearly evolve with redshift but find no evidence for any evolution - we infer the gradient of this relation $η_R = -0.38\pm0.70$. In addition, we discuss in brief a new, GPU-accelerated Python implementation of BayeSN suitable for application to large surveys which is publicly available and can be used for future cosmological analyses; this code can be found here: https://github.com/bayesn/bayesn.
△ Less
Submitted 29 April, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
SN2023ixf in Messier 101: the twilight years of the progenitor as seen by Pan-STARRS
Authors:
Conor L. Ransome,
V. Ashley Villar,
Anna Tartaglia,
Sebastian Javier Gonzalez,
Wynn V. Jacobson-Galán,
Charles D. Kilpatrick,
Raffaella Margutti,
Ryan J. Foley,
Matthew Grayling,
Yuan Qi Ni,
Ricardo Yarza,
Christine Ye,
Katie Auchettl,
Thomas de Boer,
Kenneth C. Chambers,
David A. Coulter,
Maria R. Drout,
Diego Farias,
Christa Gall,
Hua Gao,
Mark E. Huber,
Adaeze L. Ibik,
David O. Jones,
Nandita Khetan,
Chien-Cheng Lin
, et al. (6 additional authors not shown)
Abstract:
The nearby type II supernova, SN2023ixf in M101 exhibits signatures of early-time interaction with circumstellar material in the first week post-explosion. This material may be the consequence of prior mass loss suffered by the progenitor which possibly manifested in the form of a detectable pre-supernova outburst. We present an analysis of the long-baseline pre-explosion photometric data in $g$,…
▽ More
The nearby type II supernova, SN2023ixf in M101 exhibits signatures of early-time interaction with circumstellar material in the first week post-explosion. This material may be the consequence of prior mass loss suffered by the progenitor which possibly manifested in the form of a detectable pre-supernova outburst. We present an analysis of the long-baseline pre-explosion photometric data in $g$, $w$, $r$, $i$, $z$ and $y$ filters from Pan-STARRS as part of the Young Supernova Experiment, spanning $\sim$5,000 days. We find no significant detections in the Pan-STARRS pre-explosion light curve. We train a multilayer perceptron neural network to classify pre-supernova outbursts. We find no evidence of eruptive pre-supernova activity to a limiting absolute magnitude of $-7$. The limiting magnitudes from the full set of $gwrizy$ (average absolute magnitude $\approx$-8) data are consistent with previous pre-explosion studies. We use deep photometry from the literature to constrain the progenitor of SN2023ixf, finding that these data are consistent with a dusty red supergiant (RSG) progenitor with luminosity $\log\left(L/L_\odot\right)$$\approx$5.12 and temperature $\approx$3950K, corresponding to a mass of 14-20 M$_\odot$
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
GausSN: Bayesian Time-Delay Estimation for Strongly Lensed Supernovae
Authors:
Erin E. Hayes,
Stephen Thorp,
Kaisey S. Mandel,
Nikki Arendse,
Matthew Grayling,
Suhail Dhawan
Abstract:
We present GausSN, a Bayesian semi-parametric Gaussian Process (GP) model for time-delay estimation with resolved systems of gravitationally lensed supernovae (glSNe). GausSN models the underlying light curve non-parametrically using a GP. Without assuming a template light curve for each SN type, GausSN fits for the time delays of all images using data in any number of wavelength filters simultane…
▽ More
We present GausSN, a Bayesian semi-parametric Gaussian Process (GP) model for time-delay estimation with resolved systems of gravitationally lensed supernovae (glSNe). GausSN models the underlying light curve non-parametrically using a GP. Without assuming a template light curve for each SN type, GausSN fits for the time delays of all images using data in any number of wavelength filters simultaneously. We also introduce a novel time-varying magnification model to capture the effects of microlensing alongside time-delay estimation. In this analysis, we model the time-varying relative magnification as a sigmoid function, as well as a constant for comparison to existing time-delay estimation approaches. We demonstrate that GausSN provides robust time-delay estimates for simulations of glSNe from the Nancy Grace Roman Space Telescope and the Vera C. Rubin Observatory's Legacy Survey of Space and Time (Rubin-LSST). We find that up to 43.6% of time-delay estimates from Roman and 52.9% from Rubin-LSST have fractional errors of less than 5%. We then apply GausSN to SN Refsdal and find the time delay for the fifth image is consistent with the original analysis, regardless of microlensing treatment. Therefore, GausSN maintains the level of precision and accuracy achieved by existing time-delay extraction methods with fewer assumptions about the underlying shape of the light curve than template-based approaches, while incorporating microlensing into the statistical error budget rather than requiring post-processing to account for its systematic uncertainty. GausSN is scalable for time-delay cosmography analyses given current projections of glSNe discovery rates from Rubin-LSST and Roman.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Bird-Snack: Bayesian Inference of dust law $R_V$ Distributions using SN Ia Apparent Colours at peaK
Authors:
Sam M. Ward,
Suhail Dhawan,
Kaisey S. Mandel,
Matthew Grayling,
Stephen Thorp
Abstract:
To reduce systematic uncertainties in Type Ia supernova (SN Ia) cosmology, the host galaxy dust law shape parameter, $R_V$, must be accurately constrained. We thus develop a computationally-inexpensive pipeline, Bird-Snack, to rapidly infer dust population distributions from optical-near infrared SN colours at peak brightness, and determine which analysis choices significantly impact the populatio…
▽ More
To reduce systematic uncertainties in Type Ia supernova (SN Ia) cosmology, the host galaxy dust law shape parameter, $R_V$, must be accurately constrained. We thus develop a computationally-inexpensive pipeline, Bird-Snack, to rapidly infer dust population distributions from optical-near infrared SN colours at peak brightness, and determine which analysis choices significantly impact the population mean $R_V$ inference, $μ_{R_V}$. Our pipeline uses a 2D Gaussian process to measure peak $BVriJH$ apparent magnitudes from SN light curves, and a hierarchical Bayesian model to simultaneously constrain population distributions of intrinsic and dust components. Fitting a low-to-moderate-reddening sample of 65 low-redshift SNe yields $μ_{R_V}=2.61^{+0.38}_{-0.35}$, with $68\%(95\%)$ posterior upper bounds on the population dispersion, $σ_{R_V}<0.92(1.96)$. This result is robust to various analysis choices, including: the model for intrinsic colour variations, fitting the shape hyperparameter of a gamma dust extinction distribution, and cutting the sample based on the availability of data near peak. However, these choices may be important if statistical uncertainties are reduced. With larger near-future optical and near-infrared SN samples, Bird-Snack can be used to better constrain dust distributions, and investigate potential correlations with host galaxy properties. Bird-Snack is publicly available; the modular infrastructure facilitates rapid exploration of custom analysis choices, and quick fits to simulated datasets, for better interpretation of real-data inferences.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Keck Infrared Transient Survey I: Survey Description and Data Release 1
Authors:
S. Tinyanont,
R. J. Foley,
K. Taggart,
K. W. Davis,
N. LeBaron,
J. E. Andrews,
M. J. Bustamante-Rosell,
Y. Camacho-Neves,
R. Chornock,
D. A. Coulter,
L. Galbany,
S. W. Jha,
C. D. Kilpatrick,
L. A. Kwok,
C. Larison,
J. R. Pierel,
M. R. Siebert,
G. Aldering,
K. Auchettl,
J. S. Bloom,
S. Dhawan,
A. V. Filippenko,
K. D. French,
A. Gagliano,
M. Grayling
, et al. (13 additional authors not shown)
Abstract:
We present the Keck Infrared Transient Survey (KITS), a NASA Key Strategic Mission Support program to obtain near-infrared (NIR) spectra of astrophysical transients of all types, and its first data release, consisting of 105 NIR spectra of 50 transients. Such a data set is essential as we enter a new era of IR astronomy with the James Webb Space Telescope (JWST) and the upcoming Nancy Grace Roman…
▽ More
We present the Keck Infrared Transient Survey (KITS), a NASA Key Strategic Mission Support program to obtain near-infrared (NIR) spectra of astrophysical transients of all types, and its first data release, consisting of 105 NIR spectra of 50 transients. Such a data set is essential as we enter a new era of IR astronomy with the James Webb Space Telescope (JWST) and the upcoming Nancy Grace Roman Space Telescope (Roman). NIR spectral templates will be essential to search JWST images for stellar explosions of the first stars and to plan an effective Roma} SN Ia cosmology survey, both key science objectives for mission success. Between 2022 February and 2023 July, we systematically obtained 274 NIR spectra of 146 astronomical transients, representing a significant increase in the number of available NIR spectra in the literature. The first data release includes data from the 2022A semester. We systematically observed three samples: a flux-limited sample that includes all transients $<$17 mag in a red optical band (usually ZTF r or ATLAS o bands); a volume-limited sample including all transients within redshift $z < 0.01$ ($D \approx 50$ Mpc); and an SN Ia sample targeting objects at phases and light-curve parameters that had scant existing NIR data in the literature. The flux-limited sample is 39% complete (60% excluding SNe Ia), while the volume-limited sample is 54% complete and is 79% complete to $z = 0.005$. All completeness numbers will rise with the inclusion of data from other telescopes in future data releases. Transient classes observed include common Type Ia and core-collapse supernovae, tidal disruption events (TDEs), luminous red novae, and the newly categorized hydrogen-free/helium-poor interacting Type Icn supernovae. We describe our observing procedures and data reduction using Pypeit, which requires minimal human interaction to ensure reproducibility.
△ Less
Submitted 13 September, 2023;
originally announced September 2023.
-
Evaluating the impact of outcome delay on the efficiency of two-arm group-sequential trials
Authors:
Aritra Mukherjee,
Michael J. Grayling,
James M. S. Wason
Abstract:
Adaptive designs(AD) are a broad class of trial designs that allow preplanned modifications based on patient data providing improved efficiency and flexibility. However, a delay in observing the primary outcome variable can harm this added efficiency. In this paper, we aim to ascertain the size of such outcome delay that results in the realised efficiency gains of ADs becoming negligible compared…
▽ More
Adaptive designs(AD) are a broad class of trial designs that allow preplanned modifications based on patient data providing improved efficiency and flexibility. However, a delay in observing the primary outcome variable can harm this added efficiency. In this paper, we aim to ascertain the size of such outcome delay that results in the realised efficiency gains of ADs becoming negligible compared to classical fixed sample RCTs.
We measure the impact of delay by develo** formulae for the no. of overruns in 2 arm GSDs with normal data, assuming different recruitment models. The efficiency of a GSD is usually measured in terms of the expected sample size (ESS), with GSDs generally reducing the ESS compared to a standard RCT. Our formulae measures the efficiency gain from a GSD in terms of ESS reduction that is lost due to delay. We assess whether careful choice of design (e.g., altering the spacing of the IAs) can help recover the benefits of GSDs in presence of delay. We also analyse the efficiency of GSDs with respect to time to complete the trial.
Comparing the expected efficiency gains, with and without consideration of delay, it is evident GSDs suffer considerable losses due to delay. Even a small delay can have a significant impact on the trial's efficiency. In contrast, even in the presence of substantial delay, a GSD will have a smaller expected time to trial completion in comparison to a simple RCT. Although the no. of stages have little influence on the efficiency losses, the timing of IAs can impact the efficiency of a GSDs with delay. Particularly, for unequally spaced IAs, pushing IAs towards latter end of the trial can be harmful for the design with delay.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Utilising high-dimensional data in randomised clinical trials: a review of methods and practice
Authors:
Svetlana Cherlin,
Theophile Bigirumurame,
Michael J Grayling,
Jérémie Nsengimana,
Luke Ouma,
Aida Santaolalla,
Fang Wan,
S Faye Williamson,
James M S Wason
Abstract:
Introduction: Even in effectively conducted randomised trials, the probability of a successful study remains relatively low. With recent advances in the next-generation sequencing technologies, there is a rapidly growing number of high-dimensional data, including genetic, molecular and phenotypic information, that have improved our understanding of driver genes, drug targets, and drug mechanisms o…
▽ More
Introduction: Even in effectively conducted randomised trials, the probability of a successful study remains relatively low. With recent advances in the next-generation sequencing technologies, there is a rapidly growing number of high-dimensional data, including genetic, molecular and phenotypic information, that have improved our understanding of driver genes, drug targets, and drug mechanisms of action. The leveraging of high-dimensional data holds promise for increased success of clinical trials. Methods: We provide an overview of methods for utilising high-dimensional data in clinical trials. We also investigate the use of these methods in practice through a review of recently published randomised clinical trials that utilise high-dimensional genetic data. The review includes articles that were published between 2019 and 2021, identified through the PubMed database. Results: Out of 174 screened articles, 100 (57.5%) were randomised clinical trials that collected high-dimensional data. The most common clinical area was oncology (30%), followed by chronic diseases (28%), nutrition and ageing (18%) and cardiovascular diseases (7%). The most common types of data analysed were gene expression data (70%), followed by DNA data (21%). The most common method of analysis (36.3%) was univariable analysis. Articles that described multivariable analyses used standard statistical methods. Most of the clinical trials had two arms. Discussion: New methodological approaches are required for more efficient analysis of the increasing amount of high-dimensional data collected in randomised clinical trials. We highlight the limitations and barriers to the current use of high-dimensional data in trials, and suggest potential avenues for improvement and future work.
△ Less
Submitted 5 February, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Photometric study of the late-time near-infrared plateau in Type Ia supernovae
Authors:
M. Deckers,
O. Graur,
K. Maguire,
L. Shingles,
S. J. Brennan,
J. P. Anderson,
J. Burke,
T. -W. Chen,
L. Galbany,
M. J. P. Grayling,
C. P. Gutiérrez,
L. Harvey,
D. Hiramatsu,
D. A. Howell,
C. Inserra,
T. Killestein,
C. McCully,
T. E. Müller-Bravo,
M. Nicholl,
M. Newsome,
E. Padilla Gonzalez,
C. Pellegrino,
G. Terreran,
J. H. Terwel,
M. Toy
, et al. (1 additional authors not shown)
Abstract:
We present an in-depth study of the late-time near-infrared plateau in Type Ia supernovae (SNe Ia), which occurs between 70-500 d. We double the existing sample of SNe Ia observed during the late-time near-infrared plateau with new observations taken with the Hubble Space Telescope, Gemini, New Technology Telescope, the 3.5m Calar Alto Telescope, and the Nordic Optical Telescope. Our sample consis…
▽ More
We present an in-depth study of the late-time near-infrared plateau in Type Ia supernovae (SNe Ia), which occurs between 70-500 d. We double the existing sample of SNe Ia observed during the late-time near-infrared plateau with new observations taken with the Hubble Space Telescope, Gemini, New Technology Telescope, the 3.5m Calar Alto Telescope, and the Nordic Optical Telescope. Our sample consists of 24 nearby SNe Ia at redshift < 0.025. We are able to confirm that no plateau exists in the Ks band for most normal SNe Ia. SNe Ia with broader optical light curves at peak tend to have a higher average brightness on the plateau in J and H, most likely due to a shallower decline in the preceding 100 d. SNe Ia that are more luminous at peak also show a steeper decline during the plateau phase in H. We compare our data to state-of-the-art radiative transfer models of nebular SNe Ia in the near-infrared. We find good agreement with the sub-Mch model that has reduced non-thermal ionisation rates, but no physical justification for reducing these rates has yet been proposed. An analysis of the spectral evolution during the plateau demonstrates that the ratio of [Fe II] to [Fe III] contribution in a near-infrared filter determines the light curve evolution in said filter. We find that overluminous SNe decline slower during the plateau than expected from the trend seen for normal SNe Ia
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
The Young Supernova Experiment Data Release 1 (YSE DR1): Light Curves and Photometric Classification of 1975 Supernovae
Authors:
P. D. Aleo,
K. Malanchev,
S. Sharief,
D. O. Jones,
G. Narayan,
R. J. Foley,
V. A. Villar,
C. R. Angus,
V. F. Baldassare,
M. J. Bustamante-Rosell,
D. Chatterjee,
C. Cold,
D. A. Coulter,
K. W. Davis,
S. Dhawan,
M. R. Drout,
A. Engel,
K. D. French,
A. Gagliano,
C. Gall,
J. Hjorth,
M. E. Huber,
W. V. Jacobson-Galán,
C. D. Kilpatrick,
D. Langeroodi
, et al. (58 additional authors not shown)
Abstract:
We present the Young Supernova Experiment Data Release 1 (YSE DR1), comprised of processed multi-color Pan-STARRS1 (PS1) griz and Zwicky Transient Facility (ZTF) gr photometry of 1975 transients with host-galaxy associations, redshifts, spectroscopic/photometric classifications, and additional data products from 2019 November 24 to 2021 December 20. YSE DR1 spans discoveries and observations from…
▽ More
We present the Young Supernova Experiment Data Release 1 (YSE DR1), comprised of processed multi-color Pan-STARRS1 (PS1) griz and Zwicky Transient Facility (ZTF) gr photometry of 1975 transients with host-galaxy associations, redshifts, spectroscopic/photometric classifications, and additional data products from 2019 November 24 to 2021 December 20. YSE DR1 spans discoveries and observations from young and fast-rising supernovae (SNe) to transients that persist for over a year, with a redshift distribution reaching z~0.5. We present relative SN rates from YSE's magnitude- and volume-limited surveys, which are consistent with previously published values within estimated uncertainties for untargeted surveys. We combine YSE and ZTF data, and create multi-survey SN simulations to train the ParSNIP and SuperRAENN photometric classification algorithms; when validating our ParSNIP classifier on 472 spectroscopically classified YSE DR1 SNe, we achieve 82% accuracy across three SN classes (SNe Ia, II, Ib/Ic) and 90% accuracy across two SN classes (SNe Ia, core-collapse SNe). Our classifier performs particularly well on SNe Ia, with high (>90%) individual completeness and purity, which will help build an anchor photometric SNe Ia sample for cosmology. We then use our photometric classifier to characterize our photometric sample of 1483 SNe, labeling 1048 (~71%) SNe Ia, 339 (~23%) SNe II, and 96 (~6%) SNe Ib/Ic. YSE DR1 provides a training ground for building discovery, anomaly detection, and classification algorithms, performing cosmological analyses, understanding the nature of red and rare transients, exploring tidal disruption events and nuclear variability, and preparing for the forthcoming Vera C. Rubin Observatory Legacy Survey of Space and Time.
△ Less
Submitted 21 February, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Core-collapse Supernovae in the Dark Energy Survey: Luminosity Functions and Host Galaxy Demographics
Authors:
M. Grayling,
C. P. Gutiérrez,
M. Sullivan,
P. Wiseman,
M. Vincenzi,
L. Galbany,
A. Möller,
D. Brout,
T. M. Davis,
C. Frohmaier,
O. Graur,
L. Kelsey,
C. Lidman,
B. Popovic,
M. Smith,
M. Toy,
B. E. Tucker,
Z. Zontou,
T. M. C. Abbott,
M. Aguena,
S. Allam,
F. Andrade-Oliveira,
J. Annis,
J. Asorey,
D. Bacon
, et al. (51 additional authors not shown)
Abstract:
We present the luminosity functions and host galaxy properties of the Dark Energy Survey (DES) core-collapse supernova (CCSN) sample, consisting of 69 Type II and 50 Type Ibc spectroscopically and photometrically-confirmed supernovae over a redshift range $0.045<z<0.25$. We fit the observed DES $griz$ CCSN light-curves and K-correct to produce rest-frame $R$-band light curves. We compare the sampl…
▽ More
We present the luminosity functions and host galaxy properties of the Dark Energy Survey (DES) core-collapse supernova (CCSN) sample, consisting of 69 Type II and 50 Type Ibc spectroscopically and photometrically-confirmed supernovae over a redshift range $0.045<z<0.25$. We fit the observed DES $griz$ CCSN light-curves and K-correct to produce rest-frame $R$-band light curves. We compare the sample with lower-redshift CCSN samples from Zwicky Transient Facility (ZTF) and Lick Observatory Supernova Search (LOSS). Comparing luminosity functions, the DES and ZTF samples of SNe II are brighter than that of LOSS with significances of 3.0$σ$ and 2.5$σ$ respectively. While this difference could be caused by redshift evolution in the luminosity function, simpler explanations such as differing levels of host extinction remain a possibility. We find that the host galaxies of SNe II in DES are on average bluer than in ZTF, despite having consistent stellar mass distributions. We consider a number of possibilities to explain this -- including galaxy evolution with redshift, selection biases in either the DES or ZTF samples, and systematic differences due to the different photometric bands available -- but find that none can easily reconcile the differences in host colour between the two samples and thus its cause remains uncertain.
△ Less
Submitted 22 March, 2023; v1 submitted 18 July, 2022;
originally announced July 2022.
-
Bayesian sample size determination in basket trials borrowing information between subsets
Authors:
Haiyan Zheng,
Michael J. Grayling,
Pavel Mozgunov,
Thomas Jaki,
James M. S. Wason
Abstract:
Basket trials are increasingly used for the simultaneous evaluation of a new treatment in various patient subgroups under one overarching protocol. We propose a Bayesian approach to sample size determination in basket trials that permit borrowing of information between commensurate subsets. Specifically, we consider a randomised basket trial design where patients are randomly assigned to the new t…
▽ More
Basket trials are increasingly used for the simultaneous evaluation of a new treatment in various patient subgroups under one overarching protocol. We propose a Bayesian approach to sample size determination in basket trials that permit borrowing of information between commensurate subsets. Specifically, we consider a randomised basket trial design where patients are randomly assigned to the new treatment or a control within each trial subset (`subtrial' for short). Closed-form sample size formulae are derived to ensure each subtrial has a specified chance of correctly deciding whether the new treatment is superior to or not better than the control by some clinically relevant difference. Given pre-specified levels of pairwise (in)commensurability, the subtrial sample sizes are solved simultaneously. The proposed Bayesian approach resembles the frequentist formulation of the problem in yielding comparable sample sizes for circumstances of no borrowing. When borrowing is enabled between commensurate subtrials, a considerably smaller trial sample size is required compared to the widely implemented approach of no borrowing. We illustrate the use of our sample size formulae with two examples based on real basket trials. A comprehensive simulation study further shows that the proposed methodology can maintain the true positive and false positive rates at desired levels.
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
Understanding the extreme luminosity of DES14X2fna
Authors:
M. Grayling,
C. P. Gutiérrez,
M. Sullivan,
P. Wiseman,
M. Vincenzi,
S. González-Gaitán,
B. E. Tucker,
L. Galbany,
L. Kelsey,
C. Lidman,
E. Swann,
D. Carollo,
K. Glazebrook,
G. F. Lewis,
A. Möller,
S. R. Hinton,
M. Smith,
S. A. Uddin,
T. M. C. Abbott,
M. Aguena,
S. Avila,
E. Bertin,
S. Bhargava,
D. Brooks,
A. Carnero Rosell
, et al. (44 additional authors not shown)
Abstract:
We present DES14X2fna, a high-luminosity, fast-declining type IIb supernova (SN IIb) at redshift $z=0.0453$, detected by the Dark Energy Survey (DES). DES14X2fna is an unusual member of its class, with a light curve showing a broad, luminous peak reaching $M_r\simeq-19.3$ mag 20 days after explosion. This object does not show a linear decline tail in the light curve until $\simeq$60 days after exp…
▽ More
We present DES14X2fna, a high-luminosity, fast-declining type IIb supernova (SN IIb) at redshift $z=0.0453$, detected by the Dark Energy Survey (DES). DES14X2fna is an unusual member of its class, with a light curve showing a broad, luminous peak reaching $M_r\simeq-19.3$ mag 20 days after explosion. This object does not show a linear decline tail in the light curve until $\simeq$60 days after explosion, after which it declines very rapidly (4.38$\pm$0.10 mag 100 d$^{-1}$ in $r$-band). By fitting semi-analytic models to the photometry of DES14X2fna, we find that its light curve cannot be explained by a standard $^{56}$Ni decay model as this is unable to fit the peak and fast tail decline observed. Inclusion of either interaction with surrounding circumstellar material or a rapidly-rotating neutron star (magnetar) significantly increases the quality of the model fit. We also investigate the possibility for an object similar to DES14X2fna to act as a contaminant in photometric samples of SNe Ia for cosmology, finding that a similar simulated object is misclassified by a recurrent neural network (RNN)-based photometric classifier as a SN Ia in $\sim$1.1-2.4 per cent of cases in DES, depending on the probability threshold used for a positive classification.
△ Less
Submitted 26 March, 2021;
originally announced March 2021.
-
Multi-outcome trials with a generalised number of efficacious outcomes
Authors:
Martin Law,
Michael J. Grayling,
Adrian P. Mander
Abstract:
Existing multi-outcome designs focus almost entirely on evaluating whether all outcomes show evidence of efficacy or whether at least one outcome shows evidence of efficacy. While a small number of authors have provided multi-outcome designs that evaluate when a general number of outcomes show promise, these designs have been single-stage in nature only. We therefore propose two designs, of group-…
▽ More
Existing multi-outcome designs focus almost entirely on evaluating whether all outcomes show evidence of efficacy or whether at least one outcome shows evidence of efficacy. While a small number of authors have provided multi-outcome designs that evaluate when a general number of outcomes show promise, these designs have been single-stage in nature only. We therefore propose two designs, of group-sequential and drop the loser form, that provide this design characteristic in a multi-stage setting. Previous such multi-outcome multi-stage designs have allowed only for a maximum of two outcomes; our designs thus also extend previous related proposals by permitting any number of outcomes.
△ Less
Submitted 18 December, 2020;
originally announced December 2020.
-
Conditional Power and Friends: The Why and How of (Un)planned, Unblinded Sample Size Recalculations in Confirmatory Trials
Authors:
Kevin Kunzmann,
Michael J. Grayling,
Kim M. Lee,
David S. Robertson,
Kaspar Rufibach,
James M. S. Wason
Abstract:
Adapting the final sample size of a trial to the evidence accruing during the trial is a natural way to address planning uncertainty. Designs with adaptive sample size need to account for their optional stop** to guarantee strict type-I error-rate control. A variety of different methods to maintain type-I error-rate control after unplanned changes of the initial sample size have been proposed in…
▽ More
Adapting the final sample size of a trial to the evidence accruing during the trial is a natural way to address planning uncertainty. Designs with adaptive sample size need to account for their optional stop** to guarantee strict type-I error-rate control. A variety of different methods to maintain type-I error-rate control after unplanned changes of the initial sample size have been proposed in the literature. This makes interim analyses for the purpose of sample size recalculation feasible in a regulatory context. Since the sample size is usually determined via an argument based on the power of the trial, an interim analysis raises the question of how the final sample size should be determined conditional on the accrued information. Conditional power is a concept often put forward in this context. Since it depends on the unknown effect size, we take a strict estimation perspective and compare assumed conditional power, observed conditional power, and predictive power with respect to their properties as estimators of the unknown conditional power. We then demonstrate that pre-planning an interim analysis using methodology for unplanned interim analyses is ineffective and naturally leads to the concept of optimal two-stage designs. We conclude that unplanned design adaptations should only be conducted as reaction to trial-external new evidence, operational needs to violate the originally chosen design, or post hoc changes in the objective criterion. Finally, we show that commonly discussed sample size recalculation rules can lead to paradoxical outcomes and propose two alternative ways of reacting to newly emerging trial-external evidence.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
The Effect of Environment on Type Ia Supernovae in the Dark Energy Survey Three-Year Cosmological Sample
Authors:
L. Kelsey,
M. Sullivan,
M. Smith,
P. Wiseman,
D. Brout,
T. M. Davis,
C. Frohmaier,
L. Galbany,
M. Grayling,
C. P. Gutiérrez,
S. R. Hinton,
R. Kessler,
C. Lidman,
A. Möller,
M. Sako,
D. Scolnic,
S. A. Uddin,
M. Vincenzi,
T. M. C. Abbott,
M. Aguena,
S. Allam,
J. Annis,
S. Avila,
D. Bacon,
E. Bertin
, et al. (51 additional authors not shown)
Abstract:
Analyses of type Ia supernovae (SNe Ia) have found puzzling correlations between their standardised luminosities and host galaxy properties: SNe Ia in high-mass, passive hosts appear brighter than those in lower-mass, star-forming hosts. We examine the host galaxies of SNe Ia in the Dark Energy Survey three-year spectroscopically-confirmed cosmological sample, obtaining photometry in a series of "…
▽ More
Analyses of type Ia supernovae (SNe Ia) have found puzzling correlations between their standardised luminosities and host galaxy properties: SNe Ia in high-mass, passive hosts appear brighter than those in lower-mass, star-forming hosts. We examine the host galaxies of SNe Ia in the Dark Energy Survey three-year spectroscopically-confirmed cosmological sample, obtaining photometry in a series of "local" apertures centred on the SN, and for the global host galaxy. We study the differences in these host galaxy properties, such as stellar mass and rest-frame $U-R$ colours, and their correlations with SN Ia parameters including Hubble residuals. We find all Hubble residual steps to be $>3σ$ in significance, both for splitting at the traditional environmental property sample median and for the step of maximum significance. For stellar mass, we find a maximal local step of $0.098\pm0.018$ mag; $\sim 0.03$ mag greater than the largest global stellar mass step in our sample ($0.070 \pm 0.017$ mag). When splitting at the sample median, differences between local and global $U-R$ steps are small, both $\sim 0.08$ mag, but are more significant than the global stellar mass step ($0.057\pm0.017$ mag). We split the data into sub-samples based on SN Ia light curve parameters: stretch ($x_1$) and colour ($c$), finding that redder objects ($c > 0$) have larger Hubble residual steps, for both stellar mass and $U-R$, for both local and global measurements, of $\sim0.14$ mag. Additionally, the bluer (star-forming) local environments host a more homogeneous SN Ia sample, with local $U-R$ r.m.s. scatter as low as $0.084 \pm 0.017$ mag for blue ($c < 0$) SNe Ia in locally blue $U-R$ environments.
△ Less
Submitted 6 January, 2021; v1 submitted 27 August, 2020;
originally announced August 2020.
-
Two-stage single-arm trials are rarely reported adequately
Authors:
Michael J. Grayling,
Adrian P. Mander
Abstract:
Purpose: Two-stage single-arm trial designs are commonly used in phase II oncology to infer treatment effects for a binary primary outcome (e.g., tumour response). It is imperative that such studies be designed, analysed, and reported effectively. However, there is little available evidence on whether this is the case, particularly for key statistical considerations. We therefore comprehensively r…
▽ More
Purpose: Two-stage single-arm trial designs are commonly used in phase II oncology to infer treatment effects for a binary primary outcome (e.g., tumour response). It is imperative that such studies be designed, analysed, and reported effectively. However, there is little available evidence on whether this is the case, particularly for key statistical considerations. We therefore comprehensively review such trials, examining in particular quality of reporting. Methods: Published oncology trials that utilised "Simon's two-stage design" over a 5 year period were identified and reviewed. Articles were evaluated on whether they reported sufficient design details, such as the required sample size, and analysis details, such as a confidence interval (CI). The articles that did not adjust their inference for the incorporation of an interim analysis were re-analysed to evaluate the impact on their reported point estimate and CI. Results: Four hundred and twenty five articles that reported the results of a single treatment arm were included. Of these, 47.5% provided the five components that ensure design reproducibility. Only 1.2% and 2.1% reported an adjusted point estimate or CI, respectively. Just 55.3% of trials provided the final stage rejection bound, indicating many trials did not test the hypothesis the design is constructed to assess. Re-analysis of the trials suggests that reported point estimates underestimated treatment effects and that reported CIs were too narrow. Conclusion: Key design details of two-stage single-arm trials are often unreported. Whilst inference is regularly performed, it is rarely done so in a way that removes the bias introduced by the interim analysis. In order to maximise their value, future studies must improve the way that they are analysed and reported.
△ Less
Submitted 8 July, 2020;
originally announced July 2020.
-
A review of Bayesian perspectives on sample size derivation for confirmatory trials
Authors:
Kevin Kunzmann,
Michael J. Grayling,
Kim May Lee,
David S. Robertson,
Kaspar Rufibach,
James M. S. Wason
Abstract:
Sample size derivation is a crucial element of the planning phase of any confirmatory trial. A sample size is typically derived based on constraints on the maximal acceptable type I error rate and a minimal desired power. Here, power depends on the unknown true effect size. In practice, power is typically calculated either for the smallest relevant effect size or a likely point alternative. The fo…
▽ More
Sample size derivation is a crucial element of the planning phase of any confirmatory trial. A sample size is typically derived based on constraints on the maximal acceptable type I error rate and a minimal desired power. Here, power depends on the unknown true effect size. In practice, power is typically calculated either for the smallest relevant effect size or a likely point alternative. The former might be problematic if the minimal relevant effect is close to the null, thus requiring an excessively large sample size. The latter is dubious since it does not account for the a priori uncertainty about the likely alternative effect size. A Bayesian perspective on the sample size derivation for a frequentist trial naturally emerges as a way of reconciling arguments about the relative a priori plausibility of alternative effect sizes with ideas based on the relevance of effect sizes. Many suggestions as to how such `hybrid' approaches could be implemented in practice have been put forward in the literature. However, key quantities such as assurance, probability of success, or expected power are often defined in subtly different ways in the literature. Starting from the traditional and entirely frequentist approach to sample size derivation, we derive consistent definitions for the most commonly used `hybrid' quantities and highlight connections, before discussing and demonstrating their use in the context of sample size derivation for clinical trials.
△ Less
Submitted 28 June, 2020;
originally announced June 2020.
-
The Mystery of Photometric Twins DES17X1boj and DES16E2bjy
Authors:
M. Pursiainen,
C. Gutierrez,
P. Wiseman,
M. Childress,
M. Smith,
C. Frohmaier,
C. Angus,
N. Castro Segura,
L. Kelsey,
M. Sullivan,
L. Galbany,
P. Nugent,
B. A. Bassett,
D. Brout,
D. Carollo,
C. B. D'Andrea,
T. M. Davis,
R. J. Foley,
M. Grayling,
S. R. Hinton,
C. Inserra,
R. Kessler,
C. Lidman,
E. Macaulay,
M. March
, et al. (58 additional authors not shown)
Abstract:
We present an analysis of DES17X1boj and DES16E2bjy, two peculiar transients discovered by the Dark Energy Survey (DES). They exhibit nearly identical double-peaked light curves which reach very different maximum luminosities (M$_\mathrm{r}$ = -15.4 and M$_\mathrm{r}$ = -17.9, respectively). The light curve evolution of these events is highly atypical and has not been reported before. The transien…
▽ More
We present an analysis of DES17X1boj and DES16E2bjy, two peculiar transients discovered by the Dark Energy Survey (DES). They exhibit nearly identical double-peaked light curves which reach very different maximum luminosities (M$_\mathrm{r}$ = -15.4 and M$_\mathrm{r}$ = -17.9, respectively). The light curve evolution of these events is highly atypical and has not been reported before. The transients are found in different host environments: DES17X1boj was found near the nucleus of a spiral galaxy, while DES16E2bjy is located in the outskirts of a passive red galaxy. Early photometric data is well fitted with a blackbody and the resulting moderate photospheric expansion velocities (1800 km/s for DES17X1boj and 4800 km/s for DES16E2bjy) suggest an explosive or eruptive origin. Additionally, a feature identified as high-velocity CaII absorption (v $\approx$ 9400km/s) in the near-peak spectrum of DES17X1boj may imply that it is a supernova. While similar light curve evolution suggests a similar physical origin for these two transients, we are not able to identify or characterise the progenitors.
△ Less
Submitted 7 April, 2020; v1 submitted 27 November, 2019;
originally announced November 2019.
-
Optimal curtailed designs for single arm phase II clinical trials
Authors:
Martin Law,
Michael J. Grayling,
Adrian P. Mander
Abstract:
In single-arm phase II oncology trials, the most popular choice of design is Simon's two-stage design, which allows early stop** at one interim analysis. However, the expected trial sample size can be reduced further by allowing curtailment. Curtailment is stop** when the final go or no-go decision is certain, so-called non-stochastic curtailment, or very likely, known as stochastic curtailmen…
▽ More
In single-arm phase II oncology trials, the most popular choice of design is Simon's two-stage design, which allows early stop** at one interim analysis. However, the expected trial sample size can be reduced further by allowing curtailment. Curtailment is stop** when the final go or no-go decision is certain, so-called non-stochastic curtailment, or very likely, known as stochastic curtailment.
In the context of single-arm phase II oncology trials, stochastic curtailment has previously been restricted to stop** in the second stage and/or stop** for a no-go decision only. We introduce two designs that incorporate stochastic curtailment and allow stop** after every observation, for either a go or no-go decision. We obtain optimal stop** boundaries by searching over a range of potential conditional powers, beyond which the trial will stop for a go or no-go decision. This search is novel: firstly, the search is undertaken over a range of values unique to each possible design realisation. Secondly, these values are evaluated taking into account the possibility of early stop**. Finally, each design realisation's operating characteristics are obtained exactly.
The proposed designs are compared to existing designs in a real data example. They are also compared under three scenarios, both with respect to four single optimality criteria and using a loss function.
The proposed designs are superior in almost all cases. Optimising for the expected sample size under either the null or alternative hypothesis, the saving compared to the popular Simon's design ranges from 22% to 55%.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
A web application for the design of multi-arm clinical trials
Authors:
Michael J Grayling,
James MS Wason
Abstract:
Multi-arm designs provide an effective means of evaluating several treatments within the same clinical trial. Given the large number of treatments now available for testing in many disease areas, it has been argued that their utilisation should increase. However, for any given clinical trial there are numerous possible multi-arm designs that could be used, and choosing between them can be a diffic…
▽ More
Multi-arm designs provide an effective means of evaluating several treatments within the same clinical trial. Given the large number of treatments now available for testing in many disease areas, it has been argued that their utilisation should increase. However, for any given clinical trial there are numerous possible multi-arm designs that could be used, and choosing between them can be a difficult task. This task is complicated further by a lack of available easy-to-use software for designing multi-arm trials. To aid the wider implementation of multi-arm clinical trial designs, we have developed a web application for sample size calculation when using a variety of popular multiple comparison corrections. Furthermore, the application supports sample size calculation to control several varieties of power, as well as the determination of optimised arm-wise allocation ratios. It is built using the Shiny package in the R programming language, is free to access on any device with an internet browser, and requires no programming knowledge to use. The application provides the core information required by statisticians and clinicians to review the operating characteristics of a chosen multi-arm clinical trial design. We hope that it will assist with the future utilisation of such designs in practice.
△ Less
Submitted 21 June, 2019;
originally announced June 2019.
-
A review of available software for adaptive clinical trial design
Authors:
Michael J Grayling,
Graham M Wheeler
Abstract:
Background/Aims: The increasing expense of the drug development process has seen interest in the use of adaptive designs (ADs) grow substantially in recent years. Accordingly, much research has been conducted to identify potential barriers to increasing the use of ADs in practice, and several articles have argued that the availability of user-friendly software will be an important step in making A…
▽ More
Background/Aims: The increasing expense of the drug development process has seen interest in the use of adaptive designs (ADs) grow substantially in recent years. Accordingly, much research has been conducted to identify potential barriers to increasing the use of ADs in practice, and several articles have argued that the availability of user-friendly software will be an important step in making ADs easier to implement. Therefore, in this paper we present a review of the current state of software availability for AD. Methods: We first review articles from 31 journals published in 2013-17 that relate to methodology for adaptive trials, in order to assess how often code and software for implementing novel ADs is made available at the time of publication. We contrast our findings against these journals' current policies on code distribution. Secondly, we conduct additional searches of popular code repositories, such as CRAN and GitHub, to identify further existing user-contributed software for ADs. From this, we are able to direct interested parties towards solutions for their problem of interest by classifying available code by type of adaptation. Results: Only 29% of included articles made their code available in some form. In many instances, articles published in journals that had mandatory requirements on code provision still did not make code available. There are several areas in which available software is currently limited or saturated. In particular, many packages are available to address group sequential design, but comparatively little code is present in the public domain to determine biomarker-guided ADs. Conclusions: There is much room for improvement in the provision of software alongside AD publications. Additionally, whilst progress has been made, well-established software for various types of trial adaptation remains sparsely available.
△ Less
Submitted 13 June, 2019;
originally announced June 2019.
-
Blinded and unblinded sample size re-estimation in crossover trials balanced for period
Authors:
Michael Grayling,
Adrian Mander,
James Wason
Abstract:
The determination of the sample size required by a crossover trial typically depends on the specification of one or more variance components. Uncertainty about the value of these parameters at the design stage means that there is often a risk a trial may be under- or over-powered. For many study designs, this problem has been addressed by considering adaptive design methodology that allows for the…
▽ More
The determination of the sample size required by a crossover trial typically depends on the specification of one or more variance components. Uncertainty about the value of these parameters at the design stage means that there is often a risk a trial may be under- or over-powered. For many study designs, this problem has been addressed by considering adaptive design methodology that allows for the re-estimation of the required sample size during a trial. Here, we propose and compare several approaches for this in multi-treatment crossover trials. Specifically, regulators favour re-estimation procedures to maintain the blinding of the treatment allocations. We therefore develop blinded estimators for the within and between person variances, following simple or block randomisation. We demonstrate that, provided an equal number of patients are allocated to sequences that are balanced for period, the proposed estimators following block randomisation are unbiased. We further provide a formula for the bias of the estimators following simple randomisation. The performance of these procedures, along with that of an unblinded approach, is then examined utilising three motivating examples, including one based on a recently completed four-treatment four-period crossover trial. Simulation results show that the performance of the proposed blinded procedures is in many cases similar to that of the unblinded approach, and thus they are an attractive alternative.
△ Less
Submitted 27 March, 2018;
originally announced March 2018.
-
Design optimisation and post-trial analysis in group sequential stepped-wedge cluster randomised trials
Authors:
Michael Grayling,
David Robertson,
James Wason,
Adrian Mander
Abstract:
Recently, methodology was presented to facilitate the incorporation of interim analyses in stepped-wedge (SW) cluster randomised trials (CRTs). Here, we extend this previous discussion. We detail how the stop** boundaries, allocation sequences, and per-cluster per-period sample size of a group sequential SW-CRT can be optimised. We then describe methods by which point estimates, p-values, and co…
▽ More
Recently, methodology was presented to facilitate the incorporation of interim analyses in stepped-wedge (SW) cluster randomised trials (CRTs). Here, we extend this previous discussion. We detail how the stop** boundaries, allocation sequences, and per-cluster per-period sample size of a group sequential SW-CRT can be optimised. We then describe methods by which point estimates, p-values, and confidence intervals, which account for the sequential nature of the design, can be calculated. We demonstrate that optimal sequential designs can reduce the expected required number of measurements under the null hypothesis, compared to the classical design, by up to 30%, with no cost to the maximal possible required number of measurements. Furthermore, the adjusted analysis procedure almost universally reduces the average bias in the point estimate, and consistently provides a confidence interval with coverage close to the nominal level. In contrast, the coverage of a naive 95% confidence interval is observed to range between 92 and 98%. Methodology is now readily available for the efficient design and analysis of group sequential SW-CRTs. In scenarios in which there are substantial ethical or financial reasons to terminate a SW-CRT as soon as possible, trialists should strongly consider a group sequential approach.
△ Less
Submitted 26 March, 2018;
originally announced March 2018.
-
Efficient determination of optimised multi-arm multi-stage experimental designs with control of generalised error-rates
Authors:
Michael Grayling,
James Wason,
Adrian Mander
Abstract:
Primarily motivated by the drug development process, several publications have now presented methodology for the design of multi-arm multi-stage experiments with normally distributed outcome variables of known variance. Here, we extend these past considerations to allow the design of what we refer to as an abcd multi-arm multi-stage experiment. We provide a proof of how strong control of the a-gen…
▽ More
Primarily motivated by the drug development process, several publications have now presented methodology for the design of multi-arm multi-stage experiments with normally distributed outcome variables of known variance. Here, we extend these past considerations to allow the design of what we refer to as an abcd multi-arm multi-stage experiment. We provide a proof of how strong control of the a-generalised type-I familywise error-rate can be ensured. We then describe how to attain the power to reject at least b out of c false hypotheses, which is related to controlling the b-generalised type-II familywise error-rate. Following this, we detail how a design can be optimised for a scenario in which rejection of any d null hypotheses brings about termination of the experiment. We achieve this by proposing a highly computationally efficient approach for evaluating the performance of a candidate design. Finally, using a real clinical trial as a motivating example, we explore the effect of the design's control parameters on the statistical operating characteristics.
△ Less
Submitted 1 December, 2017;
originally announced December 2017.
-
A two-stage Fisher exact test for multi-arm studies with binary outcome variables
Authors:
Michael Grayling,
Adrian Mander,
James Wason
Abstract:
In small sample studies with binary outcome data, use of a normal approximation for hypothesis testing can lead to substantial inflation of the type-I error-rate. Consequently, exact statistical methods are necessitated, and accordingly, much research has been conducted to facilitate this. Recently, this has included methodology for the design of two-stage multi-arm studies utilising exact binomia…
▽ More
In small sample studies with binary outcome data, use of a normal approximation for hypothesis testing can lead to substantial inflation of the type-I error-rate. Consequently, exact statistical methods are necessitated, and accordingly, much research has been conducted to facilitate this. Recently, this has included methodology for the design of two-stage multi-arm studies utilising exact binomial tests. These designs were demonstrated to carry substantial efficiency advantages over a fixed sample design, but generally suffered from strong conservatism. An alternative classical means of small sample inference with dichotomous data is Fisher's exact test. However, this method is limited to single-stage designs when there are multiple arms. Therefore, here, we propose a two-stage version of Fisher's exact test, with the potential to stop early to accept or reject null hypotheses, which is applicable to multi-arm studies. In particular, we provide precise formulae describing the requirements for achieving weak or strong control of the familywise error-rate with this design. Following this, we describe how the design parameters may be optimised to confer desirable operating characteristics. For a motivating example based on a phase II clinical trial, we demonstrate that on average our approach is less conservative than corresponding optimal designs based on exact binomial tests.
△ Less
Submitted 28 November, 2017;
originally announced November 2017.
-
Calculations involving the multivariate normal and multivariate t distributions with and without truncation
Authors:
Michael Grayling,
Adrian Mander
Abstract:
This paper presents a set of Stata commands and Mata functions to evaluate different distributional quantities of the multivariate normal distribution, and a particular type of non-central multivariate t distribution. Specifically, their densities, distribution functions, equicoordinate quantiles, and pseudo-random vectors can be computed efficiently, either in the absence or presence of variable…
▽ More
This paper presents a set of Stata commands and Mata functions to evaluate different distributional quantities of the multivariate normal distribution, and a particular type of non-central multivariate t distribution. Specifically, their densities, distribution functions, equicoordinate quantiles, and pseudo-random vectors can be computed efficiently, either in the absence or presence of variable truncation.
△ Less
Submitted 28 November, 2017;
originally announced November 2017.
-
Admissible multi-arm stepped-wedge cluster randomized trial designs
Authors:
Michael Grayling,
Adrian Mander,
James Wason
Abstract:
Numerous publications have now addressed the principles of designing, analyzing, and reporting the results of, stepped-wedge cluster randomized trials. In contrast, there is little research available pertaining to the design and analysis of multi-arm stepped-wedge cluster randomized trials, utilized to evaluate the effectiveness of multiple experimental interventions. In this paper, we address thi…
▽ More
Numerous publications have now addressed the principles of designing, analyzing, and reporting the results of, stepped-wedge cluster randomized trials. In contrast, there is little research available pertaining to the design and analysis of multi-arm stepped-wedge cluster randomized trials, utilized to evaluate the effectiveness of multiple experimental interventions. In this paper, we address this by explaining how the required sample size in these multi-arm trials can be ascertained when data are to be analyzed using a linear mixed model. We then go on to describe how the design of such trials can be optimized to balance between minimizing the cost of the trial, and minimizing some function of the covariance matrix of the treatment effect estimates. Using a recently commenced trial that will evaluate the effectiveness of sensor monitoring in an occupational therapy rehabilitation program for older persons after hip fracture as an example, we demonstrate that our designs could reduce the number of observations required for a fixed power level by up to 58%. Consequently, when logistical constraints permit the utilization of any one of a range of possible multi-arm stepped-wedge cluster randomized trial designs, researchers should consider employing our approach to optimize their trials efficiency.
△ Less
Submitted 28 June, 2018; v1 submitted 10 October, 2017;
originally announced October 2017.
-
Group Sequential Crossover Trial Designs with Strong Control of the Familywise Error Rate
Authors:
Michael Grayling,
James Wason,
Adrian Mander
Abstract:
Crossover designs are an extremely useful tool to investigators, whilst group sequential methods have proven highly proficient at improving the efficiency of parallel group trials. Yet, group sequential methods and crossover designs have rarely been paired together. One possible explanation for this could be the absence of a formal proof of how to strongly control the familywise error rate in the…
▽ More
Crossover designs are an extremely useful tool to investigators, whilst group sequential methods have proven highly proficient at improving the efficiency of parallel group trials. Yet, group sequential methods and crossover designs have rarely been paired together. One possible explanation for this could be the absence of a formal proof of how to strongly control the familywise error rate in the case when multiple comparisons will be made. Here, we provide this proof, valid for any number of initial experimental treatments and any number of stages, when results are analysed using a linear mixed model. We then establish formulae for the expected sample size and expected number of observations of such a trial, given any choice of stop** boundaries. Finally, utilising the four-treatment, four-period TOMADO trial as an example, we demonstrate group sequential methods in this setting could have reduced the trials expected number of observations under the global null hypothesis by over 33%.
△ Less
Submitted 10 October, 2017;
originally announced October 2017.
-
An optimised multi-arm multi-stage clinical trial design for unknown variance
Authors:
Michael Grayling,
James Wason,
Adrian Mander
Abstract:
Multi-arm multi-stage trial designs can bring notable gains in efficiency to the drug development process. However, for normally distributed endpoints, the determination of a design typically depends on the assumption that the patient variance in response is known. In practice, this will not usually be the case. To allow for unknown variance, previous research explored the performance of t-test st…
▽ More
Multi-arm multi-stage trial designs can bring notable gains in efficiency to the drug development process. However, for normally distributed endpoints, the determination of a design typically depends on the assumption that the patient variance in response is known. In practice, this will not usually be the case. To allow for unknown variance, previous research explored the performance of t-test statistics, coupled with a quantile substitution procedure for modifying the stop** boundaries, at controlling the familywise error-rate to the nominal level. Here, we discuss an alternative method based on Monte Carlo simulation that allows the group size and stop** boundaries of a multi-arm multi-stage t-test to be optimised according to some nominated optimality criteria. We consider several examples, provide R code for general implementation, and show that our designs confer a familywise error-rate and power close to the desired level. Consequently, this methodology will provide utility in future multi-arm multi-stage trials.
△ Less
Submitted 10 October, 2017;
originally announced October 2017.
-
Group Sequential Clinical Trial Designs for Normally Distributed Outcome Variables
Authors:
Michael Grayling,
James Wason,
Adrian Mander
Abstract:
In a group sequential clinical trial, accumulated data are analysed at numerous time-points in order to allow early decisions about a hypothesis of interest. These designs have historically been recommended for their ethical, administrative and economic benefits. In this work, we discuss a collection of new Stata commands for computing the stop** boundaries and required group size of various cla…
▽ More
In a group sequential clinical trial, accumulated data are analysed at numerous time-points in order to allow early decisions about a hypothesis of interest. These designs have historically been recommended for their ethical, administrative and economic benefits. In this work, we discuss a collection of new Stata commands for computing the stop** boundaries and required group size of various classical group sequential designs, assuming a normally distributed outcome variable. Following this, we demonstrate how the performance of several designs can be compared graphically.
△ Less
Submitted 28 November, 2017; v1 submitted 9 October, 2017;
originally announced October 2017.
-
Blinded and unblinded sample size re-estimation procedures for stepped-wedge cluster randomized trials
Authors:
Michael Grayling,
Adrian Mander,
James Wason
Abstract:
The ability to accurately estimate the sample size required by a stepped-wedge (SW) cluster randomized trial (CRT) routinely depends upon the specification of several nuisance parameters. If these parameters are mis-specified, the trial could be over-powered, leading to increased cost, or under-powered, enhancing the likelihood of a false negative. We address this issue here for cross-sectional SW…
▽ More
The ability to accurately estimate the sample size required by a stepped-wedge (SW) cluster randomized trial (CRT) routinely depends upon the specification of several nuisance parameters. If these parameters are mis-specified, the trial could be over-powered, leading to increased cost, or under-powered, enhancing the likelihood of a false negative. We address this issue here for cross-sectional SW-CRTs, analyzed with a particular linear mixed model, by proposing methods for blinded and unblinded sample size re-estimation (SSRE). Blinded estimators for the variance parameters of a SW-CRT analyzed using the Hussey and Hughes model are derived. Then, procedures for blinded and unblinded SSRE after any time period in a SW-CRT are detailed. The performance of these procedures is then examined and contrasted using two example trial design scenarios. We find that if the two key variance parameters were under-specified by 50%, the SSRE procedures were able to increase power over the conventional SW-CRT design by up to 29%, resulting in an empirical power above the desired level. Moreover, the performance of the re-estimation procedures was relatively insensitive to the timing of the interim assessment. Thus, the considered SSRE procedures can bring substantial gains in power when the underlying variance parameters are mis-specified. Though there are practical issues to consider, the procedure's performance means researchers should consider incorporating SSRE in to future SW-CRTs.
△ Less
Submitted 7 October, 2017;
originally announced October 2017.