-
Borrowing from historical control data in a Bayesian time-to-event model with flexible baseline hazard function
Authors:
Darren A. V. Scott,
Alex Lewin
Abstract:
There is currently a focus on statistical methods which can use historical trial information to help accelerate the discovery, development and delivery of medicine. Bayesian methods can be constructed so that the borrowing is "dynamic" in the sense that the similarity of the data helps to determine how much information is used. In the time to event setting with one historical data set, a popular m…
▽ More
There is currently a focus on statistical methods which can use historical trial information to help accelerate the discovery, development and delivery of medicine. Bayesian methods can be constructed so that the borrowing is "dynamic" in the sense that the similarity of the data helps to determine how much information is used. In the time to event setting with one historical data set, a popular model for a range of baseline hazards is the piecewise exponential model where the time points are fixed and a borrowing structure is imposed on the model. Although convenient for implementation this approach effects the borrowing capability of the model. We propose a Bayesian model which allows the time points to vary and a dependency to be placed between the baseline hazards. This serves to smooth the posterior baseline hazard improving both model estimation and borrowing characteristics. We explore a variety of prior structures for the borrowing within our proposed model and assess their performance against established approaches. We demonstrate that this leads to improved type I error in the presence of prior data conflict and increased power. We have developed accompanying software which is freely available and enables easy implementation of the approach.
△ Less
Submitted 23 February, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
An efficient deep neural network to find small objects in large 3D images
Authors:
Jungkyu Park,
Jakub Chłędowski,
Stanisław Jastrzębski,
Jan Witowski,
Yanqi Xu,
Linda Du,
Sushma Gaddam,
Eric Kim,
Alana Lewin,
Ujas Parikh,
Anastasia Plaunova,
Sardius Chen,
Alexandra Millet,
James Park,
Kristine Pysarenko,
Shalin Patel,
Julia Goldberg,
Melanie Wegener,
Linda Moy,
Laura Heacock,
Beatriu Reig,
Krzysztof J. Geras
Abstract:
3D imaging enables accurate diagnosis by providing spatial information about organ anatomy. However, using 3D images to train AI models is computationally challenging because they consist of 10x or 100x more pixels than their 2D counterparts. To be trained with high-resolution 3D images, convolutional neural networks resort to downsampling them or projecting them to 2D. We propose an effective alt…
▽ More
3D imaging enables accurate diagnosis by providing spatial information about organ anatomy. However, using 3D images to train AI models is computationally challenging because they consist of 10x or 100x more pixels than their 2D counterparts. To be trained with high-resolution 3D images, convolutional neural networks resort to downsampling them or projecting them to 2D. We propose an effective alternative, a neural network that enables efficient classification of full-resolution 3D medical images. Compared to off-the-shelf convolutional neural networks, our network, 3D Globally-Aware Multiple Instance Classifier (3D-GMIC), uses 77.98%-90.05% less GPU memory and 91.23%-96.02% less computation. While it is trained only with image-level labels, without segmentation labels, it explains its predictions by providing pixel-level saliency maps. On a dataset collected at NYU Langone Health, including 85,526 patients with full-field 2D mammography (FFDM), synthetic 2D mammography, and 3D mammography, 3D-GMIC achieves an AUC of 0.831 (95% CI: 0.769-0.887) in classifying breasts with malignant findings using 3D mammography. This is comparable to the performance of GMIC on FFDM (0.816, 95% CI: 0.737-0.878) and synthetic 2D (0.826, 95% CI: 0.754-0.884), which demonstrates that 3D-GMIC successfully classified large 3D images despite focusing computation on a smaller percentage of its input compared to GMIC. Therefore, 3D-GMIC identifies and utilizes extremely small regions of interest from 3D images consisting of hundreds of millions of pixels, dramatically reducing associated computational challenges. 3D-GMIC generalizes well to BCS-DBT, an external dataset from Duke University Hospital, achieving an AUC of 0.848 (95% CI: 0.798-0.896).
△ Less
Submitted 26 February, 2023; v1 submitted 16 October, 2022;
originally announced October 2022.
-
BayesSUR: An R package for high-dimensional multivariate Bayesian variable and covariance selection in linear regression
Authors:
Zhi Zhao,
Marco Banterle,
Leonardo Bottolo,
Sylvia Richardson,
Alex Lewin,
Manuela Zucknick
Abstract:
In molecular biology, advances in high-throughput technologies have made it possible to study complex multivariate phenotypes and their simultaneous associations with high-dimensional genomic and other omics data, a problem that can be studied with high-dimensional multi-response regression, where the response variables are potentially highly correlated. To this purpose, we recently introduced sev…
▽ More
In molecular biology, advances in high-throughput technologies have made it possible to study complex multivariate phenotypes and their simultaneous associations with high-dimensional genomic and other omics data, a problem that can be studied with high-dimensional multi-response regression, where the response variables are potentially highly correlated. To this purpose, we recently introduced several multivariate Bayesian variable and covariance selection models, e.g., Bayesian estimation methods for sparse seemingly unrelated regression for variable and covariance selection. Several variable selection priors have been implemented in this context, in particular the hotspot detection prior for latent variable inclusion indicators, which results in sparse variable selection for associations between predictors and multiple phenotypes. We also propose an alternative, which uses a Markov random field (MRF) prior for incorporating prior knowledge about the dependence structure of the inclusion indicators. Inference of Bayesian seemingly unrelated regression (SUR) by Markov chain Monte Carlo methods is made computationally feasible by factorisation of the covariance matrix amongst the response variables. In this paper we present BayesSUR, an R package, which allows the user to easily specify and run a range of different Bayesian SUR models, which have been implemented in C++ for computational efficiency. The R package allows the specification of the models in a modular way, where the user chooses the priors for variable selection and for covariance selection separately. We demonstrate the performance of sparse SUR models with the hotspot prior and spike-and-slab MRF prior on synthetic and real data sets representing eQTL or mQTL studies and in vitro anti-cancer drug screening studies as examples for typical applications.
△ Less
Submitted 28 April, 2021;
originally announced April 2021.
-
Multivariate Bayesian structured variable selection for pharmacogenomic studies
Authors:
Zhi Zhao,
Marco Banterle,
Alex Lewin,
Manuela Zucknick
Abstract:
Precision cancer medicine aims to determine the optimal treatment for each patient. In-vitro cancer drug sensitivity screens combined with multi-omics characterization of the cancer cells have become an important tool to achieve this aim. Analyzing such pharmacogenomic studies requires flexible and efficient joint statistical models for associating drug sensitivity with high-dimensional multi-omic…
▽ More
Precision cancer medicine aims to determine the optimal treatment for each patient. In-vitro cancer drug sensitivity screens combined with multi-omics characterization of the cancer cells have become an important tool to achieve this aim. Analyzing such pharmacogenomic studies requires flexible and efficient joint statistical models for associating drug sensitivity with high-dimensional multi-omics data. We propose a multivariate Bayesian structured variable selection model for sparse identification of omics features associated with multiple correlated drug responses. Since many anti-cancer drugs are designed for specific molecular targets, our approach makes use of known structure between responses and predictors, e.g. molecular pathways and related omics features targeted by specific drugs, via a Markov random field (MRF) prior for the latent indicator variables of the coefficients in sparse seemingly unrelated regression. The structure information included in the MRF prior can improve the model performance, i.e. variable selection and response prediction, compared to other common priors. In addition, we employ random effects to capture heterogeneity between cancer types in a pan-cancer setting. The proposed approach is validated by simulation studies and applied to the Genomics of Drug Sensitivity in Cancer data, which includes pharmacological profiling and multi-omics characterization of a large set of heterogeneous cell lines.
△ Less
Submitted 13 February, 2023; v1 submitted 14 January, 2021;
originally announced January 2021.
-
Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening
Authors:
Nan Wu,
Jason Phang,
Jungkyu Park,
Yiqiu Shen,
Zhe Huang,
Masha Zorin,
Stanisław Jastrzębski,
Thibault Févry,
Joe Katsnelson,
Eric Kim,
Stacey Wolfson,
Ujas Parikh,
Sushma Gaddam,
Leng Leng Young Lin,
Kara Ho,
Joshua D. Weinstein,
Beatriu Reig,
Yiming Gao,
Hildegard Toth,
Kristine Pysarenko,
Alana Lewin,
Jiyon Lee,
Krystal Airola,
Eralda Mema,
Stephanie Chung
, et al. (7 additional authors not shown)
Abstract:
We present a deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200,000 exams (over 1,000,000 images). Our network achieves an AUC of 0.895 in predicting whether there is a cancer in the breast, when tested on the screening population. We attribute the high accuracy of our model to a two-stage training procedure, which allows us to use…
▽ More
We present a deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200,000 exams (over 1,000,000 images). Our network achieves an AUC of 0.895 in predicting whether there is a cancer in the breast, when tested on the screening population. We attribute the high accuracy of our model to a two-stage training procedure, which allows us to use a very high-capacity patch-level network to learn from pixel-level labels alongside a network learning from macroscopic breast-level labels. To validate our model, we conducted a reader study with 14 readers, each reading 720 screening mammogram exams, and find our model to be as accurate as experienced radiologists when presented with the same data. Finally, we show that a hybrid model, averaging probability of malignancy predicted by a radiologist with a prediction of our neural network, is more accurate than either of the two separately. To better understand our results, we conduct a thorough analysis of our network's performance on different subpopulations of the screening population, model design, training procedure, errors, and properties of its internal representations.
△ Less
Submitted 19 March, 2019;
originally announced March 2019.
-
Optimal whitening and decorrelation
Authors:
Agnan Kessy,
Alex Lewin,
Korbinian Strimmer
Abstract:
Whitening, or sphering, is a common preprocessing step in statistical analysis to transform random variables to orthogonality. However, due to rotational freedom there are infinitely many possible whitening procedures. Consequently, there is a diverse range of sphering methods in use, for example based on principal component analysis (PCA), Cholesky matrix decomposition and zero-phase component an…
▽ More
Whitening, or sphering, is a common preprocessing step in statistical analysis to transform random variables to orthogonality. However, due to rotational freedom there are infinitely many possible whitening procedures. Consequently, there is a diverse range of sphering methods in use, for example based on principal component analysis (PCA), Cholesky matrix decomposition and zero-phase component analysis (ZCA), among others.
Here we provide an overview of the underlying theory and discuss five natural whitening procedures. Subsequently, we demonstrate that investigating the cross-covariance and the cross-correlation matrix between sphered and original variables allows to break the rotational invariance and to identify optimal whitening transformations. As a result we recommend two particular approaches: ZCA-cor whitening to produce sphered variables that are maximally similar to the original variables, and PCA-cor whitening to obtain sphered variables that maximally compress the original variables.
△ Less
Submitted 17 December, 2016; v1 submitted 2 December, 2015;
originally announced December 2015.
-
Tuning the critical temperature of cuprate superconductor films using self-assembled organic layers
Authors:
I. Carmeli,
A. Lewin,
E. Flekser,
I. Diamant,
Q. Zhang,
J. Shen,
M. Gozin,
S. Richter,
Y. Dagan
Abstract:
Many of the electronic properties of high-temperature cuprate superconductors (HTSC) are strongly dependent on the number of charge carriers put into the CuO$_2$ planes (do**). Superconductivity appears over a dome-shaped region of the do**-temperature phase diagram. The highest critical temperature (Tc) is obtained for the so-called "optimum do**". The do** mechanism is usually chemical;…
▽ More
Many of the electronic properties of high-temperature cuprate superconductors (HTSC) are strongly dependent on the number of charge carriers put into the CuO$_2$ planes (do**). Superconductivity appears over a dome-shaped region of the do**-temperature phase diagram. The highest critical temperature (Tc) is obtained for the so-called "optimum do**". The do** mechanism is usually chemical; it can be done by cationic substitution. This is the case, for example, in La$_{2-x}$Sr$_x$CuO$_4$ where La3+ is replaced by Sr2+ thus adding a hole to the CuO$_2$ planes. A similar effect is achieved by adding oxygen as in the case of YBa$_2$Cu$_3$O$_{6+δ}$ where $δ$ represents the excess oxygen in the sample. In this paper we report on a different mechanism, one that enables the addition or removal of carriers from the surface of the HTSC. This method utilizes a self-assembled monolayer (SAM) of polar molecules adsorbed on the cuprate surface. In the case of optically active molecules, the polarity of the SAM can be modulated by shining light on the coated surface. This results in a light-induced modulation of the superconducting phase transition of the sample. The ability to control the superconducting transition temperature with the use of SAMs makes these surfaces practical for various devices such as switches and detectors based on high-Tc superconductors.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Can inflationary models of cosmic perturbations evade the secondary oscillation test?
Authors:
Alex Lewin,
Andreas Albrecht
Abstract:
We consider the consequences of an observed Cosmic Microwave Background (CMB) temperature anisotropy spectrum containing no secondary oscillations. While such a spectrum is generally considered to be a robust signature of active structure formation, we show that such a spectrum {\em can} be produced by (very unusual) inflationary models or other passive evolution models. However, we show that fo…
▽ More
We consider the consequences of an observed Cosmic Microwave Background (CMB) temperature anisotropy spectrum containing no secondary oscillations. While such a spectrum is generally considered to be a robust signature of active structure formation, we show that such a spectrum {\em can} be produced by (very unusual) inflationary models or other passive evolution models. However, we show that for all these passive models the characteristic oscillations would show up in other observable spectra. Our work shows that when CMB polarization and matter power spectra are taken into account secondary oscillations are indeed a signature of even these very exotic passive models. We construct a measure of the observability of secondary oscillations in a given experiment, and show that even with foregrounds both the MAP and \pk satellites should be able to distinguish between models with and without oscillations. Thus we conclude that inflationary and other passive models can {\em not} evade the secondary oscillation test.
△ Less
Submitted 25 April, 2001; v1 submitted 6 August, 1999;
originally announced August 1999.
-
A new statistic for picking out Non-Gaussianity in the CMB
Authors:
Alex Lewin,
Andreas Albrecht,
Joao Magueijo
Abstract:
In this paper we propose a new statistic capable of detecting non-Gaussianity in the CMB. The statistic is defined in Fourier space, and therefore naturally separates angular scales. It consists of taking another Fourier transform, in angle, over the Fourier modes within a given ring of scales. Like other Fourier space statistics, our statistic outdoes more conventional methods when faced with c…
▽ More
In this paper we propose a new statistic capable of detecting non-Gaussianity in the CMB. The statistic is defined in Fourier space, and therefore naturally separates angular scales. It consists of taking another Fourier transform, in angle, over the Fourier modes within a given ring of scales. Like other Fourier space statistics, our statistic outdoes more conventional methods when faced with combinations of Gaussian processes (be they noise or signal) and a non-Gaussian signal which dominates only on some scales. However, unlike previous efforts along these lines, our statistic is successful in recognizing multiple non-Gaussian patterns in a single field. We discuss various applications, in which the Gaussian component may be noise or primordial signal, and the non-Gaussian component may be a cosmic string map, or some geometrical construction mimicking, say, small scale dust maps.
△ Less
Submitted 9 April, 1999; v1 submitted 27 April, 1998;
originally announced April 1998.
-
Non-Gaussian spectra and the search for cosmic strings
Authors:
Joao Magueijo,
Alex Lewin
Abstract:
We present a new tool for relating theory and experiment suited for non-Gaussian theories: non-Gaussian spectra. It does for non-Gaussian theories what the angular power spectrum $C_\ell$ does for Gaussian theories. We then show how previous studies of cosmic strings have over rated their non-Gaussian signature. More realistic maps are not visually stringy. However non-Gaussian spectra will accu…
▽ More
We present a new tool for relating theory and experiment suited for non-Gaussian theories: non-Gaussian spectra. It does for non-Gaussian theories what the angular power spectrum $C_\ell$ does for Gaussian theories. We then show how previous studies of cosmic strings have over rated their non-Gaussian signature. More realistic maps are not visually stringy. However non-Gaussian spectra will accuse their stringiness. We finally summarise the steps of an undergoing experimental project aiming at searching for cosmic strings by means of this technique.
△ Less
Submitted 14 February, 1997;
originally announced February 1997.