-
Ranking by Lifts: A Cost-Benefit Approach to Large-Scale A/B Tests
Authors:
Pallavi Basu,
Ron Berman
Abstract:
A/B testers conducting large-scale tests prioritize lifts and want to be able to control false rejections of the null. This work develops a decision-theoretic framework for maximizing profits subject to false discovery rate (FDR) control. We build an empirical Bayes solution for the problem via the greedy knapsack approach. We derive an oracle rule based on ranking the ratio of expected lifts and…
▽ More
A/B testers conducting large-scale tests prioritize lifts and want to be able to control false rejections of the null. This work develops a decision-theoretic framework for maximizing profits subject to false discovery rate (FDR) control. We build an empirical Bayes solution for the problem via the greedy knapsack approach. We derive an oracle rule based on ranking the ratio of expected lifts and the cost of wrong rejections using the local false discovery rate (lfdr) statistic. Our oracle decision rule is valid and optimal for large-scale tests. Further, we establish asymptotic validity for the data-driven procedure and demonstrate finite-sample validity in experimental studies. We also demonstrate the merit of the proposed method over other FDR control methods. Finally, we discuss an application to actual Optimizely experiments.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Ensemble equivalence for mean field models and plurisubharmonicity
Authors:
Robert J. Berman
Abstract:
We show that entropy is globally concave with respect to energy for a rich class of mean field interactions, including regularizations of the the point-vortex model in the plane, plasmas and self-gravitating matter in 2D, as well as the higher dimensional logarithmic interactions appearing in conformal geometry and power laws. The proofs are based on a corresponding "microscopic" concavity result…
▽ More
We show that entropy is globally concave with respect to energy for a rich class of mean field interactions, including regularizations of the the point-vortex model in the plane, plasmas and self-gravitating matter in 2D, as well as the higher dimensional logarithmic interactions appearing in conformal geometry and power laws. The proofs are based on a corresponding "microscopic" concavity result at finite N, shown by leveraging an unexpected link to Kahler geometry and plurisubharmonic functions. Under more restrictive homogeneity assumptions strict concavity is obtained using a uniqueness result for free energy minimizers, established in a companion paper. The results imply that thermodynamic equivalence of ensembles holds for this class of mean field models. As an application it is shown that the critical inverse negative temperatures - in the macroscopic as well as the microscopic setting - coincide with the asymptotic slope of the corresponding microcanonical entropies. Along the way we also extend previous results on the thermodynamic equivalence of ensembles for continuous weakly positive definite interactions, concerning positive temperature states, to the general non-continuous case. In particular, singular situations are exhibited where, somewhat surprisingly, thermodynamic equivalence of ensembles fails at energy levels sufficiently close to the minimum energy level.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
On the Number of Factorizations of Polynomials over Finite Fields
Authors:
Rachel N. Berman,
Ron M. Roth
Abstract:
Motivated by coding applications,two enumeration problems are considered: the number of distinct divisors of a degree-m polynomial over F = GF(q), and the number of ways a polynomial can be written as a product of two polynomials of degree at most n over F. For the two problems, bounds are obtained on the maximum number of factorizations, and a characterization is presented for polynomials attaini…
▽ More
Motivated by coding applications,two enumeration problems are considered: the number of distinct divisors of a degree-m polynomial over F = GF(q), and the number of ways a polynomial can be written as a product of two polynomials of degree at most n over F. For the two problems, bounds are obtained on the maximum number of factorizations, and a characterization is presented for polynomials attaining that maximum. Finally, expressions are presented for the average and the variance of the number of factorizations, for any given m (respectively, n).
△ Less
Submitted 8 April, 2021; v1 submitted 6 April, 2020;
originally announced April 2020.
-
Towards Automated Melanoma Detection with Deep Learning: Data Purification and Augmentation
Authors:
Devansh Bisla,
Anna Choromanska,
Jennifer A. Stein,
David Polsky,
Russell Berman
Abstract:
Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with o…
▽ More
Melanoma is one of the ten most common cancers in the US. Early detection is crucial for survival, but often the cancer is diagnosed in the fatal stage. Deep learning has the potential to improve cancer detection rates, but its applicability to melanoma detection is compromised by the limitations of the available skin lesion databases, which are small, heavily imbalanced, and contain images with occlusions. We build deep-learning-based tools for data purification and augmentation to counter-act these limitations. The developed tools can be utilized in a deep learning system for lesion classification and we show how to build such a system. The system heavily relies on the processing unit for removing image occlusions and the data generation unit, based on generative adversarial networks, for populating scarce lesion classes, or equivalently creating virtual patients with pre-defined types of lesions. We empirically verify our approach and show that incorporating these two units into melanoma detection system results in the superior performance over common baselines.
△ Less
Submitted 14 May, 2019; v1 submitted 16 February, 2019;
originally announced February 2019.