Skip to main content

Showing 1–22 of 22 results for author: Miller, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2212.09903  [pdf, ps, other

    stat.ME stat.AP

    Prognostic Covariate Adjustment for Binary Outcomes Using Stratification

    Authors: Alyssa M. Vanderbeek, Jessica L. Ross, David P. Miller, Alejandro Schuler

    Abstract: Covariate adjustment and methods of incorporating historical data in randomized clinical trials (RCTs) each provide opportunities to increase trial power. We unite these approaches for the analysis of RCTs with binary outcomes based on the Cochran-Mantel-Haenszel (CMH) test for marginal risk ratio (RR). In PROCOVA-CMH, subjects are stratified on a single prognostic covariate reflective of their pr… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 18 pages, 11 tables, appendix

  2. arXiv:2208.04495  [pdf, other

    stat.ME stat.AP

    Restricted mean survival time estimate using covariate adjusted pseudovalue regression to improve precision

    Authors: Yunfan Li, Jessica L. Ross, Aaron M. Smith, David P. Miller

    Abstract: Covariate adjustment is desired by both practitioners and regulators of randomized clinical trials because it improves precision for estimating treatment effects. However, covariate adjustment presents a particular challenge in time-to-event analysis. We propose to apply covariate adjusted pseudovalue regression to estimate between-treatment difference in restricted mean survival times (RMST). Our… ▽ More

    Submitted 18 July, 2023; v1 submitted 8 August, 2022; originally announced August 2022.

  3. arXiv:2206.02157  [pdf, other

    cs.LG stat.ML

    Never mind the metrics -- what about the uncertainty? Visualising confusion matrix metric distributions

    Authors: David Lovell, Dimity Miller, Jaiden Capra, Andrew Bradley

    Abstract: There are strong incentives to build models that demonstrate outstanding predictive performance on various datasets and benchmarks. We believe these incentives risk a narrow focus on models and on the performance metrics used to evaluate and compare them -- resulting in a growing body of literature to evaluate and compare metrics. This paper strives for a more balanced perspective on classifier pe… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

    Comments: 60 pages, 45 figures

  4. arXiv:2006.04780  [pdf, other

    hep-ph cs.LG hep-ex physics.comp-ph stat.ML

    Lorentz Group Equivariant Neural Network for Particle Physics

    Authors: Alexander Bogatskiy, Brandon Anderson, Jan T. Offermann, Marwah Roussi, David W. Miller, Risi Kondor

    Abstract: We present a neural network architecture that is fully equivariant with respect to transformations under the Lorentz group, a fundamental symmetry of space and time in physics. The architecture is based on the theory of the finite-dimensional representations of the Lorentz group and the equivariant nonlinearity involves the tensor product. For classification tasks in particle physics, we demonstra… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  5. Understanding the stochastic partial differential equation approach to smoothing

    Authors: David L Miller, Richard Glennie, Andrew E Seaton

    Abstract: Correlation and smoothness are terms used to describe a wide variety of random quantities. In time, space, and many other domains, they both imply the same idea: quantities that occur closer together are more similar than those further apart. Two popular statistical models that represent this idea are basis-penalty smoothers (Wood, 2017) and stochastic partial differential equations (SPDE) (Lindgr… ▽ More

    Submitted 9 June, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

    Comments: 23 pages, 4 figures. JABES (2019)

  6. arXiv:1912.00913  [pdf

    stat.AP cs.SE

    Automated metrics calculation in a dynamic heterogeneous environment

    Authors: Craig Boucher, Ulf Knoblich, Daniel Miller, Sasha Patotski, Amin Saied, Venky Venkateshaiah

    Abstract: A consistent theme in software experimentation at Microsoft has been solving problems of experimentation at scale for a diverse set of products. Running experiments at scale (i.e., many experiments on many users) has become state of the art across the industry. However, providing a single platform that allows software experimentation in a highly heterogenous and constantly evolving ecosystem remai… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: 5 pages, MIT Code

  7. arXiv:1911.07970  [pdf, other

    cs.LG stat.ML

    Revealing Perceptible Backdoors, without the Training Set, via the Maximum Achievable Misclassification Fraction Statistic

    Authors: Zhen Xiang, David J. Miller, Hang Wang, George Kesidis

    Abstract: Recently, a backdoor data poisoning attack was proposed, which adds mislabeled examples to the training set, with an embedded backdoor pattern, aiming to have the classifier learn to classify to a target class whenever the backdoor pattern is present in a test sample. Here, we address post-training detection of innocuous perceptible backdoors in DNN image classifiers, wherein the defender does not… ▽ More

    Submitted 6 April, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

  8. arXiv:1910.08032  [pdf, other

    stat.ML cs.LG

    Notes on Margin Training and Margin p-Values for Deep Neural Network Classifiers

    Authors: George Kesidis, David J. Miller, Zhen Xiang

    Abstract: We provide a new local class-purity theorem for Lipschitz continuous DNN classifiers. In addition, we discuss how to achieve classification margin for training samples. Finally, we describe how to compute margin p-values for test samples.

    Submitted 5 December, 2019; v1 submitted 14 October, 2019; originally announced October 2019.

  9. arXiv:1908.10498  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Detection of Backdoors in Trained Classifiers Without Access to the Training Set

    Authors: Zhen Xiang, David J. Miller, George Kesidis

    Abstract: Recently, a special type of data poisoning (DP) attack targeting Deep Neural Network (DNN) classifiers, known as a backdoor, was proposed. These attacks do not seek to degrade classification accuracy, but rather to have the classifier learn to classify to a target class whenever the backdoor pattern is present in a test example. Launching backdoor attacks does not require knowledge of the classifi… ▽ More

    Submitted 19 August, 2020; v1 submitted 27 August, 2019; originally announced August 2019.

  10. arXiv:1906.04165  [pdf

    cs.CL cs.LG cs.SD eess.AS stat.ML

    Leveraging BERT for Extractive Text Summarization on Lectures

    Authors: Derek Miller

    Abstract: In the last two decades, automatic extractive text summarization on lectures has demonstrated to be a useful tool for collecting key phrases and sentences that best represent the content. However, many current approaches utilize dated approaches, producing sub-par outputs or requiring several hours of manual tuning to produce meaningful results. Recently, new machine learning architectures have pr… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

    Comments: 7 Pages, First Version

  11. arXiv:1904.06292  [pdf, other

    cs.LG cs.CR stat.ML

    Adversarial Learning in Statistical Classification: A Comprehensive Review of Defenses Against Attacks

    Authors: David J. Miller, Zhen Xiang, George Kesidis

    Abstract: There is great potential for damage from adversarial learning (AL) attacks on machine-learning based systems. In this paper, we provide a contemporary survey of AL, focused particularly on defenses against attacks on statistical classifiers. After introducing relevant terminology and the goals and range of possible knowledge of both attackers and defenders, we survey recent work on test-time evasi… ▽ More

    Submitted 2 December, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

    Journal ref: Proceedings of the IEEE, March. 2020

  12. arXiv:1902.01330  [pdf, ps, other

    stat.ME

    Bayesian views of generalized additive modelling

    Authors: David L. Miller

    Abstract: Generalized additive models (GAMs) are a commonly used, flexible framework applied to many problems in statistical ecology. GAMs are often considered to be a purely frequentist framework (`generalized linear models with wiggly bits'), however links between frequentist and Bayesian approaches to these models were highlighted early on in the literature. Bayesian thinking underlies many parts of the… ▽ More

    Submitted 6 October, 2021; v1 submitted 4 February, 2019; originally announced February 2019.

  13. arXiv:1811.02658  [pdf, other

    cs.CV cs.LG stat.ML

    When Not to Classify: Detection of Reverse Engineering Attacks on DNN Image Classifiers

    Authors: Yujia Wang, David J. Miller, George Kesidis

    Abstract: This paper addresses detection of a reverse engineering (RE) attack targeting a deep neural network (DNN) image classifier; by querying, RE's aim is to discover the classifier's decision rule. RE can enable test-time evasion attacks, which require knowledge of the classifier. Recently, we proposed a quite effective approach (ADA) to detect test-time evasion attacks. In this paper, we extend ADA to… ▽ More

    Submitted 31 October, 2018; originally announced November 2018.

  14. arXiv:1811.00121  [pdf, other

    cs.CR cs.LG stat.ML

    A Mixture Model Based Defense for Data Poisoning Attacks Against Naive Bayes Spam Filters

    Authors: David J. Miller, Xinyi Hu, Zhen Xiang, George Kesidis

    Abstract: Naive Bayes spam filters are highly susceptible to data poisoning attacks. Here, known spam sources/blacklisted IPs exploit the fact that their received emails will be treated as (ground truth) labeled spam examples, and used for classifier training (or re-training). The attacking source thus generates emails that will skew the spam model, potentially resulting in great degradation in classifier a… ▽ More

    Submitted 31 October, 2018; originally announced November 2018.

  15. arXiv:1808.10307  [pdf, other

    cs.CR cs.LG stat.ML

    Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

    Authors: Cong Liao, Haoti Zhong, Anna Squicciarini, Sencun Zhu, David Miller

    Abstract: Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications including those where security is of great concern. Such popularity, however, may attract attackers to exploit the vulnerabilities of the deployed deep learning model… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

  16. Variance propagation for density surface models

    Authors: Mark V Bravington, David L Miller, Sharon L Hedley

    Abstract: Spatially-explicit estimates of population density, together with appropriate estimates of uncertainty, are required in many management contexts. Density Surface Models (DSMs) are a two-stage approach for estimating spatially-varying density from distance-sampling data. First, detection probabilities -- perhaps depending on covariates -- are estimated based on details of individual encounters; nex… ▽ More

    Submitted 26 December, 2020; v1 submitted 20 July, 2018; originally announced July 2018.

    Comments: 38 pages (incl. supp. mat.), 5 figures

  17. arXiv:1702.05732  [pdf, other

    physics.comp-ph math.OC physics.data-an stat.AP

    Low-dose cryo electron ptychography via non-convex Bayesian optimization

    Authors: Philipp Michael Pelz, Wen Xuan Qiu, Robert Bücker, Günther Kassier, R. J. Dwayne Miller

    Abstract: Electron ptychography has seen a recent surge of interest for phase sensitive imaging at atomic or near-atomic resolution. However, applications are so far mainly limited to radiation-hard samples because the required doses are too high for imaging biological samples at high resolution. We propose the use of non-convex, Bayesian optimization to overcome this problem and reduce the dose required fo… ▽ More

    Submitted 19 February, 2017; originally announced February 2017.

  18. ATD: Anomalous Topic Discovery in High Dimensional Discrete Data

    Authors: Hossein Soleimani, David J. Miller

    Abstract: We propose an algorithm for detecting patterns exhibited by anomalous clusters in high dimensional discrete data. Unlike most anomaly detection (AD) methods, which detect individual anomalies, our proposed method detects groups (clusters) of anomalies; i.e. sets of points which collectively exhibit abnormal patterns. In many applications this can lead to better understanding of the nature of the a… ▽ More

    Submitted 20 May, 2016; v1 submitted 20 December, 2015; originally announced December 2015.

  19. arXiv:1502.02984  [pdf, other

    stat.ME

    Asymmetric Independence Model for Detecting Interactions between Variables

    Authors: Guoqiang Yu, David J. Miller, Carl D. Langefeld, David M. Herrington, Yue Wang

    Abstract: Detecting complex interactions among risk factors in case-control studies is a fundamental task in clinical and population research. However, though hypothesis testing using logistic regression (LR) is a convenient solution, the LR framework is poorly powered and ill-suited under several common circumstances in practice including missing or unmeasured risk factors, imperfectly correlated "surrogat… ▽ More

    Submitted 10 February, 2015; originally announced February 2015.

  20. arXiv:1406.7349  [pdf

    stat.ML q-bio.QM

    Convex Analysis of Mixtures for Separating Non-negative Well-grounded Sources

    Authors: Yitan Zhu, Niya Wang, David J. Miller, Yue Wang

    Abstract: Blind Source Separation (BSS) has proven to be a powerful tool for the analysis of composite patterns in engineering and science. We introduce Convex Analysis of Mixtures (CAM) for separating non-negative well-grounded sources, which learns the mixing matrix by identifying the lateral edges of the convex data scatter plot. We prove a sufficient and necessary condition for identifying the mixing ma… ▽ More

    Submitted 10 December, 2015; v1 submitted 27 June, 2014; originally announced June 2014.

    Comments: 15 pages, 9 figures, 2 tables

  21. arXiv:1402.0136  [pdf, other

    stat.AP q-bio.QM stat.ME

    IsoDOT Detects Differential RNA-isoform Expression/Usage with respect to a Categorical or Continuous Covariate with High Sensitivity and Specificity

    Authors: Wei Sun, Yufeng Liu, James J. Crowley, Ting-Huei Chen, Hua Zhou, Haitao Chu, Shun** Huang, Pei-Fen Kuan, Yuan Li, Darla Miller, Ginger Shaw, Yichao Wu, Vasyl Zhabotynsky, Leonard McMillan, Fei Zou, Patrick F. Sullivan, Fernando Pardo-Manuel de Villena

    Abstract: We have developed a statistical method named IsoDOT to assess differential isoform expression (DIE) and differential isoform usage (DIU) using RNA-seq data. Here isoform usage refers to relative isoform expression given the total expression of the corresponding gene. IsoDOT performs two tasks that cannot be accomplished by existing methods: to test DIE/DIU with respect to a continuous covariate, a… ▽ More

    Submitted 29 October, 2014; v1 submitted 1 February, 2014; originally announced February 2014.

  22. arXiv:1401.6169  [pdf, ps, other

    cs.LG cs.CL cs.IR stat.ML

    Parsimonious Topic Models with Salient Word Discovery

    Authors: Hossein Soleimani, David J. Miller

    Abstract: We propose a parsimonious topic model for text corpora. In related models such as Latent Dirichlet Allocation (LDA), all words are modeled topic-specifically, even though many words occur with similar frequencies across different topics. Our modeling determines salient words for each topic, which have topic-specific probabilities, with the rest explained by a universal shared model. Further, in LD… ▽ More

    Submitted 11 September, 2014; v1 submitted 22 January, 2014; originally announced January 2014.

    ACM Class: I.7.0; I.5.3; G.3; I.5.2

    Journal ref: IEEE Transaction on Knowledge and Data Engineering, 27 (2015) 824-837