Skip to main content

Showing 1–11 of 11 results for author: Noble, W S

.
  1. arXiv:2309.15319  [pdf, other

    cs.LG q-bio.QM

    DeepROCK: Error-controlled interaction detection in deep neural networks

    Authors: Winston Chen, William Stafford Noble, Yang Young Lu

    Abstract: The complexity of deep neural networks (DNNs) makes them powerful but also makes them challenging to interpret, hindering their applicability in error-intolerant domains. Existing methods attempt to reason about the internal mechanism of DNNs by identifying feature interactions that influence prediction outcomes. However, such methods typically lack a systematic strategy to prioritize interactions… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  2. arXiv:2302.11837  [pdf, other

    stat.ME

    Bounding the FDP in competition-based control of the FDR

    Authors: Arya Ebadi, Dong Luo, Jack Freestone, William Stafford Noble, Uri Keich

    Abstract: Competition-based approach to controlling the false discovery rate (FDR) recently rose to prominence when, generalizing it to sequential hypothesis testing, Barber and Candès used it as part of their knockoff-filter. Control of the FDR implies that the, arguably more important, false discovery proportion is only controlled in an average sense. We present TDC-SB and TDC-UB that provide upper predic… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: The original version of this paper appeared as arxiv:2011.11939v1. That version was split into two: one branch continuing as v2 & v3 of that original submission, and the other branch is now added here as a new submission

  3. arXiv:2011.11939  [pdf, other

    stat.ME stat.AP

    Competition-based control of the false discovery proportion

    Authors: Dong Luo, Arya Ebadi, Yilun He, Kristen Emery, William Stafford Noble, Uri Keich

    Abstract: Recently, Barber and Candès laid the theoretical foundation for a general framework for false discovery rate (FDR) control based on the notion of "knockoffs." A closely related FDR control methodology has long been employed in the analysis of mass spectrometry data, referred to there as "target-decoy competition" (TDC). However, any approach that aims to control the FDR, which is defined as the ex… ▽ More

    Submitted 14 March, 2022; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: This revision focuses only on FDP-SD described in the original submission. A later submission will further develop the procedures for simultaneous bounds on the FDP

  4. arXiv:2002.00526  [pdf, other

    cs.LG stat.ML

    DANCE: Enhancing saliency maps using decoys

    Authors: Yang Lu, Wenbo Guo, Xinyu Xing, William Stafford Noble

    Abstract: Saliency methods can make deep neural network predictions more interpretable by identifying a set of critical features in an input sample, such as pixels that contribute most strongly to a prediction made by an image classifier. Unfortunately, recent evidence suggests that many saliency methods poorly perform, especially in situations where gradients are saturated, inputs contain adversarial pertu… ▽ More

    Submitted 14 June, 2021; v1 submitted 2 February, 2020; originally announced February 2020.

  5. arXiv:1907.01458  [pdf, other

    stat.ME stat.AP

    Multiple competition-based FDR control for peptide detection

    Authors: Kristen Emery, Syamand Hasam, William Stafford Noble, Uri Keich

    Abstract: Competition-based FDR control has been commonly used for over a decade in the computational mass spectrometry community (Elias and Gygi, 2007). Recently, the approach has gained significant popularity in other fields after Barber and Candes (2015) laid its theoretical foundation in a more general setting that included the feature selection problem. In both cases, the competition is based on a head… ▽ More

    Submitted 13 November, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

    Comments: Numerous changes from the initial submission including an expanded section on peptide detection (context/motivation and results), refocused and streamlined methods development section, revised and more selective figures reflecting the most recent analysis

  6. arXiv:1906.03543  [pdf, other

    cs.LG stat.ML

    apricot: Submodular selection for data summarization in Python

    Authors: Jacob Schreiber, Jeffrey Bilmes, William Stafford Noble

    Abstract: We present apricot, an open source Python package for selecting representative subsets from large data sets using submodular optimization. The package implements an efficient greedy selection algorithm that offers strong theoretical guarantees on the quality of the selected set. Two submodular set functions are implemented in apricot: facility location, which is broadly applicable but requires mem… ▽ More

    Submitted 8 June, 2019; originally announced June 2019.

  7. arXiv:1809.01185  [pdf, other

    cs.LG stat.ML

    DeepPINK: reproducible feature selection in deep neural networks

    Authors: Yang Young Lu, Yingying Fan, **chi Lv, William Stafford Noble

    Abstract: Deep learning has become increasingly popular in both supervised and unsupervised machine learning thanks to its outstanding empirical performance. However, because of their intrinsic complexity, most deep learning methods are largely treated as black box tools with little interpretability. Even though recent attempts have been made to facilitate the interpretability of deep neural networks (DNNs)… ▽ More

    Submitted 6 September, 2018; v1 submitted 4 September, 2018; originally announced September 2018.

  8. arXiv:1410.7875  [pdf, other

    q-bio.MN stat.ML

    Faster graphical model identification of tandem mass spectra using peptide word lattices

    Authors: Shengjie Wang, John T. Halloran, Jeff A. Bilmes, William S. Noble

    Abstract: Liquid chromatography coupled with tandem mass spectrometry, also known as shotgun proteomics, is a widely-used high-throughput technology for identifying proteins in complex biological samples. Analysis of the tens of thousands of fragmentation spectra produced by a typical shotgun proteomics experiment begins by assigning to each observed spectrum the peptide hypothesized to be responsible for g… ▽ More

    Submitted 29 October, 2014; originally announced October 2014.

  9. arXiv:1210.4904  [pdf

    cs.CE q-bio.QM

    Spectrum Identification using a Dynamic Bayesian Network Model of Tandem Mass Spectra

    Authors: Ajit P. Singh, John Halloran, Jeff A. Bilmes, Katrin Kirchoff, William S. Noble

    Abstract: Shotgun proteomics is a high-throughput technology used to identify unknown proteins in a complex mixture. At the heart of this process is a prediction task, the spectrum identification problem, in which each fragmentation spectrum produced by a shotgun proteomics experiment must be mapped to the peptide (protein subsequence) which generated the spectrum. We propose a new algorithm for spectrum id… ▽ More

    Submitted 16 October, 2012; originally announced October 2012.

    Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

    Report number: UAI-P-2012-PG-775-785

  10. arXiv:1207.5848  [pdf, other

    q-bio.QM q-bio.BM

    On the feasibility and utility of exploiting real time database search to improve adaptive peak selection

    Authors: Benjamin J. Diament, Michael J. MacCoss, William Stafford Noble

    Abstract: Rationale: In a shotgun proteomics experiment with data-dependent acquisition, real-time analysis of a precursor scan results in selection of a handful of peaks for subsequent isolation, fragmentation and secondary scanning. This peak selection protocol typically focuses on the most abundant peaks in the precursor scan, while attempting to avoid re-sampling the same m/z values in rapid succession.… ▽ More

    Submitted 24 July, 2012; originally announced July 2012.

  11. arXiv:q-bio/0610040  [pdf, ps, other

    q-bio.QM cs.LG

    Metric learning pairwise kernel for graph inference

    Authors: Jean-Philippe Vert, Jian Qiu, William Stafford Noble

    Abstract: Much recent work in bioinformatics has focused on the inference of various types of biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. A common setting involves inferring network edges in a supervised fashion from a set of high-confidence edges, possibly characterized by multiple, heterogeneous data sets (protein sequence, gene expression,… ▽ More

    Submitted 21 October, 2006; originally announced October 2006.