Skip to main content

Showing 1–28 of 28 results for author: Schwab, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.05461  [pdf, other

    cs.LG cs.AI

    STREAMLINE: An Automated Machine Learning Pipeline for Biomedicine Applied to Examine the Utility of Photography-Based Phenotypes for OSA Prediction Across International Sleep Centers

    Authors: Ryan J. Urbanowicz, Harsh Bandhey, Brendan T. Keenan, Greg Maislin, Sy Hwang, Danielle L. Mowery, Shannon M. Lynch, Diego R. Mazzotti, Fang Han, Qing Yun Li, Thomas Penzel, Sergio Tufik, Lia Bittencourt, Thorarinn Gislason, Philip de Chazal, Bhajan Singh, Nigel McArdle, Ning-Hung Chen, Allan Pack, Richard J. Schwab, Peter A. Cistulli, Ulysses J. Magalang

    Abstract: While machine learning (ML) includes a valuable array of tools for analyzing biomedical data, significant time and expertise is required to assemble effective, rigorous, and unbiased pipelines. Automated ML (AutoML) tools seek to facilitate ML application by automating a subset of analysis pipeline elements. In this study we develop and validate a Simple, Transparent, End-to-end Automated Machine… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 23 pages, 7 figures, 1 table, 1 supplemental information document (77 pages), and 7 ancillary files

  2. arXiv:2309.14047  [pdf, other

    cond-mat.dis-nn cond-mat.stat-mech cs.CR cs.IT

    Random-Energy Secret Sharing via Extreme Synergy

    Authors: Vudtiwat Ngampruetikorn, David J. Schwab

    Abstract: The random-energy model (REM), a solvable spin-glass model, has impacted an incredibly diverse set of problems, from protein folding to combinatorial optimization to many-body localization. Here, we explore a new connection to secret sharing. We formulate a secret-sharing scheme, based on the REM, and analyze its information-theoretic properties. Our analyses reveal that the correlations between s… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 6 pages, 5 figures

  3. arXiv:2309.10511  [pdf, ps, other

    cs.CV cs.LG math.OC

    Self2Seg: Single-Image Self-Supervised Joint Segmentation and Denoising

    Authors: Nadja Gruber, Johannes Schwab, NoƩmie Debroux, Nicolas Papadakis, Markus Haltmeier

    Abstract: We develop Self2Seg, a self-supervised method for the joint segmentation and denoising of a single image. To this end, we combine the advantages of variational segmentation with self-supervised deep learning. One major benefit of our method lies in the fact, that in contrast to data-driven methods, where huge amounts of labeled samples are necessary, Self2Seg segments an image into meaningful regi… ▽ More

    Submitted 29 April, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    MSC Class: 65K10; 68U05; 68U15; 68T07; 68U10

  4. arXiv:2303.17762  [pdf, other

    cs.IT cond-mat.stat-mech cs.LG physics.data-an q-bio.QM

    Generalized Information Bottleneck for Gaussian Variables

    Authors: Vudtiwat Ngampruetikorn, David J. Schwab

    Abstract: The information bottleneck (IB) method offers an attractive framework for understanding representation learning, however its applications are often limited by its computational intractability. Analytical characterization of the IB method is not only of practical interest, but it can also lead to new insights into learning phenomena. Here we consider a generalized IB problem, in which the mutual in… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: 7 pages, 3 figures

  5. arXiv:2302.02214  [pdf, other

    cs.CV math.FA math.NA

    Variational multichannel multiclass segmentation using unsupervised lifting with CNNs

    Authors: Nadja Gruber, Johannes Schwab, Sebastien Court, Elke Gizewski, Markus Haltmeier

    Abstract: We propose an unsupervised image segmentation approach, that combines a variational energy functional and deep convolutional neural networks. The variational part is based on a recent multichannel multiphase Chan-Vese model, which is capable to extract useful information from multiple input images simultaneously. We implement a flexible multiclass segmentation method that divides a given image int… ▽ More

    Submitted 16 June, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

    Comments: 20th INTERNATIONAL CONFERENCE OF NUMERICAL ANALYSIS AND APPLIED MATHEMATICS

    MSC Class: 65K10; 68U10; 68T10; ACM Class: I.4.6; I.2.10; I.5.4; G.1.6; G.1.10; G.4

  6. arXiv:2208.03848  [pdf, other

    cs.IT cond-mat.stat-mech cs.LG physics.data-an stat.ML

    Information bottleneck theory of high-dimensional regression: relevancy, efficiency and optimality

    Authors: Vudtiwat Ngampruetikorn, David J. Schwab

    Abstract: Avoiding overfitting is a central challenge in machine learning, yet many large neural networks readily achieve zero training loss. This puzzling contradiction necessitates new approaches to the study of overfitting. Here we quantify overfitting via residual information, defined as the bits in fitted models that encode noise in training data. Information efficient learning algorithms minimize resi… ▽ More

    Submitted 11 October, 2022; v1 submitted 7 August, 2022; originally announced August 2022.

    Comments: NeurIPS 2022

    ACM Class: H.1.1; I.2.6

  7. arXiv:2202.04680  [pdf, other

    cs.CV math.FA

    Lifting-based variational multiclass segmentation algorithm: design, convergence analysis, and implementation with applications in medical imaging

    Authors: Nadja Gruber, Johannes Schwab, Sebastien Court, Elke Gizewski, Markus Haltmeier

    Abstract: We propose, analyze and realize a variational multiclass segmentation scheme that partitions a given image into multiple regions exhibiting specific properties. Our method determines multiple functions that encode the segmentation regions by minimizing an energy functional combining information from different channels. Multichannel image data can be obtained by lifting the image into a higher dime… ▽ More

    Submitted 18 September, 2023; v1 submitted 9 February, 2022; originally announced February 2022.

  8. arXiv:2105.13977  [pdf, other

    cs.LG cond-mat.dis-nn cond-mat.stat-mech cs.IT physics.data-an

    Perturbation Theory for the Information Bottleneck

    Authors: Vudtiwat Ngampruetikorn, David J. Schwab

    Abstract: Extracting relevant information from data is crucial for all forms of learning. The information bottleneck (IB) method formalizes this, offering a mathematically precise and conceptually appealing framework for understanding learning phenomena. However the nonlinearity of the IB problem makes it computationally expensive and analytically intractable in general. Here we derive a perturbation theory… ▽ More

    Submitted 25 October, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: NeurIPS 2021

  9. arXiv:2103.12719  [pdf, other

    cs.CV cs.AI

    Characterizing and Improving the Robustness of Self-Supervised Learning through Background Augmentations

    Authors: Chaitanya K. Ryali, David J. Schwab, Ari S. Morcos

    Abstract: Recent progress in self-supervised learning has demonstrated promising results in multiple visual tasks. An important ingredient in high-performing self-supervised methods is the use of data augmentation by training models to place different augmented views of the same image nearby in embedding space. However, commonly used augmentation pipelines treat images holistically, ignoring the semantic re… ▽ More

    Submitted 12 November, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: Technical Report; Additional Results

  10. arXiv:2010.06682  [pdf, other

    cs.CV cs.LG eess.IV

    Are all negatives created equal in contrastive instance discrimination?

    Authors: Tiffany Tianhui Cai, Jonathan Frankle, David J. Schwab, Ari S. Morcos

    Abstract: Self-supervised learning has recently begun to rival supervised learning on computer vision tasks. Many of the recent approaches have been based on contrastive instance discrimination (CID), in which the network is trained to recognize two augmented versions of the same instance (a query and positive) while discriminating against a pool of other instances (negatives). The learned representation is… ▽ More

    Submitted 25 October, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: Fixed author name error

  11. arXiv:2009.12789  [pdf, other

    cs.LG cs.IT stat.ML

    Learning Optimal Representations with the Decodable Information Bottleneck

    Authors: Yann Dubois, Douwe Kiela, David J. Schwab, Ramakrishna Vedantam

    Abstract: We address the question of characterizing and finding optimal representations for supervised learning. Traditionally, this question has been tackled using the Information Bottleneck, which compresses the inputs while retaining information about the targets, in a decoder-agnostic fashion. In machine learning, however, our goal is not compression but rather generalization, which is intimately linked… ▽ More

    Submitted 16 July, 2021; v1 submitted 27 September, 2020; originally announced September 2020.

    Comments: Accepted at NeurIPS 2020

  12. arXiv:2007.14823  [pdf, other

    cond-mat.dis-nn cond-mat.stat-mech cs.LG nlin.CD q-bio.NC

    Theory of gating in recurrent neural networks

    Authors: Kamesh Krishnamurthy, Tankut Can, David J. Schwab

    Abstract: Recurrent neural networks (RNNs) are powerful dynamical models, widely used in machine learning (ML) and neuroscience. Prior theoretical work has focused on RNNs with additive interactions. However, gating - i.e. multiplicative - interactions are ubiquitous in real neurons and also the central feature of the best-performing RNNs in ML. Here, we show that gating offers flexible control of two salie… ▽ More

    Submitted 1 December, 2021; v1 submitted 29 July, 2020; originally announced July 2020.

    Comments: 13 figures

  13. arXiv:2004.09565  [pdf, other

    math.NA cs.LG eess.IV

    Sparse aNETT for Solving Inverse Problems with Deep Learning

    Authors: Daniel Obmann, Linh Nguyen, Johannes Schwab, Markus Haltmeier

    Abstract: We propose a sparse reconstruction framework (aNETT) for solving inverse problems. Opposed to existing sparse reconstruction techniques that are based on linear sparsifying transforms, we train an autoencoder network $D \circ E$ with $E$ acting as a nonlinear sparsifying transform and minimize a Tikhonov functional with learned regularizer formed by the $\ell^q$-norm of the encoder coefficients an… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

    Comments: The original proceeding is part of the ISBI 2020 and only contains 4 pages due to page restrictions

  14. arXiv:2003.00152  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs

    Authors: Jonathan Frankle, David J. Schwab, Ari S. Morcos

    Abstract: A wide variety of deep learning techniques from style transfer to multitask learning rely on training affine transformations of features. Most prominent among these is the popular feature normalization technique BatchNorm, which normalizes activations and then subsequently applies a learned affine transform. In this paper, we aim to understand the role and expressive power of affine parameters use… ▽ More

    Submitted 21 March, 2021; v1 submitted 28 February, 2020; originally announced March 2020.

    Comments: Published in ICLR 2021

  15. arXiv:2002.10365  [pdf, other

    cs.LG cs.NE stat.ML

    The Early Phase of Neural Network Training

    Authors: Jonathan Frankle, David J. Schwab, Ari S. Morcos

    Abstract: Recent studies have shown that many important aspects of neural network learning take place within the very earliest iterations or epochs of training. For example, sparse, trainable sub-networks emerge (Frankle et al., 2019), gradient descent moves into a small subspace (Gur-Ari et al., 2018), and the network undergoes a critical period (Achille et al., 2019). Here, we examine the changes that dee… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: ICLR 2020 Camera Ready. Available on OpenReview at https://openreview.net/forum?id=Hkl1iRNFwS

  16. arXiv:2002.00155  [pdf, other

    math.NA cs.LG

    Deep synthesis regularization of inverse problems

    Authors: Daniel Obmann, Johannes Schwab, Markus Haltmeier

    Abstract: Recently, a large number of efficient deep learning methods for solving inverse problems have been developed and show outstanding numerical performance. For these deep learning methods, however, a solid theoretical foundation in the form of reconstruction guarantees is missing. In contrast, for classical reconstruction methods, such as convex variational and frame-based regularization, theoretical… ▽ More

    Submitted 1 February, 2020; originally announced February 2020.

    Comments: Submitted to IEEE Trans. Image Processing

  17. arXiv:2002.00025  [pdf, other

    cs.LG cond-mat.dis-nn cond-mat.stat-mech stat.ML

    Gating creates slow modes and controls phase-space complexity in GRUs and LSTMs

    Authors: Tankut Can, Kamesh Krishnamurthy, David J. Schwab

    Abstract: Recurrent neural networks (RNNs) are powerful dynamical models for data with complex temporal structure. However, training RNNs has traditionally proved challenging due to exploding or vanishing of gradients. RNN models such as LSTMs and GRUs (and their variants) significantly mitigate these issues associated with training by introducing various types of gating units into the architecture. While t… ▽ More

    Submitted 15 June, 2020; v1 submitted 31 January, 2020; originally announced February 2020.

    Comments: 18+18 pages, 4 figures, to appear in Proceedings of Machine Learning Research Vol. 107, 2020, 1st Annual Conference on Mathematical and Scientific Machine Learning

  18. arXiv:1910.00195  [pdf, other

    cs.LG stat.ML

    How noise affects the Hessian spectrum in overparameterized neural networks

    Authors: Mingwei Wei, David J Schwab

    Abstract: Stochastic gradient descent (SGD) forms the core optimization method for deep neural networks. While some theoretical progress has been made, it still remains unclear why SGD leads the learning dynamics in overparameterized networks to solutions that generalize well. Here we show that for overparameterized networks with a degenerate valley in their loss landscape, SGD on average decreases the trac… ▽ More

    Submitted 29 October, 2019; v1 submitted 1 October, 2019; originally announced October 2019.

  19. arXiv:1908.03006  [pdf, other

    math.NA cs.LG math.OC

    Augmented NETT Regularization of Inverse Problems

    Authors: Daniel Obmann, Linh Nguyen, Johannes Schwab, Markus Haltmeier

    Abstract: We propose aNETT (augmented NETwork Tikhonov) regularization as a novel data-driven reconstruction framework for solving inverse problems. An encoder-decoder type network defines a regularizer consisting of a penalty term that enforces regularity in the encoder domain, augmented by a penalty that penalizes the distance to the data manifold. We present a rigorous convergence analysis including stab… ▽ More

    Submitted 6 February, 2021; v1 submitted 8 August, 2019; originally announced August 2019.

  20. arXiv:1903.02606  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Mean-field Analysis of Batch Normalization

    Authors: Mingwei Wei, James Stokes, David J Schwab

    Abstract: Batch Normalization (BatchNorm) is an extremely useful component of modern neural network architectures, enabling optimization using higher learning rates and achieving faster convergence. In this paper, we use mean-field theory to analytically quantify the impact of BatchNorm on the geometry of the loss landscape for multi-layer networks consisting of fully-connected and convolutional layers. We… ▽ More

    Submitted 6 March, 2019; originally announced March 2019.

  21. arXiv:1803.08823  [pdf, other

    physics.comp-ph cond-mat.stat-mech cs.LG stat.ML

    A high-bias, low-variance introduction to Machine Learning for physicists

    Authors: Pankaj Mehta, Marin Bukov, Ching-Hao Wang, Alexandre G. R. Day, Clint Richardson, Charles K. Fisher, David J. Schwab

    Abstract: Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, r… ▽ More

    Submitted 27 May, 2019; v1 submitted 23 March, 2018; originally announced March 2018.

    Comments: Notebooks have been updated. 122 pages, 78 figures, 20 Python notebooks

    Journal ref: Phyics Reports 810 (2019) 1-124

  22. arXiv:1803.00092  [pdf, other

    math.NA cs.LG

    NETT: Solving Inverse Problems with Deep Neural Networks

    Authors: Housen Li, Johannes Schwab, Stephan Antholzer, Markus Haltmeier

    Abstract: Recovering a function or high-dimensional parameter vector from indirect measurements is a central task in various scientific areas. Several methods for solving such inverse problems are well developed and well understood. Recently, novel algorithms using deep learning and neural networks for inverse problems appeared. While still in their infancy, these techniques show astonishing performance for… ▽ More

    Submitted 8 December, 2019; v1 submitted 28 February, 2018; originally announced March 2018.

  23. arXiv:1712.09657  [pdf, other

    stat.ML cs.AI cs.IT cs.LG

    The information bottleneck and geometric clustering

    Authors: DJ Strouse, David J Schwab

    Abstract: The information bottleneck (IB) approach to clustering takes a joint distribution $P\!\left(X,Y\right)$ and maps the data $X$ to cluster labels $T$ which retain maximal information about $Y$ (Tishby et al., 1999). This objective results in an algorithm that clusters data points based upon the similarity of their conditional distributions $P\!\left(Y\mid X\right)$. This is in contrast to classic "g… ▽ More

    Submitted 31 May, 2020; v1 submitted 27 December, 2017; originally announced December 2017.

    Comments: Updated to final published version with more detailed relationship to GMMs/k-means

    Journal ref: Neural Computation 31 (2019) 596-612

  24. arXiv:1704.04587  [pdf, other

    cs.CV cs.LG

    Deep Learning for Photoacoustic Tomography from Sparse Data

    Authors: Stephan Antholzer, Markus Haltmeier, Johannes Schwab

    Abstract: The development of fast and accurate image reconstruction algorithms is a central aspect of computed tomography. In this paper, we investigate this issue for the sparse data problem in photoacoustic tomography (PAT). We develop a direct and highly efficient reconstruction algorithm based on deep learning. In our approach image reconstruction is performed with a deep convolutional neural network (C… ▽ More

    Submitted 30 August, 2018; v1 submitted 15 April, 2017; originally announced April 2017.

  25. arXiv:1609.03541  [pdf, ps, other

    cond-mat.dis-nn cs.LG stat.ML

    Comment on "Why does deep and cheap learning work so well?" [arXiv:1608.08225]

    Authors: David J. Schwab, Pankaj Mehta

    Abstract: In a recent paper, "Why does deep and cheap learning work so well?", Lin and Tegmark claim to show that the map** between deep belief networks and the variational renormalization group derived in [ar** does not hold. In this comment, we show that these claims are incorrect and stem from a misunderstandin… ▽ More

    Submitted 12 September, 2016; originally announced September 2016.

    Comments: Comment on arXiv:1608.08225

  26. arXiv:1605.05775  [pdf, other

    stat.ML cond-mat.str-el cs.LG

    Supervised Learning with Quantum-Inspired Tensor Networks

    Authors: E. Miles Stoudenmire, David J. Schwab

    Abstract: Tensor networks are efficient representations of high-dimensional tensors which have been very successful for physics and mathematics applications. We demonstrate how algorithms for optimizing such networks can be adapted to supervised learning tasks by using matrix product states (tensor trains) to parameterize models for classifying images. For the MNIST data set we obtain less than 1% test set… ▽ More

    Submitted 18 May, 2017; v1 submitted 18 May, 2016; originally announced May 2016.

    Comments: 11 pages, 15 figures; updated version includes corrections, links to sample codes, expanded discussion, and additional references

    Journal ref: Advances in Neural Information Processing Systems 29, 4799 (2016)

  27. arXiv:1604.00268  [pdf, other

    q-bio.NC cond-mat.stat-mech cs.IT q-bio.QM stat.ML

    The deterministic information bottleneck

    Authors: DJ Strouse, David J Schwab

    Abstract: Lossy compression and clustering fundamentally involve a decision about what features are relevant and which are not. The information bottleneck method (IB) by Tishby, Pereira, and Bialek formalized this notion as an information-theoretic optimization problem and proposed an optimal tradeoff between throwing away as many bits as possible, and selectively kee** those that are most important. In t… ▽ More

    Submitted 19 December, 2016; v1 submitted 1 April, 2016; originally announced April 2016.

    Comments: 15 pages, 4 figures

  28. arXiv:1410.3831  [pdf, ps, other

    stat.ML cond-mat.stat-mech cs.LG cs.NE

    An exact map** between the Variational Renormalization Group and Deep Learning

    Authors: Pankaj Mehta, David J. Schwab

    Abstract: Deep learning is a broad set of techniques that uses multiple layers of representation to automatically learn relevant features directly from structured data. Recently, such techniques have yielded record-breaking results on a diverse set of difficult machine learning tasks in computer vision, speech recognition, and natural language processing. Despite the enormous success of deep learning, relat… ▽ More

    Submitted 14 October, 2014; originally announced October 2014.

    Comments: 8 pages, 3 figures