Skip to main content

Showing 1–5 of 5 results for author: Latham, P E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12615  [pdf, other

    cs.LG

    When Are Bias-Free ReLU Networks Like Linear Networks?

    Authors: Yedi Zhang, Andrew Saxe, Peter E. Latham

    Abstract: We investigate the expressivity and learning dynamics of bias-free ReLU networks. We firstly show that two-layer bias-free ReLU networks have limited expressivity: the only odd function two-layer bias-free ReLU networks can express is a linear one. We then show that, under symmetry conditions on the data, these networks have the same learning dynamics as linear networks. This allows us to give clo… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: HiLD Workshop at ICML 2024

  2. arXiv:2312.00935  [pdf, other

    cs.LG

    Understanding Unimodal Bias in Multimodal Deep Linear Networks

    Authors: Yedi Zhang, Peter E. Latham, Andrew Saxe

    Abstract: Using multiple input streams simultaneously to train multimodal neural networks is intuitively advantageous but practically challenging. A key challenge is unimodal bias, where a network overly relies on one modality and ignores others during joint training. We develop a theory of unimodal bias with multimodal deep linear networks to understand how architecture and data statistics influence this b… ▽ More

    Submitted 1 June, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: ICML 2024 camera ready

  3. arXiv:2110.00296  [pdf, other

    stat.ML cs.AI cs.LG

    Powerpropagation: A sparsity inducing weight reparameterisation

    Authors: Jonathan Schwarz, Siddhant M. Jayakumar, Razvan Pascanu, Peter E. Latham, Yee Whye Teh

    Abstract: The training of sparse neural networks is becoming an increasingly important tool for reducing the computational footprint of models at training and evaluation, as well enabling the effective scaling up of models. Whereas much work over the years has been dedicated to specialised pruning techniques, little attention has been paid to the inherent effect of gradient based training on model sparsity.… ▽ More

    Submitted 6 October, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021

  4. arXiv:2106.13031  [pdf, other

    cs.LG cs.NE q-bio.NC

    Towards Biologically Plausible Convolutional Networks

    Authors: Roman Pogodin, Yash Mehta, Timothy P. Lillicrap, Peter E. Latham

    Abstract: Convolutional networks are ubiquitous in deep learning. They are particularly useful for images, as they reduce the number of parameters, reduce training time, and increase accuracy. However, as a model of the brain they are seriously problematic, since they require weight sharing - something real neurons simply cannot do. Consequently, while neurons in the brain can be locally connected (one of t… ▽ More

    Submitted 15 January, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

  5. arXiv:2006.07123  [pdf, other

    cs.LG q-bio.NC stat.ML

    Kernelized information bottleneck leads to biologically plausible 3-factor Hebbian learning in deep networks

    Authors: Roman Pogodin, Peter E. Latham

    Abstract: The state-of-the art machine learning approach to training deep neural networks, backpropagation, is implausible for real neural networks: neurons need to know their outgoing weights; training alternates between a bottom-up forward pass (computation) and a top-down backward pass (learning); and the algorithm often needs precise labels of many data points. Biologically plausible approximations to b… ▽ More

    Submitted 23 October, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: Accepted to NeurIPS 2020