Skip to main content

Showing 1–50 of 82 results for author: Storkey, A

.
  1. arXiv:2406.13376  [pdf, other

    cs.LG

    Efficient Offline Reinforcement Learning: The Critic is Critical

    Authors: Adam Jelley, Trevor McInroe, Sam Devlin, Amos Storkey

    Abstract: Recent work has demonstrated both benefits and limitations from using supervised approaches (without temporal-difference learning) for offline reinforcement learning. While off-policy reinforcement learning provides a promising approach for improving performance beyond supervised approaches, we observe that training is often inefficient and unstable due to temporal difference bootstrap**. In thi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2405.20838  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    einspace: Searching for Neural Architectures from Fundamental Operations

    Authors: Linus Ericsson, Miguel Espinosa, Chenhongyi Yang, Antreas Antoniou, Amos Storkey, Shay B. Cohen, Steven McDonagh, Elliot J. Crowley

    Abstract: Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces in NAS often aren't diverse enough to include such transformations a priori. Instead, for NAS to provide greater potential for fundamental design shift… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Project page at https://linusericsson.github.io/einspace/

  3. arXiv:2405.14453  [pdf, other

    eess.IV cs.CV cs.LG

    Domain-specific augmentations with resolution agnostic self-attention mechanism improves choroid segmentation in optical coherence tomography images

    Authors: Jamie Burke, Justin Engelmann, Charlene Hamid, Diana Moukaddem, Dan Pugh, Neeraj Dhaun, Amos Storkey, Niall Strang, Stuart King, Tom MacGillivray, Miguel O. Bernabeu, Ian J. C. MacCormick

    Abstract: The choroid is a key vascular layer of the eye, supplying oxygen to the retinal photoreceptors. Non-invasive enhanced depth imaging optical coherence tomography (EDI-OCT) has recently improved access and visualisation of the choroid, making it an exciting frontier for discovering novel vascular biomarkers in ophthalmology and wider systemic health. However, current methods to measure the choroid o… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 13 pages, 2 figures, 8 tables (including supplementary material)

  4. arXiv:2405.12399  [pdf, other

    cs.LG cs.AI cs.CV

    Diffusion for World Modeling: Visual Details Matter in Atari

    Authors: Eloi Alonso, Adam Jelley, Vincent Micheli, Anssi Kanervisto, Amos Storkey, Tim Pearce, François Fleuret

    Abstract: World models constitute a promising approach for training reinforcement learning agents in a safe and sample-efficient manner. Recent world models predominantly operate on sequences of discrete latent variables to model environment dynamics. However, this compression into a compact discrete representation may ignore visual details that are important for reinforcement learning. Concurrently, diffus… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 25 pages, 11 figures, 10 tables

  5. arXiv:2404.14285  [pdf, other

    cs.RO cs.AI

    LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekee** Robots

    Authors: Dongge Han, Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Peter Bell, Amos Storkey

    Abstract: Large language models (LLMs) have shown significant potential for robotics applications, particularly task planning, by harnessing their language comprehension and text generation capabilities. However, in applications such as household robotics, a critical gap remains in the personalization of these models to individual user preferences. We introduce LLM-Personalize, a novel framework with an opt… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  6. arXiv:2404.06466  [pdf, other

    cs.LG stat.ML

    Hyperparameter Selection in Continual Learning

    Authors: Thomas L. Lee, Sigrid Passano Hellan, Linus Ericsson, Elliot J. Crowley, Amos Storkey

    Abstract: In continual learning (CL) -- where a learner trains on a stream of data -- standard hyperparameter optimisation (HPO) cannot be applied, as a learner does not have access to all of the data at the same time. This has prompted the development of CL-specific HPO frameworks. The most popular way to tune hyperparameters in CL is to repeatedly train over the whole data stream with different hyperparam… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Preprint, 9 pages

  7. arXiv:2312.02956  [pdf, other

    eess.IV cs.CV cs.LG q-bio.QM

    Choroidalyzer: An open-source, end-to-end pipeline for choroidal analysis in optical coherence tomography

    Authors: Justin Engelmann, Jamie Burke, Charlene Hamid, Megan Reid-Schachter, Dan Pugh, Neeraj Dhaun, Diana Moukaddem, Lyle Gray, Niall Strang, Paul McGraw, Amos Storkey, Paul J. Steptoe, Stuart King, Tom MacGillivray, Miguel O. Bernabeu, Ian J. C. MacCormick

    Abstract: Purpose: To develop Choroidalyzer, an open-source, end-to-end pipeline for segmenting the choroid region, vessels, and fovea, and deriving choroidal thickness, area, and vascular index. Methods: We used 5,600 OCT B-scans (233 subjects, 6 systemic disease cohorts, 3 device types, 2 manufacturers). To generate region and vessel ground-truths, we used state-of-the-art automatic methods following ma… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  8. arXiv:2311.08909  [pdf, other

    cs.LG cs.CV cs.PF

    DLAS: An Exploration and Assessment of the Deep Learning Acceleration Stack

    Authors: Perry Gibson, José Cano, Elliot J. Crowley, Amos Storkey, Michael O'Boyle

    Abstract: Deep Neural Networks (DNNs) are extremely computationally demanding, which presents a large barrier to their deployment on resource-constrained devices. Since such devices are where many emerging deep learning applications lie (e.g., drones, vision-based medical technology), significant bodies of work from both the machine learning and systems communities have attempted to provide optimizations to… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  9. arXiv:2310.05723  [pdf, other

    cs.LG

    Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning

    Authors: Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Amos Storkey

    Abstract: Offline pretraining with a static dataset followed by online fine-tuning (offline-to-online, or OtO) is a paradigm well matched to a real-world RL deployment process. In this scenario, we aim to find the best-performing policy within a limited budget of online interactions. Previous work in the OtO setting has focused on correcting for bias introduced by the policy-constraint mechanisms of offline… ▽ More

    Submitted 21 June, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 10 pages, 17 figures, published at RLC 2024

  10. arXiv:2310.02206  [pdf, other

    cs.LG stat.ML

    Chunking: Forgetting Matters in Continual Learning even without Changing Tasks

    Authors: Thomas L. Lee, Amos Storkey

    Abstract: Work on continual learning (CL) has largely focused on the problems arising from the dynamically-changing data distribution. However, CL can be decomposed into two sub-problems: (a) shifts in the data distribution, and (b) dealing with the fact that the data is split into chunks and so only a part of the data is available to be trained on at any point in time. In this work, we look at the latter s… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: 9 pages, 11 figures, preprint

  11. arXiv:2309.17320  [pdf, other

    eess.IV cs.CV physics.med-ph

    Development of a Deep Learning Method to Identify Acute Ischemic Stroke Lesions on Brain CT

    Authors: Alessandro Fontanella, Wenwen Li, Grant Mair, Antreas Antoniou, Eleanor Platt, Paul Armitage, Emanuele Trucco, Joanna Wardlaw, Amos Storkey

    Abstract: Computed Tomography (CT) is commonly used to image acute ischemic stroke (AIS) patients, but its interpretation by radiologists is time-consuming and subject to inter-observer variability. Deep learning (DL) techniques can provide automated CT brain scan assessment, but usually require annotated images. Aiming to develop a DL method for AIS using labelled but not annotated CT brain scans from pati… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: 12 pages, 5 figures

  12. arXiv:2309.15081  [pdf, other

    eess.IV

    Challenges of building medical image datasets for development of deep learning software in stroke

    Authors: Alessandro Fontanella, Wenwen Li, Grant Mair, Antreas Antoniou, Eleanor Platt, Chloe Martin, Paul Armitage, Emanuele Trucco, Joanna Wardlaw, Amos Storkey

    Abstract: Despite the large amount of brain CT data generated in clinical practice, the availability of CT datasets for deep learning (DL) research is currently limited. Furthermore, the data can be insufficiently or improperly prepared for machine learning and thus lead to spurious and irreproducible analyses. This lack of access to comprehensive and diverse datasets poses a significant challenge for the d… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 9 pages, 5 figures

  13. arXiv:2308.10077  [pdf, other

    cs.LG cs.AI

    Contrastive Learning for Non-Local Graphs with Multi-Resolution Structural Views

    Authors: Asif Khan, Amos Storkey

    Abstract: Learning node-level representations of heterophilic graphs is crucial for various applications, including fraudster detection and protein function prediction. In such graphs, nodes share structural similarity identified by the equivalence of their connectivity which is implicitly encoded in the form of higher-order hierarchical information in the graphs. The contrastive methods are popular choices… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

  14. arXiv:2308.02062  [pdf, other

    eess.IV cs.CV

    Diffusion Models for Counterfactual Generation and Anomaly Detection in Brain Images

    Authors: Alessandro Fontanella, Grant Mair, Joanna Wardlaw, Emanuele Trucco, Amos Storkey

    Abstract: Segmentation masks of pathological areas are useful in many medical applications, such as brain tumour and stroke management. Moreover, healthy counterfactuals of diseased images can be used to enhance radiologists' training files and to improve the interpretability of segmentation models. In this work, we present a weakly supervised method to generate a healthy version of a diseased image and the… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: 13 pages, 7 figures

  15. arXiv:2307.13646  [pdf, other

    cs.CV cs.AI q-bio.QM

    QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models

    Authors: Justin Engelmann, Amos Storkey, Miguel O. Bernabeu

    Abstract: Image quality remains a key problem for both traditional and deep learning (DL)-based approaches to retinal image analysis, but identifying poor quality images can be time consuming and subjective. Thus, automated methods for retinal image quality scoring (RIQS) are needed. The current state-of-the-art is MCFNet, composed of three Densenet121 backbones each operating in a different colour space. M… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  16. arXiv:2307.13100  [pdf, other

    cs.LG

    Label Noise: Correcting a Correction

    Authors: William Toner, Amos Storkey

    Abstract: Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels. To address this issue, researchers have explored alternative loss functions that aim to be more robust. However, many of these alternatives are heuristic in nature and still vulnerable to overfitting or underfitting. In this work, we propose a more direct approach to tackling over… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

  17. arXiv:2307.00904  [pdf, other

    eess.IV cs.AI q-bio.QM

    An open-source deep learning algorithm for efficient and fully-automatic analysis of the choroid in optical coherence tomography

    Authors: Jamie Burke, Justin Engelmann, Charlene Hamid, Megan Reid-Schachter, Tom Pearson, Dan Pugh, Neeraj Dhaun, Stuart King, Tom MacGillivray, Miguel O. Bernabeu, Amos Storkey, Ian J. C. MacCormick

    Abstract: Purpose: To develop an open-source, fully-automatic deep learning algorithm, DeepGPET, for choroid region segmentation in optical coherence tomography (OCT) data. Methods: We used a dataset of 715 OCT B-scans (82 subjects, 115 eyes) from 3 clinical studies related to systemic disease. Ground truth segmentations were generated using a clinically validated, semi-automatic choroid segmentation method… ▽ More

    Submitted 29 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 9 pages, 5 figures, 3 tables. Accepted for publication in ARVO TVST (Association for Research in Vision and Ophthalmology, Translational Vision Science & Technology). The code and model weights for DeepGPET are available here: https://github.com/jaburke166/deepgpet

  18. arXiv:2305.19076  [pdf, other

    cs.LG stat.ML

    Approximate Bayesian Class-Conditional Models under Continuous Representation Shift

    Authors: Thomas L. Lee, Amos Storkey

    Abstract: For models consisting of a classifier in some representation space, learning online from a non-stationary data stream often necessitates changes in the representation. So, the question arises of what is the best way to adapt the classifier to shifts in representation. Current methods only slowly change the classifier to representation shift, introducing noise into learning as the classifier is mis… ▽ More

    Submitted 7 May, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Published at AISTATS 2024, 9 pages

  19. arXiv:2303.15421  [pdf, other

    eess.IV cs.CV cs.LG

    ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging

    Authors: Alessandro Fontanella, Antreas Antoniou, Wenwen Li, Joanna Wardlaw, Grant Mair, Emanuele Trucco, Amos Storkey

    Abstract: In some medical imaging tasks and other settings where only small parts of the image are informative for the classification task, traditional CNNs can sometimes struggle to generalise. Manually annotated Regions of Interest (ROI) are sometimes used to isolate the most informative parts of the image. However, these are expensive to collect and may vary significantly across annotators. To overcome t… ▽ More

    Submitted 11 August, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: International Conference on Machine Learning 2023. 17 pages, 7 figures

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:10153-10169, 2023

  20. arXiv:2301.13136  [pdf, other

    cs.LG

    Contrastive Meta-Learning for Partially Observable Few-Shot Learning

    Authors: Adam Jelley, Amos Storkey, Antreas Antoniou, Sam Devlin

    Abstract: Many contrastive and meta-learning approaches learn representations by identifying common features in multiple views. However, the formalism for these approaches generally assumes features to be shared across views to be captured coherently. We consider the problem of learning a unified representation from partial observations, where useful features may be present in only some of the views. We app… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Accepted for publication at ICLR 2023. Code is available at https://github.com/AdamJelley/POEM

  21. arXiv:2208.03923  [pdf, other

    cs.LG cs.AI cs.CV

    Adversarial robustness of VAEs through the lens of local geometry

    Authors: Asif Khan, Amos Storkey

    Abstract: In an unsupervised attack on variational autoencoders (VAEs), an adversary finds a small perturbation in an input sample that significantly changes its latent space encoding, thereby compromising the reconstruction for a fixed decoder. A known reason for such vulnerability is the distortions in the latent space resulting from a mismatch between approximated latent posterior and a prior distributio… ▽ More

    Submitted 5 April, 2023; v1 submitted 8 August, 2022; originally announced August 2022.

    Comments: International Conference on Artificial Intelligence and Statistics (AISTATS) 2023

  22. arXiv:2207.05757  [pdf, other

    q-bio.QM cs.AI cs.CV eess.IV

    Robust and efficient computation of retinal fractal dimension through deep approximation

    Authors: Justin Engelmann, Ana Villaplana-Velasco, Amos Storkey, Miguel O. Bernabeu

    Abstract: A retinal trait, or phenotype, summarises a specific aspect of a retinal image in a single number. This can then be used for further analyses, e.g. with statistical methods. However, reducing an aspect of a complex image to a single, meaningful number is challenging. Thus, methods for calculating retinal traits tend to be complex, multi-step pipelines that can only be applied to high quality image… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

  23. arXiv:2207.02249  [pdf, other

    cs.MA cs.AI cs.LG

    Learning Task Embeddings for Teamwork Adaptation in Multi-Agent Reinforcement Learning

    Authors: Lukas Schäfer, Filippos Christianos, Amos Storkey, Stefano V. Albrecht

    Abstract: Successful deployment of multi-agent reinforcement learning often requires agents to adapt their behaviour. In this work, we discuss the problem of teamwork adaptation in which a team of agents needs to adapt their policies to solve novel tasks with limited fine-tuning. Motivated by the intuition that agents need to be able to identify and distinguish tasks in order to adapt their behaviour to the… ▽ More

    Submitted 20 November, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: To be presented at the Seventh Workshop on Generalization in Planning at the NeurIPS 2023 conference

  24. arXiv:2203.06113  [pdf, other

    eess.IV cs.AI cs.CV cs.LG q-bio.QM

    Detection of multiple retinal diseases in ultra-widefield fundus images using deep learning: data-driven identification of relevant regions

    Authors: Justin Engelmann, Alice D. McTrusty, Ian J. C. MacCormick, Emma Pead, Amos Storkey, Miguel O. Bernabeu

    Abstract: Ultra-widefield (UWF) imaging is a promising modality that captures a larger retinal field of view compared to traditional fundus photography. Previous studies showed that deep learning (DL) models are effective for detecting retinal disease in UWF images, but primarily considered individual diseases under less-than-realistic conditions (excluding images with other diseases, artefacts, comorbiditi… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

  25. arXiv:2203.05469  [pdf, other

    cs.CV cs.LG

    Prediction-Guided Distillation for Dense Object Detection

    Authors: Chenhongyi Yang, Mateusz Ochal, Amos Storkey, Elliot J. Crowley

    Abstract: Real-world object detection models should be cheap and accurate. Knowledge distillation (KD) can boost the accuracy of a small, cheap detection model by leveraging useful information from a larger teacher model. However, a key challenge is identifying the most informative features produced by the teacher for distillation. In this work, we show that only a very small fraction of features within a g… ▽ More

    Submitted 18 July, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: ECCV 2022

  26. arXiv:2201.12570  [pdf, other

    q-bio.BM cs.AI cs.LG cs.NE stat.ML

    AntBO: Towards Real-World Automated Antibody Design with Combinatorial Bayesian Optimisation

    Authors: Asif Khan, Alexander I. Cowen-Rivers, Antoine Grosnit, Derrick-Goh-Xin Deik, Philippe A. Robert, Victor Greiff, Eva Smorodina, Puneet Rawat, Kamil Dreczkowski, Rahmad Akbar, Rasul Tutunov, Dany Bou-Ammar, Jun Wang, Amos Storkey, Haitham Bou-Ammar

    Abstract: Antibodies are canonically Y-shaped multimeric proteins capable of highly specific molecular recognition. The CDRH3 region located at the tip of variable chains of an antibody dominates antigen-binding specificity. Therefore, it is a priority to design optimal antigen-specific CDRH3 regions to develop therapeutic antibodies. However, the combinatorial nature of CDRH3 sequence space makes it imposs… ▽ More

    Submitted 14 October, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

  27. arXiv:2112.09591  [pdf, other

    cs.CV cs.AI

    Global explainability in aligned image modalities

    Authors: Justin Engelmann, Amos Storkey, Miguel O. Bernabeu

    Abstract: Deep learning (DL) models are very effective on many computer vision problems and increasingly used in critical applications. They are also inherently black box. A number of methods exist to generate image-wise explanations that allow practitioners to understand and verify model predictions for a given image. Beyond that, it would be desirable to validate that a DL model \textit{generally} works i… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

  28. arXiv:2112.01641  [pdf, other

    cs.CV cs.AI cs.LG

    Hamiltonian latent operators for content and motion disentanglement in image sequences

    Authors: Asif Khan, Amos Storkey

    Abstract: We introduce \textit{HALO} -- a deep generative model utilising HAmiltonian Latent Operators to reliably disentangle content and motion information in image sequences. The \textit{content} represents summary statistics of a sequence, and \textit{motion} is a dynamic process that determines how information is expressed in any part of the sequence. By modelling the dynamics as a Hamiltonian motion,… ▽ More

    Submitted 12 October, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: Conference paper at NeurIPS 2022

  29. arXiv:2106.10704  [pdf, other

    cs.LG stat.ML

    Better Training using Weight-Constrained Stochastic Dynamics

    Authors: Benedict Leimkuhler, Tiffany Vlaar, Timothée Pouchon, Amos Storkey

    Abstract: We employ constraints to control the parameter space of deep neural networks throughout training. The use of customized, appropriately designed constraints can reduce the vanishing/exploding gradients problem, improve smoothness of classification boundaries, control weight magnitudes and stabilize deep neural networks, and thus enhance the robustness of training algorithms and the generalization c… ▽ More

    Submitted 20 June, 2021; originally announced June 2021.

    Comments: ICML 2021 camera-ready. arXiv admin note: substantial text overlap with arXiv:2006.10114

    Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  30. arXiv:2104.05344  [pdf, other

    cs.LG cs.CV

    How Sensitive are Meta-Learners to Dataset Imbalance?

    Authors: Mateusz Ochal, Massimiliano Patacchiola, Amos Storkey, Jose Vazquez, Sen Wang

    Abstract: Meta-Learning (ML) has proven to be a useful tool for training Few-Shot Learning (FSL) algorithms by exposure to batches of tasks sampled from a meta-dataset. However, the standard training procedure overlooks the dynamic nature of the real-world where object classes are likely to occur at different frequencies. While it is generally understood that imbalanced tasks harm the performance of supervi… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: Published as a workshop paper at the Learning to Learn workshop at ICLR 2021. arXiv admin note: text overlap with arXiv:2101.02523

  31. arXiv:2101.02523  [pdf, other

    cs.LG cs.CV

    Few-Shot Learning with Class Imbalance

    Authors: Mateusz Ochal, Massimiliano Patacchiola, Amos Storkey, Jose Vazquez, Sen Wang

    Abstract: Few-Shot Learning (FSL) algorithms are commonly trained through Meta-Learning (ML), which exposes models to batches of tasks sampled from a meta-dataset to mimic tasks seen during evaluation. However, the standard training procedures overlook the real-world dynamics where classes commonly occur at different frequencies. While it is generally understood that class imbalance harms the performance of… ▽ More

    Submitted 14 June, 2021; v1 submitted 7 January, 2021; originally announced January 2021.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  32. arXiv:2011.11486  [pdf, other

    cs.LG

    Latent Adversarial Debiasing: Mitigating Collider Bias in Deep Neural Networks

    Authors: Luke Darlow, Stanisław Jastrzębski, Amos Storkey

    Abstract: Collider bias is a harmful form of sample selection bias that neural networks are ill-equipped to handle. This bias manifests itself when the underlying causal signal is strongly correlated with other confounding signals due to the training data collection procedure. In the situation where the confounding signal is easy-to-learn, deep neural networks will latch onto this and the resulting model wi… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

    Comments: 10 pages, 4 figures, submitted to AISTATS 2021

  33. arXiv:2007.07869  [pdf, other

    cs.LG cs.CV stat.ML

    Gradient-based Hyperparameter Optimization Over Long Horizons

    Authors: Paul Micaelli, Amos Storkey

    Abstract: Gradient-based hyperparameter optimization has earned a widespread popularity in the context of few-shot meta-learning, but remains broadly impractical for tasks with long horizons (many gradient steps), due to memory scaling and gradient degradation issues. A common workaround is to learn hyperparameters online, but this introduces greediness which comes with a significant performance drop. We pr… ▽ More

    Submitted 30 September, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

  34. arXiv:2006.10114  [pdf, other

    cs.LG stat.ML

    Constraint-Based Regularization of Neural Networks

    Authors: Benedict Leimkuhler, Timothée Pouchon, Tiffany Vlaar, Amos Storkey

    Abstract: We propose a method for efficiently incorporating constraints into a stochastic gradient Langevin framework for the training of deep neural networks. Constraints allow direct control of the parameter space of the model. Appropriately designed, they reduce the vanishing/exploding gradient problem, control weight magnitudes and stabilize deep neural networks and thus improve the robustness of traini… ▽ More

    Submitted 20 June, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: T. Vlaar won best student paper award at OPT2020

    Journal ref: OPT2020: 12th Annual Workshop on Optimization for Machine Learning, NeurIPS 2020

  35. arXiv:2006.09791  [pdf, other

    cs.LG cs.CV cs.DC stat.ML

    Optimizing Grouped Convolutions on Edge Devices

    Authors: Perry Gibson, José Cano, Jack Turner, Elliot J. Crowley, Michael O'Boyle, Amos Storkey

    Abstract: When deploying a deep neural network on constrained hardware, it is possible to replace the network's standard convolutions with grouped convolutions. This allows for substantial memory savings with minimal loss of accuracy. However, current implementations of grouped convolutions in modern deep learning frameworks are far from performing optimally in terms of speed. In this paper we propose Group… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: Camera ready version to be published at ASAP 2020 - The 31st IEEE International Conference on Application-specific Systems, Architectures and Processors. 8 pages, 6 figures

    ACM Class: I.2.6; D.3.4; C.1.4

  36. arXiv:2006.05849  [pdf, other

    cs.LG stat.ML

    Self-Supervised Relational Reasoning for Representation Learning

    Authors: Massimiliano Patacchiola, Amos Storkey

    Abstract: In self-supervised learning, a system is tasked with achieving a surrogate objective by defining alternative targets on a set of unlabeled data. The aim is to build useful representations that can be used in downstream tasks, without costly manual annotation. In this work, we propose a novel self-supervised formulation of relational reasoning that allows a learner to bootstrap a signal from inform… ▽ More

    Submitted 10 November, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2020, Spotlight)

  37. arXiv:2006.04647  [pdf, other

    cs.LG cs.CV stat.ML

    Neural Architecture Search without Training

    Authors: Joseph Mellor, Jack Turner, Amos Storkey, Elliot J. Crowley

    Abstract: The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained… ▽ More

    Submitted 11 June, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: Accepted at ICML 2021 for a long presentation

  38. arXiv:2004.11967  [pdf, other

    cs.CV cs.LG stat.ML

    Defining Benchmarks for Continual Few-Shot Learning

    Authors: Antreas Antoniou, Massimiliano Patacchiola, Mateusz Ochal, Amos Storkey

    Abstract: Both few-shot and continual learning have seen substantial progress in the last years due to the introduction of proper benchmarks. That being said, the field has still to frame a suite of benchmarks for the highly desirable setting of continual few-shot learning, where the learner is presented a number of few-shot tasks, one after the other, and then asked to perform well on a validation set stem… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

  39. arXiv:2004.05439  [pdf, other

    cs.LG stat.ML

    Meta-Learning in Neural Networks: A Survey

    Authors: Timothy Hospedales, Antreas Antoniou, Paul Micaelli, Amos Storkey

    Abstract: The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional chall… ▽ More

    Submitted 7 November, 2020; v1 submitted 11 April, 2020; originally announced April 2020.

  40. arXiv:2003.08821  [pdf, other

    cs.CV cs.LG stat.ML

    DHOG: Deep Hierarchical Object Grou**

    Authors: Luke Nicholas Darlow, Amos Storkey

    Abstract: Recently, a number of competitive methods have tackled unsupervised representation learning by maximising the mutual information between the representations produced from augmentations. The resulting representations are then invariant to stochastic augmentation strategies, and can be used for downstream tasks such as clustering or classification. Yet data augmentations preserve many properties of… ▽ More

    Submitted 13 March, 2020; originally announced March 2020.

    Comments: 15 pages, submitted to ECCV 2020

  41. arXiv:2003.06254  [pdf, other

    cs.LG cs.CV stat.ML

    What Information Does a ResNet Compress?

    Authors: Luke Nicholas Darlow, Amos Storkey

    Abstract: The information bottleneck principle (Shwartz-Ziv & Tishby, 2017) suggests that SGD-based training of deep neural networks results in optimally compressed hidden layers, from an information theoretic perspective. However, this claim was established on toy data. The goal of the work we present here is to test whether the information bottleneck principle is applicable to a realistic setting using a… ▽ More

    Submitted 13 March, 2020; originally announced March 2020.

    Comments: 10 pages + appendices; submitted to ICLR 2019

  42. arXiv:2002.08981  [pdf, other

    cs.LG cs.CV stat.ML

    Comparing recurrent and convolutional neural networks for predicting wave propagation

    Authors: Stathi Fotiadis, Eduardo Pignatelli, Mario Lino Valencia, Chris Cantwell, Amos Storkey, Anil A. Bharath

    Abstract: Dynamical systems can be modelled by partial differential equations and numerical computations are used everywhere in science and engineering. In this work, we investigate the performance of recurrent and convolutional deep neural network architectures to predict the surface waves. The system is governed by the Saint-Venant equations. We improve on the long-term prediction over previous methods wh… ▽ More

    Submitted 20 April, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

  43. arXiv:2002.08697  [pdf, other

    cs.LG stat.ML

    Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs

    Authors: Valentin Radu, Kuba Kaszyk, Yuan Wen, Jack Turner, Jose Cano, Elliot J. Crowley, Bjorn Franke, Amos Storkey, Michael O'Boyle

    Abstract: Convolutional Neural Networks (CNN) are becoming a common presence in many applications and services, due to their superior recognition accuracy. They are increasingly being used on mobile devices, many times just by porting large models designed for server space, although several model compression techniques have been considered. One model compression technique intended to reduce computations is… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

    Comments: A copy of this was published in IISWC'19

  44. arXiv:1910.05199  [pdf, other

    cs.LG stat.ML

    Bayesian Meta-Learning for the Few-Shot Setting via Deep Kernels

    Authors: Massimiliano Patacchiola, Jack Turner, Elliot J. Crowley, Michael O'Boyle, Amos Storkey

    Abstract: Recently, different machine learning methods have been introduced to tackle the challenging few-shot learning scenario that is, learning from a small labeled dataset related to a specific task. Common approaches have taken the form of meta-learning: learning to learn on the new problem given the old. Following the recognition that meta-learning is implementing learning in a multi-level model, we p… ▽ More

    Submitted 13 October, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2020, Spotlight)

  45. arXiv:1906.04113  [pdf, other

    cs.LG stat.ML

    BlockSwap: Fisher-guided Block Substitution for Network Compression on a Budget

    Authors: Jack Turner, Elliot J. Crowley, Michael O'Boyle, Amos Storkey, Gavin Gray

    Abstract: The desire to map neural networks to varying-capacity devices has led to the development of a wealth of compression techniques, many of which involve replacing standard convolutional blocks in a large network with cheap alternative blocks. However, not all blocks are created equally; for a required compute budget there may exist a potent combination of many different cheap blocks, though exhaustiv… ▽ More

    Submitted 23 January, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: ICLR 2020

  46. arXiv:1906.00859  [pdf, other

    stat.ML cs.LG

    Separable Layers Enable Structured Efficient Linear Substitutions

    Authors: Gavin Gray, Elliot J. Crowley, Amos Storkey

    Abstract: In response to the development of recent efficient dense layers, this paper shows that something as simple as replacing linear components in pointwise convolutions with structured linear decompositions also produces substantial gains in the efficiency/accuracy tradeoff. Pointwise convolutions are fully connected layers and are thus prepared for replacement by structured transforms. Networks using… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  47. arXiv:1905.10295  [pdf, other

    cs.LG stat.ML

    Learning to learn via Self-Critique

    Authors: Antreas Antoniou, Amos Storkey

    Abstract: In few-shot learning, a machine learning system learns from a small set of labelled examples relating to a specific task, such that it can generalize to new examples of the same task. Given the limited availability of labelled examples in such tasks, we wish to make use of all the information we can. Usually a model learns task-specific information from a small training-set (support-set) to predic… ▽ More

    Submitted 30 January, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: Accepted in NeurIPS 2019

  48. arXiv:1905.09768  [pdf, other

    cs.LG stat.ML

    Zero-shot Knowledge Transfer via Adversarial Belief Matching

    Authors: Paul Micaelli, Amos Storkey

    Abstract: Performing knowledge transfer from a large teacher network to a smaller student is a popular task in modern deep learning applications. However, due to growing dataset sizes and stricter privacy regulations, it is increasingly common not to have access to the data that was used to train the teacher. We propose a novel method which trains a student to match the predictions of its teacher without us… ▽ More

    Submitted 25 November, 2019; v1 submitted 23 May, 2019; originally announced May 2019.

  49. arXiv:1902.09884  [pdf, other

    stat.ML cs.LG

    Assume, Augment and Learn: Unsupervised Few-Shot Meta-Learning via Random Labels and Data Augmentation

    Authors: Antreas Antoniou, Amos Storkey

    Abstract: The field of few-shot learning has been laboriously explored in the supervised setting, where per-class labels are available. On the other hand, the unsupervised few-shot learning setting, where no labels of any kind are required, has seen little investigation. We propose a method, named Assume, Augment and Learn or AAL, for generating few-shot tasks using unlabeled data. We randomly label a rando… ▽ More

    Submitted 5 March, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

    Comments: Work in Progress - Under Review in ICML 2019

  50. arXiv:1811.00410  [pdf, other

    stat.ML cs.LG

    Dilated DenseNets for Relational Reasoning

    Authors: Antreas Antoniou, Agnieszka Słowik, Elliot J. Crowley, Amos Storkey

    Abstract: Despite their impressive performance in many tasks, deep neural networks often struggle at relational reasoning. This has recently been remedied with the introduction of a plug-in relational module that considers relations between pairs of objects. Unfortunately, this is combinatorially expensive. In this extended abstract, we show that a DenseNet incorporating dilated convolutions excels at relat… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

    Comments: Extended Abstract