Search | arXiv e-print repository

Capsule Network Projectors are Equivariant and Invariant Learners

Authors: Miles Everett, Aiden Durrant, Mingjun Zhong, Georgios Leontidis

Abstract: Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture eq… ▽ More Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture equivariance with respect to novel viewpoints. We demonstrate that the use of CapsNets in equivariant self-supervised architectures achieves improved downstream performance on equivariant tasks with higher efficiency and fewer network parameters. To accommodate the architectural changes of CapsNets, we introduce a new objective function based on entropy minimisation. This approach, which we name CapsIE (Capsule Invariant Equivariant Network), achieves state-of-the-art performance across all invariant and equivariant downstream tasks on the 3DIEBench dataset, while outperforming supervised baselines. Our results demonstrate the ability of CapsNets to learn complex and generalised representations for large-scale, multi-task datasets compared to previous CapsNet benchmarks. Code is available at https://github.com/AberdeenML/CapsIE. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 15 pages, 7 figures, 9 Tables; code to be released at: https://github.com/AberdeenML/CapsIE

arXiv:2403.06813 [pdf, other]

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations

Authors: Mohammad Alkhalefi, Georgios Leontidis, Mingjun Zhong

Abstract: Contrastive instance discrimination outperforms supervised learning in downstream tasks like image classification and object detection. However, this approach heavily relies on data augmentation during representation learning, which may result in inferior results if not properly implemented. Random crop** followed by resizing is a common form of data augmentation used in contrastive learning, bu… ▽ More Contrastive instance discrimination outperforms supervised learning in downstream tasks like image classification and object detection. However, this approach heavily relies on data augmentation during representation learning, which may result in inferior results if not properly implemented. Random crop** followed by resizing is a common form of data augmentation used in contrastive learning, but it can lead to degraded representation learning if the two random crops contain distinct semantic content. To address this issue, this paper introduces LeOCLR (Leveraging Original Images for Contrastive Learning of Visual Representations), a framework that employs a new instance discrimination approach and an adapted loss function that ensures the shared region between positive pairs is semantically correct. The experimental results show that our approach consistently improves representation learning across different datasets compared to baseline models. For example, our approach outperforms MoCo-v2 by 5.1% on ImageNet-1K in linear evaluation and several other methods on transfer learning tasks. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 16 pages, 5 figures, 6 tables

arXiv:2403.04724 [pdf, other]

Masked Capsule Autoencoders

Authors: Miles Everett, Mingjun Zhong, Georgios Leontidis

Abstract: We propose Masked Capsule Autoencoders (MCAE), the first Capsule Network that utilises pretraining in a self-supervised manner. Capsule Networks have emerged as a powerful alternative to Convolutional Neural Networks (CNNs), and have shown favourable properties when compared to Vision Transformers (ViT), but have struggled to effectively learn when presented with more complex data, leading to Caps… ▽ More We propose Masked Capsule Autoencoders (MCAE), the first Capsule Network that utilises pretraining in a self-supervised manner. Capsule Networks have emerged as a powerful alternative to Convolutional Neural Networks (CNNs), and have shown favourable properties when compared to Vision Transformers (ViT), but have struggled to effectively learn when presented with more complex data, leading to Capsule Network models that do not scale to modern tasks. Our proposed MCAE model alleviates this issue by reformulating the Capsule Network to use masked image modelling as a pretraining stage before finetuning in a supervised manner. Across several experiments and ablations studies we demonstrate that similarly to CNNs and ViTs, Capsule Networks can also benefit from self-supervised pretraining, paving the way for further advancements in this neural network domain. For instance, pretraining on the Imagenette dataset, a dataset of 10 classes of Imagenet-sized images, we achieve not only state-of-the-art results for Capsule Networks but also a 9% improvement compared to purely supervised training. Thus we propose that Capsule Networks benefit from and should be trained within a masked image modelling framework, with a novel capsule decoder, to improve a Capsule Network's performance on realistic-sized images. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 14 pages, 6 figures, 4 tables

arXiv:2307.09944 [pdf, other]

ProtoCaps: A Fast and Non-Iterative Capsule Network Routing Method

Authors: Miles Everett, Mingjun Zhong, Georgios Leontidis

Abstract: Capsule Networks have emerged as a powerful class of deep learning architectures, known for robust performance with relatively few parameters compared to Convolutional Neural Networks (CNNs). However, their inherent efficiency is often overshadowed by their slow, iterative routing mechanisms which establish connections between Capsule layers, posing computational challenges resulting in an inabili… ▽ More Capsule Networks have emerged as a powerful class of deep learning architectures, known for robust performance with relatively few parameters compared to Convolutional Neural Networks (CNNs). However, their inherent efficiency is often overshadowed by their slow, iterative routing mechanisms which establish connections between Capsule layers, posing computational challenges resulting in an inability to scale. In this paper, we introduce a novel, non-iterative routing mechanism, inspired by trainable prototype clustering. This innovative approach aims to mitigate computational complexity, while retaining, if not enhancing, performance efficacy. Furthermore, we harness a shared Capsule subspace, negating the need to project each lower-level Capsule to each higher-level Capsule, thereby significantly reducing memory requisites during training. Our approach demonstrates superior results compared to the current best non-iterative Capsule Network and tests on the Imagewoof dataset, which is too computationally demanding to handle efficiently by iterative approaches. Our findings underscore the potential of our proposed methodology in enhancing the operational efficiency and performance of Capsule Networks, paving the way for their application in increasingly complex computational scenarios. Code is available at https://github.com/mileseverett/ProtoCaps. △ Less

Submitted 8 March, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

Comments: 13 pages, 5 figures, 5 tables

Journal ref: TMLR December 2023 (https://openreview.net/pdf?id=Id10mlBjcx)

arXiv:2307.08386 [pdf, other]

doi 10.3390/make5030055

Tabular Machine Learning Methods for Predicting Gas Turbine Emissions

Authors: Rebecca Potts, Rick Hackney, Georgios Leontidis

Abstract: Predicting emissions for gas turbines is critical for monitoring harmful pollutants being released into the atmosphere. In this study, we evaluate the performance of machine learning models for predicting emissions for gas turbines. We compare an existing predictive emissions model, a first principles-based Chemical Kinetics model, against two machine learning models we developed based on SAINT an… ▽ More Predicting emissions for gas turbines is critical for monitoring harmful pollutants being released into the atmosphere. In this study, we evaluate the performance of machine learning models for predicting emissions for gas turbines. We compare an existing predictive emissions model, a first principles-based Chemical Kinetics model, against two machine learning models we developed based on SAINT and XGBoost, to demonstrate improved predictive performance of nitrogen oxides (NOx) and carbon monoxide (CO) using machine learning techniques. Our analysis utilises a Siemens Energy gas turbine test bed tabular dataset to train and validate the machine learning models. Additionally, we explore the trade-off between incorporating more features to enhance the model complexity, and the resulting presence of increased missing values in the dataset. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 23 pages, 9 figures, 1 appendix

Journal ref: Machine Learning and Knowledge Extraction 2023

arXiv:2306.16122 [pdf, other]

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination methods

Authors: Mohammad Alkhalefi, Georgios Leontidis, Mingjun Zhong

Abstract: Self-supervised learning algorithms (SSL) based on instance discrimination have shown promising results, performing competitively or even outperforming supervised learning counterparts in some downstream tasks. Such approaches employ data augmentation to create two views of the same instance (i.e., positive pairs) and encourage the model to learn good representations by attracting these views clos… ▽ More Self-supervised learning algorithms (SSL) based on instance discrimination have shown promising results, performing competitively or even outperforming supervised learning counterparts in some downstream tasks. Such approaches employ data augmentation to create two views of the same instance (i.e., positive pairs) and encourage the model to learn good representations by attracting these views closer in the embedding space without collapsing to the trivial solution. However, data augmentation is limited in representing positive pairs, and the repulsion process between the instances during contrastive learning may discard important features for instances that have similar categories. To address this issue, we propose an approach to identify those images with similar semantic content and treat them as positive instances, thereby reducing the chance of discarding important features during representation learning and increasing the richness of the latent representation. Our approach is generic and could work with any self-supervised instance discrimination frameworks such as MoCo and SimSiam. To evaluate our method, we run experiments on three benchmark datasets: ImageNet, STL-10 and CIFAR-10 with different instance discrimination SSL approaches. The experimental results show that our approach consistently outperforms the baseline methods across all three datasets; for instance, we improve upon the vanilla MoCo-v2 by 4.1% on ImageNet under a linear evaluation protocol over 800 epochs. We also report results on semi-supervised learning, transfer learning on downstream tasks, and object detection. △ Less

Submitted 25 April, 2024; v1 submitted 28 June, 2023; originally announced June 2023.

Comments: 17 pages, 6 figures, 12 tables

Journal ref: TMLR 2024 (https://openreview.net/pdf?id=z5AXLMBWdU)

arXiv:2305.11701 [pdf, other]

S-JEA: Stacked Joint Embedding Architectures for Self-Supervised Visual Representation Learning

Authors: Alžběta Manová, Aiden Durrant, Georgios Leontidis

Abstract: The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchica… ▽ More The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchical representations by stacking Joint Embedding Architectures (JEA) where higher-level JEAs are input with representations of lower-level JEA. This results in a representation space that exhibits distinct sub-categories of semantic concepts (e.g., model and colour of vehicles) in higher-level JEAs. We empirically show that representations from stacked JEA perform on a similar level as traditional JEA with comparative parameter counts and visualise the representation spaces to validate the semantic hierarchies. △ Less

Submitted 19 May, 2023; originally announced May 2023.

Comments: 9 pages, 4 figures, 3 tables

arXiv:2305.11178 [pdf, other]

Vanishing Activations: A Symptom of Deep Capsule Networks

Authors: Miles Everett, Mingjun Zhong, Georgios Leontidis

Abstract: Capsule Networks, an extension to Neural Networks utilizing vector or matrix representations instead of scalars, were initially developed to create a dynamic parse tree where visual concepts evolve from parts to complete objects. Early implementations of Capsule Networks achieved and maintain state-of-the-art results on various datasets. However, recent studies have revealed shortcomings in the or… ▽ More Capsule Networks, an extension to Neural Networks utilizing vector or matrix representations instead of scalars, were initially developed to create a dynamic parse tree where visual concepts evolve from parts to complete objects. Early implementations of Capsule Networks achieved and maintain state-of-the-art results on various datasets. However, recent studies have revealed shortcomings in the original Capsule Network architecture, notably its failure to construct a parse tree and its susceptibility to vanishing gradients when deployed in deeper networks. This paper extends the investigation to a range of leading Capsule Network architectures, demonstrating that these issues are not confined to the original design. We argue that the majority of Capsule Network research has produced architectures that, while modestly divergent from the original Capsule Network, still retain a fundamentally similar structure. We posit that this inherent design similarity might be impeding the scalability of Capsule Networks. Our study contributes to the broader discussion on improving the robustness and scalability of Capsule Networks. △ Less

Submitted 13 May, 2023; originally announced May 2023.

Comments: 9 pages, 7 figures

arXiv:2305.10926 [pdf, other]

HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes

Authors: Aiden Durrant, Georgios Leontidis

Abstract: Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised lea… ▽ More Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised learning is yet to be explored fully. In this work, we explore the use of hyperbolic representation space for self-supervised representation learning for prototype-based clustering approaches. First, we extend the Masked Siamese Networks to operate on the Poincaré ball model of hyperbolic space, secondly, we place prototypes on the ideal boundary of the Poincaré ball. Unlike previous methods we project to the hyperbolic space at the output of the encoder network and utilise a hyperbolic projection head to ensure that the representations used for downstream tasks remain hyperbolic. Empirically we demonstrate the ability of these methods to perform comparatively to Euclidean methods in lower dimensions for linear evaluation tasks, whilst showing improvements in extreme few-shot learning tasks. △ Less

Submitted 18 May, 2023; originally announced May 2023.

arXiv:2304.09876 [pdf, other]

doi 10.1016/j.eswa.2023.122847

Model Pruning Enables Localized and Efficient Federated Learning for Yield Forecasting and Data Sharing

Authors: Andy Li, Milan Markovic, Peter Edwards, Georgios Leontidis

Abstract: Federated Learning (FL) presents a decentralized approach to model training in the agri-food sector and offers the potential for improved machine learning performance, while ensuring the safety and privacy of individual farms or data silos. However, the conventional FL approach has two major limitations. First, the heterogeneous data on individual silos can cause the global model to perform well f… ▽ More Federated Learning (FL) presents a decentralized approach to model training in the agri-food sector and offers the potential for improved machine learning performance, while ensuring the safety and privacy of individual farms or data silos. However, the conventional FL approach has two major limitations. First, the heterogeneous data on individual silos can cause the global model to perform well for some clients but not all, as the update direction on some clients may hinder others after they are aggregated. Second, it is lacking with respect to the efficiency perspective concerning communication costs during FL and large model sizes. This paper proposes a new technical solution that utilizes network pruning on client models and aggregates the pruned models. This method enables local models to be tailored to their respective data distribution and mitigate the data heterogeneity present in agri-food data. Moreover, it allows for more compact models that consume less data during transmission. We experiment with a soybean yield forecasting dataset and find that this approach can improve inference performance by 15.5% to 20% compared to FedAvg, while reducing local model sizes by up to 84% and the data volume communicated between the clients and the server by 57.1% to 64.7%. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: 31 pages, 4 figures, 4 tables

Journal ref: Expert Systems with Applications 2023

arXiv:2304.07764 [pdf, other]

Deep learning universal crater detection using Segment Anything Model (SAM)

Authors: Iraklis Giannakis, Anshuman Bhardwaj, Lydia Sam, Georgios Leontidis

Abstract: Craters are amongst the most important morphological features in planetary exploration. To that extent, detecting, map** and counting craters is a mainstream process in planetary science, done primarily manually, which is a very laborious and time-consuming process. Recently, machine learning (ML) and computer vision have been successfully applied for both detecting craters and estimating their… ▽ More Craters are amongst the most important morphological features in planetary exploration. To that extent, detecting, map** and counting craters is a mainstream process in planetary science, done primarily manually, which is a very laborious and time-consuming process. Recently, machine learning (ML) and computer vision have been successfully applied for both detecting craters and estimating their size. Existing ML approaches for automated crater detection have been trained in specific types of data e.g. digital elevation model (DEM), images and associated metadata for orbiters such as the Lunar Reconnaissance Orbiter Camera (LROC) etc.. Due to that, each of the resulting ML schemes is applicable and reliable only to the type of data used during the training process. Data from different sources, angles and setups can compromise the reliability of these ML schemes. In this paper we present a universal crater detection scheme that is based on the recently proposed Segment Anything Model (SAM) from META AI. SAM is a prompt-able segmentation system with zero-shot generalization to unfamiliar objects and images without the need for additional training. Using SAM we can successfully identify crater-looking objects in any type of data (e,g, raw satellite images Level-1 and 2 products, DEMs etc.) for different setups (e.g. Lunar, Mars) and different capturing angles. Moreover, using shape indexes, we only keep the segmentation masks of crater-like features. These masks are subsequently fitted with an ellipse, recovering both the location and the size/geometry of the detected craters. △ Less

Submitted 16 April, 2023; originally announced April 2023.

Comments: 11 pages, 7 Figures, preprint of a submitted paper in Icarus (under review)

MSC Class: 86 ACM Class: I.2

arXiv:2211.09027 [pdf, other]

LLEDA -- Lifelong Self-Supervised Domain Adaptation

Authors: Mamatha Thota, Dewei Yi, Georgios Leontidis

Abstract: Humans and animals have the ability to continuously learn new information over their lifetime without losing previously acquired knowledge. However, artificial neural networks struggle with this due to new information conflicting with old knowledge, resulting in catastrophic forgetting. The complementary learning systems (CLS) theory suggests that the interplay between hippocampus and neocortex sy… ▽ More Humans and animals have the ability to continuously learn new information over their lifetime without losing previously acquired knowledge. However, artificial neural networks struggle with this due to new information conflicting with old knowledge, resulting in catastrophic forgetting. The complementary learning systems (CLS) theory suggests that the interplay between hippocampus and neocortex systems enables long-term and efficient learning in the mammalian brain, with memory replay facilitating the interaction between these two systems to reduce forgetting. The proposed Lifelong Self-Supervised Domain Adaptation (LLEDA) framework draws inspiration from the CLS theory and mimics the interaction between two networks: a DA network inspired by the hippocampus that quickly adjusts to changes in data distribution and an SSL network inspired by the neocortex that gradually learns domain-agnostic general representations. LLEDA's latent replay technique facilitates communication between these two networks by reactivating and replaying the past memory latent representations to stabilise long-term generalisation and retention without interfering with the previously learned information. Extensive experiments demonstrate that the proposed method outperforms several other methods resulting in a long-term adaptation while being less prone to catastrophic forgetting when transferred to new domains. △ Less

Submitted 7 August, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

Comments: 19 pages, 6 figures, 6 tables; V2 added more experiments on more domains and fixed typos

arXiv:2211.08177 [pdf, other]

Premonition Net, A Multi-Timeline Transformer Network Architecture Towards Strawberry Tabletop Yield Forecasting

Authors: George Onoufriou, Marc Hanheide, Georgios Leontidis

Abstract: Yield forecasting is a critical first step necessary for yield optimisation, with important consequences for the broader food supply chain, procurement, price-negotiation, logistics, and supply. However yield forecasting is notoriously difficult, and oft-inaccurate. Premonition Net is a multi-timeline, time sequence ingesting approach towards processing the past, the present, and premonitions of t… ▽ More Yield forecasting is a critical first step necessary for yield optimisation, with important consequences for the broader food supply chain, procurement, price-negotiation, logistics, and supply. However yield forecasting is notoriously difficult, and oft-inaccurate. Premonition Net is a multi-timeline, time sequence ingesting approach towards processing the past, the present, and premonitions of the future. We show how this structure combined with transformers attains critical yield forecasting proficiency towards improving food security, lowering prices, and reducing waste. We find data availability to be a continued difficulty however using our premonition network and our own collected data we attain yield forecasts 3 weeks ahead with a a testing set RMSE loss of ~0.08 across our latest season. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: 9 pages, 6 figures, IEEE two column format style

arXiv:2206.02664 [pdf, other]

Learning with Capsules: A Survey

Authors: Fabio De Sousa Ribeiro, Kevin Duarte, Miles Everett, Georgios Leontidis, Mubarak Shah

Abstract: Capsule networks were proposed as an alternative approach to Convolutional Neural Networks (CNNs) for learning object-centric representations, which can be leveraged for improved generalization and sample complexity. Unlike CNNs, capsule networks are designed to explicitly model part-whole hierarchical relationships by using groups of neurons to encode visual entities, and learn the relationships… ▽ More Capsule networks were proposed as an alternative approach to Convolutional Neural Networks (CNNs) for learning object-centric representations, which can be leveraged for improved generalization and sample complexity. Unlike CNNs, capsule networks are designed to explicitly model part-whole hierarchical relationships by using groups of neurons to encode visual entities, and learn the relationships between those entities. Promising early results achieved by capsule networks have motivated the deep learning community to continue trying to improve their performance and scalability across several application areas. However, a major hurdle for capsule network research has been the lack of a reliable point of reference for understanding their foundational ideas and motivations. The aim of this survey is to provide a comprehensive overview of the capsule network research landscape, which will serve as a valuable resource for the community going forward. To that end, we start with an introduction to the fundamental concepts and motivations behind capsule networks, such as equivariant inference in computer vision. We then cover the technical advances in the capsule routing mechanisms and the various formulations of capsule networks, e.g. generative and geometric. Additionally, we provide a detailed explanation of how capsule networks relate to the popular attention mechanism in Transformers, and highlight non-trivial conceptual similarities between them in the context of representation learning. Afterwards, we explore the extensive applications of capsule networks in computer vision, video and motion, graph representation learning, natural language processing, medical imaging and many others. To conclude, we provide an in-depth discussion regarding the main hurdles in capsule network research, and highlight promising research directions for future work. △ Less

Submitted 6 June, 2022; originally announced June 2022.

Comments: 29 pages, 43 figures

arXiv:2110.13638 [pdf, other]

EDLaaS: Fully Homomorphic Encryption Over Neural Network Graphs for Vision and Private Strawberry Yield Forecasting

Authors: George Onoufriou, Marc Hanheide, Georgios Leontidis

Abstract: We present automatically parameterised Fully Homomorphic Encryption (FHE) for encrypted neural network inference and exemplify our inference over FHE compatible neural networks with our own open-source framework and reproducible examples. We use the 4th generation Cheon, Kim, Kim and Song (CKKS) FHE scheme over fixed points provided by the Microsoft Simple Encrypted Arithmetic Library (MS-SEAL). W… ▽ More We present automatically parameterised Fully Homomorphic Encryption (FHE) for encrypted neural network inference and exemplify our inference over FHE compatible neural networks with our own open-source framework and reproducible examples. We use the 4th generation Cheon, Kim, Kim and Song (CKKS) FHE scheme over fixed points provided by the Microsoft Simple Encrypted Arithmetic Library (MS-SEAL). We significantly enhance the usability and applicability of FHE in deep learning contexts, with a focus on the constituent graphs, traversal, and optimisation. We find that FHE is not a panacea for all privacy preserving machine learning (PPML) problems, and that certain limitations still remain, such as model training. However we also find that in certain contexts FHE is well suited for computing completely private predictions with neural networks. The ability to privately compute sensitive problems more easily, while lowering the barriers to entry, can allow otherwise too-sensitive fields to begin advantaging themselves of performant third-party neural networks. Lastly we show how encrypted deep learning can be applied to a sensitive real world problem in agri-food, i.e. strawberry yield forecasting, demonstrating competitive performance. We argue that the adoption of encrypted deep learning methods at scale could allow for a greater adoption of deep learning methodologies where privacy concerns exists, hence having a large positive potential impact within the agri-food sector and its journey to net zero. △ Less

Submitted 18 October, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

Comments: 13 pages, 6 figures, journal

ACM Class: I.2.6; E.3

arXiv:2107.12997 [pdf, other]

Fully Homomorphically Encrypted Deep Learning as a Service

Authors: George Onoufriou, Paul Mayfield, Georgios Leontidis

Abstract: Fully Homomorphic Encryption (FHE) is a relatively recent advancement in the field of privacy-preserving technologies. FHE allows for the arbitrary depth computation of both addition and multiplication, and thus the application of abelian/polynomial equations, like those found in deep learning algorithms. This project investigates, derives, and proves how FHE with deep learning can be used at scal… ▽ More Fully Homomorphic Encryption (FHE) is a relatively recent advancement in the field of privacy-preserving technologies. FHE allows for the arbitrary depth computation of both addition and multiplication, and thus the application of abelian/polynomial equations, like those found in deep learning algorithms. This project investigates, derives, and proves how FHE with deep learning can be used at scale, with relatively low time complexity, the problems that such a system incurs, and mitigations/solutions for such problems. In addition, we discuss how this could have an impact on the future of data privacy and how it can enable data sharing across various actors in the agri-food supply chain, hence allowing the development of machine learning-based systems. Finally, we find that although FHE incurs a high spatial complexity cost, the time complexity is within expected reasonable bounds, while allowing for absolutely private predictions to be made, in our case for milk yield prediction. △ Less

Submitted 26 July, 2021; originally announced July 2021.

Comments: 8 pages

arXiv:2105.00925 [pdf, other]

Hyperspherically Regularized Networks for Self-Supervision

Authors: Aiden Durrant, Georgios Leontidis

Abstract: Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrast… ▽ More Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrastive methods. This work empirically demonstrates that feature diversity enforced by contrastive losses is beneficial to image representation uniformity when employed in BYOL, and as such, provides greater inter-class representation separability. Additionally, we explore and advocate the use of regularization methods, specifically the layer-wise minimization of hyperspherical energy (i.e. maximization of entropy) of network weights to encourage representation uniformity. We show that directly optimizing a measure of uniformity alongside the standard loss, or regularizing the networks of the BYOL architecture to minimize the hyperspherical energy of neurons can produce more uniformly distributed and therefore better performing representations for downstream tasks. △ Less

Submitted 27 March, 2022; v1 submitted 29 April, 2021; originally announced May 2021.

Comments: 11 pages, 8 figures

arXiv:2104.07468 [pdf, other]

doi 10.1016/j.compag.2021.106648

The Role of Cross-Silo Federated Learning in Facilitating Data Sharing in the Agri-Food Sector

Authors: Aiden Durrant, Milan Markovic, David Matthews, David May, Jessica Enright, Georgios Leontidis

Abstract: Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI tec… ▽ More Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI technologies often require large amounts of training data in order to perform well, something that in many scenarios is unrealistic. However, recent machine learning advances, e.g. federated learning and privacy-preserving technologies, can offer a solution to this issue via providing the infrastructure and underpinning technologies needed to use data from various sources to train models without ever sharing the raw data themselves. In this paper, we propose a technical solution based on federated learning that uses decentralized data, (i.e. data that are not exchanged or shared but remain with the owners) to develop a cross-silo machine learning model that facilitates data sharing across supply chains. We focus our data sharing proposition on improving production optimization through soybean yield prediction, and provide potential use-cases that such methods can assist in other problem settings. Our results demonstrate that our approach not only performs better than each of the models trained on an individual data source, but also that data sharing in the agri-food sector can be enabled via alternatives to data exchange, whilst also hel** to adopt emerging machine learning technologies to boost productivity. △ Less

Submitted 4 May, 2023; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: 23 pages, 5 figures, 7 tables || Version 2 fixed typos etc

Journal ref: Computers and Electronics in Agriculture, 2021

arXiv:2103.15566 [pdf, other]

Contrastive Domain Adaptation

Authors: Mamatha Thota, Georgios Leontidis

Abstract: Recently, contrastive self-supervised learning has become a key component for learning visual representations across many computer vision tasks and benchmarks. However, contrastive learning in the context of domain adaptation remains largely underexplored. In this paper, we propose to extend contrastive learning to a new domain adaptation setting, a particular situation occurring where the similar… ▽ More Recently, contrastive self-supervised learning has become a key component for learning visual representations across many computer vision tasks and benchmarks. However, contrastive learning in the context of domain adaptation remains largely underexplored. In this paper, we propose to extend contrastive learning to a new domain adaptation setting, a particular situation occurring where the similarity is learned and deployed on samples following different probability distributions without access to labels. Contrastive learning learns by comparing and contrasting positive and negative pairs of samples in an unsupervised setting without access to source and target labels. We have developed a variation of a recently proposed contrastive learning framework that helps tackle the domain adaptation problem, further identifying and removing possible negatives similar to the anchor to mitigate the effects of false negatives. Extensive experiments demonstrate that the proposed method adapts well, and improves the performance on the downstream domain adaptation task. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Comments: 10 pages, 6 figures, 5 tables

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021, pp. 2209-2218

arXiv:2012.04041 [pdf, other]

An autoencoder wavelet based deep neural network with attention mechanism for multistep prediction of plant growth

Authors: Bashar Alhnaity, Stefanos Kollias, Georgios Leontidis, Shouyong Jiang, Bert Schamp, Simon Pearson

Abstract: Multi-step prediction is considered of major significance for time series analysis in many real life problems. Existing methods mainly focus on one-step-ahead forecasting, since multiple step forecasting generally fails due to accumulation of prediction errors. This paper presents a novel approach for predicting plant growth in agriculture, focusing on prediction of plant Stem Diameter Variations… ▽ More Multi-step prediction is considered of major significance for time series analysis in many real life problems. Existing methods mainly focus on one-step-ahead forecasting, since multiple step forecasting generally fails due to accumulation of prediction errors. This paper presents a novel approach for predicting plant growth in agriculture, focusing on prediction of plant Stem Diameter Variations (SDV). The proposed approach consists of three main steps. At first, wavelet decomposition is applied to the original data, as to facilitate model fitting and reduce noise in them. Then an encoder-decoder framework is developed using Long Short Term Memory (LSTM) and used for appropriate feature extraction from the data. Finally, a recurrent neural network including LSTM and an attention mechanism is proposed for modelling long-term dependencies in the time series data. Experimental results are presented which illustrate the good performance of the proposed approach and that it significantly outperforms the existing models, in terms of error criteria such as RMSE, MAE and MAPE. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2006.09213 [pdf, other]

A Hybrid Natural Language Generation System Integrating Rules and Deep Learning Algorithms

Authors: Wei Wei, Bei Zhou, Georgios Leontidis

Abstract: This paper proposes an enhanced natural language generation system combining the merits of both rule-based approaches and modern deep learning algorithms, boosting its performance to the extent where the generated textual content is capable of exhibiting agile human-writing styles and the content logic of which is highly controllable. We also come up with a novel approach called HMCU to measure th… ▽ More This paper proposes an enhanced natural language generation system combining the merits of both rule-based approaches and modern deep learning algorithms, boosting its performance to the extent where the generated textual content is capable of exhibiting agile human-writing styles and the content logic of which is highly controllable. We also come up with a novel approach called HMCU to measure the performance of the natural language processing comprehensively and precisely. △ Less

Submitted 17 June, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

Comments: 6 pages

ACM Class: I.2.7; I.2.6

arXiv:2004.11123 [pdf]

doi 10.1016/j.jhydrol.2020.125126

Imputation of missing sub-hourly precipitation data in a large sensor network: a machine learning approach

Authors: Benedict Delahaye Chivers, John Wallbank, Steven J. Cole, Ondrej Sebek, Simon Stanley, Matthew Fry, Georgios Leontidis

Abstract: Precipitation data collected at sub-hourly resolution represents specific challenges for missing data recovery by being largely stochastic in nature and highly unbalanced in the duration of rain vs non-rain. Here we present a two-step analysis utilising current machine learning techniques for imputing precipitation data sampled at 30-minute intervals by devolving the task into (a) the classificati… ▽ More Precipitation data collected at sub-hourly resolution represents specific challenges for missing data recovery by being largely stochastic in nature and highly unbalanced in the duration of rain vs non-rain. Here we present a two-step analysis utilising current machine learning techniques for imputing precipitation data sampled at 30-minute intervals by devolving the task into (a) the classification of rain or non-rain samples, and (b) regressing the absolute values of predicted rain samples. Investigating 37 weather stations in the UK, this machine learning process produces more accurate predictions for recovering precipitation data than an established surface fitting technique utilising neighbouring rain gauges. Increasing available features for the training of machine learning algorithms increases performance with the integration of weather data at the target site with externally sourced rain gauges providing the highest performance. This method informs machine learning models by utilising information in concurrently collected environmental data to make accurate predictions of missing rain data. Capturing complex non-linear relationships from weakly correlated variables is critical for data recovery at sub-hourly resolutions. Such pipelines for data recovery can be developed and deployed for highly automated and near instantaneous imputation of missing values in ongoing datasets at high temporal resolutions. △ Less

Submitted 2 May, 2020; v1 submitted 30 March, 2020; originally announced April 2020.

Comments: 24 pages, 7 figures, 5 tables

Journal ref: Journal of Hydrology 2020

arXiv:2001.10335 [pdf, other]

Multi-Source Deep Domain Adaptation for Quality Control in Retail Food Packaging

Authors: Mamatha Thota, Stefanos Kollias, Mark Swainson, Georgios Leontidis

Abstract: Retail food packaging contains information which informs choice and can be vital to consumer health, including product name, ingredients list, nutritional information, allergens, preparation guidelines, pack weight, storage and shelf life information (use-by / best before dates). The presence and accuracy of such information is critical to ensure a detailed understanding of the product and to redu… ▽ More Retail food packaging contains information which informs choice and can be vital to consumer health, including product name, ingredients list, nutritional information, allergens, preparation guidelines, pack weight, storage and shelf life information (use-by / best before dates). The presence and accuracy of such information is critical to ensure a detailed understanding of the product and to reduce the potential for health risks. Consequently, erroneous or illegible labeling has the potential to be highly detrimental to consumers and many other stakeholders in the supply chain. In this paper, a multi-source deep learning-based domain adaptation system is proposed and tested to identify and verify the presence and legibility of use-by date information from food packaging photos taken as part of the validation process as the products pass along the food production line. This was achieved by improving the generalization of the techniques via making use of multi-source datasets in order to extract domain-invariant representations for all domains and aligning distribution of all pairs of source and target domains in a common feature space, along with the class boundaries. The proposed system performed very well in the conducted experiments, for automating the verification process and reducing labeling errors that could otherwise threaten public health and contravene legal requirements for food packaging information and accuracy. Comprehensive experiments on our food packaging datasets demonstrate that the proposed multi-source deep domain adaptation method significantly improves the classification accuracy and therefore has great potential for application and beneficial impact in food manufacturing control systems. △ Less

Submitted 28 January, 2020; originally announced January 2020.

Comments: 8 pages, 3 figures, 7 tables

arXiv:1907.00624 [pdf]

Using Deep Learning to Predict Plant Growth and Yield in Greenhouse Environments

Authors: Bashar Alhnaity, Simon Pearson, Georgios Leontidis, Stefanos Kollias

Abstract: Effective plant growth and yield prediction is an essential task for greenhouse growers and for agriculture in general. Develo** models which can effectively model growth and yield can help growers improve the environmental control for better production, match supply and market demand and lower costs. Recent developments in Machine Learning (ML) and, in particular, Deep Learning (DL) can provide… ▽ More Effective plant growth and yield prediction is an essential task for greenhouse growers and for agriculture in general. Develo** models which can effectively model growth and yield can help growers improve the environmental control for better production, match supply and market demand and lower costs. Recent developments in Machine Learning (ML) and, in particular, Deep Learning (DL) can provide powerful new analytical tools. The proposed study utilises ML and DL techniques to predict yield and plant growth variation across two different scenarios, tomato yield forecasting and Ficus benjamina stem growth, in controlled greenhouse environments. We deploy a new deep recurrent neural network (RNN), using the Long Short-Term Memory (LSTM) neuron model, in the prediction formulations. Both the former yield, growth and stem diameter values, as well as the microclimate conditions, are used by the RNN architecture to model the targeted growth parameters. A comparative study is presented, using ML methods, such as support vector regression and random forest regression, utilising the mean square error criterion, in order to evaluate the performance achieved by the different methods. Very promising results, based on data that have been obtained from two greenhouses, in Belgium and the UK, in the framework of the EU Interreg SMARTGREEN project (2017-2021), are presented. △ Less

Submitted 1 July, 2019; originally announced July 2019.

Comments: 8 pages, 2 figures, 1 table. arXiv admin note: text overlap with arXiv:1807.11809, arXiv:1707.00666 by other authors

arXiv:1906.01600 [pdf, other]

doi 10.1016/j.compind.2019.103133

Nemesyst: A Hybrid Parallelism Deep Learning-Based Framework Applied for Internet of Things Enabled Food Retailing Refrigeration Systems

Authors: George Onoufriou, Ronald Bickerton, Simon Pearson, Georgios Leontidis

Abstract: Deep Learning has attracted considerable attention across multiple application domains, including computer vision, signal processing and natural language processing. Although quite a few single node deep learning frameworks exist, such as tensorflow, pytorch and keras, we still lack a complete processing structure that can accommodate large scale data processing, version control, and deployment, a… ▽ More Deep Learning has attracted considerable attention across multiple application domains, including computer vision, signal processing and natural language processing. Although quite a few single node deep learning frameworks exist, such as tensorflow, pytorch and keras, we still lack a complete processing structure that can accommodate large scale data processing, version control, and deployment, all while staying agnostic of any specific single node framework. To bridge this gap, this paper proposes a new, higher level framework, i.e. Nemesyst, which uses databases along with model sequentialisation to allow processes to be fed unique and transformed data at the point of need. This facilitates near real-time application and makes models available for further training or use at any node that has access to the database simultaneously. Nemesyst is well suited as an application framework for internet of things aggregated control systems, deploying deep learning techniques to optimise individual machines in massive networks. To demonstrate this framework, we adopted a case study in a novel domain; deploying deep learning to optimise the high speed control of electrical power consumed by a massive internet of things network of retail refrigeration systems in proportion to load available on the UK National Grid (a demand side response). The case study demonstrated for the first time in such a setting how deep learning models, such as Recurrent Neural Networks (vanilla and Long-Short-Term Memory) and Generative Adversarial Networks paired with Nemesyst, achieve compelling performance, whilst still being malleable to future adjustments as both the data and requirements inevitably change over time. △ Less

Submitted 3 September, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

Comments: 25 pages, 13 figures, 4 tables, 2 appendices

Journal ref: Computers in Industry, 2019

arXiv:1905.11455 [pdf, other]

Capsule Routing via Variational Bayes

Authors: Fabio De Sousa Ribeiro, Georgios Leontidis, Stefanos Kollias

Abstract: Capsule networks are a recently proposed type of neural network shown to outperform alternatives in challenging shape recognition tasks. In capsule networks, scalar neurons are replaced with capsule vectors or matrices, whose entries represent different properties of objects. The relationships between objects and their parts are learned via trainable viewpoint-invariant transformation matrices, an… ▽ More Capsule networks are a recently proposed type of neural network shown to outperform alternatives in challenging shape recognition tasks. In capsule networks, scalar neurons are replaced with capsule vectors or matrices, whose entries represent different properties of objects. The relationships between objects and their parts are learned via trainable viewpoint-invariant transformation matrices, and the presence of a given object is decided by the level of agreement among votes from its parts. This interaction occurs between capsule layers and is a process called routing-by-agreement. In this paper, we propose a new capsule routing algorithm derived from Variational Bayes for fitting a mixture of transforming gaussians, and show it is possible transform our capsule network into a Capsule-VAE. Our Bayesian approach addresses some of the inherent weaknesses of MLE based models such as the variance-collapse by modelling uncertainty over capsule pose parameters. We outperform the state-of-the-art on smallNORB using 50% fewer capsules than previously reported, achieve competitive performances on CIFAR-10, Fashion-MNIST, SVHN, and demonstrate significant improvement in MNIST to affNIST generalisation over previous works. △ Less

Submitted 3 December, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

Comments: AAAI 2020 Accepted Paper

arXiv:1812.01681 [pdf, ps, other]

Deep Bayesian Self-Training

Authors: Fabio De Sousa Ribeiro, Francesco Caliva, Mark Swainson, Kjartan Gudmundsson, Georgios Leontidis, Stefanos Kollias

Abstract: Supervised Deep Learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of annotated data is often a challenge. In most real world problems, manual annotation is practically intractable due to time/labour constraints, thus the development of… ▽ More Supervised Deep Learning has been highly successful in recent years, achieving state-of-the-art results in most tasks. However, with the ongoing uptake of such methods in industrial applications, the requirement for large amounts of annotated data is often a challenge. In most real world problems, manual annotation is practically intractable due to time/labour constraints, thus the development of automated and adaptive data annotation systems is highly sought after. In this paper, we propose both a (i) Deep Bayesian Self-Training methodology for automatic data annotation, by leveraging predictive uncertainty estimates using variational inference and modern Neural Network architectures, as well as (ii) a practical adaptation procedure for handling high label variability between different dataset distributions through clustering of Neural Network latent variable representations. An experimental study on both public and private datasets is presented illustrating the superior performance of the proposed approach over standard Self-Training baselines, highlighting the importance of predictive uncertainty estimates in safety-critical domains. △ Less

Submitted 17 July, 2019; v1 submitted 26 November, 2018; originally announced December 2018.

Comments: 16 pages, 10 figures, 6 tables

arXiv:1807.10096 [pdf, other]

doi 10.1109/SSCI.2018.8628637

Towards a Deep Unified Framework for Nuclear Reactor Perturbation Analysis

Authors: Fabio De Sousa Ribeiro, Francesco Caliva, Dionysios Chionis, Abdelhamid Dokhane, Antonios Mylonakis, Christophe Demaziere, Georgios Leontidis, Stefanos Kollias

Abstract: In this paper, we take the first steps towards a novel unified framework for the analysis of perturbations in both the Time and Frequency domains. The identification of type and source of such perturbations is fundamental for monitoring reactor cores and guarantee safety while running at nominal conditions. A 3D Convolutional Neural Network (3D-CNN) was employed to analyse perturbations happening… ▽ More In this paper, we take the first steps towards a novel unified framework for the analysis of perturbations in both the Time and Frequency domains. The identification of type and source of such perturbations is fundamental for monitoring reactor cores and guarantee safety while running at nominal conditions. A 3D Convolutional Neural Network (3D-CNN) was employed to analyse perturbations happening in the frequency domain, such as an absorber of variable strength or propagating perturbation. Recurrent neural networks (RNN), specifically Long Short-Term Memory (LSTM) networks were used to study signal sequences related to perturbations induced in the time domain, including the vibrations of fuel assemblies and the fluctuations of thermal-hydraulic parameters at the inlet of the reactor coolant loops. 512 dimensional representations were extracted from the 3D-CNN and LSTM architectures, and used as input to a fused multi-sigmoid classification layer to recognise the perturbation type. If the perturbation is in the frequency domain, a separate fully-connected layer utilises said representations to regress the coordinates of its source. The results showed that the perturbation type can be recognised with high accuracy in all cases, and frequency domain scenario sources can be localised with high precision. △ Less

Submitted 8 September, 2018; v1 submitted 26 July, 2018; originally announced July 2018.

Comments: 8 pages, 8 figures, 5 tables; typos corrected, added references, minor alterations, results unchanged

Showing 1–28 of 28 results for author: Leontidis, G