-
Capsule Network Projectors are Equivariant and Invariant Learners
Authors:
Miles Everett,
Aiden Durrant,
Mingjun Zhong,
Georgios Leontidis
Abstract:
Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture eq…
▽ More
Learning invariant representations has been the longstanding approach to self-supervised learning. However, recently progress has been made in preserving equivariant properties in representations, yet do so with highly prescribed architectures. In this work, we propose an invariant-equivariant self-supervised architecture that employs Capsule Networks (CapsNets) which have been shown to capture equivariance with respect to novel viewpoints. We demonstrate that the use of CapsNets in equivariant self-supervised architectures achieves improved downstream performance on equivariant tasks with higher efficiency and fewer network parameters. To accommodate the architectural changes of CapsNets, we introduce a new objective function based on entropy minimisation. This approach, which we name CapsIE (Capsule Invariant Equivariant Network), achieves state-of-the-art performance across all invariant and equivariant downstream tasks on the 3DIEBench dataset, while outperforming supervised baselines. Our results demonstrate the ability of CapsNets to learn complex and generalised representations for large-scale, multi-task datasets compared to previous CapsNet benchmarks. Code is available at https://github.com/AberdeenML/CapsIE.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
S-JEA: Stacked Joint Embedding Architectures for Self-Supervised Visual Representation Learning
Authors:
Alžběta Manová,
Aiden Durrant,
Georgios Leontidis
Abstract:
The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchica…
▽ More
The recent emergence of Self-Supervised Learning (SSL) as a fundamental paradigm for learning image representations has, and continues to, demonstrate high empirical success in a variety of tasks. However, most SSL approaches fail to learn embeddings that capture hierarchical semantic concepts that are separable and interpretable. In this work, we aim to learn highly separable semantic hierarchical representations by stacking Joint Embedding Architectures (JEA) where higher-level JEAs are input with representations of lower-level JEA. This results in a representation space that exhibits distinct sub-categories of semantic concepts (e.g., model and colour of vehicles) in higher-level JEAs. We empirically show that representations from stacked JEA perform on a similar level as traditional JEA with comparative parameter counts and visualise the representation spaces to validate the semantic hierarchies.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
HMSN: Hyperbolic Self-Supervised Learning by Clustering with Ideal Prototypes
Authors:
Aiden Durrant,
Georgios Leontidis
Abstract:
Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised lea…
▽ More
Hyperbolic manifolds for visual representation learning allow for effective learning of semantic class hierarchies by naturally embedding tree-like structures with low distortion within a low-dimensional representation space. The highly separable semantic class hierarchies produced by hyperbolic learning have shown to be powerful in low-shot tasks, however, their application in self-supervised learning is yet to be explored fully. In this work, we explore the use of hyperbolic representation space for self-supervised representation learning for prototype-based clustering approaches. First, we extend the Masked Siamese Networks to operate on the Poincaré ball model of hyperbolic space, secondly, we place prototypes on the ideal boundary of the Poincaré ball. Unlike previous methods we project to the hyperbolic space at the output of the encoder network and utilise a hyperbolic projection head to ensure that the representations used for downstream tasks remain hyperbolic. Empirically we demonstrate the ability of these methods to perform comparatively to Euclidean methods in lower dimensions for linear evaluation tasks, whilst showing improvements in extreme few-shot learning tasks.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
A systematic review of physical-digital play technology and developmentally relevant child behaviour
Authors:
Pablo E. Torres,
Philip I. N. Ulrich,
Veronica Cucuiat,
Mutlu Cukurova,
Maria Fercovic De la Presa,
Rose Luckin,
Amanda Carr,
Thomas Dylan,
Abigail Durrant,
John Vines,
Shaun Lawson
Abstract:
New interactive physical-digital play technologies are sha** the way children plan. These technologies refer to digital play technologies that engage children in analogue forms of behaviour, either alone or with others. Current interactive physical-digital play technologies include robots, digital agents, mixed or augmented reality devices, and smart-eye based gaming. Little is known, however, a…
▽ More
New interactive physical-digital play technologies are sha** the way children plan. These technologies refer to digital play technologies that engage children in analogue forms of behaviour, either alone or with others. Current interactive physical-digital play technologies include robots, digital agents, mixed or augmented reality devices, and smart-eye based gaming. Little is known, however, about the ways in which these technologies could promote or damage child development. This systematic review was aimed at understanding if and how these physical-digital play technologies promoted developmentally relevant behaviour in typically develo** 0 to 12 year-olds. Psychology, Education, and Computer Science databases were searched producing 635 paper. A total of 31 papers met the inclusion criteria, of which 17 were of high enough quality to be included for synthesis. Results indicate that these new interactive play technologies could have a positive effect on children's developmentally relevant behaviour. The review indicated specific ways in which different behaviour were promoted. Providing information about own performance promoted self-monitoring. Slowing interactivity, play interdependency, and joint object accessibility promoted collaboration. Offering delimited choices promoted decision making. Problem solving and physical activity were promoted by requiring children to engage in them to keep playing. Four principles underpinned the ways in which physical digital play technologies afforded child behaviour. These included social expectations framing play situations, the directiveness of action regulations (inviting, guiding or forcing behaviours), the technical features of play technologies (digital play mechanics and physical characteristics), and the alignment between play goals, play technology and the play behaviours promoted.
△ Less
Submitted 10 February, 2022; v1 submitted 22 May, 2021;
originally announced May 2021.
-
Hyperspherically Regularized Networks for Self-Supervision
Authors:
Aiden Durrant,
Georgios Leontidis
Abstract:
Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrast…
▽ More
Bootstrap Your Own Latent (BYOL) introduced an approach to self-supervised learning avoiding the contrastive paradigm and subsequently removing the computational burden of negative sampling associated with such methods. However, we empirically find that the image representations produced under the BYOL's self-distillation paradigm are poorly distributed in representation space compared to contrastive methods. This work empirically demonstrates that feature diversity enforced by contrastive losses is beneficial to image representation uniformity when employed in BYOL, and as such, provides greater inter-class representation separability. Additionally, we explore and advocate the use of regularization methods, specifically the layer-wise minimization of hyperspherical energy (i.e. maximization of entropy) of network weights to encourage representation uniformity. We show that directly optimizing a measure of uniformity alongside the standard loss, or regularizing the networks of the BYOL architecture to minimize the hyperspherical energy of neurons can produce more uniformly distributed and therefore better performing representations for downstream tasks.
△ Less
Submitted 27 March, 2022; v1 submitted 29 April, 2021;
originally announced May 2021.
-
The Role of Cross-Silo Federated Learning in Facilitating Data Sharing in the Agri-Food Sector
Authors:
Aiden Durrant,
Milan Markovic,
David Matthews,
David May,
Jessica Enright,
Georgios Leontidis
Abstract:
Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI tec…
▽ More
Data sharing remains a major hindering factor when it comes to adopting emerging AI technologies in general, but particularly in the agri-food sector. Protectiveness of data is natural in this setting; data is a precious commodity for data owners, which if used properly can provide them with useful insights on operations and processes leading to a competitive advantage. Unfortunately, novel AI technologies often require large amounts of training data in order to perform well, something that in many scenarios is unrealistic. However, recent machine learning advances, e.g. federated learning and privacy-preserving technologies, can offer a solution to this issue via providing the infrastructure and underpinning technologies needed to use data from various sources to train models without ever sharing the raw data themselves. In this paper, we propose a technical solution based on federated learning that uses decentralized data, (i.e. data that are not exchanged or shared but remain with the owners) to develop a cross-silo machine learning model that facilitates data sharing across supply chains. We focus our data sharing proposition on improving production optimization through soybean yield prediction, and provide potential use-cases that such methods can assist in other problem settings. Our results demonstrate that our approach not only performs better than each of the models trained on an individual data source, but also that data sharing in the agri-food sector can be enabled via alternatives to data exchange, whilst also hel** to adopt emerging machine learning technologies to boost productivity.
△ Less
Submitted 4 May, 2023; v1 submitted 14 April, 2021;
originally announced April 2021.