-
Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication
Authors:
Olaf Lipinski,
Adam J. Sobey,
Federico Cerutti,
Timothy J. Norman
Abstract:
Effective communication requires the ability to refer to specific parts of an observation in relation to others. While emergent communication literature shows success in develo** various language properties, no research has shown the emergence of such positional references. This paper demonstrates how agents can communicate about spatial relationships within their observations. The results indic…
▽ More
Effective communication requires the ability to refer to specific parts of an observation in relation to others. While emergent communication literature shows success in develo** various language properties, no research has shown the emergence of such positional references. This paper demonstrates how agents can communicate about spatial relationships within their observations. The results indicate that agents can develop a language capable of expressing the relationships between parts of their observation, achieving over 90% accuracy when trained in a referential game which requires such communication. Using a collocation measure, we demonstrate how the agents create such references. This analysis suggests that agents use a mixture of non-compositional and compositional messages to convey spatial relationships. We also show that the emergent language is interpretable by humans. The translation accuracy is tested by communicating with the receiver agent, where the receiver achieves over 78% accuracy using parts of this lexicon, confirming that the interpretation of the emergent language was successful.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Combinatorial Client-Master Multiagent Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing
Authors:
Tesfay Zemuy Gebrekidan,
Sebastian Stein,
Timothy J. Norman
Abstract:
Recently, there has been an explosion of mobile applications that perform computationally intensive tasks such as video streaming, data mining, virtual reality, augmented reality, image processing, video processing, face recognition, and online gaming. However, user devices (UDs), such as tablets and smartphones, have a limited ability to perform the computation needs of the tasks. Mobile edge com…
▽ More
Recently, there has been an explosion of mobile applications that perform computationally intensive tasks such as video streaming, data mining, virtual reality, augmented reality, image processing, video processing, face recognition, and online gaming. However, user devices (UDs), such as tablets and smartphones, have a limited ability to perform the computation needs of the tasks. Mobile edge computing (MEC) has emerged as a promising technology to meet the increasing computing demands of UDs. Task offloading in MEC is a strategy that meets the demands of UDs by distributing tasks between UDs and MEC servers. Deep reinforcement learning (DRL) is gaining attention in task-offloading problems because it can adapt to dynamic changes and minimize online computational complexity. However, the various types of continuous and discrete resource constraints on UDs and MEC servers pose challenges to the design of an efficient DRL-based task-offloading strategy. Existing DRL-based task-offloading algorithms focus on the constraints of the UDs, assuming the availability of enough storage resources on the server. Moreover, existing multiagent DRL (MADRL)--based task-offloading algorithms are homogeneous agents and consider homogeneous constraints as a penalty in their reward function. We proposed a novel combinatorial client-master MADRL (CCM\_MADRL) algorithm for task offloading in MEC (CCM\_MADRL\_MEC) that enables UDs to decide their resource requirements and the server to make a combinatorial decision based on the requirements of the UDs. CCM\_MADRL\_MEC is the first MADRL in task offloading to consider server storage capacity in addition to the constraints in the UDs. By taking advantage of the combinatorial action selection, CCM\_MADRL\_MEC has shown superior convergence over existing MADDPG and heuristic algorithms.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
The Strain of Success: A Predictive Model for Injury Risk Mitigation and Team Success in Soccer
Authors:
Gregory Everett,
Ryan Beal,
Tim Matthews,
Timothy J. Norman,
Sarvapali D. Ramchurn
Abstract:
In this paper, we present a novel sequential team selection model in soccer. Specifically, we model the stochastic process of player injury and unavailability using player-specific information learned from real-world soccer data. Monte-Carlo Tree Search is used to select teams for games that optimise long-term team performance across a soccer season by reasoning over player injury probability. We…
▽ More
In this paper, we present a novel sequential team selection model in soccer. Specifically, we model the stochastic process of player injury and unavailability using player-specific information learned from real-world soccer data. Monte-Carlo Tree Search is used to select teams for games that optimise long-term team performance across a soccer season by reasoning over player injury probability. We validate our approach compared to benchmark solutions for the 2018/19 English Premier League season. Our model achieves similar season expected points to the benchmark whilst reducing first-team injuries by ~13% and the money inefficiently spent on injured players by ~11% - demonstrating the potential to reduce costs and improve player welfare in real-world soccer teams.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
PartIR: Composing SPMD Partitioning Strategies for Machine Learning
Authors:
Sami Alabed,
Daniel Belov,
Bart Chrzaszcz,
Juliana Franco,
Dominik Grewe,
Dougal Maclaurin,
James Molloy,
Tom Natan,
Tamara Norman,
Xiaoyue Pan,
Adam Paszke,
Norman A. Rink,
Michael Schaarschmidt,
Timur Sitdikov,
Agnieszka Swietlik,
Dimitrios Vytiniotis,
Joel Wee
Abstract:
Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding. When strategies increase in complexity, it becomes necessary for partitioning tools to be 1) expressive, allowing the composition of simpler strategies, and 2) predictable to estimate performance analytically. We present PartIR, our design for a NN par…
▽ More
Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding. When strategies increase in complexity, it becomes necessary for partitioning tools to be 1) expressive, allowing the composition of simpler strategies, and 2) predictable to estimate performance analytically. We present PartIR, our design for a NN partitioning system. PartIR is focused on an incremental approach to rewriting and is hardware-and-runtime agnostic. We present a simple but powerful API for composing sharding strategies and a simulator to validate them. The process is driven by high-level programmer-issued partitioning tactics, which can be both manual and automatic. Importantly, the tactics are specified separately from the model code, making them easy to change. We evaluate PartIR on several different models to demonstrate its predictability, expressibility, and ability to reach peak performance..
△ Less
Submitted 3 March, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient
Authors:
Xingzhou Lou,
Junge Zhang,
Timothy J. Norman,
Kaiqi Huang,
Yali Du
Abstract:
Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions by some agents will affect other agent's policy learning. While using individual critics for policy updates can avoid this issue, they severely limit cooperation…
▽ More
Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions by some agents will affect other agent's policy learning. While using individual critics for policy updates can avoid this issue, they severely limit cooperation among agents. To address this issue, we propose an agent topology framework, which decides whether other agents should be considered in policy gradient and achieves compromise between facilitating cooperation and alleviating the CDM issue. The agent topology allows agents to use coalition utility as learning objective instead of global utility by centralized critics or local utility by individual critics. To constitute the agent topology, various models are studied. We propose Topology-based multi-Agent Policy gradiEnt (TAPE) for both stochastic and deterministic MAPG methods. We prove the policy improvement theorem for stochastic TAPE and give a theoretical explanation for the improved cooperation among agents. Experiment results on several benchmarks show the agent topology is able to facilitate agent cooperation and alleviate CDM issue respectively to improve performance of TAPE. Finally, multiple ablation studies and a heuristic graph search algorithm are devised to show the efficacy of the agent topology.
△ Less
Submitted 15 January, 2024; v1 submitted 25 December, 2023;
originally announced December 2023.
-
It's About Time: Temporal References in Emergent Communication
Authors:
Olaf Lipinski,
Adam J. Sobey,
Federico Cerutti,
Timothy J. Norman
Abstract:
Emergent communication studies the development of language between autonomous agents, aiming to improve understanding of natural language evolution and increase communication efficiency. While temporal aspects of language have been considered in computational linguistics, there has been no research on temporal references in emergent communication. This paper addresses this gap, by exploring how ag…
▽ More
Emergent communication studies the development of language between autonomous agents, aiming to improve understanding of natural language evolution and increase communication efficiency. While temporal aspects of language have been considered in computational linguistics, there has been no research on temporal references in emergent communication. This paper addresses this gap, by exploring how agents communicate about temporal relationships. We analyse three potential influences for the emergence of temporal references: environmental, external, and architectural changes. Our experiments demonstrate that altering the loss function is insufficient for temporal references to emerge; rather, architectural changes are necessary. However, a minimal change in agent architecture, using a different batching method, allows the emergence of temporal references. This modified design is compared with the standard architecture in a temporal referential games environment, which emphasises temporal relationships. The analysis indicates that over 95\% of the agents with the modified batching method develop temporal references, without changes to their loss function. We consider temporal referencing necessary for future improvements to the agents' communication efficiency, yielding a closer to optimal coding as compared to purely compositional languages. Our readily transferable architectural insights provide the basis for their incorporation into other emergent communication settings.
△ Less
Submitted 3 May, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
MADDM: Multi-Advisor Dynamic Binary Decision-Making by Maximizing the Utility
Authors:
Zhaori Guo,
Timothy J. Norman,
Enrico H. Gerding
Abstract:
Being able to infer ground truth from the responses of multiple imperfect advisors is a problem of crucial importance in many decision-making applications, such as lending, trading, investment, and crowd-sourcing. In practice, however, gathering answers from a set of advisors has a cost. Therefore, finding an advisor selection strategy that retrieves a reliable answer and maximizes the overall uti…
▽ More
Being able to infer ground truth from the responses of multiple imperfect advisors is a problem of crucial importance in many decision-making applications, such as lending, trading, investment, and crowd-sourcing. In practice, however, gathering answers from a set of advisors has a cost. Therefore, finding an advisor selection strategy that retrieves a reliable answer and maximizes the overall utility is a challenging problem. To address this problem, we propose a novel strategy for optimally selecting a set of advisers in a sequential binary decision-making setting, where multiple decisions need to be made over time. Crucially, we assume no access to ground truth and no prior knowledge about the reliability of advisers. Specifically, our approach considers how to simultaneously (1) select advisors by balancing the advisors' costs and the value of making correct decisions, (2) learn the trustworthiness of advisers dynamically without prior information by asking multiple advisers, and (3) make optimal decisions without access to the ground truth, improving this over time. We evaluate our algorithm through several numerical experiments. The results show that our approach outperforms two other methods that combine state-of-the-art models.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
PRIME: A Price-Reverting Impact Model of a cryptocurrency Exchange
Authors:
Christopher J. Cho,
Timothy J. Norman,
Manuel Nunes
Abstract:
In a financial exchange, market impact is a measure of the price change of an asset following a transaction. This is an important element of market microstructure, which determines the behaviour of the market following a trade. In this paper, we first provide a discussion on the market impact observed in the BTC/USD Futures market, then we present a novel multi-agent market simulation that can fol…
▽ More
In a financial exchange, market impact is a measure of the price change of an asset following a transaction. This is an important element of market microstructure, which determines the behaviour of the market following a trade. In this paper, we first provide a discussion on the market impact observed in the BTC/USD Futures market, then we present a novel multi-agent market simulation that can follow an underlying price series, whilst maintaining the ability to reproduce the market impact observed in the market in an explainable manner. This simulation of the financial exchange allows the model to interact realistically with market participants, hel** its users better estimate market slippage as well as the knock-on consequences of their market actions. In turn, it allows various stakeholders such as industrial practitioners, governments and regulators to test their market hypotheses, without deploying capital or destabilising the system.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Inferring Player Location in Sports Matches: Multi-Agent Spatial Imputation from Limited Observations
Authors:
Gregory Everett,
Ryan J. Beal,
Tim Matthews,
Joseph Early,
Timothy J. Norman,
Sarvapali D. Ramchurn
Abstract:
Understanding agent behaviour in Multi-Agent Systems (MAS) is an important problem in domains such as autonomous driving, disaster response, and sports analytics. Existing MAS problems typically use uniform timesteps with observations for all agents. In this work, we analyse the problem of agent location imputation, specifically posed in environments with non-uniform timesteps and limited agent ob…
▽ More
Understanding agent behaviour in Multi-Agent Systems (MAS) is an important problem in domains such as autonomous driving, disaster response, and sports analytics. Existing MAS problems typically use uniform timesteps with observations for all agents. In this work, we analyse the problem of agent location imputation, specifically posed in environments with non-uniform timesteps and limited agent observability (~95% missing values). Our approach uses Long Short-Term Memory and Graph Neural Network components to learn temporal and inter-agent patterns to predict the location of all agents at every timestep. We apply this to the domain of football (soccer) by imputing the location of all players in a game from sparse event data (e.g., shots and passes). Our model estimates player locations to within ~6.9m; a ~62% reduction in error from the best performing baseline. This approach facilitates downstream analysis tasks such as player physical metrics, player coverage, and team pitch control. Existing solutions to these tasks often require optical tracking data, which is expensive to obtain and only available to elite clubs. By imputing player locations from easy to obtain event data, we increase the accessibility of downstream tasks.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Multi-trainer Interactive Reinforcement Learning System
Authors:
Zhaori Guo,
Timothy J. Norman,
Enrico H. Gerding
Abstract:
Interactive reinforcement learning can effectively facilitate the agent training via human feedback. However, such methods often require the human teacher to know what is the correct action that the agent should take. In other words, if the human teacher is not always reliable, then it will not be consistently able to guide the agent through its training. In this paper, we propose a more effective…
▽ More
Interactive reinforcement learning can effectively facilitate the agent training via human feedback. However, such methods often require the human teacher to know what is the correct action that the agent should take. In other words, if the human teacher is not always reliable, then it will not be consistently able to guide the agent through its training. In this paper, we propose a more effective interactive reinforcement learning system by introducing multiple trainers, namely Multi-Trainer Interactive Reinforcement Learning (MTIRL), which could aggregate the binary feedback from multiple non-perfect trainers into a more reliable reward for an agent training in a reward-sparse environment. In particular, our trainer feedback aggregation experiments show that our aggregation method has the best accuracy when compared with the majority voting, the weighted voting, and the Bayesian method. Finally, we conduct a grid-world experiment to show that the policy trained by the MTIRL with the review model is closer to the optimal policy than that without a review model.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR
Authors:
Sami Alabed,
Dominik Grewe,
Juliana Franco,
Bart Chrzaszcz,
Tom Natan,
Tamara Norman,
Norman A. Rink,
Dimitrios Vytiniotis,
Michael Schaarschmidt
Abstract:
Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm. For example, training large transformer models requires combining data, model, and pipeline partitioning; and optimizer sharding techniques. However, identifying efficient combinations for many model architectures and accelerator systems requ…
▽ More
Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm. For example, training large transformer models requires combining data, model, and pipeline partitioning; and optimizer sharding techniques. However, identifying efficient combinations for many model architectures and accelerator systems requires significant manual analysis. In this work, we present an automatic partitioner that identifies these combinations through a goal-oriented search. Our key findings are that a Monte Carlo Tree Search-based partitioner leveraging partition-specific compiler analysis directly into the search and guided goals matches expert-level strategies for various models.
△ Less
Submitted 7 October, 2022;
originally announced October 2022.
-
From Intelligent Agents to Trustworthy Human-Centred Multiagent Systems
Authors:
Mohammad Divband Soorati,
Enrico H. Gerding,
Enrico Marchioni,
Pavel Naumov,
Timothy J. Norman,
Sarvapali D. Ramchurn,
Bahar Rastegari,
Adam Sobey,
Sebastian Stein,
Danesh Tarpore,
Vahid Yazdanpanah,
Jie Zhang
Abstract:
The Agents, Interaction and Complexity research group at the University of Southampton has a long track record of research in multiagent systems (MAS). We have made substantial scientific contributions across learning in MAS, game-theoretic techniques for coordinating agent systems, and formal methods for representation and reasoning. We highlight key results achieved by the group and elaborate on…
▽ More
The Agents, Interaction and Complexity research group at the University of Southampton has a long track record of research in multiagent systems (MAS). We have made substantial scientific contributions across learning in MAS, game-theoretic techniques for coordinating agent systems, and formal methods for representation and reasoning. We highlight key results achieved by the group and elaborate on recent work and open research challenges in develo** trustworthy autonomous systems and deploying human-centred AI systems that aim to support societal good.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Learning from the Pros: Extracting Professional Goalkeeper Technique from Broadcast Footage
Authors:
Matthew Wear,
Ryan Beal,
Tim Matthews,
Tim Norman,
Sarvapali Ramchurn
Abstract:
As an amateur goalkeeper playing grassroots soccer, who better to learn from than top professional goalkeepers? In this paper, we harness computer vision and machine learning models to appraise the save technique of professionals in a way those at lower levels can learn from. We train an unsupervised machine learning model using 3D body pose data extracted from broadcast footage to learn professio…
▽ More
As an amateur goalkeeper playing grassroots soccer, who better to learn from than top professional goalkeepers? In this paper, we harness computer vision and machine learning models to appraise the save technique of professionals in a way those at lower levels can learn from. We train an unsupervised machine learning model using 3D body pose data extracted from broadcast footage to learn professional goalkeeper technique. Then, an "expected saves" model is developed, from which we can identify the optimal goalkeeper technique in different match contexts.
△ Less
Submitted 22 February, 2022;
originally announced February 2022.
-
Robust Linear Regression for General Feature Distribution
Authors:
Tom Norman,
Nir Weinberger,
Kfir Y. Levy
Abstract:
We investigate robust linear regression where data may be contaminated by an oblivious adversary, i.e., an adversary than may know the data distribution but is otherwise oblivious to the realizations of the data samples. This model has been previously analyzed under strong assumptions. Concretely, $\textbf{(i)}$ all previous works assume that the covariance matrix of the features is positive defin…
▽ More
We investigate robust linear regression where data may be contaminated by an oblivious adversary, i.e., an adversary than may know the data distribution but is otherwise oblivious to the realizations of the data samples. This model has been previously analyzed under strong assumptions. Concretely, $\textbf{(i)}$ all previous works assume that the covariance matrix of the features is positive definite; and $\textbf{(ii)}$ most of them assume that the features are centered (i.e. zero mean). Additionally, all previous works make additional restrictive assumption, e.g., assuming that the features are Gaussian or that the corruptions are symmetrically distributed.
In this work we go beyond these assumptions and investigate robust regression under a more general set of assumptions: $\textbf{(i)}$ we allow the covariance matrix to be either positive definite or positive semi definite, $\textbf{(ii)}$ we do not necessarily assume that the features are centered, $\textbf{(iii)}$ we make no further assumption beyond boundedness (sub-Gaussianity) of features and measurement noise.
Under these assumption we analyze a natural SGD variant for this problem and show that it enjoys a fast convergence rate when the covariance matrix is positive definite. In the positive semi definite case we show that there are two regimes: if the features are centered we can obtain a standard convergence rate; otherwise the adversary can cause any learner to fail arbitrarily.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Automap: Towards Ergonomic Automated Parallelism for ML Models
Authors:
Michael Schaarschmidt,
Dominik Grewe,
Dimitrios Vytiniotis,
Adam Paszke,
Georg Stefan Schmid,
Tamara Norman,
James Molloy,
Jonathan Godwin,
Norman Alexander Rink,
Vinod Nair,
Dan Belov
Abstract:
The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype o…
▽ More
The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype of an automated partitioner that seamlessly integrates into existing compilers and existing user workflows. Our partitioner enables SPMD-style parallelism that encompasses data parallelism and parameter/activation sharding. Through a combination of inductive tactics and search in a platform-independent partitioning IR, automap can recover expert partitioning strategies such as Megatron sharding for transformer layers.
△ Less
Submitted 6 December, 2021;
originally announced December 2021.
-
Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning
Authors:
Ningning Xie,
Tamara Norman,
Dominik Grewe,
Dimitrios Vytiniotis
Abstract:
We present a novel characterization of the map** of multiple parallelism forms (e.g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware map**. We experimentally verify the substantial effect of these map**s on all-reduce performance (up to 448x). We offer a novel syntax-guided program synthesis frame…
▽ More
We present a novel characterization of the map** of multiple parallelism forms (e.g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware map**. We experimentally verify the substantial effect of these map**s on all-reduce performance (up to 448x). We offer a novel syntax-guided program synthesis framework that is able to decompose reductions over one or more parallelism axes to sequences of collectives in a hierarchy- and map**-aware way. For 69% of parallelism placements and user requested reductions, our framework synthesizes programs that outperform the default all-reduce implementation when evaluated on different GPU hierarchies (max 2.04x, average 1.27x). We complement our synthesis tool with a simulator exceeding 90% top-10 accuracy, which therefore reduces the need for massive evaluations of synthesis results to determine a small set of optimal programs and map**s.
△ Less
Submitted 16 November, 2021; v1 submitted 20 October, 2021;
originally announced October 2021.
-
Optimising Long-Term Outcomes using Real-World Fluent Objectives: An Application to Football
Authors:
Ryan Beal,
Georgios Chalkiadakis,
Timothy J. Norman,
Sarvapali D. Ramchurn
Abstract:
In this paper, we present a novel approach for optimising long-term tactical and strategic decision-making in football (soccer) by encapsulating events in a league environment across a given time frame. We model the teams' objectives for a season and track how these evolve as games unfold to give a fluent objective that can aid in decision-making games. We develop Markov chain Monte Carlo and deep…
▽ More
In this paper, we present a novel approach for optimising long-term tactical and strategic decision-making in football (soccer) by encapsulating events in a league environment across a given time frame. We model the teams' objectives for a season and track how these evolve as games unfold to give a fluent objective that can aid in decision-making games. We develop Markov chain Monte Carlo and deep learning-based algorithms that make use of the fluent objectives in order to learn from prior games and other games in the environment and increase the teams' long-term performance. Simulations of our approach using real-world datasets from 760 matches shows that by using optimised tactics with our fluent objective and prior games, we can on average increase teams mean expected finishing distribution in the league by up to 35.6%.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Combining Machine Learning and Human Experts to Predict Match Outcomes in Football: A Baseline Model
Authors:
Ryan Beal,
Stuart E. Middleton,
Timothy J. Norman,
Sarvapali D. Ramchurn
Abstract:
In this paper, we present a new application-focused benchmark dataset and results from a set of baseline Natural Language Processing and Machine Learning models for prediction of match outcomes for games of football (soccer). By doing so we give a baseline for the prediction accuracy that can be achieved exploiting both statistical match data and contextual articles from human sports journalists.…
▽ More
In this paper, we present a new application-focused benchmark dataset and results from a set of baseline Natural Language Processing and Machine Learning models for prediction of match outcomes for games of football (soccer). By doing so we give a baseline for the prediction accuracy that can be achieved exploiting both statistical match data and contextual articles from human sports journalists. Our dataset is focuses on a representative time-period over 6 seasons of the English Premier League, and includes newspaper match previews from The Guardian. The models presented in this paper achieve an accuracy of 63.18% showing a 6.9% boost on the traditional statistical methods.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
SHACL Satisfiability and Containment (Extended Paper)
Authors:
Paolo Pareti,
George Konstantinidis,
Fabio Mogavero,
Timothy J. Norman
Abstract:
The Shapes Constraint Language (SHACL) is a recent W3C recommendation language for validating RDF data. Specifically, SHACL documents are collections of constraints that enforce particular shapes on an RDF graph. Previous work on the topic has provided theoretical and practical results for the validation problem, but did not consider the standard decision problems of satisfiability and containment…
▽ More
The Shapes Constraint Language (SHACL) is a recent W3C recommendation language for validating RDF data. Specifically, SHACL documents are collections of constraints that enforce particular shapes on an RDF graph. Previous work on the topic has provided theoretical and practical results for the validation problem, but did not consider the standard decision problems of satisfiability and containment, which are crucial for verifying the feasibility of the constraints and important for design and optimization purposes. In this paper, we undertake a thorough study of different features of non-recursive SHACL by providing a translation to a new first-order language, called SCL, that precisely captures the semantics of SHACL w.r.t. satisfiability and containment. We study the interaction of SHACL features in this logic and provide the detailed map of decidability and complexity results of the aforementioned decision problems for different SHACL sublanguages. Notably, we prove that both problems are undecidable for the full language, but we present decidable combinations of interesting features.
△ Less
Submitted 5 November, 2020; v1 submitted 31 August, 2020;
originally announced September 2020.
-
Acme: A Research Framework for Distributed Reinforcement Learning
Authors:
Matthew W. Hoffman,
Bobak Shahriari,
John Aslanides,
Gabriel Barth-Maron,
Nikola Momchev,
Danila Sinopalnikov,
Piotr Stańczyk,
Sabela Ramos,
Anton Raichuk,
Damien Vincent,
Léonard Hussenot,
Robert Dadashi,
Gabriel Dulac-Arnold,
Manu Orsini,
Alexis Jacq,
Johan Ferret,
Nino Vieillard,
Seyed Kamyar Seyed Ghasemipour,
Sertan Girgin,
Olivier Pietquin,
Feryal Behbahani,
Tamara Norman,
Abbas Abdolmaleki,
Albin Cassirer,
Fan Yang
, et al. (14 additional authors not shown)
Abstract:
Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce publishe…
▽ More
Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce published RL algorithms. To address these concerns this work describes Acme, a framework for constructing novel RL algorithms that is specifically designed to enable agents that are built using simple, modular components that can be used at various scales of execution. While the primary goal of Acme is to provide a framework for algorithm development, a secondary goal is to provide simple reference implementations of important or state-of-the-art algorithms. These implementations serve both as a validation of our design decisions as well as an important contribution to reproducibility in RL research. In this work we describe the major design decisions made within Acme and give further details as to how its components can be used to implement various algorithms. Our experiments provide baselines for a number of common and state-of-the-art algorithms as well as showing how these algorithms can be scaled up for much larger and more complex environments. This highlights one of the primary advantages of Acme, namely that it can be used to implement large, distributed RL algorithms that can run at massive scales while still maintaining the inherent readability of that implementation.
This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme.
△ Less
Submitted 20 September, 2022; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Optimising Game Tactics for Football
Authors:
Ryan Beal,
Georgios Chalkiadakis,
Timothy J. Norman,
Sarvapali D. Ramchurn
Abstract:
In this paper we present a novel approach to optimise tactical and strategic decision making in football (soccer). We model the game of football as a multi-stage game which is made up from a Bayesian game to model the pre-match decisions and a stochastic game to model the in-match state transitions and decisions. Using this formulation, we propose a method to predict the probability of game outcom…
▽ More
In this paper we present a novel approach to optimise tactical and strategic decision making in football (soccer). We model the game of football as a multi-stage game which is made up from a Bayesian game to model the pre-match decisions and a stochastic game to model the in-match state transitions and decisions. Using this formulation, we propose a method to predict the probability of game outcomes and the payoffs of team actions. Building upon this, we develop algorithms to optimise team formation and in-game tactics with different objectives. Empirical evaluation of our approach on real-world datasets from 760 matches shows that by using optimised tactics from our Bayesian and stochastic games, we can increase a team chances of winning by up to 16.1\% and 3.4\% respectively.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
A Policy Editor for Semantic Sensor Networks
Authors:
Paolo Pareti,
George Konstantinidis,
Timothy J. Norman
Abstract:
An important use of sensors and actuator networks is to comply with health and safety policies in hazardous environments. In order to deal with increasingly large and dynamic environments, and to quickly react to emergencies, tools are needed to simplify the process of translating high-level policies into executable queries and rules. We present a framework to produce such tools, which uses rules…
▽ More
An important use of sensors and actuator networks is to comply with health and safety policies in hazardous environments. In order to deal with increasingly large and dynamic environments, and to quickly react to emergencies, tools are needed to simplify the process of translating high-level policies into executable queries and rules. We present a framework to produce such tools, which uses rules to aggregate low-level sensor data, described using the Semantic Sensor Network Ontology, into more useful and actionable abstractions. Using the schema of the underlying data sources as an input, we automatically generate abstractions which are relevant to the use case at hand. In this demonstration we present a policy editor tool and a simulation on which policies can be tested.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
SHACL Constraints with Inference Rules
Authors:
Paolo Pareti,
George Konstantinidis,
Timothy J. Norman,
Murat Şensoy
Abstract:
The Shapes Constraint Language (SHACL) has been recently introduced as a W3C recommendation to define constraints that can be validated against RDF graphs. Interactions of SHACL with other Semantic Web technologies, such as ontologies or reasoners, is a matter of ongoing research. In this paper we study the interaction of a subset of SHACL with inference rules expressed in datalog. On the one hand…
▽ More
The Shapes Constraint Language (SHACL) has been recently introduced as a W3C recommendation to define constraints that can be validated against RDF graphs. Interactions of SHACL with other Semantic Web technologies, such as ontologies or reasoners, is a matter of ongoing research. In this paper we study the interaction of a subset of SHACL with inference rules expressed in datalog. On the one hand, SHACL constraints can be used to define a "schema" for graph datasets. On the other hand, inference rules can lead to the discovery of new facts that do not match the original schema. Given a set of SHACL constraints and a set of datalog rules, we present a method to detect which constraints could be violated by the application of the inference rules on some graph instance of the schema, and update the original schema, i.e, the set of SHACL constraints, in order to capture the new facts that can be inferred. We provide theoretical and experimental results of the various components of our approach.
△ Less
Submitted 1 November, 2019;
originally announced November 2019.
-
Rule Applicability on RDF Triplestore Schemas
Authors:
Paolo Pareti,
George Konstantinidis,
Timothy J. Norman,
Murat Şensoy
Abstract:
Rule-based systems play a critical role in health and safety, where policies created by experts are usually formalised as rules. When dealing with increasingly large and dynamic sources of data, as in the case of Internet of Things (IoT) applications, it becomes important not only to efficiently apply rules, but also to reason about their applicability on datasets confined by a certain schema. In…
▽ More
Rule-based systems play a critical role in health and safety, where policies created by experts are usually formalised as rules. When dealing with increasingly large and dynamic sources of data, as in the case of Internet of Things (IoT) applications, it becomes important not only to efficiently apply rules, but also to reason about their applicability on datasets confined by a certain schema. In this paper we define the notion of a triplestore schema which models a set of RDF graphs. Given a set of rules and such a schema as input we propose a method to determine rule applicability and produce output schemas. Output schemas model the graphs that would be obtained by running the rules on the graph models of the input schema. We present two approaches: one based on computing a canonical (critical) instance of the schema, and a novel approach based on query rewriting. We provide theoretical, complexity and evaluation results that show the superior efficiency of our rewriting approach.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
On Natural Language Generation of Formal Argumentation
Authors:
Federico Cerutti,
Alice Toniolo,
Timothy J. Norman
Abstract:
In this paper we provide a first analysis of the research questions that arise when dealing with the problem of communicating pieces of formal argumentation through natural language interfaces. It is a generally held opinion that formal models of argumentation naturally capture human argument, and some preliminary studies have focused on justifying this view. Unfortunately, the results are not onl…
▽ More
In this paper we provide a first analysis of the research questions that arise when dealing with the problem of communicating pieces of formal argumentation through natural language interfaces. It is a generally held opinion that formal models of argumentation naturally capture human argument, and some preliminary studies have focused on justifying this view. Unfortunately, the results are not only inconclusive, but seem to suggest that explaining formal argumentation to humans is a rather articulated task. Graphical models for expressing argumentation-based reasoning are appealing, but often humans require significant training to use these tools effectively. We claim that natural language interfaces to formal argumentation systems offer a real alternative, and may be the way forward for systems that capture human argument.
△ Less
Submitted 13 June, 2017;
originally announced June 2017.
-
Reducing overfitting in challenge-based competitions
Authors:
Elias Chaibub Neto,
Bruce R Hoff,
Chris Bare,
Brian M Bot,
Thomas Yu,
Lara Magravite,
Andrew D Trister,
Thea Norman,
Pablo Meyer,
Julio Saez-Rodrigues,
James C Costello,
Justin Guinney,
Gustavo Stolovitzky
Abstract:
Over-fitting is a dreaded foe in challenge-based competitions. Because participants rely on public leaderboards to evaluate and refine their models, there is always the danger they might over-fit to the holdout data supporting the leaderboard. The recently published Ladder algorithm aims to address this problem by preventing the participants from exploiting willingly or inadvertently minor fluctua…
▽ More
Over-fitting is a dreaded foe in challenge-based competitions. Because participants rely on public leaderboards to evaluate and refine their models, there is always the danger they might over-fit to the holdout data supporting the leaderboard. The recently published Ladder algorithm aims to address this problem by preventing the participants from exploiting willingly or inadvertently minor fluctuations in public leaderboard scores during model refinement. In this paper, we report a vulnerability of the Ladder that induces severe over-fitting of the leaderboard when the sample size is small. To circumvent this attack, we propose a variation of the Ladder that releases a bootstrapped estimate of the public leaderboard score instead of providing participants with a direct measure of performance. We also extend the scope of the Ladder to arbitrary performance metrics by relying on a more broadly applicable testing procedure based on the Bayesian bootstrap. Our method makes it possible to use a leaderboard, with the technical and social advantages that it provides, even in cases where data is scant.
△ Less
Submitted 30 June, 2016;
originally announced July 2016.
-
Approachability in Population Games
Authors:
Dario Bauso,
Thomas W L Norman
Abstract:
This paper reframes approachability theory within the context of population games. Thus, whilst one player aims at driving her average payoff to a predefined set, her opponent is not malevolent but rather extracted randomly from a population of individuals with given distribution on actions. First, convergence conditions are revisited based on the common prior on the population distribution, and w…
▽ More
This paper reframes approachability theory within the context of population games. Thus, whilst one player aims at driving her average payoff to a predefined set, her opponent is not malevolent but rather extracted randomly from a population of individuals with given distribution on actions. First, convergence conditions are revisited based on the common prior on the population distribution, and we define the notion of \emph{1st-moment approachability}. Second, we develop a model of two coupled partial differential equations (PDEs) in the spirit of mean-field game theory: one describing the best-response of every player given the population distribution (this is a \emph{Hamilton-Jacobi-Bellman equation}), the other capturing the macroscopic evolution of average payoffs if every player plays its best response (this is an \emph{advection equation}). Third, we provide a detailed analysis of existence, nonuniqueness, and stability of equilibria (fixed points of the two PDEs). Fourth, we apply the model to regret-based dynamics, and use it to establish convergence to Bayesian equilibrium under incomplete information.
△ Less
Submitted 15 July, 2014;
originally announced July 2014.
-
Subjective Logic Operators in Trust Assessment: an Empirical Study
Authors:
Federico Cerutti,
Alice Toniolo,
Nir Oren,
Timothy J. Norman
Abstract:
Computational trust mechanisms aim to produce trust ratings from both direct and indirect information about agents' behaviour. Subjective Logic (SL) has been widely adopted as the core of such systems via its fusion and discount operators. In recent research we revisited the semantics of these operators to explore an alternative, geometric interpretation. In this paper we present a principled desi…
▽ More
Computational trust mechanisms aim to produce trust ratings from both direct and indirect information about agents' behaviour. Subjective Logic (SL) has been widely adopted as the core of such systems via its fusion and discount operators. In recent research we revisited the semantics of these operators to explore an alternative, geometric interpretation. In this paper we present a principled desiderata for discounting and fusion operators in SL. Building upon this we present operators that satisfy these desirable properties, including a family of discount operators. We then show, through a rigorous empirical study, that specific, geometrically interpreted operators significantly outperform standard SL operators in estimating ground truth. These novel operators offer real advantages for computational models of trust and reputation, in which they may be employed without modifying other aspects of an existing system.
△ Less
Submitted 19 November, 2013;
originally announced December 2013.
-
Context-dependent Trust Decisions with Subjective Logic
Authors:
Federico Cerutti,
Alice Toniolo,
Nir Oren,
Timothy J. Norman
Abstract:
A decision procedure implemented over a computational trust mechanism aims to allow for decisions to be made regarding whether some entity or information should be trusted. As recognised in the literature, trust is contextual, and we describe how such a context often translates into a confidence level which should be used to modify an underlying trust value. Jøsang's Subjective Logic has long been…
▽ More
A decision procedure implemented over a computational trust mechanism aims to allow for decisions to be made regarding whether some entity or information should be trusted. As recognised in the literature, trust is contextual, and we describe how such a context often translates into a confidence level which should be used to modify an underlying trust value. Jøsang's Subjective Logic has long been used in the trust domain, and we show that its operators are insufficient to address this problem. We therefore provide a decision-making approach about trust which also considers the notion of confidence (based on context) through the introduction of a new operator. In particular, we introduce general requirements that must be respected when combining trustworthiness and confidence degree, and demonstrate the soundness of our new operator with respect to these properties.
△ Less
Submitted 19 September, 2013;
originally announced September 2013.