Search | arXiv e-print repository

Speaking Your Language: Spatial Relationships in Interpretable Emergent Communication

Authors: Olaf Lipinski, Adam J. Sobey, Federico Cerutti, Timothy J. Norman

Abstract: Effective communication requires the ability to refer to specific parts of an observation in relation to others. While emergent communication literature shows success in develo** various language properties, no research has shown the emergence of such positional references. This paper demonstrates how agents can communicate about spatial relationships within their observations. The results indic… ▽ More Effective communication requires the ability to refer to specific parts of an observation in relation to others. While emergent communication literature shows success in develo** various language properties, no research has shown the emergence of such positional references. This paper demonstrates how agents can communicate about spatial relationships within their observations. The results indicate that agents can develop a language capable of expressing the relationships between parts of their observation, achieving over 90% accuracy when trained in a referential game which requires such communication. Using a collocation measure, we demonstrate how the agents create such references. This analysis suggests that agents use a mixture of non-compositional and compositional messages to convey spatial relationships. We also show that the emergent language is interpretable by humans. The translation accuracy is tested by communicating with the receiver agent, where the receiver achieves over 78% accuracy using parts of this lexicon, confirming that the interpretation of the emergent language was successful. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 16 pages, 3 figures

arXiv:2402.11653 [pdf, other]

Combinatorial Client-Master Multiagent Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing

Authors: Tesfay Zemuy Gebrekidan, Sebastian Stein, Timothy J. Norman

Abstract: Recently, there has been an explosion of mobile applications that perform computationally intensive tasks such as video streaming, data mining, virtual reality, augmented reality, image processing, video processing, face recognition, and online gaming. However, user devices (UDs), such as tablets and smartphones, have a limited ability to perform the computation needs of the tasks. Mobile edge com… ▽ More Recently, there has been an explosion of mobile applications that perform computationally intensive tasks such as video streaming, data mining, virtual reality, augmented reality, image processing, video processing, face recognition, and online gaming. However, user devices (UDs), such as tablets and smartphones, have a limited ability to perform the computation needs of the tasks. Mobile edge computing (MEC) has emerged as a promising technology to meet the increasing computing demands of UDs. Task offloading in MEC is a strategy that meets the demands of UDs by distributing tasks between UDs and MEC servers. Deep reinforcement learning (DRL) is gaining attention in task-offloading problems because it can adapt to dynamic changes and minimize online computational complexity. However, the various types of continuous and discrete resource constraints on UDs and MEC servers pose challenges to the design of an efficient DRL-based task-offloading strategy. Existing DRL-based task-offloading algorithms focus on the constraints of the UDs, assuming the availability of enough storage resources on the server. Moreover, existing multiagent DRL (MADRL)--based task-offloading algorithms are homogeneous agents and consider homogeneous constraints as a penalty in their reward function. We proposed a novel combinatorial client-master MADRL (CCM\_MADRL) algorithm for task offloading in MEC (CCM\_MADRL\_MEC) that enables UDs to decide their resource requirements and the server to make a combinatorial decision based on the requirements of the UDs. CCM\_MADRL\_MEC is the first MADRL in task offloading to consider server storage capacity in addition to the constraints in the UDs. By taking advantage of the combinatorial action selection, CCM\_MADRL\_MEC has shown superior convergence over existing MADDPG and heuristic algorithms. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: 11 pages, 5 figures, 2 tables

ACM Class: I.2.11

arXiv:2402.04898 [pdf]

The Strain of Success: A Predictive Model for Injury Risk Mitigation and Team Success in Soccer

Authors: Gregory Everett, Ryan Beal, Tim Matthews, Timothy J. Norman, Sarvapali D. Ramchurn

Abstract: In this paper, we present a novel sequential team selection model in soccer. Specifically, we model the stochastic process of player injury and unavailability using player-specific information learned from real-world soccer data. Monte-Carlo Tree Search is used to select teams for games that optimise long-term team performance across a soccer season by reasoning over player injury probability. We… ▽ More In this paper, we present a novel sequential team selection model in soccer. Specifically, we model the stochastic process of player injury and unavailability using player-specific information learned from real-world soccer data. Monte-Carlo Tree Search is used to select teams for games that optimise long-term team performance across a soccer season by reasoning over player injury probability. We validate our approach compared to benchmark solutions for the 2018/19 English Premier League season. Our model achieves similar season expected points to the benchmark whilst reducing first-team injuries by ~13% and the money inefficiently spent on injured players by ~11% - demonstrating the potential to reduce costs and improve player welfare in real-world soccer teams. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 19 pages (16 main, 2 references, 1 appendix), 10 figures (9 main, 1 appendix). Accepted at the MIT Sloan Sports Analytics Conference 2024 Research Paper Competition

arXiv:2401.11202 [pdf, other]

PartIR: Composing SPMD Partitioning Strategies for Machine Learning

Authors: Sami Alabed, Daniel Belov, Bart Chrzaszcz, Juliana Franco, Dominik Grewe, Dougal Maclaurin, James Molloy, Tom Natan, Tamara Norman, Xiaoyue Pan, Adam Paszke, Norman A. Rink, Michael Schaarschmidt, Timur Sitdikov, Agnieszka Swietlik, Dimitrios Vytiniotis, Joel Wee

Abstract: Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding. When strategies increase in complexity, it becomes necessary for partitioning tools to be 1) expressive, allowing the composition of simpler strategies, and 2) predictable to estimate performance analytically. We present PartIR, our design for a NN par… ▽ More Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding. When strategies increase in complexity, it becomes necessary for partitioning tools to be 1) expressive, allowing the composition of simpler strategies, and 2) predictable to estimate performance analytically. We present PartIR, our design for a NN partitioning system. PartIR is focused on an incremental approach to rewriting and is hardware-and-runtime agnostic. We present a simple but powerful API for composing sharding strategies and a simulator to validate them. The process is driven by high-level programmer-issued partitioning tactics, which can be both manual and automatic. Importantly, the tactics are specified separately from the model code, making them easy to change. We evaluate PartIR on several different models to demonstrate its predictability, expressibility, and ability to reach peak performance.. △ Less

Submitted 3 March, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

arXiv:2312.15667 [pdf, other]

TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient

Authors: Xingzhou Lou, Junge Zhang, Timothy J. Norman, Kaiqi Huang, Yali Du

Abstract: Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions by some agents will affect other agent's policy learning. While using individual critics for policy updates can avoid this issue, they severely limit cooperation… ▽ More Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions by some agents will affect other agent's policy learning. While using individual critics for policy updates can avoid this issue, they severely limit cooperation among agents. To address this issue, we propose an agent topology framework, which decides whether other agents should be considered in policy gradient and achieves compromise between facilitating cooperation and alleviating the CDM issue. The agent topology allows agents to use coalition utility as learning objective instead of global utility by centralized critics or local utility by individual critics. To constitute the agent topology, various models are studied. We propose Topology-based multi-Agent Policy gradiEnt (TAPE) for both stochastic and deterministic MAPG methods. We prove the policy improvement theorem for stochastic TAPE and give a theoretical explanation for the improved cooperation among agents. Experiment results on several benchmarks show the agent topology is able to facilitate agent cooperation and alleviate CDM issue respectively to improve performance of TAPE. Finally, multiple ablation studies and a heuristic graph search algorithm are devised to show the efficacy of the agent topology. △ Less

Submitted 15 January, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

arXiv:2310.06555 [pdf, other]

It's About Time: Temporal References in Emergent Communication

Authors: Olaf Lipinski, Adam J. Sobey, Federico Cerutti, Timothy J. Norman

Abstract: Emergent communication studies the development of language between autonomous agents, aiming to improve understanding of natural language evolution and increase communication efficiency. While temporal aspects of language have been considered in computational linguistics, there has been no research on temporal references in emergent communication. This paper addresses this gap, by exploring how ag… ▽ More Emergent communication studies the development of language between autonomous agents, aiming to improve understanding of natural language evolution and increase communication efficiency. While temporal aspects of language have been considered in computational linguistics, there has been no research on temporal references in emergent communication. This paper addresses this gap, by exploring how agents communicate about temporal relationships. We analyse three potential influences for the emergence of temporal references: environmental, external, and architectural changes. Our experiments demonstrate that altering the loss function is insufficient for temporal references to emerge; rather, architectural changes are necessary. However, a minimal change in agent architecture, using a different batching method, allows the emergence of temporal references. This modified design is compared with the standard architecture in a temporal referential games environment, which emphasises temporal relationships. The analysis indicates that over 95\% of the agents with the modified batching method develop temporal references, without changes to their loss function. We consider temporal referencing necessary for future improvements to the agents' communication efficiency, yielding a closer to optimal coding as compared to purely compositional languages. Our readily transferable architectural insights provide the basis for their incorporation into other emergent communication settings. △ Less

Submitted 3 May, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: 26 pages main body and 36 pages supplementary material, 8 figures in main body. Code available at https://github.com/olipinski/TRG

arXiv:2305.08664 [pdf, other]

MADDM: Multi-Advisor Dynamic Binary Decision-Making by Maximizing the Utility

Authors: Zhaori Guo, Timothy J. Norman, Enrico H. Gerding

Abstract: Being able to infer ground truth from the responses of multiple imperfect advisors is a problem of crucial importance in many decision-making applications, such as lending, trading, investment, and crowd-sourcing. In practice, however, gathering answers from a set of advisors has a cost. Therefore, finding an advisor selection strategy that retrieves a reliable answer and maximizes the overall uti… ▽ More Being able to infer ground truth from the responses of multiple imperfect advisors is a problem of crucial importance in many decision-making applications, such as lending, trading, investment, and crowd-sourcing. In practice, however, gathering answers from a set of advisors has a cost. Therefore, finding an advisor selection strategy that retrieves a reliable answer and maximizes the overall utility is a challenging problem. To address this problem, we propose a novel strategy for optimally selecting a set of advisers in a sequential binary decision-making setting, where multiple decisions need to be made over time. Crucially, we assume no access to ground truth and no prior knowledge about the reliability of advisers. Specifically, our approach considers how to simultaneously (1) select advisors by balancing the advisors' costs and the value of making correct decisions, (2) learn the trustworthiness of advisers dynamically without prior information by asking multiple advisers, and (3) make optimal decisions without access to the ground truth, improving this over time. We evaluate our algorithm through several numerical experiments. The results show that our approach outperforms two other methods that combine state-of-the-art models. △ Less

Submitted 15 May, 2023; originally announced May 2023.

arXiv:2305.07559 [pdf, other]

PRIME: A Price-Reverting Impact Model of a cryptocurrency Exchange

Authors: Christopher J. Cho, Timothy J. Norman, Manuel Nunes

Abstract: In a financial exchange, market impact is a measure of the price change of an asset following a transaction. This is an important element of market microstructure, which determines the behaviour of the market following a trade. In this paper, we first provide a discussion on the market impact observed in the BTC/USD Futures market, then we present a novel multi-agent market simulation that can fol… ▽ More In a financial exchange, market impact is a measure of the price change of an asset following a transaction. This is an important element of market microstructure, which determines the behaviour of the market following a trade. In this paper, we first provide a discussion on the market impact observed in the BTC/USD Futures market, then we present a novel multi-agent market simulation that can follow an underlying price series, whilst maintaining the ability to reproduce the market impact observed in the market in an explainable manner. This simulation of the financial exchange allows the model to interact realistically with market participants, hel** its users better estimate market slippage as well as the knock-on consequences of their market actions. In turn, it allows various stakeholders such as industrial practitioners, governments and regulators to test their market hypotheses, without deploying capital or destabilising the system. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Pre-print for the Cryptocurrency Research Conference 2023

arXiv:2302.06569 [pdf, other]

Inferring Player Location in Sports Matches: Multi-Agent Spatial Imputation from Limited Observations

Authors: Gregory Everett, Ryan J. Beal, Tim Matthews, Joseph Early, Timothy J. Norman, Sarvapali D. Ramchurn

Abstract: Understanding agent behaviour in Multi-Agent Systems (MAS) is an important problem in domains such as autonomous driving, disaster response, and sports analytics. Existing MAS problems typically use uniform timesteps with observations for all agents. In this work, we analyse the problem of agent location imputation, specifically posed in environments with non-uniform timesteps and limited agent ob… ▽ More Understanding agent behaviour in Multi-Agent Systems (MAS) is an important problem in domains such as autonomous driving, disaster response, and sports analytics. Existing MAS problems typically use uniform timesteps with observations for all agents. In this work, we analyse the problem of agent location imputation, specifically posed in environments with non-uniform timesteps and limited agent observability (~95% missing values). Our approach uses Long Short-Term Memory and Graph Neural Network components to learn temporal and inter-agent patterns to predict the location of all agents at every timestep. We apply this to the domain of football (soccer) by imputing the location of all players in a game from sparse event data (e.g., shots and passes). Our model estimates player locations to within ~6.9m; a ~62% reduction in error from the best performing baseline. This approach facilitates downstream analysis tasks such as player physical metrics, player coverage, and team pitch control. Existing solutions to these tasks often require optical tracking data, which is expensive to obtain and only available to elite clubs. By imputing player locations from easy to obtain event data, we increase the accessibility of downstream tasks. △ Less

Submitted 13 February, 2023; originally announced February 2023.

Comments: 11 Pages (8 main, 1 references, 2 appendix), 8 figures (7 main, 1 appendix). Accepted at AAMAS 2023 Main Track

arXiv:2210.08050 [pdf, other]

Multi-trainer Interactive Reinforcement Learning System

Authors: Zhaori Guo, Timothy J. Norman, Enrico H. Gerding

Abstract: Interactive reinforcement learning can effectively facilitate the agent training via human feedback. However, such methods often require the human teacher to know what is the correct action that the agent should take. In other words, if the human teacher is not always reliable, then it will not be consistently able to guide the agent through its training. In this paper, we propose a more effective… ▽ More Interactive reinforcement learning can effectively facilitate the agent training via human feedback. However, such methods often require the human teacher to know what is the correct action that the agent should take. In other words, if the human teacher is not always reliable, then it will not be consistently able to guide the agent through its training. In this paper, we propose a more effective interactive reinforcement learning system by introducing multiple trainers, namely Multi-Trainer Interactive Reinforcement Learning (MTIRL), which could aggregate the binary feedback from multiple non-perfect trainers into a more reliable reward for an agent training in a reward-sparse environment. In particular, our trainer feedback aggregation experiments show that our aggregation method has the best accuracy when compared with the majority voting, the weighted voting, and the Bayesian method. Finally, we conduct a grid-world experiment to show that the policy trained by the MTIRL with the review model is closer to the optimal policy than that without a review model. △ Less

Submitted 14 October, 2022; originally announced October 2022.

arXiv:2210.06352 [pdf, other]

Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR

Authors: Sami Alabed, Dominik Grewe, Juliana Franco, Bart Chrzaszcz, Tom Natan, Tamara Norman, Norman A. Rink, Dimitrios Vytiniotis, Michael Schaarschmidt

Abstract: Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm. For example, training large transformer models requires combining data, model, and pipeline partitioning; and optimizer sharding techniques. However, identifying efficient combinations for many model architectures and accelerator systems requ… ▽ More Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm. For example, training large transformer models requires combining data, model, and pipeline partitioning; and optimizer sharding techniques. However, identifying efficient combinations for many model architectures and accelerator systems requires significant manual analysis. In this work, we present an automatic partitioner that identifies these combinations through a goal-oriented search. Our key findings are that a Monte Carlo Tree Search-based partitioner leveraging partition-specific compiler analysis directly into the search and guided goals matches expert-level strategies for various models. △ Less

Submitted 7 October, 2022; originally announced October 2022.

arXiv:2210.02260 [pdf, other]

doi 10.3233/AIC-220127

From Intelligent Agents to Trustworthy Human-Centred Multiagent Systems

Authors: Mohammad Divband Soorati, Enrico H. Gerding, Enrico Marchioni, Pavel Naumov, Timothy J. Norman, Sarvapali D. Ramchurn, Bahar Rastegari, Adam Sobey, Sebastian Stein, Danesh Tarpore, Vahid Yazdanpanah, Jie Zhang

Abstract: The Agents, Interaction and Complexity research group at the University of Southampton has a long track record of research in multiagent systems (MAS). We have made substantial scientific contributions across learning in MAS, game-theoretic techniques for coordinating agent systems, and formal methods for representation and reasoning. We highlight key results achieved by the group and elaborate on… ▽ More The Agents, Interaction and Complexity research group at the University of Southampton has a long track record of research in multiagent systems (MAS). We have made substantial scientific contributions across learning in MAS, game-theoretic techniques for coordinating agent systems, and formal methods for representation and reasoning. We highlight key results achieved by the group and elaborate on recent work and open research challenges in develo** trustworthy autonomous systems and deploying human-centred AI systems that aim to support societal good. △ Less

Submitted 5 October, 2022; originally announced October 2022.

Comments: Appears in the Special Issue on Multi-Agent Systems Research in the United Kingdom

Journal ref: AI Communications, vol. 35, no. 4, pp. 443-457, 2022

arXiv:2202.12259 [pdf]

Learning from the Pros: Extracting Professional Goalkeeper Technique from Broadcast Footage

Authors: Matthew Wear, Ryan Beal, Tim Matthews, Tim Norman, Sarvapali Ramchurn

Abstract: As an amateur goalkeeper playing grassroots soccer, who better to learn from than top professional goalkeepers? In this paper, we harness computer vision and machine learning models to appraise the save technique of professionals in a way those at lower levels can learn from. We train an unsupervised machine learning model using 3D body pose data extracted from broadcast footage to learn professio… ▽ More As an amateur goalkeeper playing grassroots soccer, who better to learn from than top professional goalkeepers? In this paper, we harness computer vision and machine learning models to appraise the save technique of professionals in a way those at lower levels can learn from. We train an unsupervised machine learning model using 3D body pose data extracted from broadcast footage to learn professional goalkeeper technique. Then, an "expected saves" model is developed, from which we can identify the optimal goalkeeper technique in different match contexts. △ Less

Submitted 22 February, 2022; originally announced February 2022.

Comments: 17 pages, 15 figures, MIT Sloan Sports Analytics Conference, March 4-5 2022, Boston, USA

arXiv:2202.02080 [pdf, other]

Robust Linear Regression for General Feature Distribution

Authors: Tom Norman, Nir Weinberger, Kfir Y. Levy

Abstract: We investigate robust linear regression where data may be contaminated by an oblivious adversary, i.e., an adversary than may know the data distribution but is otherwise oblivious to the realizations of the data samples. This model has been previously analyzed under strong assumptions. Concretely, $\textbf{(i)}$ all previous works assume that the covariance matrix of the features is positive defin… ▽ More We investigate robust linear regression where data may be contaminated by an oblivious adversary, i.e., an adversary than may know the data distribution but is otherwise oblivious to the realizations of the data samples. This model has been previously analyzed under strong assumptions. Concretely, $\textbf{(i)}$ all previous works assume that the covariance matrix of the features is positive definite; and $\textbf{(ii)}$ most of them assume that the features are centered (i.e. zero mean). Additionally, all previous works make additional restrictive assumption, e.g., assuming that the features are Gaussian or that the corruptions are symmetrically distributed. In this work we go beyond these assumptions and investigate robust regression under a more general set of assumptions: $\textbf{(i)}$ we allow the covariance matrix to be either positive definite or positive semi definite, $\textbf{(ii)}$ we do not necessarily assume that the features are centered, $\textbf{(iii)}$ we make no further assumption beyond boundedness (sub-Gaussianity) of features and measurement noise. Under these assumption we analyze a natural SGD variant for this problem and show that it enjoys a fast convergence rate when the covariance matrix is positive definite. In the positive semi definite case we show that there are two regimes: if the features are centered we can obtain a standard convergence rate; otherwise the adversary can cause any learner to fail arbitrarily. △ Less

Submitted 4 February, 2022; originally announced February 2022.

arXiv:2112.02958 [pdf, other]

Automap: Towards Ergonomic Automated Parallelism for ML Models

Authors: Michael Schaarschmidt, Dominik Grewe, Dimitrios Vytiniotis, Adam Paszke, Georg Stefan Schmid, Tamara Norman, James Molloy, Jonathan Godwin, Norman Alexander Rink, Vinod Nair, Dan Belov

Abstract: The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype o… ▽ More The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism. Implementing these methods is increasingly supported through program primitives, but identifying efficient partitioning strategies requires expensive experimentation and expertise. We present the prototype of an automated partitioner that seamlessly integrates into existing compilers and existing user workflows. Our partitioner enables SPMD-style parallelism that encompasses data parallelism and parameter/activation sharding. Through a combination of inductive tactics and search in a platform-independent partitioning IR, automap can recover expert partitioning strategies such as Megatron sharding for transformer layers. △ Less

Submitted 6 December, 2021; originally announced December 2021.

Comments: Workshop on ML for Systems at NeurIPS 2021

arXiv:2110.10548 [pdf, other]

Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning

Authors: Ningning Xie, Tamara Norman, Dominik Grewe, Dimitrios Vytiniotis

Abstract: We present a novel characterization of the map** of multiple parallelism forms (e.g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware map**. We experimentally verify the substantial effect of these map**s on all-reduce performance (up to 448x). We offer a novel syntax-guided program synthesis frame… ▽ More We present a novel characterization of the map** of multiple parallelism forms (e.g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware map**. We experimentally verify the substantial effect of these map**s on all-reduce performance (up to 448x). We offer a novel syntax-guided program synthesis framework that is able to decompose reductions over one or more parallelism axes to sequences of collectives in a hierarchy- and map**-aware way. For 69% of parallelism placements and user requested reductions, our framework synthesizes programs that outperform the default all-reduce implementation when evaluated on different GPU hierarchies (max 2.04x, average 1.27x). We complement our synthesis tool with a simulator exceeding 90% top-10 accuracy, which therefore reduces the need for massive evaluations of synthesis results to determine a small set of optimal programs and map**s. △ Less

Submitted 16 November, 2021; v1 submitted 20 October, 2021; originally announced October 2021.

arXiv:2102.09469 [pdf, other]

Optimising Long-Term Outcomes using Real-World Fluent Objectives: An Application to Football

Authors: Ryan Beal, Georgios Chalkiadakis, Timothy J. Norman, Sarvapali D. Ramchurn

Abstract: In this paper, we present a novel approach for optimising long-term tactical and strategic decision-making in football (soccer) by encapsulating events in a league environment across a given time frame. We model the teams' objectives for a season and track how these evolve as games unfold to give a fluent objective that can aid in decision-making games. We develop Markov chain Monte Carlo and deep… ▽ More In this paper, we present a novel approach for optimising long-term tactical and strategic decision-making in football (soccer) by encapsulating events in a league environment across a given time frame. We model the teams' objectives for a season and track how these evolve as games unfold to give a fluent objective that can aid in decision-making games. We develop Markov chain Monte Carlo and deep learning-based algorithms that make use of the fluent objectives in order to learn from prior games and other games in the environment and increase the teams' long-term performance. Simulations of our approach using real-world datasets from 760 matches shows that by using optimised tactics with our fluent objective and prior games, we can on average increase teams mean expected finishing distribution in the league by up to 35.6%. △ Less

Submitted 18 February, 2021; originally announced February 2021.

Comments: Pre-Print - Accepted for publication at AAMAS-21

arXiv:2012.04380 [pdf, other]

Combining Machine Learning and Human Experts to Predict Match Outcomes in Football: A Baseline Model

Authors: Ryan Beal, Stuart E. Middleton, Timothy J. Norman, Sarvapali D. Ramchurn

Abstract: In this paper, we present a new application-focused benchmark dataset and results from a set of baseline Natural Language Processing and Machine Learning models for prediction of match outcomes for games of football (soccer). By doing so we give a baseline for the prediction accuracy that can be achieved exploiting both statistical match data and contextual articles from human sports journalists.… ▽ More In this paper, we present a new application-focused benchmark dataset and results from a set of baseline Natural Language Processing and Machine Learning models for prediction of match outcomes for games of football (soccer). By doing so we give a baseline for the prediction accuracy that can be achieved exploiting both statistical match data and contextual articles from human sports journalists. Our dataset is focuses on a representative time-period over 6 seasons of the English Premier League, and includes newspaper match previews from The Guardian. The models presented in this paper achieve an accuracy of 63.18% showing a 6.9% boost on the traditional statistical methods. △ Less

Submitted 8 December, 2020; originally announced December 2020.

Comments: Pre-print. Accepted at: The Thirty-Third Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-21). 5 pages

arXiv:2009.09806 [pdf, ps, other]

SHACL Satisfiability and Containment (Extended Paper)

Authors: Paolo Pareti, George Konstantinidis, Fabio Mogavero, Timothy J. Norman

Abstract: The Shapes Constraint Language (SHACL) is a recent W3C recommendation language for validating RDF data. Specifically, SHACL documents are collections of constraints that enforce particular shapes on an RDF graph. Previous work on the topic has provided theoretical and practical results for the validation problem, but did not consider the standard decision problems of satisfiability and containment… ▽ More The Shapes Constraint Language (SHACL) is a recent W3C recommendation language for validating RDF data. Specifically, SHACL documents are collections of constraints that enforce particular shapes on an RDF graph. Previous work on the topic has provided theoretical and practical results for the validation problem, but did not consider the standard decision problems of satisfiability and containment, which are crucial for verifying the feasibility of the constraints and important for design and optimization purposes. In this paper, we undertake a thorough study of different features of non-recursive SHACL by providing a translation to a new first-order language, called SCL, that precisely captures the semantics of SHACL w.r.t. satisfiability and containment. We study the interaction of SHACL features in this logic and provide the detailed map of decidability and complexity results of the aforementioned decision problems for different SHACL sublanguages. Notably, we prove that both problems are undecidable for the full language, but we present decidable combinations of interesting features. △ Less

Submitted 5 November, 2020; v1 submitted 31 August, 2020; originally announced September 2020.

arXiv:2006.00979 [pdf, other]

Acme: A Research Framework for Distributed Reinforcement Learning

Authors: Matthew W. Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Stańczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, Léonard Hussenot, Robert Dadashi, Gabriel Dulac-Arnold, Manu Orsini, Alexis Jacq, Johan Ferret, Nino Vieillard, Seyed Kamyar Seyed Ghasemipour, Sertan Girgin, Olivier Pietquin, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang , et al. (14 additional authors not shown)

Abstract: Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce publishe… ▽ More Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce published RL algorithms. To address these concerns this work describes Acme, a framework for constructing novel RL algorithms that is specifically designed to enable agents that are built using simple, modular components that can be used at various scales of execution. While the primary goal of Acme is to provide a framework for algorithm development, a secondary goal is to provide simple reference implementations of important or state-of-the-art algorithms. These implementations serve both as a validation of our design decisions as well as an important contribution to reproducibility in RL research. In this work we describe the major design decisions made within Acme and give further details as to how its components can be used to implement various algorithms. Our experiments provide baselines for a number of common and state-of-the-art algorithms as well as showing how these algorithms can be scaled up for much larger and more complex environments. This highlights one of the primary advantages of Acme, namely that it can be used to implement large, distributed RL algorithms that can run at massive scales while still maintaining the inherent readability of that implementation. This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme. △ Less

Submitted 20 September, 2022; v1 submitted 1 June, 2020; originally announced June 2020.

Comments: This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme

arXiv:2003.10294 [pdf, other]

Optimising Game Tactics for Football

Authors: Ryan Beal, Georgios Chalkiadakis, Timothy J. Norman, Sarvapali D. Ramchurn

Abstract: In this paper we present a novel approach to optimise tactical and strategic decision making in football (soccer). We model the game of football as a multi-stage game which is made up from a Bayesian game to model the pre-match decisions and a stochastic game to model the in-match state transitions and decisions. Using this formulation, we propose a method to predict the probability of game outcom… ▽ More In this paper we present a novel approach to optimise tactical and strategic decision making in football (soccer). We model the game of football as a multi-stage game which is made up from a Bayesian game to model the pre-match decisions and a stochastic game to model the in-match state transitions and decisions. Using this formulation, we propose a method to predict the probability of game outcomes and the payoffs of team actions. Building upon this, we develop algorithms to optimise team formation and in-game tactics with different objectives. Empirical evaluation of our approach on real-world datasets from 760 matches shows that by using optimised tactics from our Bayesian and stochastic games, we can increase a team chances of winning by up to 16.1\% and 3.4\% respectively. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: AAMAS 2020 Pre-Print Version

arXiv:1911.06657 [pdf, other]

A Policy Editor for Semantic Sensor Networks

Authors: Paolo Pareti, George Konstantinidis, Timothy J. Norman

Abstract: An important use of sensors and actuator networks is to comply with health and safety policies in hazardous environments. In order to deal with increasingly large and dynamic environments, and to quickly react to emergencies, tools are needed to simplify the process of translating high-level policies into executable queries and rules. We present a framework to produce such tools, which uses rules… ▽ More An important use of sensors and actuator networks is to comply with health and safety policies in hazardous environments. In order to deal with increasingly large and dynamic environments, and to quickly react to emergencies, tools are needed to simplify the process of translating high-level policies into executable queries and rules. We present a framework to produce such tools, which uses rules to aggregate low-level sensor data, described using the Semantic Sensor Network Ontology, into more useful and actionable abstractions. Using the schema of the underlying data sources as an input, we automatically generate abstractions which are relevant to the use case at hand. In this demonstration we present a policy editor tool and a simulation on which policies can be tested. △ Less

Submitted 15 November, 2019; originally announced November 2019.

Comments: Demo paper presented at the 18th International Semantic Web Conference (ISWC 2019)

arXiv:1911.00598 [pdf, other]

doi 10.1007/978-3-030-30793-6_31

SHACL Constraints with Inference Rules

Authors: Paolo Pareti, George Konstantinidis, Timothy J. Norman, Murat Şensoy

Abstract: The Shapes Constraint Language (SHACL) has been recently introduced as a W3C recommendation to define constraints that can be validated against RDF graphs. Interactions of SHACL with other Semantic Web technologies, such as ontologies or reasoners, is a matter of ongoing research. In this paper we study the interaction of a subset of SHACL with inference rules expressed in datalog. On the one hand… ▽ More The Shapes Constraint Language (SHACL) has been recently introduced as a W3C recommendation to define constraints that can be validated against RDF graphs. Interactions of SHACL with other Semantic Web technologies, such as ontologies or reasoners, is a matter of ongoing research. In this paper we study the interaction of a subset of SHACL with inference rules expressed in datalog. On the one hand, SHACL constraints can be used to define a "schema" for graph datasets. On the other hand, inference rules can lead to the discovery of new facts that do not match the original schema. Given a set of SHACL constraints and a set of datalog rules, we present a method to detect which constraints could be violated by the application of the inference rules on some graph instance of the schema, and update the original schema, i.e, the set of SHACL constraints, in order to capture the new facts that can be inferred. We provide theoretical and experimental results of the various components of our approach. △ Less

Submitted 1 November, 2019; originally announced November 2019.

Journal ref: In International Semantic Web Conference, pp. 539-557. Springer, Cham, 2019

arXiv:1907.01627 [pdf, other]

Rule Applicability on RDF Triplestore Schemas

Authors: Paolo Pareti, George Konstantinidis, Timothy J. Norman, Murat Şensoy

Abstract: Rule-based systems play a critical role in health and safety, where policies created by experts are usually formalised as rules. When dealing with increasingly large and dynamic sources of data, as in the case of Internet of Things (IoT) applications, it becomes important not only to efficiently apply rules, but also to reason about their applicability on datasets confined by a certain schema. In… ▽ More Rule-based systems play a critical role in health and safety, where policies created by experts are usually formalised as rules. When dealing with increasingly large and dynamic sources of data, as in the case of Internet of Things (IoT) applications, it becomes important not only to efficiently apply rules, but also to reason about their applicability on datasets confined by a certain schema. In this paper we define the notion of a triplestore schema which models a set of RDF graphs. Given a set of rules and such a schema as input we propose a method to determine rule applicability and produce output schemas. Output schemas model the graphs that would be obtained by running the rules on the graph models of the input schema. We present two approaches: one based on computing a canonical (critical) instance of the schema, and a novel approach based on query rewriting. We provide theoretical, complexity and evaluation results that show the superior efficiency of our rewriting approach. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Comments: AI for Internet of Things Workshop, co-located with the 28th International Joint Conference on Artificial Intelligence (IJCAI-19)

arXiv:1706.04033 [pdf, other]

On Natural Language Generation of Formal Argumentation

Authors: Federico Cerutti, Alice Toniolo, Timothy J. Norman

Abstract: In this paper we provide a first analysis of the research questions that arise when dealing with the problem of communicating pieces of formal argumentation through natural language interfaces. It is a generally held opinion that formal models of argumentation naturally capture human argument, and some preliminary studies have focused on justifying this view. Unfortunately, the results are not onl… ▽ More In this paper we provide a first analysis of the research questions that arise when dealing with the problem of communicating pieces of formal argumentation through natural language interfaces. It is a generally held opinion that formal models of argumentation naturally capture human argument, and some preliminary studies have focused on justifying this view. Unfortunately, the results are not only inconclusive, but seem to suggest that explaining formal argumentation to humans is a rather articulated task. Graphical models for expressing argumentation-based reasoning are appealing, but often humans require significant training to use these tools effectively. We claim that natural language interfaces to formal argumentation systems offer a real alternative, and may be the way forward for systems that capture human argument. △ Less

Submitted 13 June, 2017; originally announced June 2017.

Comments: 17 pages, 4 figures, technical report

arXiv:1607.00091 [pdf, ps, other]

Reducing overfitting in challenge-based competitions

Authors: Elias Chaibub Neto, Bruce R Hoff, Chris Bare, Brian M Bot, Thomas Yu, Lara Magravite, Andrew D Trister, Thea Norman, Pablo Meyer, Julio Saez-Rodrigues, James C Costello, Justin Guinney, Gustavo Stolovitzky

Abstract: Over-fitting is a dreaded foe in challenge-based competitions. Because participants rely on public leaderboards to evaluate and refine their models, there is always the danger they might over-fit to the holdout data supporting the leaderboard. The recently published Ladder algorithm aims to address this problem by preventing the participants from exploiting willingly or inadvertently minor fluctua… ▽ More Over-fitting is a dreaded foe in challenge-based competitions. Because participants rely on public leaderboards to evaluate and refine their models, there is always the danger they might over-fit to the holdout data supporting the leaderboard. The recently published Ladder algorithm aims to address this problem by preventing the participants from exploiting willingly or inadvertently minor fluctuations in public leaderboard scores during model refinement. In this paper, we report a vulnerability of the Ladder that induces severe over-fitting of the leaderboard when the sample size is small. To circumvent this attack, we propose a variation of the Ladder that releases a bootstrapped estimate of the public leaderboard score instead of providing participants with a direct measure of performance. We also extend the scope of the Ladder to arbitrary performance metrics by relying on a more broadly applicable testing procedure based on the Bayesian bootstrap. Our method makes it possible to use a leaderboard, with the technical and social advantages that it provides, even in cases where data is scant. △ Less

Submitted 30 June, 2016; originally announced July 2016.

arXiv:1407.3910 [pdf, ps, other]

Approachability in Population Games

Authors: Dario Bauso, Thomas W L Norman

Abstract: This paper reframes approachability theory within the context of population games. Thus, whilst one player aims at driving her average payoff to a predefined set, her opponent is not malevolent but rather extracted randomly from a population of individuals with given distribution on actions. First, convergence conditions are revisited based on the common prior on the population distribution, and w… ▽ More This paper reframes approachability theory within the context of population games. Thus, whilst one player aims at driving her average payoff to a predefined set, her opponent is not malevolent but rather extracted randomly from a population of individuals with given distribution on actions. First, convergence conditions are revisited based on the common prior on the population distribution, and we define the notion of \emph{1st-moment approachability}. Second, we develop a model of two coupled partial differential equations (PDEs) in the spirit of mean-field game theory: one describing the best-response of every player given the population distribution (this is a \emph{Hamilton-Jacobi-Bellman equation}), the other capturing the macroscopic evolution of average payoffs if every player plays its best response (this is an \emph{advection equation}). Third, we provide a detailed analysis of existence, nonuniqueness, and stability of equilibria (fixed points of the two PDEs). Fourth, we apply the model to regret-based dynamics, and use it to establish convergence to Bayesian equilibrium under incomplete information. △ Less

Submitted 15 July, 2014; originally announced July 2014.

Comments: 24 pages, 5 figures

MSC Class: 91A13

arXiv:1312.4828 [pdf, ps, other]

Subjective Logic Operators in Trust Assessment: an Empirical Study

Authors: Federico Cerutti, Alice Toniolo, Nir Oren, Timothy J. Norman

Abstract: Computational trust mechanisms aim to produce trust ratings from both direct and indirect information about agents' behaviour. Subjective Logic (SL) has been widely adopted as the core of such systems via its fusion and discount operators. In recent research we revisited the semantics of these operators to explore an alternative, geometric interpretation. In this paper we present a principled desi… ▽ More Computational trust mechanisms aim to produce trust ratings from both direct and indirect information about agents' behaviour. Subjective Logic (SL) has been widely adopted as the core of such systems via its fusion and discount operators. In recent research we revisited the semantics of these operators to explore an alternative, geometric interpretation. In this paper we present a principled desiderata for discounting and fusion operators in SL. Building upon this we present operators that satisfy these desirable properties, including a family of discount operators. We then show, through a rigorous empirical study, that specific, geometrically interpreted operators significantly outperform standard SL operators in estimating ground truth. These novel operators offer real advantages for computational models of trust and reputation, in which they may be employed without modifying other aspects of an existing system. △ Less

Submitted 19 November, 2013; originally announced December 2013.

Comments: Submitted to Information Systems Frontiers Journal

arXiv:1309.4994 [pdf, other]

Context-dependent Trust Decisions with Subjective Logic

Authors: Federico Cerutti, Alice Toniolo, Nir Oren, Timothy J. Norman

Abstract: A decision procedure implemented over a computational trust mechanism aims to allow for decisions to be made regarding whether some entity or information should be trusted. As recognised in the literature, trust is contextual, and we describe how such a context often translates into a confidence level which should be used to modify an underlying trust value. Jøsang's Subjective Logic has long been… ▽ More A decision procedure implemented over a computational trust mechanism aims to allow for decisions to be made regarding whether some entity or information should be trusted. As recognised in the literature, trust is contextual, and we describe how such a context often translates into a confidence level which should be used to modify an underlying trust value. Jøsang's Subjective Logic has long been used in the trust domain, and we show that its operators are insufficient to address this problem. We therefore provide a decision-making approach about trust which also considers the notion of confidence (based on context) through the introduction of a new operator. In particular, we introduce general requirements that must be respected when combining trustworthiness and confidence degree, and demonstrate the soundness of our new operator with respect to these properties. △ Less

Submitted 19 September, 2013; originally announced September 2013.

Comments: 19 pages, 4 figures, technical report of the University of Aberdeen (preprint version)

Showing 1–29 of 29 results for author: Norman, T