-
A Systematization of the Wagner Framework: Graph Theory Conjectures and Reinforcement Learning
Authors:
Flora Angileri,
Giulia Lombardi,
Andrea Fois,
Renato Faraone,
Carlo Metta,
Michele Salvi,
Luigi Amedeo Bianchi,
Marco Fantozzi,
Silvia Giulia Galfrè,
Daniele Pavesi,
Maurizio Parton,
Francesco Morandin
Abstract:
In 2021, Adam Zsolt Wagner proposed an approach to disprove conjectures in graph theory using Reinforcement Learning (RL). Wagner's idea can be framed as follows: consider a conjecture, such as a certain quantity f(G) < 0 for every graph G; one can then play a single-player graph-building game, where at each turn the player decides whether to add an edge or not. The game ends when all edges have b…
▽ More
In 2021, Adam Zsolt Wagner proposed an approach to disprove conjectures in graph theory using Reinforcement Learning (RL). Wagner's idea can be framed as follows: consider a conjecture, such as a certain quantity f(G) < 0 for every graph G; one can then play a single-player graph-building game, where at each turn the player decides whether to add an edge or not. The game ends when all edges have been considered, resulting in a certain graph G_T, and f(G_T) is the final score of the game; RL is then used to maximize this score. This brilliant idea is as simple as innovative, and it lends itself to systematic generalization. Several different single-player graph-building games can be employed, along with various RL algorithms. Moreover, RL maximizes the cumulative reward, allowing for step-by-step rewards instead of a single final score, provided the final cumulative reward represents the quantity of interest f(G_T). In this paper, we discuss these and various other choices that can be significant in Wagner's framework. As a contribution to this systematization, we present four distinct single-player graph-building games. Each game employs both a step-by-step reward system and a single final score. We also propose a principled approach to select the most suitable neural network architecture for any given conjecture, and introduce a new dataset of graphs labeled with their Laplacian spectra. Furthermore, we provide a counterexample for a conjecture regarding the sum of the matching number and the spectral radius, which is simpler than the example provided in Wagner's original paper.
The games have been implemented as environments in the Gymnasium framework, and along with the dataset, are available as open-source supplementary materials.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Understanding and Sha** Human-Technology Assemblages in the Age of Generative AI
Authors:
Josh Andres,
Chris Danta,
Andrea Bianchi,
Sungyeon Hong,
Zhuying Li,
Eduardo B. Sandoval,
Charles Martin,
Ned Cooper
Abstract:
Generative AI capabilities are rapidly transforming how we perceive, interact with, and relate to machines. This one-day workshop invites HCI researchers, designers, and practitioners to imaginatively inhabit and explore the possible futures that might emerge from humans combining generative AI capabilities into everyday technologies at massive scale. Workshop participants will craft stories, visu…
▽ More
Generative AI capabilities are rapidly transforming how we perceive, interact with, and relate to machines. This one-day workshop invites HCI researchers, designers, and practitioners to imaginatively inhabit and explore the possible futures that might emerge from humans combining generative AI capabilities into everyday technologies at massive scale. Workshop participants will craft stories, visualisations, and prototypes through scenario-based design to investigate these possible futures, resulting in the production of an open-annotated scenario library and a journal or interactions article to disseminate the findings. We aim to gather the DIS community knowledge to explore, understand and shape the relations this new interaction paradigm is forging between humans, their technologies and the environment in safe, sustainable, enriching, and responsible ways.
△ Less
Submitted 4 May, 2024; v1 submitted 28 April, 2024;
originally announced April 2024.
-
Rethinking How to Evaluate Language Model Jailbreak
Authors:
Hongyu Cai,
Arjun Arunasalam,
Leo Y. Lin,
Antonio Bianchi,
Z. Berkay Celik
Abstract:
Large language models (LLMs) have become increasingly integrated with various applications. To ensure that LLMs do not generate unsafe responses, they are aligned with safeguards that specify what content is restricted. However, such alignment can be bypassed to produce prohibited content using a technique commonly referred to as jailbreak. Different systems have been proposed to perform the jailb…
▽ More
Large language models (LLMs) have become increasingly integrated with various applications. To ensure that LLMs do not generate unsafe responses, they are aligned with safeguards that specify what content is restricted. However, such alignment can be bypassed to produce prohibited content using a technique commonly referred to as jailbreak. Different systems have been proposed to perform the jailbreak automatically. These systems rely on evaluation methods to determine whether a jailbreak attempt is successful. However, our analysis reveals that current jailbreak evaluation methods have two limitations. (1) Their objectives lack clarity and do not align with the goal of identifying unsafe responses. (2) They oversimplify the jailbreak result as a binary outcome, successful or not. In this paper, we propose three metrics, safeguard violation, informativeness, and relative truthfulness, to evaluate language model jailbreak. Additionally, we demonstrate how these metrics correlate with the goal of different malicious actors. To compute these metrics, we introduce a multifaceted approach that extends the natural language generation evaluation method after preprocessing the response. We evaluate our metrics on a benchmark dataset produced from three malicious intent datasets and three jailbreak systems. The benchmark dataset is labeled by three annotators. We compare our multifaceted approach with three existing jailbreak evaluation methods. Experiments demonstrate that our multifaceted evaluation outperforms existing methods, with F1 scores improving on average by 17% compared to existing baselines. Our findings motivate the need to move away from the binary view of the jailbreak problem and incorporate a more comprehensive evaluation to ensure the safety of the language model.
△ Less
Submitted 7 May, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Making Sense of Constellations: Methodologies for Understanding Starlink's Scheduling Algorithms
Authors:
Hammas Bin Tanveer,
Mike Puchol,
Rachee Singh,
Antonio Bianchi,
Rishab Nithyanand
Abstract:
Starlink constellations are currently the largest LEO WAN and have seen considerable interest from the research community. In this paper, we use high-frequency and high-fidelity measurements to uncover evidence of hierarchical traffic controllers in Starlink -- a global controller which allocates satellites to terminals and an on-satellite controller that schedules transmission of user flows. We t…
▽ More
Starlink constellations are currently the largest LEO WAN and have seen considerable interest from the research community. In this paper, we use high-frequency and high-fidelity measurements to uncover evidence of hierarchical traffic controllers in Starlink -- a global controller which allocates satellites to terminals and an on-satellite controller that schedules transmission of user flows. We then devise a novel approach for identifying how satellites are allocated to user terminals. Using data gathered with this approach, we measure the characteristics of the global controller and identify the factors that influence the allocation of satellites to terminals. Finally, we use this data to build a model which approximates Starlink's global scheduler. Our model is able to predict the characteristics of the satellite allocated to a terminal at a specific location and time with reasonably high accuracy and at a rate significantly higher than baseline.
△ Less
Submitted 1 July, 2023;
originally announced July 2023.
-
A Decision Tree to Shepherd Scientists through Data Retrievability
Authors:
Andrea Bianchi,
Giordano d'Aloisio,
Francesca Marzi,
Antinisca Di Marco
Abstract:
Reproducibility is a crucial aspect of scientific research that involves the ability to independently replicate experimental results by analysing the same data or repeating the same experiment. Over the years, many works have been proposed to make the results of the experiments actually reproducible. However, very few address the importance of data reproducibility, defined as the ability of indepe…
▽ More
Reproducibility is a crucial aspect of scientific research that involves the ability to independently replicate experimental results by analysing the same data or repeating the same experiment. Over the years, many works have been proposed to make the results of the experiments actually reproducible. However, very few address the importance of data reproducibility, defined as the ability of independent researchers to retain the same dataset used as input for experimentation. Properly addressing the problem of data reproducibility is crucial because often just providing a link to the data is not enough to make the results reproducible. In fact, also proper metadata (e.g., preprocessing instruction) must be provided to make a dataset fully reproducible. In this work, our aim is to fill this gap by proposing a decision tree to sheperd researchers through the reproducibility of their datasets. In particular, this decision tree guides researchers through identifying if the dataset is actually reproducible and if additional metadata (i.e., additional resources needed to reproduce the data) must also be provided. This decision tree will be the foundation of a future application that will automate the data reproduction process by automatically providing the necessary metadata based on the particular context (e.g., data availability, data preprocessing, and so on). It is worth noting that, in this paper, we detail the steps to make a dataset retrievable, while we will detail other crucial aspects for reproducibility (e.g., dataset documentation) in future works.
△ Less
Submitted 12 April, 2023;
originally announced April 2023.
-
Columbus: Android App Testing Through Systematic Callback Exploration
Authors:
Priyanka Bose,
Dipanjan Das,
Saastha Vasan,
Sebastiano Mariani,
Ilya Grishchenko,
Andrea Continella,
Antonio Bianchi,
Christopher Kruegel,
Giovanni Vigna
Abstract:
With the continuous rise in the popularity of Android mobile devices, automated testing of apps has become more important than ever. Android apps are event-driven programs. Unfortunately, generating all possible types of events by interacting with the app's interface is challenging for an automated testing approach. Callback-driven testing eliminates the need for event generation by directly invok…
▽ More
With the continuous rise in the popularity of Android mobile devices, automated testing of apps has become more important than ever. Android apps are event-driven programs. Unfortunately, generating all possible types of events by interacting with the app's interface is challenging for an automated testing approach. Callback-driven testing eliminates the need for event generation by directly invoking app callbacks. However, existing callback-driven testing techniques assume prior knowledge of Android callbacks, and they rely on a human expert, who is familiar with the Android API, to write stub code that prepares callback arguments before invocation. Since the Android API is huge and keeps evolving, prior techniques could only support a small fraction of callbacks present in the Android framework.
In this work, we introduce Columbus, a callback-driven testing technique that employs two strategies to eliminate the need for human involvement: (i) it automatically identifies callbacks by simultaneously analyzing both the Android framework and the app under test, and (ii) it uses a combination of under-constrained symbolic execution (primitive arguments), and type-guided dynamic heap introspection (object arguments) to generate valid and effective inputs. Lastly, Columbus integrates two novel feedback mechanisms -- data dependency and crash-guidance, during testing to increase the likelihood of triggering crashes, and maximizing coverage. In our evaluation, Columbus outperforms state-of-the-art model-driven, checkpoint-based, and callback-driven testing tools both in terms of crashes and coverage.
△ Less
Submitted 17 February, 2023;
originally announced February 2023.
-
Intelligent Trading Systems: A Sentiment-Aware Reinforcement Learning Approach
Authors:
Francisco Caio Lima Paiva,
Leonardo Kanashiro Felizardo,
Reinaldo Augusto da Costa Bianchi,
Anna Helena Reali Costa
Abstract:
The feasibility of making profitable trades on a single asset on stock exchanges based on patterns identification has long attracted researchers. Reinforcement Learning (RL) and Natural Language Processing have gained notoriety in these single-asset trading tasks, but only a few works have explored their combination. Moreover, some issues are still not addressed, such as extracting market sentimen…
▽ More
The feasibility of making profitable trades on a single asset on stock exchanges based on patterns identification has long attracted researchers. Reinforcement Learning (RL) and Natural Language Processing have gained notoriety in these single-asset trading tasks, but only a few works have explored their combination. Moreover, some issues are still not addressed, such as extracting market sentiment momentum through the explicit capture of sentiment features that reflect the market condition over time and assessing the consistency and stability of RL results in different situations. Filling this gap, we propose the Sentiment-Aware RL (SentARL) intelligent trading system that improves profit stability by leveraging market mood through an adaptive amount of past sentiment features drawn from textual news. We evaluated SentARL across twenty assets, two transaction costs, and five different periods and initializations to show its consistent effectiveness against baselines. Subsequently, this thorough assessment allowed us to identify the boundary between news coverage and market sentiment regarding the correlation of price-time series above which SentARL's effectiveness is outstanding.
△ Less
Submitted 14 November, 2021;
originally announced December 2021.
-
Specializing Inter-Agent Communication in Heterogeneous Multi-Agent Reinforcement Learning using Agent Class Information
Authors:
Douglas De Rizzo Meneghetti,
Reinaldo Augusto da Costa Bianchi
Abstract:
Inspired by recent advances in agent communication with graph neural networks, this work proposes the representation of multi-agent communication capabilities as a directed labeled heterogeneous agent graph, in which node labels denote agent classes and edge labels, the communication type between two classes of agents. We also introduce a neural network architecture that specializes communication…
▽ More
Inspired by recent advances in agent communication with graph neural networks, this work proposes the representation of multi-agent communication capabilities as a directed labeled heterogeneous agent graph, in which node labels denote agent classes and edge labels, the communication type between two classes of agents. We also introduce a neural network architecture that specializes communication in fully cooperative heterogeneous multi-agent tasks by learning individual transformations to the exchanged messages between each pair of agent classes. By also employing encoding and action selection modules with parameter sharing for environments with heterogeneous agents, we demonstrate comparable or superior performance in environments where a larger number of agent classes operates.
△ Less
Submitted 10 March, 2021; v1 submitted 14 December, 2020;
originally announced December 2020.
-
Detecting soccer balls with reduced neural networks: a comparison of multiple architectures under constrained hardware scenarios
Authors:
Douglas De Rizzo Meneghetti,
Thiago Pedro Donadon Homem,
Jonas Henrique Renolfi de Oliveira,
Isaac Jesus da Silva,
Danilo Hernani Perico,
Reinaldo Augusto da Costa Bianchi
Abstract:
Object detection techniques that achieve state-of-the-art detection accuracy employ convolutional neural networks, implemented to have optimal performance in graphics processing units. Some hardware systems, such as mobile robots, operate under constrained hardware situations, but still benefit from object detection capabilities. Multiple network models have been proposed, achieving comparable acc…
▽ More
Object detection techniques that achieve state-of-the-art detection accuracy employ convolutional neural networks, implemented to have optimal performance in graphics processing units. Some hardware systems, such as mobile robots, operate under constrained hardware situations, but still benefit from object detection capabilities. Multiple network models have been proposed, achieving comparable accuracy with reduced architectures and leaner operations. Motivated by the need to create an object detection system for a soccer team of mobile robots, this work provides a comparative study of recent proposals of neural networks targeted towards constrained hardware environments, in the specific task of soccer ball detection. We train multiple open implementations of MobileNetV2 and MobileNetV3 models with different underlying architectures, as well as YOLOv3, TinyYOLOv3, YOLOv4 and TinyYOLOv4 in an annotated image data set captured using a mobile robot. We then report their mean average precision on a test data set and their inference times in videos of different resolutions, under constrained and unconstrained hardware configurations. Results show that MobileNetV3 models have a good trade-off between mAP and inference time in constrained scenarios only, while MobileNetV2 with high width multipliers are appropriate for server-side inference. YOLO models in their official implementations are not suitable for inference in CPUs.
△ Less
Submitted 21 February, 2021; v1 submitted 28 September, 2020;
originally announced September 2020.
-
Towards Heterogeneous Multi-Agent Reinforcement Learning with Graph Neural Networks
Authors:
Douglas De Rizzo Meneghetti,
Reinaldo Augusto da Costa Bianchi
Abstract:
This work proposes a neural network architecture that learns policies for multiple agent classes in a heterogeneous multi-agent reinforcement setting. The proposed network uses directed labeled graph representations for states, encodes feature vectors of different sizes for different entity classes, uses relational graph convolution layers to model different communication channels between entity t…
▽ More
This work proposes a neural network architecture that learns policies for multiple agent classes in a heterogeneous multi-agent reinforcement setting. The proposed network uses directed labeled graph representations for states, encodes feature vectors of different sizes for different entity classes, uses relational graph convolution layers to model different communication channels between entity types and learns distinct policies for different agent classes, sharing parameters wherever possible. Results have shown that specializing the communication channels between entity classes is a promising step to achieve higher performance in environments composed of heterogeneous entities.
△ Less
Submitted 20 October, 2020; v1 submitted 28 September, 2020;
originally announced September 2020.
-
Improving Image Classification Robustness through Selective CNN-Filters Fine-Tuning
Authors:
Alessandro Bianchi,
Moreno Raimondo Vendra,
Pavlos Protopapas,
Marco Brambilla
Abstract:
Image quality plays a big role in CNN-based image classification performance. Fine-tuning the network with distorted samples may be too costly for large networks. To solve this issue, we propose a transfer learning approach optimized to keep into account that in each layer of a CNN some filters are more susceptible to image distortion than others. Our method identifies the most susceptible filters…
▽ More
Image quality plays a big role in CNN-based image classification performance. Fine-tuning the network with distorted samples may be too costly for large networks. To solve this issue, we propose a transfer learning approach optimized to keep into account that in each layer of a CNN some filters are more susceptible to image distortion than others. Our method identifies the most susceptible filters and applies retraining only to the filters that show the highest activation maps distance between clean and distorted images. Filters are ranked using the Borda count election method and then only the most affected filters are fine-tuned. This significantly reduces the number of parameters to retrain. We evaluate this approach on the CIFAR-10 and CIFAR-100 datasets, testing it on two different models and two different types of distortion. Results show that the proposed transfer learning technique recovers most of the lost performance due to input data distortion, at a considerably faster pace with respect to existing methods, thanks to the reduced number of parameters to fine-tune. When few noisy samples are provided for training, our filter-level fine tuning performs particularly well, also outperforming state of the art layer-level transfer learning approaches.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.
-
Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial Puzzles
Authors:
Thiago Freitas dos Santos,
Paulo E. Santos,
Leonardo A. Ferreira,
Reinaldo A. C. Bianchi,
Pedro Cabalar
Abstract:
Spatial puzzles composed of rigid objects, flexible strings and holes offer interesting domains for reasoning about spatial entities that are common in the human daily-life's activities. The goal of this work is to investigate the automated solution of this kind of puzzles adapting an algorithm that combines Answer Set Programming (ASP) with Markov Decision Process (MDP), algorithm oASP(MDP), to u…
▽ More
Spatial puzzles composed of rigid objects, flexible strings and holes offer interesting domains for reasoning about spatial entities that are common in the human daily-life's activities. The goal of this work is to investigate the automated solution of this kind of puzzles adapting an algorithm that combines Answer Set Programming (ASP) with Markov Decision Process (MDP), algorithm oASP(MDP), to use heuristics accelerating the learning process. ASP is applied to represent the domain as an MDP, while a Reinforcement Learning algorithm (Q-Learning) is used to find the optimal policies. In this work, the heuristics were obtained from the solution of relaxed versions of the puzzles. Experiments were performed on deterministic, non-deterministic and non-stationary versions of the puzzles. Results show that the proposed approach can accelerate the learning process, presenting an advantage when compared to the non-heuristic versions of oASP(MDP) and Q-Learning.
△ Less
Submitted 15 February, 2019;
originally announced March 2019.
-
RealPen: Providing Realism in Handwriting Tasks on Touch Surfaces using Auditory-Tactile Feedback
Authors:
Youngjun Cho,
Andrea Bianchi,
Nicolai Marquardt,
Nadia Bianchi-Berthouze
Abstract:
We present RealPen, an augmented stylus for capacitive tablet screens that recreates the physical sensation of writing on paper with a pencil, ball-point pen or marker pen. The aim is to create a more engaging experience when writing on touch surfaces, such as screens of tablet computers. This is achieved by re-generating the friction-induced oscillation and sound of a real writing tool in contact…
▽ More
We present RealPen, an augmented stylus for capacitive tablet screens that recreates the physical sensation of writing on paper with a pencil, ball-point pen or marker pen. The aim is to create a more engaging experience when writing on touch surfaces, such as screens of tablet computers. This is achieved by re-generating the friction-induced oscillation and sound of a real writing tool in contact with paper. To generate realistic tactile feedback, our algorithm analyses the frequency spectrum of the friction oscillation generated when writing with traditional tools, extracts principal frequencies, and uses the actuator's frequency response profile for an adjustment weighting function. We enhance the realism by providing the sound feedback aligned with the writing pressure and speed. Furthermore, we investigated the effects of superposition and fluctuation of several frequencies on human tactile perception, evaluated the performance of RealPen, and characterized users' perception and preference of each feedback type.
△ Less
Submitted 6 March, 2018;
originally announced March 2018.
-
An ASM-based Characterization of Starvation-free Systems
Authors:
Alessandro Bianchi,
Sebastiano Pizzutilo,
Gennaro Vessio
Abstract:
Abstract State Machines (ASMs) have been successfully applied for modeling critical and complex systems in a wide range of application domains. However, unlike other well-known formalisms, e.g. Petri nets, ASMs lack inherent, domain-independent characterisations of computationally important properties. Here, we provide an ASM-based characterisation of the starvation-free property. The classic, inf…
▽ More
Abstract State Machines (ASMs) have been successfully applied for modeling critical and complex systems in a wide range of application domains. However, unlike other well-known formalisms, e.g. Petri nets, ASMs lack inherent, domain-independent characterisations of computationally important properties. Here, we provide an ASM-based characterisation of the starvation-free property. The classic, informal notion of starvation, usually provided in literature, is analysed and expressed as a necessary condition in terms of ASMs. Thus, we enrich the ASM framework with the notion of vulnerable rule as a practical tool for analysing starvation issues in an operational fashion
△ Less
Submitted 20 July, 2017;
originally announced July 2017.
-
A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming
Authors:
Leonardo A. Ferreira,
Reinaldo A. C. Bianchi,
Paulo E. Santos,
Ramon Lopez de Mantaras
Abstract:
Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named {\em Online ASP for MDP} (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with…
▽ More
Non-stationary domains, that change in unpredicted ways, are a challenge for agents searching for optimal policies in sequential decision-making problems. This paper presents a combination of Markov Decision Processes (MDP) with Answer Set Programming (ASP), named {\em Online ASP for MDP} (oASP(MDP)), which is a method capable of constructing the set of domain states while the agent interacts with a changing environment. oASP(MDP) updates previously obtained policies, learnt by means of Reinforcement Learning (RL), using rules that represent the domain changes observed by the agent. These rules represent a set of domain constraints that are processed as ASP programs reducing the search space. Results show that oASP(MDP) is capable of finding solutions for problems in non-stationary domains without interfering with the action-value function approximation process.
△ Less
Submitted 5 June, 2017;
originally announced June 2017.
-
Answer Set Programming for Non-Stationary Markov Decision Processes
Authors:
Leonardo A. Ferreira,
Reinaldo A. C. Bianchi,
Paulo E. Santos,
Ramon Lopez de Mantaras
Abstract:
Non-stationary domains, where unforeseen changes happen, present a challenge for agents to find an optimal policy for a sequential decision making problem. This work investigates a solution to this problem that combines Markov Decision Processes (MDP) and Reinforcement Learning (RL) with Answer Set Programming (ASP) in a method we call ASP(RL). In this method, Answer Set Programming is used to fin…
▽ More
Non-stationary domains, where unforeseen changes happen, present a challenge for agents to find an optimal policy for a sequential decision making problem. This work investigates a solution to this problem that combines Markov Decision Processes (MDP) and Reinforcement Learning (RL) with Answer Set Programming (ASP) in a method we call ASP(RL). In this method, Answer Set Programming is used to find the possible trajectories of an MDP, from where Reinforcement Learning is applied to learn the optimal policy of the problem. Results show that ASP(RL) is capable of efficiently finding the optimal solution of an MDP representing non-stationary domains.
△ Less
Submitted 3 May, 2017;
originally announced May 2017.