Skip to main content

Showing 1–29 of 29 results for author: Sperduti, A

.
  1. arXiv:2406.06588  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models

    Authors: Flavio Petruzzellis, Alberto Testolin, Alessandro Sperduti

    Abstract: Large Language Models (LLMs) achieve impressive performance in a wide range of tasks, even if they are often trained with the only objective of chatting fluently with users. Among other skills, LLMs show emergent abilities in mathematical reasoning benchmarks, which can be elicited with appropriate prompting methods. In this work, we systematically investigate the capabilities and limitations of p… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at 33rd International Conference on Artificial Neural Networks (ICANN24)

  2. arXiv:2402.17407  [pdf, other

    cs.NE cs.AI cs.CL

    A Neural Rewriting System to Solve Algorithmic Problems

    Authors: Flavio Petruzzellis, Alberto Testolin, Alessandro Sperduti

    Abstract: Modern neural network architectures still struggle to learn algorithmic procedures that require to systematically apply compositional rules to solve out-of-distribution problem instances. In this work, we focus on formula simplification problems, a class of synthetic benchmarks used to study the systematic generalization capabilities of neural architectures. We propose a modular architecture desig… ▽ More

    Submitted 12 July, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Updated version (v2) accepted at the 27th European Conference on Artificial Intelligence (ECAI 24)

  3. arXiv:2402.17396  [pdf, other

    cs.CL cs.AI cs.NE

    Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies

    Authors: Flavio Petruzzellis, Alberto Testolin, Alessandro Sperduti

    Abstract: Large Language Models (LLMs) have revolutionized the field of Natural Language Processing thanks to their ability to reuse knowledge acquired on massive text corpora on a wide variety of downstream tasks, with minimal (if any) tuning steps. At the same time, it has been repeatedly shown that LLMs lack systematic generalization, which allows to extrapolate the learned statistical regularities outsi… ▽ More

    Submitted 11 July, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted at LREC-COLING 2024. Added acknowledgements

  4. arXiv:2306.17249  [pdf, other

    cs.NE cs.AI

    A Hybrid System for Systematic Generalization in Simple Arithmetic Problems

    Authors: Flavio Petruzzellis, Alberto Testolin, Alessandro Sperduti

    Abstract: Solving symbolic reasoning problems that require compositionality and systematicity is considered one of the key ingredients of human intelligence. However, symbolic reasoning is still a great challenge for deep learning models, which often cannot generalize the reasoning pattern to out-of-distribution test cases. In this work, we propose a hybrid system capable of solving arithmetic problems that… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: Accepted at NeSy 2023, 17th International Workshop on Neural-Symbolic Learning and Reasoning

    ACM Class: I.1.1; I.5.1; I.2.6

  5. arXiv:2305.11699  [pdf, ps, other

    cs.LG cs.AI q-bio.BM

    RGCVAE: Relational Graph Conditioned Variational Autoencoder for Molecule Design

    Authors: Davide Rigoni, Nicolò Navarin, Alessandro Sperduti

    Abstract: Identifying molecules that exhibit some pre-specified properties is a difficult problem to solve. In the last few years, deep generative models have been used for molecule generation. Deep Graph Variational Autoencoders are among the most powerful machine learning tools with which it is possible to address this problem. However, existing methods struggle in capturing the true data distribution and… ▽ More

    Submitted 8 June, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

  6. arXiv:2305.10913  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    Weakly-Supervised Visual-Textual Grounding with Semantic Prior Refinement

    Authors: Davide Rigoni, Luca Parolari, Luciano Serafini, Alessandro Sperduti, Lamberto Ballan

    Abstract: Using only image-sentence pairs, weakly-supervised visual-textual grounding aims to learn region-phrase correspondences of the respective entity mentions. Compared to the supervised approach, learning is more difficult since bounding boxes and textual phrases correspondences are unavailable. In light of this, we propose the Semantic Prior Refinement Model (SPRM), whose predictions are obtained by… ▽ More

    Submitted 26 September, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  7. A Better Loss for Visual-Textual Grounding

    Authors: Davide Rigoni, Luciano Serafini, Alessandro Sperduti

    Abstract: Given a textual phrase and an image, the visual grounding problem is the task of locating the content of the image referenced by the sentence. It is a challenging task that has several real-world applications in human-computer interaction, image-text reference resolution, and video-text reference resolution. In the last years, several works have addressed this problem by proposing more and more la… ▽ More

    Submitted 2 February, 2022; v1 submitted 11 August, 2021; originally announced August 2021.

  8. arXiv:2106.05809  [pdf, other

    cs.LG

    Simple Graph Convolutional Networks

    Authors: Luca Pasa, Nicolò Navarin, Wolfgang Erb, Alessandro Sperduti

    Abstract: Many neural networks for graphs are based on the graph convolution operator, proposed more than a decade ago. Since then, many alternative definitions have been proposed, that tend to add complexity (and non-linearity) to the model. In this paper, we follow the opposite direction by proposing simple graph convolution operators, that can be implemented in single-layer graph convolutional networks.… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

  9. arXiv:2104.09159  [pdf, other

    cs.CV cs.AI cs.LG

    Conditional Variational Capsule Network for Open Set Recognition

    Authors: Yunrui Guo, Guglielmo Camporese, Wen**g Yang, Alessandro Sperduti, Lamberto Ballan

    Abstract: In open set recognition, a classifier has to detect unknown classes that are not known at training time. In order to recognize new categories, the classifier has to project the input samples of known classes in very compact and separated regions of the features space for discriminating samples of unknown classes. Recently proposed Capsule Networks have shown to outperform alternatives in many fiel… ▽ More

    Submitted 17 August, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: Accepted to ICCV 2021

  10. arXiv:2011.02886  [pdf, other

    cs.LG

    Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization

    Authors: Antonio Carta, Alessandro Sperduti, Davide Bacciu

    Abstract: Training RNNs to learn long-term dependencies is difficult due to vanishing gradients. We explore an alternative solution based on explicit memorization using linear autoencoders for sequences, which allows to maximize the short-term memory and that can be solved with a closed-form solution without backpropagation. We introduce an initialization schema that pretrains the weights of a recurrent neu… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

    Comments: Accepted at NeurIPS 2020 workshop "Beyond Backpropagation: Novel Ideas for Training Neural Architectures"

  11. arXiv:2009.00725  [pdf, other

    cs.LG cs.AI

    Conditional Constrained Graph Variational Autoencoders for Molecule Design

    Authors: Davide Rigoni, Nicolò Navarin, Alessandro Sperduti

    Abstract: In recent years, deep generative models for graphs have been used to generate new molecules. These models have produced good results, leading to several proposals in the literature. However, these models may have troubles learning some of the complex laws governing the chemical world. In this work, we explore the usage of the histogram of atom valences to drive the generation of molecules in such… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

  12. arXiv:2008.12676  [pdf, ps, other

    physics.plasm-ph

    Validation of neutron emission and neutron energy spectrum calculations on MAST with DRESS

    Authors: Andrea Sperduti, Iwona Klimek, Sean Conroy, Marco Cecconello, Marina Gorelenkova, Antti Snicker

    Abstract: The recently developed Directional RElativistic Spectrum Simulator (DRESS) code has been validated for the first time against numerical calculations and experimental measurements performed on MAST. In this validation, the neutron emissivities and rates computed by DRESS are benchmarked against TRANSP/NUBEAM predictions while the neutron energy spectra provided by DRESS taking as input TRANSP/NUBEA… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  13. arXiv:2008.09168  [pdf, ps, other

    cs.LG q-bio.QM

    A Systematic Assessment of Deep Learning Models for Molecule Generation

    Authors: Davide Rigoni, Nicolò Navarin, Alessandro Sperduti

    Abstract: In recent years the scientific community has devoted much effort in the development of deep learning models for the generation of new molecules with desirable properties (i.e. drugs). This has produced many proposals in literature. However, a systematic comparison among the different VAE methods is still missing. For this reason, we propose an extensive testbed for the evaluation of generative mod… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

  14. arXiv:2006.16800  [pdf, other

    cs.LG stat.ML

    Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory

    Authors: Antonio Carta, Alessandro Sperduti, Davide Bacciu

    Abstract: The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be introduced into a neural architecture by an appropriate modularization of the dynamic memory. In this paper we propose a novel incrementally trained recurrent architec… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: accepted @ ECML 2020. arXiv admin note: substantial text overlap with arXiv:2001.11771

  15. arXiv:2001.11771  [pdf, other

    cs.LG cs.NE stat.ML

    Encoding-based Memory Modules for Recurrent Neural Networks

    Authors: Antonio Carta, Alessandro Sperduti, Davide Bacciu

    Abstract: Learning to solve sequential tasks with recurrent models requires the ability to memorize long sequences and to extract task-relevant features from them. In this paper, we study the memorization subtask from the point of view of the design and training of recurrent neural networks. We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with… ▽ More

    Submitted 31 January, 2020; originally announced January 2020.

    Comments: preprint submitted at Elsevier Neural Networks

  16. arXiv:1905.06147  [pdf, ps, other

    cs.LG stat.ML

    Embeddings and Representation Learning for Structured Data

    Authors: Benjamin Paaßen, Claudio Gallicchio, Alessio Micheli, Alessandro Sperduti

    Abstract: Performing machine learning on structured data is complicated by the fact that such data does not have vectorial form. Therefore, multiple approaches have emerged to construct vectorial representations of structured data, from kernel and distance approaches to recurrent, recursive, and convolutional neural networks. Recent years have seen heightened attention in this demanding field of research an… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Comments: Oral presentation at the 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2019) in Bruges, Belgium, on April 24th, 2019

    Journal ref: Proc. ESANN (2019), 85-94

  17. arXiv:1811.10435  [pdf, other

    cs.LG stat.ML

    On Filter Size in Graph Convolutional Networks

    Authors: Dinh Van Tran, Nicolò Navarin, Alessandro Sperduti

    Abstract: Recently, many researchers have been focusing on the definition of neural networks for graphs. The basic component for many of these approaches remains the graph convolution idea proposed almost a decade ago. In this paper, we extend this basic component, following an intuition derived from the well-known convolutional filters over multi-dimensional tensors. In particular, we derive a simple, effi… ▽ More

    Submitted 23 November, 2018; originally announced November 2018.

    Comments: arXiv admin note: text overlap with arXiv:1811.06930

    Journal ref: IEEE Symposium on Deep Learning, 2018 Symposium Series on Computational Intelligence, 18 - 21 November, 2018, Bengaluru, India

  18. arXiv:1811.06930  [pdf, other

    cs.LG stat.ML

    Pre-training Graph Neural Networks with Kernels

    Authors: Nicolò Navarin, Dinh V. Tran, Alessandro Sperduti

    Abstract: Many machine learning techniques have been proposed in the last few years to process data represented in graph-structured form. Graphs can be used to model several scenarios, from molecules and materials to RNA secondary structures. Several kernel functions have been defined on graphs that coupled with kernelized learning algorithms, have shown state-of-the-art performances on many tasks. Recently… ▽ More

    Submitted 16 November, 2018; originally announced November 2018.

  19. arXiv:1811.03356  [pdf, other

    cs.LG stat.ML

    Linear Memory Networks

    Authors: Davide Bacciu, Antonio Carta, Alessandro Sperduti

    Abstract: Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled. We introduce a novel recurrent architecture based on the conceptual separation between the functional input-output transformation and the memory mechanism, show… ▽ More

    Submitted 8 November, 2018; originally announced November 2018.

  20. arXiv:1802.08132  [pdf

    nucl-ex physics.ins-det

    Results of the first user program on the Homogenous Thermal Neutron Source HOTNES (ENEA / INFN)

    Authors: A. Sperduti, M. Angelone, R. Bedogni, G. Claps, E. Diociaiuti, C. Domingo, R. Donghia, S. Giovannella, J. M. Gomez-Ros, L. Irazola-Rosales, S. Loreti, V. Monti, S. Miscetti, F. Murtas, G. Pagano, M. Pillon, R. Pilotti, A. Pola, M. Romero-Expósito, F. Sánchez-Doblado, O. Sans-Planell, A. Scherillo, E. Soldani, M. Treccani, A. Pietropaolo

    Abstract: The HOmogeneous Thermal NEutron Source (HOTNES) is a new type of thermal neutron irradiation assembly developed by the ENEA-INFN collaboration. The facility is fully characterized in terms of neutron field and dosimetric quantities, by either computational and experimental methods. This paper reports the results of the first "HOTNES users program", carried out in 2016, and covering a variety of th… ▽ More

    Submitted 22 February, 2018; originally announced February 2018.

    Journal ref: Journal of Instrumentation, Volume 12, December 2017

  21. arXiv:1711.03822  [pdf, other

    cs.LG stat.ML

    LSTM Networks for Data-Aware Remaining Time Prediction of Business Process Instances

    Authors: Nicolò Navarin, Beatrice Vincenzi, Mirko Polato, Alessandro Sperduti

    Abstract: Predicting the completion time of business process instances would be a very helpful aid when managing processes under service level agreement constraints. The ability to know in advance the trend of running process instances would allow business managers to react in time, in order to prevent delays or undesirable situations. However, making such accurate forecasts is not easy: many factors may in… ▽ More

    Submitted 10 November, 2017; originally announced November 2017.

    Comments: Article accepted for publication in 2017 IEEE Symposium on Deep Learning (IEEE DL'17) @ SSCI

  22. arXiv:1602.07566  [pdf, other

    cs.AI

    Time and Activity Sequence Prediction of Business Process Instances

    Authors: Mirko Polato, Alessandro Sperduti, Andrea Burattin, Massimiliano de Leoni

    Abstract: The ability to know in advance the trend of running process instances, with respect to different features, such as the expected completion time, would allow business managers to timely counteract to undesired situations, in order to prevent losses. Therefore, the ability to accurately predict future features of running business process instances would be a very helpful aid when managing processes,… ▽ More

    Submitted 24 February, 2016; originally announced February 2016.

  23. Graph Kernels exploiting Weisfeiler-Lehman Graph Isomorphism Test Extensions

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: In this paper we present a novel graph kernel framework inspired the by the Weisfeiler-Lehman (WL) isomorphism tests. Any WL test comprises a relabelling phase of the nodes based on test-specific information extracted from the graph, for example the set of neighbours of a node. We defined a novel relabelling and derived two kernels of the framework from it. The novel kernels are very fast to compu… ▽ More

    Submitted 22 September, 2015; originally announced September 2015.

    Journal ref: Neural Information Processing, Volume 8835 of the series Lecture Notes in Computer Science pp 93-100, 2014 Springer International Publishing

  24. A tree-based kernel for graphs with continuous attributes

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: The availability of graph data with node attributes that can be either discrete or real-valued is constantly increasing. While existing kernel methods are effective techniques for dealing with graphs having discrete node labels, their adaptation to non-discrete or continuous node attributes has been limited, mainly for computational issues. Recently, a few kernels especially tailored for this doma… ▽ More

    Submitted 20 December, 2016; v1 submitted 3 September, 2015; originally announced September 2015.

    Comments: This work has been submitted to the IEEE Transactions on Neural Networks and Learning Systems for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  25. Ordered Decompositional DAG Kernels Enhancements

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: In this paper, we show how the Ordered Decomposition DAGs (ODD) kernel framework, a framework that allows the definition of graph kernels from tree kernels, allows to easily define new state-of-the-art graph kernels. Here we consider a fast graph kernel based on the Subtree kernel (ST), and we propose various enhancements to increase its expressiveness. The proposed DAG kernel has the same worst… ▽ More

    Submitted 28 December, 2015; v1 submitted 13 July, 2015; originally announced July 2015.

    Comments: Paper accepted for publication in Neurocomputing

    Journal ref: Neurocomputing, Volume 192, 5 June 2016, Pages 92--103

  26. Extending local features with contextual information in graph kernels

    Authors: Nicolò Navarin, Alessandro Sperduti, Riccardo Tesselli

    Abstract: Graph kernels are usually defined in terms of simpler kernels over local substructures of the original graphs. Different kernels consider different types of substructures. However, in some cases they have similar predictive performances, probably because the substructures can be interpreted as approximations of the subgraphs they induce. In this paper, we propose to associate to each feature a pie… ▽ More

    Submitted 3 September, 2015; v1 submitted 8 July, 2015; originally announced July 2015.

    Comments: To appear in ICONIP 2015

    Report number: 9492, pp 271-279

    Journal ref: Lecture Notes in Computer Science, Neural Information Processing, 22nd International Conference, ICONIP 2015, November 9-12, 2015, Proceedings, Part IV

  27. arXiv:1507.02158  [pdf, other

    cs.LG

    An Empirical Study on Budget-Aware Online Kernel Algorithms for Streams of Graphs

    Authors: Giovanni Da San Martino, Nicolò Navarin, Alessandro Sperduti

    Abstract: Kernel methods are considered an effective technique for on-line learning. Many approaches have been developed for compactly representing the dual solution of a kernel method when the problem imposes memory constraints. However, in literature no work is specifically tailored to streams of graphs. Motivated by the fact that the size of the feature space representation of many state-of-the-art graph… ▽ More

    Submitted 20 July, 2016; v1 submitted 8 July, 2015; originally announced July 2015.

    Comments: Author's version of the manuscript, to appear in Neurocomputing (ELSEVIER)

  28. Conformance Checking Based on Multi-Perspective Declarative Process Models

    Authors: Andrea Burattin, Fabrizio Maria Maggi, Alessandro Sperduti

    Abstract: Process mining is a family of techniques that aim at analyzing business process execution data recorded in event logs. Conformance checking is a branch of this discipline embracing approaches for verifying whether the behavior of a process, as recorded in a log, is in line with some expected behaviors provided in the form of a process model. The majority of these approaches require the input proce… ▽ More

    Submitted 17 March, 2015; originally announced March 2015.

  29. Heuristics Miners for Streaming Event Data

    Authors: Andrea Burattin, Alessandro Sperduti, Wil M. P. van der Aalst

    Abstract: More and more business activities are performed using information systems. These systems produce such huge amounts of event data that existing systems are unable to store and process them. Moreover, few processes are in steady-state and due to changing circumstances processes evolve and systems need to adapt continuously. Since conventional process discovery algorithms have been defined for batch… ▽ More

    Submitted 27 December, 2012; originally announced December 2012.

    ACM Class: H.2.8; F.1.2