Search | arXiv e-print repository

A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems

Authors: Florin Cuconasu, Giovanni Trappolini, Nicola Tonellotto, Fabrizio Silvestri

Abstract: Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are al… ▽ More Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are aligned with human preferences using state-of-the-art techniques. Contrary to popular belief, our study demonstrates that base models outperform their instructed counterparts in RAG tasks by 20% on average under our experimental settings. This finding challenges the prevailing assumptions about the superiority of instructed LLMs in RAG applications. Further investigations reveal a more nuanced situation, questioning fundamental aspects of RAG and suggesting the need for broader discussions on the topic; or, as Fromm would have it, "Seldom is a glance at the statistics enough to understand the meaning of the figures". △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2406.12535 [pdf, other]

Predicting Award Winning Research Papers at Publication Time

Authors: Riccardo Vella, Andrea Vitaletti, Fabrizio Silvestri

Abstract: In recent years, many studies have been focusing on predicting the scientific impact of research papers. Most of these predictions are based on citations count or rely on features obtainable only from already published papers. In this study, we predict the likelihood for a research paper of winning an award only relying on information available at publication time. For each paper, we build the cit… ▽ More In recent years, many studies have been focusing on predicting the scientific impact of research papers. Most of these predictions are based on citations count or rely on features obtainable only from already published papers. In this study, we predict the likelihood for a research paper of winning an award only relying on information available at publication time. For each paper, we build the citation subgraph induced from its bibliography. We initially consider some features of this subgraph, such as the density and the global clustering coefficient, to make our prediction. Then, we mix this information with textual features, extracted from the abstract and the title, to obtain a more accurate final prediction. We made our experiments considering the ArnetMiner citation graph, while the ground truth on award-winning papers has been obtained from a collection of best paper awards from 32 computer science conferences. In our experiment, we obtained an encouraging F1 score of 0.694. Remarkably, The high recall and the low false negatives rate, show how the model performs very well at identifying papers that will not win an award. This behavior can help researchers in getting a first evaluation of their work at publication time. Lastly, we made some first experiments on interpretability. Our results highlight some interesting patterns both in topological and textual features. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.11720 [pdf, other]

Graph Neural Re-Ranking via Corpus Graph

Authors: Andrea Giuseppe Di Francesco, Christian Giannetti, Nicola Tonellotto, Fabrizio Silvestri

Abstract: Re-ranking systems aim to reorder an initial list of documents to satisfy better the information needs associated with a user-provided query. Modern re-rankers predominantly rely on neural network models, which have proven highly effective in representing samples from various modalities. However, these models typically evaluate query-document pairs in isolation, neglecting the underlying document… ▽ More Re-ranking systems aim to reorder an initial list of documents to satisfy better the information needs associated with a user-provided query. Modern re-rankers predominantly rely on neural network models, which have proven highly effective in representing samples from various modalities. However, these models typically evaluate query-document pairs in isolation, neglecting the underlying document distribution that could enhance the quality of the re-ranked list. To address this limitation, we propose Graph Neural Re-Ranking (GNRR), a pipeline based on Graph Neural Networks (GNNs), that enables each query to consider documents distribution during inference. Our approach models document relationships through corpus subgraphs and encodes their representations using GNNs. Through extensive experiments, we demonstrate that GNNs effectively capture cross-document interactions, improving performance on popular ranking metrics. In TREC-DL19, we observe a relative improvement of 5.8% in Average Precision compared to our baseline. These findings suggest that integrating the GNN segment offers significant advantages, especially in scenarios where understanding the broader context of documents is crucial. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: This preprint is the result of work in progress, therefore it should still be considered a draft

arXiv:2405.19749 [pdf, other]

Generating Query Recommendations via LLMs

Authors: Andrea Bacciu, Enrico Palumbo, Andreas Damianou, Nicola Tonellotto, Fabrizio Silvestri

Abstract: Query recommendation systems are ubiquitous in modern search engines, assisting users in producing effective queries to meet their information needs. However, these systems require a large amount of data to produce good recommendations, such as a large collection of documents to index and query logs. In particular, query logs and user data are not available in cold start scenarios. Query logs are… ▽ More Query recommendation systems are ubiquitous in modern search engines, assisting users in producing effective queries to meet their information needs. However, these systems require a large amount of data to produce good recommendations, such as a large collection of documents to index and query logs. In particular, query logs and user data are not available in cold start scenarios. Query logs are expensive to collect and maintain and require complex and time-consuming cascading pipelines for creating, combining, and ranking recommendations. To address these issues, we frame the query recommendation problem as a generative task, proposing a novel approach called Generative Query Recommendation (GQR). GQR uses an LLM as its foundation and does not require to be trained or fine-tuned to tackle the query recommendation problem. We design a prompt that enables the LLM to understand the specific recommendation task, even using a single example. We then improved our system by proposing a version that exploits query logs called Retriever-Augmented GQR (RA-GQR). RA-GQr dynamically composes its prompt by retrieving similar queries from query logs. GQR approaches reuses a pre-existing neural architecture resulting in a simpler and more ready-to-market approach, even in a cold start scenario. Our proposed GQR obtains state-of-the-art performance in terms of NDCG@10 and clarity score against two commercial search engines and the previous state-of-the-art approach on the Robust04 and ClueWeb09B collections, improving on average the NDCG@10 performance up to ~4% on Robust04 and ClueWeb09B w.r.t the previous best competitor. RA-GQR further improve the NDCG@10 obtaining an increase of ~11%, ~6\% on Robust04 and ClueWeb09B w.r.t the best competitor. Furthermore, our system obtained ~59% of user preferences in a blind user study, proving that our method produces the most engaging queries. △ Less

Submitted 4 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

Comments: Generating Query Recommendations via LLMs

MSC Class: H.3.3

arXiv:2404.15760 [pdf, other]

Debiasing Machine Unlearning with Counterfactual Examples

Authors: Ziheng Chen, Jia Wang, Jun Zhuang, Abbavaram Gowtham Reddy, Fabrizio Silvestri, ** Huang, Kaushiki Nag, Kun Kuang, Xin Ning, Gabriele Tolomei

Abstract: The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1… ▽ More The right to be forgotten (RTBF) seeks to safeguard individuals from the enduring effects of their historical actions by implementing machine-learning techniques. These techniques facilitate the deletion of previously acquired knowledge without requiring extensive model retraining. However, they often overlook a critical issue: unlearning processes bias. This bias emerges from two main sources: (1) data-level bias, characterized by uneven data removal, and (2) algorithm-level bias, which leads to the contamination of the remaining dataset, thereby degrading model accuracy. In this work, we analyze the causal factors behind the unlearning process and mitigate biases at both data and algorithmic levels. Typically, we introduce an intervention-based approach, where knowledge to forget is erased with a debiased dataset. Besides, we guide the forgetting procedure by leveraging counterfactual examples, as they maintain semantic data consistency without hurting performance on the remaining dataset. Experimental results demonstrate that our method outperforms existing machine unlearning baselines on evaluation metrics. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2403.14339 [pdf, other]

$\nabla τ$: Gradient-based and Task-Agnostic machine Unlearning

Authors: Daniel Trippa, Cesare Campagnano, Maria Sofia Bucarelli, Gabriele Tolomei, Fabrizio Silvestri

Abstract: Machine Unlearning, the process of selectively eliminating the influence of certain data examples used during a model's training, has gained significant attention as a means for practitioners to comply with recent data protection regulations. However, existing unlearning methods face critical drawbacks, including their prohibitively high cost, often associated with a large number of hyperparameter… ▽ More Machine Unlearning, the process of selectively eliminating the influence of certain data examples used during a model's training, has gained significant attention as a means for practitioners to comply with recent data protection regulations. However, existing unlearning methods face critical drawbacks, including their prohibitively high cost, often associated with a large number of hyperparameters, and the limitation of forgetting only relatively small data portions. This often makes retraining the model from scratch a quicker and more effective solution. In this study, we introduce Gradient-based and Task-Agnostic machine Unlearning ($\nabla τ$), an optimization framework designed to remove the influence of a subset of training data efficiently. It applies adaptive gradient ascent to the data to be forgotten while using standard gradient descent for the remaining data. $\nabla τ$ offers multiple benefits over existing approaches. It enables the unlearning of large sections of the training dataset (up to 30%). It is versatile, supporting various unlearning tasks (such as subset forgetting or class removal) and applicable across different domains (images, text, etc.). Importantly, $\nabla τ$ requires no hyperparameter adjustments, making it a more appealing option than retraining the model from scratch. We evaluate our framework's effectiveness using a set of well-established Membership Inference Attack metrics, demonstrating up to 10% enhancements in performance compared to state-of-the-art methods without compromising the original model's accuracy. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 14 pages, 2 figures

arXiv:2403.05358 [pdf, other]

Variational Inference of Parameters in Opinion Dynamics Models

Authors: Jacopo Lenti, Fabrizio Silvestri, Gianmarco De Francisci Morales

Abstract: Despite the frequent use of agent-based models (ABMs) for studying social phenomena, parameter estimation remains a challenge, often relying on costly simulation-based heuristics. This work uses variational inference to estimate the parameters of an opinion dynamics ABM, by transforming the estimation problem into an optimization task that can be solved directly. Our proposal relies on probabili… ▽ More Despite the frequent use of agent-based models (ABMs) for studying social phenomena, parameter estimation remains a challenge, often relying on costly simulation-based heuristics. This work uses variational inference to estimate the parameters of an opinion dynamics ABM, by transforming the estimation problem into an optimization task that can be solved directly. Our proposal relies on probabilistic generative ABMs (PGABMs): we start by synthesizing a probabilistic generative model from the ABM rules. Then, we transform the inference process into an optimization problem suitable for automatic differentiation. In particular, we use the Gumbel-Softmax reparameterization for categorical agent attributes and stochastic variational inference for parameter estimation. Furthermore, we explore the trade-offs of using variational distributions with different complexity: normal distributions and normalizing flows. We validate our method on a bounded confidence model with agent roles (leaders and followers). Our approach estimates both macroscopic (bounded confidence intervals and backfire thresholds) and microscopic ($200$ categorical, agent-level roles) more accurately than simulation-based and MCMC methods. Consequently, our technique enables experts to tune and validate their ABMs against real-world observations, thus providing insights into human behavior in social systems via data-driven analysis. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05185 [pdf, other]

Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks

Authors: Marco De Nadai, Francesco Fabbri, Paul Gigioli, Alice Wang, Ang Li, Fabrizio Silvestri, Laura Kim, Shawn Lin, Vladan Radosavljevic, Sandeep Ghael, David Nyhan, Hugues Bouchard, Mounia Lalmas-Roelleke, Andreas Damianou

Abstract: In the ever-evolving digital audio landscape, Spotify, well-known for its music and talk content, has recently introduced audiobooks to its vast user base. While promising, this move presents significant challenges for personalized recommendations. Unlike music and podcasts, audiobooks, initially available for a fee, cannot be easily skimmed before purchase, posing higher stakes for the relevance… ▽ More In the ever-evolving digital audio landscape, Spotify, well-known for its music and talk content, has recently introduced audiobooks to its vast user base. While promising, this move presents significant challenges for personalized recommendations. Unlike music and podcasts, audiobooks, initially available for a fee, cannot be easily skimmed before purchase, posing higher stakes for the relevance of recommendations. Furthermore, introducing a new content type into an existing platform confronts extreme data sparsity, as most users are unfamiliar with this new content type. Lastly, recommending content to millions of users requires the model to react fast and be scalable. To address these challenges, we leverage podcast and music user preferences and introduce 2T-HGNN, a scalable recommendation system comprising Heterogeneous Graph Neural Networks (HGNNs) and a Two Tower (2T) model. This novel approach uncovers nuanced item relationships while ensuring low latency and complexity. We decouple users from the HGNN graph and propose an innovative multi-link neighbor sampler. These choices, together with the 2T component, significantly reduce the complexity of the HGNN model. Empirical evaluations involving millions of users show significant improvement in the quality of personalized recommendations, resulting in a +46% increase in new audiobooks start rate and a +23% boost in streaming rates. Intriguingly, our model's impact extends beyond audiobooks, benefiting established products like podcasts. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: To appear in The Web Conference 2024 proceedings

arXiv:2402.14802 [pdf, other]

Link Prediction under Heterophily: A Physics-Inspired Graph Neural Network Approach

Authors: Andrea Giuseppe Di Francesco, Francesco Caso, Maria Sofia Bucarelli, Fabrizio Silvestri

Abstract: In the past years, Graph Neural Networks (GNNs) have become the `de facto' standard in various deep learning domains, thanks to their flexibility in modeling real-world phenomena represented as graphs. However, the message-passing mechanism of GNNs faces challenges in learnability and expressivity, hindering high performance on heterophilic graphs, where adjacent nodes frequently have different la… ▽ More In the past years, Graph Neural Networks (GNNs) have become the `de facto' standard in various deep learning domains, thanks to their flexibility in modeling real-world phenomena represented as graphs. However, the message-passing mechanism of GNNs faces challenges in learnability and expressivity, hindering high performance on heterophilic graphs, where adjacent nodes frequently have different labels. Most existing solutions addressing these challenges are primarily confined to specific benchmarks focused on node classification tasks. This narrow focus restricts the potential impact that link prediction under heterophily could offer in several applications, including recommender systems. For example, in social networks, two users may be connected for some latent reason, making it challenging to predict such connections in advance. Physics-Inspired GNNs such as GRAFF provided a significant contribution to enhance node classification performance under heterophily, thanks to the adoption of physics biases in the message-passing. Drawing inspiration from these findings, we advocate that the methodology employed by GRAFF can improve link prediction performance as well. To further explore this hypothesis, we introduce GRAFF-LP, an extension of GRAFF to link prediction. We evaluate its efficacy within a recent collection of heterophilic graphs, establishing a new benchmark for link prediction under heterophily. Our approach surpasses previous methods, in most of the datasets, showcasing a strong flexibility in different contexts, and achieving relative AUROC improvements of up to 26.7%. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 7 pages, 1 figure

arXiv:2401.14887 [pdf, other]

doi 10.1145/3626772.3657834

The Power of Noise: Redefining Retrieval for RAG Systems

Authors: Florin Cuconasu, Giovanni Trappolini, Federico Siciliano, Simone Filice, Cesare Campagnano, Yoelle Maarek, Nicola Tonellotto, Fabrizio Silvestri

Abstract: Retrieval-Augmented Generation (RAG) has recently emerged as a method to extend beyond the pre-trained knowledge of Large Language Models by augmenting the original prompt with relevant passages or documents retrieved by an Information Retrieval (IR) system. RAG has become increasingly important for Generative AI solutions, especially in enterprise settings or in any domain in which knowledge is c… ▽ More Retrieval-Augmented Generation (RAG) has recently emerged as a method to extend beyond the pre-trained knowledge of Large Language Models by augmenting the original prompt with relevant passages or documents retrieved by an Information Retrieval (IR) system. RAG has become increasingly important for Generative AI solutions, especially in enterprise settings or in any domain in which knowledge is constantly refreshed and cannot be memorized in the LLM. We argue here that the retrieval component of RAG systems, be it dense or sparse, deserves increased attention from the research community, and accordingly, we conduct the first comprehensive and systematic examination of the retrieval strategy of RAG systems. We focus, in particular, on the type of passages IR systems within a RAG solution should retrieve. Our analysis considers multiple factors, such as the relevance of the passages included in the prompt context, their position, and their number. One counter-intuitive finding of this work is that the retriever's highest-scoring documents that are not directly relevant to the query (e.g., do not contain the answer) negatively impact the effectiveness of the LLM. Even more surprising, we discovered that adding random documents in the prompt improves the LLM accuracy by up to 35%. These results highlight the need to investigate the appropriate strategies when integrating retrieval with LLMs, thereby laying the groundwork for future research in this area. △ Less

Submitted 1 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.03824 [pdf, ps, other]

A topological description of loss surfaces based on Betti Numbers

Authors: Maria Sofia Bucarelli, Giuseppe Alessio D'Inverno, Monica Bianchini, Franco Scarselli, Fabrizio Silvestri

Abstract: In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to thi… ▽ More In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure to evaluate loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds on the complexity of their loss function and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an $\ell_2$ regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases. △ Less

Submitted 8 January, 2024; originally announced January 2024.

arXiv:2312.17506 [pdf, other]

A graph neural network-based model with Out-of-Distribution Robustness for enhancing Antiretroviral Therapy Outcome Prediction for HIV-1

Authors: Giulia Di Teodoro, Federico Siciliano, Valerio Guarrasi, Anne-Mieke Vandamme, Valeria Ghisetti, Anders Sönnerborg, Maurizio Zazzi, Fabrizio Silvestri, Laura Palagi

Abstract: Predicting the outcome of antiretroviral therapies for HIV-1 is a pressing clinical challenge, especially when the treatment regimen includes drugs for which limited effectiveness data is available. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings. To tackle this issue, we introduce a novel joint fusion model, whi… ▽ More Predicting the outcome of antiretroviral therapies for HIV-1 is a pressing clinical challenge, especially when the treatment regimen includes drugs for which limited effectiveness data is available. This scarcity of data can arise either due to the introduction of a new drug to the market or due to limited use in clinical settings. To tackle this issue, we introduce a novel joint fusion model, which combines features from a Fully Connected (FC) Neural Network and a Graph Neural Network (GNN). The FC network employs tabular data with a feature vector made up of viral mutations identified in the most recent genotypic resistance test, along with the drugs used in therapy. Conversely, the GNN leverages knowledge derived from Stanford drug-resistance mutation tables, which serve as benchmark references for deducing in-vivo treatment efficacy based on the viral genetic sequence, to build informative graphs. We evaluated these models' robustness against Out-of-Distribution drugs in the test set, with a specific focus on the GNN's role in handling such scenarios. Our comprehensive analysis demonstrates that the proposed model consistently outperforms the FC model, especially when considering Out-of-Distribution drugs. These results underscore the advantage of integrating Stanford scores in the model, thereby enhancing its generalizability and robustness, but also extending its utility in real-world applications with limited data availability. This research highlights the potential of our approach to inform antiretroviral therapy outcome prediction and contribute to more informed clinical decisions. △ Less

Submitted 29 December, 2023; originally announced December 2023.

Comments: 32 pages, 2 figures

MSC Class: 68 ACM Class: I.2.6

arXiv:2312.15228 [pdf, other]

Adversarial Data Poisoning for Fake News Detection: How to Make a Model Misclassify a Target News without Modifying It

Authors: Federico Siciliano, Luca Maiano, Lorenzo Papa, Federica Baccini, Irene Amerini, Fabrizio Silvestri

Abstract: Fake news detection models are critical to countering disinformation but can be manipulated through adversarial attacks. In this position paper, we analyze how an attacker can compromise the performance of an online learning detector on specific news content without being able to manipulate the original target news. In some contexts, such as social networks, where the attacker cannot exert complet… ▽ More Fake news detection models are critical to countering disinformation but can be manipulated through adversarial attacks. In this position paper, we analyze how an attacker can compromise the performance of an online learning detector on specific news content without being able to manipulate the original target news. In some contexts, such as social networks, where the attacker cannot exert complete control over all the information, this scenario can indeed be quite plausible. Therefore, we show how an attacker could potentially introduce poisoning data into the training data to manipulate the behavior of an online learning method. Our initial findings reveal varying susceptibility of logistic regression models based on complexity and attack type. △ Less

Submitted 4 January, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

arXiv:2312.02401 [pdf, other]

Harmonizing Global Voices: Culturally-Aware Models for Enhanced Content Moderation

Authors: Alex J. Chan, José Luis Redondo García, Fabrizio Silvestri, Colm O'Donnel, Konstantina Palla

Abstract: Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue b… ▽ More Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue by adapting to local comprehension nuances. We train large language models on extensive datasets of media news and articles to create culturally attuned models. The latter aim to capture the nuances of communication across geographies with the goal of recognizing cultural and societal variations in what is considered offensive content. We further explore the capability of these models to generate explanations for instances of content violation, aiming to shed light on how policy guidelines are perceived when cultural and societal contexts change. We find that training on extensive media datasets successfully induced cultural awareness and resulted in improvements in handling content violations on a regional basis. Additionally, these advancements include the ability to provide explanations that align with the specific local norms and nuances as evidenced by the annotators' preference in our conducted study. This multifaceted success reinforces the critical role of an adaptable content moderation approach in kee** pace with the ever-evolving nature of the content it oversees. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 12 pages, 8 Figures. Supplementary material

arXiv:2311.17815 [pdf, other]

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

Authors: Fabrizio Ferrandi, Serena Curzel, Leandro Fiorin, Daniele Ielmini, Cristina Silvano, Francesco Conti, Alessio Burrello, Francesco Barchi, Luca Benini, Luciano Lavagno, Teodoro Urso, Enrico Calore, Sebastiano Fabio Schifano, Cristian Zambelli, Maurizio Palesi, Giuseppe Ascia, Enrico Russo, Nicola Petra, Davide De Caro, Gennaro Di Meo, Valeria Cardellini, Salvatore Filippone, Francesco Lo Presti, Francesco Silvestri, Paolo Palazzari , et al. (1 additional authors not shown)

Abstract: In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning… ▽ More In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning from computer architecture to approximate computing, computational models, and machine learning algorithms. Several methodologies and tools have been proposed to design accelerators for Deep Learning, including hardware-software co-design approaches, high-level synthesis methods, specific customized compilers, and methodologies for design space exploration, modeling, and simulation. These methodologies aim to maximize the exploitable parallelism and minimize data movement to achieve high performance and energy efficiency. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field. In particular, this work complements the previous survey proposed by the same authors in [203], which focuses on Deep Learning hardware accelerators for heterogeneous HPC platforms. △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2310.08909 [pdf, other]

doi 10.1145/3637528.3671896

Evading Community Detection via Counterfactual Neighborhood Search

Authors: Andrea Bernini, Fabrizio Silvestri, Gabriele Tolomei

Abstract: Community detection techniques are useful for social media platforms to discover tightly connected groups of users who share common interests. However, this functionality often comes at the expense of potentially exposing individuals to privacy breaches by inadvertently revealing their tastes or preferences. Therefore, some users may wish to preserve their anonymity and opt out of community detect… ▽ More Community detection techniques are useful for social media platforms to discover tightly connected groups of users who share common interests. However, this functionality often comes at the expense of potentially exposing individuals to privacy breaches by inadvertently revealing their tastes or preferences. Therefore, some users may wish to preserve their anonymity and opt out of community detection for various reasons, such as affiliation with political or religious organizations, without leaving the platform. In this study, we address the challenge of community membership hiding, which involves strategically altering the structural properties of a network graph to prevent one or more nodes from being identified by a given community detection algorithm. We tackle this problem by formulating it as a constrained counterfactual graph objective, and we solve it via deep reinforcement learning. Extensive experiments demonstrate that our method outperforms existing baselines, striking the best balance between accuracy and cost. △ Less

Submitted 7 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.04875 [pdf, other]

Prompt-to-OS (P2OS): Revolutionizing Operating Systems and Human-Computer Interaction with Integrated AI Generative Models

Authors: Gabriele Tolomei, Cesare Campagnano, Fabrizio Silvestri, Giovanni Trappolini

Abstract: In this paper, we present a groundbreaking paradigm for human-computer interaction that revolutionizes the traditional notion of an operating system. Within this innovative framework, user requests issued to the machine are handled by an interconnected ecosystem of generative AI models that seamlessly integrate with or even replace traditional software applications. At the core of this paradigm… ▽ More In this paper, we present a groundbreaking paradigm for human-computer interaction that revolutionizes the traditional notion of an operating system. Within this innovative framework, user requests issued to the machine are handled by an interconnected ecosystem of generative AI models that seamlessly integrate with or even replace traditional software applications. At the core of this paradigm shift are large generative models, such as language and diffusion models, which serve as the central interface between users and computers. This pioneering approach leverages the abilities of advanced language models, empowering users to engage in natural language conversations with their computing devices. Users can articulate their intentions, tasks, and inquiries directly to the system, eliminating the need for explicit commands or complex navigation. The language model comprehends and interprets the user's prompts, generating and displaying contextual and meaningful responses that facilitate seamless and intuitive interactions. This paradigm shift not only streamlines user interactions but also opens up new possibilities for personalized experiences. Generative models can adapt to individual preferences, learning from user input and continuously improving their understanding and response generation. Furthermore, it enables enhanced accessibility, as users can interact with the system using speech or text, accommodating diverse communication preferences. However, this visionary concept raises significant challenges, including privacy, security, trustability, and the ethical use of generative models. Robust safeguards must be in place to protect user data and prevent potential misuse or manipulation of the language model. While the full realization of this paradigm is still far from being achieved, this paper serves as a starting point for envisioning this transformative potential. △ Less

Submitted 7 October, 2023; originally announced October 2023.

Comments: 5 pages, 1 figure. Accepted at IEEE CogMI 2023 (IEEE International Conference on Cognitive Machine Intelligence)

arXiv:2309.17116 [pdf, other]

Sheaf Hypergraph Networks

Authors: Iulia Duta, Giulia Cassarà, Fabrizio Silvestri, Pietro Liò

Abstract: Higher-order relations are widespread in nature, with numerous phenomena involving complex interactions that extend beyond simple pairwise connections. As a result, advancements in higher-order processing can accelerate the growth of various fields requiring structured data. Current approaches typically represent these interactions using hypergraphs. We enhance this representation by introducing c… ▽ More Higher-order relations are widespread in nature, with numerous phenomena involving complex interactions that extend beyond simple pairwise connections. As a result, advancements in higher-order processing can accelerate the growth of various fields requiring structured data. Current approaches typically represent these interactions using hypergraphs. We enhance this representation by introducing cellular sheaves for hypergraphs, a mathematical construction that adds extra structure to the conventional hypergraph while maintaining their local, higherorder connectivity. Drawing inspiration from existing Laplacians in the literature, we develop two unique formulations of sheaf hypergraph Laplacians: linear and non-linear. Our theoretical analysis demonstrates that incorporating sheaves into the hypergraph Laplacian provides a more expressive inductive bias than standard hypergraph diffusion, creating a powerful instrument for effectively modelling complex data structures. We employ these sheaf hypergraph Laplacians to design two categories of models: Sheaf Hypergraph Neural Networks and Sheaf Hypergraph Convolutional Networks. These models generalize classical Hypergraph Networks often found in the literature. Through extensive experimentation, we show that this generalization significantly improves performance, achieving top results on multiple benchmark datasets for hypergraph node classification. △ Less

Submitted 29 September, 2023; originally announced September 2023.

Comments: Accepted at Neural Information Processing Systems (NeurIPS 2023)

arXiv:2307.13165 [pdf, other]

doi 10.1007/978-3-031-56060-6_14

Investigating the Robustness of Sequential Recommender Systems Against Training Data Perturbations

Authors: Filippo Betello, Federico Siciliano, Pushkar Mishra, Fabrizio Silvestri

Abstract: Sequential Recommender Systems (SRSs) are widely employed to model user behavior over time. However, their robustness in the face of perturbations in training data remains a largely understudied yet critical issue. A fundamental challenge emerges in previous studies aimed at assessing the robustness of SRSs: the Rank-Biased Overlap (RBO) similarity is not particularly suited for this task as it is… ▽ More Sequential Recommender Systems (SRSs) are widely employed to model user behavior over time. However, their robustness in the face of perturbations in training data remains a largely understudied yet critical issue. A fundamental challenge emerges in previous studies aimed at assessing the robustness of SRSs: the Rank-Biased Overlap (RBO) similarity is not particularly suited for this task as it is designed for infinite rankings of items and thus shows limitations in real-world scenarios. For instance, it fails to achieve a perfect score of 1 for two identical finite-length rankings. To address this challenge, we introduce a novel contribution: Finite Rank-Biased Overlap (FRBO), an enhanced similarity tailored explicitly for finite rankings. This innovation facilitates a more intuitive evaluation in practical settings. In pursuit of our goal, we empirically investigate the impact of removing items at different positions within a temporally ordered sequence. We evaluate two distinct SRS models across multiple datasets, measuring their performance using metrics such as Normalized Discounted Cumulative Gain (NDCG) and Rank List Sensitivity. Our results demonstrate that removing items at the end of the sequence has a statistically significant impact on performance, with NDCG decreasing up to 60%. Conversely, removing items from the beginning or middle has no significant effect. These findings underscore the criticality of the position of perturbed items in the training data. As we spotlight the vulnerabilities inherent in current SRSs, we fervently advocate for intensified research efforts to fortify their robustness against adversarial perturbations. △ Less

Submitted 27 December, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Journal ref: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14609

arXiv:2307.12798 [pdf, other]

RRAML: Reinforced Retrieval Augmented Machine Learning

Authors: Andrea Bacciu, Florin Cuconasu, Federico Siciliano, Fabrizio Silvestri, Nicola Tonellotto, Giovanni Trappolini

Abstract: The emergence of large language models (LLMs) has revolutionized machine learning and related fields, showcasing remarkable abilities in comprehending, generating, and manipulating human language. However, their conventional usage through API-based text prompt submissions imposes certain limitations in terms of context constraints and external source availability. To address these challenges, we p… ▽ More The emergence of large language models (LLMs) has revolutionized machine learning and related fields, showcasing remarkable abilities in comprehending, generating, and manipulating human language. However, their conventional usage through API-based text prompt submissions imposes certain limitations in terms of context constraints and external source availability. To address these challenges, we propose a novel framework called Reinforced Retrieval Augmented Machine Learning (RRAML). RRAML integrates the reasoning capabilities of LLMs with supporting information retrieved by a purpose-built retriever from a vast user-provided database. By leveraging recent advancements in reinforcement learning, our method effectively addresses several critical challenges. Firstly, it circumvents the need for accessing LLM gradients. Secondly, our method alleviates the burden of retraining LLMs for specific tasks, as it is often impractical or impossible due to restricted access to the model and the computational intensity involved. Additionally we seamlessly link the retriever's task with the reasoner, mitigating hallucinations and reducing irrelevant, and potentially damaging retrieved documents. We believe that the research agenda outlined in this paper has the potential to profoundly impact the field of AI, democratizing access to and utilization of LLMs for a wide range of entities. △ Less

Submitted 27 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Journal ref: CEUR Workshop Proceedings (2023, Vol. 3537, pp. 29-37)

arXiv:2306.14457 [pdf, other]

Fauno: The Italian Large Language Model that will leave you senza parole!

Authors: Andrea Bacciu, Giovanni Trappolini, Andrea Santilli, Emanuele Rodolà, Fabrizio Silvestri

Abstract: This paper presents Fauno, the first and largest open-source Italian conversational Large Language Model (LLM). Our goal with Fauno is to democratize the study of LLMs in Italian, demonstrating that obtaining a fine-tuned conversational bot with a single GPU is possible. In addition, we release a collection of datasets for conversational AI in Italian. The datasets on which we fine-tuned Fauno inc… ▽ More This paper presents Fauno, the first and largest open-source Italian conversational Large Language Model (LLM). Our goal with Fauno is to democratize the study of LLMs in Italian, demonstrating that obtaining a fine-tuned conversational bot with a single GPU is possible. In addition, we release a collection of datasets for conversational AI in Italian. The datasets on which we fine-tuned Fauno include various topics such as general question answering, computer science, and medical questions. We release our code and datasets on \url{https://github.com/RSTLess-research/Fauno-Italian-LLM} △ Less

Submitted 26 June, 2023; originally announced June 2023.

arXiv:2306.00707 [pdf, other]

Renormalized Graph Neural Networks

Authors: Francesco Caso, Giovanni Trappolini, Andrea Bacciu, Pietro Liò, Fabrizio Silvestri

Abstract: Graph Neural Networks (GNNs) have become essential for studying complex data, particularly when represented as graphs. Their value is underpinned by their ability to reflect the intricacies of numerous areas, ranging from social to biological networks. GNNs can grapple with non-linear behaviors, emerging patterns, and complex connections; these are also typical characteristics of complex systems.… ▽ More Graph Neural Networks (GNNs) have become essential for studying complex data, particularly when represented as graphs. Their value is underpinned by their ability to reflect the intricacies of numerous areas, ranging from social to biological networks. GNNs can grapple with non-linear behaviors, emerging patterns, and complex connections; these are also typical characteristics of complex systems. The renormalization group (RG) theory has emerged as the language for studying complex systems. It is recognized as the preferred lens through which to study complex systems, offering a framework that can untangle their intricate dynamics. Despite the clear benefits of integrating RG theory with GNNs, no existing methods have ventured into this promising territory. This paper proposes a new approach that applies RG theory to devise a novel graph rewiring to improve GNNs' performance on graph-related tasks. We support our proposal with extensive experiments on standard benchmarks and baselines. The results demonstrate the effectiveness of our method and its potential to remedy the current limitations of GNNs. Finally, this paper marks the beginning of a new research direction. This path combines the theoretical foundations of RG, the magnifying glass of complex systems, with the structural capabilities of GNNs. By doing so, we aim to enhance the potential of GNNs in modeling and unraveling the complexities inherent in diverse systems. △ Less

Submitted 1 June, 2023; originally announced June 2023.

arXiv:2305.10824 [pdf, other]

doi 10.1145/3604915.3610643

Integrating Item Relevance in Training Loss for Sequential Recommender Systems

Authors: Andrea Bacciu, Federico Siciliano, Nicola Tonellotto, Fabrizio Silvestri

Abstract: Sequential Recommender Systems (SRSs) are a popular type of recommender system that learns from a user's history to predict the next item they are likely to interact with. However, user interactions can be affected by noise stemming from account sharing, inconsistent preferences, or accidental clicks. To address this issue, we (i) propose a new evaluation protocol that takes multiple future items… ▽ More Sequential Recommender Systems (SRSs) are a popular type of recommender system that learns from a user's history to predict the next item they are likely to interact with. However, user interactions can be affected by noise stemming from account sharing, inconsistent preferences, or accidental clicks. To address this issue, we (i) propose a new evaluation protocol that takes multiple future items into account and (ii) introduce a novel relevance-aware loss function to train a SRS with multiple future items to make it more robust to noise. Our relevance-aware models obtain an improvement of ~1.2% of NDCG@10 and 0.88% in the traditional evaluation protocol, while in the new evaluation protocol, the improvement is ~1.63% of NDCG@10 and ~1.5% of HR w.r.t the best performing models. △ Less

Submitted 10 June, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

Journal ref: In Proceedings of the 17th ACM Conference on Recommender Systems (pp. 1114-1119) 2023

arXiv:2305.01447 [pdf, other]

doi 10.1145/3539618.3591930

Multimodal Neural Databases

Authors: Giovanni Trappolini, Andrea Santilli, Emanuele Rodolà, Alon Halevy, Fabrizio Silvestri

Abstract: The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as search and retrieval of extensive multimedia archives have undergone massive performance improvements, driven to a large extent by recent development… ▽ More The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. Multimedia Information Retrieval has filled this gap and has witnessed exciting progress in recent years. Tasks such as search and retrieval of extensive multimedia archives have undergone massive performance improvements, driven to a large extent by recent developments in multimodal deep learning. However, methods in this field remain limited in the kinds of queries they support and, in particular, their inability to answer database-like queries. For this reason, inspired by recent work on neural databases, we propose a new framework, which we name Multimodal Neural Databases (MMNDBs). MMNDBs can answer complex database-like queries that involve reasoning over different input modalities, such as text and images, at scale. In this paper, we present the first architecture able to fulfill this set of requirements and test it with several baselines, showing the limitations of currently available models. The results show the potential of these new techniques to process unstructured data coming from different modalities, paving the way for future research in the area. Code to replicate the experiments will be released at https://github.com/GiovanniTRA/MultimodalNeuralDatabases △ Less

Submitted 2 May, 2023; originally announced May 2023.

Journal ref: SIGIR 2023: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

arXiv:2305.00574 [pdf, other]

The Dark Side of Explanations: Poisoning Recommender Systems with Counterfactual Examples

Authors: Ziheng Chen, Fabrizio Silvestri, Jia Wang, Yongfeng Zhang, Gabriele Tolomei

Abstract: Deep learning-based recommender systems have become an integral part of several online platforms. However, their black-box nature emphasizes the need for explainable artificial intelligence (XAI) approaches to provide human-understandable reasons why a specific item gets recommended to a given user. One such method is counterfactual explanation (CF). While CFs can be highly beneficial for users an… ▽ More Deep learning-based recommender systems have become an integral part of several online platforms. However, their black-box nature emphasizes the need for explainable artificial intelligence (XAI) approaches to provide human-understandable reasons why a specific item gets recommended to a given user. One such method is counterfactual explanation (CF). While CFs can be highly beneficial for users and system designers, malicious actors may also exploit these explanations to undermine the system's security. In this work, we propose H-CARS, a novel strategy to poison recommender systems via CFs. Specifically, we first train a logical-reasoning-based surrogate model on training data derived from counterfactual explanations. By reversing the learning process of the recommendation model, we thus develop a proficient greedy algorithm to generate fabricated user profiles and their associated interaction records for the aforementioned surrogate model. Our experiments, which employ a well-known CF generation method and are conducted on two distinct datasets, show that H-CARS yields significant and successful attack performance. △ Less

Submitted 30 April, 2023; originally announced May 2023.

Comments: To be published in SIGIR2023

arXiv:2304.09097 [pdf, other]

Sheaf4Rec: Sheaf Neural Networks for Graph-based Recommender Systems

Authors: Antonio Purificato, Giulia Cassarà, Federico Siciliano, Pietro Liò, Fabrizio Silvestri

Abstract: Recent advancements in Graph Neural Networks (GNN) have facilitated their widespread adoption in various applications, including recommendation systems. GNNs have proven to be effective in addressing the challenges posed by recommendation systems by efficiently modeling graphs in which nodes represent users or items and edges denote preference relationships. However, current GNN techniques represe… ▽ More Recent advancements in Graph Neural Networks (GNN) have facilitated their widespread adoption in various applications, including recommendation systems. GNNs have proven to be effective in addressing the challenges posed by recommendation systems by efficiently modeling graphs in which nodes represent users or items and edges denote preference relationships. However, current GNN techniques represent nodes by means of a single static vector, which may inadequately capture the intricate complexities of users and items. To overcome these limitations, we propose a solution integrating a cutting-edge model inspired by category theory: Sheaf4Rec. Unlike single vector representations, Sheaf Neural Networks and their corresponding Laplacians represent each node (and edge) using a vector space. Our approach takes advantage from this theory and results in a more comprehensive representation that can be effectively exploited during inference, providing a versatile method applicable to a wide range of graph-related tasks and demonstrating unparalleled performance. Our proposed model exhibits a noteworthy relative improvement of up to 8.53% on F1-Score@10 and an impressive increase of up to 11.29% on NDCG@10, outperforming existing state-of-the-art models such as Neural Graph Collaborative Filtering (NGCF), KGTORe and other recently developed GNN-based models. In addition to its superior predictive capabilities, Sheaf4Rec shows remarkable improvements in terms of efficiency: we observe substantial runtime improvements ranging from 2.5% up to 37% when compared to other GNN-based competitor models, indicating a more efficient way of handling information while achieving better performance. Code is available at https://github.com/antoniopurificato/Sheaf4Rec. △ Less

Submitted 16 March, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

Comments: 21 pages, 8 figures

MSC Class: 55 ACM Class: I.2.6; H.3.3

arXiv:2303.09470 [pdf, other]

Learning with Noisy Labels through Learnable Weighting and Centroid Similarity

Authors: Farooq Ahmad Wani, Maria Sofia Bucarelli, Fabrizio Silvestri

Abstract: We introduce a novel method for training machine learning models in the presence of noisy labels, which are prevalent in domains such as medical diagnosis and autonomous driving and have the potential to degrade a model's generalization performance. Inspired by established literature that highlights how deep learning models are prone to overfitting to noisy samples in the later epochs of training,… ▽ More We introduce a novel method for training machine learning models in the presence of noisy labels, which are prevalent in domains such as medical diagnosis and autonomous driving and have the potential to degrade a model's generalization performance. Inspired by established literature that highlights how deep learning models are prone to overfitting to noisy samples in the later epochs of training, we propose a strategic approach. This strategy leverages the distance to class centroids in the latent space and incorporates a discounting mechanism, aiming to diminish the influence of samples that lie distant from all class centroids. By doing so, we effectively counteract the adverse effects of noisy labels. The foundational premise of our approach is the assumption that samples situated further from their respective class centroid in the initial stages of training are more likely to be associated with noise. Our methodology is grounded in robust theoretical principles and has been validated empirically through extensive experiments on several benchmark datasets. Our results show that our method consistently outperforms the existing state-of-the-art techniques, achieving significant improvements in classification accuracy in the presence of noisy labels. The code for our proposed loss function and supplementary materials is available at https://github.com/wanifarooq/NCOD △ Less

Submitted 25 June, 2024; v1 submitted 16 March, 2023; originally announced March 2023.

arXiv:2303.08288 [pdf, other]

Attention-likelihood relationship in transformers

Authors: Valeria Ruscio, Valentino Maiorca, Fabrizio Silvestri

Abstract: We analyze how large language models (LLMs) represent out-of-context words, investigating their reliance on the given context to capture their semantics. Our likelihood-guided text perturbations reveal a correlation between token likelihood and attention values in transformer-based language models. Extensive experiments reveal that unexpected tokens cause the model to attend less to the informatio… ▽ More We analyze how large language models (LLMs) represent out-of-context words, investigating their reliance on the given context to capture their semantics. Our likelihood-guided text perturbations reveal a correlation between token likelihood and attention values in transformer-based language models. Extensive experiments reveal that unexpected tokens cause the model to attend less to the information coming from themselves to compute their representations, particularly at higher layers. These findings have valuable implications for assessing the robustness of LLMs in real-world scenarios. Fully reproducible codebase at https://github.com/Flegyas/AttentionLikelihood. △ Less

Submitted 14 March, 2023; originally announced March 2023.

arXiv:2301.00214 [pdf]

Understanding the Role of Non-Fullerene Acceptors Crystallinity on the Charge Transport Properties and Performance of Organic Solar Cells

Authors: Pierluigi Mondelli, Pascal Kaienburg, Francesco Silvestri, Rebecca Scatena, Claire Welton, Martine Grandjean, Vincent Lemaur, Eduardo Solano, Mathias Nyman, Peter Horton, Simon Coles, Esther Barrena, Moritz Riede, Paolo Radaelli, David Beljonne, Manjunatha Reddy, Graham Morse

Abstract: The active layer crystallinity has long been associated with favourable organic solar cells (OSCs) properties such as high mobility and Fill Factor. In particular, this applies to acceptor materials such as fullerene-derivatives and the most recent Non-Fullerene Acceptors (NFAs), which are now surpassing 19% of Power Conversion Efficiency. Despite these advantages are being commonly attributed to… ▽ More The active layer crystallinity has long been associated with favourable organic solar cells (OSCs) properties such as high mobility and Fill Factor. In particular, this applies to acceptor materials such as fullerene-derivatives and the most recent Non-Fullerene Acceptors (NFAs), which are now surpassing 19% of Power Conversion Efficiency. Despite these advantages are being commonly attributed to their 3-dimensional crystal packing motif in the single crystal, the bridge that links the acceptor crystal packing from single crystals to solar cells has not clearly been shown yet. In this work, we investigate the molecular organisation of seven NFAs (o-IDTBR, IDIC, ITIC, m-ITIC, 4TIC, 4TICO, m-4TICO), following the evolution of their packing motif in single-crystals, powder and thin films made with pure NFAs and donor:NFA blends. In general, we observed a good correlation between the NFA single crystal packing and their molecular arrangement in the bulk heterojunction. However, the NFA packing motif is not directly affecting the device parameters but it provide an impact on the material propensity to form highly crystalline domain in the blend. Although that NFA crystallinity is required to obtain high mobility, the domain purity is more important to limit the bimolecular recombination and to obtain high efficiency organic solar cells. △ Less

Submitted 31 December, 2022; originally announced January 2023.

Comments: 16 pages, 8 figures

arXiv:2212.06605 [pdf, other]

Dimensionality reduction on complex vector spaces for Euclidean distance with dynamic weights

Authors: Paolo Pellizzoni, Simone Moretti, Francesco Silvestri

Abstract: The weighted Euclidean norm $\|x\|_w$ of a vector $x\in \mathbb{R}^d$ with weights $w\in \mathbb{R}^d$ is the Euclidean norm where the contribution of each dimension is scaled by a given weight. Approaches to dimensionality reduction that satisfy the Johnson-Lindenstrauss (JL) lemma can be easily adapted to the weighted Euclidean distance if weights are fixed: it suffices to scale each dimension o… ▽ More The weighted Euclidean norm $\|x\|_w$ of a vector $x\in \mathbb{R}^d$ with weights $w\in \mathbb{R}^d$ is the Euclidean norm where the contribution of each dimension is scaled by a given weight. Approaches to dimensionality reduction that satisfy the Johnson-Lindenstrauss (JL) lemma can be easily adapted to the weighted Euclidean distance if weights are fixed: it suffices to scale each dimension of the input vectors according to the weights, and then apply any standard approach. However, this is not the case when weights are unknown during the dimensionality reduction or might dynamically change. We address this issue by providing an approach that maps vectors into a smaller complex vector space, but still allows to satisfy a JL-like property for the weighted Euclidean distance when weights are revealed. Specifically, let $Δ\geq 1, ε\in (0,1)$ be arbitrary values, and let $S\subset \mathbb{R}^d$ be a set of $n$ vectors. We provide a weight-oblivious linear map $g:\mathbb{R}^d \rightarrow \mathbb{C}^k$, with $k=Θ(ε^{-2}Δ^4 \ln{n})$, to reduce vectors in $S$, and an estimator $ρ: \mathbb{C}^k \times \mathbb{R}^d \rightarrow \mathbb R$ with the following property. For any $x\in S$, the value $ρ(g(x), w)$ is an unbiased estimate of $\|x\|^2_w$, and $ρ$ is computed from the reduced vector $g(x)$ and the weights $w$. Moreover, the error of the estimate $ρ((g(x), w)$ depends on the norm distortion due to weights and parameter $Δ$: for any $x\in S$, the estimate has a multiplicative error $ε$ if $\|x\|_2\|w\|_2/\|x\|_w\leq Δ$, otherwise the estimate has an additive $ε\|x\|^2_2\|w\|^2_2/Δ^2$ error. Finally, we consider the estimation of weighted Euclidean norms in streaming settings: we show how to estimate the weighted norm when the weights are provided either after or concurrently with the input vector. △ Less

Submitted 14 December, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

arXiv:2209.09688 [pdf, other]

Sparse Vicious Attacks on Graph Neural Networks

Authors: Giovanni Trappolini, Valentino Maiorca, Silvio Severino, Emanuele Rodolà, Fabrizio Silvestri, Gabriele Tolomei

Abstract: Graph Neural Networks (GNNs) have proven to be successful in several predictive modeling tasks for graph-structured data. Amongst those tasks, link prediction is one of the fundamental problems for many real-world applications, such as recommender systems. However, GNNs are not immune to adversarial attacks, i.e., carefully crafted malicious examples that are designed to fool the predictive mo… ▽ More Graph Neural Networks (GNNs) have proven to be successful in several predictive modeling tasks for graph-structured data. Amongst those tasks, link prediction is one of the fundamental problems for many real-world applications, such as recommender systems. However, GNNs are not immune to adversarial attacks, i.e., carefully crafted malicious examples that are designed to fool the predictive model. In this work, we focus on a specific, white-box attack to GNN-based link prediction models, where a malicious node aims to appear in the list of recommended nodes for a given target victim. To achieve this goal, the attacker node may also count on the cooperation of other existing peers that it directly controls, namely on the ability to inject a number of ``vicious'' nodes in the network. Specifically, all these malicious nodes can add new edges or remove existing ones, thereby perturbing the original graph. Thus, we propose SAVAGE, a novel framework and a method to mount this type of link prediction attacks. SAVAGE formulates the adversary's goal as an optimization task, striking the balance between the effectiveness of the attack and the sparsity of malicious resources required. Extensive experiments conducted on real-world and synthetic datasets demonstrate that adversarial attacks implemented through SAVAGE indeed achieve high attack success rate yet using a small amount of vicious nodes. Finally, despite those attacks require full knowledge of the target model, we show that they are successfully transferable to other black-box methods for link prediction. △ Less

Submitted 20 September, 2022; originally announced September 2022.

arXiv:2208.04222 [pdf, other]

GREASE: Generate Factual and Counterfactual Explanations for GNN-based Recommendations

Authors: Ziheng Chen, Fabrizio Silvestri, Jia Wang, Yongfeng Zhang, Zhenhua Huang, Hongshik Ahn, Gabriele Tolomei

Abstract: Recently, graph neural networks (GNNs) have been widely used to develop successful recommender systems. Although powerful, it is very difficult for a GNN-based recommender system to attach tangible explanations of why a specific item ends up in the list of suggestions for a given user. Indeed, explaining GNN-based recommendations is unique, and existing GNN explanation methods are inappropriate fo… ▽ More Recently, graph neural networks (GNNs) have been widely used to develop successful recommender systems. Although powerful, it is very difficult for a GNN-based recommender system to attach tangible explanations of why a specific item ends up in the list of suggestions for a given user. Indeed, explaining GNN-based recommendations is unique, and existing GNN explanation methods are inappropriate for two reasons. First, traditional GNN explanation methods are designed for node, edge, or graph classification tasks rather than ranking, as in recommender systems. Second, standard machine learning explanations are usually intended to support skilled decision-makers. Instead, recommendations are designed for any end-user, and thus their explanations should be provided in user-understandable ways. In this work, we propose GREASE, a novel method for explaining the suggestions provided by any black-box GNN-based recommender system. Specifically, GREASE first trains a surrogate model on a target user-item pair and its $l$-hop neighborhood. Then, it generates both factual and counterfactual explanations by finding optimal adjacency matrix perturbations to capture the sufficient and necessary conditions for an item to be recommended, respectively. Experimental results conducted on real-world datasets demonstrate that GREASE can generate concise and effective explanations for popular GNN-based recommender models. △ Less

Submitted 4 August, 2022; originally announced August 2022.

arXiv:2207.13586 [pdf, other]

Encoding Concepts in Graph Neural Networks

Authors: Lucie Charlotte Magister, Pietro Barbiero, Dmitry Kazhdan, Federico Siciliano, Gabriele Ciravegna, Fabrizio Silvestri, Mateja Jamnik, Pietro Lio

Abstract: The opaque reasoning of Graph Neural Networks induces a lack of human trust. Existing graph network explainers attempt to address this issue by providing post-hoc explanations, however, they fail to make the model itself more interpretable. To fill this gap, we introduce the Concept Encoder Module, the first differentiable concept-discovery approach for graph networks. The proposed approach makes… ▽ More The opaque reasoning of Graph Neural Networks induces a lack of human trust. Existing graph network explainers attempt to address this issue by providing post-hoc explanations, however, they fail to make the model itself more interpretable. To fill this gap, we introduce the Concept Encoder Module, the first differentiable concept-discovery approach for graph networks. The proposed approach makes graph networks explainable by design by first discovering graph concepts and then using these to solve the task. Our results demonstrate that this approach allows graph networks to: (i) attain model accuracy comparable with their equivalent vanilla versions, (ii) discover meaningful concepts that achieve high concept completeness and purity scores, (iii) provide high-quality concept-based logic explanations for their prediction, and (iv) support effective interventions at test time: these can increase human trust as well as significantly improve model performance. △ Less

Submitted 7 August, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

arXiv:2205.04274 [pdf, other]

Detecting and Understanding Harmful Memes: A Survey

Authors: Shivam Sharma, Firoj Alam, Md. Shad Akhtar, Dimitar Dimitrov, Giovanni Da San Martino, Hamed Firooz, Alon Halevy, Fabrizio Silvestri, Preslav Nakov, Tanmoy Chakraborty

Abstract: The automatic identification of harmful content online is of major concern for social media platforms, policymakers, and society. Researchers have studied textual, visual, and audio content, but typically in isolation. Yet, harmful content often combines multiple modalities, as in the case of memes, which are of particular interest due to their viral nature. With this in mind, here we offer a comp… ▽ More The automatic identification of harmful content online is of major concern for social media platforms, policymakers, and society. Researchers have studied textual, visual, and audio content, but typically in isolation. Yet, harmful content often combines multiple modalities, as in the case of memes, which are of particular interest due to their viral nature. With this in mind, here we offer a comprehensive survey with a focus on harmful memes. Based on a systematic analysis of recent literature, we first propose a new typology of harmful memes, and then we highlight and summarize the relevant state of the art. One interesting finding is that many types of harmful memes are not really studied, e.g., such featuring self-harm and extremism, partly due to the lack of suitable datasets. We further find that existing datasets mostly capture multi-class scenarios, which are not inclusive of the affective spectrum that memes can represent. Another observation is that memes can propagate globally through repackaging in different languages and that they can also be multilingual, blending different cultures. We conclude by highlighting several challenges related to multimodal semiotics, technological constraints, and non-trivial social engagement, and we present several open-ended aspects such as delineating online harm and empirically examining related frameworks and assistive interventions, which we believe will motivate and drive future research. △ Less

Submitted 29 May, 2022; v1 submitted 9 May, 2022; originally announced May 2022.

Comments: Accepted at IJCAI-ECAI 2022 (Survey Track) - Editorial Feedback Revised, 9 pages (7 main + 2 reference pages)

arXiv:2202.05868 [pdf, other]

Blocking Techniques for Sparse Matrix Multiplication on Tensor Accelerators

Authors: Paolo Sylos Labini, Massimo Bernaschi, Francesco Silvestri, Flavio Vella

Abstract: Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since their features are specifically designed for tensor algebra (typically dense matrix-product), it is commonly assumed that they are not suitable for applications wi… ▽ More Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since their features are specifically designed for tensor algebra (typically dense matrix-product), it is commonly assumed that they are not suitable for applications with sparse data. To challenge this viewpoint, we discuss methods and present solutions for accelerating sparse matrix multiplication on such architectures. In particular, we present a 1-dimensional blocking algorithm with theoretical guarantees on the density, which builds dense blocks from arbitrary sparse matrices. Experimental results show that, even for unstructured and highly-sparse matrices, our block-based solution which exploits Nvidia Tensor Cores is faster than its sparse counterpart. We observed significant speed-ups of up to two orders of magnitude on real-world sparse matrices. △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: 12 pages, 14 images

arXiv:2110.11960 [pdf, other]

doi 10.1145/3511808.3557429

ReLAX: Reinforcement Learning Agent eXplainer for Arbitrary Predictive Models

Authors: Ziheng Chen, Fabrizio Silvestri, Jia Wang, He Zhu, Hongshik Ahn, Gabriele Tolomei

Abstract: Counterfactual examples (CFs) are one of the most popular methods for attaching post-hoc explanations to machine learning (ML) models. However, existing CF generation methods either exploit the internals of specific models or depend on each sample's neighborhood, thus they are hard to generalize for complex models and inefficient for large datasets. This work aims to overcome these limitations and… ▽ More Counterfactual examples (CFs) are one of the most popular methods for attaching post-hoc explanations to machine learning (ML) models. However, existing CF generation methods either exploit the internals of specific models or depend on each sample's neighborhood, thus they are hard to generalize for complex models and inefficient for large datasets. This work aims to overcome these limitations and introduces ReLAX, a model-agnostic algorithm to generate optimal counterfactual explanations. Specifically, we formulate the problem of crafting CFs as a sequential decision-making task and then find the optimal CFs via deep reinforcement learning (DRL) with discrete-continuous hybrid action space. Extensive experiments conducted on several tabular datasets have shown that ReLAX outperforms existing CF generation baselines, as it produces sparser counterfactuals, is more scalable to complex target models to explain, and generalizes to both classification and regression tasks. Finally, to demonstrate the usefulness of our method in a real-world use case, we leverage CFs generated by ReLAX to suggest actions that a country should take to reduce the risk of mortality due to COVID-19. Interestingly enough, the actions recommended by our method correspond to the strategies that many countries have actually implemented to counter the COVID-19 pandemic. △ Less

Submitted 8 August, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

arXiv:2110.02775 [pdf, other]

NEWRON: A New Generalization of the Artificial Neuron to Enhance the Interpretability of Neural Networks

Authors: Federico Siciliano, Maria Sofia Bucarelli, Gabriele Tolomei, Fabrizio Silvestri

Abstract: In this work, we formulate NEWRON: a generalization of the McCulloch-Pitts neuron structure. This new framework aims to explore additional desirable properties of artificial neurons. We show that some specializations of NEWRON allow the network to be interpretable with no change in their expressiveness. By just inspecting the models produced by our NEWRON-based networks, we can understand the rule… ▽ More In this work, we formulate NEWRON: a generalization of the McCulloch-Pitts neuron structure. This new framework aims to explore additional desirable properties of artificial neurons. We show that some specializations of NEWRON allow the network to be interpretable with no change in their expressiveness. By just inspecting the models produced by our NEWRON-based networks, we can understand the rules governing the task. Extensive experiments show that the quality of the generated models is better than traditional interpretable models and in line or better than standard neural networks. △ Less

Submitted 5 October, 2021; originally announced October 2021.

arXiv:2109.08013 [pdf, other]

Detecting Propaganda Techniques in Memes

Authors: Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

Abstract: Propaganda can be defined as a form of communication that aims to influence the opinions or the actions of people towards a specific goal; this is achieved by means of well-defined rhetorical and psychological devices. Propaganda, in the form we know it today, can be dated back to the beginning of the 17th century. However, it is with the advent of the Internet and the social media that it has sta… ▽ More Propaganda can be defined as a form of communication that aims to influence the opinions or the actions of people towards a specific goal; this is achieved by means of well-defined rhetorical and psychological devices. Propaganda, in the form we know it today, can be dated back to the beginning of the 17th century. However, it is with the advent of the Internet and the social media that it has started to spread on a much larger scale than before, thus becoming major societal and political issue. Nowadays, a large fraction of propaganda in social media is multimodal, mixing textual with visual content. With this in mind, here we propose a new multi-label multimodal task: detecting the type of propaganda techniques used in memes. We further create and release a new corpus of 950 memes, carefully annotated with 22 propaganda techniques, which can appear in the text, in the image, or in both. Our analysis of the corpus shows that understanding both modalities together is essential for detecting these techniques. This is further confirmed in our experiments with several state-of-the-art multimodal models. △ Less

Submitted 7 August, 2021; originally announced September 2021.

Comments: propaganda, disinformation, fake news, memes, multimodality. arXiv admin note: text overlap with arXiv:2105.09284

MSC Class: 68T50 ACM Class: I.2.7

Journal ref: ACL-2021

arXiv:2107.00761 [pdf, other]

On the Bike Spreading Problem

Authors: Elia Costa, Francesco Silvestri

Abstract: A free-floating bike-sharing system (FFBSS) is a dockless rental system where an individual can borrow a bike and returns it anywhere, within the service area. To improve the rental service, available bikes should be distributed over the entire service area: a customer leaving from any position is then more likely to find a near bike and then to use the service. Moreover, spreading bikes among the… ▽ More A free-floating bike-sharing system (FFBSS) is a dockless rental system where an individual can borrow a bike and returns it anywhere, within the service area. To improve the rental service, available bikes should be distributed over the entire service area: a customer leaving from any position is then more likely to find a near bike and then to use the service. Moreover, spreading bikes among the entire service area increases urban spatial equity since the benefits of FFBSS are not a prerogative of just a few zones. For guaranteeing such distribution, the FFBSS operator can use vans to manually relocate bikes, but it incurs high economic and environmental costs. We propose a novel approach that exploits the existing bike flows generated by customers to distribute bikes. More specifically, by envisioning the problem as an Influence Maximization problem, we show that it is possible to position batches of bikes on a small number of zones, and then the daily use of FFBSS will efficiently spread these bikes on a large area. We show that detecting these zones is NP-complete, but there exists a simple and efficient $1-1/e$ approximation algorithm; our approach is then evaluated on a dataset of rides from the free-floating bike-sharing system of the city of Padova. △ Less

Submitted 18 August, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

Comments: Proc. 21st Symposium on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems (ATMOS 2021)

arXiv:2106.01074 [pdf, other]

Database Reasoning Over Text

Authors: James Thorne, Majid Yazdani, Marzieh Saeidi, Fabrizio Silvestri, Sebastian Riedel, Alon Halevy

Abstract: Neural models have shown impressive performance gains in answering queries from natural language text. However, existing works are unable to support database queries, such as "List/Count all female athletes who were born in 20th century", which require reasoning over sets of relevant facts with operations such as join, filtering and aggregation. We show that while state-of-the-art transformer mode… ▽ More Neural models have shown impressive performance gains in answering queries from natural language text. However, existing works are unable to support database queries, such as "List/Count all female athletes who were born in 20th century", which require reasoning over sets of relevant facts with operations such as join, filtering and aggregation. We show that while state-of-the-art transformer models perform very well for small databases, they exhibit limitations in processing noisy data, numerical operations, and queries that aggregate facts. We propose a modular architecture to answer these database-style queries over multiple spans from text and aggregating these at scale. We evaluate the architecture using WikiNLDB, a novel dataset for exploring such queries. Our architecture scales to databases containing thousands of facts whereas contemporary models are limited by how many facts can be encoded. In direct comparison on small databases, our approach increases overall answer accuracy from 85% to 90%. On larger databases, our approach retains its accuracy whereas transformer baselines could not encode the context. △ Less

Submitted 2 June, 2021; originally announced June 2021.

Comments: To appear at ACL2021

arXiv:2105.09284 [pdf, other]

SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images

Authors: Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

Abstract: We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems. The task focused on memes and had three subtasks: (i) detecting the techniques in the text, (ii) detecting the text spans where the techniques are used, and (iii) detecting techniques in the entire meme, i.… ▽ More We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems. The task focused on memes and had three subtasks: (i) detecting the techniques in the text, (ii) detecting the text spans where the techniques are used, and (iii) detecting techniques in the entire meme, i.e., both in the text and in the image. It was a popular task, attracting 71 registrations, and 22 teams that eventually made an official submission on the test set. The evaluation results for the third subtask confirmed the importance of both modalities, the text and the image. Moreover, some teams reported benefits when not just combining the two modalities, e.g., by using early or late fusion, but rather modeling the interaction between them in a joint model. △ Less

Submitted 25 April, 2021; originally announced May 2021.

Comments: propaganda, disinformation, misinformation, fake news, memes, multimodality

MSC Class: 68T50 ACM Class: F.2.2; I.2.7

Journal ref: SemEval-2021

arXiv:2104.00353 [pdf, other]

CycleDRUMS: Automatic Drum Arrangement For Bass Lines Using CycleGAN

Authors: Giorgio Barnabò, Giovanni Trappolini, Lorenzo Lastilla, Cesare Campagnano, Angela Fan, Fabio Petroni, Fabrizio Silvestri

Abstract: The two main research threads in computer-based music generation are: the construction of autonomous music-making systems, and the design of computer-based environments to assist musicians. In the symbolic domain, the key problem of automatically arranging a piece music was extensively studied, while relatively fewer systems tackled this challenge in the audio domain. In this contribution, we prop… ▽ More The two main research threads in computer-based music generation are: the construction of autonomous music-making systems, and the design of computer-based environments to assist musicians. In the symbolic domain, the key problem of automatically arranging a piece music was extensively studied, while relatively fewer systems tackled this challenge in the audio domain. In this contribution, we propose CycleDRUMS, a novel method for generating drums given a bass line. After converting the waveform of the bass into a mel-spectrogram, we are able to automatically generate original drums that follow the beat, sound credible and can be directly mixed with the input bass. We formulated this task as an unpaired image-to-image translation problem, and we addressed it with CycleGAN, a well-established unsupervised style transfer framework, originally designed for treating images. The choice to deploy raw audio and mel-spectrograms enabled us to better represent how humans perceive music, and to potentially draw sounds for new arrangements from the vast collection of music recordings accumulated in the last century. In absence of an objective way of evaluating the output of both generative adversarial networks and music generative systems, we further defined a possible metric for the proposed task, partially based on human (and expert) judgement. Finally, as a comparison, we replicated our results with Pix2Pix, a paired image-to-image translation network, and we showed that our approach outperforms it. △ Less

Submitted 9 April, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

Comments: 9 pages, 5 figures, submitted to IEEE Transactions on Multimedia, the authors contributed equally to this work

arXiv:2103.12541 [pdf, other]

A Survey on Multimodal Disinformation Detection

Authors: Firoj Alam, Stefano Cresci, Tanmoy Chakraborty, Fabrizio Silvestri, Dimiter Dimitrov, Giovanni Da San Martino, Shaden Shaar, Hamed Firooz, Preslav Nakov

Abstract: Recent years have witnessed the proliferation of offensive content online such as fake news, propaganda, misinformation, and disinformation. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract more attention, and spread further than text. As a result, researchers started leveraging different modalities an… ▽ More Recent years have witnessed the proliferation of offensive content online such as fake news, propaganda, misinformation, and disinformation. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract more attention, and spread further than text. As a result, researchers started leveraging different modalities and combinations thereof to tackle online multimodal offensive content. In this study, we offer a survey on the state-of-the-art on multimodal disinformation detection covering various combinations of modalities: text, images, speech, video, social media network structure, and temporal information. Moreover, while some studies focused on factuality, others investigated how harmful the content is. While these two components in the definition of disinformation (i) factuality, and (ii) harmfulness, are equally important, they are typically studied in isolation. Thus, we argue for the need to tackle disinformation detection by taking into account multiple modalities as well as both factuality and harmfulness, in the same framework. Finally, we discuss current challenges and future research directions △ Less

Submitted 28 September, 2022; v1 submitted 13 March, 2021; originally announced March 2021.

Comments: Accepted at COLING-2022, disinformation, misinformation, factuality, harmfulness, fake news, propaganda, multimodality, text, images, videos, network structure, temporality

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2102.03322 [pdf, other]

CF-GNNExplainer: Counterfactual Explanations for Graph Neural Networks

Authors: Ana Lucic, Maartje ter Hoeve, Gabriele Tolomei, Maarten de Rijke, Fabrizio Silvestri

Abstract: Given the increasing promise of graph neural networks (GNNs) in real-world applications, several methods have been developed for explaining their predictions. Existing methods for interpreting predictions from GNNs have primarily focused on generating subgraphs that are especially relevant for a particular prediction. However, such methods are not counterfactual (CF) in nature: given a prediction,… ▽ More Given the increasing promise of graph neural networks (GNNs) in real-world applications, several methods have been developed for explaining their predictions. Existing methods for interpreting predictions from GNNs have primarily focused on generating subgraphs that are especially relevant for a particular prediction. However, such methods are not counterfactual (CF) in nature: given a prediction, we want to understand how the prediction can be changed in order to achieve an alternative outcome. In this work, we propose a method for generating CF explanations for GNNs: the minimal perturbation to the input (graph) data such that the prediction changes. Using only edge deletions, we find that our method, CF-GNNExplainer, can generate CF explanations for the majority of instances across three widely used datasets for GNN explanations, while removing less than 3 edges on average, with at least 94\% accuracy. This indicates that CF-GNNExplainer primarily removes edges that are crucial for the original predictions, resulting in minimal CF explanations. △ Less

Submitted 22 February, 2022; v1 submitted 5 February, 2021; originally announced February 2021.

Comments: Accepted to AISTATS 2022

arXiv:2101.10905 [pdf, other]

Sampling a Near Neighbor in High Dimensions -- Who is the Fairest of Them All?

Authors: Martin Aumüller, Sariel Har-Peled, Sepideh Mahabadi, Rasmus Pagh, Francesco Silvestri

Abstract: Similarity search is a fundamental algorithmic primitive, widely used in many computer science disciplines. Given a set of points $S$ and a radius parameter $r>0$, the $r$-near neighbor ($r$-NN) problem asks for a data structure that, given any query point $q$, returns a point $p$ within distance at most $r$ from $q$. In this paper, we study the $r$-NN problem in the light of individual fairness a… ▽ More Similarity search is a fundamental algorithmic primitive, widely used in many computer science disciplines. Given a set of points $S$ and a radius parameter $r>0$, the $r$-near neighbor ($r$-NN) problem asks for a data structure that, given any query point $q$, returns a point $p$ within distance at most $r$ from $q$. In this paper, we study the $r$-NN problem in the light of individual fairness and providing equal opportunities: all points that are within distance $r$ from the query should have the same probability to be returned. In the low-dimensional case, this problem was first studied by Hu, Qiao, and Tao (PODS 2014). Locality sensitive hashing (LSH), the theoretically strongest approach to similarity search in high dimensions, does not provide such a fairness guarantee. In this work, we show that LSH based algorithms can be made fair, without a significant loss in efficiency. We propose several efficient data structures for the exact and approximate variants of the fair NN problem. Our approach works more generally for sampling uniformly from a sub-collection of sets of a given collection and can be used in a few other applications. We also develop a data structure for fair similarity search under inner product that requires nearly-linear space and exploits locality sensitive filters. The paper concludes with an experimental evaluation that highlights the inherent unfairness of NN data structures and shows the performance of our algorithms on real-world datasets. △ Less

Submitted 26 January, 2021; originally announced January 2021.

Comments: arXiv admin note: text overlap with arXiv:1906.02640

arXiv:2010.06973 [pdf, other]

Neural Databases

Authors: James Thorne, Majid Yazdani, Marzieh Saeidi, Fabrizio Silvestri, Sebastian Riedel, Alon Halevy

Abstract: In recent years, neural networks have shown impressive performance gains on long-standing AI problems, and in particular, answering queries from natural language text. These advances raise the question of whether they can be extended to a point where we can relax the fundamental assumption of database management, namely, that our data is represented as fields of a pre-defined schema. This paper… ▽ More In recent years, neural networks have shown impressive performance gains on long-standing AI problems, and in particular, answering queries from natural language text. These advances raise the question of whether they can be extended to a point where we can relax the fundamental assumption of database management, namely, that our data is represented as fields of a pre-defined schema. This paper presents a first step in answering that question. We describe NeuralDB, a database system with no pre-defined schema, in which updates and queries are given in natural language. We develop query processing techniques that build on the primitives offered by the state of the art Natural Language Processing methods. We begin by demonstrating that at the core, recent NLP transformers, powered by pre-trained language models, can answer select-project-join queries if they are given the exact set of relevant facts. However, they cannot scale to non-trivial databases and cannot perform aggregation queries. Based on these findings, we describe a NeuralDB architecture that runs multiple Neural SPJ operators in parallel, each with a set of database sentences that can produce one of the answers to the query. The result of these operators is fed to an aggregation operator if needed. We describe an algorithm that learns how to create the appropriate sets of facts to be fed into each of the Neural SPJ operators. Importantly, this algorithm can be trained by the Neural SPJ operator itself. We experimentally validate the accuracy of NeuralDB and its components, showing that we can answer queries over thousands of sentences with very high accuracy. △ Less

Submitted 14 October, 2020; originally announced October 2020.

Comments: Submitted to PVLDB vol 14

arXiv:2009.10311 [pdf, other]

Preserving Integrity in Online Social Networks

Authors: Alon Halevy, Cristian Canton Ferrer, Hao Ma, Umut Ozertem, Patrick Pantel, Marzieh Saeidi, Fabrizio Silvestri, Ves Stoyanov

Abstract: Online social networks provide a platform for sharing information and free expression. However, these networks are also used for malicious purposes, such as distributing misinformation and hate speech, selling illegal drugs, and coordinating sex trafficking or child exploitation. This paper surveys the state of the art in kee** online platforms and their users safe from such harm, also known as… ▽ More Online social networks provide a platform for sharing information and free expression. However, these networks are also used for malicious purposes, such as distributing misinformation and hate speech, selling illegal drugs, and coordinating sex trafficking or child exploitation. This paper surveys the state of the art in kee** online platforms and their users safe from such harm, also known as the problem of preserving integrity. This survey comes from the perspective of having to combat a broad spectrum of integrity violations at Facebook. We highlight the techniques that have been proven useful in practice and that deserve additional attention from the academic community. Instead of discussing the many individual violation types, we identify key aspects of the social-media eco-system, each of which is common to a wide variety violation types. Furthermore, each of these components represents an area for research and development, and the innovations that are found can be applied widely. △ Less

Submitted 25 September, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

arXiv:2006.12608 [pdf, ps, other]

Similarity Search with Tensor Core Units

Authors: Thomas D. Ahle, Francesco Silvestri

Abstract: Tensor Core Units (TCUs) are hardware accelerators developed for deep neural networks, which efficiently support the multiplication of two dense $\sqrt{m}\times \sqrt{m}$ matrices, where $m$ is a given hardware parameter. In this paper, we show that TCUs can speed up similarity search problems as well. We propose algorithms for the Johnson-Lindenstrauss dimensionality reduction and for similarity… ▽ More Tensor Core Units (TCUs) are hardware accelerators developed for deep neural networks, which efficiently support the multiplication of two dense $\sqrt{m}\times \sqrt{m}$ matrices, where $m$ is a given hardware parameter. In this paper, we show that TCUs can speed up similarity search problems as well. We propose algorithms for the Johnson-Lindenstrauss dimensionality reduction and for similarity join that, by leveraging TCUs, achieve a $\sqrt{m}$ speedup up with respect to traditional approaches. △ Less

Submitted 22 June, 2020; originally announced June 2020.

arXiv:2006.00937 [pdf, ps, other]

Concept Matching for Low-Resource Classification

Authors: Federico Errica, Ludovic Denoyer, Bora Edizel, Fabio Petroni, Vassilis Plachouras, Fabrizio Silvestri, Sebastian Riedel

Abstract: We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input space. Importantly, the model learns to focus on elements of the input that are relevant for the task at hand; by leveraging highlighted portions of the trainin… ▽ More We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input space. Importantly, the model learns to focus on elements of the input that are relevant for the task at hand; by leveraging highlighted portions of the training data, an error boosting technique guides the learning process. In practice, it increases the error associated with relevant parts of the input by a given factor. Remarkable results on text classification tasks confirm the benefits of the proposed approach in both balanced and unbalanced cases, thus being of practical use when labeling new examples is expensive. In addition, by inspecting its weights, it is often possible to gather insights on what the model has learned. △ Less

Submitted 1 June, 2020; originally announced June 2020.

arXiv:2004.06423 [pdf]

doi 10.1117/12.2307155

Color filter arrays based on dielectric metasurface elements

Authors: Jonas Berzins, Fabrizio Silvestri, Giampiero Gerini, Frank Setzpfandt, Thomas Pertsch, Stefan M. B. Bäumer

Abstract: Digital imaging has been steadily improving over the past decades and we are moving towards a wide use of multi- and hyperspectral cameras. A key component of such imaging systems are color filter arrays, which define the spectrum of light detected by each camera pixel. Hence, it is essential to develop a variable, robust and scalable way for controlling the transmission of light. Nanostructured s… ▽ More Digital imaging has been steadily improving over the past decades and we are moving towards a wide use of multi- and hyperspectral cameras. A key component of such imaging systems are color filter arrays, which define the spectrum of light detected by each camera pixel. Hence, it is essential to develop a variable, robust and scalable way for controlling the transmission of light. Nanostructured surfaces, also known as metasurfaces, offer a promising solution as their transmission spectra can be controlled by sha** the wavelength-dependent scattering properties of their constituting elements. Here we present, metasurfaces based on silicon nanodisks, which provide filter functions with amplitudes reaching 70-90% of transmission, and well suitable for RGB and CMY color filter arrays, the initial stage towards the further development of hyperspectral filters. We suggest and discuss possible ways to expand the color gamut and improve the color values of such optical filters. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Journal ref: SPIE Proc. 10671, 106711F (2018)

Showing 1–50 of 87 results for author: Silvestri, F