Skip to main content

Showing 1–21 of 21 results for author: Lucchese, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.11731  [pdf, other

    cs.IR

    A Learning-to-Rank Formulation of Clustering-Based Approximate Nearest Neighbor Search

    Authors: Thomas Vecchiato, Claudio Lucchese, Franco Maria Nardini, Sebastian Bruch

    Abstract: A critical piece of the modern information retrieval puzzle is approximate nearest neighbor search. Its objective is to return a set of $k$ data points that are closest to a query point, with its accuracy measured by the proportion of exact nearest neighbors captured in the returned set. One popular approach to this question is clustering: The indexing algorithm partitions data points into non-ove… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  2. arXiv:2402.14988  [pdf, other

    cs.LG cs.CR cs.LO stat.ML

    Verifiable Boosted Tree Ensembles

    Authors: Stefano Calzavara, Lorenzo Cazzaro, Claudio Lucchese, Giulio Ermanno Pibiri

    Abstract: Verifiable learning advocates for training machine learning models amenable to efficient security verification. Prior research demonstrated that specific classes of decision tree ensembles -- called large-spread ensembles -- allow for robustness verification in polynomial time against any norm-based attacker. This study expands prior work on verifiable learning from basic ensemble methods (i.e., h… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 15 pages, 3 figures

  3. Efficient and Effective Tree-based and Neural Learning to Rank

    Authors: Sebastian Bruch, Claudio Lucchese, Franco Maria Nardini

    Abstract: This monograph takes a step towards promoting the study of efficiency in the era of neural information retrieval by offering a comprehensive survey of the literature on efficiency and effectiveness in ranking, and to a limited extent, retrieval. This monograph was inspired by the parallels that exist between the challenges in neural network-based ranking solutions and their predecessors, decision… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  4. arXiv:2305.08579  [pdf, other

    cs.LG

    Fast Inference of Tree Ensembles on ARM Devices

    Authors: Simon Koschel, Sebastian Buschjäger, Claudio Lucchese, Katharina Morik

    Abstract: With the ongoing integration of Machine Learning models into everyday life, e.g. in the form of the Internet of Things (IoT), the evaluation of learned models becomes more and more an important issue. Tree ensembles are one of the best black-box classifiers available and routinely outperform more complex classifiers. While the fast application of tree ensembles has already been studied in the lite… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 12 pages, 2 figures, 4 algorithms

  5. arXiv:2212.14447  [pdf, other

    cs.AI cs.CV cs.LG

    A Theoretical Framework for AI Models Explainability with Application in Biomedicine

    Authors: Matteo Rizzo, Alberto Veneri, Andrea Albarelli, Claudio Lucchese, Marco Nobile, Cristina Conati

    Abstract: EXplainable Artificial Intelligence (XAI) is a vibrant research topic in the artificial intelligence community, with growing interest across methods and domains. Much has been written about the subject, yet XAI still lacks shared terminology and a framework capable of providing structural soundness to explanations. In our work, we address these issues by proposing a novel definition of explanation… ▽ More

    Submitted 14 June, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

  6. arXiv:2209.13179  [pdf, other

    cs.LG cs.LO

    Explainable Global Fairness Verification of Tree-Based Classifiers

    Authors: Stefano Calzavara, Lorenzo Cazzaro, Claudio Lucchese, Federico Marcuzzi

    Abstract: We present a new approach to the global fairness verification of tree-based classifiers. Given a tree-based classifier and a set of sensitive features potentially leading to discrimination, our analysis synthesizes sufficient conditions for fairness, expressed as a set of traditional propositional logic formulas, which are readily understandable by human experts. The verified fairness guarantees a… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Comments: 15 pages with 7 figures

  7. ILMART: Interpretable Ranking with Constrained LambdaMART

    Authors: Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Alberto Veneri

    Abstract: Interpretable Learning to Rank (LtR) is an emerging field within the research area of explainable AI, aiming at develo** intelligible and accurate predictive models. While most of the previous research efforts focus on creating post-hoc explanations, in this paper we investigate how to train effective and intrinsically-interpretable ranking models. Develo** these models is particularly challen… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: 5 pages, 3 figures, to be published in SIGIR 2022 proceedings

  8. arXiv:2112.14435  [pdf, other

    cs.LG cs.AI

    EiFFFeL: Enforcing Fairness in Forests by Flip** Leaves

    Authors: Seyum Assefa Abebe, Claudio Lucchese, Salvatore Orlando

    Abstract: Nowadays Machine Learning (ML) techniques are extensively adopted in many socially sensitive systems, thus requiring to carefully study the fairness of the decisions taken by such systems. Many approaches have been proposed to address and to make sure there is no bias against individuals or specific groups which might originally come from biased training datasets or algorithm design. In this regar… ▽ More

    Submitted 16 May, 2022; v1 submitted 29 December, 2021; originally announced December 2021.

  9. arXiv:2112.02705  [pdf, other

    cs.LG cs.CR

    Beyond Robustness: Resilience Verification of Tree-Based Classifiers

    Authors: Stefano Calzavara, Lorenzo Cazzaro, Claudio Lucchese, Federico Marcuzzi, Salvatore Orlando

    Abstract: In this paper we criticize the robustness measure traditionally employed to assess the performance of machine learning models deployed in adversarial settings. To mitigate the limitations of robustness, we introduce a new measure called resilience and we focus on its verification. In particular, we discuss how resilience can be verified by combining a traditional robustness verification technique… ▽ More

    Submitted 5 December, 2021; originally announced December 2021.

  10. Learning Early Exit Strategies for Additive Ranking Ensembles

    Authors: Francesco Busolin, Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Salvatore Trani

    Abstract: Modern search engine ranking pipelines are commonly based on large machine-learned ensembles of regression trees. We propose LEAR, a novel - learned - technique aimed to reduce the average number of trees traversed by documents to accumulate the scores, thus reducing the overall query response time. LEAR exploits a classifier that predicts whether a document can early exit the ensemble because it… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

    Comments: 5 pages, 3 figures, ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 21)

    ACM Class: H.3.3

    Journal ref: 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Association for Computing Machinery, 2021, 2217-2221

  11. arXiv:2007.02771  [pdf, other

    cs.LG stat.ML

    Certifying Decision Trees Against Evasion Attacks by Program Analysis

    Authors: Stefano Calzavara, Pietro Ferrara, Claudio Lucchese

    Abstract: Machine learning has proved invaluable for a range of different tasks, yet it also proved vulnerable to evasion attacks, i.e., maliciously crafted perturbations of input data designed to force mispredictions. In this paper we propose a novel technique to verify the security of decision tree models against evasion attacks with respect to an expressive threat model, where the attacker can be represe… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

  12. arXiv:2004.14641  [pdf, other

    cs.IR cs.LG

    Query-level Early Exit for Additive Learning-to-Rank Ensembles

    Authors: Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Salvatore Trani

    Abstract: Search engine ranking pipelines are commonly based on large ensembles of machine-learned decision trees. The tight constraints on query response time recently motivated researchers to investigate algorithms to make faster the traversal of the additive ensemble or to early terminate the evaluation of documents that are unlikely to be ranked among the top-k. In this paper, we investigate the novel p… ▽ More

    Submitted 30 April, 2020; originally announced April 2020.

    Comments: Accepted at SIGIR 2020 (short paper)

    MSC Class: 68P20

  13. arXiv:2004.03295  [pdf, other

    cs.LG stat.ML

    Feature Partitioning for Robust Tree Ensembles and their Certification in Adversarial Scenarios

    Authors: Stefano Calzavara, Claudio Lucchese, Federico Marcuzzi, Salvatore Orlando

    Abstract: Machine learning algorithms, however effective, are known to be vulnerable in adversarial scenarios where a malicious user may inject manipulated instances. In this work we focus on evasion attacks, where a model is trained in a safe environment and exposed to attacks at test time. The attacker aims at finding a minimal perturbation of a test instance that changes the model outcome. We propose a… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

  14. arXiv:1907.01197  [pdf, other

    cs.LG cs.CR stat.ML

    Treant: Training Evasion-Aware Decision Trees

    Authors: Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, Salvatore Orlando

    Abstract: Despite its success and popularity, machine learning is now recognized as vulnerable to evasion attacks, i.e., carefully crafted perturbations of test inputs designed to force prediction errors. In this paper we focus on evasion attacks against decision tree ensembles, which are among the most successful predictive models for dealing with non-perceptual problems. Even though they are powerful and… ▽ More

    Submitted 3 July, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

  15. arXiv:1811.02516  [pdf

    cs.IR cs.SI

    Computing Entity Semantic Similarity by Features Ranking

    Authors: Livia Ruback, Claudio Lucchese, Alexander Arturo Mera Caraballo, Grettel Monteagudo García, Marco Antonio Casanova, Chiara Renso

    Abstract: This article presents a novel approach to estimate semantic entity similarity using entity features available as Linked Data. The key idea is to exploit ranked lists of features, extracted from Linked Data sources, as a representation of the entities to be compared. The similarity between two entities is then estimated by comparing their ranked lists of features. The article describes experiments… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

  16. arXiv:1703.05053  [pdf, other

    cs.SI

    A Motif-based Approach for Identifying Controversy

    Authors: Mauro Coletto, Kiran Garimella, Aristides Gionis, Claudio Lucchese

    Abstract: Among the topics discussed in Social Media, some lead to controversy. A number of recent studies have focused on the problem of identifying controversy in social media mostly based on the analysis of textual content or rely on global network structure. Such approaches have strong limitations due to the difficulty of understanding natural language, and of investigating the global network structure.… ▽ More

    Submitted 15 March, 2017; originally announced March 2017.

    Journal ref: ICWSM 2017

  17. arXiv:1612.08157  [pdf, other

    cs.CY cs.SI

    Pornography consumption in Social Media

    Authors: Mauro Coletto, Luca Maria Aiello, Claudio Lucchese, Fabrizio Silvestri

    Abstract: The structure of a social network is fundamentally related to the interests of its members. People assort spontaneously based on the topics that are relevant to them, forming social groups that revolve around different subjects. Online social media are also favorable ecosystems for the formation of topical communities centered on matters that are not commonly taken up by the general public because… ▽ More

    Submitted 20 January, 2017; v1 submitted 24 December, 2016; originally announced December 2016.

    Comments: arXiv admin note: text overlap with arXiv:1610.08372

  18. arXiv:1610.08686  [pdf, ps, other

    cs.SI

    Polarized User and Topic Tracking in Twitter

    Authors: Mauro Coletto, Claudio Lucchese, Salvatore Orlando, Raffaele Perego

    Abstract: Digital traces of conversations in micro-blogging platforms and OSNs provide information about user opinion with a high degree of resolution. These information sources can be exploited to under- stand and monitor collective behaviors. In this work, we focus on polarization classes, i.e., those topics that require the user to side exclusively with one position. The proposed method provides an itera… ▽ More

    Submitted 27 October, 2016; originally announced October 2016.

    Comments: SIGIR 16

  19. arXiv:1610.08372  [pdf, other

    cs.SI physics.soc-ph

    On the Behaviour of Deviant Communities in Online Social Networks

    Authors: Mauro Coletto, Luca Maria Aiello, Claudio Lucchese, Fabrizio Silvestri

    Abstract: On-line social networks are complex ensembles of inter-linked communities that interact on different topics. Some communities are characterized by what are usually referred to as deviant behaviors, conducts that are commonly considered inappropriate with respect to the society's norms or moral standards. Eating disorders, drug use, and adult content consumption are just a few examples. We refer to… ▽ More

    Submitted 26 October, 2016; originally announced October 2016.

    Comments: ICWSM 16

  20. arXiv:1605.01895  [pdf, other

    cs.SI

    Sentiment-enhanced Multidimensional Analysis of Online Social Networks: Perception of the Mediterranean Refugees Crisis

    Authors: Mauro Coletto, Claudio Lucchese, Cristina Ioana Muntean, Franco Maria Nardini, Andrea Esuli, Chiara Renso, Raffaele Perego

    Abstract: We propose an analytical framework able to investigate discussions about polarized topics in online social networks from many different angles. The framework supports the analysis of social networks along several dimensions: time, space and sentiment. We show that the proposed analytical framework and the methodology can be used to mine knowledge about the perception of complex social phenomena. W… ▽ More

    Submitted 6 May, 2016; originally announced May 2016.

  21. arXiv:0905.4627  [pdf, other

    cs.MM cs.IR

    CoPhIR: a Test Collection for Content-Based Image Retrieval

    Authors: Paolo Bolettieri, Andrea Esuli, Fabrizio Falchi, Claudio Lucchese, Raffaele Perego, Tommaso Piccioli, Fausto Rabitti

    Abstract: The scalability, as well as the effectiveness, of the different Content-based Image Retrieval (CBIR) approaches proposed in literature, is today an important research issue. Given the wealth of images on the Web, CBIR systems must in fact leap towards Web-scale datasets. In this paper, we report on our experience in building a test collection of 100 million images, with the corresponding descrip… ▽ More

    Submitted 1 June, 2009; v1 submitted 28 May, 2009; originally announced May 2009.

    Comments: 15 pages