Search | arXiv e-print repository

GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation

Authors: Cristiano Saltori, Evgeny Krivosheev, Stéphane Lathuilière, Nicu Sebe, Fabio Galasso, Giuseppe Fiameni, Elisa Ricci, Fabio Poiesi

Abstract: 3D point cloud semantic segmentation is fundamental for autonomous driving. Most approaches in the literature neglect an important aspect, i.e., how to deal with domain shift when handling dynamic scenes. This can significantly hinder the navigation capabilities of self-driving vehicles. This paper advances the state of the art in this research field. Our first contribution consists in analysing a… ▽ More 3D point cloud semantic segmentation is fundamental for autonomous driving. Most approaches in the literature neglect an important aspect, i.e., how to deal with domain shift when handling dynamic scenes. This can significantly hinder the navigation capabilities of self-driving vehicles. This paper advances the state of the art in this research field. Our first contribution consists in analysing a new unexplored scenario in point cloud segmentation, namely Source-Free Online Unsupervised Domain Adaptation (SF-OUDA). We experimentally show that state-of-the-art methods have a rather limited ability to adapt pre-trained deep network models to unseen domains in an online manner. Our second contribution is an approach that relies on adaptive self-training and geometric-feature propagation to adapt a pre-trained source model online without requiring either source data or target labels. Our third contribution is to study SF-OUDA in a challenging setup where source data is synthetic and target data is point clouds captured in the real world. We use the recent SynLiDAR dataset as a synthetic source and introduce two new synthetic (source) datasets, which can stimulate future synthetic-to-real autonomous driving research. Our experiments show the effectiveness of our segmentation approach on thousands of real-world point clouds. Code and synthetic datasets are available at https://github.com/saltoricristiano/gipso-sfouda. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: Accepted at ECCV 2022

arXiv:2105.03701 [pdf, other]

Business Entity Matching with Siamese Graph Convolutional Networks

Authors: Evgeny Krivosheev, Mattia Atzeni, Katsiaryna Mirylenka, Paolo Scotton, Christoph Miksovic, Anton Zorin

Abstract: Data integration has been studied extensively for decades and approached from different angles. However, this domain still remains largely rule-driven and lacks universal automation. Recent developments in machine learning and in particular deep learning have opened the way to more general and efficient solutions to data-integration tasks. In this paper, we demonstrate an approach that allows mode… ▽ More Data integration has been studied extensively for decades and approached from different angles. However, this domain still remains largely rule-driven and lacks universal automation. Recent developments in machine learning and in particular deep learning have opened the way to more general and efficient solutions to data-integration tasks. In this paper, we demonstrate an approach that allows modeling and integrating entities by leveraging their relations and contextual information. This is achieved by combining siamese and graph neural networks to effectively propagate information between connected entities and support high scalability. We evaluated our approach on the task of integrating data about business entities, demonstrating that it outperforms both traditional rule-based systems and other deep learning approaches. △ Less

Submitted 8 May, 2021; originally announced May 2021.

arXiv:2104.00808 [pdf, other]

Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation

Authors: Subhankar Roy, Evgeny Krivosheev, Zhun Zhong, Nicu Sebe, Elisa Ricci

Abstract: In this paper we address multi-target domain adaptation (MTDA), where given one labeled source dataset and multiple unlabeled target datasets that differ in data distributions, the task is to learn a robust predictor for all the target domains. We identify two key aspects that can help to alleviate multiple domain-shifts in the MTDA: feature aggregation and curriculum learning. To this end, we pro… ▽ More In this paper we address multi-target domain adaptation (MTDA), where given one labeled source dataset and multiple unlabeled target datasets that differ in data distributions, the task is to learn a robust predictor for all the target domains. We identify two key aspects that can help to alleviate multiple domain-shifts in the MTDA: feature aggregation and curriculum learning. To this end, we propose Curriculum Graph Co-Teaching (CGCT) that uses a dual classifier head, with one of them being a graph convolutional network (GCN) which aggregates features from similar samples across the domains. To prevent the classifiers from over-fitting on its own noisy pseudo-labels we develop a co-teaching strategy with the dual classifier head that is assisted by curriculum learning to obtain more reliable pseudo-labels. Furthermore, when the domain labels are available, we propose Domain-aware Curriculum Learning (DCL), a sequential adaptation strategy that first adapts on the easier target domains, followed by the harder ones. We experimentally demonstrate the effectiveness of our proposed frameworks on several benchmarks and advance the state-of-the-art in the MTDA by large margins (e.g. +5.6% on the DomainNet). △ Less

Submitted 1 April, 2021; originally announced April 2021.

Comments: CVPR 2021

arXiv:2101.08854 [pdf, other]

Active Hybrid Classification

Authors: Evgeny Krivosheev, Fabio Casati, Alessandro Bozzon

Abstract: Hybrid crowd-machine classifiers can achieve superior performance by combining the cost-effectiveness of automatic classification with the accuracy of human judgment. This paper shows how crowd and machines can support each other in tackling classification problems. Specifically, we propose an architecture that orchestrates active learning and crowd classification and combines them in a virtuous c… ▽ More Hybrid crowd-machine classifiers can achieve superior performance by combining the cost-effectiveness of automatic classification with the accuracy of human judgment. This paper shows how crowd and machines can support each other in tackling classification problems. Specifically, we propose an architecture that orchestrates active learning and crowd classification and combines them in a virtuous cycle. We show that when the pool of items to classify is finite we face learning vs. exploitation trade-off in hybrid classification, as we need to balance crowd tasks optimized for creating a training dataset with tasks optimized for classifying items in the pool. We define the problem, propose a set of heuristics and evaluate the approach on three real-world datasets with different characteristics in terms of machine and crowd classification performance, showing that our active hybrid approach significantly outperforms baselines. △ Less

Submitted 21 January, 2021; originally announced January 2021.

arXiv:2012.02297 [pdf, other]

Active Learning from Crowd in Document Screening

Authors: Evgeny Krivosheev, Burcu Sayin, Alessandro Bozzon, Zoltán Szlávik

Abstract: In this paper, we explore how to efficiently combine crowdsourcing and machine intelligence for the problem of document screening, where we need to screen documents with a set of machine-learning filters. Specifically, we focus on building a set of machine learning classifiers that evaluate documents, and then screen them efficiently. It is a challenging task since the budget is limited and there… ▽ More In this paper, we explore how to efficiently combine crowdsourcing and machine intelligence for the problem of document screening, where we need to screen documents with a set of machine-learning filters. Specifically, we focus on building a set of machine learning classifiers that evaluate documents, and then screen them efficiently. It is a challenging task since the budget is limited and there are countless number of ways to spend the given budget on the problem. We propose a multi-label active learning screening specific sampling technique -- objective-aware sampling -- for querying unlabelled documents for annotating. Our algorithm takes a decision on which machine filter need more training data and how to choose unlabeled items to annotate in order to minimize the risk of overall classification errors rather than minimizing a single filter error. We demonstrate that objective-aware sampling significantly outperforms the state of the art active learning sampling strategies. △ Less

Submitted 11 November, 2020; originally announced December 2020.

Comments: Crowd Science Workshop at NeurIPS 2020

arXiv:2001.06543 [pdf, other]

Siamese Graph Neural Networks for Data Integration

Authors: Evgeny Krivosheev, Mattia Atzeni, Katsiaryna Mirylenka, Paolo Scotton, Fabio Casati

Abstract: Data integration has been studied extensively for decades and approached from different angles. However, this domain still remains largely rule-driven and lacks universal automation. Recent development in machine learning and in particular deep learning has opened the way to more general and more efficient solutions to data integration problems. In this work, we propose a general approach to model… ▽ More Data integration has been studied extensively for decades and approached from different angles. However, this domain still remains largely rule-driven and lacks universal automation. Recent development in machine learning and in particular deep learning has opened the way to more general and more efficient solutions to data integration problems. In this work, we propose a general approach to modeling and integrating entities from structured data, such as relational databases, as well as unstructured sources, such as free text from news articles. Our approach is designed to explicitly model and leverage relations between entities, thereby using all available information and preserving as much context as possible. This is achieved by combining siamese and graph neural networks to propagate information between connected entities and support high scalability. We evaluate our method on the task of integrating data about business entities, and we demonstrate that it outperforms standard rule-based systems, as well as other deep learning approaches that do not use graph-based representations. △ Less

Submitted 17 January, 2020; originally announced January 2020.

arXiv:1904.00714 [pdf, other]

Combining Crowd and Machines for Multi-predicate Item Screening

Authors: Evgeny Krivosheev, Fabio Casati, Marcos Baez, Boualem Benatallah

Abstract: This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. We further show how, given a new classi… ▽ More This paper discusses how crowd and machine classifiers can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. We further show how, given a new classification problem and a set of classifiers of unknown accuracy for the problem at hand, we can identify how to manage the cost-accuracy trade off by progressively determining if we should spend budget to obtain test data (to assess the accuracy of the given classifiers), or to train an ensemble of classifiers, or whether we should leverage the existing machine classifiers with the crowd, and in this case how to efficiently combine them based on their estimated characteristics to obtain the classification. We demonstrate that the techniques we propose obtain significant cost/accuracy improvements with respect to the leading classification algorithms. △ Less

Submitted 1 April, 2019; originally announced April 2019.

Comments: Please cite the CSCW2018 version of this paper:@article{krivosheev2018combining, title={Combining Crowd and Machines for Multi-predicate Item Screening}, author={Krivosheev, Evgeny and Casati, Fabio and Baez, Marcos and Benatallah, Boualem}, journal={Proceedings of the ACM on Human-Computer Interaction}, volume={2}, number={CSCW}, pages={97}, year={2018}, publisher={ACM} }

arXiv:1805.12376 [pdf, other]

CrowdRev: A platform for Crowd-based Screening of Literature Reviews

Authors: Jorge Ramirez, Evgeny Krivosheev, Marcos Baez, Fabio Casati, Boualem Benatallah

Abstract: In this paper and demo we present a crowd and crowd+AI based system, called CrowdRev, supporting the screening phase of literature reviews and achieving the same quality as author classification at a fraction of the cost, and near-instantly. CrowdRev makes it easy for authors to leverage the crowd, and ensures that no money is wasted even in the face of difficult papers or criteria: if the system… ▽ More In this paper and demo we present a crowd and crowd+AI based system, called CrowdRev, supporting the screening phase of literature reviews and achieving the same quality as author classification at a fraction of the cost, and near-instantly. CrowdRev makes it easy for authors to leverage the crowd, and ensures that no money is wasted even in the face of difficult papers or criteria: if the system detects that the task is too hard for the crowd, it just gives up trying (for that paper, or for that criteria, or altogether), without wasting money and never compromising on quality. △ Less

Submitted 31 May, 2018; originally announced May 2018.

arXiv:1803.09814 [pdf, other]

Crowd-based Multi-Predicate Screening of Papers in Literature Reviews

Authors: Evgeny Krivosheev, Fabio Casati, Boualem Benatallah

Abstract: Systematic literature reviews (SLRs) are one of the most common and useful form of scientific research and publication. Tens of thousands of SLRs are published each year, and this rate is growing across all fields of science. Performing an accurate, complete and unbiased SLR is however a difficult and expensive endeavor. This is true in general for all phases of a literature review, and in particu… ▽ More Systematic literature reviews (SLRs) are one of the most common and useful form of scientific research and publication. Tens of thousands of SLRs are published each year, and this rate is growing across all fields of science. Performing an accurate, complete and unbiased SLR is however a difficult and expensive endeavor. This is true in general for all phases of a literature review, and in particular for the paper screening phase, where authors lter a set of potentially in-scope papers based on a number of exclusion criteria. To address the problem, in recent years the research community has began to explore the use of the crowd to allow for a faster, accurate, cheaper and unbiased screening of papers. Initial results show that crowdsourcing can be effective, even for relatively complex reviews. In this paper we derive and analyze a set of strategies for crowd-based screening, and show that an adaptive strategy, that continuously re-assesses the statistical properties of the problem to minimize the number of votes needed to take decisions for each paper, significantly outperforms a number of non-adaptive approaches in terms of cost and accuracy. We validate both applicability and results of the approach through a set of crowdsourcing experiments, and discuss properties of the problem and algorithms that we believe to be generally of interest for classification problems where items are classified via a series of successive tests (as it often happens in medicine). △ Less

Submitted 21 March, 2018; originally announced March 2018.

Comments: Please cite the www2018 version of this paper: @inproceedings{krivosheev2018, title={Crowd-based Multi-Predicate Screening of Papers in Literature Reviews}, author={Evgeny Krivosheev, Fabio Casati and Boualem Benatallah}, year={2018}, organization={International World Wide Web Conferences Steering Committee} }

arXiv:1803.07947 [pdf, other]

Crowd-Machine Collaboration for Item Screening

Authors: Evgeny Krivosheev, Bahareh Harandizadeh, Fabio Casati, Boualem Benatallah

Abstract: In this paper we describe how crowd and machine classifier can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. In this paper we describe how crowd and machine classifier can be efficiently combined to screen items that satisfy a set of predicates. We show that this is a recurring problem in many domains, present machine-human (hybrid) algorithms that screen items efficiently and estimate the gain over human-only or machine-only screening in terms of performance and cost. △ Less

Submitted 21 March, 2018; originally announced March 2018.

arXiv:1709.05168 [pdf, other]

Crowdsourcing Paper Screening in Systematic Literature Reviews

Authors: Evgeny Krivosheev, Fabio Casati, Valentina Caforio, Boualem Benatallah

Abstract: Literature reviews allow scientists to stand on the shoulders of giants, showing promising directions, summarizing progress, and pointing out existing challenges in research. At the same time conducting a systematic literature review is a laborious and consequently expensive process. In the last decade, there have a few studies on crowdsourcing in literature reviews. This paper explores the feasib… ▽ More Literature reviews allow scientists to stand on the shoulders of giants, showing promising directions, summarizing progress, and pointing out existing challenges in research. At the same time conducting a systematic literature review is a laborious and consequently expensive process. In the last decade, there have a few studies on crowdsourcing in literature reviews. This paper explores the feasibility of crowdsourcing for facilitating the literature review process in terms of results, time and effort, as well as to identify which crowdsourcing strategies provide the best results based on the budget available. In particular we focus on the screening phase of the literature review process and we contribute and assess methods for identifying the size of tests, labels required per paper, and classification functions as well as methods to split the crowdsourcing process in phases to improve results. Finally, we present our findings based on experiments run on Crowdflower. △ Less

Submitted 15 September, 2017; originally announced September 2017.

Showing 1–11 of 11 results for author: Krivosheev, E