Search | arXiv e-print repository

Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification

Authors: Martin Juan José Bucher, Marco Martini

Abstract: Generative AI offers a simple, prompt-based alternative to fine-tuning smaller BERT-style LLMs for text classification tasks. This promises to eliminate the need for manually labeled training data and task-specific model training. However, it remains an open question whether tools like ChatGPT can deliver on this promise. In this paper, we show that smaller, fine-tuned LLMs (still) consistently an… ▽ More Generative AI offers a simple, prompt-based alternative to fine-tuning smaller BERT-style LLMs for text classification tasks. This promises to eliminate the need for manually labeled training data and task-specific model training. However, it remains an open question whether tools like ChatGPT can deliver on this promise. In this paper, we show that smaller, fine-tuned LLMs (still) consistently and significantly outperform larger, zero-shot prompted models in text classification. We compare three major generative AI models (ChatGPT with GPT-3.5/GPT-4 and Claude Opus) with several fine-tuned LLMs across a diverse set of classification tasks (sentiment, approval/disapproval, emotions, party positions) and text categories (news, tweets, speeches). We find that fine-tuning with application-specific training data achieves superior performance in all cases. To make this approach more accessible to a broader audience, we provide an easy-to-use toolkit alongside this paper. Our toolkit, accompanied by non-technical step-by-step guidance, enables users to select and fine-tune BERT-like LLMs for any classification task with minimal technical and computational effort. △ Less

Submitted 12 June, 2024; originally announced June 2024.

ACM Class: I.2.7

arXiv:2311.00548 [pdf, other]

Continual atlas-based segmentation of prostate MRI

Authors: Amin Ranem, Camila González, Daniel Pinto dos Santos, Andreas M. Bucher, Ahmed E. Othman, Anirban Mukhopadhyay

Abstract: Continual learning (CL) methods designed for natural image classification often fail to reach basic quality standards for medical image segmentation. Atlas-based segmentation, a well-established approach in medical imaging, incorporates domain knowledge on the region of interest, leading to semantically coherent predictions. This is especially promising for CL, as it allows us to leverage structur… ▽ More Continual learning (CL) methods designed for natural image classification often fail to reach basic quality standards for medical image segmentation. Atlas-based segmentation, a well-established approach in medical imaging, incorporates domain knowledge on the region of interest, leading to semantically coherent predictions. This is especially promising for CL, as it allows us to leverage structural information and strike an optimal balance between model rigidity and plasticity over time. When combined with privacy-preserving prototypes, this process offers the advantages of rehearsal-based CL without compromising patient privacy. We propose Atlas Replay, an atlas-based segmentation approach that uses prototypes to generate high-quality segmentation masks through image registration that maintain consistency even as the training distribution changes. We explore how our proposed method performs compared to state-of-the-art CL methods in terms of knowledge transferability across seven publicly available prostate segmentation datasets. Prostate segmentation plays a vital role in diagnosing prostate cancer, however, it poses challenges due to substantial anatomical variations, benign structural differences in older age groups, and fluctuating acquisition parameters. Our results show that Atlas Replay is both robust and generalizes well to yet-unseen domains while being able to maintain knowledge, unlike end-to-end segmentation methods. Our code base is available under https://github.com/MECLabTUDA/Atlas-Replay. △ Less

Submitted 6 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

arXiv:2309.17285 [pdf, other]

Efficient Large Scale Medical Image Dataset Preparation for Machine Learning Applications

Authors: Stefan Denner, Jonas Scherer, Klaus Kades, Dimitrios Bounias, Philipp Schader, Lisa Kausch, Markus Bujotzek, Andreas Michael Bucher, Tobias Penzkofer, Klaus Maier-Hein

Abstract: In the rapidly evolving field of medical imaging, machine learning algorithms have become indispensable for enhancing diagnostic accuracy. However, the effectiveness of these algorithms is contingent upon the availability and organization of high-quality medical imaging datasets. Traditional Digital Imaging and Communications in Medicine (DICOM) data management systems are inadequate for handling… ▽ More In the rapidly evolving field of medical imaging, machine learning algorithms have become indispensable for enhancing diagnostic accuracy. However, the effectiveness of these algorithms is contingent upon the availability and organization of high-quality medical imaging datasets. Traditional Digital Imaging and Communications in Medicine (DICOM) data management systems are inadequate for handling the scale and complexity of data required to be facilitated in machine learning algorithms. This paper introduces an innovative data curation tool, developed as part of the Kaapana open-source toolkit, aimed at streamlining the organization, management, and processing of large-scale medical imaging datasets. The tool is specifically tailored to meet the needs of radiologists and machine learning researchers. It incorporates advanced search, auto-annotation and efficient tagging functionalities for improved data curation. Additionally, the tool facilitates quality control and review, enabling researchers to validate image and segmentation quality in large datasets. It also plays a critical role in uncovering potential biases in datasets by aggregating and visualizing metadata, which is essential for develo** robust machine learning models. Furthermore, Kaapana is integrated within the Radiological Cooperative Network (RACOON), a pioneering initiative aimed at creating a comprehensive national infrastructure for the aggregation, transmission, and consolidation of radiological data across all university clinics throughout Germany. A supplementary video showcasing the tool's functionalities can be accessed at https://bit.ly/MICCAI-DEMI2023. △ Less

Submitted 29 September, 2023; originally announced September 2023.

arXiv:2307.05235 [pdf, other]

Quantitative Comparison of Nearest Neighbor Search Algorithms

Authors: Hanitriniala Malalatiana Rakotondrasoa, Martin Bucher, Ilya Sinayskiy

Abstract: We compare the performance of three nearest neighbor search algorithms: the Orchard, ball tree, and VP-tree algorithms. These algorithms are commonly used for nearest-neighbor searches and are known for their efficiency in large datasets. We analyze the fraction of distances computed in relation to the size of the dataset and its dimension. For each algorithm we derive a fitting function for the e… ▽ More We compare the performance of three nearest neighbor search algorithms: the Orchard, ball tree, and VP-tree algorithms. These algorithms are commonly used for nearest-neighbor searches and are known for their efficiency in large datasets. We analyze the fraction of distances computed in relation to the size of the dataset and its dimension. For each algorithm we derive a fitting function for the efficiency as a function to set size and dimension. The article aims to provide a comprehensive analysis of the performance of these algorithms and help researchers and practitioners choose the best algorithm for their specific application. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: 5 pages Latex

arXiv:2212.14177 [pdf, other]

Current State of Community-Driven Radiological AI Deployment in Medical Imaging

Authors: Vikash Gupta, Barbaros Selnur Erdal, Carolina Ramirez, Ralf Floca, Laurence Jackson, Brad Genereaux, Sidney Bryson, Christopher P Bridge, Jens Kleesiek, Felix Nensa, Rickmer Braren, Khaled Younis, Tobias Penzkofer, Andreas Michael Bucher, Ming Melvin Qin, Gigon Bae, Hyeonhoon Lee, M. Jorge Cardoso, Sebastien Ourselin, Eric Kerfoot, Rahul Choudhury, Richard D. White, Tessa Cook, David Bericat, Matthew Lungren , et al. (2 additional authors not shown)

Abstract: Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introd… ▽ More Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and develo** tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions. △ Less

Submitted 8 May, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

Comments: 21 pages; 5 figures

MSC Class: eess.IV

arXiv:2108.06230 [pdf, other]

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds

Authors: Björn Michele, Alexandre Boulch, Gilles Puy, Maxime Bucher, Renaud Marlet

Abstract: While there has been a number of studies on Zero-Shot Learning (ZSL) for 2D images, its application to 3D data is still recent and scarce, with just a few methods limited to classification. We present the first generative approach for both ZSL and Generalized ZSL (GZSL) on 3D data, that can handle both classification and, for the first time, semantic segmentation. We show that it reaches or outper… ▽ More While there has been a number of studies on Zero-Shot Learning (ZSL) for 2D images, its application to 3D data is still recent and scarce, with just a few methods limited to classification. We present the first generative approach for both ZSL and Generalized ZSL (GZSL) on 3D data, that can handle both classification and, for the first time, semantic segmentation. We show that it reaches or outperforms the state of the art on ModelNet40 classification for both inductive ZSL and inductive GZSL. For semantic segmentation, we created three benchmarks for evaluating this new ZSL task, using S3DIS, ScanNet and SemanticKITTI. Our experiments show that our method outperforms strong baselines, which we additionally propose for this task. △ Less

Submitted 19 January, 2023; v1 submitted 13 August, 2021; originally announced August 2021.

Comments: For the published code, see https://github.com/valeoai/3DGenZ

Journal ref: Proceedings of the 2021 International Conference on 3D Vision (3DV 2021), pp. 992-1002

arXiv:2004.01130 [pdf, other]

Handling new target classes in semantic segmentation with domain adaptation

Authors: Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

Abstract: In this work, we define and address a novel domain adaptation (DA) problem in semantic scene segmentation, where the target domain not only exhibits a data distribution shift w.r.t. the source domain, but also includes novel classes that do not exist in the latter. Different to "open-set" and "universal domain adaptation", which both regard all objects from new classes as "unknown", we aim at expl… ▽ More In this work, we define and address a novel domain adaptation (DA) problem in semantic scene segmentation, where the target domain not only exhibits a data distribution shift w.r.t. the source domain, but also includes novel classes that do not exist in the latter. Different to "open-set" and "universal domain adaptation", which both regard all objects from new classes as "unknown", we aim at explicit test-time prediction for these new classes. To reach this goal, we propose a framework that leverages domain adaptation and zero-shot learning techniques to enable "boundless" adaptation in the target domain. It relies on a novel architecture, along with a dedicated learning scheme, to bridge the source-target domain gap while learning how to map new classes' labels to relevant visual representations. The performance is further improved using self-training on target-domain pseudo-labels. For validation, we consider different domain adaptation set-ups, namely synthetic-2-real, country-2-country and dataset-2-dataset. Our framework outperforms the baselines by significant margins, setting competitive standards on all benchmarks for the new task. Code and models are available at https://github.com/valeoai/buda. △ Less

Submitted 16 February, 2021; v1 submitted 2 April, 2020; originally announced April 2020.

Comments: Under review at CVIU

arXiv:1906.00817 [pdf, other]

Zero-Shot Semantic Segmentation

Authors: Maxime Bucher, Tuan-Hung Vu, Matthieu Cord, Patrick Pérez

Abstract: Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object categories with zero training examples. To this end, we present a novel architecture, ZS3Net, combining a deep visual segmentation model with an approach to generate… ▽ More Semantic segmentation models are limited in their ability to scale to large numbers of object classes. In this paper, we introduce the new task of zero-shot semantic segmentation: learning pixel-wise classifiers for never-seen object categories with zero training examples. To this end, we present a novel architecture, ZS3Net, combining a deep visual segmentation model with an approach to generate visual representations from semantic word embeddings. By this way, ZS3Net addresses pixel classification tasks where both seen and unseen categories are faced at test time (so called "generalized" zero-shot classification). Performance is further improved by a self-training step that relies on automatic pseudo-labeling of pixels from unseen classes. On the two standard segmentation datasets, Pascal-VOC and Pascal-Context, we propose zero-shot benchmarks and set competitive baselines. For complex scenes as ones in the Pascal-Context dataset, we extend our approach by using a graph-context encoding to fully leverage spatial context priors coming from class-wise segmentation maps. △ Less

Submitted 18 November, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

Comments: NeurIPS 2019 (accepted)

arXiv:1904.01886 [pdf, other]

DADA: Depth-aware Domain Adaptation in Semantic Segmentation

Authors: Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, Patrick Pérez

Abstract: Unsupervised domain adaptation (UDA) is important for applications where large scale annotation of representative data is challenging. For semantic segmentation in particular, it helps deploy on real "target domain" data models that are trained on annotated images from a different "source domain", notably a virtual environment. To this end, most previous works consider semantic segmentation as the… ▽ More Unsupervised domain adaptation (UDA) is important for applications where large scale annotation of representative data is challenging. For semantic segmentation in particular, it helps deploy on real "target domain" data models that are trained on annotated images from a different "source domain", notably a virtual environment. To this end, most previous works consider semantic segmentation as the only mode of supervision for source domain data, while ignoring other, possibly available, information like depth. In this work, we aim at exploiting at best such a privileged information while training the UDA model. We propose a unified depth-aware UDA framework that leverages in several complementary ways the knowledge of dense depth in the source domain. As a result, the performance of the trained semantic segmentation model on the target domain is boosted. Our novel approach indeed achieves state-of-the-art performance on different challenging synthetic-2-real benchmarks. △ Less

Submitted 19 August, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

Comments: Accepted in ICCV'19

arXiv:1811.12833 [pdf, other]

ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation

Authors: Tuan-Hung Vu, Himalaya Jain, Maxime Bucher, Matthieu Cord, Patrick Pérez

Abstract: Semantic segmentation is a key problem for many computer vision tasks. While approaches based on convolutional neural networks constantly break new records on different benchmarks, generalizing well to diverse testing environments remains a major challenge. In numerous real world applications, there is indeed a large gap between data distributions in train and test domains, which results in severe… ▽ More Semantic segmentation is a key problem for many computer vision tasks. While approaches based on convolutional neural networks constantly break new records on different benchmarks, generalizing well to diverse testing environments remains a major challenge. In numerous real world applications, there is indeed a large gap between data distributions in train and test domains, which results in severe performance loss at run-time. In this work, we address the task of unsupervised domain adaptation in semantic segmentation with losses based on the entropy of the pixel-wise predictions. To this end, we propose two novel, complementary methods using (i) entropy loss and (ii) adversarial loss respectively. We demonstrate state-of-the-art performance in semantic segmentation on two challenging "synthetic-2-real" set-ups and show that the approach can also be used for detection. △ Less

Submitted 17 April, 2019; v1 submitted 30 November, 2018; originally announced November 2018.

Comments: Accepted in CVPR'19. Code is available at https://github.com/valeoai/ADVENT

arXiv:1811.02234 [pdf, other]

Semantic bottleneck for computer vision tasks

Authors: Maxime Bucher, Stéphane Herbin, Frédéric Jurie

Abstract: This paper introduces a novel method for the representation of images that is semantic by nature, addressing the question of computation intelligibility in computer vision tasks. More specifically, our proposition is to introduce what we call a semantic bottleneck in the processing pipeline, which is a crossing point in which the representation of the image is entirely expressed with natural langu… ▽ More This paper introduces a novel method for the representation of images that is semantic by nature, addressing the question of computation intelligibility in computer vision tasks. More specifically, our proposition is to introduce what we call a semantic bottleneck in the processing pipeline, which is a crossing point in which the representation of the image is entirely expressed with natural language , while retaining the efficiency of numerical representations. We show that our approach is able to generate semantic representations that give state-of-the-art results on semantic content-based image retrieval and also perform very well on image classification tasks. Intelligibility is evaluated through user centered experiments for failure detection. △ Less

Submitted 6 November, 2018; originally announced November 2018.

Journal ref: Asian Conference on Computer Vision (ACCV), Dec 2018, Perth, Australia

arXiv:1708.06975 [pdf, other]

Generating Visual Representations for Zero-Shot Classification

Authors: Maxime Bucher, Stéphane Herbin, Frédéric Jurie

Abstract: This paper addresses the task of learning an image clas-sifier when some categories are defined by semantic descriptions only (e.g. visual attributes) while the others are defined by exemplar images as well. This task is often referred to as the Zero-Shot classification task (ZSC). Most of the previous methods rely on learning a common embedding space allowing to compare visual features of unknown… ▽ More This paper addresses the task of learning an image clas-sifier when some categories are defined by semantic descriptions only (e.g. visual attributes) while the others are defined by exemplar images as well. This task is often referred to as the Zero-Shot classification task (ZSC). Most of the previous methods rely on learning a common embedding space allowing to compare visual features of unknown categories with semantic descriptions. This paper argues that these approaches are limited as i) efficient discrimi-native classifiers can't be used ii) classification tasks with seen and unseen categories (Generalized Zero-Shot Classification or GZSC) can't be addressed efficiently. In contrast , this paper suggests to address ZSC and GZSC by i) learning a conditional generator using seen classes ii) generate artificial training examples for the categories without exemplars. ZSC is then turned into a standard supervised learning problem. Experiments with 4 generative models and 5 datasets experimentally validate the approach, giving state-of-the-art results on both ZSC and GZSC. △ Less

Submitted 11 December, 2017; v1 submitted 23 August, 2017; originally announced August 2017.

Journal ref: International Conference on Computer Vision (ICCV) Workshops : TASK-CV: Transferring and Adapting Source Knowledge in Computer Vision, Oct 2017, venise, Italy. International Conference on Computer Vision (ICCV) Workshops, 2017

arXiv:1608.07441 [pdf, other]

Hard Negative Mining for Metric Learning Based Zero-Shot Classification

Authors: Maxime Bucher, Stéphane Herbin, Frédéric Jurie

Abstract: Zero-Shot learning has been shown to be an efficient strategy for domain adaptation. In this context, this paper builds on the recent work of Bucher et al. [1], which proposed an approach to solve Zero-Shot classification problems (ZSC) by introducing a novel metric learning based objective function. This objective function allows to learn an optimal embedding of the attributes jointly with a meas… ▽ More Zero-Shot learning has been shown to be an efficient strategy for domain adaptation. In this context, this paper builds on the recent work of Bucher et al. [1], which proposed an approach to solve Zero-Shot classification problems (ZSC) by introducing a novel metric learning based objective function. This objective function allows to learn an optimal embedding of the attributes jointly with a measure of similarity between images and attributes. This paper extends their approach by proposing several schemes to control the generation of the negative pairs, resulting in a significant improvement of the performance and giving above state-of-the-art results on three challenging ZSC datasets. △ Less

Submitted 26 August, 2016; originally announced August 2016.

Journal ref: ECCV 16 WS TASK-CV: Transferring and Adapting Source Knowledge in Computer Vision, Oct 2016, Amsterdam, Netherlands. ECCV 16 WS TASK-CV: Transferring and Adapting Source Knowledge in Computer Vision

arXiv:1607.08085 [pdf, other]

Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classification

Authors: Maxime Bucher, Stéphane Herbin, Frédéric Jurie

Abstract: This paper addresses the task of zero-shot image classification. The key contribution of the proposed approach is to control the semantic embedding of images -- one of the main ingredients of zero-shot learning -- by formulating it as a metric learning problem. The optimized empirical criterion associates two types of sub-task constraints: metric discriminating capacity and accurate attribute pred… ▽ More This paper addresses the task of zero-shot image classification. The key contribution of the proposed approach is to control the semantic embedding of images -- one of the main ingredients of zero-shot learning -- by formulating it as a metric learning problem. The optimized empirical criterion associates two types of sub-task constraints: metric discriminating capacity and accurate attribute prediction. This results in a novel expression of zero-shot learning not requiring the notion of class in the training phase: only pairs of image/attributes, augmented with a consistency indicator, are given as ground truth. At test time, the learned model can predict the consistency of a test image with a given set of attributes , allowing flexible ways to produce recognition inferences. Despite its simplicity, the proposed approach gives state-of-the-art results on four challenging datasets used for zero-shot recognition evaluation. △ Less

Submitted 27 July, 2016; originally announced July 2016.

Comments: in ECCV 2016, Oct 2016, amsterdam, Netherlands. 2016

Showing 1–14 of 14 results for author: Bucher, M