Skip to main content

Showing 1–7 of 7 results for author: Siblini, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20468  [pdf, other

    cs.CL cs.IR cs.LG

    MTEB-French: Resources for French Sentence Embedding Evaluation and Analysis

    Authors: Mathieu Ciancone, Imene Kerboua, Marion Schaeffer, Wissam Siblini

    Abstract: Recently, numerous embedding models have been made available and widely used for various NLP tasks. The Massive Text Embedding Benchmark (MTEB) has primarily simplified the process of choosing a model that performs well for several tasks in English, but extensions to other languages remain challenging. This is why we expand MTEB to propose the first massive benchmark of sentence embeddings for Fre… ▽ More

    Submitted 17 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2204.05265  [pdf, other

    cs.LG

    The Importance of Future Information in Credit Card Fraud Detection

    Authors: Van Bach Nguyen, Kanishka Ghosh Dastidar, Michael Granitzer, Wissam Siblini

    Abstract: Fraud detection systems (FDS) mainly perform two tasks: (i) real-time detection while the payment is being processed and (ii) posterior detection to block the card retrospectively and avoid further frauds. Since human verification is often necessary and the payment processing time is limited, the second task manages the largest volume of transactions. In the literature, fraud detection challenges… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: 11 pages, 4 figures, to be published at AISTATS 2022

  3. arXiv:2107.09323  [pdf, other

    cs.LG cs.CR

    Transfer Learning for Credit Card Fraud Detection: A Journey from Research to Production

    Authors: Wissam Siblini, Guillaume Coter, Rémy Fabry, Liyun He-Guelton, Frédéric Oblé, Bertrand Lebichot, Yann-Aël Le Borgne, Gianluca Bontempi

    Abstract: The dark face of digital commerce generalization is the increase of fraud attempts. To prevent any type of attacks, state-of-the-art fraud detection systems are now embedding Machine Learning (ML) modules. The conception of such modules is only communicated at the level of research and papers mostly focus on results for isolated benchmark datasets and metrics. But research is only a part of the jo… ▽ More

    Submitted 4 November, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

  4. arXiv:2010.08422  [pdf, other

    cs.CL

    Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering

    Authors: Wissam Siblini, Mohamed Challal, Charlotte Pasqual

    Abstract: Open Domain Question Answering (ODQA) on a large-scale corpus of documents (e.g. Wikipedia) is a key challenge in computer science. Although transformer-based language models such as Bert have shown on SQuAD the ability to surpass humans for extracting answers in small passages of text, they suffer from their high complexity when faced to a much larger search space. The most common way to tackle t… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

  5. arXiv:1910.04659  [pdf, other

    cs.CL

    Multilingual Question Answering from Formatted Text applied to Conversational Agents

    Authors: Wissam Siblini, Charlotte Pasqual, Axel Lavielle, Mohamed Challal, Cyril Cauchois

    Abstract: Recent advances with language models (e.g. BERT, XLNet, ...), have allowed surpassing human performance on complex NLP tasks such as Reading Comprehension. However, labeled datasets for training are available mostly in English which makes it difficult to acknowledge progress in other languages. Fortunately, models are now pre-trained on unlabeled data from hundreds of languages and exhibit interes… ▽ More

    Submitted 1 February, 2021; v1 submitted 10 October, 2019; originally announced October 2019.

  6. Master your Metrics with Calibration

    Authors: Wissam Siblini, Jordan Fréry, Liyun He-Guelton, Frédéric Oblé, Yi-Qing Wang

    Abstract: Machine learning models deployed in real-world applications are often evaluated with precision-based metrics such as F1-score or AUC-PR (Area Under the Curve of Precision Recall). Heavily dependent on the class prior, such metrics make it difficult to interpret the variation of a model's performance over different subpopulations/subperiods in a dataset. In this paper, we propose a way to calibrate… ▽ More

    Submitted 28 April, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

    Comments: Presented at IDA2020

  7. arXiv:1803.02101  [pdf

    cs.IR

    VIPE: A new interactive classification framework for large sets of short texts - application to opinion mining

    Authors: Wissam Siblini, Frank Meyer, Pascale Kuntz

    Abstract: This paper presents a new interactive opinion mining tool that helps users to classify large sets of short texts originated from Web opinion polls, technical forums or Twitter. From a manual multi-label pre-classification of a very limited text subset, a learning algorithm predicts the labels of the remaining texts of the corpus and the texts most likely associated to a selected label. Using a fas… ▽ More

    Submitted 6 March, 2018; originally announced March 2018.

    Comments: 8 pages