-
Task-Specific Embeddings for Ante-Hoc Explainable Text Classification
Authors:
Kishaloy Halder,
Josip Krapac,
Alan Akbik,
Anthony Brew,
Matti Lyra
Abstract:
Current state-of-the-art approaches to text classification typically leverage BERT-style Transformer models with a softmax classifier, jointly fine-tuned to predict class labels of a target task. In this paper, we instead propose an alternative training objective in which we learn task-specific embeddings of text: our proposed objective learns embeddings such that all texts that share the same tar…
▽ More
Current state-of-the-art approaches to text classification typically leverage BERT-style Transformer models with a softmax classifier, jointly fine-tuned to predict class labels of a target task. In this paper, we instead propose an alternative training objective in which we learn task-specific embeddings of text: our proposed objective learns embeddings such that all texts that share the same target class label should be close together in the embedding space, while all others should be far apart. This allows us to replace the softmax classifier with a more interpretable k-nearest-neighbor classification approach. In a series of experiments, we show that this yields a number of interesting benefits: (1) The resulting order induced by distances in the embedding space can be used to directly explain classification decisions. (2) This facilitates qualitative inspection of the training data, hel** us to better understand the problem space and identify labelling quality issues. (3) The learned distances to some degree generalize to unseen classes, allowing us to incrementally add new classes without retraining the model. We present extensive experiments which show that the benefits of ante-hoc explainability and incremental learning come at no cost in overall classification accuracy, thus pointing to practical applicability of our proposed approach.
△ Less
Submitted 30 November, 2022;
originally announced December 2022.
-
Enhancing Product Safety in E-Commerce with NLP
Authors:
Kishaloy Halder,
Josip Krapac,
Dmitry Goryunov,
Anthony Brew,
Matti Lyra,
Alsida Dizdari,
William Gillett,
Adrien Renahy,
Sinan Tang
Abstract:
Ensuring safety of the products offered to the customers is of paramount importance to any e- commerce platform. Despite stringent quality and safety checking of products listed on these platforms, occasionally customers might receive a product that can pose a safety issue arising out of its use. In this paper, we present an innovative mechanism of how a large scale multinational e-commerce platfo…
▽ More
Ensuring safety of the products offered to the customers is of paramount importance to any e- commerce platform. Despite stringent quality and safety checking of products listed on these platforms, occasionally customers might receive a product that can pose a safety issue arising out of its use. In this paper, we present an innovative mechanism of how a large scale multinational e-commerce platform, Zalando, uses Natural Language Processing techniques to assist timely investigation of the potentially unsafe products mined directly from customer written claims in unstructured plain text. We systematically describe the types of safety issues that concern Zalando customers. We demonstrate how we map this core business problem into a supervised text classification problem with highly imbalanced, noisy, multilingual data in a AI-in-the-loop setup with a focus on Key Performance Indicator (KPI) driven evaluation. Finally, we present detailed ablation studies to show a comprehensive comparison between different classification techniques. We conclude the work with how this NLP model was deployed.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Characterization of the Firm-Firm Public Procurement Co-Bidding Network from the State of Ceará (Brazil) Municipalities
Authors:
Marcos Lyra,
António Curado,
Bruno Damásio,
Fernando Bação,
Flávio L. Pinheiro
Abstract:
Fraud in public funding can have deleterious consequences for the economic, social, and political well-being of societies. Fraudulent activity associated with public procurement contracts accounts for losses of billions of euros every year. Thus, it is of utmost relevance to explore analytical frameworks that can help public authorities identify agents that are more susceptible to incur in irregul…
▽ More
Fraud in public funding can have deleterious consequences for the economic, social, and political well-being of societies. Fraudulent activity associated with public procurement contracts accounts for losses of billions of euros every year. Thus, it is of utmost relevance to explore analytical frameworks that can help public authorities identify agents that are more susceptible to incur in irregular activities. Here, we use standard network science methods to study the co-biding relationships between firms that participate in public tenders issued by the $184$ municipalities of the State of Ceará (Brazil) between 2015 and 2019. We identify $22$ groups/communities of firms with similar patterns of procurement activity, defined by their geographic and activity scopes. The profiling of the communities allows us to highlight groups that are more susceptible to market manipulation and irregular activities. Our work reinforces the potential application of network analysis in policy to unfold the complex nature of relationships between market agents in a scenario of scarce data.
△ Less
Submitted 17 April, 2021;
originally announced April 2021.