-
Improving Interpretability and Robustness for the Detection of AI-Generated Images
Authors:
Tatiana Gaintseva,
Laida Kushnareva,
German Magai,
Irina Piontkovskaya,
Sergey Nikolenko,
Martin Benning,
Serguei Barannikov,
Gregory Slabaugh
Abstract:
With growing abilities of generative models, artificial content detection becomes an increasingly important and difficult task. However, all popular approaches to this problem suffer from poor generalization across domains and generative models. In this work, we focus on the robustness of AI-generated image (AIGI) detectors. We analyze existing state-of-the-art AIGI detection methods based on froz…
▽ More
With growing abilities of generative models, artificial content detection becomes an increasingly important and difficult task. However, all popular approaches to this problem suffer from poor generalization across domains and generative models. In this work, we focus on the robustness of AI-generated image (AIGI) detectors. We analyze existing state-of-the-art AIGI detection methods based on frozen CLIP embeddings and show how to interpret them, shedding light on how images produced by various AI generators differ from real ones. Next we propose two ways to improve robustness: based on removing harmful components of the embedding vector and based on selecting the best performing attention heads in the image encoder model. Our methods increase the mean out-of-distribution (OOD) classification score by up to 6% for cross-model transfer. We also propose a new dataset for AIGI detection and use it in our evaluation; we believe this dataset will help boost further research. The dataset and code are provided as a supplement.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Efficient Grammatical Error Correction Via Multi-Task Training and Optimized Training Schedule
Authors:
Andrey Bout,
Alexander Podolskiy,
Sergey Nikolenko,
Irina Piontkovskaya
Abstract:
Progress in neural grammatical error correction (GEC) is hindered by the lack of annotated training data. Sufficient amounts of high-quality manually annotated data are not available, so recent research has relied on generating synthetic data, pretraining on it, and then fine-tuning on real datasets; performance gains have been achieved either by ensembling or by using huge pretrained models such…
▽ More
Progress in neural grammatical error correction (GEC) is hindered by the lack of annotated training data. Sufficient amounts of high-quality manually annotated data are not available, so recent research has relied on generating synthetic data, pretraining on it, and then fine-tuning on real datasets; performance gains have been achieved either by ensembling or by using huge pretrained models such as XXL-T5 as the backbone. In this work, we explore an orthogonal direction: how to use available data more efficiently. First, we propose auxiliary tasks that exploit the alignment between the original and corrected sentences, such as predicting a sequence of corrections. We formulate each task as a sequence-to-sequence problem and perform multi-task training. Second, we discover that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance, so we set out to find the best training schedule. Together, these two ideas lead to significant improvements, producing results that improve state of the art with much smaller models; in particular, we outperform the best models based on T5-XXL (11B parameters) with a BART-based model (400M parameters).
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
AI-generated text boundary detection with RoFT
Authors:
Laida Kushnareva,
Tatiana Gaintseva,
German Magai,
Serguei Barannikov,
Dmitry Abulkhanov,
Kristian Kuznetsov,
Eduard Tulchinskii,
Irina Piontkovskaya,
Sergey Nikolenko
Abstract:
Due to the rapid development of large language models, people increasingly often encounter texts that may start as written by a human but continue as machine-generated. Detecting the boundary between human-written and machine-generated parts of such texts is a challenging problem that has not received much attention in literature. We attempt to bridge this gap and examine several ways to adapt sta…
▽ More
Due to the rapid development of large language models, people increasingly often encounter texts that may start as written by a human but continue as machine-generated. Detecting the boundary between human-written and machine-generated parts of such texts is a challenging problem that has not received much attention in literature. We attempt to bridge this gap and examine several ways to adapt state of the art artificial text detection classifiers to the boundary detection setting. We push all detectors to their limits, using the Real or Fake text benchmark that contains short texts on several topics and includes generations of various language models. We use this diversity to deeply examine the robustness of all detectors in cross-domain and cross-model settings to provide baselines and insights for future research. In particular, we find that perplexity-based approaches to boundary detection tend to be more robust to peculiarities of domain-specific data than supervised fine-tuning of the RoBERTa model; we also find which features of the text confuse boundary detection algorithms and negatively influence their performance in cross-domain settings.
△ Less
Submitted 2 April, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
GEC-DePenD: Non-Autoregressive Grammatical Error Correction with Decoupled Permutation and Decoding
Authors:
Konstantin Yakovlev,
Alexander Podolskiy,
Andrey Bout,
Sergey Nikolenko,
Irina Piontkovskaya
Abstract:
Grammatical error correction (GEC) is an important NLP task that is currently usually solved with autoregressive sequence-to-sequence models. However, approaches of this class are inherently slow due to one-by-one token generation, so non-autoregressive alternatives are needed. In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation ne…
▽ More
Grammatical error correction (GEC) is an important NLP task that is currently usually solved with autoregressive sequence-to-sequence models. However, approaches of this class are inherently slow due to one-by-one token generation, so non-autoregressive alternatives are needed. In this work, we propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network that outputs a self-attention weight matrix that can be used in beam search to find the best permutation of input tokens (with auxiliary {ins} tokens) and a decoder network based on a step-unrolled denoising autoencoder that fills in specific tokens. This allows us to find the token permutation after only one forward pass of the permutation network, avoiding autoregressive constructions. We show that the resulting network improves over previously known non-autoregressive methods for GEC and reaches the level of autoregressive methods that do not use language-specific synthetic data generation methods. Our results are supported by a comprehensive experimental validation on the ConLL-2014 and Write&Improve+LOCNESS datasets and an extensive ablation study that supports our architectural and algorithmic choices.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Sinkhorn Transformations for Single-Query Postprocessing in Text-Video Retrieval
Authors:
Konstantin Yakovlev,
Gregory Polyakov,
Ilseyar Alimova,
Alexander Podolskiy,
Andrey Bout,
Sergey Nikolenko,
Irina Piontkovskaya
Abstract:
A recent trend in multimodal retrieval is related to postprocessing test set results via the dual-softmax loss (DSL). While this approach can bring significant improvements, it usually presumes that an entire matrix of test samples is available as DSL input. This work introduces a new postprocessing approach based on Sinkhorn transformations that outperforms DSL. Further, we propose a new postproc…
▽ More
A recent trend in multimodal retrieval is related to postprocessing test set results via the dual-softmax loss (DSL). While this approach can bring significant improvements, it usually presumes that an entire matrix of test samples is available as DSL input. This work introduces a new postprocessing approach based on Sinkhorn transformations that outperforms DSL. Further, we propose a new postprocessing setting that does not require access to multiple test queries. We show that our approach can significantly improve the results of state of the art models such as CLIP4Clip, BLIP, X-CLIP, and DRL, thus achieving a new state-of-the-art on several standard text-video retrieval datasets both with access to the entire test set and in the single-query setting.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Intrinsic Dimension Estimation for Robust Detection of AI-Generated Texts
Authors:
Eduard Tulchinskii,
Kristian Kuznetsov,
Laida Kushnareva,
Daniil Cherniavskii,
Serguei Barannikov,
Irina Piontkovskaya,
Sergey Nikolenko,
Evgeny Burnaev
Abstract:
Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society. Therefore, it becomes increasingly important to study the properties of human texts that are invariant over different text domains and varying proficiency of human writers, can be easily calculated for any language, and c…
▽ More
Rapidly increasing quality of AI-generated content makes it difficult to distinguish between human and AI-generated texts, which may lead to undesirable consequences for society. Therefore, it becomes increasingly important to study the properties of human texts that are invariant over different text domains and varying proficiency of human writers, can be easily calculated for any language, and can robustly separate natural and AI-generated texts regardless of the generation model and sampling method. In this work, we propose such an invariant for human-written texts, namely the intrinsic dimensionality of the manifold underlying the set of embeddings for a given text sample. We show that the average intrinsic dimensionality of fluent texts in a natural language is hovering around the value $9$ for several alphabet-based languages and around $7$ for Chinese, while the average intrinsic dimensionality of AI-generated texts for each language is $\approx 1.5$ lower, with a clear statistical separation between human-generated and AI-generated distributions. This property allows us to build a score-based artificial text detector. The proposed detector's accuracy is stable over text domains, generator models, and human writer proficiency levels, outperforming SOTA detectors in model-agnostic and cross-domain scenarios by a significant margin.
△ Less
Submitted 31 October, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Can BERT eat RuCoLA? Topological Data Analysis to Explain
Authors:
Irina Proskurina,
Irina Piontkovskaya,
Ekaterina Artemova
Abstract:
This paper investigates how Transformer language models (LMs) fine-tuned for acceptability classification capture linguistic features. Our approach uses the best practices of topological data analysis (TDA) in NLP: we construct directed attention graphs from attention matrices, derive topological features from them, and feed them to linear classifiers. We introduce two novel features, chordality,…
▽ More
This paper investigates how Transformer language models (LMs) fine-tuned for acceptability classification capture linguistic features. Our approach uses the best practices of topological data analysis (TDA) in NLP: we construct directed attention graphs from attention matrices, derive topological features from them, and feed them to linear classifiers. We introduce two novel features, chordality, and the matching number, and show that TDA-based classifiers outperform fine-tuning baselines. We experiment with two datasets, CoLA and RuCoLA in English and Russian, typologically different languages. On top of that, we propose several black-box introspection techniques aimed at detecting changes in the attention mode of the LMs during fine-tuning, defining the LM's prediction confidences, and associating individual heads with fine-grained grammar phenomena. Our results contribute to understanding the behavior of monolingual LMs in the acceptability classification task, provide insights into the functional roles of attention heads, and highlight the advantages of TDA-based approaches for analyzing LMs. We release the code and the experimental results for further uptake.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing
Authors:
Xiaozhe Ren,
**yi Zhou,
Xinfan Meng,
Xin**g Huang,
Yadao Wang,
Weichao Wang,
Pengfei Li,
Xiaoda Zhang,
Alexander Podolskiy,
Grigory Arshinov,
Andrey Bout,
Irina Piontkovskaya,
Jiansheng Wei,
Xin Jiang,
Teng Su,
Qun Liu,
Jun Yao
Abstract:
The scaling of large language models has greatly improved natural language understanding, generation, and reasoning. In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1.085T parameters named PanGu-Σ. With parameter inherent from PanGu-α, we extend the dense Transfo…
▽ More
The scaling of large language models has greatly improved natural language understanding, generation, and reasoning. In this work, we develop a system that trained a trillion-parameter language model on a cluster of Ascend 910 AI processors and MindSpore framework, and present the language model with 1.085T parameters named PanGu-Σ. With parameter inherent from PanGu-α, we extend the dense Transformer model to sparse one with Random Routed Experts (RRE), and efficiently train the model over 329B tokens by using Expert Computation and Storage Separation(ECSS). This resulted in a 6.3x increase in training throughput through heterogeneous computing. Our experimental findings show that PanGu-Σ provides state-of-the-art performance in zero-shot learning of various Chinese NLP downstream tasks. Moreover, it demonstrates strong abilities when fine-tuned in application data of open-domain dialogue, question answering, machine translation and code generation.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Topological Data Analysis for Speech Processing
Authors:
Eduard Tulchinskii,
Kristian Kuznetsov,
Laida Kushnareva,
Daniil Cherniavskii,
Serguei Barannikov,
Irina Piontkovskaya,
Sergey Nikolenko,
Evgeny Burnaev
Abstract:
We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT. To this end, we introduce a number of topological and algebraic features derived from Transformer attention maps and embeddings. We show that a simple linear classifier built on top of such features outperforms a fine-tuned classification head. In particular, we…
▽ More
We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT. To this end, we introduce a number of topological and algebraic features derived from Transformer attention maps and embeddings. We show that a simple linear classifier built on top of such features outperforms a fine-tuned classification head. In particular, we achieve an improvement of about $9\%$ accuracy and $5\%$ ERR on four common datasets; on CREMA-D, the proposed feature set reaches a new state of the art performance with accuracy $80.155$. We also show that topological features are able to reveal functional roles of speech Transformer heads; e.g., we find the heads capable to distinguish between pairs of sample sources (natural/synthetic) or voices without any downstream fine-tuning. Our results demonstrate that TDA is a promising new approach for speech analysis, especially for tasks that require structural prediction. Appendices, an introduction to TDA, and other additional materials are available here - https://topohubert.github.io/speech-topology-webpages/
△ Less
Submitted 6 June, 2023; v1 submitted 30 November, 2022;
originally announced November 2022.
-
Betti numbers of attention graphs is all you really need
Authors:
Laida Kushnareva,
Dmitri Piontkovski,
Irina Piontkovskaya
Abstract:
We apply methods of topological analysis to the attention graphs, calculated on the attention heads of the BERT model ( arXiv:1810.04805v2 ). Our research shows that the classifier built upon basic persistent topological features (namely, Betti numbers) of the trained neural network can achieve classification results on par with the conventional classification method. We show the relevance of such…
▽ More
We apply methods of topological analysis to the attention graphs, calculated on the attention heads of the BERT model ( arXiv:1810.04805v2 ). Our research shows that the classifier built upon basic persistent topological features (namely, Betti numbers) of the trained neural network can achieve classification results on par with the conventional classification method. We show the relevance of such topological text representation on three text classification benchmarks. For the best of our knowledge, it is the first attempt to analyze the topology of an attention-based neural network, widely used for Natural Language Processing.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Template-based Approach to Zero-shot Intent Recognition
Authors:
Dmitry Lamanov,
Pavel Burnyshev,
Ekaterina Artemova,
Valentin Malykh,
Andrey Bout,
Irina Piontkovskaya
Abstract:
The recent advances in transfer learning techniques and pre-training of large contextualized encoders foster innovation in real-life applications, including dialog assistants. Practical needs of intent recognition require effective data usage and the ability to constantly update supported intents, adopting new ones, and abandoning outdated ones. In particular, the generalized zero-shot paradigm, i…
▽ More
The recent advances in transfer learning techniques and pre-training of large contextualized encoders foster innovation in real-life applications, including dialog assistants. Practical needs of intent recognition require effective data usage and the ability to constantly update supported intents, adopting new ones, and abandoning outdated ones. In particular, the generalized zero-shot paradigm, in which the model is trained on the seen intents and tested on both seen and unseen intents, is taking on new importance. In this paper, we explore the generalized zero-shot setup for intent recognition. Following best practices for zero-shot text classification, we treat the task with a sentence pair modeling approach. We outperform previous state-of-the-art f1-measure by up to 16\% for unseen intents, using intent labels and user utterances and without accessing external sources (such as knowledge bases). Further enhancement includes lexicalization of intent labels, which improves performance by up to 7\%. By using task transferring from other sentence pair tasks, such as Natural Language Inference, we gain additional improvements.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Acceptability Judgements via Examining the Topology of Attention Maps
Authors:
Daniil Cherniavskii,
Eduard Tulchinskii,
Vladislav Mikhailov,
Irina Proskurina,
Laida Kushnareva,
Ekaterina Artemova,
Serguei Barannikov,
Irina Piontkovskaya,
Dmitri Piontkovski,
Evgeny Burnaev
Abstract:
The role of the attention mechanism in encoding linguistic knowledge has received special interest in NLP. However, the ability of the attention heads to judge the grammatical acceptability of a sentence has been underexplored. This paper approaches the paradigm of acceptability judgments with topological data analysis (TDA), showing that the geometric properties of the attention graph can be effi…
▽ More
The role of the attention mechanism in encoding linguistic knowledge has received special interest in NLP. However, the ability of the attention heads to judge the grammatical acceptability of a sentence has been underexplored. This paper approaches the paradigm of acceptability judgments with topological data analysis (TDA), showing that the geometric properties of the attention graph can be efficiently exploited for two standard practices in linguistics: binary judgments and linguistic minimal pairs. Topological features enhance the BERT-based acceptability classifier scores by $8$%-$24$% on CoLA in three languages (English, Italian, and Swedish). By revealing the topological discrepancy between attention maps of minimal pairs, we achieve the human-level performance on the BLiMP benchmark, outperforming nine statistical and Transformer LM baselines. At the same time, TDA provides the foundation for analyzing the linguistic functions of attention heads and interpreting the correspondence between the graph features and grammatical phenomena.
△ Less
Submitted 23 October, 2022; v1 submitted 19 May, 2022;
originally announced May 2022.
-
Artificial Text Detection via Examining the Topology of Attention Maps
Authors:
Laida Kushnareva,
Daniil Cherniavskii,
Vladislav Mikhailov,
Ekaterina Artemova,
Serguei Barannikov,
Alexander Bernstein,
Irina Piontkovskaya,
Dmitri Piontkovski,
Evgeny Burnaev
Abstract:
The impressive capabilities of recent generative models to create texts that are challenging to distinguish from the human-written ones can be misused for generating fake news, product reviews, and even abusive content. Despite the prominent performance of existing methods for artificial text detection, they still lack interpretability and robustness towards unseen models. To this end, we propose…
▽ More
The impressive capabilities of recent generative models to create texts that are challenging to distinguish from the human-written ones can be misused for generating fake news, product reviews, and even abusive content. Despite the prominent performance of existing methods for artificial text detection, they still lack interpretability and robustness towards unseen models. To this end, we propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA) which is currently understudied in the field of NLP. We empirically show that the features derived from the BERT model outperform count- and neural-based baselines up to 10\% on three common datasets, and tend to be the most robust towards unseen GPT-style generation models as opposed to existing methods. The probing analysis of the features reveals their sensitivity to the surface and syntactic properties. The results demonstrate that TDA is a promising line with respect to NLP tasks, specifically the ones that incorporate surface and structural information.
△ Less
Submitted 28 April, 2022; v1 submitted 10 September, 2021;
originally announced September 2021.
-
A Single Example Can Improve Zero-Shot Data Generation
Authors:
Pavel Burnyshev,
Valentin Malykh,
Andrey Bout,
Ekaterina Artemova,
Irina Piontkovskaya
Abstract:
Sub-tasks of intent classification, such as robustness to distribution shift, adaptation to specific user groups and personalization, out-of-domain detection, require extensive and flexible datasets for experiments and evaluation. As collecting such datasets is time- and labor-consuming, we propose to use text generation methods to gather datasets. The generator should be trained to generate utter…
▽ More
Sub-tasks of intent classification, such as robustness to distribution shift, adaptation to specific user groups and personalization, out-of-domain detection, require extensive and flexible datasets for experiments and evaluation. As collecting such datasets is time- and labor-consuming, we propose to use text generation methods to gather datasets. The generator should be trained to generate utterances that belong to the given intent. We explore two approaches to generating task-oriented utterances. In the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training. In the one-shot approach, the model is presented with a single utterance from a test intent. We perform a thorough automatic, and human evaluation of the dataset generated utilizing two proposed approaches. Our results reveal that the attributes of the generated data are close to original test sets, collected via crowd-sourcing.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection
Authors:
Alexander Podolskiy,
Dmitry Lipin,
Andrey Bout,
Ekaterina Artemova,
Irina Piontkovskaya
Abstract:
Real-life applications, heavily relying on machine learning, such as dialog systems, demand out-of-domain detection methods. Intent classification models should be equipped with a mechanism to distinguish seen intents from unseen ones so that the dialog agent is capable of rejecting the latter and avoiding undesired behavior. However, despite increasing attention paid to the task, the best practic…
▽ More
Real-life applications, heavily relying on machine learning, such as dialog systems, demand out-of-domain detection methods. Intent classification models should be equipped with a mechanism to distinguish seen intents from unseen ones so that the dialog agent is capable of rejecting the latter and avoiding undesired behavior. However, despite increasing attention paid to the task, the best practices for out-of-domain intent detection have not yet been fully established.
This paper conducts a thorough comparison of out-of-domain intent detection methods. We prioritize the methods, not requiring access to out-of-domain data during training, gathering of which is extremely time- and labor-consuming due to lexical and stylistic variation of user utterances. We evaluate multiple contextual encoders and methods, proven to be efficient, on three standard datasets for intent classification, expanded with out-of-domain utterances. Our main findings show that fine-tuning Transformer-based encoders on in-domain data leads to superior results. Mahalanobis distance, together with utterance representations, derived from Transformer-based encoders, outperforms other methods by a wide margin and establishes new state-of-the-art results for all datasets.
The broader analysis shows that the reason for success lies in the fact that the fine-tuned Transformer is capable of constructing homogeneous representations of in-domain utterances, revealing geometrical disparity to out of domain utterances. In turn, the Mahalanobis distance captures this disparity easily.
The code is available in our GitHub repo: https://github.com/huawei-noah/noah-research/tree/master/Maha_OOD .
△ Less
Submitted 23 May, 2022; v1 submitted 11 January, 2021;
originally announced January 2021.
-
Differentially Private Distributed Learning for Language Modeling Tasks
Authors:
Vadim Popov,
Mikhail Kudinov,
Irina Piontkovskaya,
Petr Vytovtov,
Alex Nevidomsky
Abstract:
One of the big challenges in machine learning applications is that training data can be different from the real-world data faced by the algorithm. In language modeling, users' language (e.g. in private messaging) could change in a year and be completely different from what we observe in publicly available data. At the same time, public data can be used for obtaining general knowledge (i.e. general…
▽ More
One of the big challenges in machine learning applications is that training data can be different from the real-world data faced by the algorithm. In language modeling, users' language (e.g. in private messaging) could change in a year and be completely different from what we observe in publicly available data. At the same time, public data can be used for obtaining general knowledge (i.e. general model of English). We study approaches to distributed fine-tuning of a general model on user private data with the additional requirements of maintaining the quality on the general data and minimization of communication costs. We propose a novel technique that significantly improves prediction quality on users' language compared to a general model and outperforms gradient compression methods in terms of communication efficiency. The proposed procedure is fast and leads to an almost 70% perplexity reduction and 8.7 percentage point improvement in keystroke saving rate on informal English texts. We also show that the range of tasks our approach is applicable to is not limited by language modeling only. Finally, we propose an experimental framework for evaluating differential privacy of distributed training of language models and show that our approach has good privacy guarantees.
△ Less
Submitted 6 March, 2018; v1 submitted 20 December, 2017;
originally announced December 2017.