Skip to main content

Showing 1–9 of 9 results for author: Viktor, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.06795  [pdf, other

    cs.CL cs.LG

    Robust Infidelity: When Faithfulness Measures on Masked Language Models Are Misleading

    Authors: Evan Crothers, Herna Viktor, Nathalie Japkowicz

    Abstract: A common approach to quantifying neural text classifier interpretability is to calculate faithfulness metrics based on iteratively masking salient input tokens and measuring changes in the model prediction. We propose that this property is better described as "sensitivity to iterative masking", and highlight pitfalls in using this measure for comparing text classifier interpretability. We show tha… ▽ More

    Submitted 31 May, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

  2. arXiv:2303.09617  [pdf, ps, other

    cs.SE cs.AI

    Measuring Improvement of F$_1$-Scores in Detection of Self-Admitted Technical Debt

    Authors: William Aiken, Paul K. Mvula, Paula Branco, Guy-Vincent Jourdan, Mehrdad Sabetzadeh, Herna Viktor

    Abstract: Artificial Intelligence and Machine Learning have witnessed rapid, significant improvements in Natural Language Processing (NLP) tasks. Utilizing Deep Learning, researchers have taken advantage of repository comments in Software Engineering to produce accurate methods for detecting Self-Admitted Technical Debt (SATD) from 20 open-source Java projects' code. In this work, we improve SATD detection… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  3. arXiv:2301.05402  [pdf, other

    cs.CL cs.LG

    In BLOOM: Creativity and Affinity in Artificial Lyrics and Art

    Authors: Evan Crothers, Herna Viktor, Nathalie Japkowicz

    Abstract: We apply a large multilingual language model (BLOOM-176B) in open-ended generation of Chinese song lyrics, and evaluate the resulting lyrics for coherence and creativity using human reviewers. We find that current computational metrics for evaluating large language model outputs (MAUVE) have limitations in evaluation of creative writing. We note that the human concept of creativity requires lyrics… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Comments: Accepted to AAAI2023 creativeAI workshop

  4. arXiv:2210.07321  [pdf, other

    cs.CL cs.CR cs.CY cs.LG

    Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods

    Authors: Evan Crothers, Nathalie Japkowicz, Herna Viktor

    Abstract: Machine generated text is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NL… ▽ More

    Submitted 7 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Manuscript submitted to ACM Special Session on Trustworthy AI. 2022/11/19 - Updated references

    ACM Class: I.2.7; K.4.2

  5. Adversarial Robustness of Neural-Statistical Features in Detection of Generative Transformers

    Authors: Evan Crothers, Nathalie Japkowicz, Herna Viktor, Paula Branco

    Abstract: The detection of computer-generated text is an area of rapidly increasing significance as nascent generative models allow for efficient creation of compelling human-like text, which may be abused for the purposes of spam, disinformation, phishing, or online influence campaigns. Past work has studied detection of current state-of-the-art models, but despite a develo** threat landscape, there has… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

  6. Towards Ethical Content-Based Detection of Online Influence Campaigns

    Authors: Evan Crothers, Nathalie Japkowicz, Herna Viktor

    Abstract: The detection of clandestine efforts to influence users in online communities is a challenging problem with significant active development. We demonstrate that features derived from the text of user comments are useful for identifying suspect activity, but lead to increased erroneous identifications when keywords over-represented in past influence campaigns are present. Drawing on research in nati… ▽ More

    Submitted 28 August, 2019; originally announced August 2019.

    Comments: To appear in "Special Session on Machine learning for Knowledge Discovery in the Social Sciences" at IEEE Machine Learning for Signal Processing Workshop (MLSP) 2019

    Journal ref: 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), Pittsburgh, PA, USA, 2019, pp. 1-6

  7. arXiv:1907.04233  [pdf, other

    cs.LG stat.ML

    Contextual One-Class Classification in Data Streams

    Authors: Richard Hugh Moulton, Herna L. Viktor, Nathalie Japkowicz, João Gama

    Abstract: In machine learning, the one-class classification problem occurs when training instances are only available from one class. It has been observed that making use of this class's structure, or its different contexts, may improve one-class classifier performance. Although this observation has been demonstrated for static data, a rigorous application of the idea within the data stream environment is l… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: 49 pages, 18 figures, 2 appendices

  8. arXiv:1710.02030  [pdf, other

    stat.ML cs.DB cs.LG

    McDiarmid Drift Detection Methods for Evolving Data Streams

    Authors: Ali Pesaranghader, Herna Viktor, Eric Paquet

    Abstract: Increasingly, Internet of Things (IoT) domains, such as sensor networks, smart cities, and social networks, generate vast amounts of data. Such data are not only unbounded and rapidly evolving. Rather, the content thereof dynamically evolves over time, often in unforeseen ways. These variations are due to so-called concept drifts, caused by changes in the underlying data generation mechanisms. In… ▽ More

    Submitted 17 January, 2018; v1 submitted 5 October, 2017; originally announced October 2017.

    Comments: 9 pages, 3 figures, 3 tables

  9. arXiv:1709.02457  [pdf, other

    stat.ML cs.DB cs.LG

    Reservoir of Diverse Adaptive Learners and Stacking Fast Hoeffding Drift Detection Methods for Evolving Data Streams

    Authors: Ali Pesaranghader, Herna Viktor, Eric Paquet

    Abstract: The last decade has seen a surge of interest in adaptive learning algorithms for data stream classification, with applications ranging from predicting ozone level peaks, learning stock market indicators, to detecting computer security violations. In addition, a number of methods have been developed to detect concept drifts in these streams. Consider a scenario where we have a number of classifiers… ▽ More

    Submitted 7 September, 2017; originally announced September 2017.

    Comments: 42 pages, and 14 figures