Skip to main content

Showing 1–17 of 17 results for author: Japkowicz, N

.
  1. arXiv:2403.05548  [pdf, other

    cs.CY cs.AI cs.CL cs.IR cs.LG cs.SI

    Monitoring the evolution of antisemitic discourse on extremist social media using BERT

    Authors: Raza Ul Mustafa, Nathalie Japkowicz

    Abstract: Racism and intolerance on social media contribute to a toxic online environment which may spill offline to foster hatred, and eventually lead to physical violence. That is the case with online antisemitism, the specific category of hatred considered in this study. Tracking antisemitic themes and their associated terminology over time in online discussions could help monitor the sentiments of their… ▽ More

    Submitted 6 February, 2024; originally announced March 2024.

    Comments: 11 pages; 4 figures; 4 pages

  2. arXiv:2401.10841  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Using LLMs to discover emerging coded antisemitic hate-speech in extremist social media

    Authors: Dhanush Kikkisetti, Raza Ul Mustafa, Wendy Melillo, Roberto Corizzo, Zois Boukouvalas, Jeff Gill, Nathalie Japkowicz

    Abstract: Online hate speech proliferation has created a difficult problem for social media platforms. A particular challenge relates to the use of coded language by groups interested in both creating a sense of belonging for its users and evading detection. Coded language evolves quickly and its use varies over time. This paper proposes a methodology for detecting emerging coded hate-laden terminology. The… ▽ More

    Submitted 23 January, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

    Comments: 9 pages, 4 figures, 2 algorithms, 3 tables

  3. arXiv:2309.16257  [pdf

    cs.CV cs.AI eess.IV

    Nondestructive chicken egg fertility detection using CNN-transfer learning algorithms

    Authors: Shoffan Saifullah, Rafal Drezewski, Anton Yudhana, Andri Pranolo, Wilis Kaswijanti, Andiko Putro Suryotomo, Seno Aji Putra, Alin Khaliduzzaman, Anton Satria Prabuwono, Nathalie Japkowicz

    Abstract: This study explored the application of CNN-Transfer Learning for nondestructive chicken egg fertility detection for precision poultry hatchery practices. Four models, VGG16, ResNet50, InceptionNet, and MobileNet, were trained and evaluated on a dataset (200 single egg images) using augmented images (rotation, flip, scale, translation, and reflection). Although the training results demonstrated tha… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 18 pages, 9 figures, 1 table, journal article published

    MSC Class: CS-Class: 68T07; 68T45; 68U10; ICT-Class: 94A08 ACM Class: I.2; I.4; I.5

    Journal ref: Jurnal Ilmiah Teknik Elektro Komputer dan Informatika (JITEKI), Vol 9, No 3 (2023)

  4. arXiv:2308.06795  [pdf, other

    cs.CL cs.LG

    Robust Infidelity: When Faithfulness Measures on Masked Language Models Are Misleading

    Authors: Evan Crothers, Herna Viktor, Nathalie Japkowicz

    Abstract: A common approach to quantifying neural text classifier interpretability is to calculate faithfulness metrics based on iteratively masking salient input tokens and measuring changes in the model prediction. We propose that this property is better described as "sensitivity to iterative masking", and highlight pitfalls in using this measure for comparing text classifier interpretability. We show tha… ▽ More

    Submitted 31 May, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

  5. A Semi-Supervised Framework for Misinformation Detection

    Authors: Yueyang Liu, Zois Boukouvalas, Nathalie Japkowicz

    Abstract: The spread of misinformation in social media outlets has become a prevalent societal problem and is the cause of many kinds of social unrest. Curtailing its prevalence is of great importance and machine learning has shown significant promise. However, there are two main challenges when applying machine learning to this problem. First, while much too prevalent in one respect, misinformation, actual… ▽ More

    Submitted 22 April, 2023; originally announced April 2023.

    Journal ref: In: Soares, C., Torgo, L. (eds) Discovery Science. DS 2021. Lecture Notes in Computer Science(), vol 12986. Springer, Cham

  6. arXiv:2303.11076  [pdf, other

    cs.LG cs.AI

    From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning

    Authors: Kamil Faber, Dominik Zurek, Marcin Pietron, Nathalie Japkowicz, Antonio Vergari, Roberto Corizzo

    Abstract: Continual learning (CL) is one of the most promising trends in recent machine learning research. Its goal is to go beyond classical assumptions in machine learning and develop models and learning strategies that present high robustness in dynamic environments. The landscape of CL research is fragmented into several learning evaluation protocols, comprising different learning tasks, datasets, and e… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  7. Lifelong Continual Learning for Anomaly Detection: New Challenges, Perspectives, and Insights

    Authors: Kamil Faber, Roberto Corizzo, Bartlomiej Sniezynski, Nathalie Japkowicz

    Abstract: Anomaly detection is of paramount importance in many real-world domains, characterized by evolving behavior. Lifelong learning represents an emerging trend, answering the need for machine learning models that continuously adapt to new challenges in dynamic environments while retaining past knowledge. However, limited efforts are dedicated to building foundations for lifelong anomaly detection, whi… ▽ More

    Submitted 2 April, 2024; v1 submitted 13 March, 2023; originally announced March 2023.

    Journal ref: IEEE Access, vol. 12, pp. 41364-41380, 2024

  8. arXiv:2301.05402  [pdf, other

    cs.CL cs.LG

    In BLOOM: Creativity and Affinity in Artificial Lyrics and Art

    Authors: Evan Crothers, Herna Viktor, Nathalie Japkowicz

    Abstract: We apply a large multilingual language model (BLOOM-176B) in open-ended generation of Chinese song lyrics, and evaluate the resulting lyrics for coherence and creativity using human reviewers. We find that current computational metrics for evaluating large language model outputs (MAUVE) have limitations in evaluation of creative writing. We note that the human concept of creativity requires lyrics… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Comments: Accepted to AAAI2023 creativeAI workshop

  9. System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games

    Authors: Indranil Sur, Zachary Daniels, Abrar Rahman, Kamil Faber, Gianmarco J. Gallardo, Tyler L. Hayes, Cameron E. Taylor, Mustafa Burak Gurbuz, James Smith, Sahana Joshi, Nathalie Japkowicz, Michael Baron, Zsolt Kira, Christopher Kanan, Roberto Corizzo, Ajay Divakaran, Michael Piacentino, Jesse Hostetler, Aswin Raghavan

    Abstract: As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model's capability to learn new ta… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: The Second International Conference on AIML Systems, October 12--15, 2022, Bangalore, India

  10. arXiv:2210.07321  [pdf, other

    cs.CL cs.CR cs.CY cs.LG

    Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods

    Authors: Evan Crothers, Nathalie Japkowicz, Herna Viktor

    Abstract: Machine generated text is increasingly difficult to distinguish from human authored text. Powerful open-source models are freely available, and user-friendly tools that democratize access to generative models are proliferating. ChatGPT, which was released shortly after the first edition of this survey, epitomizes these trends. The great potential of state-of-the-art natural language generation (NL… ▽ More

    Submitted 7 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Manuscript submitted to ACM Special Session on Trustworthy AI. 2022/11/19 - Updated references

    ACM Class: I.2.7; K.4.2

  11. Adversarial Robustness of Neural-Statistical Features in Detection of Generative Transformers

    Authors: Evan Crothers, Nathalie Japkowicz, Herna Viktor, Paula Branco

    Abstract: The detection of computer-generated text is an area of rapidly increasing significance as nascent generative models allow for efficient creation of compelling human-like text, which may be abused for the purposes of spam, disinformation, phishing, or online influence campaigns. Past work has studied detection of current state-of-the-art models, but despite a develo** threat landscape, there has… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

  12. WATCH: Wasserstein Change Point Detection for High-Dimensional Time Series Data

    Authors: Kamil Faber, Roberto Corizzo, Bartlomiej Sniezynski, Michael Baron, Nathalie Japkowicz

    Abstract: Detecting relevant changes in dynamic time series data in a timely manner is crucially important for many data analysis tasks in real-world settings. Change point detection methods have the ability to discover changes in an unsupervised fashion, which represents a desirable property in the analysis of unbounded and unlabeled data streams. However, one limitation of most of the existing approaches… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    Journal ref: 2021 IEEE International Conference on Big Data (Big Data)

  13. arXiv:2107.14194  [pdf, other

    cs.LG

    On the combined effect of class imbalance and concept complexity in deep learning

    Authors: Kushankur Ghosh, Colin Bellinger, Roberto Corizzo, Bartosz Krawczyk, Nathalie Japkowicz

    Abstract: Structural concept complexity, class overlap, and data scarcity are some of the most important factors influencing the performance of classifiers under class imbalance conditions. When these effects were uncovered in the early 2000s, understandably, the classifiers on which they were demonstrated belonged to the classical rather than Deep Learning categories of approaches. As Deep Learning is gain… ▽ More

    Submitted 29 July, 2021; originally announced July 2021.

  14. arXiv:2012.02312  [pdf, other

    cs.LG

    ReMix: Calibrated Resampling for Class Imbalance in Deep learning

    Authors: Colin Bellinger, Roberto Corizzo, Nathalie Japkowicz

    Abstract: Class imbalance is a problem of significant importance in applied deep learning where trained models are exploited for decision support and automated decisions in critical areas such as health and medicine, transportation, and finance. The challenge of learning deep models from imbalanced training data remains high, and the state-of-the-art solutions are typically data dependent and primarily focu… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

  15. arXiv:2006.01284  [pdf, ps, other

    cs.LG stat.ML

    Independent Component Analysis for Trustworthy Cyberspace during High Impact Events: An Application to Covid-19

    Authors: Zois Boukouvalas, Christine Mallinson, Evan Crothers, Nathalie Japkowicz, Aritran Piplai, Sudip Mittal, Anupam Joshi, Tülay Adalı

    Abstract: Social media has become an important communication channel during high impact events, such as the COVID-19 pandemic. As misinformation in social media can rapidly spread, creating social unrest, curtailing the spread of misinformation during such events is a significant data challenge. While recent solutions that are based on machine learning have shown promise for the detection of misinformation,… ▽ More

    Submitted 30 June, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

  16. Towards Ethical Content-Based Detection of Online Influence Campaigns

    Authors: Evan Crothers, Nathalie Japkowicz, Herna Viktor

    Abstract: The detection of clandestine efforts to influence users in online communities is a challenging problem with significant active development. We demonstrate that features derived from the text of user comments are useful for identifying suspect activity, but lead to increased erroneous identifications when keywords over-represented in past influence campaigns are present. Drawing on research in nati… ▽ More

    Submitted 28 August, 2019; originally announced August 2019.

    Comments: To appear in "Special Session on Machine learning for Knowledge Discovery in the Social Sciences" at IEEE Machine Learning for Signal Processing Workshop (MLSP) 2019

    Journal ref: 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP), Pittsburgh, PA, USA, 2019, pp. 1-6

  17. arXiv:1907.04233  [pdf, other

    cs.LG stat.ML

    Contextual One-Class Classification in Data Streams

    Authors: Richard Hugh Moulton, Herna L. Viktor, Nathalie Japkowicz, João Gama

    Abstract: In machine learning, the one-class classification problem occurs when training instances are only available from one class. It has been observed that making use of this class's structure, or its different contexts, may improve one-class classifier performance. Although this observation has been demonstrated for static data, a rigorous application of the idea within the data stream environment is l… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: 49 pages, 18 figures, 2 appendices