Skip to main content

Showing 1–6 of 6 results for author: Dozier, G

.
  1. arXiv:2406.11109  [pdf, other

    cs.CL cs.AI cs.LG

    Investigating Annotator Bias in Large Language Models for Hate Speech Detection

    Authors: Amit Das, Zheng Zhang, Fatemeh Jamshidi, Vinija Jain, Aman Chadha, Nilanjana Raychawdhary, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals

    Abstract: Data annotation, the practice of assigning descriptive labels to raw data, is pivotal in optimizing the performance of machine learning models. However, it is a resource-intensive process susceptible to biases introduced by annotators. The emergence of sophisticated Large Language Models (LLMs), like ChatGPT presents a unique opportunity to modernize and streamline this complex procedure. While ex… ▽ More

    Submitted 18 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2403.02472  [pdf, other

    cs.CL

    OffensiveLang: A Community Based Implicit Offensive Language Dataset

    Authors: Amit Das, Mostafa Rahgouy, Dongji Feng, Zheng Zhang, Tathagata Bhattacharya, Nilanjana Raychawdhary, Fatemeh Jamshidi, Vinija Jain, Aman Chadha, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals

    Abstract: The widespread presence of hateful languages on social media has resulted in adverse effects on societal well-being. As a result, addressing this issue with high priority has become very important. Hate speech or offensive languages exist in both explicit and implicit forms, with the latter being more challenging to detect. Current research in this domain encounters several challenges. Firstly, th… ▽ More

    Submitted 17 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  3. arXiv:2103.15897  [pdf

    cs.CR cs.CV

    Automating Defense Against Adversarial Attacks: Discovery of Vulnerabilities and Application of Multi-INT Imagery to Protect Deployed Models

    Authors: Josh Kalin, David Noever, Matthew Ciolino, Dominick Hambrick, Gerry Dozier

    Abstract: Image classification is a common step in image recognition for machine learning in overhead applications. When applying popular model architectures like MobileNetV2, known vulnerabilities expose the model to counter-attacks, either mislabeling a known class or altering box location. This work proposes an automated approach to defend these models. We evaluate the use of multi-spectral image arrays… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: SPIE 2021, 8 Pages, 6 Figures

  4. arXiv:2101.01628  [pdf

    cs.CL cs.LG

    Local Translation Services for Neglected Languages

    Authors: David Noever, Josh Kalin, Matt Ciolino, Dom Hambrick, Gerry Dozier

    Abstract: Taking advantage of computationally lightweight, but high-quality translators prompt consideration of new applications that address neglected languages. Locally run translators for less popular languages may assist data projects with protected or personal data that may require specific compliance checks before posting to a public translation API, but which could render reasonable, cost-effective s… ▽ More

    Submitted 13 January, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

  5. arXiv:2009.03136  [pdf, other

    cs.LG stat.ML

    Black Box to White Box: Discover Model Characteristics Based on Strategic Probing

    Authors: Josh Kalin, Matthew Ciolino, David Noever, Gerry Dozier

    Abstract: In Machine Learning, White Box Adversarial Attacks rely on knowing underlying knowledge about the model attributes. This works focuses on discovering to distrinct pieces of model information: the underlying architecture and primary training dataset. With the process in this paper, a structured set of input probes and the output of the model become the training data for a deep classifier. Two subdo… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

    Comments: 4 Pages, 3 Figure, IEEE Format, Ai4i 2020

  6. arXiv:2006.11130  [pdf

    cs.CR cs.LG stat.ML

    Systematic Attack Surface Reduction For Deployed Sentiment Analysis Models

    Authors: Josh Kalin, David Noever, Gerry Dozier

    Abstract: This work proposes a structured approach to baselining a model, identifying attack vectors, and securing the machine learning models after deployment. This method for securing each model post deployment is called the BAD (Build, Attack, and Defend) Architecture. Two implementations of the BAD architecture are evaluated to quantify the adversarial life cycle for a black box Sentiment Analysis syste… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

    Comments: 11 pages, 4 figures, 6th International Conference on Data Mining