Skip to main content

Showing 1–2 of 2 results for author: Tafvizi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13660  [pdf, other

    cs.CL cs.AI

    Towards Minimal Targeted Updates of Language Models with Targeted Negative Training

    Authors: Lily H. Zhang, Rajesh Ranganath, Arya Tafvizi

    Abstract: Generative models of language exhibit impressive capabilities but still place non-negligible probability mass over undesirable outputs. In this work, we address the task of updating a model to avoid unwanted outputs while minimally changing model behavior otherwise, a challenge we refer to as a minimal targeted update. We first formalize the notion of a minimal targeted update and propose a method… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Published in Transactions of Machine Learning Research

  2. arXiv:2205.11781  [pdf, other

    cs.LG

    Attributing AUC-ROC to Analyze Binary Classifier Performance

    Authors: Arya Tafvizi, Besim Avci, Mukund Sundararajan

    Abstract: Area Under the Receiver Operating Characteristic Curve (AUC-ROC) is a popular evaluation metric for binary classifiers. In this paper, we discuss techniques to segment the AUC-ROC along human-interpretable dimensions. AUC-ROC is not an additive/linear function over the data samples, therefore such segmenting the overall AUC-ROC is different from tabulating the AUC-ROC of data segments. To segment… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.