Skip to main content

Showing 1–50 of 148 results for author: Lipton, Z

.
  1. arXiv:2407.02694  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    LLM-Select: Feature Selection with Large Language Models

    Authors: Daniel P. Jeong, Zachary C. Lipton, Pradeep Ravikumar

    Abstract: In this paper, we demonstrate a surprising capability of large language models (LLMs): given only input feature names and a description of a prediction task, they are capable of selecting the most predictive features, with performance rivaling the standard tools of data science. Remarkably, these models exhibit this capacity across various query mechanisms. For example, we zero-shot prompt an LLM… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Preprint

  2. arXiv:2406.09358  [pdf, other

    cs.LG

    Understanding Hallucinations in Diffusion Models through Mode Interpolation

    Authors: Sumukh K Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

    Abstract: Colloquially speaking, image generation models based upon diffusion processes are frequently said to exhibit "hallucinations," samples that could never occur in the training data. But where do such hallucinations come from? In this paper, we study a particular failure mode in diffusion models, which we term mode interpolation. Specifically, we find that diffusion models smoothly "interpolate" betw… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2406.09264  [pdf, other

    cs.HC cs.AI cs.CL

    Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions

    Authors: Hua Shen, Tiffany Knearem, Reshmi Ghosh, Kenan Alkiek, Kundan Krishna, Yachuan Liu, Ziqiao Ma, Savvas Petridis, Yi-Hao Peng, Li Qiwei, Sushrita Rakshit, Chenglei Si, Yutong Xie, Jeffrey P. Bigham, Frank Bentley, Joyce Chai, Zachary Lipton, Qiaozhu Mei, Rada Mihalcea, Michael Terry, Diyi Yang, Meredith Ringel Morris, Paul Resnick, David Jurgens

    Abstract: Recent advancements in general-purpose AI have highlighted the importance of guiding AI systems towards the intended goals, ethical principles, and values of individuals and groups, a concept broadly recognized as alignment. However, the lack of clarified definitions and scopes of human-AI alignment poses a significant obstacle, hampering collaborative efforts across research domains to achieve th… ▽ More

    Submitted 17 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 56 pages

  4. arXiv:2406.03487  [pdf, other

    cs.CL cs.AI

    Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends

    Authors: Sanjana Ramprasad, Elisa Ferracane, Zachary C. Lipton

    Abstract: Recent advancements in large language models (LLMs) have considerably advanced the capabilities of summarization systems. However, they continue to face concerns about hallucinations. While prior work has evaluated LLMs extensively in news domains, most evaluation of dialogue summarization has focused on BART-based models, leaving a gap in our understanding of their faithfulness. Our work benchmar… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at ACL 2024

  5. arXiv:2404.15146  [pdf, other

    cs.LG cs.CL

    Rethinking LLM Memorization through the Lens of Adversarial Compression

    Authors: Avi Schwarzschild, Zhili Feng, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter

    Abstract: Large language models (LLMs) trained on web-scale datasets raise substantial concerns regarding permissible data usage. One major question is whether these models "memorize" all their training data or they integrate many data sources in some way more akin to how a human would learn and synthesize information. The answer hinges, to a large degree, on how we define memorization. In this work, we pro… ▽ More

    Submitted 1 July, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: https://locuslab.github.io/acr-memorization

  6. arXiv:2404.07815  [pdf, other

    cs.LG cs.AI stat.ML

    Post-Hoc Reversal: Are We Selecting Models Prematurely?

    Authors: Rishabh Ranjan, Saurabh Garg, Mrigank Raman, Carlos Guestrin, Zachary Chase Lipton

    Abstract: Trained models are often composed with post-hoc transforms such as temperature scaling (TS), ensembling and stochastic weight averaging (SWA) to improve performance, robustness, uncertainty estimation, etc. However, such transforms are typically applied only after the base models have already been finalized by standard means. In this paper, we challenge this practice with an extensive empirical st… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 9 pages + references + appendix, 7 figures

  7. arXiv:2404.07177  [pdf, other

    cs.LG

    Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic

    Authors: Sachin Goyal, Pratyush Maini, Zachary C. Lipton, Aditi Raghunathan, J. Zico Kolter

    Abstract: Vision-language models (VLMs) are trained for thousands of GPU hours on carefully curated web datasets. In recent times, data curation has gained prominence with several works develo** strategies to retain 'high-quality' subsets of 'raw' scraped data. For instance, the LAION public dataset retained only 10% of the total crawled data. However, these strategies are typically developed agnostic of… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Published at CVPR 2024

  8. arXiv:2403.14713  [pdf, other

    cs.LG cs.CY stat.ME stat.ML

    Auditing Fairness under Unobserved Confounding

    Authors: Yewon Byun, Dylan Sam, Michael Oberst, Zachary C. Lipton, Bryan Wilder

    Abstract: The presence of inequity is a fundamental problem in the outcomes of decision-making systems, especially when human lives are at stake. Yet, estimating notions of unfairness or inequity is difficult, particularly if they rely on hard-to-measure concepts such as risk. Such measurements of risk can be accurately obtained when no unobserved confounders have jointly influenced past decisions and outco… ▽ More

    Submitted 24 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: AISTATS 2024

  9. arXiv:2402.12566  [pdf, other

    cs.CL cs.LG

    GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence

    Authors: Kundan Krishna, Sanjana Ramprasad, Prakhar Gupta, Byron C. Wallace, Zachary C. Lipton, Jeffrey P. Bigham

    Abstract: LLMs can generate factually incorrect statements even when provided access to reference documents. Such errors can be dangerous in high-stakes applications (e.g., document-grounded QA for healthcare or finance). We present GenAudit -- a tool intended to assist fact-checking LLM responses for document-grounded tasks. GenAudit suggests edits to the LLM response by revising or removing claims that ar… ▽ More

    Submitted 16 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Code and models available at https://genaudit.org

  10. arXiv:2402.08025  [pdf, other

    cs.CV

    Beyond the Mud: Datasets and Benchmarks for Computer Vision in Off-Road Racing

    Authors: Jacob Tyo, Motolani Olarinre, Youngseog Chung, Zachary C. Lipton

    Abstract: Despite significant progress in optical character recognition (OCR) and computer vision systems, robustly recognizing text and identifying people in images taken in unconstrained \emph{in-the-wild} environments remain an ongoing challenge. However, such obstacles must be overcome in practical applications of vision systems, such as identifying racers in photos taken during off-road racing events.… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2311.09256

  11. arXiv:2402.07685  [pdf, other

    cs.CV cs.LG

    Contrastive Multiple Instance Learning for Weakly Supervised Person ReID

    Authors: Jacob Tyo, Zachary C. Lipton

    Abstract: The acquisition of large-scale, precisely labeled datasets for person re-identification (ReID) poses a significant challenge. Weakly supervised ReID has begun to address this issue, although its performance lags behind fully supervised methods. In response, we introduce Contrastive Multiple Instance Learning (CMIL), a novel framework tailored for more effective weakly supervised ReID. CMIL disting… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  12. arXiv:2402.05133  [pdf, other

    cs.CL cs.AI cs.LG

    Personalized Language Modeling from Personalized Human Feedback

    Authors: Xinyu Li, Zachary C. Lipton, Liu Leqi

    Abstract: Reinforcement Learning from Human Feedback (RLHF) is the current dominating framework to fine-tune large language models to better align with human preferences. However, the underlying premise of algorithms developed under this framework can be problematic when user preferences encoded in human feedback are diverse. In this work, we aim to address this problem by develo** methods for building pe… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  13. arXiv:2402.03509  [pdf, other

    cs.CL cs.AI cs.LG

    Evaluating the Factuality of Zero-shot Summarizers Across Varied Domains

    Authors: Sanjana Ramprasad, Kundan Krishna, Zachary C Lipton, Byron C Wallace

    Abstract: Recent work has shown that large language models (LLMs) are capable of generating summaries zero-shot (i.e., without explicit supervision) that, under human assessment, are often comparable or even preferred to manually composed reference summaries. However, this prior work has focussed almost exclusively on evaluating news article summarization. How do zero-shot summarizers perform in other (pote… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  14. arXiv:2401.15897  [pdf, other

    cs.CY cs.HC cs.LG

    Red-Teaming for Generative AI: Silver Bullet or Security Theater?

    Authors: Michael Feffer, Anusha Sinha, Wesley Hanwen Deng, Zachary C. Lipton, Hoda Heidari

    Abstract: In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI red-teaming as a key component of their strategies for identifying and mitigating these risks. However, despite AI red-teaming's central role in policy discussions and corporate messaging, significant questions remain about what… ▽ More

    Submitted 15 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  15. arXiv:2401.08788  [pdf, other

    cs.LG cs.CY stat.ML

    The Impact of Differential Feature Under-reporting on Algorithmic Fairness

    Authors: Nil-Jana Akpinar, Zachary C. Lipton, Alexandra Chouldechova

    Abstract: Predictive risk models in the public sector are commonly developed using administrative data that is more complete for subpopulations that more greatly rely on public services. In the United States, for instance, information on health care utilization is routinely available to government agencies for individuals supported by Medicaid and Medicare, but not for the privately insured. Critiques of pu… ▽ More

    Submitted 3 May, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: ACM Conference on Fairness, Accountability, and Transparency (FAccT 2024)

  16. arXiv:2401.06121  [pdf, other

    cs.LG cs.CL

    TOFU: A Task of Fictitious Unlearning for LLMs

    Authors: Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter

    Abstract: Large language models trained on massive corpora of data from the web can memorize and reproduce sensitive or private data raising both legal and ethical concerns. Unlearning, or tuning models to forget information present in their training data, provides us with a way to protect private data after training. Although several methods exist for such unlearning, it is unclear to what extent they resu… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: https://locuslab.github.io/tofu/

  17. arXiv:2312.09323  [pdf, other

    cs.AI cs.LG

    Perspectives on the State and Future of Deep Learning - 2023

    Authors: Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson

    Abstract: The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time. The plan is to host this survey periodically until the AI singularity paperclip-frenzy-driven doomsday, kee** an updated list of topical questions and interviewing new community members for each edition. In this issue, we probed people's opinions on inter… ▽ More

    Submitted 18 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

  18. arXiv:2312.03318  [pdf, other

    cs.LG cs.CV stat.ML

    Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

    Authors: Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, Aditi Raghunathan

    Abstract: Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning). However, despite the popularity and compatibility of these techniques, their efficacy in combination remains unexplored. In this paper, we undertake a systematic empirical investi… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  19. arXiv:2312.00234  [pdf, other

    cs.LG math.NA stat.ML

    Deep Equilibrium Based Neural Operators for Steady-State PDEs

    Authors: Tanya Marwah, Ashwini Pokle, J. Zico Kolter, Zachary C. Lipton, Jianfeng Lu, Andrej Risteski

    Abstract: Data-driven machine learning approaches are being increasingly used to solve partial differential equations (PDEs). They have shown particularly striking successes when training an operator, which takes as input a PDE in some family, and outputs its solution. However, the architectural design space, especially given structural knowledge of the PDE family of interest, is still poorly understood. We… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: NeurIPS 2023

  20. arXiv:2311.09401  [pdf, other

    cs.CV cs.LG

    MoCo-Transfer: Investigating out-of-distribution contrastive learning for limited-data domains

    Authors: Yuwen Chen, Helen Zhou, Zachary C. Lipton

    Abstract: Medical imaging data is often siloed within hospitals, limiting the amount of data available for specialized model development. With limited in-domain data, one might hope to leverage larger datasets from related domains. In this paper, we analyze the benefit of transferring self-supervised contrastive representations from moment contrast (MoCo) pretraining on out-of-distribution data to settings… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 4 pages

  21. arXiv:2311.09256  [pdf, other

    cs.CV

    Reading Between the Mud: A Challenging Motorcycle Racer Number Dataset

    Authors: Jacob Tyo, Youngseog Chung, Motolani Olarinre, Zachary C. Lipton

    Abstract: This paper introduces the off-road motorcycle Racer number Dataset (RnD), a new challenging dataset for optical character recognition (OCR) research. RnD contains 2,411 images from professional motorsports photographers that depict motorcycle racers in off-road competitions. The images exhibit a wide variety of factors that make OCR difficult, including mud occlusions, motion blur, non-standard fo… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  22. arXiv:2311.08488  [pdf, other

    cs.CV

    MUDD: A New Re-Identification Dataset with Efficient Annotation for Off-Road Racers in Extreme Conditions

    Authors: Jacob Tyo, Motolani Olarinre, Youngseog Chung, Zachary C. Lipton

    Abstract: Re-identifying individuals in unconstrained environments remains an open challenge in computer vision. We introduce the Muddy Racer re-IDentification Dataset (MUDD), the first large-scale benchmark for matching identities of motorcycle racers during off-road competitions. MUDD exhibits heavy mud occlusion, motion blurring, complex poses, and extreme lighting conditions previously unseen in existin… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  23. arXiv:2310.07935  [pdf, other

    stat.ME stat.AP

    Estimating the Likelihood of Arrest from Police Records in Presence of Unreported Crimes

    Authors: Riccardo Fogliato, Arun Kumar Kuchibhotla, Zachary Lipton, Daniel Nagin, Alice Xiang, Alexandra Chouldechova

    Abstract: Many important policy decisions concerning policing hinge on our understanding of how likely various criminal offenses are to result in arrests. Since many crimes are never reported to law enforcement, estimates based on police records alone must be adjusted to account for the likelihood that each crime would have been reported to the police. In this paper, we present a methodological framework fo… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  24. arXiv:2308.14272  [pdf, other

    cs.CL cs.LG

    Goodhart's Law Applies to NLP's Explanation Benchmarks

    Authors: Jennifer Hsia, Danish Pruthi, Aarti Singh, Zachary C. Lipton

    Abstract: Despite the rising popularity of saliency-based explanations, the research community remains at an impasse, facing doubts concerning their purpose, efficacy, and tendency to contradict each other. Seeking to unite the community's efforts around common goals, several recent works have proposed evaluation metrics. In this paper, we critically examine two sets of metrics: the ERASER metrics (comprehe… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  25. arXiv:2307.09542  [pdf, other

    cs.LG cs.CV

    Can Neural Network Memorization Be Localized?

    Authors: Pratyush Maini, Michael C. Mozer, Hanie Sedghi, Zachary C. Lipton, J. Zico Kolter, Chiyuan Zhang

    Abstract: Recent efforts at explaining the interplay of memorization and generalization in deep overparametrized networks have posited that neural networks $\textit{memorize}$ "hard" examples in the final few layers of the model. Memorization refers to the ability to correctly predict on $\textit{atypical}$ examples of the training set. In this work, we show that rather than being confined to individual lay… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: Accepted at ICML 2023

  26. arXiv:2307.03132  [pdf, other

    cs.CV cs.CL cs.LG

    T-MARS: Improving Visual Representations by Circumventing Text Feature Learning

    Authors: Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, Aditi Raghunathan

    Abstract: Large web-sourced multimodal datasets have powered a slew of new methods for learning general-purpose visual representations, advancing the state of the art in computer vision and revolutionizing zero- and few-shot recognition. One crucial decision facing practitioners is how, if at all, to curate these ever-larger datasets. For example, the creators of the LAION-5B dataset chose to retain only im… ▽ More

    Submitted 18 March, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: Accepted to ICLR 2024. Oral at ICCV Datacomp 2023

  27. arXiv:2305.19570  [pdf, other

    stat.ML cs.LG

    Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms

    Authors: Dheeraj Baby, Saurabh Garg, Tzu-Ching Yen, Sivaraman Balakrishnan, Zachary Chase Lipton, Yu-Xiang Wang

    Abstract: This paper focuses on supervised and unsupervised online label shift, where the class marginals $Q(y)$ varies but the class-conditionals $Q(x|y)$ remain invariant. In the unsupervised setting, our goal is to adapt a learner, trained on some offline labeled data, to changing label distributions given unlabeled online data. In the supervised setting, we must both learn a classifier and adapt to the… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: First three authors contributed equally

  28. arXiv:2305.17319  [pdf, other

    cs.CY cs.AI cs.GT

    Moral Machine or Tyranny of the Majority?

    Authors: Michael Feffer, Hoda Heidari, Zachary C. Lipton

    Abstract: With Artificial Intelligence systems increasingly applied in consequential domains, researchers have begun to ask how these systems ought to act in ethically charged situations where even humans lack consensus. In the Moral Machine project, researchers crowdsourced answers to "Trolley Problems" concerning autonomous vehicles. Subsequently, Noothigattu et al. (2018) proposed inferring linear functi… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: To appear in the proceedings of AAAI 2023

  29. arXiv:2305.15444  [pdf, other

    cs.CL cs.AI cs.LG

    PromptNER: Prompting For Named Entity Recognition

    Authors: Dhananjay Ashok, Zachary C. Lipton

    Abstract: In a surprising turn, Large Language Models (LLMs) together with a growing arsenal of prompt-based heuristics now offer powerful off-the-shelf approaches providing few-shot solutions to myriad classic NLP problems. However, despite promising early results, these LLM-based few-shot methods remain far from the state of the art in Named Entity Recognition (NER), where prevailing methods include learn… ▽ More

    Submitted 20 June, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  30. arXiv:2305.14296  [pdf, other

    cs.CL cs.LG

    USB: A Unified Summarization Benchmark Across Tasks and Domains

    Authors: Kundan Krishna, Prakhar Gupta, Sanjana Ramprasad, Byron C. Wallace, Jeffrey P. Bigham, Zachary C. Lipton

    Abstract: While the NLP community has produced numerous summarization benchmarks, none provide the rich annotations required to simultaneously address many important problems related to control and reliability. We introduce a Wikipedia-derived benchmark, complemented by a rich set of crowd-sourced annotations, that supports $8$ interrelated tasks: (i) extractive summarization; (ii) abstractive summarization… ▽ More

    Submitted 4 December, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: EMNLP Findings 2023 Camera Ready

  31. arXiv:2305.13426  [pdf, other

    cs.LG cs.AI

    Evaluating Model Performance in Medical Datasets Over Time

    Authors: Helen Zhou, Yuwen Chen, Zachary C. Lipton

    Abstract: Machine learning (ML) models deployed in healthcare systems must face data drawn from continually evolving environments. However, researchers proposing such models typically evaluate them in a time-agnostic manner, splitting datasets according to patients sampled randomly throughout the entire study time period. This work proposes the Evaluation on Medical Datasets Over Time (EMDOT) framework, whi… ▽ More

    Submitted 16 July, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: To appear at Conference on Health, Inference, and Learning (CHIL) 2023. arXiv admin note: substantial text overlap with arXiv:2211.07165

  32. arXiv:2305.06884  [pdf, ps, other

    stat.ME cs.AI cs.LG math.ST stat.AP stat.ML

    Risk-limiting Financial Audits via Weighted Sampling without Replacement

    Authors: Shubhanshu Shekhar, Ziyu Xu, Zachary C. Lipton, Pierre J. Liang, Aaditya Ramdas

    Abstract: We introduce the notion of a risk-limiting financial auditing (RLFA): given $N$ transactions, the goal is to estimate the total misstated monetary fraction~($m^*$) to a given accuracy $ε$, with confidence $1-δ$. We do this by constructing new confidence sequences (CSs) for the weighted average of $N$ unknown values, based on samples drawn without replacement according to a (randomized) weighted sa… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: 23 pages, 8 figures, to appear in the Proceedings of Uncertainty in Artificial Intelligence (UAI) 2023

  33. arXiv:2304.09088  [pdf, other

    cs.IR cs.HC cs.LG

    A Field Test of Bandit Algorithms for Recommendations: Understanding the Validity of Assumptions on Human Preferences in Multi-armed Bandits

    Authors: Liu Leqi, Giulio Zhou, Fatma Kılınç-Karzan, Zachary C. Lipton, Alan L. Montgomery

    Abstract: Personalized recommender systems suffuse modern life, sha** what media we read and what products we consume. Algorithms powering such systems tend to consist of supervised learning-based heuristics, such as latent factor models with a variety of heuristically chosen prediction targets. Meanwhile, theoretical treatments of recommendation frequently address the decision-theoretic nature of the pro… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

    Comments: Accepted to CHI. 16 pages, 6 figures

  34. arXiv:2303.07320  [pdf, other

    cs.CL cs.LG

    Model-tuning Via Prompts Makes NLP Models Adversarially Robust

    Authors: Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi

    Abstract: In recent years, NLP practitioners have converged on the following practice: (i) import an off-the-shelf pretrained (masked) language model; (ii) append a multilayer perceptron atop the CLS token's hidden representation (with randomly initialized weights); and (iii) fine-tune the entire model on a downstream task (MLP-FT). This procedure has produced massive gains on standard NLP benchmarks, but t… ▽ More

    Submitted 5 December, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted to the EMNLP 2023 Conference

  35. arXiv:2303.05500  [pdf, ps, other

    cs.CY cs.AI cs.HC

    Users are the North Star for AI Transparency

    Authors: Alex Mei, Michael Saxon, Shiyu Chang, Zachary C. Lipton, William Yang Wang

    Abstract: Despite widespread calls for transparent artificial intelligence systems, the term is too overburdened with disparate meanings to express precise policy aims or to orient concrete lines of research. Consequently, stakeholders often talk past each other, with policymakers expressing vague demands and practitioners devising solutions that may not address the underlying concerns. Part of why this hap… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: 9 pages, 3 tables

  36. arXiv:2302.08070  [pdf, other

    cs.LG stat.ME

    Local Causal Discovery for Estimating Causal Effects

    Authors: Shantanu Gupta, David Childers, Zachary C. Lipton

    Abstract: Even when the causal graph underlying our data is unknown, we can use observational data to narrow down the possible values that an average treatment effect (ATE) can take by (1) identifying the graph up to a Markov equivalence class; and (2) estimating that ATE for each graph in the class. While the PC algorithm can identify this class under strong faithfulness assumptions, it can be computationa… ▽ More

    Submitted 10 April, 2024; v1 submitted 15 February, 2023; originally announced February 2023.

    Comments: Accepted at CLeaR 2023

  37. arXiv:2302.06804  [pdf, other

    cs.LG stat.ME

    Discovering Optimal Scoring Mechanisms in Causal Strategic Prediction

    Authors: Tom Yan, Shantanu Gupta, Zachary Lipton

    Abstract: Faced with data-driven policies, individuals will manipulate their features to obtain favorable decisions. While earlier works cast these manipulations as undesirable gaming, recent works have adopted a more nuanced causal framing in which manipulations can improve outcomes of interest, and setting coherent mechanisms requires accounting for both predictive accuracy and improvement of the outcome.… ▽ More

    Submitted 20 February, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

  38. arXiv:2302.03020  [pdf, other

    cs.LG cs.CV stat.ML

    RLSbench: Domain Adaptation Under Relaxed Label Shift

    Authors: Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored. Meanwhile, popular deep domain adaptation heuristics tend to falter when faced with label proportions shifts. While several papers modify these heuristics in attempts to handle label proportions shifts, inconsistencies i… ▽ More

    Submitted 5 June, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023. Paper website: https://sites.google.com/view/rlsbench/

  39. arXiv:2302.02551  [pdf, other

    cs.CV cs.LG

    CHiLS: Zero-Shot Image Classification with Hierarchical Label Sets

    Authors: Zachary Novack, Julian McAuley, Zachary C. Lipton, Saurabh Garg

    Abstract: Open vocabulary models (e.g. CLIP) have shown strong performance on zero-shot classification through their ability generate embeddings for each class based on their (natural language) names. Prior work has focused on improving the accuracy of these models through prompt engineering or by incorporating a small amount of labeled downstream data (via finetuning). However, there has been little focus… ▽ More

    Submitted 31 May, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023

  40. arXiv:2301.11724  [pdf, other

    cs.LG

    Meta-Learning Mini-Batch Risk Functionals

    Authors: Jacob Tyo, Zachary C. Lipton

    Abstract: Supervised learning typically optimizes the expected value risk functional of the loss, but in many cases, we want to optimize for other risk functionals. In full-batch gradient descent, this is done by taking gradients of a risk functional of interest, such as the Conditional Value at Risk (CVaR) which ignores some quantile of extreme losses. However, deep learning must almost always use mini-bat… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  41. arXiv:2211.15853  [pdf, other

    cs.LG

    Disentangling the Mechanisms Behind Implicit Regularization in SGD

    Authors: Zachary Novack, Simran Kaur, Tanya Marwah, Saurabh Garg, Zachary C. Lipton

    Abstract: A number of competing hypotheses have been proposed to explain why small-batch Stochastic Gradient Descent (SGD)leads to improved generalization over the full-batch regime, with recent work crediting the implicit regularization of various quantities throughout training. However, to date, empirical evidence assessing the explanatory power of these hypotheses is lacking. In this paper, we conduct an… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted as Spotlight at the NeurIPS 2022 Workshop for Higher Order Optimization in Machine Learning

  42. arXiv:2211.07165  [pdf, other

    cs.LG stat.AP

    Model Evaluation in Medical Datasets Over Time

    Authors: Helen Zhou, Yuwen Chen, Zachary C. Lipton

    Abstract: Machine learning models deployed in healthcare systems face data drawn from continually evolving environments. However, researchers proposing such models typically evaluate them in a time-agnostic manner, with train and test splits sampling patients throughout the entire study period. We introduce the Evaluation on Medical Datasets Over Time (EMDOT) framework and Python package, which evaluates th… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 6 pages

  43. arXiv:2211.02093  [pdf, other

    cs.LG stat.ML

    Domain Adaptation under Missingness Shift

    Authors: Helen Zhou, Sivaraman Balakrishnan, Zachary C. Lipton

    Abstract: Rates of missing data often depend on record-kee** policies and thus may change across times and locations, even when the underlying features are comparatively stable. In this paper, we introduce the problem of Domain Adaptation under Missingness Shift (DAMS). Here, (labeled) source data and (unlabeled) target data would be exchangeable but for different missing data mechanisms. We show that if… ▽ More

    Submitted 3 May, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

  44. arXiv:2210.15031  [pdf, other

    cs.LG

    Characterizing Datapoints via Second-Split Forgetting

    Authors: Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter

    Abstract: Researchers investigating example hardness have increasingly focused on the dynamics by which neural networks learn and forget examples throughout training. Popular metrics derived from these dynamics include (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out.… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022

  45. arXiv:2210.12101  [pdf, ps, other

    cs.LG math.NA

    Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective

    Authors: Tanya Marwah, Zachary C. Lipton, Jianfeng Lu, Andrej Risteski

    Abstract: A burgeoning line of research leverages deep neural networks to approximate the solutions to high dimensional PDEs, opening lines of theoretical inquiry focused on explaining how it is that these models appear to evade the curse of dimensionality. However, most prior theoretical analyses have been limited to linear PDEs. In this work, we take a step towards studying the representational power of n… ▽ More

    Submitted 27 March, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  46. arXiv:2210.01422  [pdf, other

    cs.LG

    Time-Varying Propensity Score to Bridge the Gap between the Past and Present

    Authors: Rasool Fakoor, Jonas Mueller, Zachary C. Lipton, Pratik Chaudhari, Alexander J. Smola

    Abstract: Real-world deployment of machine learning models is challenging because data evolves over time. While no model can work when data evolves in an arbitrary fashion, if there is some pattern to these changes, we might be able to design methods to address it. This paper addresses situations when data evolves gradually. We introduce a time-varying propensity score that can detect gradual shifts in the… ▽ More

    Submitted 2 May, 2024; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: Published at ICLR 2024

  47. arXiv:2209.14389  [pdf, other

    cs.CL cs.LG

    Downstream Datasets Make Surprisingly Good Pretraining Corpora

    Authors: Kundan Krishna, Saurabh Garg, Jeffrey P. Bigham, Zachary C. Lipton

    Abstract: For most natural language processing tasks, the dominant practice is to finetune large pretrained transformer models (e.g., BERT) using smaller downstream datasets. Despite the success of this approach, it remains unclear to what extent these gains are attributable to the massive background corpora employed for pretraining versus to the pretraining objectives themselves. This paper introduces a la… ▽ More

    Submitted 26 May, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: ACL2023 Camera Ready

  48. arXiv:2209.10444  [pdf, other

    cs.LG cs.AI stat.ML

    Off-Policy Risk Assessment in Markov Decision Processes

    Authors: Audrey Huang, Liu Leqi, Zachary Chase Lipton, Kamyar Azizzadenesheli

    Abstract: Addressing such diverse ends as safety alignment with human preferences, and the efficiency of learning, a growing line of reinforcement learning research focuses on risk functionals that depend on the entire distribution of returns. Recent work on \emph{off-policy risk assessment} (OPRA) for contextual bandits introduced consistent estimators for the target policy's CDF of returns along with fini… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  49. arXiv:2209.06869  [pdf, other

    cs.CL cs.AI cs.LG

    On the State of the Art in Authorship Attribution and Authorship Verification

    Authors: Jacob Tyo, Bhuwan Dhingra, Zachary C. Lipton

    Abstract: Despite decades of research on authorship attribution (AA) and authorship verification (AV), inconsistent dataset splits/filtering and mismatched evaluation methods make it difficult to assess the state of the art. In this paper, we present a survey of the fields, resolve points of confusion, introduce Valla that standardizes and benchmarks AA/AV datasets and metrics, provide a large-scale empiric… ▽ More

    Submitted 5 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

  50. arXiv:2208.13126  [pdf, other

    cs.LG stat.ML

    Learning Clinical Concepts for Predicting Risk of Progression to Severe COVID-19

    Authors: Helen Zhou, Cheng Cheng, Kelly J. Shields, Gursimran Kochhar, Tariq Cheema, Zachary C. Lipton, Jeremy C. Weiss

    Abstract: With COVID-19 now pervasive, identification of high-risk individuals is crucial. Using data from a major healthcare provider in Southwestern Pennsylvania, we develop survival models predicting severe COVID-19 progression. In this endeavor, we face a tradeoff between more accurate models relying on many features and less accurate models relying on a few features aligned with clinician intuition. Co… ▽ More

    Submitted 27 August, 2022; originally announced August 2022.