Skip to main content

Showing 1–25 of 25 results for author: Santurkar, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.17548  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Whose Opinions Do Language Models Reflect?

    Authors: Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, Tatsunori Hashimoto

    Abstract: Language models (LMs) are increasingly being used in open-ended contexts, where the opinions reflected by LMs in response to subjective queries can have a profound impact, both on user satisfaction, as well as sha** the views of society at large. In this work, we put forth a quantitative framework to investigate the opinions reflected by LMs -- by leveraging high-quality public opinion polls and… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

  2. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  3. arXiv:2302.03169  [pdf, other

    cs.CL cs.LG

    Data Selection for Language Models via Importance Resampling

    Authors: Sang Michael Xie, Shibani Santurkar, Tengyu Ma, Percy Liang

    Abstract: Selecting a suitable pretraining dataset is crucial for both general-domain (e.g., GPT-3) and domain-specific (e.g., Codex) language models (LMs). We formalize this problem as selecting a subset of a large raw unlabeled dataset to match a desired target distribution given unlabeled target samples. Due to the scale and dimensionality of the raw text data, existing methods use simple heuristics or r… ▽ More

    Submitted 18 November, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2023

  4. arXiv:2211.09110  [pdf, other

    cs.CL cs.AI cs.LG

    Holistic Evaluation of Language Models

    Authors: Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao , et al. (25 additional authors not shown)

    Abstract: Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest fo… ▽ More

    Submitted 1 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Project page: https://crfm.stanford.edu/helm/v1.0

    Journal ref: Published in Transactions on Machine Learning Research (TMLR), 2023

  5. arXiv:2207.07635  [pdf, other

    cs.CV cs.LG stat.ML

    Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning

    Authors: Shibani Santurkar, Yann Dubois, Rohan Taori, Percy Liang, Tatsunori Hashimoto

    Abstract: The development of CLIP [Radford et al., 2021] has sparked a debate on whether language supervision can result in vision models with more transferable representations than traditional image-only methods. Our work studies this question through a carefully controlled comparison of two approaches in terms of their ability to learn representations that generalize to downstream classification tasks. We… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

  6. arXiv:2112.01008  [pdf, other

    cs.LG cs.CV

    Editing a classifier by rewriting its prediction rules

    Authors: Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry

    Abstract: We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. Our code is available at https://github.com/MadryLab/EditingClassifiers .

    Submitted 2 December, 2021; originally announced December 2021.

  7. arXiv:2106.03805  [pdf, other

    cs.CV cs.LG stat.ML

    3DB: A Framework for Debugging Computer Vision Models

    Authors: Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry

    Abstract: We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation. We demonstrate, through a wide range of use cases, that 3DB allows users to discover vulnerabilities in computer vision systems and gain insights into how models make decisions. 3DB captures and generalizes many robustness analyses from prior work, and enables one to study th… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  8. arXiv:2105.04857  [pdf, other

    cs.LG stat.ML

    Leveraging Sparse Linear Layers for Debuggable Deep Networks

    Authors: Eric Wong, Shibani Santurkar, Aleksander Mądry

    Abstract: We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantiatively via numerical and human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, expla… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

  9. arXiv:2008.04859  [pdf, other

    cs.CV cs.LG stat.ML

    BREEDS: Benchmarks for Subpopulation Shift

    Authors: Shibani Santurkar, Dimitris Tsipras, Aleksander Madry

    Abstract: We develop a methodology for assessing the robustness of models to subpopulation shift---specifically, their ability to generalize to novel data subpopulations that were not observed during training. Our approach leverages the class structure underlying existing datasets to control the data subpopulations that comprise the training and test distributions. This enables us to synthesize realistic di… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

  10. arXiv:2005.12729  [pdf, other

    cs.LG cs.RO stat.ML

    Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO

    Authors: Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

    Abstract: We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms: Proximal Policy Optimization (PPO) and Trust Region Policy Optimization (TRPO). Specifically, we investigate the consequences of "code-level optimizations:" algorithm augmentations found only in implementations or described as auxiliary details to the core algorithm. Seemin… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: ICLR 2020 version. arXiv admin note: text overlap with arXiv:1811.02553

  11. arXiv:2005.11295  [pdf, other

    cs.CV cs.LG stat.ML

    From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

    Authors: Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Andrew Ilyas, Aleksander Madry

    Abstract: Building rich machine learning datasets in a scalable manner often necessitates a crowd-sourced data collection pipeline. In this work, we use human studies to investigate the consequences of employing such a pipeline, focusing on the popular ImageNet dataset. We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset---including the introduc… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

  12. arXiv:2005.09619  [pdf, other

    stat.ML cs.CV cs.LG

    Identifying Statistical Bias in Dataset Replication

    Authors: Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Jacob Steinhardt, Aleksander Madry

    Abstract: Dataset replication is a useful tool for assessing whether improvements in test accuracy on a specific benchmark correspond to improvements in models' ability to generalize reliably. In this work, we present unintuitive yet significant ways in which standard approaches to dataset replication introduce statistical bias, skewing the resulting observations. We study ImageNet-v2, a replication of the… ▽ More

    Submitted 2 September, 2020; v1 submitted 19 May, 2020; originally announced May 2020.

  13. arXiv:1906.09453  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    Image Synthesis with a Single (Robust) Classifier

    Authors: Shibani Santurkar, Dimitris Tsipras, Brandon Tran, Andrew Ilyas, Logan Engstrom, Aleksander Madry

    Abstract: We show that the basic classification framework alone can be used to tackle some of the most challenging tasks in image synthesis. In contrast to other state-of-the-art approaches, the toolkit we develop is rather minimal: it uses a single, off-the-shelf classifier for all these tasks. The crux of our approach is that we train this classifier to be adversarially robust. It turns out that adversari… ▽ More

    Submitted 8 August, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

  14. arXiv:1906.00945  [pdf, other

    stat.ML cs.CV cs.LG cs.NE

    Adversarial Robustness as a Prior for Learned Representations

    Authors: Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Brandon Tran, Aleksander Madry

    Abstract: An important goal in deep learning is to learn versatile, high-level feature representations of input data. However, standard networks' representations seem to possess shortcomings that, as we illustrate, prevent them from fully realizing this goal. In this work, we show that robust optimization can be re-cast as a tool for enforcing priors on the features learned by deep neural networks. It turns… ▽ More

    Submitted 27 September, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

  15. arXiv:1905.02175  [pdf, other

    stat.ML cs.CR cs.CV cs.LG

    Adversarial Examples Are Not Bugs, They Are Features

    Authors: Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, Aleksander Madry

    Abstract: Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans. After capturing… ▽ More

    Submitted 12 August, 2019; v1 submitted 6 May, 2019; originally announced May 2019.

  16. arXiv:1811.02553  [pdf, other

    cs.LG cs.NE cs.RO stat.ML

    A Closer Look at Deep Policy Gradients

    Authors: Andrew Ilyas, Logan Engstrom, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

    Abstract: We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development. To this end, we propose a fine-grained analysis of state-of-the-art methods based on key elements of this framework: gradient estimation, value prediction, and optimization landscapes. Our results show that the behavior of deep policy gradient algorithms often deviates from… ▽ More

    Submitted 25 May, 2020; v1 submitted 6 November, 2018; originally announced November 2018.

    Comments: ICLR 2020 version

  17. arXiv:1805.12152  [pdf, other

    stat.ML cs.CV cs.LG cs.NE

    Robustness May Be at Odds with Accuracy

    Authors: Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, Aleksander Madry

    Abstract: We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in… ▽ More

    Submitted 9 September, 2019; v1 submitted 30 May, 2018; originally announced May 2018.

    Comments: ICLR'19

  18. arXiv:1805.11604  [pdf, other

    stat.ML cs.LG cs.NE

    How Does Batch Normalization Help Optimization?

    Authors: Shibani Santurkar, Dimitris Tsipras, Andrew Ilyas, Aleksander Madry

    Abstract: Batch Normalization (BatchNorm) is a widely adopted technique that enables faster and more stable training of deep neural networks (DNNs). Despite its pervasiveness, the exact reasons for BatchNorm's effectiveness are still poorly understood. The popular belief is that this effectiveness stems from controlling the change of the layers' input distributions during training to reduce the so-called "i… ▽ More

    Submitted 14 April, 2019; v1 submitted 29 May, 2018; originally announced May 2018.

    Comments: In NeurIPS'18

  19. arXiv:1804.11285  [pdf, other

    cs.LG cs.NE stat.ML

    Adversarially Robust Generalization Requires More Data

    Authors: Ludwig Schmidt, Shibani Santurkar, Dimitris Tsipras, Kunal Talwar, Aleksander Mądry

    Abstract: Machine learning models are often susceptible to adversarial perturbations of their inputs. Even small perturbations can cause state-of-the-art classifiers with high "standard" accuracy to produce an incorrect prediction with high confidence. To better understand this phenomenon, we study adversarially robust learning from the viewpoint of generalization. We show that already in a simple natural d… ▽ More

    Submitted 2 May, 2018; v1 submitted 30 April, 2018; originally announced April 2018.

    Comments: Small changes for biblatex compatibility

  20. arXiv:1711.00970  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    A Classification-Based Study of Covariate Shift in GAN Distributions

    Authors: Shibani Santurkar, Ludwig Schmidt, Aleksander Mądry

    Abstract: A basic, and still largely unanswered, question in the context of Generative Adversarial Networks (GANs) is whether they are truly able to capture all the fundamental characteristics of the distributions they are trained on. In particular, evaluating the diversity of GAN distributions is challenging and existing methods provide only a partial understanding of this issue. In this paper, we develop… ▽ More

    Submitted 5 June, 2018; v1 submitted 2 November, 2017; originally announced November 2017.

  21. arXiv:1703.01467  [pdf, other

    cs.CV

    Generative Compression

    Authors: Shibani Santurkar, David Budden, Nir Shavit

    Abstract: Traditional image and video compression algorithms rely on hand-crafted encoder/decoder pairs (codecs) that lack adaptability and are agnostic to the data being compressed. Here we describe the concept of generative compression, the compression of data using generative models, and suggest that it is a direction worth pursuing to produce more accurate and visually pleasing reconstructions at much d… ▽ More

    Submitted 4 June, 2017; v1 submitted 4 March, 2017; originally announced March 2017.

  22. arXiv:1702.07386  [pdf, other

    cs.CV

    Toward Streaming Synapse Detection with Compositional ConvNets

    Authors: Shibani Santurkar, David Budden, Alexander Matveev, Heather Berlin, Hayk Saribekyan, Yaron Meirovitch, Nir Shavit

    Abstract: Connectomics is an emerging field in neuroscience that aims to reconstruct the 3-dimensional morphology of neurons from electron microscopy (EM) images. Recent studies have successfully demonstrated the use of convolutional neural networks (ConvNets) for segmenting cell membranes to individuate neurons. However, there has been comparatively little success in high-throughput identification of the i… ▽ More

    Submitted 23 February, 2017; originally announced February 2017.

    Comments: 10 pages, 9 figures

  23. arXiv:1611.06565  [pdf, other

    cs.CV cs.DC cs.NE

    Deep Tensor Convolution on Multicores

    Authors: David Budden, Alexander Matveev, Shibani Santurkar, Shraman Ray Chaudhuri, Nir Shavit

    Abstract: Deep convolutional neural networks (ConvNets) of 3-dimensional kernels allow joint modeling of spatiotemporal features. These networks have improved performance of video and volumetric image analysis, but have been limited in size due to the low memory ceiling of GPU hardware. Existing CPU implementations overcome this constraint but are impractically slow. Here we extend and optimize the faster W… ▽ More

    Submitted 11 June, 2017; v1 submitted 20 November, 2016; originally announced November 2016.

    Comments: 11 pages, 4 figures, 1 supplementary doc

  24. arXiv:1410.7883  [pdf, other

    cs.NE q-bio.NC

    Sub-threshold CMOS Spiking Neuron Circuit Design for Navigation Inspired by C. elegans Chemotaxis

    Authors: Shibani Santurkar, Bipin Rajendran

    Abstract: We demonstrate a spiking neural network for navigation motivated by the chemotaxis network of Caenorhabditis elegans. Our network uses information regarding temporal gradients in the tracking variable's concentration to make navigational decisions. The gradient information is determined by mimicking the underlying mechanisms of the ASE neurons of C. elegans. Simulations show that our model is able… ▽ More

    Submitted 29 October, 2014; originally announced October 2014.

  25. arXiv:1410.7881  [pdf, other

    cs.NE q-bio.NC

    A neural circuit for navigation inspired by C. elegans Chemotaxis

    Authors: Shibani Santurkar, Bipin Rajendran

    Abstract: We develop an artificial neural circuit for contour tracking and navigation inspired by the chemotaxis of the nematode Caenorhabditis elegans. In order to harness the computational advantages spiking neural networks promise over their non-spiking counterparts, we develop a network comprising 7-spiking neurons with non-plastic synapses which we show is extremely robust in tracking a range of concen… ▽ More

    Submitted 29 October, 2014; originally announced October 2014.