Skip to main content

Showing 1–14 of 14 results for author: Taori, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.00833  [pdf, other

    cs.AI

    Position Paper: Agent AI Towards a Holistic Intelligence

    Authors: Qiuyuan Huang, Naoki Wake, Bidipta Sarkar, Zane Durante, Ran Gong, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Noboru Kuno, Ade Famoti, Ashley Llorens, John Langford, Hoi Vo, Li Fei-Fei, Katsu Ikeuchi, Jianfeng Gao

    Abstract: Recent advancements in large foundation models have remarkably enhanced our understanding of sensory information in open-world environments. In leveraging the power of foundation models, it is crucial for AI research to pivot away from excessive reductionism and toward an emphasis on systems that function as cohesive wholes. Specifically, we emphasize develo** Agent AI -- an embodied system that… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

    Comments: 22 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2401.03568

  2. arXiv:2402.05929  [pdf, other

    cs.AI cs.LG cs.RO

    An Interactive Agent Foundation Model

    Authors: Zane Durante, Bidipta Sarkar, Ran Gong, Rohan Taori, Yusuke Noda, Paul Tang, Ehsan Adeli, Shrinidhi Kowshika Lakshmikanth, Kevin Schulman, Arnold Milstein, Demetri Terzopoulos, Ade Famoti, Noboru Kuno, Ashley Llorens, Hoi Vo, Katsu Ikeuchi, Li Fei-Fei, Jianfeng Gao, Naoki Wake, Qiuyuan Huang

    Abstract: The development of artificial intelligence systems is transitioning from creating static, task-specific models to dynamic, agent-based systems capable of performing well in a wide range of applications. We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents across a wide range of domains, datasets, and tasks. Our training paradi… ▽ More

    Submitted 17 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  3. arXiv:2401.03568  [pdf, other

    cs.AI cs.HC cs.LG

    Agent AI: Surveying the Horizons of Multimodal Interaction

    Authors: Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Ye** Choi, Katsushi Ikeuchi, Hoi Vo, Li Fei-Fei, Jianfeng Gao

    Abstract: Multi-modal AI systems will likely become a ubiquitous presence in our everyday lives. A promising approach to making these systems more interactive is to embody them as agents within physical and virtual environments. At present, systems leverage existing foundation models as the basic building blocks for the creation of embodied agents. Embedding agents within such environments facilitates the a… ▽ More

    Submitted 25 January, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  4. arXiv:2312.00364  [pdf, other

    cs.LG cs.CV

    Benchmarking Multi-Domain Active Learning on Image Classification

    Authors: Jiayi Li, Rohan Taori, Tatsunori B. Hashimoto

    Abstract: Active learning aims to enhance model performance by strategically labeling informative data points. While extensively studied, its effectiveness on large-scale, real-world datasets remains underexplored. Existing research primarily focuses on single-source data, ignoring the multi-domain nature of real-world data. We introduce a multi-domain active learning benchmark to bridge this gap. Our bench… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  5. arXiv:2308.06595  [pdf, other

    cs.CL cs.AI cs.CV

    VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

    Authors: Yonatan Bitton, Hritik Bansal, Jack Hessel, Rulin Shao, Wanrong Zhu, Anas Awadalla, Josh Gardner, Rohan Taori, Ludwig Schmidt

    Abstract: We introduce VisIT-Bench (Visual InsTruction Benchmark), a benchmark for evaluation of instruction-following vision-language models for real-world use. Our starting point is curating 70 'instruction families' that we envision instruction tuned vision-language models should be able to address. Extending beyond evaluations like VQAv2 and COCO, tasks range from basic recognition to game playing and c… ▽ More

    Submitted 26 December, 2023; v1 submitted 12 August, 2023; originally announced August 2023.

    Comments: Accepted to NeurIPS 2023, Datasets and Benchmarks. Website: https://visit-bench.github.io/

  6. arXiv:2305.14387  [pdf, other

    cs.LG cs.AI cs.CL

    AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback

    Authors: Yann Dubois, Xuechen Li, Rohan Taori, Tianyi Zhang, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto

    Abstract: Large language models (LLMs) such as ChatGPT have seen widespread adoption due to their strong instruction-following abilities. Develo** these LLMs involves a complex yet poorly understood workflow requiring training with human feedback. Replicating and understanding this instruction-following requires tackling three major challenges: the high cost of data collection, the lack of trustworthy eva… ▽ More

    Submitted 7 January, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Spotlight at NeurIPS 2023

  7. arXiv:2209.03942  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    Data Feedback Loops: Model-driven Amplification of Dataset Biases

    Authors: Rohan Taori, Tatsunori B. Hashimoto

    Abstract: Datasets scraped from the internet have been critical to the successes of large-scale machine learning. Yet, this very success puts the utility of future internet-derived datasets at potential risk, as model outputs begin to replace human annotations as a source of supervision. In this work, we first formalize a system where interactions with one model are recorded as history and scraped as trai… ▽ More

    Submitted 8 September, 2022; originally announced September 2022.

  8. arXiv:2207.07635  [pdf, other

    cs.CV cs.LG stat.ML

    Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning

    Authors: Shibani Santurkar, Yann Dubois, Rohan Taori, Percy Liang, Tatsunori Hashimoto

    Abstract: The development of CLIP [Radford et al., 2021] has sparked a debate on whether language supervision can result in vision models with more transferable representations than traditional image-only methods. Our work studies this question through a carefully controlled comparison of two approaches in terms of their ability to learn representations that generalize to downstream classification tasks. We… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

  9. arXiv:2108.07258  [pdf, other

    cs.LG cs.AI cs.CY

    On the Opportunities and Risks of Foundation Models

    Authors: Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh , et al. (89 additional authors not shown)

    Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their cap… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html

  10. arXiv:2107.04649  [pdf, other

    cs.LG stat.ML

    Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization

    Authors: John Miller, Rohan Taori, Aditi Raghunathan, Shiori Sagawa, Pang Wei Koh, Vaishaal Shankar, Percy Liang, Yair Carmon, Ludwig Schmidt

    Abstract: For machine learning systems to be reliable, we must understand their performance in unseen, out-of-distribution environments. In this paper, we empirically show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts. Specifically, we demonstrate strong correlations between in-distribution and out-of-distribut… ▽ More

    Submitted 7 October, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

  11. arXiv:2007.07243  [pdf, other

    cs.CV cs.GR

    Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter

    Authors: Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum A. Reda, Karan Sapra, Andrew Tao, Bryan Catanzaro

    Abstract: Conventional CNNs for texture synthesis consist of a sequence of (de)-convolution and up/down-sampling layers, where each layer operates locally and lacks the ability to capture the long-term structural dependency required by texture synthesis. Thus, they often simply enlarge the input texture, rather than perform reasonable synthesis. As a compromise, many recent methods sacrifice generalizabilit… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  12. arXiv:2007.00644  [pdf, other

    cs.LG cs.CV stat.ML

    Measuring Robustness to Natural Distribution Shifts in Image Classification

    Authors: Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt

    Abstract: We study how robust current ImageNet models are to distribution shifts arising from natural variations in datasets. Most research on robustness focuses on synthetic image perturbations (noise, simulated weather artifacts, adversarial examples, etc.), which leaves open how robustness on synthetic distribution shift relates to distribution shift arising in real data. Informed by an evaluation of 204… ▽ More

    Submitted 14 September, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

  13. arXiv:1910.07737  [pdf, other

    cs.LG stat.ML

    Autoregressive Models: What Are They Good For?

    Authors: Murtaza Dalal, Alexander C. Li, Rohan Taori

    Abstract: Autoregressive (AR) models have become a popular tool for unsupervised learning, achieving state-of-the-art log likelihood estimates. We investigate the use of AR models as density estimators in two settings -- as a learning signal for image translation, and as an outlier detector -- and find that these density estimates are much less reliable than previously thought. We examine the underlying opt… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: Accepted for the Information Theory and Machine Learning workshop at NeurIPS 2019

  14. arXiv:1805.07820  [pdf, other

    cs.LG cs.CR cs.SD eess.AS stat.ML

    Targeted Adversarial Examples for Black Box Audio Systems

    Authors: Rohan Taori, Amog Kamsetty, Brenton Chu, Nikita Vemuri

    Abstract: The application of deep recurrent networks to audio transcription has led to impressive gains in automatic speech recognition (ASR) systems. Many have demonstrated that small adversarial perturbations can fool deep neural networks into incorrectly predicting a specified target with high confidence. Current work on fooling ASR systems have focused on white-box attacks, in which the model architectu… ▽ More

    Submitted 19 August, 2019; v1 submitted 20 May, 2018; originally announced May 2018.

    Comments: IEEE Deep Learning and Security Workshop 2019