Skip to main content

Showing 1–50 of 262 results for author: González, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00207  [pdf, other

    cs.AR

    CIS: Composable Instruction Set for Streaming Applications: Design, Modeling, and Scheduling

    Authors: Yu Yang, Jordi Altayó González, Ahmed Hemani

    Abstract: The efficiency improvement of hardware accelerators such as single-instruction-multiple-data (SIMD) and coarse-grained reconfigurable architecture (CGRA) empowers the rapid advancement of AI and machine learning applications. These streaming applications consist of numerous vector operations that can be naturally parallelized. Despite the outstanding achievements of today's hardware accelerators,… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  2. arXiv:2406.18665  [pdf, other

    cs.LG cs.AI cs.CL

    RouteLLM: Learning to Route LLMs with Preference Data

    Authors: Isaac Ong, Amjad Almahairi, Vincent Wu, Wei-Lin Chiang, Tianhao Wu, Joseph E. Gonzalez, M Waleed Kadous, Ion Stoica

    Abstract: Large language models (LLMs) exhibit impressive capabilities across a wide range of tasks, yet the choice of which model to use often involves a trade-off between performance and cost. More powerful models, though effective, come with higher expenses, while less capable models are more cost-effective. To address this dilemma, we propose several efficient router models that dynamically select betwe… ▽ More

    Submitted 1 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.18247  [pdf, other

    eess.IV cs.CV cs.LG

    Generative artificial intelligence in ophthalmology: multimodal retinal images for the diagnosis of Alzheimer's disease with convolutional neural networks

    Authors: I. R. Slootweg, M. Thach, K. R. Curro-Tafili, F. D. Verbraak, F. H. Bouwman, Y. A. L. Pijnenburg, J. F. Boer, J. H. P. de Kwisthout, L. Bagheriye, P. J. González

    Abstract: Background/Aim. This study aims to predict Amyloid Positron Emission Tomography (AmyloidPET) status with multimodal retinal imaging and convolutional neural networks (CNNs) and to improve the performance through pretraining with synthetic data. Methods. Fundus autofluorescence, optical coherence tomography (OCT), and OCT angiography images from 328 eyes of 59 AmyloidPET positive subjects and 108 A… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  4. arXiv:2406.11939  [pdf, other

    cs.LG cs.AI cs.CL

    From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

    Authors: Tianle Li, Wei-Lin Chiang, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica

    Abstract: The rapid evolution of language models has necessitated the development of more challenging benchmarks. Current static benchmarks often struggle to consistently distinguish between the capabilities of different models and fail to align with real-world user preferences. On the other hand, live crowd-sourced platforms like the Chatbot Arena collect a wide range of natural prompts and user feedback.… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2406.09579  [pdf, other

    cs.HC

    Hovering Over the Key to Text Input in XR

    Authors: Mar Gonzalez-Franco, Diar Abdlkarim, Arpit Bhatia, Stuart Macgregor, Jason Alexander Fotso-Puepi, Eric J Gonzalez, Hasti Seifi, Massimiliano Di Luca, Karan Ahuja

    Abstract: Virtual, Mixed, and Augmented Reality (XR) technologies hold immense potential for transforming productivity beyond PC. Therefore there is a critical need for improved text input solutions for XR. However, achieving efficient text input in these environments remains a significant challenge. This paper examines the current landscape of XR text input techniques, focusing on the importance of keyboar… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  6. arXiv:2406.04271  [pdf, other

    cs.CL

    Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

    Authors: Ling Yang, Zhaochen Yu, Tianjun Zhang, Shiyi Cao, Minkai Xu, Wentao Zhang, Joseph E. Gonzalez, Bin Cui

    Abstract: We introduce Buffer of Thoughts (BoT), a novel and versatile thought-augmented reasoning approach for enhancing accuracy, efficiency and robustness of large language models (LLMs). Specifically, we propose meta-buffer to store a series of informative high-level thoughts, namely thought-template, distilled from the problem-solving processes across various tasks. Then for each problem, we retrieve a… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Project: https://github.com/YangLing0818/buffer-of-thought-llm

  7. arXiv:2406.03636  [pdf, other

    cs.PL cs.LG

    Synthetic Programming Elicitation and Repair for Text-to-Code in Very Low-Resource Programming Languages

    Authors: Federico Mora, Justin Wong, Haley Lepe, Sahil Bhatia, Karim Elmaaroufi, George Varghese, Joseph E. Gonzalez, Elizabeth Polgreen, Sanjit A. Seshia

    Abstract: Recent advances in large language models (LLMs) for code applications have demonstrated remarkable zero-shot fluency and instruction following on challenging code related tasks ranging from test case generation to self-repair. Unsurprisingly, however, models struggle to compose syntactically valid programs in programming languages unrepresented in pre-training, referred to as very low-resource Pro… ▽ More

    Submitted 29 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: 15 pages, 6 figures, 1 table

  8. arXiv:2404.18928  [pdf, other

    cs.CV cs.AI cs.CL cs.GR cs.LG

    Stylus: Automatic Adapter Selection for Diffusion Models

    Authors: Michael Luo, Justin Wong, Brandon Trabucco, Yan** Huang, Joseph E. Gonzalez, Zhifeng Chen, Ruslan Salakhutdinov, Ion Stoica

    Abstract: Beyond scaling base models with more data or parameters, fine-tuned adapters provide an alternative way to generate high fidelity, custom images at reduced costs. As such, adapters have been widely adopted by open-source communities, accumulating a database of over 100K adapters-most of which are highly customized with insufficient descriptions. This paper explores the problem of matching the prom… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Project Website: https://stylus-diffusion.github.io

  9. arXiv:2404.13274  [pdf, other

    cs.HC cs.AI

    Augmented Object Intelligence: Making the Analog World Interactable with XR-Objects

    Authors: Mustafa Doga Dogan, Eric J. Gonzalez, Andrea Colaco, Karan Ahuja, Ruofei Du, Johnny Lee, Mar Gonzalez-Franco, David Kim

    Abstract: Seamless integration of physical objects as interactive digital entities remains a challenge for spatial computing. This paper introduces Augmented Object Intelligence (AOI), a novel XR interaction paradigm designed to blur the lines between digital and physical by equip** real-world objects with the ability to interact as if they were digital, where every object has the potential to serve as a… ▽ More

    Submitted 22 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  10. arXiv:2404.08523  [pdf, other

    cs.LG cs.AI

    Advancing Forest Fire Prevention: Deep Reinforcement Learning for Effective Firebreak Placement

    Authors: Lucas Murray, Tatiana Castillo, Jaime Carrasco, Andrés Weintraub, Richard Weber, Isaac Martín de Diego, José Ramón González, Jordi García-Gonzalo

    Abstract: Over the past decades, the increase in both frequency and intensity of large-scale wildfires due to climate change has emerged as a significant natural threat. The pressing need to design resilient landscapes capable of withstanding such disasters has become paramount, requiring the development of advanced decision-support tools. Existing methodologies, including Mixed Integer Programming, Stochas… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 20 pages, 15 figures

  11. arXiv:2404.07979  [pdf, other

    cs.CL cs.AI cs.LG

    LLoCO: Learning Long Contexts Offline

    Authors: Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang, Kurt Keutzer, Joseph E. Gonzalez, Raluca Ada Popa

    Abstract: Processing long contexts remains a challenge for large language models (LLMs) due to the quadratic computational and memory overhead of the self-attention mechanism and the substantial KV cache sizes during generation. We propose a novel approach to address this problem by learning contexts offline through context compression and in-domain parameter-efficient finetuning. Our method enables an LLM… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: The first two authors contributed equally to this work

  12. arXiv:2404.06921  [pdf, other

    cs.CL cs.AI

    GoEX: Perspectives and Designs Towards a Runtime for Autonomous LLM Applications

    Authors: Shishir G. Patil, Tianjun Zhang, Vivian Fang, Noppapon C., Roy Huang, Aaron Hao, Martin Casado, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica

    Abstract: Large Language Models (LLMs) are evolving beyond their classical role of providing information within dialogue systems to actively engaging with tools and performing actions on real-world applications and services. Today, humans verify the correctness and appropriateness of the LLM-generated outputs (e.g., code, functions, or actions) before putting them into real-world execution. This poses signi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  13. arXiv:2404.02904  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    ALOHa: A New Measure for Hallucination in Captioning Models

    Authors: Suzanne Petryk, David M. Chan, Anish Kachinthaya, Haodi Zou, John Canny, Joseph E. Gonzalez, Trevor Darrell

    Abstract: Despite recent advances in multimodal pre-training for visual description, state-of-the-art models still produce captions containing errors, such as hallucinating objects not present in a scene. The existing prominent metric for object hallucination, CHAIR, is limited to a fixed set of MS COCO objects and synonyms. In this work, we propose a modernized open-vocabulary metric, ALOHa, which leverage… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: To appear at NAACL 2024

  14. arXiv:2403.10131  [pdf, other

    cs.CL cs.AI

    RAFT: Adapting Language Model to Domain Specific RAG

    Authors: Tianjun Zhang, Shishir G. Patil, Naman Jain, Sheng Shen, Matei Zaharia, Ion Stoica, Joseph E. Gonzalez

    Abstract: Pretraining Large Language Models (LLMs) on large corpora of textual data is now a standard paradigm. When using these LLMs for many downstream applications, it is common to additionally bake in new knowledge (e.g., time-critical news, or private domain knowledge) into the pretrained model either through RAG-based-prompting, or fine-tuning. However, the optimal methodology for the model to gain su… ▽ More

    Submitted 5 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  15. arXiv:2403.05821  [pdf, other

    cs.LG cs.DB

    Optimizing LLM Queries in Relational Workloads

    Authors: Shu Liu, Asim Biswal, Audrey Cheng, Xiangxi Mo, Shiyi Cao, Joseph E. Gonzalez, Ion Stoica, Matei Zaharia

    Abstract: Analytical database providers (e.g., Redshift, Databricks, BigQuery) have rapidly added support for invoking Large Language Models (LLMs) through native user-defined functions (UDFs) to help users perform natural language tasks, such as classification, entity extraction, and translation, inside analytical workloads. For instance, an analyst might want to extract customer sentiments on millions of… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  16. arXiv:2403.04132  [pdf, other

    cs.AI cs.CL

    Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

    Authors: Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, Ion Stoica

    Abstract: Large Language Models (LLMs) have unlocked new capabilities and applications; however, evaluating the alignment with human preferences still poses significant challenges. To address this issue, we introduce Chatbot Arena, an open platform for evaluating LLMs based on human preferences. Our methodology employs a pairwise comparison approach and leverages input from a diverse user base through crowd… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  17. arXiv:2402.15243  [pdf, other

    cs.RO eess.SY

    Safety-Conscious Pushing on Diverse Oriented Surfaces with Underactuated Aerial Vehicles

    Authors: Tong Hui, Manuel J. Fernandez Gonzalez, Matteo Fumagalli

    Abstract: Pushing tasks performed by aerial manipulators can be used for contact-based industrial inspections. Underactuated aerial vehicles are widely employed in aerial manipulation due to their widespread availability and relatively low cost. Industrial infrastructures often consist of diverse oriented work surfaces. When interacting with such surfaces, the coupled gravity compensation and interaction fo… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to the 2024 IEEE International Conference on Robotics and Automation (ICRA2024)

  18. arXiv:2401.18075  [pdf, other

    cs.CV

    CARFF: Conditional Auto-encoded Radiance Field for 3D Scene Forecasting

    Authors: Jiezhi Yang, Khushi Desai, Charles Packer, Harshil Bhatia, Nicholas Rhinehart, Rowan McAllister, Joseph Gonzalez

    Abstract: We propose CARFF: Conditional Auto-encoded Radiance Field for 3D Scene Forecasting, a method for predicting future 3D scenes given past observations, such as 2D ego-centric images. Our method maps an image to a distribution over plausible 3D latent scene configurations using a probabilistic encoder, and predicts the evolution of the hypothesized scenes through time. Our latent scene representation… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  19. arXiv:2401.03946  [pdf, other

    cs.CL

    TextMachina: Seamless Generation of Machine-Generated Text Datasets

    Authors: Areg Mikael Sarvazyan, José Ángel González, Marc Franco-Salvador

    Abstract: Recent advancements in Large Language Models (LLMs) have led to high-quality Machine-Generated Text (MGT), giving rise to countless new use cases and applications. However, easy access to LLMs is posing new challenges due to misuse. To address malicious usage, researchers have released datasets to effectively train models on MGT-related tasks. Similar strategies are used to compile these datasets,… ▽ More

    Submitted 12 April, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: 14 pages, 10 figures

  20. arXiv:2401.00588  [pdf, other

    cs.AI cs.LG cs.PF

    Fairness in Serving Large Language Models

    Authors: Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica

    Abstract: High-demand LLM inference services (e.g., ChatGPT and BARD) support a wide range of requests from short chat conversations to long document reading. To ensure that all client requests are processed fairly, most major LLM inference services have request rate limits, to ensure that no client can dominate the request queue. However, this rudimentary notion of fairness also results in under-utilizatio… ▽ More

    Submitted 5 June, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

  21. arXiv:2312.08366  [pdf, other

    cs.CV

    See, Say, and Segment: Teaching LMMs to Overcome False Premises

    Authors: Tsung-Han Wu, Giscard Biamby, David Chan, Lisa Dunlap, Ritwik Gupta, Xudong Wang, Joseph E. Gonzalez, Trevor Darrell

    Abstract: Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-vocabulary language grounding and segmentation but can suffer under false premises when queries imply the existence of something that is not actually present in the image. We observe that existing methods that fine-tune an LMM to segment images significantly degrade their ability to reliably determine ("see") if an obje… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: Project Page: https://see-say-segment.github.io

  22. arXiv:2312.07104  [pdf, other

    cs.AI cs.PL

    SGLang: Efficient Execution of Structured Language Model Programs

    Authors: Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Chuyue Sun, Jeff Huang, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark Barrett, Ying Sheng

    Abstract: Large language models (LLMs) are increasingly used for complex tasks that require multiple generation calls, advanced prompting techniques, control flow, and structured inputs/outputs. However, efficient systems are lacking for programming and executing these applications. We introduce SGLang, a system for efficient execution of complex language model programs. SGLang consists of a frontend langua… ▽ More

    Submitted 5 June, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  23. arXiv:2312.02974  [pdf, other

    cs.CV cs.CL cs.CY cs.LG

    Describing Differences in Image Sets with Natural Language

    Authors: Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy

    Abstract: How do two sets of images differ? Discerning set-level differences is crucial for understanding model behaviors and analyzing datasets, yet manually sifting through thousands of images is impractical. To aid in this discovery process, we explore the task of automatically describing the differences between two $\textbf{sets}$ of images, which we term Set Difference Captioning. This task takes in im… ▽ More

    Submitted 26 April, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 Oral

  24. arXiv:2311.16090  [pdf, other

    cs.CV

    Self-correcting LLM-controlled Diffusion Models

    Authors: Tsung-Han Wu, Long Lian, Joseph E. Gonzalez, Boyi Li, Trevor Darrell

    Abstract: Text-to-image generation has witnessed significant progress with the advent of diffusion models. Despite the ability to generate photorealistic images, current text-to-image diffusion models still often struggle to accurately interpret and follow complex input text prompts. In contrast to existing models that aim to generate images only with their best effort, we introduce Self-correcting LLM-cont… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 16 pages, 10 figures

  25. arXiv:2311.14904  [pdf, other

    cs.LG cs.SE

    LLM-Assisted Code Cleaning For Training Accurate Code Generators

    Authors: Naman Jain, Tianjun Zhang, Wei-Lin Chiang, Joseph E. Gonzalez, Koushik Sen, Ion Stoica

    Abstract: Natural language to code generation is an important application area of LLMs and has received wide attention from the community. The majority of relevant studies have exclusively concentrated on increasing the quantity and functional correctness of training sets while disregarding other stylistic elements of programs. More recently, data quality has garnered a lot of interest and multiple works ha… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

  26. arXiv:2311.04850  [pdf, other

    cs.CL cs.AI

    Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

    Authors: Shuo Yang, Wei-Lin Chiang, Lianmin Zheng, Joseph E. Gonzalez, Ion Stoica

    Abstract: Large language models are increasingly trained on all the data ever produced by humans. Many have raised concerns about the trustworthiness of public benchmarks due to potential contamination in pre-training or fine-tuning datasets. While most data decontamination efforts apply string matching (e.g., n-gram overlap) to remove benchmark data, we show that these methods are insufficient, and simple… ▽ More

    Submitted 11 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

  27. arXiv:2311.03285  [pdf, other

    cs.LG cs.AI cs.DC

    S-LoRA: Serving Thousands of Concurrent LoRA Adapters

    Authors: Ying Sheng, Shiyi Cao, Dacheng Li, Coleman Hooper, Nicholas Lee, Shuo Yang, Christopher Chou, Banghua Zhu, Lianmin Zheng, Kurt Keutzer, Joseph E. Gonzalez, Ion Stoica

    Abstract: The "pretrain-then-finetune" paradigm is commonly adopted in the deployment of large language models. Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method, is often employed to adapt a base model to a multitude of tasks, resulting in a substantial collection of LoRA adapters derived from one base model. We observe that this paradigm presents significant opportunities for batched in… ▽ More

    Submitted 5 June, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

  28. arXiv:2311.03033  [pdf, ps, other

    cs.LG cs.AI

    Beyond Words: A Mathematical Framework for Interpreting Large Language Models

    Authors: Javier González, Aditya V. Nori

    Abstract: Large language models (LLMs) are powerful AI tools that can generate and comprehend natural language text and other complex information. However, the field lacks a mathematical framework to systematically describe, compare and improve LLMs. We propose Hex a framework that clarifies key terms and concepts in LLM research, such as hallucinations, alignment, self-verification and chain-of-thought rea… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 4 figures, 18 pages

  29. arXiv:2311.02700  [pdf, other

    cs.CV

    A Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Dra**

    Authors: Hunor Laczkó, Meysam Madadi, Sergio Escalera, Jordi Gonzalez

    Abstract: RGB cloth generation has been deeply studied in the related literature, however, 3D garment generation remains an open problem. In this paper, we build a conditional variational autoencoder for 3D garment generation and dra**. We propose a pyramid network to add garment details progressively in a canonical space, i.e. unposing and unsha** the garments w.r.t. the body. We study conditioning the… ▽ More

    Submitted 15 January, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: WACV24, IEEE copyright

  30. arXiv:2311.01491  [pdf, other

    physics.chem-ph cond-mat.mtrl-sci cs.LG physics.comp-ph

    Investigating the Behavior of Diffusion Models for Accelerating Electronic Structure Calculations

    Authors: Daniel Rothchild, Andrew S. Rosen, Eric Taw, Connie Robinson, Joseph E. Gonzalez, Aditi S. Krishnapriyan

    Abstract: We present an investigation into diffusion models for molecular generation, with the aim of better understanding how their predictions compare to the results of physics-based calculations. The investigation into these models is driven by their potential to significantly accelerate electronic structure calculations using machine learning, without requiring expensive first-principles datasets for tr… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

  31. arXiv:2311.01301  [pdf, other

    cs.LG cs.AI stat.ME

    TRIALSCOPE: A Unifying Causal Framework for Scaling Real-World Evidence Generation with Biomedical Language Models

    Authors: Javier González, Cliff Wong, Zelalem Gero, Jass Bagga, Risa Ueno, Isabel Chien, Eduard Oravkin, Emre Kiciman, Aditya Nori, Roshanthi Weerasinghe, Rom S. Leidner, Brian Piening, Tristan Naumann, Carlo Bifulco, Hoifung Poon

    Abstract: The rapid digitization of real-world data offers an unprecedented opportunity for optimizing healthcare delivery and accelerating biomedical discovery. In practice, however, such data is most abundantly available in unstructured forms, such as clinical notes in electronic medical records (EMRs), and it is generally plagued by confounders. In this paper, we present TRIALSCOPE, a unifying framework… ▽ More

    Submitted 6 November, 2023; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: 6 Figures, 22 Pages, 3 Tables

  32. arXiv:2310.12971  [pdf, other

    cs.CV cs.AI cs.CL

    CLAIR: Evaluating Image Captions with Large Language Models

    Authors: David Chan, Suzanne Petryk, Joseph E. Gonzalez, Trevor Darrell, John Canny

    Abstract: The evaluation of machine-generated image captions poses an interesting yet persistent challenge. Effective evaluation measures must consider numerous dimensions of similarity, including semantic relevance, visual structure, object interactions, caption diversity, and specificity. Existing highly-engineered measures attempt to capture specific aspects, but fall short in providing a holistic score… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: To Appear at EMNLP 2023

  33. arXiv:2310.08560  [pdf, other

    cs.AI

    MemGPT: Towards LLMs as Operating Systems

    Authors: Charles Packer, Sarah Wooders, Kevin Lin, Vivian Fang, Shishir G. Patil, Ion Stoica, Joseph E. Gonzalez

    Abstract: Large language models (LLMs) have revolutionized AI, but are constrained by limited context windows, hindering their utility in tasks like extended conversations and document analysis. To enable using context beyond limited context windows, we propose virtual context management, a technique drawing inspiration from hierarchical memory systems in traditional operating systems that provide the appea… ▽ More

    Submitted 12 February, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Code and data available at https://research.memgpt.ai

  34. arXiv:2310.07898  [pdf, other

    cs.SE cs.DB

    FlorDB: Multiversion Hindsight Logging for Continuous Training

    Authors: Rolando Garcia, Anusha Dandamudi, Gabriel Matute, Lehan Wan, Joseph Gonzalez, Joseph M. Hellerstein, Koushik Sen

    Abstract: Production Machine Learning involves continuous training: hosting multiple versions of models over time, often with many model versions running at once. When model performance does not meet expectations, Machine Learning Engineers (MLEs) debug issues by exploring and analyzing numerous prior versions of code and training data to identify root causes and mitigate problems. Traditional debugging and… ▽ More

    Submitted 2 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  35. arXiv:2310.03294  [pdf, other

    cs.LG cs.AI cs.DC

    DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training

    Authors: Dacheng Li, Rulin Shao, Anze Xie, Eric P. Xing, Xuezhe Ma, Ion Stoica, Joseph E. Gonzalez, Hao Zhang

    Abstract: FlashAttention (Dao, 2023) effectively reduces the quadratic peak memory usage to linear in training transformer-based large language models (LLMs) on a single GPU. In this paper, we introduce DISTFLASHATTN, a distributed memory-efficient attention mechanism optimized for long-context LLMs training. We propose three key techniques: token-level workload balancing, overlap** key-value communicatio… ▽ More

    Submitted 31 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

  36. arXiv:2309.13188  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Masked Discriminators for Content-Consistent Unpaired Image-to-Image Translation

    Authors: Bonifaz Stuhr, Jürgen Brauer, Bernhard Schick, Jordi Gonzàlez

    Abstract: A common goal of unpaired image-to-image translation is to preserve content consistency between source images and translated images while mimicking the style of the target domain. Due to biases between the datasets of both domains, many methods suffer from inconsistencies caused by the translation process. Most approaches introduced to mitigate these inconsistencies do not constrain the discrimina… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 24 pages, 22 figures, under review

    ACM Class: I.2; I.3; I.4; I.5; I.6

  37. arXiv:2309.11998  [pdf, other

    cs.CL cs.AI

    LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

    Authors: Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric P. Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang

    Abstract: Studying how people interact with large language models (LLMs) in real-world scenarios is increasingly important due to their widespread use in various applications. In this paper, we introduce LMSYS-Chat-1M, a large-scale dataset containing one million real-world conversations with 25 state-of-the-art LLMs. This dataset is collected from 210K unique IP addresses in the wild on our Vicuna demo and… ▽ More

    Submitted 10 March, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

  38. arXiv:2309.11285  [pdf, other

    cs.CL cs.AI cs.LG

    Overview of AuTexTification at IberLEF 2023: Detection and Attribution of Machine-Generated Text in Multiple Domains

    Authors: Areg Mikael Sarvazyan, José Ángel González, Marc Franco-Salvador, Francisco Rangel, Berta Chulvi, Paolo Rosso

    Abstract: This paper presents the overview of the AuTexTification shared task as part of the IberLEF 2023 Workshop in Iberian Languages Evaluation Forum, within the framework of the SEPLN 2023 conference. AuTexTification consists of two subtasks: for Subtask 1, participants had to determine whether a text is human-authored or has been generated by a large language model. For Subtask 2, participants had to a… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted at SEPLN 2023

    Journal ref: Procesamiento del Lenguaje Natural, [S.l.], v. 71, p. 275-288, sep. 2023

  39. arXiv:2309.09476  [pdf, other

    cs.AI cs.LG

    Mechanic Maker 2.0: Reinforcement Learning for Evaluating Generated Rules

    Authors: Johor Jara Gonzalez, Seth Cooper, Matthew Guzdial

    Abstract: Automated game design (AGD), the study of automatically generating game rules, has a long history in technical games research. AGD approaches generally rely on approximations of human play, either objective functions or AI agents. Despite this, the majority of these approximators are static, meaning they do not reflect human player's ability to learn and improve in a game. In this paper, we invest… ▽ More

    Submitted 4 October, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 10 pages, 6 figures, Artificial Intelligence and Interactive Digital Entertainment

  40. arXiv:2309.09472  [pdf, other

    cs.CV cs.LG

    Reconstructing Existing Levels through Level Inpainting

    Authors: Johor Jara Gonzalez, Matthew Guzdial

    Abstract: Procedural Content Generation (PCG) and Procedural Content Generation via Machine Learning (PCGML) have been used in prior work for generating levels in various games. This paper introduces Content Augmentation and focuses on the subproblem of level inpainting, which involves reconstructing and extending video game levels. Drawing inspiration from image inpainting, we adapt two techniques from thi… ▽ More

    Submitted 4 October, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: 8 pages, 5 figures, Artificial Intelligence and Interactive Digital Entertainment

  41. arXiv:2309.06180  [pdf, other

    cs.LG cs.DC

    Efficient Memory Management for Large Language Model Serving with PagedAttention

    Authors: Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica

    Abstract: High throughput serving of large language models (LLMs) requires batching sufficiently many requests at a time. However, existing systems struggle because the key-value cache (KV cache) memory for each request is huge and grows and shrinks dynamically. When managed inefficiently, this memory can be significantly wasted by fragmentation and redundant duplication, limiting the batch size. To address… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: SOSP 2023

  42. arXiv:2308.03204  [pdf, other

    cs.RO

    Leveraging Cloud Computing to Make Autonomous Vehicles Safer

    Authors: Peter Schafhalter, Sukrit Kalra, Le Xu, Joseph E. Gonzalez, Ion Stoica

    Abstract: The safety of autonomous vehicles (AVs) depends on their ability to perform complex computations on high-volume sensor data in a timely manner. Their ability to run these computations with state-of-the-art models is limited by the processing power and slow update cycles of their onboard hardware. In contrast, cloud computing offers the ability to burst computation to vast amounts of the latest gen… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: IROS 2023 (to appear); 8 pages, 7 figures, 2 tables

  43. arXiv:2307.04427  [pdf, other

    astro-ph.HE astro-ph.GA cs.LG

    Observation of high-energy neutrinos from the Galactic plane

    Authors: R. Abbasi, M. Ackermann, J. Adams, J. A. Aguilar, M. Ahlers, M. Ahrens, J. M. Alameddine, A. A. Alves Jr., N. M. Amin, K. Andeen, T. Anderson, G. Anton, C. Argüelles, Y. Ashida, S. Athanasiadou, S. Axani, X. Bai, A. Balagopal V., S. W. Barwick, V. Basu, S. Baur, R. Bay, J. J. Beatty, K. -H. Becker, J. Becker Tjus , et al. (364 additional authors not shown)

    Abstract: The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: Submitted on May 12th, 2022; Accepted on May 4th, 2023

    Journal ref: Science 380, 6652, 1338-1343 (2023)

  44. arXiv:2306.05685  [pdf, other

    cs.CL cs.AI

    Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

    Authors: Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, Ion Stoica

    Abstract: Evaluating large language model (LLM) based chat assistants is challenging due to their broad capabilities and the inadequacy of existing benchmarks in measuring human preferences. To address this, we explore using strong LLMs as judges to evaluate these models on more open-ended questions. We examine the usage and limitations of LLM-as-a-judge, including position, verbosity, and self-enhancement… ▽ More

    Submitted 23 December, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  45. arXiv:2305.16289  [pdf, other

    cs.CV cs.AI

    Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation

    Authors: Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell

    Abstract: Many fine-grained classification tasks, like rare animal identification, have limited training data and consequently classifiers trained on these datasets often fail to generalize to variations in the domain like changes in weather or location. As such, we explore how natural language descriptions of the domains seen in training data can be used with large vision models trained on diverse pretrain… ▽ More

    Submitted 29 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Update: replaced Planes dataset with Waterbirds & updated results after bug fix

  46. arXiv:2305.15334  [pdf, other

    cs.CL cs.AI

    Gorilla: Large Language Model Connected with Massive APIs

    Authors: Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez

    Abstract: Large Language Models (LLMs) have seen an impressive wave of advances recently, with models now excelling in a variety of tasks, such as mathematical reasoning and program synthesis. However, their potential to effectively use tools via API calls remains unfulfilled. This is a challenging task even for today's state-of-the-art LLMs such as GPT-4, largely due to their inability to generate accurate… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  47. arXiv:2305.15053  [pdf, other

    cs.CL cs.IR

    Decomposing Complex Queries for Tip-of-the-tongue Retrieval

    Authors: Kevin Lin, Kyle Lo, Joseph E. Gonzalez, Dan Klein

    Abstract: When re-finding items, users who forget or are uncertain about identifying details often rely on creative strategies for expressing their information needs -- complex queries that describe content elements (e.g., book characters or events), information beyond the document text (e.g., descriptions of book covers), or personal context (e.g., when they read a book). This retrieval setting, called tip… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  48. arXiv:2305.07021  [pdf, other

    cs.CV

    Simple Token-Level Confidence Improves Caption Correctness

    Authors: Suzanne Petryk, Spencer Whitehead, Joseph E. Gonzalez, Trevor Darrell, Anna Rohrbach, Marcus Rohrbach

    Abstract: The ability to judge whether a caption correctly describes an image is a critical part of vision-language understanding. However, state-of-the-art models often misinterpret the correctness of fine-grained details, leading to errors in outputs such as hallucinating objects in generated captions or poor compositional reasoning. In this work, we explore Token-Level Confidence, or TLC, as a simple yet… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  49. arXiv:2304.13737  [pdf, other

    q-bio.QM cs.LG

    AIRIVA: A Deep Generative Model of Adaptive Immune Repertoires

    Authors: Melanie F. Pradier, Niranjani Prasad, Paidamoyo Chapfuwa, Sahra Ghalebikesabi, Max Ilse, Steven Woodhouse, Rebecca Elyanow, Javier Zazo, Javier Gonzalez, Julia Greissl, Edward Meeds

    Abstract: Recent advances in immunomics have shown that T-cell receptor (TCR) signatures can accurately predict active or recent infection by leveraging the high specificity of TCR binding to disease antigens. However, the extreme diversity of the adaptive immune repertoire presents challenges in reliably identifying disease-specific TCRs. Population genetics and sequencing depth can also have strong system… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  50. arXiv:2303.06865  [pdf, other

    cs.LG cs.AI cs.PF

    FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

    Authors: Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, Ce Zhang

    Abstract: The high computational and memory requirements of large language model (LLM) inference make it feasible only with multiple high-end accelerators. Motivated by the emerging demand for latency-insensitive tasks with batched processing, this paper initiates the study of high-throughput LLM inference using limited resources, such as a single commodity GPU. We present FlexGen, a high-throughput generat… ▽ More

    Submitted 12 June, 2023; v1 submitted 13 March, 2023; originally announced March 2023.