Skip to main content

Showing 1–25 of 25 results for author: Khattab, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11706  [pdf, other

    cs.IR cs.CL cs.LG

    Prompts as Auto-Optimized Training Hyperparameters: Training Best-in-Class IR Models from Scratch with 10 Gold Labels

    Authors: Jasper Xian, Saron Samuel, Faraz Khoubsirat, Ronak Pradeep, Md Arafat Sultan, Radu Florian, Salim Roukos, Avirup Sil, Christopher Potts, Omar Khattab

    Abstract: We develop a method for training small-scale (under 100M parameter) neural information retrieval models with as few as 10 gold relevance labels. The method depends on generating synthetic queries for documents using a language model (LM), and the key step is that we automatically optimize the LM prompt that is used to generate these queries based on training quality. In experiments with the BIRCO… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2406.11695  [pdf, other

    cs.CL cs.AI cs.LG

    Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs

    Authors: Krista Opsahl-Ong, Michael J Ryan, Josh Purtell, David Broman, Christopher Potts, Matei Zaharia, Omar Khattab

    Abstract: Language Model Programs, i.e. sophisticated pipelines of modular language model (LM) calls, are increasingly advancing NLP tasks, but they require crafting prompts that are jointly effective for all modules. We study prompt optimization for LM programs, i.e. how to update these prompts to maximize a downstream metric without access to module-level labels or gradients. To make this tractable, we fa… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Krista and Michael contributed equally to this work

  3. arXiv:2403.03956  [pdf, other

    cs.IR cs.CL

    Backtracing: Retrieving the Cause of the Query

    Authors: Rose E. Wang, Pawan Wirawarn, Omar Khattab, Noah Goodman, Dorottya Demszky

    Abstract: Many online content portals allow users to ask questions to supplement their understanding (e.g., of lectures). While information retrieval (IR) systems may provide answers for such user queries, they do not directly assist content creators -- such as lecturers who want to improve their content -- identify segments that _caused_ a user to ask those questions. We introduce the task of backtracing,… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Code: https://github.com/rosewang2008/backtracing; EACL 2024 Findings, Long Paper

  4. arXiv:2402.14207  [pdf, other

    cs.CL cs.AI

    Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

    Authors: Yijia Shao, Yucheng Jiang, Theodore A. Kanell, Peter Xu, Omar Khattab, Monica S. Lam

    Abstract: We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages. This underexplored problem poses new challenges at the pre-writing stage, including how to research the topic and prepare an outline prior to writing. We propose STORM, a writing system for the Synthesis of Topic Outlines through Retriev… ▽ More

    Submitted 8 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 27 pages, NAACL 2024 Main Conference

  5. arXiv:2401.12178  [pdf, other

    cs.CL cs.AI

    In-Context Learning for Extreme Multi-Label Classification

    Authors: Karel D'Oosterlinck, Omar Khattab, François Remy, Thomas Demeester, Chris Develder, Christopher Potts

    Abstract: Multi-label classification problems with thousands of classes are hard to solve with in-context learning alone, as language models (LMs) might lack prior knowledge about the precise classes or how to assign them, and it is generally infeasible to demonstrate every class in a prompt. We propose a general program, $\texttt{Infer--Retrieve--Rank}$, that defines multi-step interactions between LMs and… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  6. arXiv:2401.03590  [pdf, other

    cs.CL

    Building Efficient and Effective OpenQA Systems for Low-Resource Languages

    Authors: Emrah Budur, Rıza Özçelik, Dilara Soylu, Omar Khattab, Tunga Güngör, Christopher Potts

    Abstract: Question answering (QA) is the task of answering questions posed in natural language with free-form natural language answers extracted from a given passage. In the OpenQA variant, only a question text is given, and the system must retrieve relevant passages from an unstructured knowledge source and use them to provide answers, which is the case in the mainstream QA systems on the Web. QA systems c… ▽ More

    Submitted 4 June, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  7. arXiv:2312.13382  [pdf, ps, other

    cs.CL cs.AI cs.PL

    DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines

    Authors: Arnav Singhvi, Manish Shetty, Shangyin Tan, Christopher Potts, Koushik Sen, Matei Zaharia, Omar Khattab

    Abstract: Chaining language model (LM) calls as composable modules is fueling a new way of programming, but ensuring LMs adhere to important constraints requires heuristic "prompt engineering". We introduce LM Assertions, a programming construct for expressing computational constraints that LMs should satisfy. We integrate our constructs into the recent DSPy programming model for LMs, and present new strate… ▽ More

    Submitted 2 February, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Arnav*, Manish*, Shangyin* contributed equally to this work

  8. arXiv:2312.05468  [pdf

    cs.AI cond-mat.mtrl-sci cs.CV cs.IR

    Image and Data Mining in Reticular Chemistry Using GPT-4V

    Authors: Zhiling Zheng, Zhiguo He, Omar Khattab, Nakul Rampal, Matei A. Zaharia, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi

    Abstract: The integration of artificial intelligence into scientific research has reached a new pinnacle with GPT-4V, a large language model featuring enhanced vision capabilities, accessible through ChatGPT or an API. This study demonstrates the remarkable ability of GPT-4V to navigate and obtain complex data for metal-organic frameworks, especially from graphical sources. Our approach involved an automate… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 36 pages, 24 figures

  9. arXiv:2311.09476  [pdf, other

    cs.CL cs.AI cs.IR

    ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems

    Authors: Jon Saad-Falcon, Omar Khattab, Christopher Potts, Matei Zaharia

    Abstract: Evaluating retrieval-augmented generation (RAG) systems traditionally relies on hand annotations for input queries, passages to retrieve, and responses to generate. We introduce ARES, an Automated RAG Evaluation System, for evaluating RAG systems along the dimensions of context relevance, answer faithfulness, and answer relevance. By creating its own synthetic training data, ARES finetunes lightwe… ▽ More

    Submitted 31 March, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  10. arXiv:2310.03714  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

    Authors: Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, Christopher Potts

    Abstract: The ML community is rapidly exploring techniques for prompting language models (LMs) and for stacking them into pipelines that solve complex tasks. Unfortunately, existing LM pipelines are typically implemented using hard-coded "prompt templates", i.e. lengthy strings discovered via trial and error. Toward a more systematic approach for develo** and optimizing LM pipelines, we introduce DSPy, a… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  11. arXiv:2306.12601  [pdf, other

    cs.IR cs.AI

    Resources and Evaluations for Multi-Distribution Dense Information Retrieval

    Authors: Soumya Chatterjee, Omar Khattab, Simran Arora

    Abstract: We introduce and define the novel problem of multi-distribution information retrieval (IR) where given a query, systems need to retrieve passages from within multiple collections, each drawn from a different distribution. Some of these collections and distributions might not be available at training time. To evaluate methods for multi-distribution retrieval, we design three benchmarks for this tas… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

    Comments: REML @ SIGIR 2023; 9 pages, 8 figures

  12. arXiv:2303.00807  [pdf, other

    cs.IR cs.CL

    UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

    Authors: Jon Saad-Falcon, Omar Khattab, Keshav Santhanam, Radu Florian, Martin Franz, Salim Roukos, Avirup Sil, Md Arafat Sultan, Christopher Potts

    Abstract: Many information retrieval tasks require large labeled datasets for fine-tuning. However, such datasets are often unavailable, and their utility for real-world applications can diminish quickly due to domain shifts. To address this challenge, we develop and motivate a method for using large language models (LLMs) to generate large numbers of synthetic queries cheaply. The method begins by generati… ▽ More

    Submitted 13 October, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: Long Paper at Empirical Methods in Natural Language Processing (EMNLP) 2023

  13. arXiv:2212.14024  [pdf, other

    cs.CL cs.IR

    Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

    Authors: Omar Khattab, Keshav Santhanam, Xiang Lisa Li, David Hall, Percy Liang, Christopher Potts, Matei Zaharia

    Abstract: Retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge-intensive tasks using frozen language models (LM) and retrieval models (RM). Existing work has combined these in simple "retrieve-then-read" pipelines in which the RM retrieves passages that are inserted into the LM prompt. To begin to fully realize the potential of frozen LMs and RMs, we propose De… ▽ More

    Submitted 23 January, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

  14. arXiv:2212.01340  [pdf, other

    cs.IR cs.CL

    Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking

    Authors: Keshav Santhanam, Jon Saad-Falcon, Martin Franz, Omar Khattab, Avirup Sil, Radu Florian, Md Arafat Sultan, Salim Roukos, Matei Zaharia, Christopher Potts

    Abstract: Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks. Unfortunately, some dimensions of this progress are illusory: the majority of the popular IR benchmarks today focus exclusively on downstream task accuracy and thus conceal the costs incurred by systems that trade away efficiency for quality.… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

  15. arXiv:2211.09110  [pdf, other

    cs.CL cs.AI cs.LG

    Holistic Evaluation of Language Models

    Authors: Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao , et al. (25 additional authors not shown)

    Abstract: Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest fo… ▽ More

    Submitted 1 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Project page: https://crfm.stanford.edu/helm/v1.0

    Journal ref: Published in Transactions on Machine Learning Research (TMLR), 2023

  16. arXiv:2205.09707  [pdf, other

    cs.IR cs.CL

    PLAID: An Efficient Engine for Late Interaction Retrieval

    Authors: Keshav Santhanam, Omar Khattab, Christopher Potts, Matei Zaharia

    Abstract: Pre-trained language models are increasingly important components across multiple information retrieval (IR) paradigms. Late interaction, introduced with the ColBERT model and recently refined in ColBERTv2, is a popular paradigm that holds state-of-the-art status across many benchmarks. To dramatically speed up the search latency of late interaction, we introduce the Performance-optimized Late Int… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: Preprint. Omar and Keshav contributed equally to this work

  17. arXiv:2203.13088  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Introducing Neural Bag of Whole-Words with ColBERTer: Contextualized Late Interactions using Enhanced Reduction

    Authors: Sebastian Hofstätter, Omar Khattab, Sophia Althammer, Mete Sertkan, Allan Hanbury

    Abstract: Recent progress in neural information retrieval has demonstrated large gains in effectiveness, while often sacrificing the efficiency and interpretability of the neural model compared to classical approaches. This paper proposes ColBERTer, a neural retrieval model using contextualized late interaction (ColBERT) with enhanced reduction. Along the effectiveness Pareto frontier, ColBERTer's reduction… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

  18. arXiv:2112.01488  [pdf, other

    cs.IR cs.CL

    ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction

    Authors: Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, Christopher Potts, Matei Zaharia

    Abstract: Neural information retrieval (IR) has greatly advanced search and other knowledge-intensive language tasks. While many neural IR methods encode queries and documents into single-vector representations, late interaction models produce multi-vector representations at the granularity of each token and decompose relevance modeling into scalable token-level computations. This decomposition has been sho… ▽ More

    Submitted 10 July, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: NAACL 2022. Omar and Keshav contributed equally to this work

  19. arXiv:2110.07752  [pdf, other

    cs.CL cs.IR

    Hindsight: Posterior-guided training of retrievers for improved open-ended generation

    Authors: Ashwin Paranjape, Omar Khattab, Christopher Potts, Matei Zaharia, Christopher D. Manning

    Abstract: Many text generation systems benefit from using a retriever to retrieve passages from a textual knowledge corpus (e.g., Wikipedia) which are then provided as additional context to the generator. For open-ended generation tasks (like generating informative utterances in conversations) many varied passages may be equally relevant and we find that existing methods that jointly train the retriever and… ▽ More

    Submitted 20 October, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

  20. arXiv:2108.07258  [pdf, other

    cs.LG cs.AI cs.CY

    On the Opportunities and Risks of Foundation Models

    Authors: Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh , et al. (89 additional authors not shown)

    Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their cap… ▽ More

    Submitted 12 July, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Report page with citation guidelines: https://crfm.stanford.edu/report.html

  21. arXiv:2104.12016  [pdf, other

    cs.IR

    Learning Passage Impacts for Inverted Indexes

    Authors: Antonio Mallia, Omar Khattab, Nicola Tonellotto, Torsten Suel

    Abstract: Neural information retrieval systems typically use a cascading pipeline, in which a first-stage model retrieves a candidate set of documents and one or more subsequent stages re-rank this set using contextualized language models such as BERT. In this paper, we propose DeepImpact, a new document term-weighting scheme suitable for efficient retrieval using a standard inverted index. Compared to exis… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

  22. arXiv:2101.00436  [pdf, other

    cs.CL cs.IR

    Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval

    Authors: Omar Khattab, Christopher Potts, Matei Zaharia

    Abstract: Multi-hop reasoning (i.e., reasoning across two or more documents) is a key ingredient for NLP models that leverage large corpora to exhibit broad knowledge. To retrieve evidence passages, multi-hop models must contend with a fast-growing search space across the hops, represent complex queries that combine multiple information needs, and resolve ambiguity about the best order in which to hop betwe… ▽ More

    Submitted 10 July, 2022; v1 submitted 2 January, 2021; originally announced January 2021.

    Comments: NeurIPS 2021 (Spotlight)

  23. arXiv:2007.00814  [pdf, other

    cs.CL cs.IR

    Relevance-guided Supervision for OpenQA with ColBERT

    Authors: Omar Khattab, Christopher Potts, Matei Zaharia

    Abstract: Systems for Open-Domain Question Answering (OpenQA) generally depend on a retriever for finding candidate passages in a large corpus and a reader for extracting answers from those passages. In much recent work, the retriever is a learned component that uses coarse-grained vector representations of questions and passages. We argue that this modeling choice is insufficiently expressive for dealing w… ▽ More

    Submitted 2 August, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2021. Author's final version. Oral presentation at ACL'21

  24. arXiv:2004.12832  [pdf, other

    cs.IR cs.CL

    ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

    Authors: Omar Khattab, Matei Zaharia

    Abstract: Recent progress in Natural Language Understanding (NLU) is driving fast-paced advances in Information Retrieval (IR), largely owed to fine-tuning deep language models (LMs) for document ranking. While remarkably effective, the ranking models based on these LMs increase computational cost by orders of magnitude over prior approaches, particularly as they must feed each query-document pair through a… ▽ More

    Submitted 4 June, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: Accepted at SIGIR 2020

  25. arXiv:1306.1448  [pdf

    cs.NI

    I am 4 vho: new approach to improve seamless vertical hanover in heterogeneous wireless networks

    Authors: Omar Khattab, Omar Alani

    Abstract: Two mechanisms have been proposed independently by IEEE and 3GPP; namely, Media Independent Handover (MIH) and Access Network Discovery and Selection Function (ANDSF), respectively. These mechanisms enable a seamless Vertical Handover (VHO) between the different types of technologies (3GPP and non-3GPP), such as GSM (Global System for Mobile Communication), Wireless Fidelity (Wi- Fi), Worldwide In… ▽ More

    Submitted 6 June, 2013; originally announced June 2013.

    Comments: 11 pages, 5 figures,Journal