Skip to main content

Showing 1–13 of 13 results for author: Valmeekam, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20625  [pdf, other

    cs.AI

    Robust Planning with LLM-Modulo Framework: Case Study in Travel Planning

    Authors: Atharva Gundawar, Mudit Verma, Lin Guan, Karthik Valmeekam, Siddhant Bhambri, Subbarao Kambhampati

    Abstract: As the applicability of Large Language Models (LLMs) extends beyond traditional text processing tasks, there is a burgeoning interest in their potential to excel in planning and reasoning assignments, realms traditionally reserved for System 2 cognitive competencies. Despite their perceived versatility, the research community is still unraveling effective strategies to harness these models in such… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2405.04776  [pdf, other

    cs.AI

    Chain of Thoughtlessness? An Analysis of CoT in Planning

    Authors: Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: Large language model (LLM) performance on reasoning problems typically does not generalize out of distribution. Previous work has claimed that this can be mitigated with chain of thought prompting-a method of demonstrating solution procedures-with the intuition that it is possible to in-context teach an LLM an algorithm for solving the problem. This paper presents a case study of chain of thought… ▽ More

    Submitted 5 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  3. arXiv:2402.08115  [pdf, other

    cs.AI

    On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

    Authors: Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: There has been considerable divergence of opinion on the reasoning abilities of Large Language Models (LLMs). While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples--ranging from multiplication to simple planning--there persists a wide spread belief that LLMs can self-critique and improve their own solutions in an itera… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.12397

  4. arXiv:2402.01817  [pdf, other

    cs.AI cs.LG

    LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

    Authors: Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

    Abstract: There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the probl… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  5. arXiv:2311.00226  [pdf, other

    eess.SP cs.LG

    Transformers are Provably Optimal In-context Estimators for Wireless Communications

    Authors: Vishnu Teja Kunde, Vicram Rajagopalan, Chandra Shekhara Kaushik Valmeekam, Krishna Narayanan, Srinivas Shakkottai, Dileep Kalathil, Jean-Francois Chamberland

    Abstract: Pre-trained transformers exhibit the capability of adapting to new tasks through in-context learning (ICL), where they efficiently utilize a limited set of prompts without explicit model optimization. The canonical communication problem of estimating transmitted symbols from received observations can be modelled as an in-context learning problem: Received observations are essentially a noisy fun… ▽ More

    Submitted 14 June, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

    Comments: 13 pages, 2 figures, 2 tables, preprint; abstract, references, theory updated

  6. arXiv:2310.08118  [pdf, other

    cs.AI

    Can Large Language Models Really Improve by Self-critiquing Their Own Plans?

    Authors: Karthik Valmeekam, Matthew Marquez, Subbarao Kambhampati

    Abstract: There have been widespread claims about Large Language Models (LLMs) being able to successfully verify or self-critique their candidate solutions in reasoning problems in an iterative mode. Intrigued by those claims, in this paper we set out to investigate the verification/self-critiquing abilities of large language models in the context of planning. We evaluate a planning system that employs LLMs… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  7. arXiv:2306.04050  [pdf, ps, other

    cs.IT cs.CL cs.LG

    LLMZip: Lossless Text Compression using Large Language Models

    Authors: Chandra Shekhara Kaushik Valmeekam, Krishna Narayanan, Dileep Kalathil, Jean-Francois Chamberland, Srinivas Shakkottai

    Abstract: We provide new estimates of an asymptotic upper bound on the entropy of English using the large language model LLaMA-7B as a predictor for the next token given a window of past tokens. This estimate is significantly smaller than currently available estimates in \cite{cover1978convergent}, \cite{lutati2023focus}. A natural byproduct is an algorithm for lossless compression of English text which com… ▽ More

    Submitted 26 June, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: 7 pages, 4 figures, 4 tables, preprint, added results on using LLMs with arithmetic coding

  8. arXiv:2305.15771  [pdf, other

    cs.AI

    On the Planning Abilities of Large Language Models : A Critical Investigation

    Authors: Karthik Valmeekam, Matthew Marquez, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) the effectiveness of LLMs in generating plans autonomously in commonsense planning tasks and (2) the potential of LLMs in LLM-Modulo settings where they act as a source of heuristic guidance for external plan… ▽ More

    Submitted 6 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 Spotlight. arXiv admin note: substantial text overlap with arXiv:2206.10498

  9. arXiv:2305.14909  [pdf, other

    cs.AI

    Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning

    Authors: Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: There is a growing interest in applying pre-trained large language models (LLMs) to planning problems. However, methods that use LLMs directly as planners are currently impractical due to several factors, including limited correctness of plans, strong reliance on feedback from interactions with simulators or even the actual environment, and the inefficiency in utilizing human feedback. In this wor… ▽ More

    Submitted 1 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  10. arXiv:2302.06706  [pdf, other

    cs.AI cs.CL cs.LG

    On the Planning Abilities of Large Language Models (A Critical Investigation with a Proposed Benchmark)

    Authors: Karthik Valmeekam, Sarath Sreedharan, Matthew Marquez, Alberto Olmo, Subbarao Kambhampati

    Abstract: Intrigued by the claims of emergent reasoning capabilities in LLMs trained on general web corpora, in this paper, we set out to investigate their planning capabilities. We aim to evaluate (1) how good LLMs are by themselves in generating and validating simple plans in commonsense planning tasks (of the type that humans are generally quite good at) and (2) how good LLMs are in being a source of heu… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: arXiv admin note: text overlap with arXiv:2206.10498

  11. arXiv:2210.15906  [pdf, other

    cs.AI cs.HC cs.LG

    Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences

    Authors: Lin Guan, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: Generating complex behaviors that satisfy the preferences of non-expert users is a crucial requirement for AI agents. Interactive reward learning from trajectory comparisons (a.k.a. RLHF) is one way to allow non-expert users to convey complex objectives by expressing preferences over short clips of agent behaviors. Even though this parametric method can encode complex tacit knowledge present in th… ▽ More

    Submitted 27 February, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

    Comments: ICLR 2023 Camera Ready

  12. arXiv:2206.10498  [pdf, other

    cs.CL cs.AI

    PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

    Authors: Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati

    Abstract: Generating plans of action, and reasoning about change have long been considered a core competence of intelligent agents. It is thus no surprise that evaluating the planning and reasoning capabilities of large language models (LLMs) has become a hot topic of research. Most claims about LLM planning capabilities are however based on common sense tasks-where it becomes hard to tell whether LLMs are… ▽ More

    Submitted 25 November, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2023 Track on Datasets and Benchmarks

  13. arXiv:2011.09644  [pdf, other

    cs.AI

    RADAR-X: An Interactive Mixed Initiative Planning Interface Pairing Contrastive Explanations and Revised Plan Suggestions

    Authors: Karthik Valmeekam, Sarath Sreedharan, Sailik Sengupta, Subbarao Kambhampati

    Abstract: Decision support systems seek to enable informed decision-making. In the recent years, automated planning techniques have been leveraged to empower such systems to better aid the human-in-the-loop. The central idea for such decision support systems is to augment the capabilities of the human-in-the-loop with automated planning techniques and enhance the quality of decision-making. In addition to p… ▽ More

    Submitted 3 June, 2022; v1 submitted 18 November, 2020; originally announced November 2020.

    Comments: Accepted at ICAPS 2022