Skip to main content

Showing 1–4 of 4 results for author: Stechly, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.04776  [pdf, other

    cs.AI

    Chain of Thoughtlessness? An Analysis of CoT in Planning

    Authors: Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: Large language model (LLM) performance on reasoning problems typically does not generalize out of distribution. Previous work has claimed that this can be mitigated with chain of thought prompting-a method of demonstrating solution procedures-with the intuition that it is possible to in-context teach an LLM an algorithm for solving the problem. This paper presents a case study of chain of thought… ▽ More

    Submitted 5 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  2. arXiv:2402.08115  [pdf, other

    cs.AI

    On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

    Authors: Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

    Abstract: There has been considerable divergence of opinion on the reasoning abilities of Large Language Models (LLMs). While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples--ranging from multiplication to simple planning--there persists a wide spread belief that LLMs can self-critique and improve their own solutions in an itera… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.12397

  3. arXiv:2402.01817  [pdf, other

    cs.AI cs.LG

    LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

    Authors: Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Mudit Verma, Kaya Stechly, Siddhant Bhambri, Lucas Saldyt, Anil Murthy

    Abstract: There is considerable confusion about the role of Large Language Models (LLMs) in planning and reasoning tasks. On one side are over-optimistic claims that LLMs can indeed do these tasks with just the right prompting or self-verification strategies. On the other side are perhaps over-pessimistic claims that all that LLMs are good for in planning/reasoning tasks are as mere translators of the probl… ▽ More

    Submitted 11 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  4. arXiv:2310.12397  [pdf, other

    cs.AI

    GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems

    Authors: Kaya Stechly, Matthew Marquez, Subbarao Kambhampati

    Abstract: There has been considerable divergence of opinion on the reasoning abilities of Large Language Models (LLMs). While the initial optimism that reasoning might emerge automatically with scale has been tempered thanks to a slew of counterexamples, a wide spread belief in their iterative self-critique capabilities persists. In this paper, we set out to systematically investigate the effectiveness of i… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 18 pages, 3 figures