-
MinePlanner: A Benchmark for Long-Horizon Planning in Large Minecraft Worlds
Authors:
William Hill,
Ireton Liu,
Anita De Mello Koch,
Damion Harvey,
Nishanth Kumar,
George Konidaris,
Steven James
Abstract:
We propose a new benchmark for planning tasks based on the Minecraft game. Our benchmark contains 45 tasks overall, but also provides support for creating both propositional and numeric instances of new Minecraft tasks automatically. We benchmark numeric and propositional planning systems on these tasks, with results demonstrating that state-of-the-art planners are currently incapable of dealing w…
▽ More
We propose a new benchmark for planning tasks based on the Minecraft game. Our benchmark contains 45 tasks overall, but also provides support for creating both propositional and numeric instances of new Minecraft tasks automatically. We benchmark numeric and propositional planning systems on these tasks, with results demonstrating that state-of-the-art planners are currently incapable of dealing with many of the challenges advanced by our new benchmark, such as scaling to instances with thousands of objects. Based on these results, we identify areas of improvement for future planners. Our framework is made available at https://github.com/IretonLiu/mine-pddl/.
△ Less
Submitted 28 April, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
ProTIP: Progressive Tool Retrieval Improves Planning
Authors:
Raviteja Anantha,
Bortik Bandyopadhyay,
Anirudh Kashi,
Sayantan Mahinder,
Andrew W Hill,
Srinivas Chappidi
Abstract:
Large language models (LLMs) are increasingly employed for complex multi-step planning tasks, where the tool retrieval (TR) step is crucial for achieving successful outcomes. Two prevalent approaches for TR are single-step retrieval, which utilizes the complete query, and sequential retrieval using task decomposition (TD), where a full query is segmented into discrete atomic subtasks. While single…
▽ More
Large language models (LLMs) are increasingly employed for complex multi-step planning tasks, where the tool retrieval (TR) step is crucial for achieving successful outcomes. Two prevalent approaches for TR are single-step retrieval, which utilizes the complete query, and sequential retrieval using task decomposition (TD), where a full query is segmented into discrete atomic subtasks. While single-step retrieval lacks the flexibility to handle "inter-tool dependency," the TD approach necessitates maintaining "subtask-tool atomicity alignment," as the toolbox can evolve dynamically. To address these limitations, we introduce the Progressive Tool retrieval to Improve Planning (ProTIP) framework. ProTIP is a lightweight, contrastive learning-based framework that implicitly performs TD without the explicit requirement of subtask labels, while simultaneously maintaining subtask-tool atomicity. On the ToolBench dataset, ProTIP outperforms the ChatGPT task decomposition-based approach by a remarkable margin, achieving a 24% improvement in Recall@K=10 for TR and a 41% enhancement in tool accuracy for plan generation.
△ Less
Submitted 16 December, 2023;
originally announced December 2023.
-
An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets
Authors:
Maya Srikanth,
Jeremy Irvin,
Brian Wesley Hill,
Felipe Godoy,
Ishan Sabane,
Andrew Y. Ng
Abstract:
Major advancements in computer vision can primarily be attributed to the use of labeled datasets. However, acquiring labels for datasets often results in errors which can harm model performance. Recent works have proposed methods to automatically identify mislabeled images, but develo** strategies to effectively implement them in real world datasets has been sparsely explored. Towards improved d…
▽ More
Major advancements in computer vision can primarily be attributed to the use of labeled datasets. However, acquiring labels for datasets often results in errors which can harm model performance. Recent works have proposed methods to automatically identify mislabeled images, but develo** strategies to effectively implement them in real world datasets has been sparsely explored. Towards improved data-centric methods for cleaning real world vision datasets, we first conduct more than 200 experiments carefully benchmarking recently developed automated mislabel detection methods on multiple datasets under a variety of synthetic and real noise settings with varying noise levels. We compare these methods to a Simple and Efficient Mislabel Detector (SEMD) that we craft, and find that SEMD performs similarly to or outperforms prior mislabel detection approaches. We then apply SEMD to multiple real world computer vision datasets and test how dataset size, mislabel removal strategy, and mislabel removal amount further affect model performance after retraining on the cleaned data. With careful design of the approach, we find that mislabel removal leads per-class performance improvements of up to 8% of a retrained classifier in smaller data regimes.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Representing and Computing Uncertainty in Phonological Reconstruction
Authors:
Johann-Mattis List,
Nathan W. Hill,
Robert Forkel,
Frederic Blum
Abstract:
Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms. With the increasing success of recently proposed approaches to automating certain aspects of the traditional comparative method, the formal representation of proto-forms has also improved. This formalization makes it possible to address both…
▽ More
Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms. With the increasing success of recently proposed approaches to automating certain aspects of the traditional comparative method, the formal representation of proto-forms has also improved. This formalization makes it possible to address both the representation and the computation of uncertainty. Building on recent advances in supervised phonological reconstruction, during which an algorithm learns how to reconstruct words in a given proto-language relying on previously annotated data, and inspired by improved methods for automated word prediction from cognate sets, we present a new framework that allows for the representation of uncertainty in linguistic reconstruction and also includes a workflow for the computation of fuzzy reconstructions from linguistic data.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
A New Framework for Fast Automated Phonological Reconstruction Using Trimmed Alignments and Sound Correspondence Patterns
Authors:
Johann-Mattis List,
Robert Forkel,
Nathan W. Hill
Abstract:
Computational approaches in historical linguistics have been increasingly applied during the past decade and many new methods that implement parts of the traditional comparative method have been proposed. Despite these increased efforts, there are not many easy-to-use and fast approaches for the task of phonological reconstruction. Here we present a new framework that combines state-of-the-art tec…
▽ More
Computational approaches in historical linguistics have been increasingly applied during the past decade and many new methods that implement parts of the traditional comparative method have been proposed. Despite these increased efforts, there are not many easy-to-use and fast approaches for the task of phonological reconstruction. Here we present a new framework that combines state-of-the-art techniques for automated sequence comparison with novel techniques for phonetic alignment analysis and sound correspondence pattern detection to allow for the supervised reconstruction of word forms in ancestral languages. We test the method on a new dataset covering six groups from three different language families. The results show that our method yields promising results while at the same time being not only fast but also easy to apply and expand.
△ Less
Submitted 10 April, 2022;
originally announced April 2022.
-
Autonomous Situational Awareness for Robotic Swarms in High-Risk Environments
Authors:
Vincent W. Hill,
Ryan W. Thomas,
Jordan D. Larson
Abstract:
This paper describes a technique for the autonomous mission planning of robotic swarms in high risk environments where agent disablement is likely. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement or agent loss, the swarm planning is updated to reflect the…
▽ More
This paper describes a technique for the autonomous mission planning of robotic swarms in high risk environments where agent disablement is likely. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement or agent loss, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the swarm. The primary algorithms featured in this work are A* pathfinding and the Generalized Labeled Multi-Bernoulli multi-object tracking method.
△ Less
Submitted 10 May, 2021;
originally announced May 2021.
-
Autonomous Situational Awareness for UAS Swarms
Authors:
Vincent W. Hill,
Ryan W. Thomas,
Jordan D. Larson
Abstract:
This paper describes a technique for the autonomous mission planning of unmanned aerial system swarms. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the sw…
▽ More
This paper describes a technique for the autonomous mission planning of unmanned aerial system swarms. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the swarm. The primary algorithms featured in this work are A* pathfinding and the Generalized Labeled Multi-Bernoulli multi-target tracking method.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.