Search | arXiv e-print repository

MinePlanner: A Benchmark for Long-Horizon Planning in Large Minecraft Worlds

Authors: William Hill, Ireton Liu, Anita De Mello Koch, Damion Harvey, Nishanth Kumar, George Konidaris, Steven James

Abstract: We propose a new benchmark for planning tasks based on the Minecraft game. Our benchmark contains 45 tasks overall, but also provides support for creating both propositional and numeric instances of new Minecraft tasks automatically. We benchmark numeric and propositional planning systems on these tasks, with results demonstrating that state-of-the-art planners are currently incapable of dealing w… ▽ More We propose a new benchmark for planning tasks based on the Minecraft game. Our benchmark contains 45 tasks overall, but also provides support for creating both propositional and numeric instances of new Minecraft tasks automatically. We benchmark numeric and propositional planning systems on these tasks, with results demonstrating that state-of-the-art planners are currently incapable of dealing with many of the challenges advanced by our new benchmark, such as scaling to instances with thousands of objects. Based on these results, we identify areas of improvement for future planners. Our framework is made available at https://github.com/IretonLiu/mine-pddl/. △ Less

Submitted 28 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: Accepted to the 6th ICAPS Workshop on the International Planning Competition (WIPC 2024)

arXiv:2312.10332 [pdf, other]

ProTIP: Progressive Tool Retrieval Improves Planning

Authors: Raviteja Anantha, Bortik Bandyopadhyay, Anirudh Kashi, Sayantan Mahinder, Andrew W Hill, Srinivas Chappidi

Abstract: Large language models (LLMs) are increasingly employed for complex multi-step planning tasks, where the tool retrieval (TR) step is crucial for achieving successful outcomes. Two prevalent approaches for TR are single-step retrieval, which utilizes the complete query, and sequential retrieval using task decomposition (TD), where a full query is segmented into discrete atomic subtasks. While single… ▽ More Large language models (LLMs) are increasingly employed for complex multi-step planning tasks, where the tool retrieval (TR) step is crucial for achieving successful outcomes. Two prevalent approaches for TR are single-step retrieval, which utilizes the complete query, and sequential retrieval using task decomposition (TD), where a full query is segmented into discrete atomic subtasks. While single-step retrieval lacks the flexibility to handle "inter-tool dependency," the TD approach necessitates maintaining "subtask-tool atomicity alignment," as the toolbox can evolve dynamically. To address these limitations, we introduce the Progressive Tool retrieval to Improve Planning (ProTIP) framework. ProTIP is a lightweight, contrastive learning-based framework that implicitly performs TD without the explicit requirement of subtask labels, while simultaneously maintaining subtask-tool atomicity. On the ToolBench dataset, ProTIP outperforms the ChatGPT task decomposition-based approach by a remarkable margin, achieving a 24% improvement in Recall@K=10 for TR and a 41% enhancement in tool accuracy for plan generation. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: preprint version

arXiv:2312.02200 [pdf, other]

An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets

Authors: Maya Srikanth, Jeremy Irvin, Brian Wesley Hill, Felipe Godoy, Ishan Sabane, Andrew Y. Ng

Abstract: Major advancements in computer vision can primarily be attributed to the use of labeled datasets. However, acquiring labels for datasets often results in errors which can harm model performance. Recent works have proposed methods to automatically identify mislabeled images, but develo** strategies to effectively implement them in real world datasets has been sparsely explored. Towards improved d… ▽ More Major advancements in computer vision can primarily be attributed to the use of labeled datasets. However, acquiring labels for datasets often results in errors which can harm model performance. Recent works have proposed methods to automatically identify mislabeled images, but develo** strategies to effectively implement them in real world datasets has been sparsely explored. Towards improved data-centric methods for cleaning real world vision datasets, we first conduct more than 200 experiments carefully benchmarking recently developed automated mislabel detection methods on multiple datasets under a variety of synthetic and real noise settings with varying noise levels. We compare these methods to a Simple and Efficient Mislabel Detector (SEMD) that we craft, and find that SEMD performs similarly to or outperforms prior mislabel detection approaches. We then apply SEMD to multiple real world computer vision datasets and test how dataset size, mislabel removal strategy, and mislabel removal amount further affect model performance after retraining on the cleaned data. With careful design of the approach, we find that mislabel removal leads per-class performance improvements of up to 8% of a retrained classifier in smaller data regimes. △ Less

Submitted 2 December, 2023; originally announced December 2023.

arXiv:2310.12727 [pdf, other]

Representing and Computing Uncertainty in Phonological Reconstruction

Authors: Johann-Mattis List, Nathan W. Hill, Robert Forkel, Frederic Blum

Abstract: Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms. With the increasing success of recently proposed approaches to automating certain aspects of the traditional comparative method, the formal representation of proto-forms has also improved. This formalization makes it possible to address both… ▽ More Despite the inherently fuzzy nature of reconstructions in historical linguistics, most scholars do not represent their uncertainty when proposing proto-forms. With the increasing success of recently proposed approaches to automating certain aspects of the traditional comparative method, the formal representation of proto-forms has also improved. This formalization makes it possible to address both the representation and the computation of uncertainty. Building on recent advances in supervised phonological reconstruction, during which an algorithm learns how to reconstruct words in a given proto-language relying on previously annotated data, and inspired by improved methods for automated word prediction from cognate sets, we present a new framework that allows for the representation of uncertainty in linguistic reconstruction and also includes a workflow for the computation of fuzzy reconstructions from linguistic data. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: To appear in: Proceedings of the 4th Workshop on Computational Approaches to Historical Language Change

arXiv:2204.04619 [pdf, other]

A New Framework for Fast Automated Phonological Reconstruction Using Trimmed Alignments and Sound Correspondence Patterns

Authors: Johann-Mattis List, Robert Forkel, Nathan W. Hill

Abstract: Computational approaches in historical linguistics have been increasingly applied during the past decade and many new methods that implement parts of the traditional comparative method have been proposed. Despite these increased efforts, there are not many easy-to-use and fast approaches for the task of phonological reconstruction. Here we present a new framework that combines state-of-the-art tec… ▽ More Computational approaches in historical linguistics have been increasingly applied during the past decade and many new methods that implement parts of the traditional comparative method have been proposed. Despite these increased efforts, there are not many easy-to-use and fast approaches for the task of phonological reconstruction. Here we present a new framework that combines state-of-the-art techniques for automated sequence comparison with novel techniques for phonetic alignment analysis and sound correspondence pattern detection to allow for the supervised reconstruction of word forms in ancestral languages. We test the method on a new dataset covering six groups from three different language families. The results show that our method yields promising results while at the same time being not only fast but also easy to apply and expand. △ Less

Submitted 10 April, 2022; originally announced April 2022.

Comments: To appear at the 3rd Workshop on Computational Approaches to Historical Language Change, co-located with the ACL 2022 conference. https://www.aclweb.org/portal/content/3rd-workshop-computational-approaches-historical-language-change

arXiv:2105.04764 [pdf, other]

Autonomous Situational Awareness for Robotic Swarms in High-Risk Environments

Authors: Vincent W. Hill, Ryan W. Thomas, Jordan D. Larson

Abstract: This paper describes a technique for the autonomous mission planning of robotic swarms in high risk environments where agent disablement is likely. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement or agent loss, the swarm planning is updated to reflect the… ▽ More This paper describes a technique for the autonomous mission planning of robotic swarms in high risk environments where agent disablement is likely. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement or agent loss, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the swarm. The primary algorithms featured in this work are A* pathfinding and the Generalized Labeled Multi-Bernoulli multi-object tracking method. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2104.08904

arXiv:2104.08904 [pdf, other]

Autonomous Situational Awareness for UAS Swarms

Authors: Vincent W. Hill, Ryan W. Thomas, Jordan D. Larson

Abstract: This paper describes a technique for the autonomous mission planning of unmanned aerial system swarms. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the sw… ▽ More This paper describes a technique for the autonomous mission planning of unmanned aerial system swarms. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the swarm. The primary algorithms featured in this work are A* pathfinding and the Generalized Labeled Multi-Bernoulli multi-target tracking method. △ Less

Submitted 18 April, 2021; originally announced April 2021.

Comments: IEEE Aerospace 2021

Showing 1–7 of 7 results for author: Hill, W