Understanding and Estimating Domain Complexity Across Domains
Authors:
Katarina Doctor,
Mayank Kejriwal,
Lawrence Holder,
Eric Kildebeck,
Emma Resmini,
Christopher Pereyda,
Robert J. Steininger,
Daniel V. Olivença
Abstract:
Artificial Intelligence (AI) systems, trained in controlled environments, often struggle in real-world complexities. We propose a general framework for estimating domain complexity across diverse environments, like open-world learning and real-world applications. This framework distinguishes between intrinsic complexity (inherent to the domain) and extrinsic complexity (dependent on the AI agent).…
▽ More
Artificial Intelligence (AI) systems, trained in controlled environments, often struggle in real-world complexities. We propose a general framework for estimating domain complexity across diverse environments, like open-world learning and real-world applications. This framework distinguishes between intrinsic complexity (inherent to the domain) and extrinsic complexity (dependent on the AI agent). By analyzing dimensionality, sparsity, and diversity within these categories, we offer a comprehensive view of domain challenges. This approach enables quantitative predictions of AI difficulty during environment transitions, avoids bias in novel situations, and helps navigate the vast search spaces of open-world domains.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
Polycraft World AI Lab (PAL): An Extensible Platform for Evaluating Artificial Intelligence Agents
Authors:
Stephen A. Goss,
Robert J. Steininger,
Dhruv Narayanan,
Daniel V. Olivença,
Yutong Sun,
Peng Qiu,
Jim Amato,
Eberhard O. Voit,
Walter E. Voit,
Eric J. Kildebeck
Abstract:
As artificial intelligence research advances, the platforms used to evaluate AI agents need to adapt and grow to continue to challenge them. We present the Polycraft World AI Lab (PAL), a task simulator with an API based on the Minecraft mod Polycraft World. Our platform is built to allow AI agents with different architectures to easily interact with the Minecraft world, train and be evaluated in…
▽ More
As artificial intelligence research advances, the platforms used to evaluate AI agents need to adapt and grow to continue to challenge them. We present the Polycraft World AI Lab (PAL), a task simulator with an API based on the Minecraft mod Polycraft World. Our platform is built to allow AI agents with different architectures to easily interact with the Minecraft world, train and be evaluated in multiple tasks. PAL enables the creation of tasks in a flexible manner as well as having the capability to manipulate any aspect of the task during an evaluation. All actions taken by AI agents and external actors (non-player-characters, NPCs) in the open-world environment are logged to streamline evaluation. Here we present two custom tasks on the PAL platform, one focused on multi-step planning and one focused on navigation, and evaluations of agents solving them. In summary, we report a versatile and extensible AI evaluation platform with a low barrier to entry for AI researchers to utilize.
△ Less
Submitted 27 January, 2023;
originally announced January 2023.