-
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages
Authors:
Max Zuo,
Francisco Piedrahita Velez,
Xiaochen Li,
Michael L. Littman,
Stephen H. Bach
Abstract:
Many recent works have explored using language models for planning problems. One line of research focuses on translating natural language descriptions of planning tasks into structured planning languages, such as the planning domain definition language (PDDL). While this approach is promising, accurately measuring the quality of generated PDDL code continues to pose significant challenges. First,…
▽ More
Many recent works have explored using language models for planning problems. One line of research focuses on translating natural language descriptions of planning tasks into structured planning languages, such as the planning domain definition language (PDDL). While this approach is promising, accurately measuring the quality of generated PDDL code continues to pose significant challenges. First, generated PDDL code is typically evaluated using planning validators that check whether the problem can be solved with a planner. This method is insufficient because a language model might generate valid PDDL code that does not align with the natural language description of the task. Second, existing evaluation sets often have natural language descriptions of the planning task that closely resemble the ground truth PDDL, reducing the challenge of the task. To bridge this gap, we introduce \benchmarkName, a benchmark designed to evaluate language models' ability to generate PDDL code from natural language descriptions of planning tasks. We begin by creating a PDDL equivalence algorithm that rigorously evaluates the correctness of PDDL code generated by language models by flexibly comparing it against a ground truth PDDL. Then, we present a dataset of $132,037$ text-to-PDDL pairs across 13 different tasks, with varying levels of difficulty. Finally, we evaluate several API-access and open-weight language models that reveal this task's complexity. For example, $87.6\%$ of the PDDL problem descriptions generated by GPT-4o are syntactically parseable, $82.2\%$ are valid, solve-able problems, but only $35.1\%$ are semantically correct, highlighting the need for a more rigorous benchmark for this problem.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
ConSOR: A Context-Aware Semantic Object Rearrangement Framework for Partially Arranged Scenes
Authors:
Kartik Ramachandruni,
Max Zuo,
Sonia Chernova
Abstract:
Object rearrangement is the problem of enabling a robot to identify the correct object placement in a complex environment. Prior work on object rearrangement has explored a diverse set of techniques for following user instructions to achieve some desired goal state. Logical predicates, images of the goal scene, and natural language descriptions have all been used to instruct a robot in how to arra…
▽ More
Object rearrangement is the problem of enabling a robot to identify the correct object placement in a complex environment. Prior work on object rearrangement has explored a diverse set of techniques for following user instructions to achieve some desired goal state. Logical predicates, images of the goal scene, and natural language descriptions have all been used to instruct a robot in how to arrange objects. In this work, we argue that burdening the user with specifying goal scenes is not necessary in partially-arranged environments, such as common household settings. Instead, we show that contextual cues from partially arranged scenes (i.e., the placement of some number of pre-arranged objects in the environment) provide sufficient context to enable robots to perform object rearrangement \textit{without any explicit user goal specification}. We introduce ConSOR, a Context-aware Semantic Object Rearrangement framework that utilizes contextual cues from a partially arranged initial state of the environment to complete the arrangement of new objects, without explicit goal specification from the user. We demonstrate that ConSOR strongly outperforms two baselines in generalizing to novel object arrangements and unseen object categories. The code and data can be found at https://github.com/kartikvrama/consor.
△ Less
Submitted 30 September, 2023;
originally announced October 2023.
-
Traffic Flow Prediction via Variational Bayesian Inference-based Encoder-Decoder Framework
Authors:
Jianlei Kong,
Xiaomeng Fan,
Xue-Bo **,
Min Zuo
Abstract:
Accurate traffic flow prediction, a hotspot for intelligent transportation research, is the prerequisite for mastering traffic and making travel plans. The speed of traffic flow can be affected by roads condition, weather, holidays, etc. Furthermore, the sensors to catch the information about traffic flow will be interfered with by environmental factors such as illumination, collection time, occlu…
▽ More
Accurate traffic flow prediction, a hotspot for intelligent transportation research, is the prerequisite for mastering traffic and making travel plans. The speed of traffic flow can be affected by roads condition, weather, holidays, etc. Furthermore, the sensors to catch the information about traffic flow will be interfered with by environmental factors such as illumination, collection time, occlusion, etc. Therefore, the traffic flow in the practical transportation system is complicated, uncertain, and challenging to predict accurately. This paper proposes a deep encoder-decoder prediction framework based on variational Bayesian inference. A Bayesian neural network is constructed by combining variational inference with gated recurrent units (GRU) and used as the deep neural network unit of the encoder-decoder framework to mine the intrinsic dynamics of traffic flow. Then, the variational inference is introduced into the multi-head attention mechanism to avoid noise-induced deterioration of prediction accuracy. The proposed model achieves superior prediction performance on the Guangzhou urban traffic flow dataset over the benchmarks, particularly when the long-term prediction.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
ATCON: Attention Consistency for Vision Models
Authors:
Ali Mirzazadeh,
Florian Dubost,
Maxwell Pike,
Krish Maniar,
Max Zuo,
Christopher Lee-Messer,
Daniel Rubin
Abstract:
Attention--or attribution--maps methods are methods designed to highlight regions of the model's input that were discriminative for its predictions. However, different attention maps methods can highlight different regions of the input, with sometimes contradictory explanations for a prediction. This effect is exacerbated when the training set is small. This indicates that either the model learned…
▽ More
Attention--or attribution--maps methods are methods designed to highlight regions of the model's input that were discriminative for its predictions. However, different attention maps methods can highlight different regions of the input, with sometimes contradictory explanations for a prediction. This effect is exacerbated when the training set is small. This indicates that either the model learned incorrect representations or that the attention maps methods did not accurately estimate the model's representations. We propose an unsupervised fine-tuning method that optimizes the consistency of attention maps and show that it improves both classification performance and the quality of attention maps. We propose an implementation for two state-of-the-art attention computation methods, Grad-CAM and Guided Backpropagation, which relies on an input masking technique. We also show results on Grad-CAM and Integrated Gradients in an ablation study. We evaluate this method on our own dataset of event detection in continuous video recordings of hospital patients aggregated and curated for this work. As a sanity check, we also evaluate the proposed method on PASCAL VOC and SVHN. With the proposed method, with small training sets, we achieve a 6.6 points lift of F1 score over the baselines on our video dataset, a 2.9 point lift of F1 score on PASCAL, and a 1.8 points lift of mean Intersection over Union over Grad-CAM for weakly supervised detection on PASCAL. Those improved attention maps may help clinicians better understand vision model predictions and ease the deployment of machine learning systems into clinical care. We share part of the code for this article at the following repository: https://github.com/alimirzazadeh/SemisupervisedAttention.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Chemical Short-Range Ordering in a CrCoNi Medium-Entropy Alloy
Authors:
H. W. Hsiao,
R. Feng,
H. Ni,
K. An,
J. D. Poplawsky,
P. K. Liaw,
J. M. Zuo
Abstract:
The exceptional mechanical strengths of medium and high-entropy alloys have been attributed to hardening in random solid solutions. Here, we evidence non-random chemical mixings in CrCoNi alloys, resulting from short range ordering. A novel data-mining approach of electron nanodiffraction patterns enabled the study, which is assisted by neutron scattering, atom probe tomography, and diffraction si…
▽ More
The exceptional mechanical strengths of medium and high-entropy alloys have been attributed to hardening in random solid solutions. Here, we evidence non-random chemical mixings in CrCoNi alloys, resulting from short range ordering. A novel data-mining approach of electron nanodiffraction patterns enabled the study, which is assisted by neutron scattering, atom probe tomography, and diffraction simulation using first principles theory models. Results reveal two critical types of short range orders in nanoclusters that minimize the Cr and Cr nearest neighbors (L11) or segregate Cr on alternating close-packed planes (L12). The makeup of ordering-strengthened nanoclusters can be tuned by heat treatments to affect deformation mechanisms. These findings uncover a mixture of bonding preferences and their control at the nanoscopic scale in CrCoNi and provide general opportunities for an atomistic-structure study in concentrated alloys for the design of strong and ductile materials.
△ Less
Submitted 4 June, 2022;
originally announced June 2022.
-
Efficient Exploration via First-Person Behavior Cloning Assisted Rapidly-Exploring Random Trees
Authors:
Max Zuo,
Logan Schick,
Matthew Gombolay,
Nakul Gopalan
Abstract:
Modern day computer games have extremely large state and action spaces. To detect bugs in these games' models, human testers play the games repeatedly to explore the game and find errors in the games. Such gameplay is exhaustive and time consuming. Moreover, since robotics simulators depend on similar methods of model specification and debugging, the problem of finding errors in the model is of in…
▽ More
Modern day computer games have extremely large state and action spaces. To detect bugs in these games' models, human testers play the games repeatedly to explore the game and find errors in the games. Such gameplay is exhaustive and time consuming. Moreover, since robotics simulators depend on similar methods of model specification and debugging, the problem of finding errors in the model is of interest to the robotics community to ensure robot behaviors and interactions are consistent in simulators. Previous methods have used reinforcement learning arXiv:2103.13798 and search based methods (Chang, 2019, (Chang, 2021) arXiv:1811.06962 including Rapidly-exploring Random Trees (RRT) to explore a game's state-action space to find bugs. However, such search and exploration based methods are not efficient at exploring the state-action space without a pre-defined heuristic. In this work we attempt to combine a human-tester's expertise in solving games, and the RRT's exhaustiveness to search a game's state space efficiently with high coverage. This paper introduces Cloning Assisted RRT (CA-RRT) to test a game through search. We compare our methods to two existing baselines: 1) a weighted-RRT as described by arXiv:1812.03125; 2) human demonstration seeded RRT as described by Chang et. al. We find CA-RRT is applicable to more game maps and explores more game states in fewer tree expansions/iterations when compared to the existing baselines. In each test, CA-RRT reached more states on average in the same number of iterations as weighted-RRT. In our tested environments, CA-RRT reached the same number of states as weighted-RRT by more than 5000 fewer iterations on average, almost a 50% reduction and applied to more scenarios than. Moreover, as a consequence of our first person behavior cloning approach, CA-RRT worked on unseen game maps than just seeding the RRT with human demonstrated states.
△ Less
Submitted 19 April, 2022; v1 submitted 23 March, 2022;
originally announced March 2022.
-
A differential evolution-based optimization tool for interplanetary transfer trajectory design
Authors:
Mingcheng Zuo,
Guangming Dai,
Lei Peng,
Zhe Tang
Abstract:
The extremely sensitive and highly nonlinear search space of interplanetary transfer trajectory design bring about big challenges on global optimization. As a representative, the current known best solution of the global trajectory optimization problem (GTOP) designed by the European space agency (ESA) is very hard to be found. To deal with this difficulty, a powerful differential evolution-based…
▽ More
The extremely sensitive and highly nonlinear search space of interplanetary transfer trajectory design bring about big challenges on global optimization. As a representative, the current known best solution of the global trajectory optimization problem (GTOP) designed by the European space agency (ESA) is very hard to be found. To deal with this difficulty, a powerful differential evolution-based optimization tool named COoperative Differential Evolution (CODE) is proposed in this paper. CODE employs a two-stage evolutionary process, which concentrates on learning global structure in the earlier process, and tends to self-adaptively learn the structures of different local spaces. Besides, considering the spatial distribution of global optimum on different problems and the gradient information on different variables, a multiple boundary check technique has been employed. Also, Covariance Matrix Adaptation Evolutionary Strategies (CMA-ES) is used as a local optimizer. The previous studies have shown that a specific swarm intelligent optimization algorithm usually can solve only one or two GTOP problems. However, the experimental test results show that CODE can find the current known best solutions of Cassini1 and Sagas directly, and the cooperation with CMA-ES can solve Cassini2, GTOC1, Messenger (reduced) and Rosetta. For the most complicated Messenger (full) problem, even though CODE cannot find the current known best solution, the found best solution with objective function equaling to 3.38 km/s is still a level that other swarm intelligent algorithms cannot easily reach.
△ Less
Submitted 13 April, 2021; v1 submitted 13 November, 2020;
originally announced November 2020.
-
Response to Comment (arXiv:1506.02787v1) on Selective Interface Control of Order Parameters in Complex Oxides
Authors:
D. Meyers,
Jian Liu,
J. W. Freeland,
S. Middey,
M. Kareev,
J. M. Zuo,
Yi-De Chuang,
Jong Woo Kim,
P. Ryan,
J. Chakhalian
Abstract:
In response to Lu et al, (arXiv:1506.02787v1), here we present a detailed writeup concerning the questions raised in their comment on our eprint (arXiv:1505.07451). The key question raised by Lu et al was if the bulk-like charge ordered state becomes indetectable with resonant scattering due to ultrathin film thickness. In this reply, we first detail the relation of our work to past work on the sa…
▽ More
In response to Lu et al, (arXiv:1506.02787v1), here we present a detailed writeup concerning the questions raised in their comment on our eprint (arXiv:1505.07451). The key question raised by Lu et al was if the bulk-like charge ordered state becomes indetectable with resonant scattering due to ultrathin film thickness. In this reply, we first detail the relation of our work to past work on the same compound by Staub et al to demonstrate that the presented data are indeed sufficient to support our claims of no charge order on ultra thin films of NdNiO3 (NNO) on NdGaO3 (NGO). Further, we demonstrate that if a well defined charge ordered phase exists in ultra thin films, it is indeed resolvable such as that in EuNiO3 (ENO).
△ Less
Submitted 21 June, 2015;
originally announced June 2015.
-
Selective Interface Control of Order Parameters in Complex Oxides
Authors:
D. Meyers,
Jian Liu,
J. W. Freeland,
S. Middey,
M. Kareev,
J. M. Zuo,
Yi-De Chuang,
Jong Woo Kim,
P. J. Ryan,
J. Chakhalian
Abstract:
In complex materials observed electronic phases and transitions between them often involves coupling between many degrees of freedom whose entanglement convolutes understanding of the instigating mechanism. Metal-insulator transitions are one such problem where coupling to the structural, orbital, charge, and magnetic order parameters frequently obscures the underlying physics. Here, we demonstrat…
▽ More
In complex materials observed electronic phases and transitions between them often involves coupling between many degrees of freedom whose entanglement convolutes understanding of the instigating mechanism. Metal-insulator transitions are one such problem where coupling to the structural, orbital, charge, and magnetic order parameters frequently obscures the underlying physics. Here, we demonstrate a way to unravel this conundrum by heterostructuring a prototypical multi-ordered complex oxide NdNiO3 in ultra thin geometry, which preserves the metal-to-insulator transition and bulk-like magnetic order parameter, but entirely suppresses the symmetry lowering and charge order parameter. These findings illustrate the utility of heterointerfaces as a powerful method for removing competing order parameters to gain greater insight into the nature of the transition, here revealing that the magnetic order generates the transition independently, leading to a purely electronic Mott metal-insulator transition.
△ Less
Submitted 22 July, 2015; v1 submitted 27 May, 2015;
originally announced May 2015.
-
Superconductivity with topological surface state in SrxBi2Se3
Authors:
Zhongheng Liu,
Xiong Yao,
Jifeng Shao,
Ming Zuo,
Li Pi,
Shun Tan,
Chang** Zhang,
Yuheng Zhang
Abstract:
By intercalation of alkaline-earth metal Sr in Bi2Se3, superconductivity with large shielding volume fraction (~91.5% at 0.5 K) has been achieved in Sr0.065Bi2Se3. The analysis of the Shubnikov-de Hass oscillations confirms the 1/2-shift expected from a Dirac spectrum, giving transport evidence of the existence of surface states. Importantly, the SrxBi2Se3superconductor is stable under air, making…
▽ More
By intercalation of alkaline-earth metal Sr in Bi2Se3, superconductivity with large shielding volume fraction (~91.5% at 0.5 K) has been achieved in Sr0.065Bi2Se3. The analysis of the Shubnikov-de Hass oscillations confirms the 1/2-shift expected from a Dirac spectrum, giving transport evidence of the existence of surface states. Importantly, the SrxBi2Se3superconductor is stable under air, making the SrxBi2Se3 compound an ideal material base for investigating topological superconductivity.
△ Less
Submitted 17 August, 2015; v1 submitted 4 February, 2015;
originally announced February 2015.
-
Superconducting fiber with transition temperature up to 7.43 K in Nb2PdxS5-delta (0< x <0.6)
Authors:
Hongyan Yu,
Ming Zuo,
Lei Zhang,
Shun Tan,
Chang** Zhang,
Yuheng Zhang
Abstract:
Wiring systems powered by high-efficient superconductors have long been a dream of scientists, but researchers have faced practical challenges such as finding flexible materials. Here we report superconductivity in Nb2PdxS5-delta fibers with transition temperature up to 7.43 K, which have typical diameters of 0.3-3 micrometer. Superconductivity occurs in a wide range of Pd and S contents, suggesti…
▽ More
Wiring systems powered by high-efficient superconductors have long been a dream of scientists, but researchers have faced practical challenges such as finding flexible materials. Here we report superconductivity in Nb2PdxS5-delta fibers with transition temperature up to 7.43 K, which have typical diameters of 0.3-3 micrometer. Superconductivity occurs in a wide range of Pd and S contents, suggesting that the superconductivity in this system is very robust. Long fibers with suitable size provide a new route to high-power transmission cables and electronic devices.
△ Less
Submitted 23 August, 2013;
originally announced August 2013.
-
Phase Separation and Chemical Inhomogeneity in the Iron Chalcogenide Superconductor Fe1+yTexSe1-x
Authors:
Hefei Hu,
J. M. Zuo,
J. S. Wen,
Z. J. Xu,
Z. W. Lin,
Q. Li,
Genda Gu,
W. K. Park,
L. H. Greene
Abstract:
We report investigation on Fe1+yTexSe1-x single crystals by using scanning transmission electron microscopy (STEM) and electron energy loss spectroscopy (EELS). Both nonsuperconducting samples with excess iron and superconducting samples demonstrate nanoscale phase separation and chemical inhomogeneity of Te/Se content, which we attribute to a miscibility gap. The line scan EELS technique indicate…
▽ More
We report investigation on Fe1+yTexSe1-x single crystals by using scanning transmission electron microscopy (STEM) and electron energy loss spectroscopy (EELS). Both nonsuperconducting samples with excess iron and superconducting samples demonstrate nanoscale phase separation and chemical inhomogeneity of Te/Se content, which we attribute to a miscibility gap. The line scan EELS technique indicates ~20% or less fluctuation of Te concentration from the nominal compositions.
△ Less
Submitted 29 September, 2010;
originally announced September 2010.
-
Magnetically asymmetric interfaces in a (LaMnO$_3$)/(SrMnO$_3$) superlattice due to structural asymmetries
Authors:
S. J. May,
A. B. Shah,
S. G. E. te Velthuis,
M. R. Fitzsimmons,
J. M. Zuo,
X. Zhai,
J. N. Eckstein,
S. D. Bader,
A. Bhattacharya
Abstract:
Polarized neutron reflectivity measurements of a ferromagnetic [(LaMnO$_3$)$_{11.8}$/(SrMnO$_3$)$_{4.4}$]$_6$ superlattice reveal a modulated magnetic structure with an enhanced magnetization at the interfaces where LaMnO$_3$ was deposited on SrMnO$_3$ (LMO/SMO). However, the opposite interfaces (SMO/LMO) are found to have a reduced ferromagnetic moment. The magnetic asymmetry arises from the di…
▽ More
Polarized neutron reflectivity measurements of a ferromagnetic [(LaMnO$_3$)$_{11.8}$/(SrMnO$_3$)$_{4.4}$]$_6$ superlattice reveal a modulated magnetic structure with an enhanced magnetization at the interfaces where LaMnO$_3$ was deposited on SrMnO$_3$ (LMO/SMO). However, the opposite interfaces (SMO/LMO) are found to have a reduced ferromagnetic moment. The magnetic asymmetry arises from the difference in lateral structural roughness of the two interfaces observed via electron microscopy, with strong ferromagnetism present at the interfaces that are atomically smooth over tens of nanometers. This result demonstrates that atomic-scale roughness can destabilize interfacial phases in complex oxide heterostructures.
△ Less
Submitted 25 February, 2008; v1 submitted 11 September, 2007;
originally announced September 2007.
-
Lamellar phase separation and dynamic competition in La0.23Ca0.77MnO3
Authors:
J. Tao,
D. Niebieskikwiat,
M. B. Salamon,
J. M. Zuo
Abstract:
We report the coexistence of lamellar charge-ordered (CO) and charge-disordered (CD) domains, and their dynamical behavior, in La0.23Ca0.77MnO3. Using high resolution transmission electron microscopy (TEM), we show that below Tcd~170K a CD-monoclinic phase forms within the established CO-orthorhombic matrix. The CD phase has a sheet-like morphology, perpendicular to the q vector of the CO superl…
▽ More
We report the coexistence of lamellar charge-ordered (CO) and charge-disordered (CD) domains, and their dynamical behavior, in La0.23Ca0.77MnO3. Using high resolution transmission electron microscopy (TEM), we show that below Tcd~170K a CD-monoclinic phase forms within the established CO-orthorhombic matrix. The CD phase has a sheet-like morphology, perpendicular to the q vector of the CO superlattice (a axis of the Pnma structure). For temperatures between 64K and 130K, both the TEM and resistivity experiments show a dynamic competition between the two phases: at constant T, the CD phase slowly advances over the CO one. This slow dynamics appears to be linked to the magnetic transitions occurring in this compound, suggesting important magnetoelastic effects.
△ Less
Submitted 27 September, 2004;
originally announced September 2004.
-
Nanometer-sized Regions of Charge Ordering and Charge Melting in La2/3Ca1/3MnO3 Revealed by Electron Micro-diffraction
Authors:
J. M. Zuo,
J. Tao
Abstract:
Electron microdiffraction study of phase transition in La2/3Ca1/3MnO3 revealed temperature dependent (h+1/2,0,l) diffraction spots. Their intensity peaks at Tc. Quantitative electron diffraction intensity analysis shows that they come from nanometer-sized domains with modulated transverse atomic displacements in the orthorhombic a-c plane, which has two types of Mn ions and thus charge ordering.…
▽ More
Electron microdiffraction study of phase transition in La2/3Ca1/3MnO3 revealed temperature dependent (h+1/2,0,l) diffraction spots. Their intensity peaks at Tc. Quantitative electron diffraction intensity analysis shows that they come from nanometer-sized domains with modulated transverse atomic displacements in the orthorhombic a-c plane, which has two types of Mn ions and thus charge ordering. The average domain is ~3.6 nm in diameter and ~1.5 nm in height (along the b-axis). The number of domains increases and then decreases, as the sample is cooled through Tc.
△ Less
Submitted 29 September, 2000;
originally announced September 2000.