Search | arXiv e-print repository

CaT-BENCH: Benchmarking Language Model Understanding of Causal and Temporal Dependencies in Plans

Authors: Yash Kumar Lal, Vanya Cohen, Nathanael Chambers, Niranjan Balasubramanian, Raymond Mooney

Abstract: Understanding the abilities of LLMs to reason about natural language plans, such as instructional text and recipes, is critical to reliably using them in decision-making systems. A fundamental aspect of plans is the temporal order in which their steps needs to be executed, which reflects the underlying causal dependencies between them. We introduce CaT-Bench, a benchmark of Step Order Prediction q… ▽ More Understanding the abilities of LLMs to reason about natural language plans, such as instructional text and recipes, is critical to reliably using them in decision-making systems. A fundamental aspect of plans is the temporal order in which their steps needs to be executed, which reflects the underlying causal dependencies between them. We introduce CaT-Bench, a benchmark of Step Order Prediction questions, which test whether a step must necessarily occur before or after another in cooking recipe plans. We use this to evaluate how well frontier LLMs understand causal and temporal dependencies. We find that SOTA LLMs are underwhelming (best zero-shot is only 0.59 in F1), and are biased towards predicting dependence more often, perhaps relying on temporal order of steps as a heuristic. While prompting for explanations and using few-shot examples improve performance, the best F1 result is only 0.73. Further, human evaluation of explanations along with answer correctness show that, on average, humans do not agree with model reasoning. Surprisingly, we also find that explaining after answering leads to better performance than normal chain-of-thought prompting, and LLM answers are not consistent across questions about the same step pairs. Overall, results show that LLMs' ability to detect dependence between steps has significant room for improvement. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2302.07139 [pdf, other]

Modeling Complex Event Scenarios via Simple Entity-focused Questions

Authors: Mahnaz Koupaee, Greg Durrett, Nathanael Chambers, Niranjan Balasubramanian

Abstract: Event scenarios are often complex and involve multiple event sequences connected through different entity participants. Exploring such complex scenarios requires an ability to branch through different sequences, something that is difficult to achieve with standard event language modeling. To address this, we propose a question-guided generation framework that models events in complex scenarios as… ▽ More Event scenarios are often complex and involve multiple event sequences connected through different entity participants. Exploring such complex scenarios requires an ability to branch through different sequences, something that is difficult to achieve with standard event language modeling. To address this, we propose a question-guided generation framework that models events in complex scenarios as answers to questions about participants. At any step in the generation process, the framework uses the previously generated events as context, but generates the next event as an answer to one of three questions: what else a participant did, what else happened to a participant, or what else happened. The participants and the questions themselves can be sampled or be provided as input from a user, allowing for controllable exploration. Our empirical evaluation shows that this question-guided generation provides better coverage of participants, diverse events within a domain, comparable perplexities for modeling event sequences, and more effective control for interactive schema generation. △ Less

Submitted 14 February, 2023; originally announced February 2023.

Comments: To be published in proceedings of EACL 2023

arXiv:2208.00329 [pdf, other]

PASTA: A Dataset for Modeling Participant States in Narratives

Authors: Sayontan Ghosh, Mahnaz Koupaee, Isabella Chen, Francis Ferraro, Nathanael Chambers, Niranjan Balasubramanian

Abstract: The events in a narrative are understood as a coherent whole via the underlying states of their participants. Often, these participant states are not explicitly mentioned, instead left to be inferred by the reader. A model that understands narratives should likewise infer these implicit states, and even reason about the impact of changes to these states on the narrative. To facilitate this goal, w… ▽ More The events in a narrative are understood as a coherent whole via the underlying states of their participants. Often, these participant states are not explicitly mentioned, instead left to be inferred by the reader. A model that understands narratives should likewise infer these implicit states, and even reason about the impact of changes to these states on the narrative. To facilitate this goal, we introduce a new crowdsourced English-language, Participant States dataset, PASTA. This dataset contains inferable participant states; a counterfactual perturbation to each state; and the changes to the story that would be necessary if the counterfactual were true. We introduce three state-based reasoning tasks that test for the ability to infer when a state is entailed by a story, to revise a story conditioned on a counterfactual state, and to explain the most likely state change given a revised story. Experiments show that today's LLMs can reason about states to some degree, but there is large room for improvement, especially in problems requiring access and ability to reason with diverse types of knowledge (e.g. physical, numerical, factual). △ Less

Submitted 1 July, 2023; v1 submitted 30 July, 2022; originally announced August 2022.

arXiv:2106.07117 [pdf, other]

Toward Diverse Precondition Generation

Authors: Heeyoung Kwon, Nathanael Chambers, Niranjan Balasubramanian

Abstract: Language understanding must identify the logical connections between events in a discourse, but core events are often unstated due to their commonsense nature. This paper fills in these missing events by generating precondition events. Precondition generation can be framed as a sequence-to-sequence problem: given a target event, generate a possible precondition. However, in most real-world scenari… ▽ More Language understanding must identify the logical connections between events in a discourse, but core events are often unstated due to their commonsense nature. This paper fills in these missing events by generating precondition events. Precondition generation can be framed as a sequence-to-sequence problem: given a target event, generate a possible precondition. However, in most real-world scenarios, an event can have several preconditions, requiring diverse generation -- a challenge for standard seq2seq approaches. We propose DiP, a Diverse Precondition generation system that can generate unique and diverse preconditions. DiP uses a generative process with three components -- an event sampler, a candidate generator, and a post-processor. The event sampler provides control codes (precondition triggers) which the candidate generator uses to focus its generation. Unlike other conditional generation systems, DiP automatically generates control codes without training on diverse examples. Analysis against baselines reveals that DiP improves the diversity of preconditions significantly while also generating more preconditions. △ Less

Submitted 13 June, 2021; originally announced June 2021.

arXiv:2106.06132 [pdf, other]

doi 10.18653/v1/2021.findings-acl.53

TellMeWhy: A Dataset for Answering Why-Questions in Narratives

Authors: Yash Kumar Lal, Nathanael Chambers, Raymond Mooney, Niranjan Balasubramanian

Abstract: Answering questions about why characters perform certain actions is central to understanding and reasoning about narratives. Despite recent progress in QA, it is not clear if existing models have the ability to answer "why" questions that may require commonsense knowledge external to the input narrative. In this work, we introduce TellMeWhy, a new crowd-sourced dataset that consists of more than 3… ▽ More Answering questions about why characters perform certain actions is central to understanding and reasoning about narratives. Despite recent progress in QA, it is not clear if existing models have the ability to answer "why" questions that may require commonsense knowledge external to the input narrative. In this work, we introduce TellMeWhy, a new crowd-sourced dataset that consists of more than 30k questions and free-form answers concerning why characters in short narratives perform the actions described. For a third of this dataset, the answers are not present within the narrative. Given the limitations of automated evaluation for this task, we also present a systematized human evaluation interface for this dataset. Our evaluation of state-of-the-art models show that they are far below human performance on answering such questions. They are especially worse on questions whose answers are external to the narrative, thus providing a challenge for future QA and narrative understanding research. △ Less

Submitted 17 August, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

Comments: Accepted to Findings of ACL, 2021 Data and evaluation suite available at http://lunr.cs.stonybrook.edu/tellmewhy

arXiv:2106.04107 [pdf, other]

doi 10.1093/mnras/stab1659

X-ray burst ignition location on the surface of accreting X-ray pulsars: Can bursts preferentially ignite at the hotspot?

Authors: A. J. Goodwin, A. Heger, F. R. N. Chambers, A. L. Watts, Y. Cavecchi

Abstract: Hotspots on the surface of accreting neutron stars have been directly observed via pulsations in the lightcurves of X-ray pulsars. They are thought to occur due to magnetic channelling of the accreted fuel to the neutron star magnetic poles. Some X-ray pulsars exhibit burst oscillations during Type I thermonuclear X-ray bursts which are thought to be caused by asymmetries in the burning. In rapidl… ▽ More Hotspots on the surface of accreting neutron stars have been directly observed via pulsations in the lightcurves of X-ray pulsars. They are thought to occur due to magnetic channelling of the accreted fuel to the neutron star magnetic poles. Some X-ray pulsars exhibit burst oscillations during Type I thermonuclear X-ray bursts which are thought to be caused by asymmetries in the burning. In rapidly rotating neutron stars, it has been shown that the lower gravity at the equator can lead to preferential ignition of X-ray bursts at this location. These models, however, do not include the effect of accretion hotspots at the neutron star surface. There are two accreting neutron star sources in which burst oscillations have been observed to track exactly the neutron star spin period. We analyse whether this could be due to the X-ray bursts igniting at the magnetic pole of the neutron star, because of heating in the accreted layers under the hotspot causing ignition conditions to be reached earlier. We investigate heat transport in the accreted layers using a 2D model and study the prevalence of heating down to the ignition depth of X-ray bursts for different hotspot temperatures and sizes. We perform calculations for accretion at the pole and at the equator, and infer that ignition could occur away from the equator at the magnetic pole for hotspots with temperatures greater than $1\times10^8$ K. However, current observations have not identified such high temperatures in accreting X-ray pulsars. △ Less

Submitted 8 June, 2021; originally announced June 2021.

Comments: 13 pages, 8 figures, accepted for publication in MNRAS

arXiv:2104.11273 [pdf, other]

Real-Time Trajectory Optimization in Robot-Assisted Exercise and Rehabilitation

Authors: Humberto De las Casas, Nicholas Chambers, Hanz Richter, Kenneth Sparks

Abstract: This work focuses on the optimization of the training trajectory orientation using a robot as an advanced exercise machine (AEM) and muscle activations as biofeedback. Muscle recruitment patterns depend on trajectory parameters of the AEMs and correlate with the efficiency of exercise. Thus, improvements to training efficiency may be achieved by optimizing these parameters. The optimal regulation… ▽ More This work focuses on the optimization of the training trajectory orientation using a robot as an advanced exercise machine (AEM) and muscle activations as biofeedback. Muscle recruitment patterns depend on trajectory parameters of the AEMs and correlate with the efficiency of exercise. Thus, improvements to training efficiency may be achieved by optimizing these parameters. The optimal regulation of these parameters is challenging because of the complexity of the physiological dynamics from person to person as a result of the unique physical features such as musculoskeletal distribution. Furthermore, these effects can vary due to fatigue, body temperature, and other physiological factors. In this paper, a model-free optimization method using Extremum Seeking Control (ESC) as a real-time optimizer is proposed. After selecting a muscle objective, this method seeks for the optimal combination of parameters using the muscle activations as biofeedback. The muscle objective can be selected by a therapist to emphasize or de-emphasize certain muscle groups. The feasibility of this method has been proven for the automatic regulation of an ellipsoidal curve orientation, suggesting the existence of two local optimal orientations. This methodology can also be applied to other parameter regulations using a different physiological effects such as oxygen consumption and heart rate as biofeedback. △ Less

Submitted 22 April, 2021; originally announced April 2021.

arXiv:2012.15786 [pdf, other]

Conditional Generation of Temporally-ordered Event Sequences

Authors: Shih-Ting Lin, Nathanael Chambers, Greg Durrett

Abstract: Models of narrative schema knowledge have proven useful for a range of event-related tasks, but they typically do not capture the temporal relationships between events. We propose a single model that addresses both temporal ordering, sorting given events into the order they occurred, and event infilling, predicting new events which fit into an existing temporally-ordered sequence. We use a BART-ba… ▽ More Models of narrative schema knowledge have proven useful for a range of event-related tasks, but they typically do not capture the temporal relationships between events. We propose a single model that addresses both temporal ordering, sorting given events into the order they occurred, and event infilling, predicting new events which fit into an existing temporally-ordered sequence. We use a BART-based conditional generation model that can capture both temporality and common event co-occurrence, meaning it can be flexibly applied to different tasks in this space. Our model is trained as a denoising autoencoder: we take temporally-ordered event sequences, shuffle them, delete some events, and then attempt to recover the original event sequence. This task teaches the model to make inferences given incomplete knowledge about the events in an underlying scenario. On the temporal ordering task, we show that our model is able to unscramble event sequences from existing datasets without access to explicitly labeled temporal training data, outperforming both a BERT-based pairwise model and a BERT-based pointer network. On event infilling, human evaluation shows that our model is able to generate events that fit better temporally into the input events when compared to GPT-2 story completion models. △ Less

Submitted 1 July, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

Comments: ACL 2021

arXiv:2010.02429 [pdf, ps, other]

Modeling Preconditions in Text with a Crowd-sourced Dataset

Authors: Heeyoung Kwon, Mahnaz Koupaee, Pratyush Singh, Gargi Sawhney, Anmol Shukla, Keerthi Kumar Kallur, Nathanael Chambers, Niranjan Balasubramanian

Abstract: Preconditions provide a form of logical connection between events that explains why some events occur together and information that is complementary to the more widely studied relations such as causation, temporal ordering, entailment, and discourse relations. Modeling preconditions in text has been hampered in part due to the lack of large scale labeled data grounded in text. This paper introduce… ▽ More Preconditions provide a form of logical connection between events that explains why some events occur together and information that is complementary to the more widely studied relations such as causation, temporal ordering, entailment, and discourse relations. Modeling preconditions in text has been hampered in part due to the lack of large scale labeled data grounded in text. This paper introduces PeKo, a crowd-sourced annotation of preconditions between event pairs in newswire, an order of magnitude larger than prior text annotations. To complement this new corpus, we also introduce two challenge tasks aimed at modeling preconditions: (i) Precondition Identification -- a standard classification task defined over pairs of event mentions, and (ii) Precondition Generation -- a generative task aimed at testing a more general ability to reason about a given event. Evaluation on both tasks shows that modeling preconditions is challenging even for today's large language models (LM). This suggests that precondition knowledge is not easily accessible in LM-derived representations alone. Our generation results show that fine-tuning an LM on PeKo yields better conditional relations than when trained on raw text or temporally-ordered corpora. △ Less

Submitted 14 October, 2020; v1 submitted 5 October, 2020; originally announced October 2020.

arXiv:2006.14817 [pdf, other]

doi 10.1093/mnras/staa2962

Deep model simulation of polar vortices in gas giant atmospheres

Authors: Ferran Garcia, Frank R. N. Chambers, Anna L. Watts

Abstract: The Cassini and Juno probes have revealed large coherent cyclonic vortices in the polar regions of Saturn and Jupiter, a dramatic contrast from the east-west banded jet structure seen at lower latitudes. Debate has centered on whether the jets are shallow, or extend to greater depths in the planetary envelope. Recent experiments and observations have demonstrated the relevance of deep convection m… ▽ More The Cassini and Juno probes have revealed large coherent cyclonic vortices in the polar regions of Saturn and Jupiter, a dramatic contrast from the east-west banded jet structure seen at lower latitudes. Debate has centered on whether the jets are shallow, or extend to greater depths in the planetary envelope. Recent experiments and observations have demonstrated the relevance of deep convection models to a successful explanation of jet structure and cyclonic coherent vortices away from the polar regions have been simulated recently including an additional stratified shallow layer. Here we present new convective models able to produce long-lived polar vortices. Using simulation parameters relevant for giant planet atmospheres we find flow regimes that are in agreement with geostrophic turbulence (GT) theory in rotating convection for the formation of large scale coherent structures via an upscale energy transfer fully three-dimensional. Our simulations generate polar characteristics qualitatively similar to those seen by Juno and Cassini: they match the structure of cyclonic vortices seen on Jupiter; or can account for the existence of a strong polar vortex extending downwards to lower latitudes with a marked spiral morphology and the hexagonal pattern seen on Saturn. Our findings indicate that these vortices can be generated deep in the planetary interior. A transition differentiating these two polar flows regimes is described, interpreted in terms of different force balances and compared with previous shallow atmospheric models which characterised polar vortex dynamics in giant planets. In addition, the heat transport properties are investigated confirming recent scaling laws obtained in the context of reduced models of GT. △ Less

Submitted 23 September, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

Comments: 18 pages, 13 figures and 3 tables

MSC Class: 76-10; 76F35; 86-10

arXiv:2006.06382 [pdf, other]

doi 10.1093/mnras/staa1699

Waves in Thin Oceans on Oblate Neutron Stars

Authors: Bart F. A. van Baal, Frank R. N. Chambers, Anna L. Watts

Abstract: Waves in thin fluid layers are important in various stellar and planetary problems. Due to rapid rotation such systems will become oblate, with a latitudinal variation in the gravitational acceleration across the surface of the object. In the case of accreting neutron stars, rapid rotation could lead to a polar radius smaller than the equatorial radius by a factor $\sim 0.8$. We investigate how th… ▽ More Waves in thin fluid layers are important in various stellar and planetary problems. Due to rapid rotation such systems will become oblate, with a latitudinal variation in the gravitational acceleration across the surface of the object. In the case of accreting neutron stars, rapid rotation could lead to a polar radius smaller than the equatorial radius by a factor $\sim 0.8$. We investigate how the oblateness and a changing gravitational acceleration affect different hydrodynamic modes that exist in such fluid layers through analytic approximations and numerical calculations. The wave vectors of $g$-modes and Yanai modes increase for more oblate systems compared to spherical counterparts, although the impact of variations in the changing gravitational acceleration is effectively negligible. We find that for increased oblateness, Kelvin modes show less equatorial confinement and little change in their wave vector. For $r$-modes, we find that for more oblate systems the wave vector decreases. The exact manner of these changes for the $r$-modes depends on the model for the gravitational acceleration across the surface. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: 10 pages, 8 figures Accepted for publication in MNRAS

arXiv:2006.05489 [pdf, other]

Modeling Label Semantics for Predicting Emotional Reactions

Authors: Radhika Gaonkar, Heeyoung Kwon, Mohaddeseh Bastan, Niranjan Balasubramanian, Nathanael Chambers

Abstract: Predicting how events induce emotions in the characters of a story is typically seen as a standard multi-label classification task, which usually treats labels as anonymous classes to predict. They ignore information that may be conveyed by the emotion labels themselves. We propose that the semantics of emotion labels can guide a model's attention when representing the input story. Further, we obs… ▽ More Predicting how events induce emotions in the characters of a story is typically seen as a standard multi-label classification task, which usually treats labels as anonymous classes to predict. They ignore information that may be conveyed by the emotion labels themselves. We propose that the semantics of emotion labels can guide a model's attention when representing the input story. Further, we observe that the emotions evoked by an event are often related: an event that evokes joy is unlikely to also evoke sadness. In this work, we explicitly model label classes via label embeddings, and add mechanisms that track label-label correlations both during training and inference. We also introduce a new semi-supervision strategy that regularizes for the correlations on unlabeled data. Our empirical evaluations show that modeling label semantics yields consistent benefits, and we advance the state-of-the-art on an emotion inference task. △ Less

Submitted 28 June, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: 6 pages, 2 figures, published in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

arXiv:2004.03762 [pdf, other]

Generating Narrative Text in a Switching Dynamical System

Authors: Noah Weber, Leena Shekhar, Heeyoung Kwon, Niranjan Balasubramanian, Nathanael Chambers

Abstract: Early work on narrative modeling used explicit plans and goals to generate stories, but the language generation itself was restricted and inflexible. Modern methods use language models for more robust generation, but often lack an explicit representation of the scaffolding and dynamics that guide a coherent narrative. This paper introduces a new model that integrates explicit narrative structure w… ▽ More Early work on narrative modeling used explicit plans and goals to generate stories, but the language generation itself was restricted and inflexible. Modern methods use language models for more robust generation, but often lack an explicit representation of the scaffolding and dynamics that guide a coherent narrative. This paper introduces a new model that integrates explicit narrative structure with neural language models, formalizing narrative modeling as a Switching Linear Dynamical System (SLDS). A SLDS is a dynamical system in which the latent dynamics of the system (i.e. how the state vector transforms over time) is controlled by top-level discrete switching variables. The switching variables represent narrative structure (e.g., sentiment or discourse states), while the latent state vector encodes information on the current state of the narrative. This probabilistic formulation allows us to control generation, and can be learned in a semi-supervised fashion using both labeled and unlabeled data. Additionally, we derive a Gibbs sampler for our model that can fill in arbitrary parts of the narrative, guided by the switching variables. Our filled-in (English language) narratives outperform several baselines on both automatic and human evaluations. △ Less

Submitted 7 April, 2020; originally announced April 2020.

arXiv:1912.05369 [pdf, ps, other]

doi 10.1093/mnras/stz3449

Relativistic ocean $r$-modes during type-I X-ray bursts

Authors: Frank R. N. Chambers, Anna L. Watts

Abstract: Accreting neutron stars (NS) can exhibit high frequency modulations in their lightcurves during thermonuclear X-ray bursts, known as burst oscillations. These frequencies can be offset from the NS spin frequency by several Hz (where known independently) and can drift by 1-3 Hz. One plausible explanation is that a wave is present in the bursting ocean, the rotating frame frequency of which is the o… ▽ More Accreting neutron stars (NS) can exhibit high frequency modulations in their lightcurves during thermonuclear X-ray bursts, known as burst oscillations. These frequencies can be offset from the NS spin frequency by several Hz (where known independently) and can drift by 1-3 Hz. One plausible explanation is that a wave is present in the bursting ocean, the rotating frame frequency of which is the offset. The frequency of the wave should decrease (in the rotating frame) as the burst cools hence explaining the drift. A strong candidate is a buoyant $r$-mode. To date, models that calculated the frequency of this mode taking into account the radial structure neglected relativistic effects and predicted rotating frame frequencies of $\sim$ 4 Hz and frequency drifts of > 5 Hz; too large to be consistent with observations. We present a calculation that includes frame-dragging and gravitational redshift that reduces the rotating frame frequency by up to 30 % and frequency drift by up to 20 %. Updating previous models for the ocean cooling in the aftermath of the burst to a model more representative of detailed calculations of thermonuclear X-ray bursts reduces the frequency of the mode still further. This model, combined with relativistic effects, can reduce the rotating frequency of the mode to $\sim$ 2 Hz and frequency drift to $\sim$ 2 Hz, which is closer to the observed values. △ Less

Submitted 11 December, 2019; originally announced December 2019.

arXiv:1811.12111 [pdf, ps, other]

doi 10.3847/1538-4357/aaf501

Burning in the tail: implications for a burst oscillation model

Authors: Frank R. N. Chambers, Anna L. Watts, L. Keek, Yuri Cavecchi, F. Garcia

Abstract: Accreting neutron stars (NS) can exhibit high-frequency modulations, known as burst oscillations, in their lightcurves during thermonuclear X-ray bursts. Their frequencies can be offset from the spin frequency of the NS (known independently) by several Hz, and can drift by 1-3 Hz. One plausible explanation for this phenomenon is that a wave is present in the bursting ocean that decreases in freque… ▽ More Accreting neutron stars (NS) can exhibit high-frequency modulations, known as burst oscillations, in their lightcurves during thermonuclear X-ray bursts. Their frequencies can be offset from the spin frequency of the NS (known independently) by several Hz, and can drift by 1-3 Hz. One plausible explanation for this phenomenon is that a wave is present in the bursting ocean that decreases in frequency (in the rotating frame) as the burst cools. The strongest candidate is the buoyant $r$-mode; however, models for the burning ocean background used in previous studies over-predict frequency drifts by several Hz. Using new background models (which include shallow heating, and burning in the tail of the burst) the evolution of the buoyant $r$-mode is calculated. The resulting frequency drifts are smaller, in line with observations. This illustrates the importance of accounting for the detailed nuclear physics in these bursts. △ Less

Submitted 29 November, 2018; originally announced November 2018.

Comments: 7 pages, 2 figures

arXiv:1808.09542 [pdf, other]

Hierarchical Quantized Representations for Script Generation

Authors: Noah Weber, Leena Shekhar, Niranjan Balasubramanian, Nathanael Chambers

Abstract: Scripts define knowledge about how everyday scenarios (such as going to a restaurant) are expected to unfold. One of the challenges to learning scripts is the hierarchical nature of the knowledge. For example, a suspect arrested might plead innocent or guilty, and a very different track of events is then expected to happen. To capture this type of information, we propose an autoencoder model with… ▽ More Scripts define knowledge about how everyday scenarios (such as going to a restaurant) are expected to unfold. One of the challenges to learning scripts is the hierarchical nature of the knowledge. For example, a suspect arrested might plead innocent or guilty, and a very different track of events is then expected to happen. To capture this type of information, we propose an autoencoder model with a latent space defined by a hierarchy of categorical variables. We utilize a recently proposed vector quantization based approach, which allows continuous embeddings to be associated with each latent variable value. This permits the decoder to softly decide what portions of the latent hierarchy to condition on by attending over the value embeddings for a given setting. Our model effectively encodes and generates scripts, outperforming a recent language modeling-based method on several standard tasks, and allowing the autoencoder model to achieve substantially lower perplexity scores compared to the previous language modeling-based method. △ Less

Submitted 28 August, 2018; originally announced August 2018.

Comments: EMNLP 2018

arXiv:1807.05120 [pdf, other]

doi 10.1103/PhysRevFluids.3.123501

Thermal convection in rotating spherical shells: temperature-dependent internal heat generation using the example of triple-$α$ burning in neutron stars

Authors: F. Garcia, F. R. N Chambers, A. L. Watts

Abstract: We present an extensive study of Boussinesq thermal convection including a temperature-dependent internal heating source, based on numerical three-dimensional simulations. The temperature dependence mimics triple-$α$ nuclear reactions and the fluid geometry is a rotating spherical shell. These are key ingredients for the study of convective accreting neutron star oceans. A dimensionless parameter… ▽ More We present an extensive study of Boussinesq thermal convection including a temperature-dependent internal heating source, based on numerical three-dimensional simulations. The temperature dependence mimics triple-$α$ nuclear reactions and the fluid geometry is a rotating spherical shell. These are key ingredients for the study of convective accreting neutron star oceans. A dimensionless parameter ${\rm Ra}_n$, measuring the relevance of nuclear heating, is defined. We explore how flow characteristics change with increasing ${\rm Ra}_n$ and give an astrophysical motivation. The onset of convection is investigated with respect to this parameter and periodic, quasiperiodic, chaotic flows with coherent structures, and fully turbulent flows are exhibited as ${\rm Ra}_n$ is varied. Several regime transitions are identified and compared with previous results on differentially heated convection. Finally, we explore (tentatively) the potential applicability of our results to the evolution of thermonuclear bursts in accreting neutron star oceans. △ Less

Submitted 14 December, 2018; v1 submitted 13 July, 2018; originally announced July 2018.

Comments: 9 Figures, 6 Tables. Published in the Physical Review Fluids journal

MSC Class: 76E20; 76F06; 76U05

Journal ref: Phys. Rev. Fluids 3, 123501 (2018)

arXiv:1806.10679 [pdf, other]

doi 10.1103/PhysRevFluids.4.074802

Polar waves and chaotic flows in thin rotating spherical shells

Authors: F. Garcia, F. R. N. Chambers, A. L. Watts

Abstract: Convection in rotating spherical geometries is an important physical process in planetary and stellar systems. Using continuation methods at low Prandtl number, we find both strong equatorially asymmetric and symmetric polar nonlinear rotating waves in a model of thermal convection in thin rotating spherical shells with stress-free boundary conditions. For the symmetric waves convection is confine… ▽ More Convection in rotating spherical geometries is an important physical process in planetary and stellar systems. Using continuation methods at low Prandtl number, we find both strong equatorially asymmetric and symmetric polar nonlinear rotating waves in a model of thermal convection in thin rotating spherical shells with stress-free boundary conditions. For the symmetric waves convection is confined to high latitude in both hemispheres but is only restricted to one hemisphere close to the pole in the case of asymmetric waves. This is in contrast to what is previously known from studies in the field. These periodic flows, in which the pattern is rotating steadily in the azimuthal direction, develop a strong axisymmetric component very close to onset. Using stability analysis of periodic orbits the regions of stability are determined and the topology of the stable/unstable oscillatory flows bifurcated from the branches of rotating waves is described. By means of direct numerical simulations of these oscillatory chaotic flows, we show that these three-dimensional convective polar flows exhibit characteristics, such as force balance or mean physical properties, which are similar to flows occuring in planetary atmospheres. We show that these results may open a route to understanding unexplained features of gas giant atmospheres, in particular for the case of Jupiter. These include the observed equatorial asymmetry with a pronounced decrease at the equator (the so-called dimple), and the coherent vortices surrounding the poles recently observed by the Juno mission. △ Less

Submitted 6 July, 2019; v1 submitted 27 June, 2018; originally announced June 2018.

Comments: Published in Physical Review Fluids (2019). Contains 2 tables and 8 figures

MSC Class: 76E20 37G35 37G40 65P30 65P40

Journal ref: Phys. Rev. Fluids 4, 074802 (2019)

arXiv:1804.02189 [pdf, ps, other]

doi 10.1093/mnras/sty895

Superburst oscillations: ocean and crustal modes excited by Carbon-triggered Type I X-ray bursts

Authors: Frank R. N. Chambers, Anna L. Watts, Yuri Cavecchi, F. Garcia, L. Keek

Abstract: Accreting neutron stars (NS) can exhibit high frequency modulations in their lightcurves during thermonuclear X-ray bursts, known as burst oscillations. The frequencies can be offset from the spin frequency of the NS by several Hz, and can drift by 1-3 Hz. One possible explanation is a mode in the bursting ocean, the frequency of which would decrease (in the rotating frame) as the burst cools, hen… ▽ More Accreting neutron stars (NS) can exhibit high frequency modulations in their lightcurves during thermonuclear X-ray bursts, known as burst oscillations. The frequencies can be offset from the spin frequency of the NS by several Hz, and can drift by 1-3 Hz. One possible explanation is a mode in the bursting ocean, the frequency of which would decrease (in the rotating frame) as the burst cools, hence explaining the drifts. Most burst oscillations have been observed during H/He triggered bursts, however there has been one observation of oscillations during a superburst; hours' long Type I X-ray bursts caused by unstable carbon burning deeper in the ocean. This paper calculates the frequency evolution of an oceanic r-mode during a superburst. The rotating frame frequency varies during the burst from 4-14 Hz, and is sensitive to the background parameters, in particular the temperature of the ocean and ignition depth. This calculation is compared to the superburst oscillations observed on 4U-1636-536. The predicted mode frequencies ($\sim$ 10 Hz) would require a spin frequency of $\sim$ 592 Hz to match observations; 6 Hz higher than the spin inferred from an oceanic r-mode model for the H/He triggered burst oscillations. This model also over-predicts the frequency drift during the superburst by 90 %. △ Less

Submitted 6 April, 2018; originally announced April 2018.

Comments: Accepted for publication in MNRAS

arXiv:1802.03071 [pdf, other]

The onset of low Prandtl number thermal convection in thin spherical shells

Authors: F. Garcia, F. R. N. Chambers, A. L. Watts

Abstract: This study considers the onset of stress-free Boussinesq thermal convection in rotating spherical shells with aspect ratio $η=r_i/r_o=0.9$ ($r_i$ and $r_o$ being the inner and outer radius), Prandtl numbers ${\rm Pr} \in[10^{-4},10^{-1}]$, and Taylor numbers ${\rm Ta}\in[10^{4},10^{12}]$. We are particularly interested in the form of the convective cell pattern that develops, and in its time scale… ▽ More This study considers the onset of stress-free Boussinesq thermal convection in rotating spherical shells with aspect ratio $η=r_i/r_o=0.9$ ($r_i$ and $r_o$ being the inner and outer radius), Prandtl numbers ${\rm Pr} \in[10^{-4},10^{-1}]$, and Taylor numbers ${\rm Ta}\in[10^{4},10^{12}]$. We are particularly interested in the form of the convective cell pattern that develops, and in its time scales, since this may have observational consequences. For a fixed ${\rm Ta}<10^{9}$ and by decreasing ${\rm Pr}$ from 0.1 to $10^{-4}$ a transition between spiralling columnar (SC) and equatorially-attached (EA) modes, and a transition between EA and equatorially antisymmetric or symmetric polar (AP/SP) weakly multicellular modes are found. The latter modes are preferred at very low ${\rm Pr}$. Surprisingly, for ${\rm Ta}>3\times 10^{9}$ the unicellular polar modes become also preferred at moderate ${\rm Pr}\sim10^{-2}$ because two new transition curves between EA and AP/SP and between AP/SP and SC modes are born at a triple-point bifurcation. The dependence on ${\rm Pr}$ and ${\rm Ta}$ of the transitions is studied to estimate the type of modes, and their critical parameters, preferred at different stellar regimes. △ Less

Submitted 15 May, 2018; v1 submitted 8 February, 2018; originally announced February 2018.

Comments: Accepted for publication in Physical Review Fluids. Contains 17 pages, 8 figures and 3 tables. Added brief erratum correcting values used for estimates of neutron star ocean viscosity

MSC Class: 76E20

arXiv:1711.07611 [pdf, other]

Event Representations with Tensor-based Compositions

Authors: Noah Weber, Niranjan Balasubramanian, Nathanael Chambers

Abstract: Robust and flexible event representations are important to many core areas in language understanding. Scripts were proposed early on as a way of representing sequences of events for such understanding, and has recently attracted renewed attention. However, obtaining effective representations for modeling script-like event sequences is challenging. It requires representations that can capture event… ▽ More Robust and flexible event representations are important to many core areas in language understanding. Scripts were proposed early on as a way of representing sequences of events for such understanding, and has recently attracted renewed attention. However, obtaining effective representations for modeling script-like event sequences is challenging. It requires representations that can capture event-level and scenario-level semantics. We propose a new tensor-based composition method for creating event representations. The method captures more subtle semantic interactions between an event and its entities and yields representations that are effective at multiple event-related tasks. With the continuous representations, we also devise a simple schema generation method which produces better schemas compared to a prior discrete representation based method. Our analysis shows that the tensors capture distinct usages of a predicate even when there are only subtle differences in their surface realizations. △ Less

Submitted 20 November, 2017; originally announced November 2017.

Comments: Accepted at AAAI 2018

arXiv:1604.01696 [pdf, other]

A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories

Authors: Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, Lucy Vanderwende, Pushmeet Kohli, James Allen

Abstract: Representation and learning of commonsense knowledge is one of the foundational problems in the quest to enable deep language understanding. This issue is particularly challenging for understanding casual and correlational relationships between events. While this topic has received a lot of interest in the NLP community, research has been hindered by the lack of a proper evaluation framework. This… ▽ More Representation and learning of commonsense knowledge is one of the foundational problems in the quest to enable deep language understanding. This issue is particularly challenging for understanding casual and correlational relationships between events. While this topic has received a lot of interest in the NLP community, research has been hindered by the lack of a proper evaluation framework. This paper attempts to address this problem with a new framework for evaluating story understanding and script learning: the 'Story Cloze Test'. This test requires a system to choose the correct ending to a four-sentence story. We created a new corpus of ~50k five-sentence commonsense stories, ROCStories, to enable this evaluation. This corpus is unique in two ways: (1) it captures a rich set of causal and temporal commonsense relations between daily events, and (2) it is a high quality collection of everyday life stories that can also be used for story generation. Experimental evaluation shows that a host of baselines and state-of-the-art models based on shallow language understanding struggle to achieve a high score on the Story Cloze Test. We discuss these implications for script and story learning, and offer suggestions for deeper language understanding. △ Less

Submitted 6 April, 2016; originally announced April 2016.

Comments: In Proceedings of the 2016 North American Chapter of the ACL (NAACL HLT), 2016

Showing 1–22 of 22 results for author: Chambers, N