-
Causal interventions expose implicit situation models for commonsense language understanding
Authors:
Takateru Yamakoshi,
James L. McClelland,
Adele E. Goldberg,
Robert D. Hawkins
Abstract:
Accounts of human language processing have long appealed to implicit ``situation models'' that enrich comprehension with relevant but unstated world knowledge. Here, we apply causal intervention techniques to recent transformer models to analyze performance on the Winograd Schema Challenge (WSC), where a single context cue shifts interpretation of an ambiguous pronoun. We identify a relatively sma…
▽ More
Accounts of human language processing have long appealed to implicit ``situation models'' that enrich comprehension with relevant but unstated world knowledge. Here, we apply causal intervention techniques to recent transformer models to analyze performance on the Winograd Schema Challenge (WSC), where a single context cue shifts interpretation of an ambiguous pronoun. We identify a relatively small circuit of attention heads that are responsible for propagating information from the context word that guides which of the candidate noun phrases the pronoun ultimately attends to. We then compare how this circuit behaves in a closely matched ``syntactic'' control where the situation model is not strictly necessary. These analyses suggest distinct pathways through which implicit situation models are constructed to guide pronoun resolution.
△ Less
Submitted 7 June, 2023; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Probing BERT's priors with serial reproduction chains
Authors:
Takateru Yamakoshi,
Thomas L. Griffiths,
Robert D. Hawkins
Abstract:
Sampling is a promising bottom-up method for exposing what generative models have learned about language, but it remains unclear how to generate representative samples from popular masked language models (MLMs) like BERT. The MLM objective yields a dependency network with no guarantee of consistent conditional distributions, posing a problem for naive approaches. Drawing from theories of iterated…
▽ More
Sampling is a promising bottom-up method for exposing what generative models have learned about language, but it remains unclear how to generate representative samples from popular masked language models (MLMs) like BERT. The MLM objective yields a dependency network with no guarantee of consistent conditional distributions, posing a problem for naive approaches. Drawing from theories of iterated learning in cognitive science, we explore the use of serial reproduction chains to sample from BERT's priors. In particular, we observe that a unique and consistent estimator of the ground-truth joint distribution is given by a Generative Stochastic Network (GSN) sampler, which randomly selects which token to mask and reconstruct on each step. We show that the lexical and syntactic statistics of sentences from GSN chains closely match the ground-truth corpus distribution and perform better than other methods in a large corpus of naturalness judgments. Our findings establish a firmer theoretical foundation for bottom-up probing and highlight richer deviations from human priors.
△ Less
Submitted 18 March, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
Loading ultracold atoms onto nonlinear Bloch states and soliton states in bichromatic lattices
Authors:
Tomotake Yamakoshi,
Shinichi Watanabe
Abstract:
We simulate and analyze an experimental method of loading interacting ultracold atoms onto nontrivial quantum states such as nonlinear Bloch wave and soliton solutions in a 1-dimensional bichromatic lattice. Of standard bands, inverted bands, and bands with Dirac-like points permitted by a bichromatic lattice, we consider the case of an inverted band and examine the loading process in terms of non…
▽ More
We simulate and analyze an experimental method of loading interacting ultracold atoms onto nontrivial quantum states such as nonlinear Bloch wave and soliton solutions in a 1-dimensional bichromatic lattice. Of standard bands, inverted bands, and bands with Dirac-like points permitted by a bichromatic lattice, we consider the case of an inverted band and examine the loading process in terms of nonlinear Bloch waves formed by an aggregate of ultracold atoms described by the mean-field model. Specifically, we solved the Gross-Pitaevskii equation numerically and found an appropriate standing wave-pulse sequence for the inverted band, which sequence proved to be a suitable protocol for producing soliton solutions. In addition, we examined the effect of an external potential and dynamical instabilities for the post-loading process. We also provide an appropriate data set for future experimental realization of our findings.
△ Less
Submitted 4 June, 2021; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Investigating representations of verb bias in neural language models
Authors:
Robert D. Hawkins,
Takateru Yamakoshi,
Thomas L. Griffiths,
Adele E. Goldberg
Abstract:
Languages typically provide more than one grammatical construction to express certain types of messages. A speaker's choice of construction is known to depend on multiple factors, including the choice of main verb -- a phenomenon known as \emph{verb bias}. Here we introduce DAIS, a large benchmark dataset containing 50K human judgments for 5K distinct sentence pairs in the English dative alternati…
▽ More
Languages typically provide more than one grammatical construction to express certain types of messages. A speaker's choice of construction is known to depend on multiple factors, including the choice of main verb -- a phenomenon known as \emph{verb bias}. Here we introduce DAIS, a large benchmark dataset containing 50K human judgments for 5K distinct sentence pairs in the English dative alternation. This dataset includes 200 unique verbs and systematically varies the definiteness and length of arguments. We use this dataset, as well as an existing corpus of naturally occurring data, to evaluate how well recent neural language models capture human preferences. Results show that larger models perform better than smaller models, and transformer architectures (e.g. GPT-2) tend to out-perform recurrent architectures (e.g. LSTMs) even under comparable parameter and training settings. Additional analyses of internal feature representations suggest that transformers may better integrate specific lexical information with grammatical constructions.
△ Less
Submitted 15 October, 2020; v1 submitted 5 October, 2020;
originally announced October 2020.
-
Fast and selective inter-band transfer of ultracold atoms in bichromatic lattices permitting Dirac points
Authors:
Tomotake Yamakoshi,
Shinichi Watanabe
Abstract:
An experimental group at Bei**g[Yueyang Zhai, ${\it et. al.}$, Phys. Rev. A ${\bf 87}$, 063638 (2013)] introduced the method of standing-wave pulse sequence for efficiently preparing ultracold bosonic atoms into a specific excited band in a 1-dimensional optical lattice. Here, we report a theoretical extension of their work to the problem of 1-dimensional bichromatic superlattice. We find that va…
▽ More
An experimental group at Bei**g[Yueyang Zhai, ${\it et. al.}$, Phys. Rev. A ${\bf 87}$, 063638 (2013)] introduced the method of standing-wave pulse sequence for efficiently preparing ultracold bosonic atoms into a specific excited band in a 1-dimensional optical lattice. Here, we report a theoretical extension of their work to the problem of 1-dimensional bichromatic superlattice. We find that varying the lattice parameters leads to the so-called Dirac point where a pair of excited bands crosses. This paper thus discusses ${\it simultaneously}$ the efficient excitation of the wave packet to the proximity of the Dirac point and its subsequent dynamics in the force field of a parabolic trap. With the aid of a toy model, we theoretically unravel the mechanism of the efficient preparation, and then numerically explore optimal pulse-sequence parameters for a realistic situation. We find an optimized sequence of a bichromatic optical lattice that excites more than 99% of the atoms to the 1st and 2nd excited bands within 100 $μ$s without the harmonic trap. Our main finding is that the system permitting the Dirac point possesses a region of parameters where the excited energy bands become nearly parabolic, conducive to robust coherence and isochronicity. We also provide an appropriate data set for future experimentation, including effects of the atom-atom interaction by way of the mean-field nonlinear term.
△ Less
Submitted 12 June, 2018;
originally announced June 2018.
-
A significantly stable mode of the ultracold atomic wave packet in amplitude modulated parabolic optical lattices
Authors:
Tomotake Yamakoshi,
Farhan Saif,
Shinichi Watanabe
Abstract:
We show that a conspicuous wave packet of ultracold noninteracting Bosonic atoms emerges in a 1-dimensional parabolic optical lattice as in the setup of the Aarhus experiment [P. L. Pedersen ${\it et}$ ${\it al.}$, Phys. Rev. A ${\bf 88}$, 023620 (2013)], given the lattice height is harmonically modulated with a particular amplitude at a resonant frequency. We show that this wave packet, coined "…
▽ More
We show that a conspicuous wave packet of ultracold noninteracting Bosonic atoms emerges in a 1-dimensional parabolic optical lattice as in the setup of the Aarhus experiment [P. L. Pedersen ${\it et}$ ${\it al.}$, Phys. Rev. A ${\bf 88}$, 023620 (2013)], given the lattice height is harmonically modulated with a particular amplitude at a resonant frequency. We show that this wave packet, coined "${\it 4bandPWP}$" here, executes stable time-wise periodic motion for infinitely long time. We apply the Floquet theory to analyze the parameter dependence of ${\it 4bandPWP}$ in detail. Our analysis shows that it consists mainly of two principal Floquet eigenstates of the periodically driven Hamiltonian. The informative Husimi representation yields temporal slices of the phase space of ${\it 4bandPWP}$, visually identifying moments where the inter-band transitions take place. The provided data should aid the experiment in locating ${\it 4bandPWP}$.
△ Less
Submitted 29 September, 2017;
originally announced September 2017.
-
Dynamics of fermions in an amplitude modulated lattice
Authors:
Tomotake Yamakoshi,
Shinichi Watanabe,
Shun Ohgoda,
Alexander P. Itin
Abstract:
We study dynamics of fermions loaded in an optical lattice with a superimposed parabolic trap potential. In the recent Hamburg experiments [J.Heinze et.al., Phys. Rev. Lett. 110, 085302 (2013)] on quantum simulation of photoconductivity, a modulation pulse on the optical lattice transferred part of the population of the lowest band to an excited band, leaving a hole in the particle distribution of…
▽ More
We study dynamics of fermions loaded in an optical lattice with a superimposed parabolic trap potential. In the recent Hamburg experiments [J.Heinze et.al., Phys. Rev. Lett. 110, 085302 (2013)] on quantum simulation of photoconductivity, a modulation pulse on the optical lattice transferred part of the population of the lowest band to an excited band, leaving a hole in the particle distribution of the lowest band. Subsequent intricate dynamics of both excited particles and holes can be explained by a semiclassical approach based on the evolution of Wigner function. Here we provide a more detailed analysis of the dynamics taking into account the dimensionality of the system and finite temperature effects, aiming at reproducing experimental results on longer timescales. A semiclassical wave packet is constructed more accurately than in the previous theory. As a result, semiclassical dynamics indeed reproduces experimental data and full quantum numerical calculations with much better accuracy. In particular, fascinating phenomenon of collapse and revival of holes is investigated in a more detail. We presume the experimental setup can be used for deeper exploration of nonlinear waves in fermionic gases.
△ Less
Submitted 6 June, 2016; v1 submitted 31 March, 2016;
originally announced March 2016.
-
Single-particle Analysis of Non-interacting Ultracold Bosons in Amplitude Modulated Parabolic Optical Lattice
Authors:
Tomotake Yamakoshi,
Shinichi Watanabe
Abstract:
Ultracold atoms in the combined potential of a parabolic trap and an optical lattice is considered a promising tool for coherent manipulation of matter wave packets. The recent Aarhus experiment[P. L. Pedersen et al., Phys. Rev. A. 88, 023620 (2013)] produced wave packets by applying the optical lattice's amplitude modulation to a Bose-Einstein condensate (BEC) of $^{87}$Rb. The present paper rend…
▽ More
Ultracold atoms in the combined potential of a parabolic trap and an optical lattice is considered a promising tool for coherent manipulation of matter wave packets. The recent Aarhus experiment[P. L. Pedersen et al., Phys. Rev. A. 88, 023620 (2013)] produced wave packets by applying the optical lattice's amplitude modulation to a Bose-Einstein condensate (BEC) of $^{87}$Rb. The present paper renders a theoretical account with single-particle analysis of this experimental production of the wave packets and their subsequent time-evolution. We focus on the one-dimensional non-interacting bosonic system as a fundamental starting point for accurate quantum analysis and for further investigation of similar experiments. We show that a simple Rabi-oscillation model gives a good description of the wave packet production in terms of the inter-band transition while the first-order perturbation theory proves inadequate, that is the recent experiment already reached the realm of high-order couplings. As a natural extension, we demonstrate enhancement of the wave packet production by the two-step Rabi-oscillation method using either single frequency or dual frequencies. We assess the high-order Bragg reflection and Landau-Zener transition at a band gap with the aid of rigorous quantum time-propagation as well as the semi-classical theory exploited earlier by the Hamburg experiment [J. Heinze et al., PRL 107, 135303(2011)]. Complicated reflections and splittings of the wave packet during free evolution may be largely attributed to the intertwining of these two effects.
△ Less
Submitted 26 September, 2014;
originally announced September 2014.
-
Stochastic and equilibrium pictures of the ultracold FFR molecular conversion rate
Authors:
Tomotake Yamakoshi,
Shinichi Watanabe,
Chen Zhang,
Chris H. Greene
Abstract:
The ultracold molecular conversion rate occurring in an adiabatic ramp through a Fano-Feshbach resonance is studied and compared in two statistical models. One model, the so-called stochastic phase space sampling (SPSS)[E.Hodby et al., PRL.94 120402(2005)] evaluates the overlap of two atomic distributions in phase space by sampling atomic pairs according to a phase-space criterion. The other model…
▽ More
The ultracold molecular conversion rate occurring in an adiabatic ramp through a Fano-Feshbach resonance is studied and compared in two statistical models. One model, the so-called stochastic phase space sampling (SPSS)[E.Hodby et al., PRL.94 120402(2005)] evaluates the overlap of two atomic distributions in phase space by sampling atomic pairs according to a phase-space criterion. The other model, the chemical equilibrium theory(ChET)[S.Watabe and T.Nikuni, PRA.77 013616(2008)] considers atomic and molecular distributions in the limit of the chemical and thermal equilibrium. The present study applies SPSS and ChET to a prototypical system of K+K K2 in all the symmetry combinations, namely Fermi-Fermi, Bose-Bose, and Bose-Fermi cases. To examine implications of the phase-space criterion for SPSS, the behavior of molecular conversion is analyzed using four distinct geometrical constraints. Our comparison of the results of SPSS with those of ChET shows that while they appear similar in most situations, the two models give rise to rather dissimilar behaviors when the presence of a Bose-Einstein condensate (BEC) strongly affects the molecule formation.
△ Less
Submitted 8 March, 2013;
originally announced March 2013.