-
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Authors:
Saizheng Zhang,
Emily Dinan,
Jack Urbanek,
Arthur Szlam,
Douwe Kiela,
Jason Weston
Abstract:
Chit-chat models are known to have several problems: they lack specificity, do not display a consistent personality and are often not very captivating. In this work we present the task of making chit-chat more engaging by conditioning on profile information. We collect data and train models to (i) condition on their given profile information; and (ii) information about the person they are talking…
▽ More
Chit-chat models are known to have several problems: they lack specificity, do not display a consistent personality and are often not very captivating. In this work we present the task of making chit-chat more engaging by conditioning on profile information. We collect data and train models to (i) condition on their given profile information; and (ii) information about the person they are talking to, resulting in improved dialogues, as measured by next utterance prediction. Since (ii) is initially unknown our model is trained to engage its partner with personal topics, and we show the resulting dialogue can be used to predict profile information about the interlocutors.
△ Less
Submitted 25 September, 2018; v1 submitted 22 January, 2018;
originally announced January 2018.
-
Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent
Authors:
Zhilin Yang,
Saizheng Zhang,
Jack Urbanek,
Will Feng,
Alexander H. Miller,
Arthur Szlam,
Douwe Kiela,
Jason Weston
Abstract:
Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment. In this work we propose an interactive learning procedure called Mechanical Turker Descent (MTD) and use it to train agents to execute natural language commands grounded in a fantasy text adventure game. In MTD, Turkers compete to train better…
▽ More
Contrary to most natural language processing research, which makes use of static datasets, humans learn language interactively, grounded in an environment. In this work we propose an interactive learning procedure called Mechanical Turker Descent (MTD) and use it to train agents to execute natural language commands grounded in a fantasy text adventure game. In MTD, Turkers compete to train better agents in the short term, and collaborate by sharing their agents' skills in the long term. This results in a gamified, engaging experience for the Turkers and a better quality teaching signal for the agents compared to static datasets, as the Turkers naturally adapt the training data to the agent's abilities.
△ Less
Submitted 16 April, 2018; v1 submitted 21 November, 2017;
originally announced November 2017.
-
Emergent Translation in Multi-Agent Communication
Authors:
Jason Lee,
Kyunghyun Cho,
Jason Weston,
Douwe Kiela
Abstract:
While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans. In this work, we propose a communication game where two agents, native speakers of their own respective languages, jointly learn to solve a visual referential task. We find that the ability to understa…
▽ More
While most machine translation systems to date are trained on large parallel corpora, humans learn language in a different way: by being grounded in an environment and interacting with other humans. In this work, we propose a communication game where two agents, native speakers of their own respective languages, jointly learn to solve a visual referential task. We find that the ability to understand and translate a foreign language emerges as a means to achieve shared goals. The emergent translation is interactive and multimodal, and crucially does not require parallel corpora, but only monolingual, independent text and corresponding images. Our proposed translation model achieves this by grounding the source and target languages into a shared visual modality, and outperforms several baselines on both word-level and sentence-level translation tasks. Furthermore, we show that agents in a multilingual community learn to translate better and faster than in a bilingual communication setting.
△ Less
Submitted 10 April, 2018; v1 submitted 11 October, 2017;
originally announced October 2017.
-
Transient and Sharvin resistances of Luttinger liquids
Authors:
Thomas Kloss,
Joseph Weston,
Xavier Waintal
Abstract:
Although the intrinsic conductance of an interacting one-dimensional system is renormalized by the electron-electron correlations, it has been known for some time that this renormalization is washed out by the presence of the (non-interacting) electrodes to which the wire is connected. Here, we study the transient conductance of such a wire: a finite voltage bias is suddenly applied across the wir…
▽ More
Although the intrinsic conductance of an interacting one-dimensional system is renormalized by the electron-electron correlations, it has been known for some time that this renormalization is washed out by the presence of the (non-interacting) electrodes to which the wire is connected. Here, we study the transient conductance of such a wire: a finite voltage bias is suddenly applied across the wire and we measure the current before it has enough time to reach its stationary value. These calculations allow us to extract the Sharvin (contact) resistance of Luttinger and Fermi liquids. In particular, we find that a perfect junction between a Fermi liquid electrode and a Luttinger liquid electrode is characterized by a contact resistance that consists of half the quantum of conductance in series with half the intrinsic resistance of an infinite Luttinger liquid. These results were obtained using two different methods: a dynamical Hartree-Fock approach and a self-consistent Boltzmann approach. Although these methods are formally approximate we find a perfect match with the exact results of Luttinger/Fermi liquid theory.
△ Less
Submitted 26 April, 2018; v1 submitted 2 October, 2017;
originally announced October 2017.
-
StarSpace: Embed All The Things!
Authors:
Ledell Wu,
Adam Fisch,
Sumit Chopra,
Keith Adams,
Antoine Bordes,
Jason Weston
Abstract:
We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by emb…
▽ More
We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by embedding those entities comprised of discrete features and comparing them against each other -- learning similarities dependent on the task. Empirical results on a number of tasks show that StarSpace is highly competitive with existing methods, whilst also being generally applicable to new cases where those methods are not.
△ Less
Submitted 20 November, 2017; v1 submitted 12 September, 2017;
originally announced September 2017.
-
ParlAI: A Dialog Research Software Platform
Authors:
Alexander H. Miller,
Will Feng,
Adam Fisch,
Jiasen Lu,
Dhruv Batra,
Antoine Bordes,
Devi Parikh,
Jason Weston
Abstract:
We introduce ParlAI (pronounced "par-lay"), an open-source software platform for dialog research implemented in Python, available at http://parl.ai. Its goal is to provide a unified framework for sharing, training and testing of dialog models, integration of Amazon Mechanical Turk for data collection, human evaluation, and online/reinforcement learning; and a repository of machine learning models…
▽ More
We introduce ParlAI (pronounced "par-lay"), an open-source software platform for dialog research implemented in Python, available at http://parl.ai. Its goal is to provide a unified framework for sharing, training and testing of dialog models, integration of Amazon Mechanical Turk for data collection, human evaluation, and online/reinforcement learning; and a repository of machine learning models for comparing with others' models, and improving upon existing architectures. Over 20 tasks are supported in the first release, including popular datasets such as SQuAD, bAbI tasks, MCTest, WikiQA, QACNN, QADailyMail, CBT, bAbI Dialog, Ubuntu, OpenSubtitles and VQA. Several models are integrated, including neural models such as memory networks, seq2seq and attentive LSTMs.
△ Less
Submitted 8 March, 2018; v1 submitted 18 May, 2017;
originally announced May 2017.
-
Reading Wikipedia to Answer Open-Domain Questions
Authors:
Danqi Chen,
Adam Fisch,
Jason Weston,
Antoine Bordes
Abstract:
This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles). Our a…
▽ More
This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article. This task of machine reading at scale combines the challenges of document retrieval (finding the relevant articles) with that of machine comprehension of text (identifying the answer spans from those articles). Our approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs. Our experiments on multiple existing QA datasets indicate that (1) both modules are highly competitive with respect to existing counterparts and (2) multitask learning using distant supervision on their combination is an effective complete system on this challenging task.
△ Less
Submitted 27 April, 2017; v1 submitted 31 March, 2017;
originally announced April 2017.
-
Determination of muon momentum in the MicroBooNE LArTPC using an improved model of multiple Coulomb scattering
Authors:
MicroBooNE collaboration,
P. Abratenko,
R. Acciarri,
C. Adams,
R. An,
J. Asaadi,
M. Auger,
L. Bagby,
S. Balasubramanian,
B. Baller,
C. Barnes,
G. Barr,
M. Bass,
F. Bay,
M. Bishai,
A. Blake,
T. Bolton,
L. Bugel,
L. Camilleri,
D. Caratelli,
B. Carls,
R. Castillo Fernandez,
F. Cavanna,
H. Chen,
E. Church
, et al. (123 additional authors not shown)
Abstract:
We discuss a technique for measuring a charged particle's momentum by means of multiple Coulomb scattering (MCS) in the MicroBooNE liquid argon time projection chamber (LArTPC). This method does not require the full particle ionization track to be contained inside of the detector volume as other track momentum reconstruction methods do (range-based momentum reconstruction and calorimetric momentum…
▽ More
We discuss a technique for measuring a charged particle's momentum by means of multiple Coulomb scattering (MCS) in the MicroBooNE liquid argon time projection chamber (LArTPC). This method does not require the full particle ionization track to be contained inside of the detector volume as other track momentum reconstruction methods do (range-based momentum reconstruction and calorimetric momentum reconstruction). We motivate use of this technique, describe a tuning of the underlying phenomenological formula, quantify its performance on fully contained beam-neutrino-induced muon tracks both in simulation and in data, and quantify its performance on exiting muon tracks in simulation. Using simulation, we have shown that the standard Highland formula should be re-tuned specifically for scattering in liquid argon, which significantly improves the bias and resolution of the momentum measurement. With the tuned formula, we find agreement between data and simulation for contained tracks, with a small bias in the momentum reconstruction and with resolutions that vary as a function of track length, improving from about 10% for the shortest (one meter long) tracks to 5% for longer (several meter) tracks. For simulated exiting muons with at least one meter of track contained, we find a similarly small bias, and a resolution which is less than 15% for muons with momentum below 2 GeV/c. Above 2 GeV/c, results are given as a first estimate of the MCS momentum measurement capabilities of MicroBooNE for high momentum exiting tracks.
△ Less
Submitted 5 October, 2017; v1 submitted 17 March, 2017;
originally announced March 2017.
-
The Peculiar Multi-Wavelength Evolution Of V1535 Sco
Authors:
J. D. Linford,
L. Chomiuk,
T. Nelson,
T. Finzell,
F. M. Walter,
J. L. Sokoloski,
K. Mukai,
A. J. Mioduszewski,
A. J. van der Horst,
J. H. S. Weston,
M. P. Rupen
Abstract:
We present multi-wavelength observations of the unusual nova V1535 Sco throughout its outburst in 2015. Early radio observations were consistent with synchrotron emission, and early X-ray observations revealed the presence of high-energy (>1 keV) photons. These indicated that strong shocks were present during the first ~2 weeks of the nova's evolution. The radio spectral energy distribution was co…
▽ More
We present multi-wavelength observations of the unusual nova V1535 Sco throughout its outburst in 2015. Early radio observations were consistent with synchrotron emission, and early X-ray observations revealed the presence of high-energy (>1 keV) photons. These indicated that strong shocks were present during the first ~2 weeks of the nova's evolution. The radio spectral energy distribution was consistent with thermal emission from week 2 to week 6. Starting in week 7, the radio emission again showed evidence of synchrotron emission and there was an increase in X-ray emission, indicating a second shock event. The optical spectra show evidence for at least two separate outflows, with the faster outflow possibly having a bipolar morphology. The optical and near infrared light curves and the X-ray measurements of the hydrogen column density indicated that the companion star is likely a K giant.
△ Less
Submitted 18 May, 2017; v1 submitted 9 March, 2017;
originally announced March 2017.
-
A Detailed Observational Analysis of V1324 Sco, the Most Gamma-Ray Luminous Classical Nova to Date
Authors:
Thomas Finzell,
Laura Chomiuk,
Brian D. Metzger,
Frederick M. Walter,
Justin D. Linford,
Koji Mukai,
Thomas Nelson,
Jennifer H. S. Weston,
Yong Zheng,
Jennifer L. Sokoloski,
Amy Mioduszewski,
Michael P. Rupen,
Subo Dong,
Sumner Starrfield,
C. C. Cheung,
Terry Bohlsen,
Charles E. Woodward,
Gregory B. Taylor,
Terry Bohlsen,
Christian Buil,
Jose Prieto,
R. Mark Wagner,
Thomas Bensby,
I. A. Bond,
T. Sumi
, et al. (11 additional authors not shown)
Abstract:
It has recently been discovered that some, if not all, classical novae emit GeV gamma rays during outburst, but the mechanisms involved in the production of the gamma rays are still not well understood. We present here a comprehensive multi-wavelength dataset---from radio to X-rays---for the most gamma-ray luminous classical nova to-date, V1324 Sco. Using this dataset, we show that V1324 Sco is a…
▽ More
It has recently been discovered that some, if not all, classical novae emit GeV gamma rays during outburst, but the mechanisms involved in the production of the gamma rays are still not well understood. We present here a comprehensive multi-wavelength dataset---from radio to X-rays---for the most gamma-ray luminous classical nova to-date, V1324 Sco. Using this dataset, we show that V1324 Sco is a canonical dusty Fe-II type nova, with a maximum ejecta velocity of 2600 km s$^{-1}$ and an ejecta mass of few $\times 10^{-5}$ M$_{\odot}$. There is also evidence for complex shock interactions, including a double-peaked radio light curve which shows high brightness temperatures at early times. To explore why V1324~Sco was so gamma-ray luminous, we present a model of the nova ejecta featuring strong internal shocks, and find that higher gamma-ray luminosities result from higher ejecta velocities and/or mass-loss rates. Comparison of V1324~Sco with other gamma-ray detected novae does not show clear signatures of either, and we conclude that a larger sample of similarly well-observed novae is needed to understand the origin and variation of gamma rays in novae.
△ Less
Submitted 21 November, 2017; v1 submitted 11 January, 2017;
originally announced January 2017.
-
Design and Construction of the MicroBooNE Detector
Authors:
MicroBooNE Collaboration,
R. Acciarri,
C. Adams,
R. An,
A. Aparicio,
S. Aponte,
J. Asaadi,
M. Auger,
N. Ayoub,
L. Bagby,
B. Baller,
R. Barger,
G. Barr,
M. Bass,
F. Bay,
K. Biery,
M. Bishai,
A. Blake,
V. Bocean,
D. Boehnlein,
V. D. Bogert,
T. Bolton,
L. Bugel,
C. Callahan,
L. Camilleri
, et al. (215 additional authors not shown)
Abstract:
This paper describes the design and construction of the MicroBooNE liquid argon time projection chamber and associated systems. MicroBooNE is the first phase of the Short Baseline Neutrino program, located at Fermilab, and will utilize the capabilities of liquid argon detectors to examine a rich assortment of physics topics. In this document details of design specifications, assembly procedures, a…
▽ More
This paper describes the design and construction of the MicroBooNE liquid argon time projection chamber and associated systems. MicroBooNE is the first phase of the Short Baseline Neutrino program, located at Fermilab, and will utilize the capabilities of liquid argon detectors to examine a rich assortment of physics topics. In this document details of design specifications, assembly procedures, and acceptance tests are reported.
△ Less
Submitted 17 January, 2017; v1 submitted 17 December, 2016;
originally announced December 2016.
-
Learning through Dialogue Interactions by Asking Questions
Authors:
Jiwei Li,
Alexander H. Miller,
Sumit Chopra,
Marc'Aurelio Ranzato,
Jason Weston
Abstract:
A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow such interactions between a learner and a teacher. We investigate how a learner can benefit…
▽ More
A good dialogue agent should have the ability to interact with users by both responding to questions and by asking questions, and importantly to learn from both types of interaction. In this work, we explore this direction by designing a simulator and a set of synthetic tasks in the movie domain that allow such interactions between a learner and a teacher. We investigate how a learner can benefit from asking questions in both offline and online reinforcement learning settings, and demonstrate that the learner improves when asking questions. Finally, real experiments with Mechanical Turk validate the approach. Our work represents a first step in develo** such end-to-end learned interactive dialogue agents.
△ Less
Submitted 13 February, 2017; v1 submitted 15 December, 2016;
originally announced December 2016.
-
Tracking the World State with Recurrent Entity Networks
Authors:
Mikael Henaff,
Jason Weston,
Arthur Szlam,
Antoine Bordes,
Yann LeCun
Abstract:
We introduce a new model, the Recurrent Entity Network (EntNet). It is equipped with a dynamic long-term memory which allows it to maintain and update a representation of the state of the world as it receives new data. For language understanding tasks, it can reason on-the-fly as it reads text, not just when it is required to answer a question or respond as is the case for a Memory Network (Sukhba…
▽ More
We introduce a new model, the Recurrent Entity Network (EntNet). It is equipped with a dynamic long-term memory which allows it to maintain and update a representation of the state of the world as it receives new data. For language understanding tasks, it can reason on-the-fly as it reads text, not just when it is required to answer a question or respond as is the case for a Memory Network (Sukhbaatar et al., 2015). Like a Neural Turing Machine or Differentiable Neural Computer (Graves et al., 2014; 2016) it maintains a fixed size memory and can learn to perform location and content-based read and write operations. However, unlike those models it has a simple parallel architecture in which several memory locations can be updated simultaneously. The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting. We also demonstrate that it can solve a reasoning task which requires a large number of supporting facts, which other methods are not able to solve, and can generalize past its training horizon. It can also be practically used on large scale datasets such as Children's Book Test, where it obtains competitive performance, reading the story in a single pass.
△ Less
Submitted 10 May, 2017; v1 submitted 12 December, 2016;
originally announced December 2016.
-
Dialogue Learning With Human-In-The-Loop
Authors:
Jiwei Li,
Alexander H. Miller,
Sumit Chopra,
Marc'Aurelio Ranzato,
Jason Weston
Abstract:
An important aspect of develo** conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes. Most research has focused on learning from fixed training sets of labeled data rather than interacting with a dialogue partner in an online fashion. In this paper we explore this direction in a reinforcement learning setting…
▽ More
An important aspect of develo** conversational agents is to give a bot the ability to improve through communicating with humans and to learn from the mistakes that it makes. Most research has focused on learning from fixed training sets of labeled data rather than interacting with a dialogue partner in an online fashion. In this paper we explore this direction in a reinforcement learning setting where the bot improves its question-answering ability from feedback a teacher gives following its generated responses. We build a simulator that tests various aspects of such learning in a synthetic environment, and introduce models that work in this regime. Finally, real experiments with Mechanical Turk validate the approach.
△ Less
Submitted 13 January, 2017; v1 submitted 29 November, 2016;
originally announced November 2016.
-
Convolutional Neural Networks Applied to Neutrino Events in a Liquid Argon Time Projection Chamber
Authors:
MicroBooNE collaboration,
R. Acciarri,
C. Adams,
R. An,
J. Asaadi,
M. Auger,
L. Bagby,
B. Baller,
G. Barr,
M. Bass,
F. Bay,
M. Bishai,
A. Blake,
T. Bolton,
L. Bugel,
L. Camilleri,
D. Caratelli,
B. Carls,
R. Castillo Fernandez,
F. Cavanna,
H. Chen,
E. Church,
D. Cianci,
G. H. Collin,
J. M. Conrad
, et al. (114 additional authors not shown)
Abstract:
We present several studies of convolutional neural networks applied to data coming from the MicroBooNE detector, a liquid argon time projection chamber (LArTPC). The algorithms studied include the classification of single particle images, the localization of single particle and neutrino interactions in an image, and the detection of a simulated neutrino event overlaid with cosmic ray backgrounds t…
▽ More
We present several studies of convolutional neural networks applied to data coming from the MicroBooNE detector, a liquid argon time projection chamber (LArTPC). The algorithms studied include the classification of single particle images, the localization of single particle and neutrino interactions in an image, and the detection of a simulated neutrino event overlaid with cosmic ray backgrounds taken from real detector data. These studies demonstrate the potential of convolutional neural networks for particle identification or event detection on simulated neutrino interactions. We also address technical issues that arise when applying this technique to data from a large LArTPC at or near ground level.
△ Less
Submitted 16 November, 2016;
originally announced November 2016.
-
Key-Value Memory Networks for Directly Reading Documents
Authors:
Alexander Miller,
Adam Fisch,
Jesse Dodge,
Amir-Hossein Karimi,
Antoine Bordes,
Jason Weston
Abstract:
Directly reading documents and being able to answer questions from them is an unsolved challenge. To avoid its inherent difficulty, question answering (QA) has been directed towards using Knowledge Bases (KBs) instead, which has proven effective. Unfortunately KBs often suffer from being too restrictive, as the schema cannot support certain types of answers, and too sparse, e.g. Wikipedia contains…
▽ More
Directly reading documents and being able to answer questions from them is an unsolved challenge. To avoid its inherent difficulty, question answering (QA) has been directed towards using Knowledge Bases (KBs) instead, which has proven effective. Unfortunately KBs often suffer from being too restrictive, as the schema cannot support certain types of answers, and too sparse, e.g. Wikipedia contains much more information than Freebase. In this work we introduce a new method, Key-Value Memory Networks, that makes reading documents more viable by utilizing different encodings in the addressing and output stages of the memory read operation. To compare using KBs, information extraction or Wikipedia documents directly in a single framework we construct an analysis tool, WikiMovies, a QA dataset that contains raw text alongside a preprocessed KB, in the domain of movies. Our method reduces the gap between all three settings. It also achieves state-of-the-art results on the existing WikiQA benchmark.
△ Less
Submitted 10 October, 2016; v1 submitted 9 June, 2016;
originally announced June 2016.
-
Learning End-to-End Goal-Oriented Dialog
Authors:
Antoine Bordes,
Y-Lan Boureau,
Jason Weston
Abstract:
Traditional dialog systems used in goal-oriented applications require a lot of domain-specific handcrafting, which hinders scaling up to new domains. End-to-end dialog systems, in which all components are trained from the dialogs themselves, escape this limitation. But the encouraging success recently obtained in chit-chat dialog may not carry over to goal-oriented settings. This paper proposes a…
▽ More
Traditional dialog systems used in goal-oriented applications require a lot of domain-specific handcrafting, which hinders scaling up to new domains. End-to-end dialog systems, in which all components are trained from the dialogs themselves, escape this limitation. But the encouraging success recently obtained in chit-chat dialog may not carry over to goal-oriented settings. This paper proposes a testbed to break down the strengths and shortcomings of end-to-end dialog systems in goal-oriented applications. Set in the context of restaurant reservation, our tasks require manipulating sentences and symbols, so as to properly conduct conversations, issue API calls and use the outputs of such calls. We show that an end-to-end dialog system based on Memory Networks can reach promising, yet imperfect, performance and learn to perform non-trivial operations. We confirm those results by comparing our system to a hand-crafted slot-filling baseline on data from the second Dialog State Tracking Challenge (Henderson et al., 2014a). We show similar result patterns on data extracted from an online concierge service.
△ Less
Submitted 30 March, 2017; v1 submitted 24 May, 2016;
originally announced May 2016.
-
Dialog-based Language Learning
Authors:
Jason Weston
Abstract:
A long-term goal of machine learning research is to build an intelligent dialog agent. Most research in natural language understanding has focused on learning from fixed training sets of labeled data, with supervision either at the word level (tagging, parsing tasks) or sentence level (question answering, machine translation). This kind of supervision is not realistic of how humans learn, where la…
▽ More
A long-term goal of machine learning research is to build an intelligent dialog agent. Most research in natural language understanding has focused on learning from fixed training sets of labeled data, with supervision either at the word level (tagging, parsing tasks) or sentence level (question answering, machine translation). This kind of supervision is not realistic of how humans learn, where language is both learned by, and used for, communication. In this work, we study dialog-based language learning, where supervision is given naturally and implicitly in the response of the dialog partner during the conversation. We study this setup in two domains: the bAbI dataset of (Weston et al., 2015) and large-scale question answering from (Dodge et al., 2015). We evaluate a set of baseline learning strategies on these tasks, and show that a novel model incorporating predictive lookahead is a promising approach for learning from a teacher's response. In particular, a surprising result is that it can learn to answer questions correctly without any reward-based supervision at all.
△ Less
Submitted 24 October, 2016; v1 submitted 20 April, 2016;
originally announced April 2016.
-
Towards Realistic Time-Resolved Simulations of Quantum Devices
Authors:
Joseph Weston,
Xavier Waintal
Abstract:
We report on our recent efforts to perform realistic simulations of large quantum devices in the time domain. In contrast to d.c. transport where the calculations are explicitly performed at the Fermi level, the presence of time-dependent terms in the Hamiltonian makes the system inelastic so that it is necessary to explicitly enforce the Pauli principle in the simulations. We illustrate our appro…
▽ More
We report on our recent efforts to perform realistic simulations of large quantum devices in the time domain. In contrast to d.c. transport where the calculations are explicitly performed at the Fermi level, the presence of time-dependent terms in the Hamiltonian makes the system inelastic so that it is necessary to explicitly enforce the Pauli principle in the simulations. We illustrate our approach with calculations for a flying qubit interferometer, a nanoelectronic device that is currently under experimental investigation. Our calculations illustrate the fact that many degrees of freedom (16,700 tight-binding sites in the scattering region) and long simulation times (80,000 times the inverse Bandwidth of the tight-binding model) can be easily achieved on a local computer.
△ Less
Submitted 5 April, 2016;
originally announced April 2016.
-
Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems
Authors:
Jesse Dodge,
Andreea Gane,
Xiang Zhang,
Antoine Bordes,
Sumit Chopra,
Alexander Miller,
Arthur Szlam,
Jason Weston
Abstract:
A long-term goal of machine learning is to build intelligent conversational agents. One recent popular approach is to train end-to-end models on a large amount of real dialog transcripts between humans (Sordoni et al., 2015; Vinyals & Le, 2015; Shang et al., 2015). However, this approach leaves many questions unanswered as an understanding of the precise successes and shortcomings of each model is…
▽ More
A long-term goal of machine learning is to build intelligent conversational agents. One recent popular approach is to train end-to-end models on a large amount of real dialog transcripts between humans (Sordoni et al., 2015; Vinyals & Le, 2015; Shang et al., 2015). However, this approach leaves many questions unanswered as an understanding of the precise successes and shortcomings of each model is hard to assess. A contrasting recent proposal are the bAbI tasks (Weston et al., 2015b) which are synthetic data that measure the ability of learning machines at various reasoning tasks over toy language. Unfortunately, those tests are very small and hence may encourage methods that do not scale. In this work, we propose a suite of new tasks of a much larger scale that attempt to bridge the gap between the two regimes. Choosing the domain of movies, we provide tasks that test the ability of models to answer factual questions (utilizing OMDB), provide personalization (utilizing MovieLens), carry short conversations about the two, and finally to perform on natural dialogs from Reddit. We provide a dataset covering 75k movie entities and with 3.5M training examples. We present results of various models on these tasks, and evaluate their performance.
△ Less
Submitted 19 April, 2016; v1 submitted 21 November, 2015;
originally announced November 2015.
-
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
Authors:
Felix Hill,
Antoine Bordes,
Sumit Chopra,
Jason Weston
Abstract:
We introduce a new test of how well language models capture meaning in children's books. Unlike standard language modelling benchmarks, it distinguishes the task of predicting syntactic function words from that of predicting lower-frequency words, which carry greater semantic content. We compare a range of state-of-the-art models, each with a different way of encoding what has been previously read…
▽ More
We introduce a new test of how well language models capture meaning in children's books. Unlike standard language modelling benchmarks, it distinguishes the task of predicting syntactic function words from that of predicting lower-frequency words, which carry greater semantic content. We compare a range of state-of-the-art models, each with a different way of encoding what has been previously read. We show that models which store explicit representations of long-term contexts outperform state-of-the-art neural language models at predicting semantic content words, although this advantage is not observed for syntactic function words. Interestingly, we find that the amount of text encoded in a single memory representation is highly influential to the performance: there is a sweet-spot, not too big and not too small, between single words and full sentences that allows the most meaningful information in a text to be effectively retained and recalled. Further, the attention over such window-based memories can be trained effectively through self-supervision. We then assess the generality of this principle by applying it to the CNN QA benchmark, which involves identifying named entities in paraphrased summaries of news articles, and achieve state-of-the-art performance.
△ Less
Submitted 1 April, 2016; v1 submitted 6 November, 2015;
originally announced November 2015.
-
Shock-powered radio emission from V5589 Sagittarii (Nova Sgr 2012 1)
Authors:
Jennifer H. S. Weston,
J. L. Sokoloski,
Laura Chomiuk,
Justin D. Linford,
Thomas Nelson,
Koji Mukai,
Tom Finzell,
Amy Mioduszewski,
Michael P. Rupen,
Frederick M. Walter
Abstract:
Since the Fermi discovery of $γ$-rays from novae, one of the biggest questions in the field has been how novae generate such high-energy emission. Shocks must be a fundamental ingredient. Six months of radio observations of the 2012 nova V5589 Sgr with the VLA and 15 weeks of X-ray observations with Swift/XRT show that the radio emission consisted of: 1) a shock-powered, non-thermal flare; and 2)…
▽ More
Since the Fermi discovery of $γ$-rays from novae, one of the biggest questions in the field has been how novae generate such high-energy emission. Shocks must be a fundamental ingredient. Six months of radio observations of the 2012 nova V5589 Sgr with the VLA and 15 weeks of X-ray observations with Swift/XRT show that the radio emission consisted of: 1) a shock-powered, non-thermal flare; and 2) weak thermal emission from $10^{-5}$ M$_\odot$ of freely expanding, photoionized ejecta. Absorption features in the optical spectrum and the peak optical brightness suggest that V5589 Sgr lies 4 kpc away (3.2-4.6 kpc). The shock-powered flare dominated the radio light curve at low frequencies before day 100. The spectral evolution of the radio flare, its high radio brightness temperature, the presence of unusually hard ($kT_x > 33$ keV) X-rays, and the ratio of radio to X-ray flux near radio maximum all support the conclusions that the flare was shock-powered and non-thermal. Unlike most other novae with strong shock-powered radio emission, V5589 Sgr is not embedded in the wind of a red-giant companion. Based on the similar inclinations and optical line profiles of V5589 Sgr and V959 Mon, we propose that shocks in V5589 Sgr formed from collisions between a slow flow with an equatorial density enhancement and a subsequent faster flow. We speculate that the relatively high speed and low mass of the ejecta led to the unusual radio emission from V5589 Sgr, and perhaps also to the non-detection of $γ$-rays.
△ Less
Submitted 16 May, 2016; v1 submitted 22 October, 2015;
originally announced October 2015.
-
A linear-scaling source-sink algorithm for simulating time-resolved quantum transport and superconductivity
Authors:
Joseph Weston,
Xavier Waintal
Abstract:
We report on a "source-sink" algorithm which allows one to calculate time-resolved physical quantities from a general nanoelectronic quantum system (described by an arbitrary time-dependent quadratic Hamiltonian) connected to infinite electrodes. Although mathematically equivalent to the non equilibrium Green's function formalism, the approach is based on the scattering wave functions of the syste…
▽ More
We report on a "source-sink" algorithm which allows one to calculate time-resolved physical quantities from a general nanoelectronic quantum system (described by an arbitrary time-dependent quadratic Hamiltonian) connected to infinite electrodes. Although mathematically equivalent to the non equilibrium Green's function formalism, the approach is based on the scattering wave functions of the system. It amounts to solving a set of generalized Schrödinger equations which include an additional "source" term (coming from the time dependent perturbation) and an absorbing "sink" term (the electrodes). The algorithm execution time scales linearly with both system size and simulation time allowing one to simulate large systems (currently around $10^6$ degrees of freedom) and/or large times (currently around $10^5$ times the smallest time scale of the system). As an application we calculate the current-voltage characteristics of a Josephson junction for both short and long junctions, and recover the multiple Andreev reflexion (MAR) physics. We also discuss two intrinsically time-dependent situations: the relaxation time of a Josephson junction after a quench of the voltage bias, and the propagation of voltage pulses through a Josephson junction. In the case of a ballistic, long Josephson junction, we predict that a fast voltage pulse creates an oscillatory current whose frequency is controlled by the Thouless energy of the normal part. A similar effect is found for short junctions, a voltage pulse produces an oscillating current which, in the absence of electromagnetic environment, does not relax.
△ Less
Submitted 20 October, 2015;
originally announced October 2015.
-
High Frequency Radio Network Simulation Using OMNeT++
Authors:
Jeffery Weston,
Eric Koski
Abstract:
Harris Corporation has an interest in making HF radios a suitable medium for wireless information networks using standard Internet protocols. Although HF radio links have many unique characteristics, HF wireless subnets can be subject to many of the same traffic flow characteristics and topologies as existing line-of-sight (LOS) radio networks, giving rise to similar issues (media access, connecti…
▽ More
Harris Corporation has an interest in making HF radios a suitable medium for wireless information networks using standard Internet protocols. Although HF radio links have many unique characteristics, HF wireless subnets can be subject to many of the same traffic flow characteristics and topologies as existing line-of-sight (LOS) radio networks, giving rise to similar issues (media access, connectivity, routing) which lend themselves to investigation through simulation. Accordingly, we have undertaken to develop efficient, high-fidelity simulations of various aspects of HF radio communications and networking using the OMNeT++ framework. Essential aspects of these simulations include HF channel models simulating relevant channel attributes such as Signal to Noise Ratio, multipath, and Doppler spread; a calibrated physical layer model reproducing the error statistics (including burst error distributions) of the MIL-STD-188-110B/C HF modem waveforms, both narrowband (3 kHz) and wideband (up to 24 kHz) on the simulated HF channels; a model of the NATO STANAG 5066 data link protocol; and integration of these models with the OMNeT++ network simulation framework and its INET library of Internet protocol models. This simulation is used to evaluate the impacts of different STANAG 5066 configuration settings on TCP network performance, and to evaluate strategies for optimizing throughput over HF links using TCP Performance Enhancing Proxy (PEP) techniques.
△ Less
Submitted 10 September, 2015;
originally announced September 2015.
-
A Neural Attention Model for Abstractive Sentence Summarization
Authors:
Alexander M. Rush,
Sumit Chopra,
Jason Weston
Abstract:
Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it ca…
▽ More
Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.
△ Less
Submitted 3 September, 2015; v1 submitted 2 September, 2015;
originally announced September 2015.
-
Probing (topological) Floquet states through DC transport
Authors:
Michel Fruchart,
Pierre Delplace,
Joseph Weston,
Xavier Waintal,
David Carpentier
Abstract:
We consider the differential conductance of a periodically driven system connected to infinite electrodes. We focus on the situation where the dissipation occurs predominantly in these electrodes. Using analytical arguments and a detailed numerical study we relate the differential conductances of such a system in two and three terminal geometries to the spectrum of quasi-energies of the Floquet op…
▽ More
We consider the differential conductance of a periodically driven system connected to infinite electrodes. We focus on the situation where the dissipation occurs predominantly in these electrodes. Using analytical arguments and a detailed numerical study we relate the differential conductances of such a system in two and three terminal geometries to the spectrum of quasi-energies of the Floquet operator. Moreover these differential conductances are found to provide an accurate probe of the existence of gaps in this quasi-energy spectrum, being quantized when topological edge states occur within these gaps. Our analysis opens the perspective to describe the intermediate time dynamics of driven mesoscopic conductors as topological Floquet filters.
△ Less
Submitted 6 October, 2015; v1 submitted 1 July, 2015;
originally announced July 2015.
-
Large-scale Simple Question Answering with Memory Networks
Authors:
Antoine Bordes,
Nicolas Usunier,
Sumit Chopra,
Jason Weston
Abstract:
Training large-scale question answering systems is complicated because training sources usually cover a small portion of the range of possible questions. This paper studies the impact of multitask and transfer learning for simple question answering; a setting for which the reasoning required to answer is quite easy, as long as one can retrieve the correct evidence given a question, which can be di…
▽ More
Training large-scale question answering systems is complicated because training sources usually cover a small portion of the range of possible questions. This paper studies the impact of multitask and transfer learning for simple question answering; a setting for which the reasoning required to answer is quite easy, as long as one can retrieve the correct evidence given a question, which can be difficult in large-scale conditions. To this end, we introduce a new dataset of 100k questions that we use in conjunction with existing benchmarks. We conduct our study within the framework of Memory Networks (Weston et al., 2015) because this perspective allows us to eventually scale up to more complex reasoning, and show that Memory Networks can be successfully trained to achieve excellent performance.
△ Less
Submitted 5 June, 2015;
originally announced June 2015.
-
Non-Thermal Radio Emission from Colliding Flows in Classical Nova V1723 Aql
Authors:
Jennifer H. S. Weston,
J. L. Sokoloski,
Brian D. Metzger,
Yong Zheng,
Laura Chomiuk,
Miriam I. Krauss,
Justin Linford,
Thomas Nelson,
Amy Mioduszewski,
Michael P. Rupen,
Tom Finzell,
Koji Mukai
Abstract:
The importance of shocks in nova explosions has been highlighted by Fermi's discovery of γ-ray producing novae. Over three years of multi-band VLA radio observations of the 2010 nova V1723 Aql show that shocks between fast and slow flows within the ejecta led to the acceleration of particles and the production of synchrotron radiation. Soon after the start of the eruption, shocks in the ejecta pro…
▽ More
The importance of shocks in nova explosions has been highlighted by Fermi's discovery of γ-ray producing novae. Over three years of multi-band VLA radio observations of the 2010 nova V1723 Aql show that shocks between fast and slow flows within the ejecta led to the acceleration of particles and the production of synchrotron radiation. Soon after the start of the eruption, shocks in the ejecta produced an unexpected radio flare, resulting in a multi-peaked radio light curve. The emission eventually became consistent with an expanding thermal remnant with mass $2 \times 10^{-4} M_\odot$ and temperature $10^4$ K. However, during the first two months, the $\gtrsim 10^6$ K brightness temperature at low frequencies was too high to be due to thermal emission from the small amount of X-ray producing shock-heated gas. Radio imaging showed structures with velocities of 400 km s$^{-1}$ (d/6 kpc) in the plane of the sky, perpendicular to a more elongated 1500 km s$^{-1}$ (d/6 kpc) flow. The morpho-kinematic structure of the ejecta from V1723 Aql appears similar to nova V959 Mon, where collisions between a slow torus and a faster flow collimated the fast flow and gave rise to γ-ray producing shocks. Optical spectroscopy and X-ray observations of V1723 Aql during the radio flare are consistent with this picture. Our observations support the idea that shocks in novae occur when a fast flow collides with a slow collimating torus. Such shocks could be responsible for hard X-ray emission, γ-ray production, and double-peaked radio light curves from some classical novae.
△ Less
Submitted 30 December, 2015; v1 submitted 21 May, 2015;
originally announced May 2015.
-
End-To-End Memory Networks
Authors:
Sainbayar Sukhbaatar,
Arthur Szlam,
Jason Weston,
Rob Fergus
Abstract:
We introduce a neural network with a recurrent attention model over a possibly large external memory. The architecture is a form of Memory Network (Weston et al., 2015) but unlike the model in that work, it is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings. It can also be seen as an extension of RNNse…
▽ More
We introduce a neural network with a recurrent attention model over a possibly large external memory. The architecture is a form of Memory Network (Weston et al., 2015) but unlike the model in that work, it is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings. It can also be seen as an extension of RNNsearch to the case where multiple computational steps (hops) are performed per output symbol. The flexibility of the model allows us to apply it to tasks as diverse as (synthetic) question answering and to language modeling. For the former our approach is competitive with Memory Networks, but with less supervision. For the latter, on the Penn TreeBank and Text8 datasets our approach demonstrates comparable performance to RNNs and LSTMs. In both cases we show that the key concept of multiple computational hops yields improved results.
△ Less
Submitted 24 November, 2015; v1 submitted 30 March, 2015;
originally announced March 2015.
-
The Distance to Nova V959 Mon from VLA Imaging
Authors:
J. D. Linford,
V. A. R. M. Ribeiro,
L. Chomiuk,
T. Nelson,
J. L. Sokoloski,
M. P. Rupen,
K. Mukai,
T. J. O'Brien,
A. J. Mioduszewski,
J. Weston
Abstract:
Determining reliable distances to classical novae is a challenging but crucial step in deriving their ejected masses and explosion energetics. Here we combine radio expansion measurements from the Karl G. Jansky Very Large Array with velocities derived from optical spectra to estimate an expansion parallax for nova V959 Mon, the first nova discovered through its gamma-ray emission. We spatially re…
▽ More
Determining reliable distances to classical novae is a challenging but crucial step in deriving their ejected masses and explosion energetics. Here we combine radio expansion measurements from the Karl G. Jansky Very Large Array with velocities derived from optical spectra to estimate an expansion parallax for nova V959 Mon, the first nova discovered through its gamma-ray emission. We spatially resolve the nova at frequencies of 4.5-36.5 GHz in nine different imaging epochs. The first five epochs cover the expansion of the ejecta from 2012 October to 2013 January, while the final four epochs span 2014 February to 2014 May. These observations correspond to days 126 through 199 and days 615 through 703 after the first detection of the nova. The images clearly show a non-spherical ejecta geometry. Utilizing ejecta velocities derived from 3D modelling of optical spectroscopy, the radio expansion implies a distance between 0.9 +/- 0.2 and 2.2 +/- 0.4 kpc, with a most probable distance of 1.4 +/- 0.4 kpc. This distance implies a gamma-ray luminosity much less than the prototype gamma-ray-detected nova, V407 Cyg, possibly due to the lack of a red giant companion in the V959 Mon system. V959 Mon also has a much lower gamma-ray luminosity than other classical novae detected in gamma-rays to date, indicating a range of at least a factor of 10 in the gamma-ray luminosities for these explosions.
△ Less
Submitted 12 March, 2015;
originally announced March 2015.
-
Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
Authors:
Jason Weston,
Antoine Bordes,
Sumit Chopra,
Alexander M. Rush,
Bart van Merriënboer,
Armand Joulin,
Tomas Mikolov
Abstract:
One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent. To measure progress towards that goal, we argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is a…
▽ More
One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent. To measure progress towards that goal, we argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is able to answer questions via chaining facts, simple induction, deduction and many more. The tasks are designed to be prerequisites for any system that aims to be capable of conversing with a human. We believe many existing learning systems can currently not solve them, and hence our aim is to classify these tasks into skill sets, so that researchers can identify (and then rectify) the failings of their systems. We also extend and improve the recently introduced Memory Networks model, and show it is able to solve some, but not all, of the tasks.
△ Less
Submitted 31 December, 2015; v1 submitted 19 February, 2015;
originally announced February 2015.
-
Manipulating Andreev and Majorana Bound States with microwaves
Authors:
Joseph Weston,
Benoit Gaury,
Xavier Waintal
Abstract:
We study the interplay between Andreev (Majorana) bound states that form at the boundary of a (topological) superconductor and a train of microwave pulses. We find that the extra dynamical phase coming from the pulses can shift the phase of the Andreev reflection, resulting in the appear- ance of dynamical Andreev states. As an application we study the presence of the zero bias peak in the differe…
▽ More
We study the interplay between Andreev (Majorana) bound states that form at the boundary of a (topological) superconductor and a train of microwave pulses. We find that the extra dynamical phase coming from the pulses can shift the phase of the Andreev reflection, resulting in the appear- ance of dynamical Andreev states. As an application we study the presence of the zero bias peak in the differential conductance of a normal-topological superconductor junction - the simplest, yet somehow ambiguous, experimental signature for Majorana states. Adding microwave radiation to the measuring electrodes provides an unambiguous probe of the Andreev nature of the zero bias peak.
△ Less
Submitted 30 July, 2015; v1 submitted 25 November, 2014;
originally announced November 2014.
-
Memory Networks
Authors:
Jason Weston,
Sumit Chopra,
Antoine Bordes
Abstract:
We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively act…
▽ More
We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively acts as a (dynamic) knowledge base, and the output is a textual response. We evaluate them on a large-scale QA task, and a smaller, but more complex, toy task generated from a simulated world. In the latter, we show the reasoning power of such models by chaining multiple supporting sentences to answer questions that require understanding the intension of verbs.
△ Less
Submitted 29 November, 2015; v1 submitted 14 October, 2014;
originally announced October 2014.
-
Binary orbits as the driver of gamma-ray emission and mass ejection in classical novae
Authors:
Laura Chomiuk,
Justin D. Linford,
Jun Yang,
T. J. O'Brien,
Zsolt Paragi,
Amy J. Mioduszewski,
R. J. Beswick,
C. C. Cheung,
Koji Mukai,
Thomas Nelson,
Valerio A. R. M. Ribeiro,
Michael P. Rupen,
J. L. Sokoloski,
Jennifer Weston,
Yong Zheng,
Michael F. Bode,
Stewart Eyres,
Nirupam Roy,
Gregory B. Taylor
Abstract:
Classical novae are the most common astrophysical thermonuclear explosions, occurring on the surfaces of white dwarf stars accreting gas from companions in binary star systems. Novae typically expel ~10^(-4) solar masses of material at velocities exceeding 1,000 kilometres per second. However, the mechanism of mass ejection in novae is poorly understood, and could be dominated by the impulsive fla…
▽ More
Classical novae are the most common astrophysical thermonuclear explosions, occurring on the surfaces of white dwarf stars accreting gas from companions in binary star systems. Novae typically expel ~10^(-4) solar masses of material at velocities exceeding 1,000 kilometres per second. However, the mechanism of mass ejection in novae is poorly understood, and could be dominated by the impulsive flash of thermonuclear energy, prolonged optically thick winds, or binary interaction with the nova envelope. Classical novae are now routinely detected in gigaelectronvolt gamma-ray wavelengths, suggesting that relativistic particles are accelerated by strong shocks in the ejecta. Here we report high-resolution radio imaging of the gamma-ray-emitting nova V959 Mon. We find that its ejecta were shaped by the motion of the binary system: some gas was expelled rapidly along the poles as a wind from the white dwarf, while denser material drifted out along the equatorial plane, propelled by orbital motion. At the interface between the equatorial and polar regions, we observe synchrotron emission indicative of shocks and relativistic particle acceleration, thereby pinpointing the location of gamma-ray production. Binary sha** of the nova ejecta and associated internal shocks are expected to be widespread among novae, explaining why many novae are gamma-ray emitters.
△ Less
Submitted 13 October, 2014;
originally announced October 2014.
-
AC Josephson effect without superconductivity
Authors:
Benoit Gaury,
Joseph Weston,
Xavier Waintal
Abstract:
Superconductivity derives its most salient features from the coherence of its macroscopic wave function. The associated physical phenomena have now moved from exotic subjects to fundamental building blocks for quantum circuits such as qubits or single photonic modes. Here, we theoretically find that the AC Josephson effect---which transforms a DC voltage $V_b$ into an oscillating signal…
▽ More
Superconductivity derives its most salient features from the coherence of its macroscopic wave function. The associated physical phenomena have now moved from exotic subjects to fundamental building blocks for quantum circuits such as qubits or single photonic modes. Here, we theoretically find that the AC Josephson effect---which transforms a DC voltage $V_b$ into an oscillating signal $cos(2eV_b t/ \hbar)$---has a mesoscopic counterpart in normal conductors. We show that on applying a DC voltage $V_b$ to an electronic interferometer, there exists a universal transient regime where the current oscillates at frequency $eV_b/h$. This effect is not limited by a superconducting gap and could, in principle, be used to produce tunable AC signals in the elusive $0.1-10$ THz "terahertz gap".
△ Less
Submitted 15 July, 2014;
originally announced July 2014.
-
Classical and quantum spreading of a charge pulse
Authors:
Benoit Gaury,
Joseph Weston,
Christoph Groth,
Xavier Waintal
Abstract:
With the technical progress of radio-frequency setups, high frequency quantum transport experiments have moved from theory to the lab. So far the standard theoretical approach used to treat such problems numerically--known as Keldysh or NEGF (Non Equilibrium Green's Functions) formalism--has not been very successful mainly because of a prohibitive computational cost. We propose a reformulation of…
▽ More
With the technical progress of radio-frequency setups, high frequency quantum transport experiments have moved from theory to the lab. So far the standard theoretical approach used to treat such problems numerically--known as Keldysh or NEGF (Non Equilibrium Green's Functions) formalism--has not been very successful mainly because of a prohibitive computational cost. We propose a reformulation of the non-equilibrium Green's function technique in terms of the electronic wave functions of the system in an energy-time representation. The numerical algorithm we obtain scales now linearly with the simulated time and the volume of the system, and makes simulation of systems with 10^5 - 10^6 atoms/sites feasible. We illustrate our method with the propagation and spreading of a charge pulse in the quantum Hall regime. We identify a classical and a quantum regime for the spreading, depending on the number of particles contained in the pulse. This numerical experiment is the condensed matter analogue to the spreading of a Gaussian wavepacket discussed in quantum mechanics textbooks.
△ Less
Submitted 15 July, 2014; v1 submitted 27 June, 2014;
originally announced June 2014.
-
Question Answering with Subgraph Embeddings
Authors:
Antoine Bordes,
Sumit Chopra,
Jason Weston
Abstract:
This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few hand-crafted features. Our model learns low-dimensional embeddings of words and knowledge base constituents; these representations are used to score natural language questions against candidate answers. Training our system using pairs of questions and structured representations…
▽ More
This paper presents a system which learns to answer questions on a broad range of topics from a knowledge base using few hand-crafted features. Our model learns low-dimensional embeddings of words and knowledge base constituents; these representations are used to score natural language questions against candidate answers. Training our system using pairs of questions and structured representations of their answers, and pairs of question paraphrases, yields competitive results on a competitive benchmark of the literature.
△ Less
Submitted 3 September, 2014; v1 submitted 13 June, 2014;
originally announced June 2014.
-
Stop** electrons with radio-frequency pulses in the quantum Hall regime
Authors:
Benoit Gaury,
Joseph Weston,
Xavier Waintal
Abstract:
Most functionalities of modern electronic circuits rely on the possibility to modify the path fol- lowed by the electrons using, e.g. field effect transistors. Here we discuss the interplay between the modification of this path and the quantum dynamics of the electronic flow. Specifically, we study the propagation of charge pulses through the edge states of a two-dimensional electron gas in the qu…
▽ More
Most functionalities of modern electronic circuits rely on the possibility to modify the path fol- lowed by the electrons using, e.g. field effect transistors. Here we discuss the interplay between the modification of this path and the quantum dynamics of the electronic flow. Specifically, we study the propagation of charge pulses through the edge states of a two-dimensional electron gas in the quantum Hall regime. By sending radio-frequency (RF) excitations on a top gate capacitively coupled to the electron gas, we manipulate these edge state dynamically. We find that a fast RF change of the gate voltage can stop the propagation of the charge pulse inside the sample. This effect is intimately linked to the vanishing velocity of bulk states in the quantum Hall regime and the peculiar connection between momentum and transverse confinement of Landau levels. Our findings suggest new possibilities for stop**, releasing and switching the trajectory of charge pulses in quantum Hall systems.
△ Less
Submitted 14 May, 2014;
originally announced May 2014.
-
Open Question Answering with Weakly Supervised Embedding Models
Authors:
Antoine Bordes,
Jason Weston,
Nicolas Usunier
Abstract:
Building computers able to answer questions on any subject is a long standing goal of artificial intelligence. Promising progress has recently been achieved by methods that learn to map questions to logical forms or database queries. Such approaches can be effective but at the cost of either large amounts of human-labeled data or by defining lexicons and grammars tailored by practitioners. In this…
▽ More
Building computers able to answer questions on any subject is a long standing goal of artificial intelligence. Promising progress has recently been achieved by methods that learn to map questions to logical forms or database queries. Such approaches can be effective but at the cost of either large amounts of human-labeled data or by defining lexicons and grammars tailored by practitioners. In this paper, we instead take the radical approach of learning to map questions to vectorial feature representations. By map** answers into the same space one can query any knowledge base independent of its schema, without requiring any grammar or lexicon. Our method is trained with a new optimization procedure combining stochastic gradient descent followed by a fine-tuning step using the weak supervision provided by blending automatically and collaboratively generated resources. We empirically demonstrate that our model can capture meaningful signals from its noisy supervision leading to major improvements over paralex, the only existing method able to be trained on similar weakly labeled data.
△ Less
Submitted 16 April, 2014;
originally announced April 2014.
-
The 2011 Outburst of Recurrent Nova T Pyx: X-ray Observations Expose the White Dwarf Mass and Ejection Dynamics
Authors:
Laura Chomiuk,
Thomas Nelson,
Koji Mukai,
J. L. Sokoloski,
Michael P. Rupen,
Kim L. Page,
Julian P. Osborne,
Erik Kuulkers,
Amy J. Mioduszewski,
Nirupam Roy,
Jennifer Weston,
Miriam I. Krauss
Abstract:
The recurrent nova T Pyx underwent its sixth historical outburst in 2011, and became the subject of an intensive multi-wavelength observational campaign. We analyze data from the Swift and Suzaku satellites to produce a detailed X-ray light curve augmented by epochs of spectral information. X-ray observations yield mostly non-detections in the first four months of outburst, but both a super-soft a…
▽ More
The recurrent nova T Pyx underwent its sixth historical outburst in 2011, and became the subject of an intensive multi-wavelength observational campaign. We analyze data from the Swift and Suzaku satellites to produce a detailed X-ray light curve augmented by epochs of spectral information. X-ray observations yield mostly non-detections in the first four months of outburst, but both a super-soft and hard X-ray component rise rapidly after Day 115. The super-soft X-ray component, attributable to the photosphere of the nuclear-burning white dwarf, is relatively cool (~45 eV) and implies that the white dwarf in T Pyx is significantly below the Chandrasekhar mass (~1 M_sun). The late turn-on time of the super-soft component yields a large nova ejecta mass (>~10^-5 M_sun), consistent with estimates at other wavelengths. The hard X-ray component is well fit by a ~1 keV thermal plasma, and is attributed to shocks internal to the 2011 nova ejecta. The presence of a strong oxygen line in this thermal plasma on Day 194 requires a significantly super-solar abundance of oxygen and implies that the ejecta are polluted by white dwarf material. The X-ray light curve can be explained by a dual-phase ejection, with a significant delay between the first and second ejection phases, and the second ejection finally released two months after outburst. A delayed ejection is consistent with optical and radio observations of T Pyx, but the physical mechanism producing such a delay remains a mystery.
△ Less
Submitted 11 April, 2014;
originally announced April 2014.
-
Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction
Authors:
Jason Weston,
Antoine Bordes,
Oksana Yakhnenko,
Nicolas Usunier
Abstract:
This paper proposes a novel approach for relation extraction from free text which is trained to jointly use information from the text and from existing knowledge. Our model is based on two scoring functions that operate by learning low-dimensional embeddings of words and of entities and relationships from a knowledge base. We empirically show on New York Times articles aligned with Freebase relati…
▽ More
This paper proposes a novel approach for relation extraction from free text which is trained to jointly use information from the text and from existing knowledge. Our model is based on two scoring functions that operate by learning low-dimensional embeddings of words and of entities and relationships from a knowledge base. We empirically show on New York Times articles aligned with Freebase relations that our approach is able to efficiently use the extra information provided by a large subset of Freebase data (4M entities, 23k relationships) to improve over existing methods that rely on text features alone.
△ Less
Submitted 30 July, 2013;
originally announced July 2013.
-
Numerical simulations of time resolved quantum electronics
Authors:
Benoit Gaury,
Joseph Weston,
Matthieu Santin,
Manuel Houzet,
Christoph Groth,
Xavier Waintal
Abstract:
This paper discusses the technical aspects - mathematical and numerical - associated with the numerical simulations of a mesoscopic system in the time domain (i.e. beyond the single frequency AC limit). After a short review of the state of the art, we develop a theoretical framework for the calculation of time resolved observables in a general multiterminal system subject to an arbitrary time depe…
▽ More
This paper discusses the technical aspects - mathematical and numerical - associated with the numerical simulations of a mesoscopic system in the time domain (i.e. beyond the single frequency AC limit). After a short review of the state of the art, we develop a theoretical framework for the calculation of time resolved observables in a general multiterminal system subject to an arbitrary time dependent perturbation (oscillating electrostatic gates, voltage pulses, time-vaying magnetic fields) The approach is mathematically equivalent to (i) the time dependent scattering formalism, (ii) the time resolved Non Equilibrium Green Function (NEGF) formalism and (iii) the partition-free approach. The central object of our theory is a wave function that obeys a simple Schrodinger equation with an additional source term that accounts for the electrons injected from the electrodes. The time resolved observables (current, density. . .) and the (inelastic) scattering matrix are simply expressed in term of this wave function. We use our approach to develop a numerical technique for simulating time resolved quantum transport. We find that the use of this wave function is advantageous for numerical simulations resulting in a speed up of many orders of magnitude with respect to the direct integration of NEGF equations. Our technique allows one to simulate realistic situations beyond simple models, a subject that was until now beyond the simulation capabilities of available approaches.
△ Less
Submitted 18 February, 2014; v1 submitted 24 July, 2013;
originally announced July 2013.
-
Shocks and Ejecta Mass: Radio Observations of Nova V1723 Aql
Authors:
Jennifer H. S. Weston,
Jennifer L. Sokoloski,
Yong Zheng,
Laura Chomiuk,
Amy Mioduszewski,
Koji Mukai,
Michael P. Rupen,
Miriam I. Krauss,
Nirupam Roy,
Thomas Nelson
Abstract:
The radio light curves of novae rise and fall over the course of months to years, allowing for detailed observations of the evolution of the nova shell. However, the main parameter determined by radio models of nova explosions - the mass of the ejecta - often seems to exceed theoretical expectations by an order of magnitude. With the recent technological improvements on the Karl G. Jansky Very Lar…
▽ More
The radio light curves of novae rise and fall over the course of months to years, allowing for detailed observations of the evolution of the nova shell. However, the main parameter determined by radio models of nova explosions - the mass of the ejecta - often seems to exceed theoretical expectations by an order of magnitude. With the recent technological improvements on the Karl G. Jansky Very Large Array (VLA), new observations can test the assumptions upon which ejecta mass estimates are based. Early observations of the classical nova V1723 Aql showed an unexpectedly rapid rise in radio flux density and a distinct bump in the radio light curve on the rise to radio maximum, which is inconsistent with the simple model of spherical ejecta expelled in a single discrete event. This initial bump appears to indicate the presence of shocked material in the outer region of the ejected shell, with the emission from the shocks fading over time. We explore possible origins for this emission and its relation to the mass loss history of the nova. The evolution of the radio spectrum also reveals the density profile, the mass of the ejected shell, and other properties of the ejecta. These observations comprise one of the most complete, longterm set of multi-wavelength radio observations for any classical nova to date.
△ Less
Submitted 17 July, 2013; v1 submitted 10 June, 2013;
originally announced June 2013.
-
Irreflexive and Hierarchical Relations as Translations
Authors:
Antoine Bordes,
Nicolas Usunier,
Alberto Garcia-Duran,
Jason Weston,
Oksana Yakhnenko
Abstract:
We consider the problem of embedding entities and relations of knowledge bases in low-dimensional vector spaces. Unlike most existing approaches, which are primarily efficient for modeling equivalence relations, our approach is designed to explicitly model irreflexive relations, such as hierarchies, by interpreting them as translations operating on the low-dimensional embeddings of the entities. P…
▽ More
We consider the problem of embedding entities and relations of knowledge bases in low-dimensional vector spaces. Unlike most existing approaches, which are primarily efficient for modeling equivalence relations, our approach is designed to explicitly model irreflexive relations, such as hierarchies, by interpreting them as translations operating on the low-dimensional embeddings of the entities. Preliminary experiments show that, despite its simplicity and a smaller number of parameters than previous approaches, our approach achieves state-of-the-art performance according to standard evaluation protocols on data from WordNet and Freebase.
△ Less
Submitted 26 April, 2013;
originally announced April 2013.
-
Radio studies of novae: a current status report and highlights of new results
Authors:
Nirupam Roy,
Laura Chomiuk,
Jennifer L. Sokoloski,
Jennifer Weston,
Michael P. Rupen,
Traci Johnson,
Miriam I. Krauss,
Thomas Nelson,
Koji Mukai,
Amy Mioduszewski,
Michael F. Bode,
Stewart P. S. Eyres,
Tim J. O'Brien
Abstract:
Novae, which are the sudden visual brightening triggered by runaway thermonuclear burning on the surface of an accreting white dwarf, are fairly common and bright events. Despite their astronomical significance as nearby laboratories for the study of nuclear burning and accretion phenomena, many aspects of these common stellar explosions are observationally not well-constrained and remain poorly u…
▽ More
Novae, which are the sudden visual brightening triggered by runaway thermonuclear burning on the surface of an accreting white dwarf, are fairly common and bright events. Despite their astronomical significance as nearby laboratories for the study of nuclear burning and accretion phenomena, many aspects of these common stellar explosions are observationally not well-constrained and remain poorly understood. Radio observations, modeling and interpretation can potentially play a crucial role in addressing some of these puzzling issues. In this review on radio studies of novae, we focus on the possibility of testing and improving the nova models with radio observations, and present a current status report on the progress in both the observational front and theoretical developments. We specifically address the issues of accurate estimation of ejecta mass, multi-phase and complex ejection phenomena, and the effect of a dense environment around novae. With highlights of new observational results, we illustrate how radio observations can shed light on some of these long-standing puzzles.
△ Less
Submitted 18 February, 2013;
originally announced February 2013.
-
Affinity Weighted Embedding
Authors:
Jason Weston,
Ron Weiss,
Hector Yee
Abstract:
Supervised (linear) embedding models like Wsabie and PSI have proven successful at ranking, recommendation and annotation tasks. However, despite being scalable to large datasets they do not take full advantage of the extra data due to their linear nature, and typically underfit. We propose a new class of models which aim to provide improved performance while retaining many of the benefits of the…
▽ More
Supervised (linear) embedding models like Wsabie and PSI have proven successful at ranking, recommendation and annotation tasks. However, despite being scalable to large datasets they do not take full advantage of the extra data due to their linear nature, and typically underfit. We propose a new class of models which aim to provide improved performance while retaining many of the benefits of the existing class of embedding models. Our new approach works by iteratively learning a linear embedding model where the next iteration's features and labels are reweighted as a function of the previous iteration. We describe several variants of the family, and give some initial results.
△ Less
Submitted 17 January, 2013;
originally announced January 2013.
-
A Semantic Matching Energy Function for Learning with Multi-relational Data
Authors:
Xavier Glorot,
Antoine Bordes,
Jason Weston,
Yoshua Bengio
Abstract:
Large-scale relational learning becomes crucial for handling the huge amounts of structured data generated daily in many application domains ranging from computational biology or information retrieval, to natural language processing. In this paper, we present a new neural network architecture designed to embed multi-relational graphs into a flexible continuous vector space in which the original da…
▽ More
Large-scale relational learning becomes crucial for handling the huge amounts of structured data generated daily in many application domains ranging from computational biology or information retrieval, to natural language processing. In this paper, we present a new neural network architecture designed to embed multi-relational graphs into a flexible continuous vector space in which the original data is kept and enhanced. The network is trained to encode the semantics of these graphs in order to assign high probabilities to plausible components. We empirically show that it reaches competitive performance in link prediction on standard datasets from the literature.
△ Less
Submitted 21 March, 2013; v1 submitted 15 January, 2013;
originally announced January 2013.
-
The 2011 Outburst of Recurrent Nova T Pyx: Radio Observations Reveal the Ejecta Mass and Hint at Complex Mass Loss
Authors:
Thomas Nelson,
Laura Chomiuk,
Nirupam Roy,
J. L. Sokoloski,
Koji Mukai,
Miriam I. Krauss,
Amy J. Mioduszewski,
Michael P. Rupen,
Jennifer Weston
Abstract:
Despite being the prototype of its class, T Pyx is arguably the most unusual and poorly understood recurrent nova. Here, we use radio observations from the Karl G. Jansky Very Large Array to trace the evolution of the ejecta over the course of the 2011 outburst of T Pyx. The radio emission is broadly consistent with thermal emission from the nova ejecta. However, the radio flux began rising surpri…
▽ More
Despite being the prototype of its class, T Pyx is arguably the most unusual and poorly understood recurrent nova. Here, we use radio observations from the Karl G. Jansky Very Large Array to trace the evolution of the ejecta over the course of the 2011 outburst of T Pyx. The radio emission is broadly consistent with thermal emission from the nova ejecta. However, the radio flux began rising surprisingly late in the outburst, indicating that the bulk of the radio-emitting material was either very cold, or expanding very slowly, for the first ~50 days of the outburst. Considering a plausible range of volume filling factors and geometries for the ejecta, we find that the high peak flux densities of the radio emission require a massive ejection of 1-30 x 10^{-5} solar masses. This ejecta mass is much higher than the values normally associated with recurrent novae, and is more consistent with a nova on a white dwarf well below the Chandrasekhar limit.
△ Less
Submitted 3 March, 2014; v1 submitted 13 November, 2012;
originally announced November 2012.
-
The Radio Light Curve of the Gamma-Ray Nova in V407 Cyg: Thermal Emission from the Ionized Symbiotic Envelope, Devoured from Within by the Nova Blast
Authors:
Laura Chomiuk,
Miriam I. Krauss,
Michael P. Rupen,
Thomas Nelson,
Nirupam Roy,
Jennifer L. Sokoloski,
Koji Mukai,
Ulisse Munari,
Amy Mioduszewski,
Jennifer Weston,
Tim J. O'Brien,
Stewart P. S. Eyres,
Michael F. Bode
Abstract:
We present multi-frequency radio observations of the 2010 nova event in the symbiotic binary V407 Cygni, obtained with the Karl G. Jansky Very Large Array and spanning 1-45 GHz and 17-770 days following discovery. This nova---the first ever detected in gamma rays---shows a radio light curve dominated by the wind of the Mira giant companion, rather than the nova ejecta themselves. The radio luminos…
▽ More
We present multi-frequency radio observations of the 2010 nova event in the symbiotic binary V407 Cygni, obtained with the Karl G. Jansky Very Large Array and spanning 1-45 GHz and 17-770 days following discovery. This nova---the first ever detected in gamma rays---shows a radio light curve dominated by the wind of the Mira giant companion, rather than the nova ejecta themselves. The radio luminosity grew as the wind became increasingly ionized by the nova outburst, and faded as the wind was violently heated from within by the nova shock. This study marks the first time that this physical mechanism has been shown to dominate the radio light curve of an astrophysical transient. We do not observe a thermal signature from the nova ejecta or synchrotron emission from the shock, due to the fact that these components were hidden behind the absorbing screen of the Mira wind.
We estimate a mass loss rate for the Mira wind of Mdot_w ~ 10^-6 M_sun/yr. We also present the only radio detection of V407 Cyg before the 2010 nova, gleaned from unpublished 1993 archival VLA data, which shows that the radio luminosity of the Mira wind varies by a factor of >~20 even in quiescence. Although V407 Cyg likely hosts a massive accreting white dwarf, making it a candidate progenitor system for a Type Ia supernova, the dense and radially continuous circumbinary material surrounding V407 Cyg is inconsistent with observational constraints on the environments of most Type Ia supernovae.
△ Less
Submitted 22 October, 2012;
originally announced October 2012.
-
Latent Structured Ranking
Authors:
Jason Weston,
John Blitzer
Abstract:
Many latent (factorized) models have been proposed for recommendation tasks like collaborative filtering and for ranking tasks like document or image retrieval and annotation. Common to all those methods is that during inference the items are scored independently by their similarity to the query in the latent embedding space. The structure of the ranked list (i.e. considering the set of items retu…
▽ More
Many latent (factorized) models have been proposed for recommendation tasks like collaborative filtering and for ranking tasks like document or image retrieval and annotation. Common to all those methods is that during inference the items are scored independently by their similarity to the query in the latent embedding space. The structure of the ranked list (i.e. considering the set of items returned as a whole) is not taken into account. This can be a problem because the set of top predictions can be either too diverse (contain results that contradict each other) or are not diverse enough. In this paper we introduce a method for learning latent structured rankings that improves over existing methods by providing the right blend of predictions at the top of the ranked list. Particular emphasis is put on making this method scalable. Empirical results on large scale image annotation and music recommendation tasks show improvements over existing approaches.
△ Less
Submitted 16 October, 2012;
originally announced October 2012.