-
Black hole solutions surrounded by anisotropic fluid in $f(\mathbb{T},\CMcal{T})$ gravity
Authors:
Franciele M. da Silva,
Luis C. N. Santos,
V. B. Bezerra
Abstract:
In this work, we investigate some extensions of the Kiselev black hole solutions in the context of $f(\mathbb{T},\CMcal{T})$ gravity. By map** the components of the Kiselev energy-momentum tensor into the anisotropic energy-momentum tensor and assuming a particular form of $f(\mathbb{T},\CMcal{T})$, we obtain exact solutions for the field equation in this theory that carries dependence on the co…
▽ More
In this work, we investigate some extensions of the Kiselev black hole solutions in the context of $f(\mathbb{T},\CMcal{T})$ gravity. By map** the components of the Kiselev energy-momentum tensor into the anisotropic energy-momentum tensor and assuming a particular form of $f(\mathbb{T},\CMcal{T})$, we obtain exact solutions for the field equation in this theory that carries dependence on the coupling constant and on the parameter of the equation of state of the fluid. We show that in this scenario of modified gravity some new structure is added to the geometry of spacetime as compared to the Kiselev black hole. We analyse the energy conditions, mass, horizons and the Hawking temperature considering particular values for the parameter of the equation of state.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Memory Augmented Language Models through Mixture of Word Experts
Authors:
Cicero Nogueira dos Santos,
James Lee-Thorp,
Isaac Noble,
Chung-Ching Chang,
David Uthus
Abstract:
Scaling up the number of parameters of language models has proven to be an effective approach to improve performance. For dense models, increasing model size proportionally increases the model's computation footprint. In this work, we seek to aggressively decouple learning capacity and FLOPs through Mixture-of-Experts (MoE) style models with large knowledge-rich vocabulary based routing functions…
▽ More
Scaling up the number of parameters of language models has proven to be an effective approach to improve performance. For dense models, increasing model size proportionally increases the model's computation footprint. In this work, we seek to aggressively decouple learning capacity and FLOPs through Mixture-of-Experts (MoE) style models with large knowledge-rich vocabulary based routing functions and experts. Our proposed approach, dubbed Mixture of Word Experts (MoWE), can be seen as a memory augmented model, where a large set of word-specific experts play the role of a sparse memory. We demonstrate that MoWE performs significantly better than the T5 family of models with similar number of FLOPs in a variety of NLP tasks. Additionally, MoWE outperforms regular MoE models on knowledge intensive tasks and has similar performance to more complex memory augmented approaches that often require to invoke custom mechanisms to search the sparse memory.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
A Bayesian study of quark models in view of recent astrophysical constraints
Authors:
Franciele M. da Silva,
Adamu Issifu,
Luiz L. Lopes,
Luis C. N. Santos,
Débora P. Menezes
Abstract:
In this work, we perform a comparative analysis between the density-dependent quark model and the vector MIT bag model using Bayesian analysis. We use the equations of state generated by these two models to describe quark stars. We impose four recent observational astrophysical constraints on both models to determine their model-dependent parameters in an optimized manner assuming that the compact…
▽ More
In this work, we perform a comparative analysis between the density-dependent quark model and the vector MIT bag model using Bayesian analysis. We use the equations of state generated by these two models to describe quark stars. We impose four recent observational astrophysical constraints on both models to determine their model-dependent parameters in an optimized manner assuming that the compact objects observed are composed entirely of self-bound quarks. The restrictions are aimed at producing stars with maximum masses $2 - 2.35$ M$_\odot$ and a mass-radii diagram compatible with the observed pulsars: PSR J0740+6620, PSR J0952-0607, PSR J0030+0451 and the compact object XMMU J173203.3-344518. With this analysis, the parameter dependence of the nuclear equation of state (EoS) of both models is restricted.
△ Less
Submitted 23 April, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks
Authors:
Kanishka Misra,
Cicero Nogueira dos Santos,
Siamak Shakeri
Abstract:
Despite readily memorizing world knowledge about entities, pre-trained language models (LMs) struggle to compose together two or more facts to perform multi-hop reasoning in question-answering tasks. In this work, we propose techniques that improve upon this limitation by relying on random walks over structured knowledge graphs. Specifically, we use soft prompts to guide LMs to chain together thei…
▽ More
Despite readily memorizing world knowledge about entities, pre-trained language models (LMs) struggle to compose together two or more facts to perform multi-hop reasoning in question-answering tasks. In this work, we propose techniques that improve upon this limitation by relying on random walks over structured knowledge graphs. Specifically, we use soft prompts to guide LMs to chain together their encoded knowledge by learning to map multi-hop questions to random walk paths that lead to the answer. Applying our methods on two T5 LMs shows substantial improvements over standard tuning approaches in answering questions that require 2-hop reasoning.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Quantum Configuration and Phase Spaces: Finsler and Hamilton Geometries
Authors:
Saulo Albuquerque,
Valdir B. Bezerra,
Iarley P. Lobo,
Gabriel Macedo,
Pedro H. Morais,
Ernesto Rodrigues,
Luis C. N. Santos,
Gislaine Varão
Abstract:
In this paper, we review two approaches that can describe, in a geometrical way, the kinematics of particles that are affected by Planck-scale departures, named Finsler and Hamilton geometries. By relying on maps that connect the spaces of velocities and momenta, we discuss the properties of configuration and phase spaces induced by these two distinct geometries. In particular, we exemplify this a…
▽ More
In this paper, we review two approaches that can describe, in a geometrical way, the kinematics of particles that are affected by Planck-scale departures, named Finsler and Hamilton geometries. By relying on maps that connect the spaces of velocities and momenta, we discuss the properties of configuration and phase spaces induced by these two distinct geometries. In particular, we exemplify this approach by considering the so-called $q$-de Sitter-inspired modified dispersion relation as a laboratory for this study. We finalize with some points that we consider as positive and negative ones of each approach for the description of quantum configuration and phases spaces.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
Neutron stars in the context of $f$($\mathbb{T}$,$\mathcal{T}$) gravity
Authors:
Clésio E. Mota,
Luis C. N. Santos,
Franciele M. da Silva,
Cesar V. Flores,
Iarley P. Lobo,
Valdir B. Bezerra
Abstract:
In this work, we investigate the existence of neutron stars (NS) in the framework of $f$($\mathbb{T}$,$\mathcal{T}$) gravity, where $\mathbb{T}$ is the torsion tensor and $\mathcal{T}$ is the trace of the energy-momentum tensor. The hydrostatic equilibrium equations are obtained, however, with $p$ and $ρ$ quantities passed on by effective quantities $\bar{p}$ and $\barρ$, whose mass-radius diagram…
▽ More
In this work, we investigate the existence of neutron stars (NS) in the framework of $f$($\mathbb{T}$,$\mathcal{T}$) gravity, where $\mathbb{T}$ is the torsion tensor and $\mathcal{T}$ is the trace of the energy-momentum tensor. The hydrostatic equilibrium equations are obtained, however, with $p$ and $ρ$ quantities passed on by effective quantities $\bar{p}$ and $\barρ$, whose mass-radius diagrams are obtained using modern equations of state (EoS) of nuclear matter derived from relativistic mean field models and compared with the ones computed by the Tolman-Oppenheimer-Volkoff (TOV) equations. Substantial changes in the mass-radius profiles of NS are obtained even for small changes in the free parameter of this modified theory. The results indicate that the use of $f$($\mathbb{T}$,$\mathcal{T}$) gravity in the study of NS provides good results for the masses and radii of some important astrophysical objects, as for example, the low-mass X-ray binary (LMXB) NGC 6397 and the pulsar of millisecond PSR J0740+6620. In addition, radii results inferred from the Lead Radius EXperiment (PREX-2) can also be described for certain parameter values.
△ Less
Submitted 10 January, 2023; v1 submitted 8 January, 2023;
originally announced January 2023.
-
Kiselev black holes in $f(R,T)$ gravity
Authors:
L. C. N. Santos,
F. M. da Silva,
C. E. Mota,
I. P. Lobo,
V. B. Bezerra
Abstract:
We obtain new exact solutions for the gravitational field equations in the context of $f(R,T)$ gravity, thereby obtaining different classes of black holes surrounded by fluids, taking into account some specific values of the parameter of the equations of state, $w$. In order to obtain these solutions in the context of $f(R,T)$ gravity, we consider viable particular choices of the $f(R,T)$. Conside…
▽ More
We obtain new exact solutions for the gravitational field equations in the context of $f(R,T)$ gravity, thereby obtaining different classes of black holes surrounded by fluids, taking into account some specific values of the parameter of the equations of state, $w$. In order to obtain these solutions in the context of $f(R,T)$ gravity, we consider viable particular choices of the $f(R,T)$. Considering an anisotropic energy-momentum tensor, we write the field equations with the required symmetries for this type of solution. Then, we analyze the conditions of energy in a general way and also for particular values of the parameter $w$ of the equation of state. In addition, thermodynamic quantities, such as Hawking temperature and mass associated to the horizons of solutions, are taken into account in our analysis.
△ Less
Submitted 5 January, 2023;
originally announced January 2023.
-
Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts
Authors:
Cicero Nogueira dos Santos,
Zhe Dong,
Daniel Cer,
John Nham,
Siamak Shakeri,
Jianmo Ni,
Yun-hsuan Sung
Abstract:
Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) to new tasks. In this work, we repurpose soft prompts to the task of injecting world knowledge into LMs. We introduce a method to train soft prompts via self-supervised learning on data from knowledge bases. The resulting soft knowledge prompts (KPs) are task independent and work as an external memor…
▽ More
Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) to new tasks. In this work, we repurpose soft prompts to the task of injecting world knowledge into LMs. We introduce a method to train soft prompts via self-supervised learning on data from knowledge bases. The resulting soft knowledge prompts (KPs) are task independent and work as an external memory of the LMs. We perform qualitative and quantitative experiments and demonstrate that: (1) KPs can effectively model the structure of the training data; (2) KPs can be used to improve the performance of LMs in different knowledge intensive tasks.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Non-inertial effects on a non-relativistic quantum harmonic oscillator in the presence of a screw dislocation
Authors:
L. C. N. Santos,
F. M. da Silva,
C. E. Mota,
V. B. Bezerra
Abstract:
We investigate non-inertial effects induced by a rotating frame on a non-relativistic quantum harmonic oscillator as well as of the topology associated to a screw dislocation, which corresponds to a distortion of a vertical line into a vertical spiral. To do this, we obtain the analytical solutions of the time-independent Schrödinger equation for this harmonic oscillator potential in this backgrou…
▽ More
We investigate non-inertial effects induced by a rotating frame on a non-relativistic quantum harmonic oscillator as well as of the topology associated to a screw dislocation, which corresponds to a distortion of a vertical line into a vertical spiral. To do this, we obtain the analytical solutions of the time-independent Schrödinger equation for this harmonic oscillator potential in this background. The expressions for the energy spectrum are obtained and the solutions for four quantum states, namely $n=0,1,2$ and $3$, are analysed. Our results show that the presence of the topological defect (screw dislocation) as well the fact that we are analysing the system from the point of view of a rotating frame, changes the solutions of Schrödinger equation and the corresponding spectrum. Now these quantities depend on the angular velocity of the rotating frame, $Ω$, and also on the parameter $β$, which codifies the presence of the screw dislocation. Particularly, with respect to the energy spectrum of the system the changing is such that when $Ω$ increases, the energy can increase or decrease depending on the values we assign to the eigenvalues of the angular and linear momenta. Additionally, we observe that the values of the parameter $β$ that characterizes the screw dislocation causes a shift in the energy spectrum.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
Rapidly rotating neutron stars in $f(R,T)$ gravity
Authors:
F. M. da Silva,
L. C. N. Santos,
C. E. Mota,
T. O. F. da Costa,
J. C. Fabris
Abstract:
In this work, we study the influence of $f(R,T)$ gravity on rapidly rotating neutron stars. First we discuss the main aspects of this modified theory of gravity where the gravitational Lagrangian is an arbitrary function of the Ricci scalar $R$ and of the trace of the energy-momentum tensor $T$. Then we present the basic equations for neutron stars including the equations of state used in the pres…
▽ More
In this work, we study the influence of $f(R,T)$ gravity on rapidly rotating neutron stars. First we discuss the main aspects of this modified theory of gravity where the gravitational Lagrangian is an arbitrary function of the Ricci scalar $R$ and of the trace of the energy-momentum tensor $T$. Then we present the basic equations for neutron stars including the equations of state used in the present work to describe the hadronic matter. Some physical quantities of interest are calculated such as mass-radius relations, moments of inertia, angular momentum, and compactness. By considering four different rotation regimes, we obtain results that indicate substantial modifications in the physical properties of neutron stars in $f(R,T)$ gravity when compared to those in the context of general relativity. In particular, the mass-radius relation for sequences of stars indicates that $f(R,T)$ gravity increases the mass and the equatorial radius of the neutron stars for stars rotating with an angular velocity smaller than Kepler limit.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Counterfactual Data Augmentation improves Factuality of Abstractive Summarization
Authors:
Dheeraj Rajagopal,
Siamak Shakeri,
Cicero Nogueira dos Santos,
Eduard Hovy,
Chung-Ching Chang
Abstract:
Abstractive summarization systems based on pretrained language models often generate coherent but factually inconsistent sentences. In this paper, we present a counterfactual data augmentation approach where we augment data with perturbed summaries that increase the training data diversity. Specifically, we present three augmentation approaches based on replacing (i) entities from other and the sa…
▽ More
Abstractive summarization systems based on pretrained language models often generate coherent but factually inconsistent sentences. In this paper, we present a counterfactual data augmentation approach where we augment data with perturbed summaries that increase the training data diversity. Specifically, we present three augmentation approaches based on replacing (i) entities from other and the same category and (ii) nouns with their corresponding WordNet hypernyms. We show that augmenting the training data with our approach improves the factual correctness of summaries without significantly affecting the ROUGE score. We show that in two commonly used summarization datasets (CNN/Dailymail and XSum), we improve the factual correctness by about 2.5 points on average
△ Less
Submitted 24 May, 2022;
originally announced May 2022.
-
ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Authors:
Kai Hui,
Honglei Zhuang,
Tao Chen,
Zhen Qin,
**g Lu,
Dara Bahri,
Ji Ma,
Jai Prakash Gupta,
Cicero Nogueira dos Santos,
Yi Tay,
Don Metzler
Abstract:
State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms, however, are not without flaws, i.e., running the model on all query-document pairs at inference-time incurs a significant computational cost. This paper propo…
▽ More
State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms, however, are not without flaws, i.e., running the model on all query-document pairs at inference-time incurs a significant computational cost. This paper proposes a new training and inference paradigm for re-ranking. We propose to finetune a pretrained encoder-decoder model using in the form of document to query generation. Subsequently, we show that this encoder-decoder architecture can be decomposed into a decoder-only language model during inference. This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference. Our experiments show that this new paradigm achieves results that are comparable to the more expensive cross-attention ranking approaches while being up to 6.8X faster. We believe this work paves the way for more efficient neural rankers that leverage large pretrained models.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
Spectroscopic assessment of short-term nitric acid do** of epitaxial graphene
Authors:
Ngoc Thanh Mai Tran,
Swapnil M. Mhatre,
Cristiane N. Santos,
Adam J. Biacchi,
Mathew L. Kelley,
Heather M. Hill,
Dipanjan Saha,
Chi-Te Liang,
Randolph E. Elmquist,
David B. Newell,
Benoit Hackens,
Christina A. Hacker,
Albert F. Rigosi
Abstract:
This work reports information on the transience of hole do** in epitaxial graphene devices when nitric acid is used as an adsorbent. Under vacuum conditions, desorption processes are monitored by electrical and spectroscopic means to extract the relevant timescales from the corresponding data. It is of vital importance to understand the reversible nature of hole do** because such device proces…
▽ More
This work reports information on the transience of hole do** in epitaxial graphene devices when nitric acid is used as an adsorbent. Under vacuum conditions, desorption processes are monitored by electrical and spectroscopic means to extract the relevant timescales from the corresponding data. It is of vital importance to understand the reversible nature of hole do** because such device processing can be a suitable alternative to large-scale, metallic gating. Most measurements are performed post-exposure at room temperature, and, for some electrical transport measurements, at 1.5 K. Vacuum conditions are applied to many measurements to replicate the laboratory conditions under which devices using this do** method would be measured. The relevant timescales from transport measurements are compared with results from X-ray photoelecton spectroscopy and Fourier transform infrared spectroscopy measurements, with the latter performed at ambient conditions and accompanied by calculations of the spectra in the Reststrahlen band.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
The extended phase space thermodynamics of Planck-scale-corrected Reissner-Nordström-anti-de Sitter black hole
Authors:
Iarley P. Lobo,
Luis C. N. Santos,
V. B. Bezerra,
J. P. Morais Graça,
H. Moradpour
Abstract:
We analyze the effect of Planck-scale modified radiation equation of state on the Reissner-Nodström-anti-de Sitter black hole inspired by Kiselev's ansatz. Deformed thermodynamic quantities are found, phase transitions and black holes as heat engines are described for the Carnot and square cycles. Non-trivial differences between linear and quadratic Planck-scale corrections are discussed in detail…
▽ More
We analyze the effect of Planck-scale modified radiation equation of state on the Reissner-Nodström-anti-de Sitter black hole inspired by Kiselev's ansatz. Deformed thermodynamic quantities are found, phase transitions and black holes as heat engines are described for the Carnot and square cycles. Non-trivial differences between linear and quadratic Planck-scale corrections are discussed in detail.
△ Less
Submitted 11 October, 2021;
originally announced October 2021.
-
Effects of modified dispersion relations on free Fermi gas: equations of state and applications in astrophysics
Authors:
Luis C. N. Santos,
Clésio E. Mota,
Franciele M. da Silva,
Guilherme Grams,
I. P. Lobo
Abstract:
Deformed dispersion relations are considered in the study of equations of state of Fermi gas with applications to compact objects. Different choices of deformed energy relations are used in the formulation of our model. As a first test, we consider a relativistic star with a simple internal structure. The mass-radius diagrams obtained suggest a positive influence of deformed Fermi gas, depending o…
▽ More
Deformed dispersion relations are considered in the study of equations of state of Fermi gas with applications to compact objects. Different choices of deformed energy relations are used in the formulation of our model. As a first test, we consider a relativistic star with a simple internal structure. The mass-radius diagrams obtained suggest a positive influence of deformed Fermi gas, depending of the functions employed. In addition, we comment on how realistic equations of state, in which interactions between nucleons are taken into account, can be addressed.
△ Less
Submitted 25 September, 2021;
originally announced October 2021.
-
Contrastive Fine-tuning Improves Robustness for Neural Rankers
Authors:
Xiaofei Ma,
Cicero Nogueira dos Santos,
Andrew O. Arnold
Abstract:
The performance of state-of-the-art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain. In this paper, we present a novel method for fine-tuning neural rankers that can significantly improve their robustness to out-of-domain data and query perturbations. Specifically, a contrastive loss that compares data points in the representation space is combi…
▽ More
The performance of state-of-the-art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain. In this paper, we present a novel method for fine-tuning neural rankers that can significantly improve their robustness to out-of-domain data and query perturbations. Specifically, a contrastive loss that compares data points in the representation space is combined with the standard ranking loss during fine-tuning. We use relevance labels to denote similar/dissimilar pairs, which allows the model to learn the underlying matching semantics across different query-document pairs and leads to improved robustness. In experiments with four passage ranking datasets, the proposed contrastive fine-tuning method obtains improvements on robustness to query reformulations, noise perturbations, and zero-shot transfer for both BERT and BART based rankers. Additionally, our experiments show that contrastive fine-tuning outperforms data augmentation for robustifying neural rankers.
△ Less
Submitted 27 May, 2021;
originally announced May 2021.
-
Joint Text and Label Generation for Spoken Language Understanding
Authors:
Yang Li,
Ben Athiwaratkun,
Cicero Nogueira dos Santos,
Bing Xiang
Abstract:
Generalization is a central problem in machine learning, especially when data is limited. Using prior information to enforce constraints is the principled way of encouraging generalization. In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data. Spe…
▽ More
Generalization is a central problem in machine learning, especially when data is limited. Using prior information to enforce constraints is the principled way of encouraging generalization. In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data. Specifically, we extract prior knowledge from pretrained LM in the form of synthetic data, which encode the prior implicitly. We fine-tune the LM to generate an augmented language, which contains not only text but also encodes both intent labels and slot labels. The generated synthetic data can be used to train a classifier later. Since the generated data may contain noise, we rephrase the learning from generated data as learning with noisy labels. We then utilize the mixout regularization for the classifier and prove its effectiveness to resist label noise in generated data. Empirically, our method demonstrates superior performance and outperforms the baseline by a large margin.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Improving Factual Consistency of Abstractive Summarization via Question Answering
Authors:
Feng Nan,
Cicero Nogueira dos Santos,
Henghui Zhu,
Patrick Ng,
Kathleen McKeown,
Ramesh Nallapati,
Dejiao Zhang,
Zhiguo Wang,
Andrew O. Arnold,
Bing Xiang
Abstract:
A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summari…
▽ More
A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summarization. We first propose an efficient automatic evaluation metric to measure factual consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. Through extensive experiments, we confirm that our method is effective in improving factual consistency and even overall quality of the summaries, as judged by both automatic metrics and human evaluation.
△ Less
Submitted 10 May, 2021;
originally announced May 2021.
-
Generative Context Pair Selection for Multi-hop Question Answering
Authors:
Dheeru Dua,
Cicero Nogueira dos Santos,
Patrick Ng,
Ben Athiwaratkun,
Bing Xiang,
Matt Gardner,
Sameer Singh
Abstract:
Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question. However, crowdsourced datasets often capture only a slice of the underlying task distribution, which can induce unanticipated biases in models performing compositional reasoning. Furthermore, discriminatively trained models exploit such biases to get a better…
▽ More
Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question. However, crowdsourced datasets often capture only a slice of the underlying task distribution, which can induce unanticipated biases in models performing compositional reasoning. Furthermore, discriminatively trained models exploit such biases to get a better held-out performance, without learning the right way to reason, as they do not necessitate paying attention to the question representation (conditioning variable) in its entirety, to estimate the answer likelihood. In this work, we propose a generative context selection model for multi-hop question answering that reasons about how the given question could have been generated given a context pair. While being comparable to the state-of-the-art answering performance, our proposed generative passage selection model has a better performance (4.9% higher than baseline) on adversarial held-out set which tests robustness of model's multi-hop reasoning capabilities.
△ Less
Submitted 18 April, 2021;
originally announced April 2021.
-
Entity-level Factual Consistency of Abstractive Text Summarization
Authors:
Feng Nan,
Ramesh Nallapati,
Zhiguo Wang,
Cicero Nogueira dos Santos,
Henghui Zhu,
Dejiao Zhang,
Kathleen McKeown,
Bing Xiang
Abstract:
A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of gene…
▽ More
A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of generated summaries and we show that the entity hallucination problem can be alleviated by simply filtering the training data. In addition, we propose a summary-worthy entity classification task to the training process as well as a joint entity and summary generation approach, which yield further improvements in entity level metrics.
△ Less
Submitted 17 February, 2021;
originally announced February 2021.
-
Structured Prediction as Translation between Augmented Natural Languages
Authors:
Giovanni Paolini,
Ben Athiwaratkun,
Jason Krone,
Jie Ma,
Alessandro Achille,
Rishita Anubhai,
Cicero Nogueira dos Santos,
Bing Xiang,
Stefano Soatto
Abstract:
We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discri…
▽ More
We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discriminative classifiers, we frame it as a translation task between augmented natural languages, from which the task-relevant information can be easily extracted. Our approach can match or outperform task-specific models on all tasks, and in particular, achieves new state-of-the-art results on joint entity and relation extraction (CoNLL04, ADE, NYT, and ACE2005 datasets), relation classification (FewRel and TACRED), and semantic role labeling (CoNLL-2005 and CoNLL-2012). We accomplish this while using the same architecture and hyperparameters for all tasks and even when training a single model to solve all tasks at the same time (multi-task learning). Finally, we show that our framework can also significantly improve the performance in a low-resource regime, thanks to better use of label semantics.
△ Less
Submitted 2 December, 2021; v1 submitted 14 January, 2021;
originally announced January 2021.
-
Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training
Authors:
Peng Shi,
Patrick Ng,
Zhiguo Wang,
Henghui Zhu,
Alexander Hanbo Li,
Jun Wang,
Cicero Nogueira dos Santos,
Bing Xiang
Abstract:
Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train large neural language models with self-supervised learning objectives, such as Masked Language Model (MLM). However, based on a pilot study, we observe three issues of existing general-purpose language models when they are applied to text-…
▽ More
Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train large neural language models with self-supervised learning objectives, such as Masked Language Model (MLM). However, based on a pilot study, we observe three issues of existing general-purpose language models when they are applied to text-to-SQL semantic parsers: fail to detect column mentions in the utterances, fail to infer column mentions from cell values, and fail to compose complex SQL queries. To mitigate these issues, we present a model pre-training framework, Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data. GAP MODEL is trained on 2M utterance-schema pairs and 30K utterance-schema-SQL triples, whose utterances are produced by generative models. Based on experimental results, neural semantic parsers that leverage GAP MODEL as a representation encoder obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-SQL benchmarks.
△ Less
Submitted 18 December, 2020;
originally announced December 2020.
-
Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip Prediction
Authors:
Yifan Gao,
Henghui Zhu,
Patrick Ng,
Cicero Nogueira dos Santos,
Zhiguo Wang,
Feng Nan,
Dejiao Zhang,
Ramesh Nallapati,
Andrew O. Arnold,
Bing Xiang
Abstract:
In open-domain question answering, questions are highly likely to be ambiguous because users may not know the scope of relevant topics when formulating them. Therefore, a system needs to find possible interpretations of the question, and predict one or multiple plausible answers. When multiple plausible answers are found, the system should rewrite the question for each answer to resolve the ambigu…
▽ More
In open-domain question answering, questions are highly likely to be ambiguous because users may not know the scope of relevant topics when formulating them. Therefore, a system needs to find possible interpretations of the question, and predict one or multiple plausible answers. When multiple plausible answers are found, the system should rewrite the question for each answer to resolve the ambiguity. In this paper, we present a model that aggregates and combines evidence from multiple passages to adaptively predict a single answer or a set of question-answer pairs for ambiguous questions. In addition, we propose a novel round-trip prediction approach to iteratively generate additional interpretations that our model fails to find in the first pass, and then verify and filter out the incorrect question-answer pairs to arrive at the final disambiguated output. Our model, named Refuel, achieves a new state-of-the-art performance on the AmbigQA dataset, and shows competitive performance on NQ-Open and TriviaQA. The proposed round-trip prediction is a model-agnostic general approach for answering ambiguous open-domain questions, which improves our Refuel as well as several baseline models. We release source code for our models and experiments at https://github.com/amzn/refuel-open-domain-qa.
△ Less
Submitted 30 May, 2021; v1 submitted 26 November, 2020;
originally announced November 2020.
-
DualTKB: A Dual Learning Bridge between Text and Knowledge Base
Authors:
Pierre L. Dognin,
Igor Melnyk,
Inkit Padhi,
Cicero Nogueira dos Santos,
Payel Das
Abstract:
In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). We investigate the impact of weak supervision by creating a weakly supervised dataset and show that even a slight amount of supervision can significantly improve the model performance and enable better-quality transfers. We examine different model archite…
▽ More
In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). We investigate the impact of weak supervision by creating a weakly supervised dataset and show that even a slight amount of supervision can significantly improve the model performance and enable better-quality transfers. We examine different model architectures, and evaluation metrics, proposing a novel Commonsense KB completion metric tailored for generative models. Extensive experimental results show that the proposed method compares very favorably to the existing baselines. This approach is a viable step towards a more advanced system for automatic KB construction/expansion and the reverse operation of KB conversion to coherent textual descriptions.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems
Authors:
Siamak Shakeri,
Cicero Nogueira dos Santos,
Henry Zhu,
Patrick Ng,
Feng Nan,
Zhiguo Wang,
Ramesh Nallapati,
Bing Xiang
Abstract:
We propose an end-to-end approach for synthetic QA data generation. Our model comprises a single transformer-based encoder-decoder network that is trained end-to-end to generate both answers and questions. In a nutshell, we feed a passage to the encoder and ask the decoder to generate a question and an answer token-by-token. The likelihood produced in the generation process is used as a filtering…
▽ More
We propose an end-to-end approach for synthetic QA data generation. Our model comprises a single transformer-based encoder-decoder network that is trained end-to-end to generate both answers and questions. In a nutshell, we feed a passage to the encoder and ask the decoder to generate a question and an answer token-by-token. The likelihood produced in the generation process is used as a filtering score, which avoids the need for a separate filtering model. Our generator is trained by fine-tuning a pretrained LM using maximum likelihood estimation. The experimental results indicate significant improvements in the domain adaptation of QA models outperforming current state-of-the-art methods.
△ Less
Submitted 12 October, 2020;
originally announced October 2020.
-
Beyond [CLS] through Ranking by Generation
Authors:
Cicero Nogueira dos Santos,
Xiaofei Ma,
Ramesh Nallapati,
Zhiheng Huang,
Bing Xiang
Abstract:
Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document's language model, were very successful in various IR tasks in the past. However, with the advent of modern deep neural networks, attention has shifted to discriminative ranking functions that model the semantic similarity of documents and queries instead. Recently, de…
▽ More
Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document's language model, were very successful in various IR tasks in the past. However, with the advent of modern deep neural networks, attention has shifted to discriminative ranking functions that model the semantic similarity of documents and queries instead. Recently, deep generative models such as GPT2 and BART have been shown to be excellent text generators, but their effectiveness as rankers have not been demonstrated yet. In this work, we revisit the generative framework for information retrieval and show that our generative approaches are as effective as state-of-the-art semantic similarity-based discriminative models for the answer selection task. Additionally, we demonstrate the effectiveness of unlikelihood losses for IR.
△ Less
Submitted 6 October, 2020;
originally announced October 2020.
-
Rapidly rotating compact stars in Rastall's gravity
Authors:
F. M. da Silva,
L. C. N. Santos,
C. C. Barros Jr
Abstract:
In this work we study rapidly rotating stars by considering the Rastall theory of gravity. We obtain and solve the equations by numerical methods for two usual parametrization of polytropic stars. Then the mass-radius relations, moments of inertia and other results of interest are obtained and compared with the ones for non-rotating stars.
In this work we study rapidly rotating stars by considering the Rastall theory of gravity. We obtain and solve the equations by numerical methods for two usual parametrization of polytropic stars. Then the mass-radius relations, moments of inertia and other results of interest are obtained and compared with the ones for non-rotating stars.
△ Less
Submitted 4 January, 2022; v1 submitted 30 September, 2020;
originally announced October 2020.
-
Augmented Natural Language for Generative Sequence Labeling
Authors:
Ben Athiwaratkun,
Cicero Nogueira dos Santos,
Jason Krone,
Bing Xiang
Abstract:
We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework is general purpose, performing well on few-shot, low-r…
▽ More
We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework is general purpose, performing well on few-shot, low-resource, and high-resource tasks. We demonstrate these advantages on popular named entity recognition, slot labeling, and intent classification benchmarks. We set a new state-of-the-art for few-shot slot labeling, improving substantially upon the previous 5-shot ($75.0\% \rightarrow 90.9\%$) and 1-shot ($70.4\% \rightarrow 81.0\%$) state-of-the-art results. Furthermore, our model generates large improvements ($46.27\% \rightarrow 63.83\%$) in low-resource slot labeling over a BERT baseline by incorporating label semantics. We also maintain competitive results on high-resource tasks, performing within two points of the state-of-the-art on all tasks and setting a new state-of-the-art on the SNIPS dataset.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
Embedding-based Zero-shot Retrieval through Query Generation
Authors:
Davis Liang,
Peng Xu,
Siamak Shakeri,
Cicero Nogueira dos Santos,
Ramesh Nallapati,
Zhiheng Huang,
Bing Xiang
Abstract:
Passage retrieval addresses the problem of locating relevant passages, usually from a large corpus, given a query. In practice, lexical term-matching algorithms like BM25 are popular choices for retrieval owing to their efficiency. However, term-based matching algorithms often miss relevant passages that have no lexical overlap with the query and cannot be finetuned to downstream datasets. In this…
▽ More
Passage retrieval addresses the problem of locating relevant passages, usually from a large corpus, given a query. In practice, lexical term-matching algorithms like BM25 are popular choices for retrieval owing to their efficiency. However, term-based matching algorithms often miss relevant passages that have no lexical overlap with the query and cannot be finetuned to downstream datasets. In this work, we consider the embedding-based two-tower architecture as our neural retrieval model. Since labeled data can be scarce and because neural retrieval models require vast amounts of data to train, we propose a novel method for generating synthetic training data for retrieval. Our system produces remarkable results, significantly outperforming BM25 on 5 out of 6 datasets tested, by an average of 2.45 points for Recall@1. In some cases, our model trained on synthetic data can even outperform the same model trained on real data
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Generalized Rastall's gravity and its effects on compact objects
Authors:
Clésio E. Mota,
Luis C. N. Santos,
Franciele M. da Silva,
Guilherme Grams,
Iarley P. Lobo,
Débora P. Menezes
Abstract:
We present a generalization of Rastall's gravity in which the conservation law of the energy-momentum tensor is altered, and as a result, the trace of the energy-momentum tensor is taken into account together with the Ricci scalar in the expression for the covariant derivative. Afterwards, we obtain the field equations in this theory and solve them by considering a spherically symmetric space-time…
▽ More
We present a generalization of Rastall's gravity in which the conservation law of the energy-momentum tensor is altered, and as a result, the trace of the energy-momentum tensor is taken into account together with the Ricci scalar in the expression for the covariant derivative. Afterwards, we obtain the field equations in this theory and solve them by considering a spherically symmetric space-time. We show that the external solution has two possible classes of solutions with spherical symmetry in the vacuum in generalized Rastall's gravity, and we analyse one of them explicitly. The generalization, in contrast to constant value $k=8πG$ in general relativity, has a gravitational parameter $k$ that depends on the Rastall constant $α$. As an application, we perform a careful analysis of the effects of the theory on neutron stars using realistic equations of state (EoS) as input. Our results show that important differences on the profile of neutron stars are obtained within two representatives EoS.
△ Less
Submitted 23 February, 2022; v1 submitted 3 July, 2020;
originally announced July 2020.
-
Learning Implicit Text Generation via Feature Matching
Authors:
Inkit Padhi,
Pierre Dognin,
Ke Bai,
Cicero Nogueira dos Santos,
Vijil Chenthamarakshan,
Youssef Mroueh,
Payel Das
Abstract:
Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English…
▽ More
Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English: unconditional text generation, class-conditional text generation, and unsupervised text style transfer. SeqGFMN is stable to train and outperforms various adversarial approaches for text generation and text style transfer.
△ Less
Submitted 8 May, 2020; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Covering the News with (AI) Style
Authors:
Michele Merler,
Cicero Nogueira dos Santos,
Mauro Martino,
Alfio M. Gliozzo,
John R. Smith
Abstract:
We introduce a multi-modal discriminative and generative frame-work capable of assisting humans in producing visual content re-lated to a given theme, starting from a collection of documents(textual, visual, or both). This framework can be used by edit or to generate images for articles, as well as books or music album covers. Motivated by a request from the The New York Times (NYT) seeking help t…
▽ More
We introduce a multi-modal discriminative and generative frame-work capable of assisting humans in producing visual content re-lated to a given theme, starting from a collection of documents(textual, visual, or both). This framework can be used by edit or to generate images for articles, as well as books or music album covers. Motivated by a request from the The New York Times (NYT) seeking help to use AI to create art for their special section on Artificial Intelligence, we demonstrated the application of our system in producing such image.
△ Less
Submitted 5 January, 2020;
originally announced February 2020.
-
Quantum dynamics of scalar particles in the space-time of a cosmic string in the context of gravity's rainbow
Authors:
L. C. N. Santos,
C. E. Mota,
C. C. Barros Jr.,
L. B. Castro,
V. B. Bezerra
Abstract:
We analyze the dynamics of scalar particles in gravity's rainbow considering the space-time of a cosmic string in this modified gravity. Thus, we solve the Klein-Gordon equation for two types of potential in which two possible rainbow functions are selected. In all cases studied we obtain exact solutions for the wave equation, derive the energy spectrum, analyze it in detail and discuss the depend…
▽ More
We analyze the dynamics of scalar particles in gravity's rainbow considering the space-time of a cosmic string in this modified gravity. Thus, we solve the Klein-Gordon equation for two types of potential in which two possible rainbow functions are selected. In all cases studied we obtain exact solutions for the wave equation, derive the energy spectrum, analyze it in detail and discuss the dependence of the results with the choice of the rainbow functions.
△ Less
Submitted 19 December, 2019;
originally announced December 2019.
-
Effects of Planck-scale-modified dispersion relations on the thermodynamics of charged black holes
Authors:
I. P. Lobo,
V. B. Bezerra,
J. P. Morais Graça,
Luis C. N. Santos,
M. Ronco
Abstract:
Considering corrections produced by modified dispersion relations on the equation of state parameter of radiation, we study the induced black hole metric inspired by Kiselev's ansatz, thus defining a deformed Reissner-Nordström metric. In particular, we consider thermodynamic properties of such a black hole from the combined viewpoints of the modified equation of state parameter and the phenomenol…
▽ More
Considering corrections produced by modified dispersion relations on the equation of state parameter of radiation, we study the induced black hole metric inspired by Kiselev's ansatz, thus defining a deformed Reissner-Nordström metric. In particular, we consider thermodynamic properties of such a black hole from the combined viewpoints of the modified equation of state parameter and the phenomenological approach to the quantum gravity problem called rainbow gravity.
△ Less
Submitted 2 April, 2020; v1 submitted 19 November, 2019;
originally announced November 2019.
-
Anisotropic Compact Stars in Rastall-Rainbow Gravity
Authors:
Clésio E. Mota,
Luis C. N. Santos,
Franciele M. da Silva,
Cesar V. Flores,
Tiago J. N. da Silva,
Débora P. Menezes
Abstract:
In this work, we have investigated anisotropic neutron stars in the framework of Rastall-Rainbow gravity. All our calculations were computed using the IU-FSU realistic equation of state (EoS), in which was considered two cases: standard nucleonic composition and the one with the eight lightest baryons. From the neutron star masses and radii obtained we conclude that anisotropic pressure has signif…
▽ More
In this work, we have investigated anisotropic neutron stars in the framework of Rastall-Rainbow gravity. All our calculations were computed using the IU-FSU realistic equation of state (EoS), in which was considered two cases: standard nucleonic composition and the one with the eight lightest baryons. From the neutron star masses and radii obtained we conclude that anisotropic pressure has significant consequences on the structure of stellar objects. In particular, when anisotropy is considered within the general relativity framework, it significantly modifies the maximum stellar mass. On the other hand, when Rastall-Rainbow gravity and anisotropy are simultaneously considered, they provide the best results for mass and radius values, including important astrophysical objects such as the LMXB NGC 6397 and the extremely massive pulsar millisecond MSP J0740 + 6620. Although the expected inclusion of hyperons in the nuclear model reproduces stellar masses smaller than those produced by standard nucleonic matter, we shown that the hyperon puzzle problem can be solved by including anisotropic effects on compact stars in the context of the Rastall-Rainbow gravity.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Electrostatic self-interaction of charged particles in the space-time of a cosmic string in the context of gravity's rainbow
Authors:
L. C. N. Santos,
V. B. Bezerra
Abstract:
We analyze the electrostatic self-energy of a point like electrically charged particle induced by a cosmic string in the context of gravity's rainbow, as well the electrostatic self-force on this particle. The possibility of the solution associated with a charged particle to be altered by modifications in dispersion relations of the space-time is discussed. We show that the self-energy depends on…
▽ More
We analyze the electrostatic self-energy of a point like electrically charged particle induced by a cosmic string in the context of gravity's rainbow, as well the electrostatic self-force on this particle. The possibility of the solution associated with a charged particle to be altered by modifications in dispersion relations of the space-time is discussed. We show that the self-energy depends on the rainbow functions and that this quantity can either increase or decrease depending on the rainbow function chosen, as compared with analogous result in the framework of general relativity. With respect to the self-force, its dependence with the rainbow functions is also pointed out
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
-
Scalar bosons with Coulomb potentials in a cosmic string background: Scattering and bound states
Authors:
Francisco A. Cruz Neto,
Franciele M. da Silva,
Luis C. N. Santos,
Luis B. Castro
Abstract:
The relativistic quantum motion of scalar bosons under the influence of a full vector (minimal $A^μ$ and nonminimal $X^μ$) and scalar ($V_{s}$) interactions embedded in the background of a cosmic string is explored in the context of the Klein-Gordon equation. Considering Coulomb interactions, the effects of this topological defect in equation of motion, phase shift and S-matrix are analyzed and di…
▽ More
The relativistic quantum motion of scalar bosons under the influence of a full vector (minimal $A^μ$ and nonminimal $X^μ$) and scalar ($V_{s}$) interactions embedded in the background of a cosmic string is explored in the context of the Klein-Gordon equation. Considering Coulomb interactions, the effects of this topological defect in equation of motion, phase shift and S-matrix are analyzed and discussed. Bound-state solutions are obtained from poles of the S-matrix and it is shown that bound-state solutions are possible only for a restrict range of coupling constants.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Effects of quantum corrections on the criticality and efficiency of black holes surrounded by a perfect fluid
Authors:
V. B. Bezerra,
I. P. Lobo,
J. P. Morais Graça,
Luis C. N. Santos
Abstract:
We study some properties of the extended phase space of a quantum-corrected Schwarzschild black hole surrounded by a perfect fluid. In particular we demonstrate that, due to the quantum correction, there exist first and second order phase transitions for a certain range of the state parameter of the perfect fluid, and we explicitly analyze some cases. Besides that, we describe the efficiency of th…
▽ More
We study some properties of the extended phase space of a quantum-corrected Schwarzschild black hole surrounded by a perfect fluid. In particular we demonstrate that, due to the quantum correction, there exist first and second order phase transitions for a certain range of the state parameter of the perfect fluid, and we explicitly analyze some cases. Besides that, we describe the efficiency of this system as a heat engine and the effect of quantum corrections for different surrounding fluids.
△ Less
Submitted 23 August, 2019; v1 submitted 21 August, 2019;
originally announced August 2019.
-
Combined Rastall and Rainbow theories of gravity with applications to neutron stars
Authors:
Clésio E. Mota,
Luis C. N. Santos,
Guilherme Grams,
Franciele M. da Silva,
Débora P. Menezes
Abstract:
The possibility of modifications on general relativity is investigated. We propose an alternative theory of gravity constructed with the combination of Rastall and Rainbow theories. The hydrostatic equilibrium equations are obtained in order to test the new theory in neutron stars, whose mass-radius diagrams are obtained using modern equations of state of nuclear matter derived from relativistic m…
▽ More
The possibility of modifications on general relativity is investigated. We propose an alternative theory of gravity constructed with the combination of Rastall and Rainbow theories. The hydrostatic equilibrium equations are obtained in order to test the new theory in neutron stars, whose mass-radius diagrams are obtained using modern equations of state of nuclear matter derived from relativistic mean field models and compared with the ones computed by the Tolman-Oppenheimer-Volkoff equations. We conclude that substantial modifications are obtained even for very small alterations on the two free parameters, making the reproduction of astrophysical observations an easy task.
△ Less
Submitted 3 May, 2019;
originally announced May 2019.
-
Learning Implicit Generative Models by Matching Perceptual Features
Authors:
Cicero Nogueira dos Santos,
Youssef Mroueh,
Inkit Padhi,
Pierre Dognin
Abstract:
Perceptual features (PFs) have been used with great success in tasks such as transfer learning, style transfer, and super-resolution. However, the efficacy of PFs as key source of information for learning generative models is not well studied. We investigate here the use of PFs in the context of learning implicit generative models through moment matching (MM). More specifically, we propose a new e…
▽ More
Perceptual features (PFs) have been used with great success in tasks such as transfer learning, style transfer, and super-resolution. However, the efficacy of PFs as key source of information for learning generative models is not well studied. We investigate here the use of PFs in the context of learning implicit generative models through moment matching (MM). More specifically, we propose a new effective MM approach that learns implicit generative models by performing mean and covariance matching of features extracted from pretrained ConvNets. Our proposed approach improves upon existing MM methods by: (1) breaking away from the problematic min/max game of adversarial learning; (2) avoiding online learning of kernel functions; and (3) being efficient with respect to both number of used moments and required minibatch size. Our experimental results demonstrate that, due to the expressiveness of PFs from pretrained deep ConvNets, our method achieves state-of-the-art results for challenging benchmarks.
△ Less
Submitted 4 April, 2019;
originally announced April 2019.
-
Klein-Gordon oscillator in a topologically nontrivial space-time
Authors:
L. C. N. Santos,
C. E. Mota,
C. C. Barros Jr
Abstract:
In this study, we analyze solutions of the wave equation for scalar particles in a space-time with nontrivial topology. Solutions for the Klein--Gordon oscillator are found considering two configurations of this space-time. In the first one, it is assumed the $S^{1}\times R^{3}$ space where the metric is written in the usual inertial frame of reference. In the second case, we consider a rotating r…
▽ More
In this study, we analyze solutions of the wave equation for scalar particles in a space-time with nontrivial topology. Solutions for the Klein--Gordon oscillator are found considering two configurations of this space-time. In the first one, it is assumed the $S^{1}\times R^{3}$ space where the metric is written in the usual inertial frame of reference. In the second case, we consider a rotating reference frame adapted to the circle S1. We obtained compact expressions for the energy spectrum and for the particles wave functions in both configurations. Additionally, we show that the energy spectrum of the solution associated to the rotating system has an additional term that breaks the symmetry around $E = 0$.
△ Less
Submitted 13 January, 2019;
originally announced January 2019.
-
Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer
Authors:
Cicero Nogueira dos Santos,
Igor Melnyk,
Inkit Padhi
Abstract:
We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from…
▽ More
We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from Twitter and Reddit show that our method outperforms a state-of-the-art text style transfer system in two out of three quantitative metrics and produces reliable non-offensive transferred sentences.
△ Less
Submitted 19 May, 2018;
originally announced May 2018.
-
Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
Authors:
Rui Zhang,
Cicero Nogueira dos Santos,
Michihiro Yasunaga,
Bing Xiang,
Dragomir Radev
Abstract:
Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention m…
▽ More
Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and the mention clustering log-likelihood given the mention cluster labels. Our model achieves the state-of-the-art performance on the CoNLL-2012 Shared Task English test set.
△ Less
Submitted 13 May, 2018;
originally announced May 2018.
-
Relativistic quantum motion of spin-0 particles under the influence of non-inertial effects in the cosmic string space-time
Authors:
L. C. N. Santos,
C. C. Barros Jr
Abstract:
We study solutions for the Klein-Gordon equation with vector and scalar potentials of the Coulomb types under the influence of non-inertial effects in the space-time of topological defects. We also investigate a quantum particle described by the Klein-Gordon oscillator in the background space-time generated by a string. An important result obtained is that the non-inertial effects restrict the phy…
▽ More
We study solutions for the Klein-Gordon equation with vector and scalar potentials of the Coulomb types under the influence of non-inertial effects in the space-time of topological defects. We also investigate a quantum particle described by the Klein-Gordon oscillator in the background space-time generated by a string. An important result obtained is that the non-inertial effects restrict the physical region of the space-time where the particle can be placed. In addition, we show that these potentials can form bound states for the relativistic wave equation equation in this kind of background.
△ Less
Submitted 30 December, 2017;
originally announced January 2018.
-
Improved Neural Text Attribute Transfer with Non-parallel Data
Authors:
Igor Melnyk,
Cicero Nogueira dos Santos,
Kahini Wadhawan,
Inkit Padhi,
Abhishek Kumar
Abstract:
Text attribute transfer using non-parallel data requires methods that can perform disentanglement of content and linguistic attributes. In this work, we propose multiple improvements over the existing approaches that enable the encoder-decoder framework to cope with the text attribute transfer from non-parallel data. We perform experiments on the sentiment transfer task using two datasets. For bot…
▽ More
Text attribute transfer using non-parallel data requires methods that can perform disentanglement of content and linguistic attributes. In this work, we propose multiple improvements over the existing approaches that enable the encoder-decoder framework to cope with the text attribute transfer from non-parallel data. We perform experiments on the sentiment transfer task using two datasets. For both datasets, our proposed method outperforms a strong baseline in two of the three employed evaluation metrics.
△ Less
Submitted 4 December, 2017; v1 submitted 26 November, 2017;
originally announced November 2017.
-
Improved Answer Selection with Pre-Trained Word Embeddings
Authors:
Rishav Chakravarti,
Jiri Navratil,
Cicero Nogueira dos Santos
Abstract:
This paper evaluates existing and newly proposed answer selection methods based on pre-trained word embeddings. Word embeddings are highly effective in various natural language processing tasks and their integration into traditional information retrieval (IR) systems allows for the capture of semantic relatedness between questions and answers. Empirical results on three publicly available data set…
▽ More
This paper evaluates existing and newly proposed answer selection methods based on pre-trained word embeddings. Word embeddings are highly effective in various natural language processing tasks and their integration into traditional information retrieval (IR) systems allows for the capture of semantic relatedness between questions and answers. Empirical results on three publicly available data sets show significant gains over traditional term frequency based approaches in both supervised and unsupervised settings. We show that combining these word embedding features with traditional learning-to-rank techniques can achieve similar performance to state-of-the-art neural networks trained for the answer selection task.
△ Less
Submitted 14 August, 2017;
originally announced August 2017.
-
Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks
Authors:
Cicero Nogueira dos Santos,
Kahini Wadhawan,
Bowen Zhou
Abstract:
We propose discriminative adversarial networks (DAN) for semi-supervised learning and loss function learning. Our DAN approach builds upon generative adversarial networks (GANs) and conditional GANs but includes the key differentiator of using two discriminators instead of a generator and a discriminator. DAN can be seen as a framework to learn loss functions for predictors that also implements se…
▽ More
We propose discriminative adversarial networks (DAN) for semi-supervised learning and loss function learning. Our DAN approach builds upon generative adversarial networks (GANs) and conditional GANs but includes the key differentiator of using two discriminators instead of a generator and a discriminator. DAN can be seen as a framework to learn loss functions for predictors that also implements semi-supervised learning in a straightforward manner. We propose instantiations of DAN for two different prediction tasks: classification and ranking. Our experimental results on three datasets of different tasks demonstrate that DAN is a promising framework for both semi-supervised learning and learning loss functions for predictors. For all tasks, the semi-supervised capability of DAN can significantly boost the predictor performance for small labeled sets with minor architecture changes across tasks. Moreover, the loss functions automatically learned by DANs are very competitive and usually outperform the standard pairwise and negative log-likelihood loss functions for both semi-supervised and supervised learning.
△ Less
Submitted 7 July, 2017;
originally announced July 2017.
-
Fermions in the Rindler spacetime
Authors:
L. C. N. Santos,
C. C. Barros Jr
Abstract:
In this paper we study the Dirac equation in the Rindler spacetime. The solution of the wave equation in an accelerated reference frame is obtained. The differential equation associated to this wave equation is mapped into a Sturm-Liouville problem of a Schrödinger-like equation. We derive a compact expression for the energy spectrum associated with the Dirac equation in an accelerated reference.…
▽ More
In this paper we study the Dirac equation in the Rindler spacetime. The solution of the wave equation in an accelerated reference frame is obtained. The differential equation associated to this wave equation is mapped into a Sturm-Liouville problem of a Schrödinger-like equation. We derive a compact expression for the energy spectrum associated with the Dirac equation in an accelerated reference. It is shown that the noninertial effect of the accelerated reference frame mimics an external potential in the Dirac equation and, moreover, allows the formation of bound states.
△ Less
Submitted 27 April, 2019; v1 submitted 2 April, 2017;
originally announced April 2017.
-
A Structured Self-attentive Sentence Embedding
Authors:
Zhouhan Lin,
Minwei Feng,
Cicero Nogueira dos Santos,
Mo Yu,
Bing Xiang,
Bowen Zhou,
Yoshua Bengio
Abstract:
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with a…
▽ More
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks.
△ Less
Submitted 8 March, 2017;
originally announced March 2017.