Search | arXiv e-print repository

arXiv:2405.20966 [pdf, other]

Black hole solutions surrounded by anisotropic fluid in $f(\mathbb{T},\CMcal{T})$ gravity

Authors: Franciele M. da Silva, Luis C. N. Santos, V. B. Bezerra

Abstract: In this work, we investigate some extensions of the Kiselev black hole solutions in the context of $f(\mathbb{T},\CMcal{T})$ gravity. By map** the components of the Kiselev energy-momentum tensor into the anisotropic energy-momentum tensor and assuming a particular form of $f(\mathbb{T},\CMcal{T})$, we obtain exact solutions for the field equation in this theory that carries dependence on the co… ▽ More In this work, we investigate some extensions of the Kiselev black hole solutions in the context of $f(\mathbb{T},\CMcal{T})$ gravity. By map** the components of the Kiselev energy-momentum tensor into the anisotropic energy-momentum tensor and assuming a particular form of $f(\mathbb{T},\CMcal{T})$, we obtain exact solutions for the field equation in this theory that carries dependence on the coupling constant and on the parameter of the equation of state of the fluid. We show that in this scenario of modified gravity some new structure is added to the geometry of spacetime as compared to the Kiselev black hole. We analyse the energy conditions, mass, horizons and the Hawking temperature considering particular values for the parameter of the equation of state. △ Less

Submitted 31 May, 2024; originally announced May 2024.

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2311.10768 [pdf, other]

Memory Augmented Language Models through Mixture of Word Experts

Authors: Cicero Nogueira dos Santos, James Lee-Thorp, Isaac Noble, Chung-Ching Chang, David Uthus

Abstract: Scaling up the number of parameters of language models has proven to be an effective approach to improve performance. For dense models, increasing model size proportionally increases the model's computation footprint. In this work, we seek to aggressively decouple learning capacity and FLOPs through Mixture-of-Experts (MoE) style models with large knowledge-rich vocabulary based routing functions… ▽ More Scaling up the number of parameters of language models has proven to be an effective approach to improve performance. For dense models, increasing model size proportionally increases the model's computation footprint. In this work, we seek to aggressively decouple learning capacity and FLOPs through Mixture-of-Experts (MoE) style models with large knowledge-rich vocabulary based routing functions and experts. Our proposed approach, dubbed Mixture of Word Experts (MoWE), can be seen as a memory augmented model, where a large set of word-specific experts play the role of a sparse memory. We demonstrate that MoWE performs significantly better than the T5 family of models with similar number of FLOPs in a variety of NLP tasks. Additionally, MoWE outperforms regular MoE models on knowledge intensive tasks and has similar performance to more complex memory augmented approaches that often require to invoke custom mechanisms to search the sparse memory. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: 14 pages

arXiv:2309.16865 [pdf, other]

doi 10.1103/PhysRevD.109.043054

A Bayesian study of quark models in view of recent astrophysical constraints

Authors: Franciele M. da Silva, Adamu Issifu, Luiz L. Lopes, Luis C. N. Santos, Débora P. Menezes

Abstract: In this work, we perform a comparative analysis between the density-dependent quark model and the vector MIT bag model using Bayesian analysis. We use the equations of state generated by these two models to describe quark stars. We impose four recent observational astrophysical constraints on both models to determine their model-dependent parameters in an optimized manner assuming that the compact… ▽ More In this work, we perform a comparative analysis between the density-dependent quark model and the vector MIT bag model using Bayesian analysis. We use the equations of state generated by these two models to describe quark stars. We impose four recent observational astrophysical constraints on both models to determine their model-dependent parameters in an optimized manner assuming that the compact objects observed are composed entirely of self-bound quarks. The restrictions are aimed at producing stars with maximum masses $2 - 2.35$ M$_\odot$ and a mass-radii diagram compatible with the observed pulsars: PSR J0740+6620, PSR J0952-0607, PSR J0030+0451 and the compact object XMMU J173203.3-344518. With this analysis, the parameter dependence of the nuclear equation of state (EoS) of both models is restricted. △ Less

Submitted 23 April, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: Published Version

Journal ref: Phys. Rev. D 109, 043054 (2024)

arXiv:2306.04009 [pdf, other]

Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks

Authors: Kanishka Misra, Cicero Nogueira dos Santos, Siamak Shakeri

Abstract: Despite readily memorizing world knowledge about entities, pre-trained language models (LMs) struggle to compose together two or more facts to perform multi-hop reasoning in question-answering tasks. In this work, we propose techniques that improve upon this limitation by relying on random walks over structured knowledge graphs. Specifically, we use soft prompts to guide LMs to chain together thei… ▽ More Despite readily memorizing world knowledge about entities, pre-trained language models (LMs) struggle to compose together two or more facts to perform multi-hop reasoning in question-answering tasks. In this work, we propose techniques that improve upon this limitation by relying on random walks over structured knowledge graphs. Specifically, we use soft prompts to guide LMs to chain together their encoded knowledge by learning to map multi-hop questions to random walk paths that lead to the answer. Applying our methods on two T5 LMs shows substantial improvements over standard tuning approaches in answering questions that require 2-hop reasoning. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: Findings of ACL 2023

arXiv:2301.09448 [pdf, ps, other]

doi 10.3390/physics5010008

Quantum Configuration and Phase Spaces: Finsler and Hamilton Geometries

Authors: Saulo Albuquerque, Valdir B. Bezerra, Iarley P. Lobo, Gabriel Macedo, Pedro H. Morais, Ernesto Rodrigues, Luis C. N. Santos, Gislaine Varão

Abstract: In this paper, we review two approaches that can describe, in a geometrical way, the kinematics of particles that are affected by Planck-scale departures, named Finsler and Hamilton geometries. By relying on maps that connect the spaces of velocities and momenta, we discuss the properties of configuration and phase spaces induced by these two distinct geometries. In particular, we exemplify this a… ▽ More In this paper, we review two approaches that can describe, in a geometrical way, the kinematics of particles that are affected by Planck-scale departures, named Finsler and Hamilton geometries. By relying on maps that connect the spaces of velocities and momenta, we discuss the properties of configuration and phase spaces induced by these two distinct geometries. In particular, we exemplify this approach by considering the so-called $q$-de Sitter-inspired modified dispersion relation as a laboratory for this study. We finalize with some points that we consider as positive and negative ones of each approach for the description of quantum configuration and phases spaces. △ Less

Submitted 23 January, 2023; originally announced January 2023.

Comments: 22 pages. Matches published version. Invited contribution for Physics. Special Issue "New Advances in Quantum Geometry"

Journal ref: Physics 2023, 5(1), 90-115

arXiv:2301.03067 [pdf, ps, other]

Neutron stars in the context of $f$($\mathbb{T}$,$\mathcal{T}$) gravity

Authors: Clésio E. Mota, Luis C. N. Santos, Franciele M. da Silva, Cesar V. Flores, Iarley P. Lobo, Valdir B. Bezerra

Abstract: In this work, we investigate the existence of neutron stars (NS) in the framework of $f$($\mathbb{T}$,$\mathcal{T}$) gravity, where $\mathbb{T}$ is the torsion tensor and $\mathcal{T}$ is the trace of the energy-momentum tensor. The hydrostatic equilibrium equations are obtained, however, with $p$ and $ρ$ quantities passed on by effective quantities $\bar{p}$ and $\barρ$, whose mass-radius diagram… ▽ More In this work, we investigate the existence of neutron stars (NS) in the framework of $f$($\mathbb{T}$,$\mathcal{T}$) gravity, where $\mathbb{T}$ is the torsion tensor and $\mathcal{T}$ is the trace of the energy-momentum tensor. The hydrostatic equilibrium equations are obtained, however, with $p$ and $ρ$ quantities passed on by effective quantities $\bar{p}$ and $\barρ$, whose mass-radius diagrams are obtained using modern equations of state (EoS) of nuclear matter derived from relativistic mean field models and compared with the ones computed by the Tolman-Oppenheimer-Volkoff (TOV) equations. Substantial changes in the mass-radius profiles of NS are obtained even for small changes in the free parameter of this modified theory. The results indicate that the use of $f$($\mathbb{T}$,$\mathcal{T}$) gravity in the study of NS provides good results for the masses and radii of some important astrophysical objects, as for example, the low-mass X-ray binary (LMXB) NGC 6397 and the pulsar of millisecond PSR J0740+6620. In addition, radii results inferred from the Lead Radius EXperiment (PREX-2) can also be described for certain parameter values. △ Less

Submitted 10 January, 2023; v1 submitted 8 January, 2023; originally announced January 2023.

arXiv:2301.02534 [pdf, other]

Kiselev black holes in $f(R,T)$ gravity

Authors: L. C. N. Santos, F. M. da Silva, C. E. Mota, I. P. Lobo, V. B. Bezerra

Abstract: We obtain new exact solutions for the gravitational field equations in the context of $f(R,T)$ gravity, thereby obtaining different classes of black holes surrounded by fluids, taking into account some specific values of the parameter of the equations of state, $w$. In order to obtain these solutions in the context of $f(R,T)$ gravity, we consider viable particular choices of the $f(R,T)$. Conside… ▽ More We obtain new exact solutions for the gravitational field equations in the context of $f(R,T)$ gravity, thereby obtaining different classes of black holes surrounded by fluids, taking into account some specific values of the parameter of the equations of state, $w$. In order to obtain these solutions in the context of $f(R,T)$ gravity, we consider viable particular choices of the $f(R,T)$. Considering an anisotropic energy-momentum tensor, we write the field equations with the required symmetries for this type of solution. Then, we analyze the conditions of energy in a general way and also for particular values of the parameter $w$ of the equation of state. In addition, thermodynamic quantities, such as Hawking temperature and mass associated to the horizons of solutions, are taken into account in our analysis. △ Less

Submitted 5 January, 2023; originally announced January 2023.

arXiv:2210.04726 [pdf, other]

Knowledge Prompts: Injecting World Knowledge into Language Models through Soft Prompts

Authors: Cicero Nogueira dos Santos, Zhe Dong, Daniel Cer, John Nham, Siamak Shakeri, Jianmo Ni, Yun-hsuan Sung

Abstract: Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) to new tasks. In this work, we repurpose soft prompts to the task of injecting world knowledge into LMs. We introduce a method to train soft prompts via self-supervised learning on data from knowledge bases. The resulting soft knowledge prompts (KPs) are task independent and work as an external memor… ▽ More Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) to new tasks. In this work, we repurpose soft prompts to the task of injecting world knowledge into LMs. We introduce a method to train soft prompts via self-supervised learning on data from knowledge bases. The resulting soft knowledge prompts (KPs) are task independent and work as an external memory of the LMs. We perform qualitative and quantitative experiments and demonstrate that: (1) KPs can effectively model the structure of the training data; (2) KPs can be used to improve the performance of LMs in different knowledge intensive tasks. △ Less

Submitted 10 October, 2022; originally announced October 2022.

arXiv:2210.02559 [pdf, other]

doi 10.1142/S0219887823500676

Non-inertial effects on a non-relativistic quantum harmonic oscillator in the presence of a screw dislocation

Authors: L. C. N. Santos, F. M. da Silva, C. E. Mota, V. B. Bezerra

Abstract: We investigate non-inertial effects induced by a rotating frame on a non-relativistic quantum harmonic oscillator as well as of the topology associated to a screw dislocation, which corresponds to a distortion of a vertical line into a vertical spiral. To do this, we obtain the analytical solutions of the time-independent Schrödinger equation for this harmonic oscillator potential in this backgrou… ▽ More We investigate non-inertial effects induced by a rotating frame on a non-relativistic quantum harmonic oscillator as well as of the topology associated to a screw dislocation, which corresponds to a distortion of a vertical line into a vertical spiral. To do this, we obtain the analytical solutions of the time-independent Schrödinger equation for this harmonic oscillator potential in this background. The expressions for the energy spectrum are obtained and the solutions for four quantum states, namely $n=0,1,2$ and $3$, are analysed. Our results show that the presence of the topological defect (screw dislocation) as well the fact that we are analysing the system from the point of view of a rotating frame, changes the solutions of Schrödinger equation and the corresponding spectrum. Now these quantities depend on the angular velocity of the rotating frame, $Ω$, and also on the parameter $β$, which codifies the presence of the screw dislocation. Particularly, with respect to the energy spectrum of the system the changing is such that when $Ω$ increases, the energy can increase or decrease depending on the values we assign to the eigenvalues of the angular and linear momenta. Additionally, we observe that the values of the parameter $β$ that characterizes the screw dislocation causes a shift in the energy spectrum. △ Less

Submitted 4 October, 2022; originally announced October 2022.

arXiv:2206.08469 [pdf, other]

doi 10.1140/epjc/s10052-023-11466-2

Rapidly rotating neutron stars in $f(R,T)$ gravity

Authors: F. M. da Silva, L. C. N. Santos, C. E. Mota, T. O. F. da Costa, J. C. Fabris

Abstract: In this work, we study the influence of $f(R,T)$ gravity on rapidly rotating neutron stars. First we discuss the main aspects of this modified theory of gravity where the gravitational Lagrangian is an arbitrary function of the Ricci scalar $R$ and of the trace of the energy-momentum tensor $T$. Then we present the basic equations for neutron stars including the equations of state used in the pres… ▽ More In this work, we study the influence of $f(R,T)$ gravity on rapidly rotating neutron stars. First we discuss the main aspects of this modified theory of gravity where the gravitational Lagrangian is an arbitrary function of the Ricci scalar $R$ and of the trace of the energy-momentum tensor $T$. Then we present the basic equations for neutron stars including the equations of state used in the present work to describe the hadronic matter. Some physical quantities of interest are calculated such as mass-radius relations, moments of inertia, angular momentum, and compactness. By considering four different rotation regimes, we obtain results that indicate substantial modifications in the physical properties of neutron stars in $f(R,T)$ gravity when compared to those in the context of general relativity. In particular, the mass-radius relation for sequences of stars indicates that $f(R,T)$ gravity increases the mass and the equatorial radius of the neutron stars for stars rotating with an angular velocity smaller than Kepler limit. △ Less

Submitted 16 June, 2022; originally announced June 2022.

arXiv:2205.12416 [pdf, other]

Counterfactual Data Augmentation improves Factuality of Abstractive Summarization

Authors: Dheeraj Rajagopal, Siamak Shakeri, Cicero Nogueira dos Santos, Eduard Hovy, Chung-Ching Chang

Abstract: Abstractive summarization systems based on pretrained language models often generate coherent but factually inconsistent sentences. In this paper, we present a counterfactual data augmentation approach where we augment data with perturbed summaries that increase the training data diversity. Specifically, we present three augmentation approaches based on replacing (i) entities from other and the sa… ▽ More Abstractive summarization systems based on pretrained language models often generate coherent but factually inconsistent sentences. In this paper, we present a counterfactual data augmentation approach where we augment data with perturbed summaries that increase the training data diversity. Specifically, we present three augmentation approaches based on replacing (i) entities from other and the same category and (ii) nouns with their corresponding WordNet hypernyms. We show that augmenting the training data with our approach improves the factual correctness of summaries without significantly affecting the ROUGE score. We show that in two commonly used summarization datasets (CNN/Dailymail and XSum), we improve the factual correctness by about 2.5 points on average △ Less

Submitted 24 May, 2022; originally announced May 2022.

arXiv:2204.11458 [pdf, other]

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

Authors: Kai Hui, Honglei Zhuang, Tao Chen, Zhen Qin, **g Lu, Dara Bahri, Ji Ma, Jai Prakash Gupta, Cicero Nogueira dos Santos, Yi Tay, Don Metzler

Abstract: State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms, however, are not without flaws, i.e., running the model on all query-document pairs at inference-time incurs a significant computational cost. This paper propo… ▽ More State-of-the-art neural models typically encode document-query pairs using cross-attention for re-ranking. To this end, models generally utilize an encoder-only (like BERT) paradigm or an encoder-decoder (like T5) approach. These paradigms, however, are not without flaws, i.e., running the model on all query-document pairs at inference-time incurs a significant computational cost. This paper proposes a new training and inference paradigm for re-ranking. We propose to finetune a pretrained encoder-decoder model using in the form of document to query generation. Subsequently, we show that this encoder-decoder architecture can be decomposed into a decoder-only language model during inference. This results in significant inference time speedups since the decoder-only architecture only needs to learn to interpret static encoder embeddings during inference. Our experiments show that this new paradigm achieves results that are comparable to the more expensive cross-attention ranking approaches while being up to 6.8X faster. We believe this work paves the way for more efficient neural rankers that leverage large pretrained models. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: Findings of ACL 2022

arXiv:2204.07647 [pdf]

Spectroscopic assessment of short-term nitric acid do** of epitaxial graphene

Authors: Ngoc Thanh Mai Tran, Swapnil M. Mhatre, Cristiane N. Santos, Adam J. Biacchi, Mathew L. Kelley, Heather M. Hill, Dipanjan Saha, Chi-Te Liang, Randolph E. Elmquist, David B. Newell, Benoit Hackens, Christina A. Hacker, Albert F. Rigosi

Abstract: This work reports information on the transience of hole do** in epitaxial graphene devices when nitric acid is used as an adsorbent. Under vacuum conditions, desorption processes are monitored by electrical and spectroscopic means to extract the relevant timescales from the corresponding data. It is of vital importance to understand the reversible nature of hole do** because such device proces… ▽ More This work reports information on the transience of hole do** in epitaxial graphene devices when nitric acid is used as an adsorbent. Under vacuum conditions, desorption processes are monitored by electrical and spectroscopic means to extract the relevant timescales from the corresponding data. It is of vital importance to understand the reversible nature of hole do** because such device processing can be a suitable alternative to large-scale, metallic gating. Most measurements are performed post-exposure at room temperature, and, for some electrical transport measurements, at 1.5 K. Vacuum conditions are applied to many measurements to replicate the laboratory conditions under which devices using this do** method would be measured. The relevant timescales from transport measurements are compared with results from X-ray photoelecton spectroscopy and Fourier transform infrared spectroscopy measurements, with the latter performed at ambient conditions and accompanied by calculations of the spectra in the Reststrahlen band. △ Less

Submitted 15 April, 2022; originally announced April 2022.

arXiv:2110.05396 [pdf, other]

doi 10.1016/j.nuclphysb.2021.115568

The extended phase space thermodynamics of Planck-scale-corrected Reissner-Nordström-anti-de Sitter black hole

Authors: Iarley P. Lobo, Luis C. N. Santos, V. B. Bezerra, J. P. Morais Graça, H. Moradpour

Abstract: We analyze the effect of Planck-scale modified radiation equation of state on the Reissner-Nodström-anti-de Sitter black hole inspired by Kiselev's ansatz. Deformed thermodynamic quantities are found, phase transitions and black holes as heat engines are described for the Carnot and square cycles. Non-trivial differences between linear and quadratic Planck-scale corrections are discussed in detail… ▽ More We analyze the effect of Planck-scale modified radiation equation of state on the Reissner-Nodström-anti-de Sitter black hole inspired by Kiselev's ansatz. Deformed thermodynamic quantities are found, phase transitions and black holes as heat engines are described for the Carnot and square cycles. Non-trivial differences between linear and quadratic Planck-scale corrections are discussed in detail. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: 13 pages, 7 figures. Matches published version

Journal ref: Nuclear Physics B 972 (2021) 115568

arXiv:2110.00371 [pdf, ps, other]

doi 10.1016/j.physletb.2021.136684

Effects of modified dispersion relations on free Fermi gas: equations of state and applications in astrophysics

Authors: Luis C. N. Santos, Clésio E. Mota, Franciele M. da Silva, Guilherme Grams, I. P. Lobo

Abstract: Deformed dispersion relations are considered in the study of equations of state of Fermi gas with applications to compact objects. Different choices of deformed energy relations are used in the formulation of our model. As a first test, we consider a relativistic star with a simple internal structure. The mass-radius diagrams obtained suggest a positive influence of deformed Fermi gas, depending o… ▽ More Deformed dispersion relations are considered in the study of equations of state of Fermi gas with applications to compact objects. Different choices of deformed energy relations are used in the formulation of our model. As a first test, we consider a relativistic star with a simple internal structure. The mass-radius diagrams obtained suggest a positive influence of deformed Fermi gas, depending of the functions employed. In addition, we comment on how realistic equations of state, in which interactions between nucleons are taken into account, can be addressed. △ Less

Submitted 25 September, 2021; originally announced October 2021.

arXiv:2105.12932 [pdf, other]

Contrastive Fine-tuning Improves Robustness for Neural Rankers

Authors: Xiaofei Ma, Cicero Nogueira dos Santos, Andrew O. Arnold

Abstract: The performance of state-of-the-art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain. In this paper, we present a novel method for fine-tuning neural rankers that can significantly improve their robustness to out-of-domain data and query perturbations. Specifically, a contrastive loss that compares data points in the representation space is combi… ▽ More The performance of state-of-the-art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain. In this paper, we present a novel method for fine-tuning neural rankers that can significantly improve their robustness to out-of-domain data and query perturbations. Specifically, a contrastive loss that compares data points in the representation space is combined with the standard ranking loss during fine-tuning. We use relevance labels to denote similar/dissimilar pairs, which allows the model to learn the underlying matching semantics across different query-document pairs and leads to improved robustness. In experiments with four passage ranking datasets, the proposed contrastive fine-tuning method obtains improvements on robustness to query reformulations, noise perturbations, and zero-shot transfer for both BERT and BART based rankers. Additionally, our experiments show that contrastive fine-tuning outperforms data augmentation for robustifying neural rankers. △ Less

Submitted 27 May, 2021; originally announced May 2021.

Journal ref: Findings of ACL 2021

arXiv:2105.05052 [pdf, other]

Joint Text and Label Generation for Spoken Language Understanding

Authors: Yang Li, Ben Athiwaratkun, Cicero Nogueira dos Santos, Bing Xiang

Abstract: Generalization is a central problem in machine learning, especially when data is limited. Using prior information to enforce constraints is the principled way of encouraging generalization. In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data. Spe… ▽ More Generalization is a central problem in machine learning, especially when data is limited. Using prior information to enforce constraints is the principled way of encouraging generalization. In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data. Specifically, we extract prior knowledge from pretrained LM in the form of synthetic data, which encode the prior implicitly. We fine-tune the LM to generate an augmented language, which contains not only text but also encodes both intent labels and slot labels. The generated synthetic data can be used to train a classifier later. Since the generated data may contain noise, we rephrase the learning from generated data as learning with noisy labels. We then utilize the mixout regularization for the classifier and prove its effectiveness to resist label noise in generated data. Empirically, our method demonstrates superior performance and outperforms the baseline by a large margin. △ Less

Submitted 11 May, 2021; originally announced May 2021.

arXiv:2105.04623 [pdf, other]

Improving Factual Consistency of Abstractive Summarization via Question Answering

Authors: Feng Nan, Cicero Nogueira dos Santos, Henghui Zhu, Patrick Ng, Kathleen McKeown, Ramesh Nallapati, Dejiao Zhang, Zhiguo Wang, Andrew O. Arnold, Bing Xiang

Abstract: A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summari… ▽ More A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents. The fact that automatic summarization may produce plausible-sounding yet inaccurate summaries is a major concern that limits its wide application. In this paper we present an approach to address factual consistency in summarization. We first propose an efficient automatic evaluation metric to measure factual consistency; next, we propose a novel learning algorithm that maximizes the proposed metric during model training. Through extensive experiments, we confirm that our method is effective in improving factual consistency and even overall quality of the summaries, as judged by both automatic metrics and human evaluation. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: ACL-IJCNLP 2021

arXiv:2104.08744 [pdf, other]

Generative Context Pair Selection for Multi-hop Question Answering

Authors: Dheeru Dua, Cicero Nogueira dos Santos, Patrick Ng, Ben Athiwaratkun, Bing Xiang, Matt Gardner, Sameer Singh

Abstract: Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question. However, crowdsourced datasets often capture only a slice of the underlying task distribution, which can induce unanticipated biases in models performing compositional reasoning. Furthermore, discriminatively trained models exploit such biases to get a better… ▽ More Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question. However, crowdsourced datasets often capture only a slice of the underlying task distribution, which can induce unanticipated biases in models performing compositional reasoning. Furthermore, discriminatively trained models exploit such biases to get a better held-out performance, without learning the right way to reason, as they do not necessitate paying attention to the question representation (conditioning variable) in its entirety, to estimate the answer likelihood. In this work, we propose a generative context selection model for multi-hop question answering that reasons about how the given question could have been generated given a context pair. While being comparable to the state-of-the-art answering performance, our proposed generative passage selection model has a better performance (4.9% higher than baseline) on adversarial held-out set which tests robustness of model's multi-hop reasoning capabilities. △ Less

Submitted 18 April, 2021; originally announced April 2021.

arXiv:2102.09130 [pdf, other]

Entity-level Factual Consistency of Abstractive Text Summarization

Authors: Feng Nan, Ramesh Nallapati, Zhiguo Wang, Cicero Nogueira dos Santos, Henghui Zhu, Dejiao Zhang, Kathleen McKeown, Bing Xiang

Abstract: A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of gene… ▽ More A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document. For example, state-of-the-art models trained on existing datasets exhibit entity hallucination, generating names of entities that are not present in the source document. We propose a set of new metrics to quantify the entity-level factual consistency of generated summaries and we show that the entity hallucination problem can be alleviated by simply filtering the training data. In addition, we propose a summary-worthy entity classification task to the training process as well as a joint entity and summary generation approach, which yield further improvements in entity level metrics. △ Less

Submitted 17 February, 2021; originally announced February 2021.

Comments: EACL 2021

arXiv:2101.05779 [pdf, other]

Structured Prediction as Translation between Augmented Natural Languages

Authors: Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto

Abstract: We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discri… ▽ More We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discriminative classifiers, we frame it as a translation task between augmented natural languages, from which the task-relevant information can be easily extracted. Our approach can match or outperform task-specific models on all tasks, and in particular, achieves new state-of-the-art results on joint entity and relation extraction (CoNLL04, ADE, NYT, and ACE2005 datasets), relation classification (FewRel and TACRED), and semantic role labeling (CoNLL-2005 and CoNLL-2012). We accomplish this while using the same architecture and hyperparameters for all tasks and even when training a single model to solve all tasks at the same time (multi-task learning). Finally, we show that our framework can also significantly improve the performance in a low-resource regime, thanks to better use of label semantics. △ Less

Submitted 2 December, 2021; v1 submitted 14 January, 2021; originally announced January 2021.

Journal ref: International Conference on Learning Representations (ICLR) 2021

arXiv:2012.10309 [pdf, other]

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Authors: Peng Shi, Patrick Ng, Zhiguo Wang, Henghui Zhu, Alexander Hanbo Li, Jun Wang, Cicero Nogueira dos Santos, Bing Xiang

Abstract: Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train large neural language models with self-supervised learning objectives, such as Masked Language Model (MLM). However, based on a pilot study, we observe three issues of existing general-purpose language models when they are applied to text-… ▽ More Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train large neural language models with self-supervised learning objectives, such as Masked Language Model (MLM). However, based on a pilot study, we observe three issues of existing general-purpose language models when they are applied to text-to-SQL semantic parsers: fail to detect column mentions in the utterances, fail to infer column mentions from cell values, and fail to compose complex SQL queries. To mitigate these issues, we present a model pre-training framework, Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data. GAP MODEL is trained on 2M utterance-schema pairs and 30K utterance-schema-SQL triples, whose utterances are produced by generative models. Based on experimental results, neural semantic parsers that leverage GAP MODEL as a representation encoder obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-SQL benchmarks. △ Less

Submitted 18 December, 2020; originally announced December 2020.

Comments: Accepted to AAAI 2021

arXiv:2011.13137 [pdf, other]

Answering Ambiguous Questions through Generative Evidence Fusion and Round-Trip Prediction

Authors: Yifan Gao, Henghui Zhu, Patrick Ng, Cicero Nogueira dos Santos, Zhiguo Wang, Feng Nan, Dejiao Zhang, Ramesh Nallapati, Andrew O. Arnold, Bing Xiang

Abstract: In open-domain question answering, questions are highly likely to be ambiguous because users may not know the scope of relevant topics when formulating them. Therefore, a system needs to find possible interpretations of the question, and predict one or multiple plausible answers. When multiple plausible answers are found, the system should rewrite the question for each answer to resolve the ambigu… ▽ More In open-domain question answering, questions are highly likely to be ambiguous because users may not know the scope of relevant topics when formulating them. Therefore, a system needs to find possible interpretations of the question, and predict one or multiple plausible answers. When multiple plausible answers are found, the system should rewrite the question for each answer to resolve the ambiguity. In this paper, we present a model that aggregates and combines evidence from multiple passages to adaptively predict a single answer or a set of question-answer pairs for ambiguous questions. In addition, we propose a novel round-trip prediction approach to iteratively generate additional interpretations that our model fails to find in the first pass, and then verify and filter out the incorrect question-answer pairs to arrive at the final disambiguated output. Our model, named Refuel, achieves a new state-of-the-art performance on the AmbigQA dataset, and shows competitive performance on NQ-Open and TriviaQA. The proposed round-trip prediction is a model-agnostic general approach for answering ambiguous open-domain questions, which improves our Refuel as well as several baseline models. We release source code for our models and experiments at https://github.com/amzn/refuel-open-domain-qa. △ Less

Submitted 30 May, 2021; v1 submitted 26 November, 2020; originally announced November 2020.

Comments: ACL 2021 main conference, 14 pages, 7 figures. Code will be released at https://github.com/amzn/refuel-open-domain-qa

arXiv:2010.14660 [pdf, other]

DualTKB: A Dual Learning Bridge between Text and Knowledge Base

Authors: Pierre L. Dognin, Igor Melnyk, Inkit Padhi, Cicero Nogueira dos Santos, Payel Das

Abstract: In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). We investigate the impact of weak supervision by creating a weakly supervised dataset and show that even a slight amount of supervision can significantly improve the model performance and enable better-quality transfers. We examine different model archite… ▽ More In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). We investigate the impact of weak supervision by creating a weakly supervised dataset and show that even a slight amount of supervision can significantly improve the model performance and enable better-quality transfers. We examine different model architectures, and evaluation metrics, proposing a novel Commonsense KB completion metric tailored for generative models. Extensive experimental results show that the proposed method compares very favorably to the existing baselines. This approach is a viable step towards a more advanced system for automatic KB construction/expansion and the reverse operation of KB conversion to coherent textual descriptions. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Comments: Equal Contributions of Authors Pierre L. Dognin, Igor Melnyk, and Inkit Padhi. Accepted at EMNLP'20

arXiv:2010.06028 [pdf, other]

End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems

Authors: Siamak Shakeri, Cicero Nogueira dos Santos, Henry Zhu, Patrick Ng, Feng Nan, Zhiguo Wang, Ramesh Nallapati, Bing Xiang

Abstract: We propose an end-to-end approach for synthetic QA data generation. Our model comprises a single transformer-based encoder-decoder network that is trained end-to-end to generate both answers and questions. In a nutshell, we feed a passage to the encoder and ask the decoder to generate a question and an answer token-by-token. The likelihood produced in the generation process is used as a filtering… ▽ More We propose an end-to-end approach for synthetic QA data generation. Our model comprises a single transformer-based encoder-decoder network that is trained end-to-end to generate both answers and questions. In a nutshell, we feed a passage to the encoder and ask the decoder to generate a question and an answer token-by-token. The likelihood produced in the generation process is used as a filtering score, which avoids the need for a separate filtering model. Our generator is trained by fine-tuning a pretrained LM using maximum likelihood estimation. The experimental results indicate significant improvements in the domain adaptation of QA models outperforming current state-of-the-art methods. △ Less

Submitted 12 October, 2020; originally announced October 2020.

Comments: EMNLP 2020

arXiv:2010.03073 [pdf, other]

Beyond [CLS] through Ranking by Generation

Authors: Cicero Nogueira dos Santos, Xiaofei Ma, Ramesh Nallapati, Zhiheng Huang, Bing Xiang

Abstract: Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document's language model, were very successful in various IR tasks in the past. However, with the advent of modern deep neural networks, attention has shifted to discriminative ranking functions that model the semantic similarity of documents and queries instead. Recently, de… ▽ More Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document's language model, were very successful in various IR tasks in the past. However, with the advent of modern deep neural networks, attention has shifted to discriminative ranking functions that model the semantic similarity of documents and queries instead. Recently, deep generative models such as GPT2 and BART have been shown to be excellent text generators, but their effectiveness as rankers have not been demonstrated yet. In this work, we revisit the generative framework for information retrieval and show that our generative approaches are as effective as state-of-the-art semantic similarity-based discriminative models for the answer selection task. Additionally, we demonstrate the effectiveness of unlikelihood losses for IR. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: EMNLP 2020

arXiv:2010.00086 [pdf, other]

doi 10.1088/1361-6382/ac129d

Rapidly rotating compact stars in Rastall's gravity

Authors: F. M. da Silva, L. C. N. Santos, C. C. Barros Jr

Abstract: In this work we study rapidly rotating stars by considering the Rastall theory of gravity. We obtain and solve the equations by numerical methods for two usual parametrization of polytropic stars. Then the mass-radius relations, moments of inertia and other results of interest are obtained and compared with the ones for non-rotating stars. In this work we study rapidly rotating stars by considering the Rastall theory of gravity. We obtain and solve the equations by numerical methods for two usual parametrization of polytropic stars. Then the mass-radius relations, moments of inertia and other results of interest are obtained and compared with the ones for non-rotating stars. △ Less

Submitted 4 January, 2022; v1 submitted 30 September, 2020; originally announced October 2020.

arXiv:2009.13272 [pdf, other]

Augmented Natural Language for Generative Sequence Labeling

Authors: Ben Athiwaratkun, Cicero Nogueira dos Santos, Jason Krone, Bing Xiang

Abstract: We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework is general purpose, performing well on few-shot, low-r… ▽ More We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework is general purpose, performing well on few-shot, low-resource, and high-resource tasks. We demonstrate these advantages on popular named entity recognition, slot labeling, and intent classification benchmarks. We set a new state-of-the-art for few-shot slot labeling, improving substantially upon the previous 5-shot ($75.0\% \rightarrow 90.9\%$) and 1-shot ($70.4\% \rightarrow 81.0\%$) state-of-the-art results. Furthermore, our model generates large improvements ($46.27\% \rightarrow 63.83\%$) in low-resource slot labeling over a BERT baseline by incorporating label semantics. We also maintain competitive results on high-resource tasks, performing within two points of the state-of-the-art on all tasks and setting a new state-of-the-art on the SNIPS dataset. △ Less

Submitted 15 September, 2020; originally announced September 2020.

Comments: To appear at EMNLP 2020

arXiv:2009.10270 [pdf, other]

Embedding-based Zero-shot Retrieval through Query Generation

Authors: Davis Liang, Peng Xu, Siamak Shakeri, Cicero Nogueira dos Santos, Ramesh Nallapati, Zhiheng Huang, Bing Xiang

Abstract: Passage retrieval addresses the problem of locating relevant passages, usually from a large corpus, given a query. In practice, lexical term-matching algorithms like BM25 are popular choices for retrieval owing to their efficiency. However, term-based matching algorithms often miss relevant passages that have no lexical overlap with the query and cannot be finetuned to downstream datasets. In this… ▽ More Passage retrieval addresses the problem of locating relevant passages, usually from a large corpus, given a query. In practice, lexical term-matching algorithms like BM25 are popular choices for retrieval owing to their efficiency. However, term-based matching algorithms often miss relevant passages that have no lexical overlap with the query and cannot be finetuned to downstream datasets. In this work, we consider the embedding-based two-tower architecture as our neural retrieval model. Since labeled data can be scarce and because neural retrieval models require vast amounts of data to train, we propose a novel method for generating synthetic training data for retrieval. Our system produces remarkable results, significantly outperforming BM25 on 5 out of 6 datasets tested, by an average of 2.45 points for Recall@1. In some cases, our model trained on synthetic data can even outperform the same model trained on real data △ Less

Submitted 21 September, 2020; originally announced September 2020.

arXiv:2007.01968 [pdf, ps, other]

doi 10.1142/S0218271822500237

Generalized Rastall's gravity and its effects on compact objects

Authors: Clésio E. Mota, Luis C. N. Santos, Franciele M. da Silva, Guilherme Grams, Iarley P. Lobo, Débora P. Menezes

Abstract: We present a generalization of Rastall's gravity in which the conservation law of the energy-momentum tensor is altered, and as a result, the trace of the energy-momentum tensor is taken into account together with the Ricci scalar in the expression for the covariant derivative. Afterwards, we obtain the field equations in this theory and solve them by considering a spherically symmetric space-time… ▽ More We present a generalization of Rastall's gravity in which the conservation law of the energy-momentum tensor is altered, and as a result, the trace of the energy-momentum tensor is taken into account together with the Ricci scalar in the expression for the covariant derivative. Afterwards, we obtain the field equations in this theory and solve them by considering a spherically symmetric space-time. We show that the external solution has two possible classes of solutions with spherical symmetry in the vacuum in generalized Rastall's gravity, and we analyse one of them explicitly. The generalization, in contrast to constant value $k=8πG$ in general relativity, has a gravitational parameter $k$ that depends on the Rastall constant $α$. As an application, we perform a careful analysis of the effects of the theory on neutron stars using realistic equations of state (EoS) as input. Our results show that important differences on the profile of neutron stars are obtained within two representatives EoS. △ Less

Submitted 23 February, 2022; v1 submitted 3 July, 2020; originally announced July 2020.

arXiv:2005.03588 [pdf, other]

Learning Implicit Text Generation via Feature Matching

Authors: Inkit Padhi, Pierre Dognin, Ke Bai, Cicero Nogueira dos Santos, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das

Abstract: Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English… ▽ More Generative feature matching network (GFMN) is an approach for training implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English: unconditional text generation, class-conditional text generation, and unsupervised text style transfer. SeqGFMN is stable to train and outperforms various adversarial approaches for text generation and text style transfer. △ Less

Submitted 8 May, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

Comments: ACL 2020

arXiv:2002.02369 [pdf, other]

Covering the News with (AI) Style

Authors: Michele Merler, Cicero Nogueira dos Santos, Mauro Martino, Alfio M. Gliozzo, John R. Smith

Abstract: We introduce a multi-modal discriminative and generative frame-work capable of assisting humans in producing visual content re-lated to a given theme, starting from a collection of documents(textual, visual, or both). This framework can be used by edit or to generate images for articles, as well as books or music album covers. Motivated by a request from the The New York Times (NYT) seeking help t… ▽ More We introduce a multi-modal discriminative and generative frame-work capable of assisting humans in producing visual content re-lated to a given theme, starting from a collection of documents(textual, visual, or both). This framework can be used by edit or to generate images for articles, as well as books or music album covers. Motivated by a request from the The New York Times (NYT) seeking help to use AI to create art for their special section on Artificial Intelligence, we demonstrated the application of our system in producing such image. △ Less

Submitted 5 January, 2020; originally announced February 2020.

arXiv:1912.10923 [pdf, ps, other]

doi 10.1016/j.aop.2020.168276

Quantum dynamics of scalar particles in the space-time of a cosmic string in the context of gravity's rainbow

Authors: L. C. N. Santos, C. E. Mota, C. C. Barros Jr., L. B. Castro, V. B. Bezerra

Abstract: We analyze the dynamics of scalar particles in gravity's rainbow considering the space-time of a cosmic string in this modified gravity. Thus, we solve the Klein-Gordon equation for two types of potential in which two possible rainbow functions are selected. In all cases studied we obtain exact solutions for the wave equation, derive the energy spectrum, analyze it in detail and discuss the depend… ▽ More We analyze the dynamics of scalar particles in gravity's rainbow considering the space-time of a cosmic string in this modified gravity. Thus, we solve the Klein-Gordon equation for two types of potential in which two possible rainbow functions are selected. In all cases studied we obtain exact solutions for the wave equation, derive the energy spectrum, analyze it in detail and discuss the dependence of the results with the choice of the rainbow functions. △ Less

Submitted 19 December, 2019; originally announced December 2019.

Journal ref: Ann. Phys. (N.Y.) 421 (2020) 168276

arXiv:1911.08624 [pdf, other]

doi 10.1103/PhysRevD.101.084004

Effects of Planck-scale-modified dispersion relations on the thermodynamics of charged black holes

Authors: I. P. Lobo, V. B. Bezerra, J. P. Morais Graça, Luis C. N. Santos, M. Ronco

Abstract: Considering corrections produced by modified dispersion relations on the equation of state parameter of radiation, we study the induced black hole metric inspired by Kiselev's ansatz, thus defining a deformed Reissner-Nordström metric. In particular, we consider thermodynamic properties of such a black hole from the combined viewpoints of the modified equation of state parameter and the phenomenol… ▽ More Considering corrections produced by modified dispersion relations on the equation of state parameter of radiation, we study the induced black hole metric inspired by Kiselev's ansatz, thus defining a deformed Reissner-Nordström metric. In particular, we consider thermodynamic properties of such a black hole from the combined viewpoints of the modified equation of state parameter and the phenomenological approach to the quantum gravity problem called rainbow gravity. △ Less

Submitted 2 April, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

Comments: 16 pages, 8 figures

Journal ref: Phys. Rev. D 101, 084004 (2020)

arXiv:1911.03208 [pdf, ps, other]

Anisotropic Compact Stars in Rastall-Rainbow Gravity

Authors: Clésio E. Mota, Luis C. N. Santos, Franciele M. da Silva, Cesar V. Flores, Tiago J. N. da Silva, Débora P. Menezes

Abstract: In this work, we have investigated anisotropic neutron stars in the framework of Rastall-Rainbow gravity. All our calculations were computed using the IU-FSU realistic equation of state (EoS), in which was considered two cases: standard nucleonic composition and the one with the eight lightest baryons. From the neutron star masses and radii obtained we conclude that anisotropic pressure has signif… ▽ More In this work, we have investigated anisotropic neutron stars in the framework of Rastall-Rainbow gravity. All our calculations were computed using the IU-FSU realistic equation of state (EoS), in which was considered two cases: standard nucleonic composition and the one with the eight lightest baryons. From the neutron star masses and radii obtained we conclude that anisotropic pressure has significant consequences on the structure of stellar objects. In particular, when anisotropy is considered within the general relativity framework, it significantly modifies the maximum stellar mass. On the other hand, when Rastall-Rainbow gravity and anisotropy are simultaneously considered, they provide the best results for mass and radius values, including important astrophysical objects such as the LMXB NGC 6397 and the extremely massive pulsar millisecond MSP J0740 + 6620. Although the expected inclusion of hyperons in the nuclear model reproduces stellar masses smaller than those produced by standard nucleonic matter, we shown that the hyperon puzzle problem can be solved by including anisotropic effects on compact stars in the context of the Rastall-Rainbow gravity. △ Less

Submitted 8 November, 2019; originally announced November 2019.

arXiv:1911.01996 [pdf, ps, other]

doi 10.1007/s10714-019-2633-z

Electrostatic self-interaction of charged particles in the space-time of a cosmic string in the context of gravity's rainbow

Authors: L. C. N. Santos, V. B. Bezerra

Abstract: We analyze the electrostatic self-energy of a point like electrically charged particle induced by a cosmic string in the context of gravity's rainbow, as well the electrostatic self-force on this particle. The possibility of the solution associated with a charged particle to be altered by modifications in dispersion relations of the space-time is discussed. We show that the self-energy depends on… ▽ More We analyze the electrostatic self-energy of a point like electrically charged particle induced by a cosmic string in the context of gravity's rainbow, as well the electrostatic self-force on this particle. The possibility of the solution associated with a charged particle to be altered by modifications in dispersion relations of the space-time is discussed. We show that the self-energy depends on the rainbow functions and that this quantity can either increase or decrease depending on the rainbow function chosen, as compared with analogous result in the framework of general relativity. With respect to the self-force, its dependence with the rainbow functions is also pointed out △ Less

Submitted 5 November, 2019; originally announced November 2019.

arXiv:1910.00997 [pdf, ps, other]

doi 10.1140/epjp/s13360-019-00062-7

Scalar bosons with Coulomb potentials in a cosmic string background: Scattering and bound states

Authors: Francisco A. Cruz Neto, Franciele M. da Silva, Luis C. N. Santos, Luis B. Castro

Abstract: The relativistic quantum motion of scalar bosons under the influence of a full vector (minimal $A^μ$ and nonminimal $X^μ$) and scalar ($V_{s}$) interactions embedded in the background of a cosmic string is explored in the context of the Klein-Gordon equation. Considering Coulomb interactions, the effects of this topological defect in equation of motion, phase shift and S-matrix are analyzed and di… ▽ More The relativistic quantum motion of scalar bosons under the influence of a full vector (minimal $A^μ$ and nonminimal $X^μ$) and scalar ($V_{s}$) interactions embedded in the background of a cosmic string is explored in the context of the Klein-Gordon equation. Considering Coulomb interactions, the effects of this topological defect in equation of motion, phase shift and S-matrix are analyzed and discussed. Bound-state solutions are obtained from poles of the S-matrix and it is shown that bound-state solutions are possible only for a restrict range of coupling constants. △ Less

Submitted 2 October, 2019; originally announced October 2019.

Comments: 6 pages, 2 figures

Journal ref: Eur. Phys. J. Plus 135, 25 (2020)

arXiv:1908.08140 [pdf, other]

doi 10.1140/epjc/s10052-019-7482-0

Effects of quantum corrections on the criticality and efficiency of black holes surrounded by a perfect fluid

Authors: V. B. Bezerra, I. P. Lobo, J. P. Morais Graça, Luis C. N. Santos

Abstract: We study some properties of the extended phase space of a quantum-corrected Schwarzschild black hole surrounded by a perfect fluid. In particular we demonstrate that, due to the quantum correction, there exist first and second order phase transitions for a certain range of the state parameter of the perfect fluid, and we explicitly analyze some cases. Besides that, we describe the efficiency of th… ▽ More We study some properties of the extended phase space of a quantum-corrected Schwarzschild black hole surrounded by a perfect fluid. In particular we demonstrate that, due to the quantum correction, there exist first and second order phase transitions for a certain range of the state parameter of the perfect fluid, and we explicitly analyze some cases. Besides that, we describe the efficiency of this system as a heat engine and the effect of quantum corrections for different surrounding fluids. △ Less

Submitted 23 August, 2019; v1 submitted 21 August, 2019; originally announced August 2019.

Comments: 10 pages, 12 figures. References added

arXiv:1905.01250 [pdf, ps, other]

doi 10.1103/PhysRevD.100.024043

Combined Rastall and Rainbow theories of gravity with applications to neutron stars

Authors: Clésio E. Mota, Luis C. N. Santos, Guilherme Grams, Franciele M. da Silva, Débora P. Menezes

Abstract: The possibility of modifications on general relativity is investigated. We propose an alternative theory of gravity constructed with the combination of Rastall and Rainbow theories. The hydrostatic equilibrium equations are obtained in order to test the new theory in neutron stars, whose mass-radius diagrams are obtained using modern equations of state of nuclear matter derived from relativistic m… ▽ More The possibility of modifications on general relativity is investigated. We propose an alternative theory of gravity constructed with the combination of Rastall and Rainbow theories. The hydrostatic equilibrium equations are obtained in order to test the new theory in neutron stars, whose mass-radius diagrams are obtained using modern equations of state of nuclear matter derived from relativistic mean field models and compared with the ones computed by the Tolman-Oppenheimer-Volkoff equations. We conclude that substantial modifications are obtained even for very small alterations on the two free parameters, making the reproduction of astrophysical observations an easy task. △ Less

Submitted 3 May, 2019; originally announced May 2019.

Journal ref: Phys. Rev. D 100, 024043 (2019)

arXiv:1904.02762 [pdf, other]

Learning Implicit Generative Models by Matching Perceptual Features

Authors: Cicero Nogueira dos Santos, Youssef Mroueh, Inkit Padhi, Pierre Dognin

Abstract: Perceptual features (PFs) have been used with great success in tasks such as transfer learning, style transfer, and super-resolution. However, the efficacy of PFs as key source of information for learning generative models is not well studied. We investigate here the use of PFs in the context of learning implicit generative models through moment matching (MM). More specifically, we propose a new e… ▽ More Perceptual features (PFs) have been used with great success in tasks such as transfer learning, style transfer, and super-resolution. However, the efficacy of PFs as key source of information for learning generative models is not well studied. We investigate here the use of PFs in the context of learning implicit generative models through moment matching (MM). More specifically, we propose a new effective MM approach that learns implicit generative models by performing mean and covariance matching of features extracted from pretrained ConvNets. Our proposed approach improves upon existing MM methods by: (1) breaking away from the problematic min/max game of adversarial learning; (2) avoiding online learning of kernel functions; and (3) being efficient with respect to both number of used moments and required minibatch size. Our experimental results demonstrate that, due to the expressiveness of PFs from pretrained deep ConvNets, our method achieves state-of-the-art results for challenging benchmarks. △ Less

Submitted 4 April, 2019; originally announced April 2019.

Comments: 16 pages

Journal ref: ICCV 2019

arXiv:1901.04104 [pdf, ps, other]

doi 10.1155/2019/2729352

Klein-Gordon oscillator in a topologically nontrivial space-time

Authors: L. C. N. Santos, C. E. Mota, C. C. Barros Jr

Abstract: In this study, we analyze solutions of the wave equation for scalar particles in a space-time with nontrivial topology. Solutions for the Klein--Gordon oscillator are found considering two configurations of this space-time. In the first one, it is assumed the $S^{1}\times R^{3}$ space where the metric is written in the usual inertial frame of reference. In the second case, we consider a rotating r… ▽ More In this study, we analyze solutions of the wave equation for scalar particles in a space-time with nontrivial topology. Solutions for the Klein--Gordon oscillator are found considering two configurations of this space-time. In the first one, it is assumed the $S^{1}\times R^{3}$ space where the metric is written in the usual inertial frame of reference. In the second case, we consider a rotating reference frame adapted to the circle S1. We obtained compact expressions for the energy spectrum and for the particles wave functions in both configurations. Additionally, we show that the energy spectrum of the solution associated to the rotating system has an additional term that breaks the symmetry around $E = 0$. △ Less

Submitted 13 January, 2019; originally announced January 2019.

arXiv:1805.07685 [pdf, other]

Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer

Authors: Cicero Nogueira dos Santos, Igor Melnyk, Inkit Padhi

Abstract: We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from… ▽ More We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from Twitter and Reddit show that our method outperforms a state-of-the-art text style transfer system in two out of three quantitative metrics and produces reliable non-offensive transferred sentences. △ Less

Submitted 19 May, 2018; originally announced May 2018.

Comments: ACL 2018

arXiv:1805.04893 [pdf, other]

Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering

Authors: Rui Zhang, Cicero Nogueira dos Santos, Michihiro Yasunaga, Bing Xiang, Dragomir Radev

Abstract: Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention m… ▽ More Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and the mention clustering log-likelihood given the mention cluster labels. Our model achieves the state-of-the-art performance on the CoNLL-2012 Shared Task English test set. △ Less

Submitted 13 May, 2018; originally announced May 2018.

Comments: ACL2018

arXiv:1801.01024 [pdf, ps, other]

doi 10.1140/epjc/s10052-017-5476-3

Relativistic quantum motion of spin-0 particles under the influence of non-inertial effects in the cosmic string space-time

Authors: L. C. N. Santos, C. C. Barros Jr

Abstract: We study solutions for the Klein-Gordon equation with vector and scalar potentials of the Coulomb types under the influence of non-inertial effects in the space-time of topological defects. We also investigate a quantum particle described by the Klein-Gordon oscillator in the background space-time generated by a string. An important result obtained is that the non-inertial effects restrict the phy… ▽ More We study solutions for the Klein-Gordon equation with vector and scalar potentials of the Coulomb types under the influence of non-inertial effects in the space-time of topological defects. We also investigate a quantum particle described by the Klein-Gordon oscillator in the background space-time generated by a string. An important result obtained is that the non-inertial effects restrict the physical region of the space-time where the particle can be placed. In addition, we show that these potentials can form bound states for the relativistic wave equation equation in this kind of background. △ Less

Submitted 30 December, 2017; originally announced January 2018.

Comments: arXiv admin note: text overlap with arXiv:1608.06692

arXiv:1711.09395 [pdf, other]

Improved Neural Text Attribute Transfer with Non-parallel Data

Authors: Igor Melnyk, Cicero Nogueira dos Santos, Kahini Wadhawan, Inkit Padhi, Abhishek Kumar

Abstract: Text attribute transfer using non-parallel data requires methods that can perform disentanglement of content and linguistic attributes. In this work, we propose multiple improvements over the existing approaches that enable the encoder-decoder framework to cope with the text attribute transfer from non-parallel data. We perform experiments on the sentiment transfer task using two datasets. For bot… ▽ More Text attribute transfer using non-parallel data requires methods that can perform disentanglement of content and linguistic attributes. In this work, we propose multiple improvements over the existing approaches that enable the encoder-decoder framework to cope with the text attribute transfer from non-parallel data. We perform experiments on the sentiment transfer task using two datasets. For both datasets, our proposed method outperforms a strong baseline in two of the three employed evaluation metrics. △ Less

Submitted 4 December, 2017; v1 submitted 26 November, 2017; originally announced November 2017.

Comments: NIPS 2017 Workshop on Learning Disentangled Representations: from Perception to Control

arXiv:1708.04326 [pdf, ps, other]

Improved Answer Selection with Pre-Trained Word Embeddings

Authors: Rishav Chakravarti, Jiri Navratil, Cicero Nogueira dos Santos

Abstract: This paper evaluates existing and newly proposed answer selection methods based on pre-trained word embeddings. Word embeddings are highly effective in various natural language processing tasks and their integration into traditional information retrieval (IR) systems allows for the capture of semantic relatedness between questions and answers. Empirical results on three publicly available data set… ▽ More This paper evaluates existing and newly proposed answer selection methods based on pre-trained word embeddings. Word embeddings are highly effective in various natural language processing tasks and their integration into traditional information retrieval (IR) systems allows for the capture of semantic relatedness between questions and answers. Empirical results on three publicly available data sets show significant gains over traditional term frequency based approaches in both supervised and unsupervised settings. We show that combining these word embedding features with traditional learning-to-rank techniques can achieve similar performance to state-of-the-art neural networks trained for the answer selection task. △ Less

Submitted 14 August, 2017; originally announced August 2017.

arXiv:1707.02198 [pdf, other]

Learning Loss Functions for Semi-supervised Learning via Discriminative Adversarial Networks

Authors: Cicero Nogueira dos Santos, Kahini Wadhawan, Bowen Zhou

Abstract: We propose discriminative adversarial networks (DAN) for semi-supervised learning and loss function learning. Our DAN approach builds upon generative adversarial networks (GANs) and conditional GANs but includes the key differentiator of using two discriminators instead of a generator and a discriminator. DAN can be seen as a framework to learn loss functions for predictors that also implements se… ▽ More We propose discriminative adversarial networks (DAN) for semi-supervised learning and loss function learning. Our DAN approach builds upon generative adversarial networks (GANs) and conditional GANs but includes the key differentiator of using two discriminators instead of a generator and a discriminator. DAN can be seen as a framework to learn loss functions for predictors that also implements semi-supervised learning in a straightforward manner. We propose instantiations of DAN for two different prediction tasks: classification and ranking. Our experimental results on three datasets of different tasks demonstrate that DAN is a promising framework for both semi-supervised learning and learning loss functions for predictors. For all tasks, the semi-supervised capability of DAN can significantly boost the predictor performance for small labeled sets with minor architecture changes across tasks. Moreover, the loss functions automatically learned by DANs are very competitive and usually outperform the standard pairwise and negative log-likelihood loss functions for both semi-supervised and supervised learning. △ Less

Submitted 7 July, 2017; originally announced July 2017.

Comments: 11 pages

arXiv:1704.00408 [pdf, ps, other]

doi 10.1142/S0219887819501408

Fermions in the Rindler spacetime

Authors: L. C. N. Santos, C. C. Barros Jr

Abstract: In this paper we study the Dirac equation in the Rindler spacetime. The solution of the wave equation in an accelerated reference frame is obtained. The differential equation associated to this wave equation is mapped into a Sturm-Liouville problem of a Schrödinger-like equation. We derive a compact expression for the energy spectrum associated with the Dirac equation in an accelerated reference.… ▽ More In this paper we study the Dirac equation in the Rindler spacetime. The solution of the wave equation in an accelerated reference frame is obtained. The differential equation associated to this wave equation is mapped into a Sturm-Liouville problem of a Schrödinger-like equation. We derive a compact expression for the energy spectrum associated with the Dirac equation in an accelerated reference. It is shown that the noninertial effect of the accelerated reference frame mimics an external potential in the Dirac equation and, moreover, allows the formation of bound states. △ Less

Submitted 27 April, 2019; v1 submitted 2 April, 2017; originally announced April 2017.

arXiv:1703.03130 [pdf, other]

A Structured Self-attentive Sentence Embedding

Authors: Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio

Abstract: This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with a… ▽ More This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks. △ Less

Submitted 8 March, 2017; originally announced March 2017.

Comments: 15 pages with appendix, 7 figures, 4 tables. Conference paper in 5th International Conference on Learning Representations (ICLR 2017)

Showing 1–50 of 59 results for author: Santos, C N