Skip to main content

Showing 1–50 of 72 results for author: Ballesteros, M

.
  1. arXiv:2405.00204  [pdf, other

    cs.CL cs.AI

    General Purpose Verification for Chain of Thought Prompting

    Authors: Robert Vacareanu, Anurag Pratik, Evangelia Spiliopoulou, Zheng Qi, Giovanni Paolini, Neha Anna John, Jie Ma, Yassine Benajiba, Miguel Ballesteros

    Abstract: Many of the recent capabilities demonstrated by Large Language Models (LLMs) arise primarily from their ability to exploit contextual information. In this paper, we explore ways to improve reasoning capabilities of LLMs through (1) exploration of different chains of thought and (2) validation of the individual steps of the reasoning process. We propose three general principles that a model should… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 22 pages, preprint

  2. arXiv:2402.18479  [pdf, other

    cs.CL

    NewsQs: Multi-Source Question Generation for the Inquiring Mind

    Authors: Alyssa Hwang, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba, Vittorio Castelli, Markus Dreyer, Mohit Bansal, Kathleen McKeown

    Abstract: We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents. To create NewsQs, we augment a traditional multi-document summarization dataset with questions automatically generated by a T5-Large model fine-tuned on FAQ-style news articles from the News On the Web corpus. We show that fine-tuning a model with control codes produces questions that are judg… ▽ More

    Submitted 15 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: minor wording change

  3. arXiv:2311.18112  [pdf, ps, other

    math.PR math.ST

    Detection of an Arbitrary Number of Communities in a Block Spin Ising Model

    Authors: Miguel Ballesteros, Ramsés H. Mena, José Luis Pérez, Gabor Toth

    Abstract: We study the problem of community detection in a general version of the block spin Ising model featuring M groups, a model inspired by the Curie-Weiss model of ferromagnetism in statistical mechanics. We solve the general problem of identifying any number of groups with any possible coupling constants. Up to now, the problem was only solved for the specific situation with two groups of identical s… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 31 pages

    MSC Class: 62H22; 82B20; 60F05

  4. arXiv:2305.17127  [pdf, other

    cs.CL

    Characterizing and Measuring Linguistic Dataset Drift

    Authors: Tyler A. Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth

    Abstract: NLP models often degrade in performance when real world data distributions differ markedly from training data. However, existing dataset drift metrics in NLP have generally not considered specific dimensions of linguistic drift that affect model performance, and they have not been validated in their ability to predict model performance at the individual example level, where such metrics are often… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  5. arXiv:2305.13191  [pdf, other

    cs.CL cs.AI cs.LG

    Taxonomy Expansion for Named Entity Recognition

    Authors: Karthikeyan K, Yogarshi Vyas, Jie Ma, Giovanni Paolini, Neha Anna John, Shuai Wang, Yassine Benajiba, Vittorio Castelli, Dan Roth, Miguel Ballesteros

    Abstract: Training a Named Entity Recognition (NER) model often involves fixing a taxonomy of entity types. However, requirements evolve and we might need the NER model to recognize additional entity types. A simple approach is to re-annotate entire dataset with both existing and additional entity types and then train the model on the re-annotated dataset. However, this is an extremely laborious task. To re… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  6. arXiv:2305.11979  [pdf, other

    cs.CL

    A Weak Supervision Approach for Few-Shot Aspect Based Sentiment

    Authors: Robert Vacareanu, Siddharth Varia, Kishaloy Halder, Shuai Wang, Giovanni Paolini, Neha Anna John, Miguel Ballesteros, Smaranda Muresan

    Abstract: We explore how weak supervision on abundant unlabeled data can be leveraged to improve few-shot performance in aspect-based sentiment analysis (ABSA) tasks. We propose a pipeline approach to construct a noisy ABSA dataset, and we use it to adapt a pre-trained sequence-to-sequence model to the ABSA tasks. We test the resulting model on three widely used ABSA datasets, before and after fine-tuning.… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  7. arXiv:2305.11242  [pdf, other

    cs.CL

    Comparing Biases and the Impact of Multilingual Training across Multiple Languages

    Authors: Sharon Levy, Neha Anna John, Ling Liu, Yogarshi Vyas, Jie Ma, Yoshinari Fu**uma, Miguel Ballesteros, Vittorio Castelli, Dan Roth

    Abstract: Studies in bias and fairness in natural language processing have primarily examined social biases within a single language and/or across few attributes (e.g. gender, race). However, biases can manifest differently across various languages for individual attributes. As a result, it is critical to examine biases within each language and attribute. Of equal importance is to study how these biases com… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  8. arXiv:2303.11660  [pdf, other

    cs.CL

    Simple Yet Effective Synthetic Dataset Construction for Unsupervised Opinion Summarization

    Authors: Ming Shen, Jie Ma, Shuai Wang, Yogarshi Vyas, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba

    Abstract: Opinion summarization provides an important solution for summarizing opinions expressed among a large number of reviews. However, generating aspect-specific and general summaries is challenging due to the lack of annotated data. In this work, we propose two simple yet effective unsupervised approaches to generate both aspect-specific and general opinion summaries by training on synthetic datasets… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: EACL 2023 Findings

  9. arXiv:2302.12297  [pdf, other

    cs.CL

    Dynamic Benchmarking of Masked Language Models on Temporal Concept Drift with Multiple Views

    Authors: Katerina Margatina, Shuai Wang, Yogarshi Vyas, Neha Anna John, Yassine Benajiba, Miguel Ballesteros

    Abstract: Temporal concept drift refers to the problem of data changing over time. In NLP, that would entail that language (e.g. new expressions, meaning shifts) and factual knowledge (e.g. new concepts, updated facts) evolve over time. Focusing on the latter, we benchmark $11$ pretrained masked language models (MLMs) on a series of tests designed to evaluate the effect of temporal concept drift, as it is c… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: To appear at EACL 2023. Our code will be available at https://github.com/amazon-science/temporal-robustness

  10. arXiv:2211.05021  [pdf, ps, other

    math-ph

    Levinson theorem for discrete Schrödinger operators on the line with matrix potentials having a first moment

    Authors: Miguel Ballesteros, Gerardo Franco Córdova, Ivan Naumkin, Hermann Schulz-Baldes

    Abstract: This paper proves new results on spectral and scattering theory for matrix-valued Schrödinger operators on the discrete line with non-compactly supported perturbations whose first moments are assumed to exist. In particular, a Levinson theorem is proved, in which a relation between scattering data and spectral properties (bound and half bound states) of the corresponding Hamiltonians is derived. T… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  11. arXiv:2211.04903  [pdf, other

    cs.CL

    Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

    Authors: Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown

    Abstract: Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter. We present a pipelined extractive-abstractive approach where the extractive step filters the content that is passed to the abstractive component. Extremely lengthy input also results in a highly skewed data… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  12. arXiv:2210.06629  [pdf, other

    cs.CL

    Instruction Tuning for Few-Shot Aspect-Based Sentiment Analysis

    Authors: Siddharth Varia, Shuai Wang, Kishaloy Halder, Robert Vacareanu, Miguel Ballesteros, Yassine Benajiba, Neha Anna John, Rishita Anubhai, Smaranda Muresan, Dan Roth

    Abstract: Aspect-based Sentiment Analysis (ABSA) is a fine-grained sentiment analysis task which involves four elements from user-generated texts: aspect term, aspect category, opinion term, and sentiment polarity. Most computational approaches focus on some of the ABSA sub-tasks such as tuple (aspect term, sentiment polarity) or triplet (aspect term, opinion term, sentiment polarity) extraction using eithe… ▽ More

    Submitted 11 June, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Camera ready copy for WASSA at ACL 2023

  13. arXiv:2210.05613  [pdf, other

    cs.CL cs.AI

    Contrastive Training Improves Zero-Shot Classification of Semi-structured Documents

    Authors: Muhammad Khalifa, Yogarshi Vyas, Shuai Wang, Graham Horwood, Sunil Mallya, Miguel Ballesteros

    Abstract: We investigate semi-structured document classification in a zero-shot setting. Classification of semi-structured documents is more challenging than that of standard unstructured documents, as positional, layout, and style information play a vital role in interpreting such documents. The standard classification setting where categories are fixed during both training and testing falls short in dynam… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  14. arXiv:2204.11117  [pdf, other

    cs.CL cs.LG

    Exploring the Role of Task Transferability in Large-Scale Multi-Task Learning

    Authors: Vishakh Padmakumar, Leonard Lausen, Miguel Ballesteros, Sheng Zha, He He, George Karypis

    Abstract: Recent work has found that multi-task training with a large number of diverse tasks can uniformly improve downstream performance on unseen target tasks. In contrast, literature on task transferability has established that the choice of intermediate tasks can heavily affect downstream task performance. In this work, we aim to disentangle the effect of scale and relatedness of tasks in multi-task re… ▽ More

    Submitted 12 July, 2022; v1 submitted 23 April, 2022; originally announced April 2022.

    Comments: NAACL 2022 - Camera ready version

  15. arXiv:2203.08985  [pdf, other

    cs.CL

    Label Semantics for Few Shot Named Entity Recognition

    Authors: Jie Ma, Miguel Ballesteros, Srikanth Doss, Rishita Anubhai, Sunil Mallya, Yaser Al-Onaizan, Dan Roth

    Abstract: We study the problem of few shot learning for named entity recognition. Specifically, we leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors. We propose a neural architecture that consists of two BERT encoders, one to encode the document and its tokens and another one to encode each of the labels in natural language format… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Findings of ACL 2022

  16. arXiv:2112.08345  [pdf, other

    cs.CV

    Reliable Multi-Object Tracking in the Presence of Unreliable Detections

    Authors: Travis Mandel, Mark Jimenez, Emily Risley, Taishi Nammoto, Rebekka Williams, Max Panoff, Meynard Ballesteros, Bobbie Suarez

    Abstract: Recent multi-object tracking (MOT) systems have leveraged highly accurate object detectors; however, training such detectors requires large amounts of labeled data. Although such data is widely available for humans and vehicles, it is significantly more scarce for other animal species. We present Robust Confidence Tracking (RCT), an algorithm designed to maintain robust performance even when detec… ▽ More

    Submitted 7 November, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: The full journal version of this article (published in Pattern Recognition, Vol. 135) can be found at https://www.sciencedirect.com/science/article/pii/S0031320322005878. The article is open access. The source code and dataset can be found at https://github.com/tmandel/fish-detrac

  17. arXiv:2109.08232  [pdf, other

    cs.CL

    A Bag of Tricks for Dialogue Summarization

    Authors: Muhammad Khalifa, Miguel Ballesteros, Kathleen McKeown

    Abstract: Dialogue summarization comes with its own peculiar challenges as opposed to news or scientific articles summarization. In this work, we explore four different challenges of the task: handling and differentiating parts of the dialogue belonging to multiple speakers, negation understanding, reasoning about the situation, and informal language understanding. Using a pretrained sequence-to-sequence la… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 - short paper

  18. arXiv:2109.03160  [pdf, other

    cs.CL

    How much pretraining data do language models need to learn syntax?

    Authors: Laura Pérez-Mayos, Miguel Ballesteros, Leo Wanner

    Abstract: Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks. However, while pretraining methods are very convenient, they are expensive in terms of time and resources. This calls for a study of the impact of pretraining data size on the knowledge of the models. We explore this impact on the syntactic capabilities of RoBERTa, using models trained on i… ▽ More

    Submitted 9 September, 2021; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: To be published in proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)

  19. arXiv:2104.08413  [pdf, other

    cs.CL

    Sequential Cross-Document Coreference Resolution

    Authors: Emily Allaway, Shuai Wang, Miguel Ballesteros

    Abstract: Relating entities and events in text is a key component of natural language understanding. Cross-document coreference resolution, in particular, is important for the growing interest in multi-document analysis tasks. In this work we propose a new model that extends the efficient sequential prediction paradigm for coreference resolution to cross-document settings and achieves competitive results fo… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  20. arXiv:2101.11492  [pdf, other

    cs.CL

    On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

    Authors: Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner

    Abstract: The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations. Among other information, it has been shown that entire syntax trees are implicitly embedded in the geometry of such models. As these models are often fine-tuned, it becom… ▽ More

    Submitted 10 February, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

  21. arXiv:2101.11059  [pdf, other

    cs.CL cs.AI cs.IR

    Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings

    Authors: Kailash Karthik Saravanakumar, Miguel Ballesteros, Muthu Kumar Chandrasekaran, Kathleen McKeown

    Abstract: We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm. Our model uses a combination of sparse and dense document representations, aggregates document-cluster similarity along these multiple representations and makes the clustering decision using a neural classifier. The weighted document-cluster similarity model is learned using a… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: To appear in Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics

    ACM Class: I.2.7

  22. arXiv:2010.14042  [pdf, other

    cs.CL

    To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

    Authors: Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai, Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

    Abstract: Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how t… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: Accepted in the Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)(https://2020.emnlp.org/papers/main)

  23. arXiv:2010.11333  [pdf, other

    cs.CL

    Linking Entities to Unseen Knowledge Bases with Arbitrary Schemas

    Authors: Yogarshi Vyas, Miguel Ballesteros

    Abstract: In entity linking, mentions of named entities in raw text are disambiguated against a knowledge base (KB). This work focuses on linking to unseen KBs that do not have training data and whose schema is unknown during training. Our approach relies on methods to flexibly convert entities from arbitrary KBs with several attribute-value pairs into flat strings, which we use in conjunction with state-of… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

  24. arXiv:2010.10669  [pdf, other

    cs.CL

    Transition-based Parsing with Stack-Transformers

    Authors: Ramon Fernandez Astudillo, Miguel Ballesteros, Tahira Naseem, Austin Blodgett, Radu Florian

    Abstract: Modeling the parser state is key to good performance in transition-based parsing. Recurrent Neural Networks considerably improved the performance of transition-based systems by modelling the global state, e.g. stack-LSTM parsers, or local state modeling of contextualized features, e.g. Bi-LSTM parsers. Given the success of Transformer architectures in recent parsing systems, this work explores mod… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP2020, open review https://openreview.net/forum?id=b36spsuUAde, code https://github.com/IBM/transition-amr-parser

  25. arXiv:2010.05725  [pdf, other

    cs.CL

    Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

    Authors: Ethan Wilcox, Peng Qian, Richard Futrell, Ryosuke Kohita, Roger Levy, Miguel Ballesteros

    Abstract: Humans can learn structural properties about a word from minimal experience, and deploy their learned syntactic representations uniformly in different grammatical contexts. We assess the ability of modern neural language models to reproduce this behavior in English and evaluate the effect of structural supervision on learning outcomes. First, we assess few-shot learning capabilities by develo**… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: To appear at EMNLP 2020

  26. arXiv:2010.03022  [pdf, other

    cs.CL

    Resource-Enhanced Neural Model for Event Argument Extraction

    Authors: Jie Ma, Shuai Wang, Rishita Anubhai, Miguel Ballesteros, Yaser Al-Onaizan

    Abstract: Event argument extraction (EAE) aims to identify the arguments of an event and classify the roles that those arguments play. Despite great efforts made in prior work, there remain many challenges: (1) Data scarcity. (2) Capturing the long-range dependency, specifically, the connection between an event trigger and a distant event argument. (3) Integrating event trigger information into candidate ar… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: Findings of EMNLP 2020

  27. arXiv:2008.02177  [pdf, ps, other

    math-ph

    Band edge limit of the scattering matrix for quasi-one-dimensional discrete Schrödinger operators

    Authors: Miguel Ballesteros, Gerardo Franco Córdova, Guillermo Garro, Hermann Schulz-Baldes

    Abstract: This paper is about the scattering theory for one-dimensional matrix Schrödinger operators with a matrix potential having a finite first moment. The transmission coefficients are analytically continued and extended to the band edges. An explicit expression is given for these extensions. The limits of the reflection coefficients at the band edges is also calculated.

    Submitted 28 March, 2022; v1 submitted 5 August, 2020; originally announced August 2020.

    Comments: minor corrections; appears in Complex Analysis and Operator Theory

  28. The appearance of particle tracks in detectors

    Authors: Miguel Ballesteros, Tristan Benoist, Martin Fraas, Jürg Fröhlich

    Abstract: The phenomenon that a quantum particle propagating in a detector, such as a Wilson cloud chamber, leaves a track close to a classical trajectory is analyzed. We introduce an idealized quantum-mechanical model of a charged particle that is periodically illuminated by pulses of laser light resulting in repeated indirect measurements of the approximate position of the particle. For this model we pres… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

    Comments: 40 pages

  29. arXiv:2004.13099  [pdf, other

    math-ph

    Analyticity properties of the scattering matrix for matrix Schrödinger operators on the discrete line

    Authors: Miguel Ballesteros, Gerardo Franco Córdova, Hermann Schulz-Baldes

    Abstract: Explicit formulas for the analytic extensions of the scattering matrix and the time delay of a quasi-one-dimensional discrete Schrödinger operator with a potential of finite support are derived. This includes a careful analysis of the band edge singularities and allows to prove a Levinson-type theorem. The main algebraic tool are the plane wave transfer matrices.

    Submitted 22 January, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: minor corrections, to appear in J. Math. Analysis and its Applications

  30. arXiv:2004.04295  [pdf, ps, other

    cs.CL

    Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

    Authors: Miguel Ballesteros, Rishita Anubhai, Shuai Wang, Nima Pourdamghani, Yogarshi Vyas, Jie Ma, Parminder Bhatia, Kathleen McKeown, Yaser Al-Onaizan

    Abstract: In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Our proposed models receive a pair of events within a span of text as input and they identify temporal relations (Before, After, Equal, Vague) between them. Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrain… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

  31. arXiv:2001.08279   

    cs.CL cs.AI cs.LG

    Transition-Based Dependency Parsing using Perceptron Learner

    Authors: Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking

    Abstract: Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora. In this paper, we tackle transition-based dependency parsing using a Perceptron Learner. Our proposed model, which adds more relevant features to the Perceptron Learn… ▽ More

    Submitted 28 January, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

    Comments: This was part of an assignment at my graduate course at LTI. This does not offer any major novelties

  32. arXiv:1907.03013  [pdf, other

    math-ph

    One-Boson Scattering Processes in the Massless Spin-Boson Model -- A Non-Perturbative Formula

    Authors: Miguel Ballesteros, Dirk-André Deckert, Felix Hänle

    Abstract: In scattering experiments, physicists observe so-called resonances as peaks at certain energy values in the measured scattering cross sections per solid angle. These peaks are usually associate with certain scattering processes, e.g., emission, absorption, or excitation of certain particles and systems. On the other hand, mathematicians define resonances as poles of an analytic continuation of the… ▽ More

    Submitted 15 May, 2020; v1 submitted 5 July, 2019; originally announced July 2019.

    Comments: 26 pages, 3 figure. arXiv admin note: text overlap with arXiv:1801.04843

  33. arXiv:1905.13370  [pdf, ps, other

    cs.CL

    Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning

    Authors: Tahira Naseem, Abhishek Shah, Hui Wan, Radu Florian, Salim Roukos, Miguel Ballesteros

    Abstract: Our work involves enriching the Stack-LSTM transition-based AMR parser (Ballesteros and Al-Onaizan, 2017) by augmenting training with Policy Learning and rewarding the Smatch score of sampled graphs. In addition, we also combined several AMR-to-text alignments with an attention mechanism and we supplemented the parser with pre-processed concept identification, named entities and contextualized emb… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: Accepted as short paper at ACL 2019

  34. arXiv:1903.03260  [pdf, other

    cs.CL

    Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State

    Authors: Richard Futrell, Ethan Wilcox, Takashi Morita, Peng Qian, Miguel Ballesteros, Roger Levy

    Abstract: We deploy the methods of controlled psycholinguistic experimentation to shed light on the extent to which the behavior of neural network language models reflects incremental representations of syntactic state. To do so, we examine model behavior on artificial sentences containing a variety of syntactically complex structures. We test four models: two publicly available LSTM sequence models of Engl… ▽ More

    Submitted 7 March, 2019; originally announced March 2019.

    Comments: Accepted to NAACL 2019. Not yet edited into the camera-ready version

  35. arXiv:1903.00943  [pdf, other

    cs.CL

    Structural Supervision Improves Learning of Non-Local Grammatical Dependencies

    Authors: Ethan Wilcox, Peng Qian, Richard Futrell, Miguel Ballesteros, Roger Levy

    Abstract: State-of-the-art LSTM language models trained on large corpora learn sequential contingencies in impressive detail and have been shown to acquire a number of non-local grammatical dependencies with some success. Here we investigate whether supervision with hierarchical structure enhances learning of a range of grammatical dependencies, a question that has previously been addressed only for subject… ▽ More

    Submitted 6 April, 2019; v1 submitted 3 March, 2019; originally announced March 2019.

    Comments: To appear: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

  36. arXiv:1902.09781  [pdf, other

    cs.CL

    Recursive Subtree Composition in LSTM-Based Dependency Parsing

    Authors: Miryam de Lhoneux, Miguel Ballesteros, Joakim Nivre

    Abstract: The need for tree structure modelling on top of sequence modelling is an open issue in neural dependency parsing. We investigate the impact of adding a tree layer on top of a sequential model by recursively composing subtree representations (composition) in a transition-based parser that uses features extracted by a BiLSTM. Composition seems superfluous with such a model, suggesting that BiLSTMs c… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

    Comments: Accepted at NAACL 2019

  37. arXiv:1902.02848  [pdf, ps, other

    math.OA

    Conditionally Free Reduced Products of Hilbert Spaces

    Authors: Octavio Arizmendi, Miguel Ballesteros, Francisco Torres-Ayala

    Abstract: We present a product of pairs of pointed Hilbert spaces that, in the context of Bozėjko, Leinert and Speicher's theory of conditionally free probability, plays the role of the reduced free product of pointed Hilbert spaces, and thus gives a unified construction for the natural notions of independence defined by Muraki. We additionally provide important applications of this construction. We prove… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

  38. arXiv:1810.09135  [pdf, ps, other

    math-ph

    One-Boson Scattering Processes in the massive Spin-Boson Model

    Authors: Miguel Ballesteros, Dirk-André Deckert, Jérémy Faupin, Felix Hänle

    Abstract: The Spin-Boson model describes a two-level quantum system that interacts with a second-quantized boson scalar field. Recently the relation between the integral kernel of the scattering matrix and the resonance in this model has been established in [14] for the case of massless bosons. In the present work, we treat the massive case. On the one hand, one might rightfully expect that the massive case… ▽ More

    Submitted 10 May, 2019; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: 49 pages

  39. arXiv:1806.03280  [pdf, other

    cs.CL

    Multilingual Neural Machine Translation with Task-Specific Attention

    Authors: Graeme Blackwood, Miguel Ballesteros, Todd Ward

    Abstract: Multilingual machine translation addresses the task of translating between multiple source and target languages. We propose task-specific attention models, a simple but effective technique for improving the quality of sequence-to-sequence neural multilingual translation. Our approach seeks to retain as much of the parameter sharing generalization of NMT models as possible, while still allowing for… ▽ More

    Submitted 8 June, 2018; originally announced June 2018.

    Comments: COLING 2018

  40. arXiv:1804.08915  [pdf, other

    cs.CL

    Scheduled Multi-Task Learning: From Syntax to Translation

    Authors: Eliyahu Kiperwasser, Miguel Ballesteros

    Abstract: Neural encoder-decoder models of machine translation have achieved impressive results, while learning linguistic knowledge of both the source and target languages in an implicit end-to-end manner. We propose a framework in which our model begins learning syntax and translation interleaved, gradually putting more focus on translation. Using this approach, we achieve considerable improvements in ter… ▽ More

    Submitted 24 April, 2018; originally announced April 2018.

    Journal ref: Transactions of the Association for Computational Linguistics, 6:225-240 (2018)

  41. arXiv:1804.05038  [pdf, ps, other

    cs.CL

    Pieces of Eight: 8-bit Neural Machine Translation

    Authors: Jerry Quinn, Miguel Ballesteros

    Abstract: Neural machine translation has achieved levels of fluency and adequacy that would have been surprising a short time ago. Output quality is extremely relevant for industry purposes, however it is equally important to produce results in the shortest time possible, mainly for latency-sensitive applications and to control cloud hosting costs. In this paper we show the effectiveness of translating with… ▽ More

    Submitted 13 April, 2018; originally announced April 2018.

    Comments: To appear at NAACL 2018 Industry Track

  42. arXiv:1803.02392  [pdf, other

    cs.CL

    Multimodal Emoji Prediction

    Authors: Francesco Barbieri, Miguel Ballesteros, Francesco Ronzano, Horacio Saggion

    Abstract: Emojis are small images that are commonly included in social media text messages. The combination of visual and textual content in the same message builds up a modern way of communication, that automatic systems are not used to deal with. In this paper we extend recent advances in emoji prediction by putting forward a multimodal approach that is able to predict emojis in Instagram posts. Instagram… ▽ More

    Submitted 17 April, 2018; v1 submitted 6 March, 2018; originally announced March 2018.

    Comments: NAACL 2018 (short)

  43. Relation between the Resonance and the Scattering Matrix in the massless Spin-Boson Model

    Authors: Miguel Ballesteros, Dirk-André Deckert, Felix Hänle

    Abstract: We establish the precise relation between the integral kernel of the scattering matrix and the resonance in the massless Spin-Boson model which describes the interaction of a two-level quantum system with a second-quantized scalar field. For this purpose, we derive an explicit formula for the two-body scattering matrix. We impose an ultraviolet cut-off and assume a slightly less singular behavior… ▽ More

    Submitted 23 February, 2019; v1 submitted 15 January, 2018; originally announced January 2018.

    Comments: 46 pages, 2 figures. arXiv admin note: text overlap with arXiv:1810.09135

    Journal ref: Commun. Math. Phys. (2019) 370: 249; The final publication is available at link.springer.com

  44. Analyticity of Resonances and Eigenvalues and Spectral Properties of the massless Spin-Boson Model

    Authors: Miguel Ballesteros, Dirk-André Deckert, Felix Hänle

    Abstract: We extend the method of multiscale analysis for resonances introduced in [5] in order to infer analytic properties of resonances and eigenvalues (and their eigenprojections) as well as estimates for the localization of the spectrum of dilated Hamiltonians and norm-bounds for the corresponding resolvent operators, in neighborhoods of resonances and eigenvalues. We apply our method to the massless S… ▽ More

    Submitted 17 February, 2019; v1 submitted 11 January, 2018; originally announced January 2018.

    Comments: 47 pages, 3 figures

    Journal ref: Journal of Functional Analysis, Vol. 276(8), 2019, Pages 2524-2581

  45. arXiv:1709.03149  [pdf, ps, other

    math-ph quant-ph

    Perturbation Theory for Weak Measurements in Quantum Mechanics, I -- Systems with Finite-Dimensional State Space

    Authors: M. Ballesteros, N. Crawford, M. Fraas, J. Fröhlich, B. Schubnel

    Abstract: The quantum theory of indirect measurements in physical systems is studied. The example of an indirect measurement of an observable represented by a self-adjoint operator $\mathcal{N}$ with finite spectrum is analysed in detail. The Hamiltonian generating the time evolution of the system in the absence of direct measurements is assumed to be given by the sum of a term commuting with $\mathcal{N}$… ▽ More

    Submitted 10 September, 2017; originally announced September 2017.

    Comments: 42 pages

  46. arXiv:1709.00489  [pdf, ps, other

    cs.CL

    Arc-Standard Spinal Parsing with Stack-LSTMs

    Authors: Miguel Ballesteros, Xavier Carreras

    Abstract: We present a neural transition-based parser for spinal trees, a dependency representation of constituent trees. The parser uses Stack-LSTMs that compose constituent nodes with dependency-based derivations. In experiments, we show that this model adapts to different styles of dependency relations, but this choice has little effect for predicting constituent structure, suggesting that LSTMs induce u… ▽ More

    Submitted 1 September, 2017; originally announced September 2017.

    Comments: IWPT 2017

  47. arXiv:1707.07755  [pdf, ps, other

    cs.CL

    AMR Parsing using Stack-LSTMs

    Authors: Miguel Ballesteros, Yaser Al-Onaizan

    Abstract: We present a transition-based AMR parser that directly generates AMR parses from plain text. We use Stack-LSTMs to represent our parser state and make decisions greedily. In our experiments, we show that our parser achieves very competitive scores on English using only AMR training data. Adding additional information, such as POS tags and dependency trees, improves the results further.

    Submitted 2 August, 2017; v1 submitted 24 July, 2017; originally announced July 2017.

    Comments: EMNLP 2017

  48. arXiv:1706.09584  [pdf, ps, other

    math-ph quant-ph

    Non-demolition measurements of observables with general spectra

    Authors: M. Ballesteros, N. Crawford, M. Fraas, J. Fröhlich, B. Schubnel

    Abstract: It has recently been established that, in a non-demolition measurement of an observable $\mathcal{N}$ with a finite point spectrum, the density matrix of the system approaches an eigenstate of $\mathcal{N}$, i.e., it "purifies" over the spectrum of $\mathcal{N}$. We extend this result to observables with general spectra. It is shown that the spectral density of the state of the system converges to… ▽ More

    Submitted 29 June, 2017; originally announced June 2017.

    Comments: 22 pages

  49. arXiv:1702.07285  [pdf, other

    cs.CL

    Are Emojis Predictable?

    Authors: Francesco Barbieri, Miguel Ballesteros, Horacio Saggion

    Abstract: Emojis are ideograms which are naturally combined with plain text to visually complement or condense the meaning of a message. Despite being widely used in social media, their underlying semantics have received little attention from a Natural Language Processing standpoint. In this paper, we investigate the relation between words and emojis, studying the novel task of predicting which emojis are e… ▽ More

    Submitted 24 February, 2017; v1 submitted 23 February, 2017; originally announced February 2017.

    Comments: To appear at EACL 2017

  50. arXiv:1701.03980  [pdf, other

    stat.ML cs.CL cs.MS

    DyNet: The Dynamic Neural Network Toolkit

    Authors: Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

    Abstract: We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its deriva… ▽ More

    Submitted 14 January, 2017; originally announced January 2017.

    Comments: 33 pages