Skip to main content

Showing 1–50 of 96 results for author: Cohen, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20838  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    einspace: Searching for Neural Architectures from Fundamental Operations

    Authors: Linus Ericsson, Miguel Espinosa, Chenhongyi Yang, Antreas Antoniou, Amos Storkey, Shay B. Cohen, Steven McDonagh, Elliot J. Crowley

    Abstract: Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces in NAS often aren't diverse enough to include such transformations a priori. Instead, for NAS to provide greater potential for fundamental design shift… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Project page at https://linusericsson.github.io/einspace/

  2. arXiv:2405.09719  [pdf, other

    cs.CL cs.AI cs.LG

    Spectral Editing of Activations for Large Language Model Alignment

    Authors: Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen

    Abstract: Large language models (LLMs) often exhibit undesirable behaviours, such as generating untruthful or biased content. Editing their internal representations has been shown to be effective in mitigating such behaviours on top of the existing alignment methods. We propose a novel inference-time editing method, namely spectral editing of activations (SEA), to project the input representations into dire… ▽ More

    Submitted 25 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  3. arXiv:2403.13312  [pdf, other

    cs.CL

    LeanReasoner: Boosting Complex Logical Reasoning with Lean

    Authors: Dongwei Jiang, Marcio Fonseca, Shay B. Cohen

    Abstract: Large language models (LLMs) often struggle with complex logical reasoning due to logical inconsistencies and the inherent difficulty of such reasoning. We use Lean, a theorem proving framework, to address these challenges. By formalizing logical reasoning problems into theorems within Lean, we can solve them by proving or disproving the corresponding theorems. This method reduces the risk of logi… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted to NAACL 2024 main conference

  4. arXiv:2403.08828  [pdf, other

    cs.HC cs.AI cs.RO

    People Attribute Purpose to Autonomous Vehicles When Explaining Their Behavior

    Authors: Balint Gyevnar, Stephanie Droop, Tadeg Quillien, Shay B. Cohen, Neil R. Bramley, Christopher G. Lucas, Stefano V. Albrecht

    Abstract: Cognitive science can help us understand which explanations people might expect, and in which format they frame these explanations, whether causal, counterfactual, or teleological (i.e., purpose-oriented). Understanding the relevance of these concepts is crucial for building good explainable AI (XAI) which offers recourse and actionability. Focusing on autonomous driving, a complex decision-making… ▽ More

    Submitted 30 April, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  5. arXiv:2402.15055  [pdf, other

    cs.CL cs.AI cs.LG

    Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions

    Authors: Clement Neo, Shay B. Cohen, Fazl Barez

    Abstract: In this paper, we investigate the interplay between attention heads and specialized "next-token" neurons in the Multilayer Perceptron that predict specific tokens. By prompting an LLM like GPT-4 to explain these model internals, we can elucidate attention mechanisms that activate certain next-token neurons. Our analysis identifies attention heads that recognize contexts relevant to predicting a pa… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 15 pages, 11 figures

  6. arXiv:2402.10643  [pdf, other

    cs.CL cs.AI

    `Keep it Together': Enforcing Cohesion in Extractive Summaries by Simulating Human Memory

    Authors: Ronald Cardenas, Matthias Galle, Shay B. Cohen

    Abstract: Extractive summaries are usually presented as lists of sentences with no expected cohesion between them. In this paper, we aim to enforce cohesion whilst controlling for informativeness and redundancy in summaries, in cases where the input exhibits high redundancy. The pipeline controls for redundancy in long inputs as it is consumed, and balances informativeness and cohesion during sentence selec… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  7. arXiv:2401.10415  [pdf, other

    cs.CL cs.AI

    Can Large Language Model Summarizers Adapt to Diverse Scientific Communication Goals?

    Authors: Marcio Fonseca, Shay B. Cohen

    Abstract: In this work, we investigate the controllability of large language models (LLMs) on scientific summarization tasks. We identify key stylistic and content coverage factors that characterize different types of summaries such as paper reviews, abstracts, and lay summaries. By controlling stylistic features, we find that non-fine-tuned LLMs outperform humans in the MuP review generation task, both in… ▽ More

    Submitted 27 June, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: ACL 2024 camera ready

  8. arXiv:2401.07353  [pdf, other

    cs.SE cs.AI cs.LG

    Towards Engineering Fair and Equitable Software Systems for Managing Low-Altitude Airspace Authorizations

    Authors: Usman Gohar, Michael C. Hunter, Agnieszka Marczak-Czajka, Robyn R. Lutz, Myra B. Cohen, Jane Cleland-Huang

    Abstract: Small Unmanned Aircraft Systems (sUAS) have gained widespread adoption across a diverse range of applications. This has introduced operational complexities within shared airspaces and an increase in reported incidents, raising safety concerns. In response, the U.S. Federal Aviation Administration (FAA) is develo** a UAS Traffic Management (UTM) system to control access to airspace based on an sU… ▽ More

    Submitted 3 February, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

    Journal ref: ICSE-SEIS 2024

  9. arXiv:2401.01814  [pdf, other

    cs.AI

    Large Language Models Relearn Removed Concepts

    Authors: Michelle Lo, Shay B. Cohen, Fazl Barez

    Abstract: Advances in model editing through neuron pruning hold promise for removing undesirable concepts from large language models. However, it remains unclear whether models have the capacity to reacquire pruned concepts after editing. To investigate this, we evaluate concept relearning in models by tracking concept saliency and similarity in pruned neurons during retraining. Our findings reveal that mod… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  10. arXiv:2312.03480  [pdf, other

    cs.CL

    AMR Parsing is Far from Solved: GrAPES, the Granular AMR Parsing Evaluation Suite

    Authors: Jonas Groschwitz, Shay B. Cohen, Lucia Donatelli, Meaghan Fowlie

    Abstract: We present the Granular AMR Parsing Evaluation Suite (GrAPES), a challenge set for Abstract Meaning Representation (AMR) parsing with accompanying evaluation metrics. AMR parsers now obtain high scores on the standard AMR evaluation metric Smatch, close to or even above reported inter-annotator agreement. But that does not mean that AMR parsing is solved; in fact, human evaluation in previous work… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted at EMNLP 2023. For the associated GitHub repository, see https://github.com/jgroschwitz/GrAPES

    ACM Class: J.5

  11. arXiv:2311.09467  [pdf, other

    cs.CL cs.AI

    Think While You Write: Hypothesis Verification Promotes Faithful Knowledge-to-Text Generation

    Authors: Yifu Qiu, Varun Embar, Shay B. Cohen, Benjamin Han

    Abstract: Knowledge-to-text generators often struggle to faithfully generate descriptions for the input facts: they may produce hallucinations that contradict the input, or describe facts not present in the input. To reduce hallucinations, we propose a decoding-only method, TWEAK (Think While Effectively Articulating Knowledge), which can be integrated with any generator without retraining. TWEAK treats the… ▽ More

    Submitted 3 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: NAACL 2024 (Findings)

  12. arXiv:2311.08704  [pdf, other

    cs.CL cs.AI

    Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial Domains

    Authors: Marcio Fonseca, Shay B. Cohen

    Abstract: Although large language models (LLMs) exhibit remarkable capacity to leverage in-context demonstrations, it is still unclear to what extent they can learn new concepts or facts from ground-truth labels. To address this question, we examine the capacity of instruction-tuned LLMs to follow in-context concept guidelines for sentence labeling tasks. We design guidelines that present different types of… ▽ More

    Submitted 26 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: ACL 2024 camera ready

  13. arXiv:2311.08398  [pdf, other

    cs.CL cs.AI

    Are Large Language Models Temporally Grounded?

    Authors: Yifu Qiu, Zheng Zhao, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen

    Abstract: Are Large language models (LLMs) temporally grounded? Since LLMs cannot perceive and interact with the environment, it is impossible to answer this question directly. Instead, we provide LLMs with textual narratives and probe them with respect to their common-sense knowledge of the structure and duration of events, their ability to order events along a timeline, and self-consistency within their t… ▽ More

    Submitted 16 November, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

  14. arXiv:2310.15513  [pdf, other

    cs.CL

    A Joint Matrix Factorization Analysis of Multilingual Representations

    Authors: Zheng Zhao, Yftah Ziser, Bonnie Webber, Shay B. Cohen

    Abstract: We present an analysis tool based on joint matrix factorization for comparing latent representations of multilingual and monolingual models. An alternative to probing, this tool allows us to analyze multiple sets of representations in a joint manner. Using this tool, we study to what extent and how morphosyntactic features are reflected in the representations learned by multilingual pre-trained mo… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

  15. HIFuzz: Human Interaction Fuzzing for small Unmanned Aerial Vehicles

    Authors: Theodore Chambers, Michael Vierhauser, Ankit Agrawal, Michael Murphy, Jason Matthew Brauer, Salil Purandare, Myra B. Cohen, Jane Cleland-Huang

    Abstract: Small Unmanned Aerial Systems (sUAS) must meet rigorous safety standards when deployed in high-stress emergency response scenarios; however many reported accidents have involved humans in the loop. In this paper, we, therefore, present the HiFuzz testing framework, which uses fuzz testing to identify system vulnerabilities associated with human interactions. HiFuzz includes three distinct levels t… ▽ More

    Submitted 7 April, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

  16. arXiv:2305.19734  [pdf, other

    cs.AI cs.CL cs.DB

    Knowledge Base Question Answering for Space Debris Queries

    Authors: Paul Darm, Antonio Valerio Miceli-Barone, Shay B. Cohen, Annalisa Riccardi

    Abstract: Space agencies execute complex satellite operations that need to be supported by the technical knowledge contained in their extensive information systems. Knowledge bases (KB) are an effective way of storing and accessing such information at scale. In this work we present a system, developed for the European Space Agency (ESA), that can answer complex natural language queries, to support engineers… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 7 pages, ACL 2023 industry track

    ACM Class: I.2.7

  17. arXiv:2305.16947  [pdf, other

    cs.CL

    Sentence-Incremental Neural Coreference Resolution

    Authors: Matt Grenander, Shay B. Cohen, Mark Steedman

    Abstract: We propose a sentence-incremental neural coreference resolution system which incrementally builds clusters after marking mention boundaries in a shift-reduce method. The system is aimed at bridging two recent approaches at coreference resolution: (1) state-of-the-art non-incremental models that incur quadratic complexity in document length with high computational cost, and (2) memory network-based… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted at EMNLP 2022

  18. arXiv:2305.15507  [pdf, other

    cs.CL cs.AI

    The Larger They Are, the Harder They Fail: Language Models do not Recognize Identifier Swaps in Python

    Authors: Antonio Valerio Miceli-Barone, Fazl Barez, Ioannis Konstas, Shay B. Cohen

    Abstract: Large Language Models (LLMs) have successfully been applied to code generation tasks, raising the question of how well these models understand programming. Typical programming languages have invariances and equivariances in their semantics that human programmers intuitively understand and exploit, such as the (near) invariance to the renaming of identifiers. We show that LLMs not only fail to prop… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 17 pages, 5 figure, ACL 2023

  19. arXiv:2305.13632  [pdf, other

    cs.CL cs.AI cs.LG

    Detecting and Mitigating Hallucinations in Multilingual Summarisation

    Authors: Yifu Qiu, Yftah Ziser, Anna Korhonen, Edoardo M. Ponti, Shay B. Cohen

    Abstract: Hallucinations pose a significant challenge to the reliability of neural models for abstractive summarisation. While automatically generated summaries may be fluent, they often lack faithfulness to the original document. This issue becomes even more pronounced in low-resource settings, such as cross-lingual transfer. With the existing faithful metrics focusing on English, even measuring the extent… ▽ More

    Submitted 26 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023

  20. arXiv:2305.08828  [pdf, other

    cs.CL

    PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India

    Authors: Ashok Urlana, Pinzhen Chen, Zheng Zhao, Shay B. Cohen, Manish Shrivastava, Barry Haddow

    Abstract: This paper introduces PMIndiaSum, a multilingual and massively parallel summarization corpus focused on languages in India. Our corpus provides a training and testing ground for four language families, 14 languages, and the largest to date with 196 language pairs. We detail our construction workflow including data acquisition, processing, and quality assurance. Furthermore, we publish benchmarks f… ▽ More

    Submitted 19 October, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Findings of EMNLP 2023

    ACM Class: I.2.7

  21. arXiv:2302.10809  [pdf, other

    cs.AI cs.RO

    Causal Explanations for Sequential Decision-Making in Multi-Agent Systems

    Authors: Balint Gyevnar, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, Stefano V. Albrecht

    Abstract: We present CEMA: Causal Explanations in Multi-Agent systems; a framework for creating causal natural language explanations of an agent's decisions in dynamic sequential multi-agent systems to build more trustworthy autonomous agents. Unlike prior work that assumes a fixed causal structure, CEMA only requires a probabilistic model for forward-simulating the state of the system. Using such a model,… ▽ More

    Submitted 14 February, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted in 23rd International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), 2024

    ACM Class: I.2.9

  22. arXiv:2302.09350  [pdf, other

    cs.CL

    BERT is not The Count: Learning to Match Mathematical Statements with Proofs

    Authors: Weixian Waylon Li, Yftah Ziser, Maximin Coavoux, Shay B. Cohen

    Abstract: We introduce a task consisting in matching a proof to a given mathematical statement. The task fits well within current research on Mathematical Information Retrieval and, more generally, mathematical article analysis (Mathematical Sciences, 2014). We present a dataset for the task (the MATcH dataset) consisting of over 180k statement-proof pairs extracted from modern mathematical research article… ▽ More

    Submitted 18 February, 2023; originally announced February 2023.

    Comments: Accepted to the Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023; 14 pages. arXiv admin note: substantial text overlap with arXiv:2102.02110

  23. arXiv:2211.13807  [pdf, other

    cs.CV

    GEFF: Improving Any Clothes-Changing Person ReID Model using Gallery Enrichment with Face Features

    Authors: Daniel Arkushin, Bar Cohen, Shmuel Peleg, Ohad Fried

    Abstract: In the Clothes-Changing Re-Identification (CC-ReID) problem, given a query sample of a person, the goal is to determine the correct identity based on a labeled gallery in which the person appears in different clothes. Several models tackle this challenge by extracting clothes-independent features. However, the performance of these models is still lower for the clothes-changing setting compared to… ▽ More

    Submitted 21 November, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

  24. arXiv:2211.09458  [pdf, other

    cs.CL

    Abstractive Summarization Guided by Latent Hierarchical Document Structure

    Authors: Yifu Qiu, Shay B. Cohen

    Abstract: Sequential abstractive neural summarizers often do not use the underlying structure in the input article or dependencies between the input sentences. This structure is essential to integrate and consolidate information from different parts of the text. To address this shortcoming, we propose a hierarchy-aware graph neural network (HierGNN) which captures such dependencies through three main steps:… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022, 15 pages

  25. arXiv:2210.12553  [pdf, other

    cs.CL cs.LG

    Understanding Domain Learning in Language Models Through Subpopulation Analysis

    Authors: Zheng Zhao, Yftah Ziser, Shay B. Cohen

    Abstract: We investigate how different domains are encoded in modern neural network architectures. We analyze the relationship between natural language domains, model size, and the amount of training data used. The primary analysis tool we develop is based on subpopulation analysis with Singular Vector Canonical Correlation Analysis (SVCCA), which we apply to Transformer-based language models (LMs). We comp… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: Accepted to BlackboxNLP 2022

  26. A Human-Centric Method for Generating Causal Explanations in Natural Language for Autonomous Vehicle Motion Planning

    Authors: Balint Gyevnar, Massimiliano Tamborski, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, Stefano V. Albrecht

    Abstract: Inscrutable AI systems are difficult to trust, especially if they operate in safety-critical settings like autonomous driving. Therefore, there is a need to build transparent and queryable systems to increase trust levels. We propose a transparent, human-centric explanation generation method for autonomous vehicle motion planning and prediction based on an existing white-box system called IGP2. Ou… ▽ More

    Submitted 27 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: IJCAI Workshop on Artificial Intelligence for Autonomous Driving (AI4AD), 2022

  27. arXiv:2205.12486  [pdf, other

    cs.CL

    Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents

    Authors: Marcio Fonseca, Yftah Ziser, Shay B. Cohen

    Abstract: We argue that disentangling content selection from the budget used to cover salient content improves the performance and applicability of abstractive summarizers. Our method, FactorSum, does this disentanglement by factorizing summarization into two steps through an energy function: (1) generation of abstractive summary views; (2) combination of these views into a final summary, following a budget… ▽ More

    Submitted 26 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022 camera ready

  28. On the Trade-off between Redundancy and Local Coherence in Summarization

    Authors: Ronald Cardenas, Matthias Galle, Shay B. Cohen

    Abstract: Extractive summaries are usually presented as lists of sentences with no expected cohesion between them and with plenty of redundant information if not accounted for. In this paper, we investigate the trade-offs incurred when aiming to control for inter-sentential cohesion and redundancy in extracted summaries, and their impact on their informativeness. As case study, we focus on the summarization… ▽ More

    Submitted 6 June, 2024; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Accepted to JAIR

    Journal ref: Journal of Artificial Intelligence Research, 80, 273-326 (2024)

  29. arXiv:2203.07893  [pdf, other

    cs.CL cs.LG

    Gold Doesn't Always Glitter: Spectral Removal of Linear and Nonlinear Guarded Attribute Information

    Authors: Shun Shao, Yftah Ziser, Shay B. Cohen

    Abstract: We describe a simple and effective method (Spectral Attribute removaL; SAL) to remove private or guarded information from neural representations. Our method uses matrix decomposition to project the input representations into directions with reduced covariance with the guarded information rather than maximal covariance as factorization methods normally use. We begin with linear information removal… ▽ More

    Submitted 20 April, 2023; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: Accepted to the Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023; 12 pages (minor formatting corrections)

  30. arXiv:2110.02283  [pdf, other

    cs.CL cs.AI cs.LG

    Co-training an Unsupervised Constituency Parser with Weak Supervision

    Authors: Nickil Maveli, Shay B. Cohen

    Abstract: We introduce a method for unsupervised parsing that relies on bootstrap** classifiers to identify if a node dominates a specific span in a sentence. There are two types of classifiers, an inside classifier that acts on a span, and an outside classifier that acts on everything outside of a given span. Through self-training and co-training with the two classifiers, we show that the interplay betwe… ▽ More

    Submitted 18 March, 2022; v1 submitted 5 October, 2021; originally announced October 2021.

    Comments: Accepted to Findings of ACL 2022

  31. arXiv:2108.12075  [pdf, other

    cs.SE cs.CR

    HyperGI: Automated Detection and Repair of Information Flow Leakage

    Authors: Ibrahim Mesecan, Daniel Blackwell, David Clark, Myra B. Cohen, Justyna Petke

    Abstract: Maintaining confidential information control in software is a persistent security problem where failure means secrets can be revealed via program behaviors. Information flow control techniques traditionally have been based on static or symbolic analyses -- limited in scalability and specialized to particular languages. When programs do leak secrets there are no approaches to automatically repair t… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

  32. arXiv:2104.08392  [pdf, other

    cs.CL

    Unsupervised Extractive Summarization by Human Memory Simulation

    Authors: Ronald Cardenas, Matthias Galle, Shay B. Cohen

    Abstract: Summarization systems face the core challenge of identifying and selecting important information. In this paper, we tackle the problem of content selection in unsupervised extractive summarization of long, structured documents. We introduce a wide range of heuristics that leverage cognitive representations of content units and how these are retained or forgotten in human memory. We find that prope… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  33. arXiv:2102.02110  [pdf, other

    cs.CL

    Learning to Match Mathematical Statements with Proofs

    Authors: Maximin Coavoux, Shay B. Cohen

    Abstract: We introduce a novel task consisting in assigning a proof to a given mathematical statement. The task is designed to improve the processing of research-level mathematical texts. Applying Natural Language Processing (NLP) tools to research level mathematical articles is both challenging, since it is a highly specialized domain which mixes natural language and mathematical formulae. It is also an im… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  34. arXiv:2101.06803  [pdf, other

    cs.CL

    Narration Generation for Cartoon Videos

    Authors: Nikos Papasarantopoulos, Shay B. Cohen

    Abstract: Research on text generation from multimodal inputs has largely focused on static images, and less on video data. In this paper, we propose a new task, narration generation, that is complementing videos with narration texts that are to be interjected in several places. The narrations are part of the video and contribute to the storyline unfolding in it. Moreover, they are context-informed, since th… ▽ More

    Submitted 17 January, 2021; originally announced January 2021.

  35. arXiv:2011.06572  [pdf, ps, other

    math.OC cs.DS cs.LG

    Relative Lipschitzness in Extragradient Methods and a Direct Recipe for Acceleration

    Authors: Michael B. Cohen, Aaron Sidford, Kevin Tian

    Abstract: We show that standard extragradient methods (i.e. mirror prox and dual extrapolation) recover optimal accelerated rates for first-order minimization of smooth convex functions. To obtain this result we provide a fine-grained characterization of the convergence rates of extragradient methods for solving monotone variational inequalities in terms of a natural condition we call relative Lipschitzness… ▽ More

    Submitted 14 July, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

    Comments: 32 pages. This is the full version of a paper appearing in ITCS 2021. v2 addresses reviewer comments and adds citations

  36. arXiv:2010.12676  [pdf, other

    cs.CL cs.LG

    A Differentiable Relaxation of Graph Segmentation and Alignment for AMR Parsing

    Authors: Chunchuan Lyu, Shay B. Cohen, Ivan Titov

    Abstract: Abstract Meaning Representations (AMR) are a broad-coverage semantic formalism which represents sentence meaning as a directed acyclic graph. To train most AMR parsers, one needs to segment the graph into subgraphs and align each such subgraph to a word in a sentence; this is normally done at preprocessing, relying on hand-crafted rules. In contrast, we treat both alignment and segmentation as lat… ▽ More

    Submitted 24 October, 2022; v1 submitted 23 October, 2020; originally announced October 2020.

  37. arXiv:2010.04383  [pdf, other

    cs.CL

    Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation

    Authors: Yan Zhang, Zhijiang Guo, Zhiyang Teng, Wei Lu, Shay B. Cohen, Zuozhu Liu, Lidong Bing

    Abstract: AMR-to-text generation is used to transduce Abstract Meaning Representation structures (AMR) into text. A key challenge in this task is to efficiently learn effective graph representations. Previously, Graph Convolution Networks (GCNs) were used to encode input AMRs, however, vanilla GCNs are not able to capture non-local information and additionally, they follow a local (first-order) information… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: Accepted to EMNLP 2020, long paper

  38. arXiv:2009.13312  [pdf, other

    cs.CL

    Reducing Quantity Hallucinations in Abstractive Summarization

    Authors: Zheng Zhao, Shay B. Cohen, Bonnie Webber

    Abstract: It is well-known that abstractive summaries are subject to hallucination---including material that is not supported by the original text. While summaries can be made hallucination-free by limiting them to general phrases, such summaries would fail to be very informative. Alternatively, one can try to avoid hallucinations by verifying that any specific entities in the summary appear in the original… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

    Comments: Accepted to Findings of EMNLP 2020

  39. arXiv:2008.07648  [pdf, other

    cs.LG stat.ML

    Nonparametric Learning of Two-Layer ReLU Residual Units

    Authors: Zhunxuan Wang, Linyun He, Chunchuan Lyu, Shay B. Cohen

    Abstract: We describe an algorithm that learns two-layer residual units using rectified linear unit (ReLU) activation: suppose the input $\mathbf{x}$ is from a distribution with support space $\mathbb{R}^d$ and the ground-truth generative model is a residual unit of this type, given by $\mathbf{y} = \boldsymbol{B}^\ast\left[\left(\boldsymbol{A}^\ast\mathbf{x}\right)^+ + \mathbf{x}\right]$, where ground-trut… ▽ More

    Submitted 10 December, 2022; v1 submitted 17 August, 2020; originally announced August 2020.

    Comments: Published in Transactions on Machine Learning Research (11/2022), slightly typographically revised

  40. arXiv:2007.15987  [pdf, other

    cs.SE cs.AI cs.NE

    Genetic Improvement @ ICSE 2020

    Authors: William B. Langdon, Westley Weimer, Justyna Petke, Erik Fredericks, Seongmin Lee, Emily Winter, Michail Basios, Myra B. Cohen, Aymeric Blot, Markus Wagner, Bobby R. Bruce, Shin Yoo, Simos Gerasimou, Oliver Krauss, Yu Huang, Michael Gerten

    Abstract: Following Prof. Mark Harman of Facebook's keynote and formal presentations (which are recorded in the proceedings) there was a wide ranging discussion at the eighth international Genetic Improvement workshop, GI-2020 @ ICSE (held as part of the 42nd ACM/IEEE International Conference on Software Engineering on Friday 3rd July 2020). Topics included industry take up, human factors, explainabiloity (… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: 7 pages, 2 figures. Write up of GI @ ICSE 2020 workshop. Submitted to ACM SIGSOFT Software Engineering Notes

  41. arXiv:2004.11054  [pdf, other

    cs.CL cs.LG cs.NE

    Learning Dialog Policies from Weak Demonstrations

    Authors: Gabriel Gordon-Hall, Philip John Gorinski, Shay B. Cohen

    Abstract: Deep reinforcement learning is a promising approach to training a dialog manager, but current methods struggle with the large state and action spaces of multi-domain dialog systems. Building upon Deep Q-learning from Demonstrations (DQfD), an algorithm that scores highly in difficult Atari games, we leverage dialog data to guide the agent to successfully respond to a user's requests. We make progr… ▽ More

    Submitted 13 August, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: 9 pages + 2 pages references + 1 page appendices, 6 figures, 2 tables, 1 algorithm, accepted as long paper at ACL2020

  42. Multi-Step Inference for Reasoning Over Paragraphs

    Authors: Jiangming Liu, Matt Gardner, Shay B. Cohen, Mirella Lapata

    Abstract: Complex reasoning over text requires understanding and chaining together free-form predicates and logical connectives. Prior work has largely tried to do this either symbolically or with black-box transformers. We present a middle ground between these two extremes: a compositional model reminiscent of neural module networks that can perform chained logical reasoning. This model first finds relevan… ▽ More

    Submitted 7 June, 2021; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: accepted by EMNLP 2020

  43. arXiv:2002.01365  [pdf, other

    cs.CL cs.AI cs.LG

    Compositional Languages Emerge in a Neural Iterated Learning Model

    Authors: Yi Ren, Shangmin Guo, Matthieu Labeau, Shay B. Cohen, Simon Kirby

    Abstract: The principle of compositionality, which enables natural language to represent complex concepts via a structured combination of simpler ones, allows us to convey an open-ended set of messages using a limited vocabulary. If compositionality is indeed a natural property of language, we may expect it to appear in communication protocols that are created by neural agents in language games. In this pap… ▽ More

    Submitted 17 February, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

    Comments: accepted by ICLR-2020

    Journal ref: ICLR-2020

  44. arXiv:1909.03285  [pdf, other

    cs.CL

    Semantic Role Labeling with Iterative Structure Refinement

    Authors: Chunchuan Lyu, Shay B. Cohen, Ivan Titov

    Abstract: Modern state-of-the-art Semantic Role Labeling (SRL) methods rely on expressive sentence encoders (e.g., multi-layer LSTMs) but tend to model only local (if any) interactions between individual argument labeling decisions. This contrasts with earlier work and also with the intuition that the labels of individual arguments are strongly interdependent. We model interactions between argument labeling… ▽ More

    Submitted 7 September, 2019; originally announced September 2019.

    Journal ref: EMNLP 2019

  45. arXiv:1907.08722  [pdf, ps, other

    cs.CL

    What is this Article about? Extreme Summarization with Topic-aware Convolutional Neural Networks

    Authors: Shashi Narayan, Shay B. Cohen, Mirella Lapata

    Abstract: We introduce 'extreme summarization', a new single-document summarization task which aims at creating a short, one-sentence news summary answering the question ``What is the article about?''. We argue that extreme summarization, by nature, is not amenable to extractive strategies and requires an abstractive modeling approach. In the hope of driving research on this task further: (a) we collect a r… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: Accepted to appear in Journal of Artificial Intelligence Research (JAIR), 37 pages

  46. arXiv:1905.11580  [pdf, ps, other

    cs.DS

    A near-optimal algorithm for approximating the John Ellipsoid

    Authors: Michael B. Cohen, Ben Cousins, Yin Tat Lee, Xin Yang

    Abstract: We develop a simple and efficient algorithm for approximating the John Ellipsoid of a symmetric polytope. Our algorithm is near optimal in the sense that our time complexity matches the current best verification algorithm. We also provide the MATLAB code for further research.

    Submitted 18 February, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: COLT 2019

  47. arXiv:1904.09585  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Obfuscation for Privacy-preserving Syntactic Parsing

    Authors: Zhifeng Hu, Serhii Havrylov, Ivan Titov, Shay B. Cohen

    Abstract: The goal of homomorphic encryption is to encrypt data such that another party can operate on it without being explicitly exposed to the content of the original data. We introduce an idea for a privacy-preserving transformation on natural language data, inspired by homomorphic encryption. Our primary tool is {\em obfuscation}, relying on the properties of natural language. Specifically, a given Eng… ▽ More

    Submitted 27 May, 2020; v1 submitted 21 April, 2019; originally announced April 2019.

    Comments: Accepted to IWPT 2020

  48. arXiv:1904.02020  [pdf, other

    cs.IR cs.CL cs.LG

    Jointly Extracting and Compressing Documents with Summary State Representations

    Authors: Afonso Mendes, Shashi Narayan, Sebastião Miranda, Zita Marinho, André F. T. Martins, Shay B. Cohen

    Abstract: We present a new neural model for text summarization that first extracts sentences from a document and then compresses them. The proposed model offers a balance that sidesteps the difficulties in abstractive methods while generating more concise summaries than extractive methods. In addition, our model dynamically determines the length of the output summary based on the gold summaries it observes… ▽ More

    Submitted 5 April, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

    Journal ref: NAACL 2019

  49. arXiv:1904.00615  [pdf, other

    cs.CL

    Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle

    Authors: Maximin Coavoux, Shay B. Cohen

    Abstract: We introduce a novel transition system for discontinuous constituency parsing. Instead of storing subtrees in a stack --i.e. a data structure with linear-time sequential access-- the proposed system uses a set of parsing items, with constant-time random access. This change makes it possible to construct any discontinuous constituency tree in exactly $4n - 2$ transitions for a sentence of length… ▽ More

    Submitted 1 April, 2019; originally announced April 2019.

    Comments: Accepted for publication at NAACL 2019; 14 pages

  50. arXiv:1903.11410  [pdf, ps, other

    cs.CL

    Structural Neural Encoders for AMR-to-text Generation

    Authors: Marco Damonte, Shay B. Cohen

    Abstract: AMR-to-text generation is a problem recently introduced to the NLP community, in which the goal is to generate sentences from Abstract Meaning Representation (AMR) graphs. Sequence-to-sequence models can be used to this end by converting the AMR graphs to strings. Approaching the problem while working directly with graphs requires the use of graph-to-sequence models that encode the AMR graph into… ▽ More

    Submitted 20 May, 2019; v1 submitted 27 March, 2019; originally announced March 2019.

    Comments: Proceedings of NAACL 2019