Skip to main content

Showing 1–12 of 12 results for author: Hosseini, M J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19803  [pdf, other

    cs.CL

    Scalable and Domain-General Abstractive Proposition Segmentation

    Authors: Mohammad Javad Hosseini, Yang Gao, Tim Baumgärtner, Alex Fabrikant, Reinald Kim Amplayo

    Abstract: Segmenting text into fine-grained units of meaning is important to a wide range of NLP applications. The default approach of segmenting text into sentences is often insufficient, especially since sentences are usually complex enough to include multiple units of meaning that merit separate treatment in the downstream task. We focus on the task of abstractive proposition segmentation: transforming t… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2402.12368  [pdf, other

    cs.CL

    A synthetic data approach for domain generalization of NLI models

    Authors: Mohammad Javad Hosseini, Andrey Petrov, Alex Fabrikant, Annie Louis

    Abstract: Natural Language Inference (NLI) remains an important benchmark task for LLMs. NLI datasets are a springboard for transfer learning to other semantic tasks, and NLI models are standard tools for identifying the faithfulness of model-generated text. There are several large scale NLI datasets today, and models have improved greatly by hill-climbing on these collections. Yet their realistic performan… ▽ More

    Submitted 28 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  3. arXiv:2305.19585  [pdf, other

    cs.CL cs.LG

    LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction

    Authors: Jeremiah Milbauer, Annie Louis, Mohammad Javad Hosseini, Alex Fabrikant, Donald Metzler, Tal Schuster

    Abstract: Transformer encoders contextualize token representations by attending to all other tokens at each layer, leading to quadratic increase in compute effort with the input length. In practice, however, the input text of many NLP tasks can be seen as a sequence of related segments (e.g., the sequence of sentences within a passage, or the hypothesis and premise in NLI). While attending across these segm… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  4. arXiv:2305.14552  [pdf, other

    cs.CL cs.AI

    Sources of Hallucination by Large Language Models on Inference Tasks

    Authors: Nick McKenna, Tianyi Li, Liang Cheng, Mohammad Javad Hosseini, Mark Johnson, Mark Steedman

    Abstract: Large Language Models (LLMs) are claimed to be capable of Natural Language Inference (NLI), necessary for applied tasks like question answering and summarization. We present a series of behavioral studies on several LLM families (LLaMA, GPT-3.5, and PaLM) which probe their behavior using controlled experiments. We establish two biases originating from pretraining which predict much of their behavi… ▽ More

    Submitted 22 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Findings of EMNLP 2023

  5. arXiv:2212.10933  [pdf, other

    cs.CL

    Resolving Indirect Referring Expressions for Entity Selection

    Authors: Mohammad Javad Hosseini, Filip Radlinski, Silvia Pareti, Annie Louis

    Abstract: Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address this problem of reference resolution, when people use natural expressions to choose between the entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural res… ▽ More

    Submitted 26 May, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

  6. arXiv:2210.04695  [pdf, other

    cs.CL

    Language Models Are Poor Learners of Directional Inference

    Authors: Tianyi Li, Mohammad Javad Hosseini, Sabine Weber, Mark Steedman

    Abstract: We examine LMs' competence of directional predicate entailments by supervised fine-tuning with prompts. Our analysis shows that contrary to their apparent success on standard NLI, LMs show limited ability to learn such directional inference; moreover, existing datasets fail to test directionality, and/or are infested by artefacts that can be learnt as proxy for entailments, yielding over-optimisti… ▽ More

    Submitted 14 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP 2022

  7. Organic log-domain integrator synapse

    Authors: Mohammad Javad Mirshojaeian Hosseini, Elisa Donati, Giacomo Indiveri, Robert A. Nawrocki

    Abstract: Synapses play a critical role in memory, learning, and cognition. Their main functions include converting pre-synaptic voltage spikes to post-synaptic currents, as well as scaling the input signal. Several brain-inspired architectures have been proposed to emulate the behavior of biological synapses. While these are useful to explore the properties of nervous systems, the challenge of making bioco… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Accepted by Advanced Electronic Materials (18 pages, 17 figures)

  8. arXiv:2203.06264  [pdf, other

    cs.CL

    Cross-lingual Inference with A Chinese Entailment Graph

    Authors: Tianyi Li, Sabine Weber, Mohammad Javad Hosseini, Liane Guillou, Mark Steedman

    Abstract: Predicate entailment detection is a crucial task for question-answering from text, where previous work has explored unsupervised learning of entailment graphs from typed open relation triples. In this paper, we present the first pipeline for building Chinese entailment graphs, which involves a novel high-recall open relation extraction (ORE) method and the first Chinese fine-grained entity ty**… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: Accepted to Findings of ACL 2022

  9. arXiv:2109.09412  [pdf, other

    cs.CL

    Incorporating Temporal Information in Entailment Graph Mining

    Authors: Liane Guillou, Sander Bijl de Vroe, Mohammad Javad Hosseini, Mark Johnson, Mark Steedman

    Abstract: We present a novel method for injecting temporality into entailment graphs to address the problem of spurious entailments, which may arise from similar but temporally distinct events involving the same pair of entities. We focus on the sports domain in which the same pairs of teams play on different occasions, with different outcomes. We present an unsupervised model that aims to learn entailments… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: L. Guillou, S. Bijl de Vroe, M.J. Hosseini, M. Johnson, and M. Steedman. 2020. Incorporating temporal information in entailment graph mining. In Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs), pages 60-71, Barcelona, Spain (Online). Association for Computational Linguistics

    Journal ref: In Proceedings of TextGraphs 2020, pages 60-71, Barcelona, Spain (Online)

  10. arXiv:2104.07846  [pdf, other

    cs.CL

    Multivalent Entailment Graphs for Question Answering

    Authors: Nick McKenna, Liane Guillou, Mohammad Javad Hosseini, Sander Bijl de Vroe, Mark Johnson, Mark Steedman

    Abstract: Drawing inferences between open-domain natural language predicates is a necessity for true language understanding. There has been much progress in unsupervised learning of entailment graphs for this purpose. We make three contributions: (1) we reinterpret the Distributional Inclusion Hypothesis to model entailment between predicates of different valencies, like DEFEAT(Biden, Trump) entails WIN(Bid… ▽ More

    Submitted 19 September, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted to EMNLP 2021

  11. arXiv:2008.09236  [pdf, other

    cs.CL

    Spatial Language Representation with Multi-Level Geocoding

    Authors: Sayali Kulkarni, Shailee Jain, Mohammad Javad Hosseini, Jason Baldridge, Eugene Ie, Li Zhang

    Abstract: We present a multi-level geocoding model (MLG) that learns to associate texts to geographic locations. The Earth's surface is represented using space-filling curves that decompose the sphere into a hierarchy of similarly sized, non-overlap** cells. MLG balances generalization and accuracy by combining losses across multiple levels and predicting cells at each level simultaneously. Without using… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

  12. arXiv:1908.08672  [pdf, other

    cs.CL

    Jointly Modeling Hierarchical and Horizontal Features for Relational Triple Extraction

    Authors: Zhepei Wei, Yantao Jia, Yuan Tian, Mohammad Javad Hosseini, Sujian Li, Mark Steedman, Yi Chang

    Abstract: Recent works on relational triple extraction have shown the superiority of jointly extracting entities and relations over the pipelined extraction manner. However, most existing joint models fail to balance the modeling of entity features and the joint decoding strategy, and thus the interactions between the entity level and triple level are not fully investigated. In this work, we first introduce… ▽ More

    Submitted 3 May, 2022; v1 submitted 23 August, 2019; originally announced August 2019.

    Comments: 20 pages, 5 figures