Skip to main content

Showing 1–4 of 4 results for author: Roof, B

.
  1. arXiv:2208.08000  [pdf, other

    cs.CL

    DeeperDive: The Unreasonable Effectiveness of Weak Supervision in Document Understanding A Case Study in Collaboration with UiPath Inc

    Authors: Emad Elwany, Allison Hegel, Marina Shah, Brendan Roof, Genevieve Peaslee, Quentin Rivet

    Abstract: Weak supervision has been applied to various Natural Language Understanding tasks in recent years. Due to technical challenges with scaling weak supervision to work on long-form documents, spanning up to hundreds of pages, applications in the document understanding space have been limited. At Lexion, we built a weak supervision-based system tailored for long-form (10-200 pages long) PDF documents.… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

  2. arXiv:2107.08128  [pdf, other

    cs.CL

    The Law of Large Documents: Understanding the Structure of Legal Contracts Using Visual Cues

    Authors: Allison Hegel, Marina Shah, Genevieve Peaslee, Brendan Roof, Emad Elwany

    Abstract: Large, pre-trained transformer models like BERT have achieved state-of-the-art results on document understanding tasks, but most implementations can only consider 512 tokens at a time. For many real-world applications, documents can be much longer, and the segmentation strategies typically used on longer documents miss out on document structure and contextual information, hurting their results on… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

    Journal ref: Document Intelligence Workshop at KDD, 2021

  3. arXiv:1908.11047  [pdf, other

    cs.CL

    Shallow Syntax in Deep Water

    Authors: Swabha Swayamdipta, Matthew Peters, Brendan Roof, Chris Dyer, Noah A. Smith

    Abstract: Shallow syntax provides an approximation of phrase-syntactic structure of sentences; it can be produced with high accuracy, and is computationally cheap to obtain. We investigate the role of shallow syntax-aware representations for NLP tasks using two techniques. First, we enhance the ELMo architecture to allow pretraining on predicted shallow syntactic parses, instead of just raw text, so that co… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

  4. arXiv:1811.00146  [pdf, other

    cs.CL

    ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning

    Authors: Maarten Sap, Ronan LeBras, Emily Allaway, Chandra Bhagavatula, Nicholas Lourie, Hannah Rashkin, Brendan Roof, Noah A. Smith, Ye** Choi

    Abstract: We present ATOMIC, an atlas of everyday commonsense reasoning, organized through 877k textual descriptions of inferential knowledge. Compared to existing resources that center around taxonomic knowledge, ATOMIC focuses on inferential knowledge organized as typed if-then relations with variables (e.g., "if X pays Y a compliment, then Y will likely return the compliment"). We propose nine if-then re… ▽ More

    Submitted 7 February, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: AAAI 2019 CR