Skip to main content

Showing 1–4 of 4 results for author: Finegan-Dollak, C

.
  1. arXiv:2305.17219  [pdf

    cs.CV cs.CL cs.LG

    GVdoc: Graph-based Visual Document Classification

    Authors: Fnu Mohbat, Mohammed J. Zaki, Catherine Finegan-Dollak, Ashish Verma

    Abstract: The robustness of a model for real-world deployment is decided by how well it performs on unseen data and distinguishes between in-domain and out-of-domain samples. Visual document classifiers have shown impressive performance on in-distribution test sets. However, they tend to have a hard time correctly classifying and differentiating out-of-distribution examples. Image-based classifiers lack the… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  2. arXiv:2109.00442  [pdf, other

    cs.CL cs.LG

    Position Masking for Improved Layout-Aware Document Understanding

    Authors: Anik Saha, Catherine Finegan-Dollak, Ashish Verma

    Abstract: Natural language processing for document scans and PDFs has the potential to enormously improve the efficiency of business processes. Layout-aware word embeddings such as LayoutLM have shown promise for classification of and information extraction from such documents. This paper proposes a new pre-training task called that can improve performance of layout-aware word embeddings that incorporate 2-… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: Document Intelligence Workshop at KDD, 2021

  3. Increasing the Speed and Accuracy of Data LabelingThrough an AI Assisted Interface

    Authors: Michael Desmond, Zahra Ashktorab, Michelle Brachman, Kristina Brimijoin, Evelyn Duesterwald, Casey Dugan, Catherine Finegan-Dollak, Michael Muller, Narendra Nath Joshi, Qian Pan, Aabhas Sharma

    Abstract: Labeling data is an important step in the supervised machine learning lifecycle. It is a laborious human activity comprised of repeated decision making: the human labeler decides which of several potential labels to apply to each example. Prior work has shown that providing AI assistance can improve the accuracy of binary decision tasks. However, the role of AI assistance in more complex data-labe… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

  4. arXiv:1806.09029  [pdf, other

    cs.CL cs.AI cs.DB

    Improving Text-to-SQL Evaluation Methodology

    Authors: Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Li Zhang, Karthik Ramanathan, Sesh Sadasivam, Rui Zhang, Dragomir Radev

    Abstract: To be informative, an evaluation must measure how well systems generalize to realistic unseen data. We identify limitations of and propose improvements to current evaluations of text-to-SQL systems. First, we compare human-generated and automatically generated questions, characterizing properties of queries necessary for real-world applications. To facilitate evaluation on multiple datasets, we re… ▽ More

    Submitted 23 June, 2018; originally announced June 2018.

    Comments: To appear at ACL 2018

    ACM Class: I.2.7; I.2.1; H.2.3; H.3.4

    Journal ref: ACL (2018) 351-360