Skip to main content

Showing 1–50 of 59 results for author: Fabbri, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01370  [pdf, other

    cs.CL

    Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

    Authors: Philippe Laban, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu

    Abstract: LLMs and RAG systems are now capable of handling millions of input tokens or more. However, evaluating the output quality of such systems on long-context tasks remains challenging, as tasks like Needle-in-a-Haystack lack complexity. In this work, we argue that summarization can play a central role in such evaluation. We design a procedure to synthesize Haystacks of documents, ensuring that specifi… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2404.16251  [pdf, ps, other

    cs.CR cs.AI cs.CL

    Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions

    Authors: Divyansh Agarwal, Alexander R. Fabbri, Philippe Laban, Ben Risher, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu

    Abstract: Prompt leakage in large language models (LLMs) poses a significant security and privacy threat, particularly in retrieval-augmented generation (RAG) systems. However, leakage in multi-turn LLM interactions along with mitigation strategies has not been studied in a standardized manner. This paper investigates LLM vulnerabilities against prompt leakage across 4 diverse domains and 10 closed- and ope… ▽ More

    Submitted 26 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  3. arXiv:2311.09458  [pdf, other

    cs.CL

    Lexical Repetitions Lead to Rote Learning: Unveiling the Impact of Lexical Overlap in Train and Test Reference Summaries

    Authors: Prafulla Kumar Choubey, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu

    Abstract: Ideal summarization models should generalize to novel summary-worthy content without remembering reference training summaries by rote. However, a single average performance score on the entire test set is inadequate in determining such model competencies. We propose a fine-grained evaluation protocol by partitioning a test set based on the lexical similarity of reference test summaries with traini… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023-Findings

  4. arXiv:2311.09184  [pdf, other

    cs.CL cs.LG

    Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

    Authors: Yixin Liu, Alexander R. Fabbri, Jiawen Chen, Yilun Zhao, Simeng Han, Shafiq Joty, Pengfei Liu, Dragomir Radev, Chien-Sheng Wu, Arman Cohan

    Abstract: While large language models (LLMs) already achieve strong performance on standard generic summarization benchmarks, their performance on more complex summarization task settings is less studied. Therefore, we benchmark LLMs on instruction controllable text summarization, where the model input consists of both a source article and a natural language requirement for the desired summary characteristi… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: GitHub Repo: https://github.com/yale-nlp/InstruSum

  5. arXiv:2309.09369  [pdf, other

    cs.CL

    Embrace Divergence for Richer Insights: A Multi-document Summarization Benchmark and a Case Study on Summarizing Diverse Information from News Articles

    Authors: Kung-Hsiang Huang, Philippe Laban, Alexander R. Fabbri, Prafulla Kumar Choubey, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu

    Abstract: Previous research in multi-document news summarization has typically concentrated on collating information that all sources agree upon. However, the summarization of diverse information dispersed across multiple articles about an event remains underexplored. In this paper, we propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event. To… ▽ More

    Submitted 22 March, 2024; v1 submitted 17 September, 2023; originally announced September 2023.

    Comments: NAACL 2024

  6. arXiv:2305.17779  [pdf, other

    cs.CL

    Generating EDU Extracts for Plan-Guided Summary Re-Ranking

    Authors: Griffin Adams, Alexander R. Fabbri, Faisal Ladhak, Kathleen McKeown, Noémie Elhadad

    Abstract: Two-step approaches, in which summary candidates are generated-then-reranked to return a single summary, can improve ROUGE scores over the standard single-step approach. Yet, standard decoding methods (i.e., beam search, nucleus sampling, and diverse beam search) produce candidates with redundant, and often low quality, content. In this paper, we design a novel method to generate candidates for re… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  7. arXiv:2305.14540  [pdf, other

    cs.CL

    LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond

    Authors: Philippe Laban, Wojciech Kryściński, Divyansh Agarwal, Alexander R. Fabbri, Caiming Xiong, Shafiq Joty, Chien-Sheng Wu

    Abstract: With the recent appearance of LLMs in practical settings, having methods that can effectively detect factual inconsistencies is crucial to reduce the propagation of misinformation and improve trust in model outputs. When testing on existing factual consistency benchmarks, we find that a few large language models (LLMs) perform competitively on classification benchmarks for factual inconsistency de… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  8. arXiv:2305.14239  [pdf, other

    cs.CL

    On Learning to Summarize with Large Language Models as References

    Authors: Yixin Liu, Kejian Shi, Katherine S He, Longtian Ye, Alexander R. Fabbri, Pengfei Liu, Dragomir Radev, Arman Cohan

    Abstract: Recent studies have found that summaries generated by large language models (LLMs) are favored by human annotators over the original reference summaries in commonly used summarization datasets. Therefore, we investigate a new learning setting of text summarization models that considers the LLMs as the reference or the gold-standard oracle on these datasets. To examine the standard practices that a… ▽ More

    Submitted 16 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: GitHub Repo: https://github.com/yixinL7/SumLLM

  9. arXiv:2303.03608  [pdf, other

    cs.CL

    Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation

    Authors: Yixin Liu, Alexander R. Fabbri, Yilun Zhao, Pengfei Liu, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

    Abstract: Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics. In this work, we develop strong-performing automatic metrics for reference-based summarization evaluation, based on a two-stage evaluation pipeline that first extracts basic information units from one text sequence and then checks the extracted units in another sequence. The metrics we de… ▽ More

    Submitted 16 November, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: EMNLP 2023 Camera Ready Version

  10. arXiv:2212.10449  [pdf, other

    cs.CL

    Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

    Authors: Artidoro Pagnoni, Alexander R. Fabbri, Wojciech Kryściński, Chien-Sheng Wu

    Abstract: In long document controllable summarization, where labeled data is scarce, pretrained models struggle to adapt to the task and effectively respond to user queries. In this paper, we introduce Socratic pretraining, a question-driven, unsupervised pretraining objective specifically designed to improve controllability in summarization tasks. By training a model to generate and answer relevant questio… ▽ More

    Submitted 8 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: To appear at ACL 2023

  11. arXiv:2212.07981  [pdf, other

    cs.CL

    Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

    Authors: Yixin Liu, Alexander R. Fabbri, Pengfei Liu, Yilun Zhao, Linyong Nan, Ruilin Han, Simeng Han, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

    Abstract: Human evaluation is the foundation upon which the evaluation of both summarization systems and automatic metrics rests. However, existing human evaluation studies for summarization either exhibit a low inter-annotator agreement or have insufficient scale, and an in-depth analysis of human evaluation is lacking. Therefore, we address the shortcomings of existing summarization evaluation along the f… ▽ More

    Submitted 6 June, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: ACL 2023 Camera Ready

  12. arXiv:2211.15914  [pdf, other

    cs.CL

    Prompted Opinion Summarization with GPT-3.5

    Authors: Adithya Bhaskar, Alexander R. Fabbri, Greg Durrett

    Abstract: Large language models have shown impressive performance across a wide variety of tasks, including text summarization. In this paper, we show that this strong performance extends to opinion summarization. We explore several pipeline methods for applying GPT-3.5 to summarize a large collection of user reviews in a prompted fashion. To handle arbitrarily large numbers of user reviews, we explore recu… ▽ More

    Submitted 23 May, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Accepted to ACL (Findings) 2023

  13. arXiv:2211.06196  [pdf, other

    cs.CL

    Improving Factual Consistency in Summarization with Compression-Based Post-Editing

    Authors: Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong

    Abstract: State-of-the-art summarization models still struggle to be factually consistent with the input text. A model-agnostic way to address this problem is post-editing the generated summaries. However, existing approaches typically fail to remove entity errors if a suitable input entity replacement is not available or may insert erroneous content. In our work, we focus on removing extrinsic entity error… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: EMNLP 2022

  14. arXiv:2211.05886  [pdf, ps, other

    cs.CL

    CREATIVESUMM: Shared Task on Automatic Summarization for Creative Writing

    Authors: Divyansh Agarwal, Alexander R. Fabbri, Simeng Han, Wojciech Kryściński, Faisal Ladhak, Bryan Li, Kathleen McKeown, Dragomir Radev, Tianyi Zhang, Sam Wiseman

    Abstract: This paper introduces the shared task of summarizing documents in several creative domains, namely literary texts, movie scripts, and television scripts. Summarizing these creative documents requires making complex literary interpretations, as well as understanding non-trivial temporal dependencies in texts containing varied styles of plot development and narrative structure. This poses unique cha… ▽ More

    Submitted 6 December, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: 4 pages + 3 for references and appendix

  15. arXiv:2209.00840  [pdf, other

    cs.CL

    FOLIO: Natural Language Reasoning with First-Order Logic

    Authors: Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Wenfei Zhou, James Coady, David Peng, Yujie Qiao, Luke Benson, Lucy Sun, Alex Wardle-Solano, Hannah Szabo, Ekaterina Zubova, Matthew Burtell, Jonathan Fan, Yixin Liu, Brian Wong, Malcolm Sailor, Ansong Ni, Linyong Nan, Jungo Kasai, Tao Yu, Rui Zhang, Alexander R. Fabbri , et al. (10 additional authors not shown)

    Abstract: Large language models (LLMs) have achieved remarkable performance on a variety of natural language understanding tasks. However, existing benchmarks are inadequate in measuring the complex logical reasoning capabilities of a model. We present FOLIO, a human-annotated, logically complex and diverse dataset for reasoning in natural language (NL), equipped with first-order logic (FOL) annotations. FO… ▽ More

    Submitted 17 May, 2024; v1 submitted 2 September, 2022; originally announced September 2022.

  16. arXiv:2205.12854  [pdf, other

    cs.CL cs.AI

    Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors

    Authors: Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett

    Abstract: The propensity of abstractive summarization models to make factual errors has been studied extensively, including design of metrics to detect factual errors and annotation of errors in current systems' outputs. However, the ever-evolving nature of summarization systems, metrics, and annotated benchmarks makes factuality evaluation a moving target, and drawing clear comparisons among metrics has be… ▽ More

    Submitted 25 May, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL 2023

  17. arXiv:2112.08542  [pdf, other

    cs.CL

    QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization

    Authors: Alexander R. Fabbri, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong

    Abstract: Factual consistency is an essential quality of text summarization models in practical settings. Existing work in evaluating this dimension can be broadly categorized into two lines of research, entailment-based and question answering (QA)-based metrics, and different experimental setups often lead to contrasting conclusions as to which paradigm performs the best. In this work, we conduct an extens… ▽ More

    Submitted 29 April, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: NAACL 2022

  18. arXiv:2112.07637  [pdf, other

    cs.CL

    Exploring Neural Models for Query-Focused Summarization

    Authors: Jesse Vig, Alexander R. Fabbri, Wojciech Kryściński, Chien-Sheng Wu, Wenhao Liu

    Abstract: Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization. While recently released datasets, such as QMSum or AQuaMuSe, facilitate research efforts in QFS, the field lacks a comprehensive study of the broad space of applicable modeling methods. In this paper we conduct a systematic exploration of neur… ▽ More

    Submitted 26 April, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Findings of NAACL 2022

  19. arXiv:2112.04139  [pdf, other

    cs.CL

    Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander R. Fabbri, Ye** Choi, Noah A. Smith

    Abstract: Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of crowdworker judgments. Meanwhile, efforts to improve generation models tend to depend on simple n-gram overlap metrics (e.g., BLEU, ROUGE). We argue that new advances on models and metrics should each more direc… ▽ More

    Submitted 18 May, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Proc. of NAACL 2022

  20. arXiv:2111.06474  [pdf, other

    cs.CL

    AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

    Authors: Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Haoran Li, Mona Diab

    Abstract: Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions. Each question thread can receive a large number of answers with different perspectives. One goal of answer summarization is to produce a summary that reflects the range of answer perspectives. A major obstacle for this task is the absenc… ▽ More

    Submitted 29 April, 2022; v1 submitted 11 November, 2021; originally announced November 2021.

    Comments: NAACL 2022; arXiv admin note: substantial text overlap with arXiv:2104.08536

  21. arXiv:2110.07166  [pdf, other

    cs.CL

    CaPE: Contrastive Parameter Ensembling for Reducing Hallucination in Abstractive Summarization

    Authors: Prafulla Kumar Choubey, Alexander R. Fabbri, Jesse Vig, Chien-Sheng Wu, Wenhao Liu, Nazneen Fatema Rajani

    Abstract: Hallucination is a known issue for neural abstractive summarization models. Recent work suggests that the degree of hallucination may depend on errors in the training data. In this work, we propose a new method called Contrastive Parameter Ensembling (CaPE) to use training data more effectively, utilizing variations in noise in training samples to reduce hallucination. We first select clean and no… ▽ More

    Submitted 20 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  22. arXiv:2106.00829  [pdf, other

    cs.CL

    ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining

    Authors: Alexander R. Fabbri, Faiaz Rahman, Imad Rizvi, Borui Wang, Haoran Li, Yashar Mehdad, Dragomir Radev

    Abstract: While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles. This research gap is due, in part, to the lack of standardized datasets for summarizing online discussions. To address this gap, we design annotation protocols motivated by an issues--viewpoints--assertions framework to… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  23. arXiv:2104.08536  [pdf, other

    cs.CL

    Multi-Perspective Abstractive Answer Summarization

    Authors: Alexander R. Fabbri, Xiaojian Wu, Srini Iyer, Mona Diab

    Abstract: Community Question Answering (CQA) forums such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of questions. Each question thread can receive a large number of answers with different perspectives. The goal of multi-perspective answer summarization is to produce a summary that includes all perspectives of the answer. A major obstacle for multi-perspective, ab… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  24. arXiv:2010.12836  [pdf, other

    cs.CL

    Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

    Authors: Alexander R. Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq Joty, Dragomir Radev, Yashar Mehdad

    Abstract: Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks. However, these models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains. In this work, we introduce a novel and generalizable method, called WikiTransfer, for fin… ▽ More

    Submitted 11 April, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: NAACL 2021

  25. arXiv:2007.12626  [pdf, other

    cs.CL

    SummEval: Re-evaluating Summarization Evaluation

    Authors: Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev

    Abstract: The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress. We address the existing shortcomings of summarization evaluation methods along five dimensions: 1) we re-evaluate 14 automatic evaluation metrics in a comprehensive and consistent fashion using neural summarization mode… ▽ More

    Submitted 1 February, 2021; v1 submitted 24 July, 2020; originally announced July 2020.

    Comments: 11 pages, 4 tables, 2 figures; pre-MIT Press publication version

  26. arXiv:2004.11892  [pdf, other

    cs.CL

    Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering

    Authors: Alexander R. Fabbri, Patrick Ng, Zhiguo Wang, Ramesh Nallapati, Bing Xiang

    Abstract: Question Answering (QA) is in increasing demand as the amount of information available online and the desire for quick access to this content grows. A common approach to QA has been to fine-tune a pretrained language model on a task-specific labeled dataset. This paradigm, however, relies on scarce, and costly to obtain, large-scale human-labeled data. We propose an unsupervised approach to traini… ▽ More

    Submitted 24 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  27. arXiv:1909.01716  [pdf, other

    cs.CL cs.IR cs.LG

    ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks

    Authors: Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R. Fabbri, Irene Li, Dan Friedman, Dragomir R. Radev

    Abstract: Scientific article summarization is challenging: large, annotated corpora are not available, and the summary should ideally include the article's impacts on research community. This paper provides novel solutions to these two challenges. We 1) develop and release the first large-scale manually-annotated corpus for scientific papers (on computational linguistics) by enabling faster annotation, and… ▽ More

    Submitted 15 September, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: AAAI 2019

  28. arXiv:1906.10910  [pdf, other

    cs.LG cs.CL stat.ML

    Creating A Neural Pedagogical Agent by Jointly Learning to Review and Assess

    Authors: Youngnam Lee, Youngduck Choi, Junghyun Cho, Alexander R. Fabbri, Hyunbin Loh, Chanyou Hwang, Yongku Lee, Sang-Wook Kim, Dragomir Radev

    Abstract: Machine learning plays an increasing role in intelligent tutoring systems as both the amount of data available and specialization among students grow. Nowadays, these systems are frequently deployed on mobile applications. Users on such mobile education platforms are dynamic, frequently being added, accessing the application with varying levels of focus, and changing while using the service. The e… ▽ More

    Submitted 1 July, 2019; v1 submitted 26 June, 2019; originally announced June 2019.

    Comments: 9 pages, 9 figures, 7 tables

  29. arXiv:1906.01749  [pdf, other

    cs.CL

    Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model

    Authors: Alexander R. Fabbri, Irene Li, Tianwei She, Suyi Li, Dragomir R. Radev

    Abstract: Automatic generation of summaries from multiple news articles is a valuable tool as the number of online publications grows rapidly. Single document summarization (SDS) systems have benefited from advances in neural encoder-decoder model thanks to the availability of large datasets. However, multi-document summarization (MDS) of news articles has been limited to datasets of a couple of hundred exa… ▽ More

    Submitted 19 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: ACL 2019, 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019

  30. arXiv:1903.09755  [pdf, other

    cs.CV

    Trifocal Relative Pose from Lines at Points and its Efficient Solution

    Authors: Ricardo Fabbri, Timothy Duff, Hongyi Fan, Margaret Regan, David da Costa de Pinho, Elias Tsigaridas, Charles Wampler, Jonathan Hauenstein, Benjamin Kimia, Anton Leykin, Tomas Pajdla

    Abstract: We present a method for solving two minimal problems for relative camera pose estimation from three views, which are based on three view correspondences of i) three points and one line and the novel case of ii) three points and two lines through two of the points. These problems are too difficult to be efficiently solved by the state of the art Groebner basis methods. Our method is based on a new… ▽ More

    Submitted 29 November, 2022; v1 submitted 23 March, 2019; originally announced March 2019.

    Comments: First appeared at CVPR - Computer Vision and Pattern Recognition Conference 2020. This material is based upon work supported by the National Science Foundation under Grant No. DMS-1439786 while most authors were in residence at Brown University's Institute for Computational and Experimental Research in Mathematics -- ICERM, in Providence, RI

    MSC Class: 14Qxx; 12Yxx; 51N15; 14N05; 53A20; 17B81; 22E70; 53A04; 53A55; 53Bxx; 53B5; 57R25; 58C25; 68T40; 68U05; 70B1; 70G55; 70G65; 90C30 ACM Class: I.4.5; I.4.8; I.2.9; I.2.10; I.1.2; G.1.3; G.1.5

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, preprint available December 2022

  31. arXiv:1811.12181  [pdf, other

    cs.CY cs.CL cs.IR cs.LG stat.ML

    What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning

    Authors: Irene Li, Alexander R. Fabbri, Robert R. Tung, Dragomir R. Radev

    Abstract: Recent years have witnessed the rising popularity of Natural Language Processing (NLP) and related fields such as Artificial Intelligence (AI) and Machine Learning (ML). Many online courses and resources are available even for those without a strong background in the field. Often the student is curious about a specific topic but does not quite know where to begin studying. To answer the question o… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

  32. arXiv:1808.07531  [pdf, other

    cs.CL

    Sarcasm Analysis using Conversation Context

    Authors: Debanjan Ghosh, Alexander R. Fabbri, Smaranda Muresan

    Abstract: Computational models for sarcasm detection have often relied on the content of utterances in isolation. However, the speaker's sarcastic intent is not always apparent without additional context. Focusing on social media discussions, we investigate three issues: (1) does modeling conversation context help in sarcasm detection; (2) can we identify what part of conversation context triggered the sarc… ▽ More

    Submitted 28 August, 2018; v1 submitted 22 August, 2018; originally announced August 2018.

    Comments: Computational Linguistics (journal)

  33. arXiv:1805.04617  [pdf, other

    cs.CL

    TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation

    Authors: Alexander R. Fabbri, Irene Li, Prawat Trairatvorakul, Yijiao He, Wei Tai Ting, Robert Tung, Caitlin Westerfield, Dragomir R. Radev

    Abstract: The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.

    Comments: ACL 2018, 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 2018

  34. arXiv:1712.09359  [pdf, other

    cs.CY cs.CL

    Basic concepts and tools for the Toki Pona minimal and constructed language: description of the language and main issues; analysis of the vocabulary; text synthesis and syntax highlighting; Wordnet synsets

    Authors: Renato Fabbri

    Abstract: A minimal constructed language (conlang) is useful for experiments and comfortable for making tools. The Toki Pona (TP) conlang is minimal both in the vocabulary (with only 14 letters and 124 lemmas) and in the (about) 10 syntax rules. The language is useful for being a used and somewhat established minimal conlang with at least hundreds of fluent speakers. This article exposes current concepts an… ▽ More

    Submitted 3 July, 2018; v1 submitted 26 December, 2017; originally announced December 2017.

    Comments: Python and Vim scripts in this repository: https://github.com/ttm/prv/

  35. arXiv:1712.06933  [pdf, ps, other

    cs.HC

    An anthropological account of the Vim text editor: features and tweaks after 10 years of usage

    Authors: Renato Fabbri

    Abstract: The Vim text editor is very rich in capabilities and thus complex. This article is a description of Vim and a set of considerations about its usage and design. It results from more than ten years of experience in using Vim for writing and editing various types of documents, e.g. Python, C++, JavaScript, ChucK programs; \LaTeX, Markdown, HTML, RDF, Make and other markup files; % TTM binary files. I… ▽ More

    Submitted 18 December, 2017; originally announced December 2017.

    Comments: Scripts and other files are in this repository: https://github.com/ttm/vim

  36. arXiv:1711.04612  [pdf, other

    cs.CY

    The Algorithmic-Autoregulation (AA) Methodology and Software: a collective focus on self-transparency

    Authors: Renato Fabbri

    Abstract: There are numerous efforts to achieve a lightweight and systematic account of what is done by a group and its individuals. The Algorithmic-Autoregulation (AA) is a special case, in which a technical community embraced the challenge of registering their own dedication for sharing processes, self-transparency, and documenting the efforts. AA is used since June/2011 by dozens of researchers and softw… ▽ More

    Submitted 26 October, 2017; originally announced November 2017.

    Comments: Scripts and data in https://github.com/ttm/ensaaio

    Report number: ISSN 2527-2357, ISBN 978-85-5676-019-7

    Journal ref: Anais do XX ENMC - Encontro Nacional de Modelagem Computacional e VIII ECTM - Encontro de Ciências e Tecnologia de Materiais, Nova Friburgo, RJ - 16 a 19 Outubro 2017

  37. arXiv:1711.04609  [pdf, other

    cs.CY

    Text Mining Descriptions Of Dreams: aesthetic and clinical efforts

    Authors: Renato Fabbri, Fabiane M. Borges

    Abstract: Dreams are highly valued in both Freudian psychoanalysis and less conservative clinical traditions. Text mining enables the extraction of meaning from writings in powerful and unexpected ways. In this work, we report methods, uses and results obtained by mining descriptions of dreams. The texts were collected as part of a course in Schizoanalysis (Clinical Psychology) from dozens of participants.… ▽ More

    Submitted 26 October, 2017; originally announced November 2017.

    Comments: Scripts and corpus in https://github.com/ttm/sonhos, Anais do XX ENMC - Encontro Nacional de Modelagem Computacional e VIII ECTM - Encontro de Ciências e Tecnologia de Materiais, Nova Friburgo, RJ - 16 a 19 Outubro 2017

    Report number: ISSN 2527-2357, ISBN 978-85-5676-019-7

  38. arXiv:1710.09954  [pdf, other

    cs.CY cs.AI

    Audiovisual Analytics Vocabulary and Ontology (AAVO): initial core and example expansion

    Authors: Renato Fabbri, Maria Cristina Ferreira de Oliveira

    Abstract: Visual Analytics might be defined as data mining assisted by interactive visual interfaces. The field has been receiving prominent consideration by researchers, developers and the industry. The literature, however, is complex because it involves multiple fields of knowledge and is considerably recent. In this article we describe an initial tentative organization of the knowledge in the field as an… ▽ More

    Submitted 26 October, 2017; originally announced October 2017.

    Comments: Scripts in https://github.com/ttm/aavo/

    Report number: ISSN 2527-2357, ISBN 978-85-5676-019-7

    Journal ref: Anais do XX ENMC - Encontro Nacional de Modelagem Computacional e VIII ECTM - Encontro de Ciências e Tecnologia de Materiais, Nova Friburgo, RJ - 16 a 19 Outubro 2017

  39. arXiv:1710.09952  [pdf, other

    cs.AI

    Enhancements of linked data expressiveness for ontologies

    Authors: Renato Fabbri

    Abstract: The semantic web has received many contributions of researchers as ontologies which, in this context, i.e. within RDF linked data, are formalized conceptualizations that might use different protocols, such as RDFS, OWL DL and OWL FULL. In this article, we describe new expressive techniques which were found necessary after elaborating dozens of OWL ontologies for the scientific academy, the State a… ▽ More

    Submitted 26 October, 2017; originally announced October 2017.

    Report number: ISSN 2527-2357, ISBN 978-85-5676-019-7

    Journal ref: Anais do XX ENMC - Encontro Nacional de Modelagem Computacional e VIII ECTM - Encontro de Ciências e Tecnologia de Materiais, Nova Friburgo, RJ - 16 a 19 Outubro 2017

  40. arXiv:1710.09233  [pdf, other

    cs.CL

    A Simple Text Analytics Model To Assist Literary Criticism: comparative approach and example on James Joyce against Shakespeare and the Bible

    Authors: Renato Fabbri, Luis Henrique Garcia

    Abstract: Literary analysis, criticism or studies is a largely valued field with dedicated journals and researchers which remains mostly within the humanities scope. Text analytics is the computer-aided process of deriving information from texts. In this article we describe a simple and generic model for performing literary analysis using text analytics. The method relies on statistical measures of: 1) toke… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

    Comments: Scripts and corpus in https://github.com/ttm/joyce

    Report number: ISSN 2527-2357, ISBN 978-85-5676-019-7

    Journal ref: Anais do XX ENMC - Encontro Nacional de Modelagem Computacional e VIII ECTM - Encontro de Ciências e Tecnologia de Materiais, Nova Friburgo, RJ - 16 a 19 Outubro 2017

  41. arXiv:1707.06226  [pdf, other

    cs.CL cs.AI cs.LG

    The Role of Conversation Context for Sarcasm Detection in Online Interactions

    Authors: Debanjan Ghosh, Alexander Richard Fabbri, Smaranda Muresan

    Abstract: Computational models for sarcasm detection have often relied on the content of utterances in isolation. However, speaker's sarcastic intent is not always obvious without additional context. Focusing on social media discussions, we investigate two issues: (1) does modeling of conversation context help in sarcasm detection and (2) can we understand what part of conversation context triggered the sar… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Comments: SIGDial 2017

  42. arXiv:1707.03946  [pdf, other

    cs.CV

    The Surfacing of Multiview 3D Drawings via Lofting and Occlusion Reasoning

    Authors: Anil Usumezbas, Ricardo Fabbri, Benjamin Kimia

    Abstract: The three-dimensional reconstruction of scenes from multiple views has made impressive strides in recent years, chiefly by methods correlating isolated feature points, intensities, or curvilinear structure. In the general setting, i.e., without requiring controlled acquisition, limited number of objects, abundant patterns on objects, or object curves to follow particular models, the majority of th… ▽ More

    Submitted 12 July, 2017; originally announced July 2017.

    Comments: CVPR 2017 expanded version with improvements over camera ready, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2017

  43. arXiv:1609.05561  [pdf, other

    cs.CV cs.CG cs.GR cs.RO

    From Multiview Image Curves to 3D Drawings

    Authors: Anil Usumezbas, Ricardo Fabbri, Benjamin B. Kimia

    Abstract: Reconstructing 3D scenes from multiple views has made impressive strides in recent years, chiefly by correlating isolated feature points, intensity patterns, or curvilinear structures. In the general setting - without controlled acquisition, abundant texture, curves and surfaces following specific models or limiting scene complexity - most methods produce unorganized point clouds, meshes, or voxel… ▽ More

    Submitted 18 September, 2016; originally announced September 2016.

    Comments: Expanded ECCV 2016 version with tweaked figures and including an overview of the supplementary material available at multiview-3d-drawing.sourceforge.net

    MSC Class: 65D17; 68U05; 68U10; 53A20 ACM Class: I.4.8; I.4.10; I.4.6; I.3.5; J.6

    Journal ref: Lecture Notes in Computer Science, 9908, pp 70-87, september 2016

  44. arXiv:1604.08256  [pdf, other

    cs.CV cs.CG cs.GR math.DG

    Multiview Differential Geometry of Curves

    Authors: Ricardo Fabbri, Benjamin Kimia

    Abstract: The field of multiple view geometry has seen tremendous progress in reconstruction and calibration due to methods for extracting reliable point features and key developments in projective geometry. Point features, however, are not available in certain applications and result in unstructured point cloud reconstructions. General image curves provide a complementary feature when keypoints are scarce,… ▽ More

    Submitted 27 April, 2016; originally announced April 2016.

    Comments: International Journal of Computer Vision Final Accepted version. International Journal of Computer Vision, 2016. The final publication is available at Springer via http://dx.doi.org/10.1007/s11263-016-0912-7

    MSC Class: 53A04; 53A17; 53A20 ACM Class: I.4.8; I.3.5

  45. The Algorithmic Autoregulation Software Development Methodology

    Authors: Renato Fabbri, Ricardo Fabbri, Vilson Vieira, Daniel Penalva, Danilo Shiga, Marcos Mendonca, Alexandre Negrao, Lucas Zambianchi, Gabriela Thume

    Abstract: We present a new self-regulating methodology for coordinating distributed team work called Algorithmic Autoregulation (AA), based on recent social networking concepts and individual merit. Team members take on an egalitarian role, and stay voluntarily logged into so-called AA sessions for part of their time (e.g. 2 hours per day), during which they create periodical logs - short text sentences - t… ▽ More

    Submitted 27 April, 2016; originally announced April 2016.

    ACM Class: D.2.9

    Journal ref: RESI, v. 13, n. 2, 2014

  46. arXiv:1505.06640  [pdf, ps, other

    cs.CY cs.SI

    Continuous voting by approval and participation

    Authors: Renato Fabbri, Ricardo Poppi

    Abstract: In finding the adequate way to prioritize proposals, the Brazilian participation community agreed about the measurement of two indexes, one of approval and one of participation. Both practice and literature is constantly handled by the experts involved, and the formalization of such model and metrics seems novel. Also, the relevance of this report is strengthened by the nearby use of these indexes… ▽ More

    Submitted 24 April, 2015; originally announced May 2015.

  47. arXiv:1502.01312  [pdf, other

    cs.CY cs.HC

    Vivace: a collaborative live coding language and platform

    Authors: Vilson Vieira, Guilherme Lunhani, Geraldo Magela de Castro Rocha Junior, Caleb Mascarenhas Luporini, Daniel Penalva, Ricardo Fabbri, Renato Fabbri

    Abstract: Live coding is a performance and creative technique based on improvised and interactive coding. Many recent endeavors have focused in live coding both because of aesthetics and as a way to alleviate performance drawbacks when the musical instrument is a computer. This paper describes the principles and the design of Vivace, a live coding language and environment built with Web technologies to be e… ▽ More

    Submitted 30 October, 2017; v1 submitted 13 January, 2015; originally announced February 2015.

    Report number: ISSN 2175-6759

    Journal ref: Proceedings of the 16th Brazilian Symposium on Computer Music, SBCM 2017

  48. arXiv:1501.02662  [pdf, other

    cs.CY cs.AI

    Social Participation Ontology: community documentation, enhancements and use examples

    Authors: Renato Fabbri, Henrique Parra Parra Filho, Rodrigo Bandeira de Luna, Ricardo Augusto Poppi Martins, Flor Karina Mamani Amanqui, Dilvan de Abreu Moreira, Osvaldo Novais de Oliveira Junior

    Abstract: Participatory democracy advances in virtually all governments and especially in South America which exhibits a mixed culture and social predisposition. This article presents the "Social Participation Ontology" (OPS from the Brazilian name \emph{Ontologia de Participação Social}) implemented in compliance with the Web Ontology Language standard (OWL) for fostering social participation, specially in… ▽ More

    Submitted 30 October, 2017; v1 submitted 12 January, 2015; originally announced January 2015.

    Comments: See ancillary for table of terms, OPS code and figures. Further information is at https://github.com/ttm/ops

  49. arXiv:1412.7311  [pdf, other

    cs.SI physics.comp-ph physics.soc-ph

    Versinus: a visualization method for graphs in evolution

    Authors: Renato Fabbri

    Abstract: This article presents a novel visualization approach for dynamic graphs, the versinus method, specially useful for real world networks exhibiting free-scale properties. With a simple and fixed layout, and a small set of visual markups, the method has been useful for understanding network dynamics. Local community often suggests that it be reported, which motivated this article. Online resources de… ▽ More

    Submitted 23 December, 2014; originally announced December 2014.

    Comments: article written by request of research colleagues that appreciated these visualizations. arXiv admin note: text overlap with arXiv:1310.7769

  50. arXiv:1412.7309  [pdf, other

    cs.SI physics.data-an physics.soc-ph

    A connective differentiation of textual production in interaction networks

    Authors: Renato Fabbri

    Abstract: This paper explores textual production in interaction networks, with special emphasis on its relation to topological measures. Four email lists were selected, in which measures were taken from the texts participants wrote. Peripheral, intermediary and hub sectors of these networks were observed to have discrepant linguistic elaborations. For completeness of exposition, correlation of textual and t… ▽ More

    Submitted 23 December, 2014; originally announced December 2014.