Skip to main content

Showing 1–10 of 10 results for author: Sotudeh, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.07586  [pdf, other

    cs.CL

    QontSum: On Contrasting Salient Content for Query-focused Summarization

    Authors: Sajad Sotudeh, Nazli Goharian

    Abstract: Query-focused summarization (QFS) is a challenging task in natural language processing that generates summaries to address specific queries. The broader field of Generative Information Retrieval (Gen-IR) aims to revolutionize information extraction from vast document corpora through generative approaches, encompassing Generative Document Retrieval (GDR) and Grounded Answer Retrieval (GAR). This pa… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: 9 pages, Long paper accepted at Gen-IR@SIGIR23

  2. arXiv:2302.01342  [pdf, other

    cs.CL

    Curriculum-Guided Abstractive Summarization

    Authors: Sajad Sotudeh, Hanieh Deilamsalehy, Franck Dernoncourt, Nazli Goharian

    Abstract: Recent Transformer-based summarization models have provided a promising approach to abstractive summarization. They go beyond sentence selection and extractive strategies to deal with more complicated tasks such as novel word generation and sentence paraphrasing. Nonetheless, these models have two shortcomings: (1) they often perform poorly in content selection, and (2) their training strategy is… ▽ More

    Submitted 8 February, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: 8 pages, Long paper. arXiv admin note: text overlap with arXiv:2302.00954

  3. arXiv:2302.00954  [pdf, other

    cs.CL cs.AI

    Curriculum-guided Abstractive Summarization for Mental Health Online Posts

    Authors: Sajad Sotudeh, Nazli Goharian, Hanieh Deilamsalehy, Franck Dernoncourt

    Abstract: Automatically generating short summaries from users' online mental health posts could save counselors' reading time and reduce their fatigue so that they can provide timely responses to those seeking help for improving their mental state. Recent Transformers-based summarization models have presented a promising approach to abstractive summarization. They go beyond sentence selection and extractive… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: 4 pages, short paper, accepted to The 13th International Workshop on Health Text Mining and Information Analysis (LOUHI 2022)

  4. arXiv:2206.00856  [pdf, other

    cs.CL

    MentSum: A Resource for Exploring Summarization of Mental Health Online Posts

    Authors: Sajad Sotudeh, Nazli Goharian, Zachary Young

    Abstract: Mental health remains a significant challenge of public health worldwide. With increasing popularity of online platforms, many use the platforms to share their mental health conditions, express their feelings, and seek help from the community and counselors. Some of these platforms, such as Reachout, are dedicated forums where the users register to seek help. Others such as Reddit provide subreddi… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: 8 pages, LREC 2022 Long Paper

  5. arXiv:2206.00847  [pdf, other

    cs.CL

    TSTR: Too Short to Represent, Summarize with Details! Intro-Guided Extended Summary Generation

    Authors: Sajad Sotudeh, Nazli Goharian

    Abstract: Many scientific papers such as those in arXiv and PubMed data collections have abstracts with varying lengths of 50-1000 words and average length of approximately 200 words, where longer abstracts typically convey more information about the source paper. Up to recently, scientific summarization research has typically focused on generating short, abstract-like summaries following the existing datas… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: 9 pages, NAACL 2022 Long Paper

  6. arXiv:2110.01159  [pdf, other

    cs.CL

    TLDR9+: A Large Scale Resource for Extreme Summarization of Social Media Posts

    Authors: Sajad Sotudeh, Hanieh Deilamsalehy, Franck Dernoncourt, Nazli Goharian

    Abstract: Recent models in develo** summarization systems consist of millions of parameters and the model performance is highly dependent on the abundance of training data. While most existing summarization corpora contain data in the order of thousands to one million, generation of large-scale summarization datasets in order of couple of millions is yet to be explored. Practically, more data is better at… ▽ More

    Submitted 5 October, 2021; v1 submitted 3 October, 2021; originally announced October 2021.

    Comments: Accepted to New Frontiers in Summarization Workshop (EMNLP 2021)

  7. arXiv:2012.14136  [pdf, other

    cs.CL

    On Generating Extended Summaries of Long Documents

    Authors: Sajad Sotudeh, Arman Cohan, Nazli Goharian

    Abstract: Prior work in document summarization has mainly focused on generating short summaries of a document. While this type of summary helps get a high-level view of a given document, it is desirable in some cases to know more detailed information about its salient points that can't fit in a short summary. This is typically the case for longer documents such as a research paper, legal document, or a book… ▽ More

    Submitted 28 December, 2020; originally announced December 2020.

    Comments: Accepted at SDU 2021

  8. arXiv:2007.14477  [pdf, ps, other

    cs.CL

    GUIR at SemEval-2020 Task 12: Domain-Tuned Contextualized Models for Offensive Language Detection

    Authors: Sajad Sotudeh, Tong Xiang, Hao-Ren Yao, Sean MacAvaney, Eugene Yang, Nazli Goharian, Ophir Frieder

    Abstract: Offensive language detection is an important and challenging task in natural language processing. We present our submissions to the OffensEval 2020 shared task, which includes three English sub-tasks: identifying the presence of offensive language (Sub-task A), identifying the presence of target in offensive language (Sub-task B), and identifying the categories of the target (Sub-task C). Our expe… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

    Comments: SemEval 2020

  9. arXiv:2005.00163  [pdf, other

    cs.CL

    Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization

    Authors: Sajad Sotudeh, Nazli Goharian, Ross W. Filice

    Abstract: Sequence-to-sequence (seq2seq) network is a well-established model for text summarization task. It can learn to produce readable content; however, it falls short in effectively identifying key regions of the source. In this paper, we approach the content selection problem for clinical abstractive summarization by augmenting salient ontological terms into the summarizer. Our experiments on two publ… ▽ More

    Submitted 30 April, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020

  10. arXiv:1905.05818  [pdf, other

    cs.CL cs.IR

    Ontology-Aware Clinical Abstractive Summarization

    Authors: Sean MacAvaney, Sajad Sotudeh, Arman Cohan, Nazli Goharian, Ish Talati, Ross W. Filice

    Abstract: Automatically generating accurate summaries from clinical reports could save a clinician's time, improve summary coverage, and reduce errors. We propose a sequence-to-sequence abstractive summarization model augmented with domain-specific ontological information to enhance content selection and summary generation. We apply our method to a dataset of radiology reports and show that it significantly… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: 4 pages; SIGIR 2019 Short Paper