Skip to main content

Showing 1–9 of 9 results for author: Golshan, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2005.06133  [pdf, other

    cs.DB cs.LG

    Adaptive Rule Discovery for Labeling Text Data

    Authors: Sainyam Galhotra, Behzad Golshan, Wang-Chiew Tan

    Abstract: Creating and collecting labeled data is one of the major bottlenecks in machine learning pipelines and the emergence of automated feature generation techniques such as deep learning, which typically requires a lot of training data, has further exacerbated the problem. While weak-supervision techniques have circumvented this bottleneck, existing frameworks either require users to write a set of div… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

  2. arXiv:2004.14283  [pdf, other

    cs.CL cs.AI

    SubjQA: A Dataset for Subjectivity and Review Comprehension

    Authors: Johannes Bjerva, Nikita Bhutani, Behzad Golshan, Wang-Chiew Tan, Isabelle Augenstein

    Abstract: Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to be important for sentiment analysis and word-sense disambiguation. Furthermore, subjectivity is an important aspect of user-generated data. In spite of this, subjectivity has not been investigated in contexts where such data is widespread, such as in question answe… ▽ More

    Submitted 6 October, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: EMNLP 2020 Long Paper - Camera Ready

  3. arXiv:2004.03020  [pdf, other

    cs.CL

    Enhancing Review Comprehension with Domain-Specific Commonsense

    Authors: Aaron Traylor, Chen Chen, Behzad Golshan, Xiaolan Wang, Yuliang Li, Yoshihiko Suhara, **feng Li, Cagatay Demiralp, Wang-Chiew Tan

    Abstract: Review comprehension has played an increasingly important role in improving the quality of online services and products and commonsense knowledge can further enhance review comprehension. However, existing general-purpose commonsense knowledge bases lack sufficient coverage and precision to meaningfully improve the comprehension of domain-specific reviews. In this paper, we introduce xSense, an ef… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

    Comments: 8 pages

  4. arXiv:1910.00637  [pdf, other

    cs.CL

    Essentia: Mining Domain-Specific Paraphrases with Word-Alignment Graphs

    Authors: Danni Ma, Chen Chen, Behzad Golshan, Wang-Chiew Tan

    Abstract: Paraphrases are important linguistic resources for a wide variety of NLP applications. Many techniques for automatic paraphrase mining from general corpora have been proposed. While these techniques are successful at discovering generic paraphrases, they often fail to identify domain-specific paraphrases (e.g., {staff, concierge} in the hospitality domain). This is because current techniques are o… ▽ More

    Submitted 4 October, 2019; v1 submitted 1 October, 2019; originally announced October 2019.

    Comments: accepted at the 13th Workshop on Graph-Based Natural Language Processing

  5. arXiv:1909.06731  [pdf, other

    cs.CL cs.LG

    Emu: Enhancing Multilingual Sentence Embeddings with Semantic Specialization

    Authors: Wataru Hirota, Yoshihiko Suhara, Behzad Golshan, Wang-Chiew Tan

    Abstract: We present Emu, a system that semantically enhances multilingual sentence embeddings. Our framework fine-tunes pre-trained multilingual sentence embeddings using two main components: a semantic classifier and a language discriminator. The semantic classifier improves the semantic similarity of related sentences, whereas the language discriminator enhances the multilinguality of the embeddings via… ▽ More

    Submitted 24 November, 2019; v1 submitted 15 September, 2019; originally announced September 2019.

    Comments: AAAI 2020

  6. A Team-Formation Algorithm for Faultline Minimization

    Authors: Sanaz Bahargam, Behzad Golshan, Theodoros Lappas, Evimaria Terzi

    Abstract: In recent years, the proliferation of online resumes and the need to evaluate large populations of candidates for on-site and virtual teams have led to a growing interest in automated team-formation. Given a large pool of candidates, the general problem requires the selection of a team of experts to complete a given task. Surprisingly, while ongoing research has studied numerous variations with di… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

  7. arXiv:1805.01083  [pdf, other

    cs.DB cs.CL

    Scalable Semantic Querying of Text

    Authors: Xiaolan Wang, Aaron Feng, Behzad Golshan, Alon Halevy, George Mihaila, Hidekazu Oiwa, Wang-Chiew Tan

    Abstract: We present the KOKO system that takes declarative information extraction to a new level by incorporating advances in natural language processing techniques in its extraction language. KOKO is novel in that its extraction language simultaneously supports conditions on the surface of the text and on the structure of the dependency parse tree of sentences, thereby allowing for more refined extraction… ▽ More

    Submitted 2 May, 2018; originally announced May 2018.

  8. arXiv:1801.07746  [pdf, other

    cs.CL

    HappyDB: A Corpus of 100,000 Crowdsourced Happy Moments

    Authors: Akari Asai, Sara Evensen, Behzad Golshan, Alon Halevy, Vivian Li, Andrei Lopatenko, Daniela Stepanov, Yoshihiko Suhara, Wang-Chiew Tan, Yinzhan Xu

    Abstract: The science of happiness is an area of positive psychology concerned with understanding what behaviors make people happy in a sustainable fashion. Recently, there has been interest in develo** technologies that help incorporate the findings of the science of happiness into users' daily lives by steering them towards behaviors that increase happiness. With the goal of building technology that can… ▽ More

    Submitted 25 January, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

    Comments: Typos fixed

  9. arXiv:1701.05352  [pdf, other

    cs.SI physics.soc-ph

    Finding low-tension communities

    Authors: Esther Galbrun, Behzad Golshan, Aristides Gionis, Evimaria Terzi

    Abstract: Motivated by applications that arise in online social media and collaboration networks, there has been a lot of work on community-search and team-formation problems. In the former class of problems, the goal is to find a subgraph that satisfies a certain connectivity requirement and contains a given collection of seed nodes. In the latter class of problems, on the other hand, the goal is to find i… ▽ More

    Submitted 19 January, 2017; originally announced January 2017.

    Comments: A short version of this paper appeared in the 2017 SIAM International Conference on Data Mining, SDM'17. In this extended version, we discuss the team-formation problem variant, beside the original community-search problem, and include additional experimental results