Skip to main content

Showing 1–12 of 12 results for author: Karamanolakis, G

.
  1. arXiv:2312.14943  [pdf, other

    cs.IR cs.CL cs.LG

    Flood Event Extraction from News Media to Support Satellite-Based Flood Insurance

    Authors: Tejit Pabari, Beth Tellman, Giannis Karamanolakis, Mitchell Thomas, Max Mauerman, Eugene Wu, Upmanu Lall, Marco Tedesco, Michael S Steckler, Paolo Colosio, Daniel E Osgood, Melody Braun, Jens de Bruijn, Shammun Islam

    Abstract: Floods cause large losses to property, life, and livelihoods across the world every year, hindering sustainable development. Safety nets to help absorb financial shocks in disasters, such as insurance, are often unavailable in regions of the world most vulnerable to floods, like Bangladesh. Index-based insurance has emerged as an affordable solution, which considers weather data or information fro… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  2. arXiv:2204.07705  [pdf, other

    cs.CL cs.AI

    Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

    Authors: Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza , et al. (15 additional authors not shown)

    Abstract: How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting,… ▽ More

    Submitted 24 October, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: Accepted to EMNLP 2022, 25 pages

  3. arXiv:2108.12603  [pdf, other

    cs.CL cs.LG

    WALNUT: A Benchmark on Semi-weakly Supervised Learning for Natural Language Understanding

    Authors: Guoqing Zheng, Giannis Karamanolakis, Kai Shu, Ahmed Hassan Awadallah

    Abstract: Building machine learning models for natural language understanding (NLU) tasks relies heavily on labeled data. Weak supervision has been proven valuable when large amount of labeled data is unavailable or expensive to obtain. Existing works studying weak supervision for NLU either mostly focus on a specific task or simulate weak supervision signals from ground-truth labels. It is thus hard to com… ▽ More

    Submitted 22 May, 2022; v1 submitted 28 August, 2021; originally announced August 2021.

    Comments: Accepted to NAACL 2022 (Long Paper)

  4. arXiv:2104.05514  [pdf, other

    cs.CL stat.ML

    Self-Training with Weak Supervision

    Authors: Giannis Karamanolakis, Subhabrata Mukherjee, Guoqing Zheng, Ahmed Hassan Awadallah

    Abstract: State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks. Weak supervision in the form of domain-specific rules has been shown to be useful in such settings to automatically generate weakly labeled training data. However, learning with weak rules is challenging due to their inherent heuristic and noisy nature.… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: Accepted to NAACL 2021 (Long Paper)

  5. arXiv:2010.05194  [pdf, other

    cs.CL

    Detecting Foodborne Illness Complaints in Multiple Languages Using English Annotations Only

    Authors: Ziyi Liu, Giannis Karamanolakis, Daniel Hsu, Luis Gravano

    Abstract: Health departments have been deploying text classification systems for the early detection of foodborne illness complaints in social media documents such as Yelp restaurant reviews. Current systems have been successfully applied for documents in English and, as a result, a promising direction is to increase coverage and recall by considering documents in additional languages, such as Spanish or Ch… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: Accepted for the 11th International Workshop on Health Text Mining and Information Analysis (LOUHI@EMNLP 2020)

  6. arXiv:2010.02562  [pdf, other

    cs.CL

    Cross-Lingual Text Classification with Minimal Resources by Transferring a Sparse Teacher

    Authors: Giannis Karamanolakis, Daniel Hsu, Luis Gravano

    Abstract: Cross-lingual text classification alleviates the need for manually labeled documents in a target language by leveraging labeled documents from other languages. Existing approaches for transferring supervision across languages require expensive cross-lingual resources, such as parallel corpora, while less expensive cross-lingual representation learning approaches train classifiers without target la… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP 2020 (Long Paper)

  7. AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types

    Authors: Xin Luna Dong, Xiang He, Andrey Kan, Xian Li, Yan Liang, Jun Ma, Yifan Ethan Xu, Chenwei Zhang, Tong Zhao, Gabriel Blanco Saldana, Saurabh Deshpande, Alexandre Michetti Manduca, Jay Ren, Surender Pal Singh, Fan Xiao, Haw-Shiuan Chang, Giannis Karamanolakis, Yuning Mao, Yaqing Wang, Christos Faloutsos, Andrew McCallum, Jiawei Han

    Abstract: Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products p… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: KDD 2020

  8. arXiv:2004.13852  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories

    Authors: Giannis Karamanolakis, Jun Ma, Xin Luna Dong

    Abstract: Extracting structured knowledge from product profiles is crucial for various applications in e-Commerce. State-of-the-art approaches for knowledge extraction were each designed for a single category of product, and thus do not apply to real-life e-Commerce scenarios, which often contain thousands of diverse categories. This paper proposes TXtract, a taxonomy-aware knowledge extraction model that a… ▽ More

    Submitted 1 May, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

    Comments: Accepted to ACL 2020 (Long Paper)

  9. arXiv:1910.00054  [pdf, other

    cs.LG cs.CL cs.IR stat.ML

    Weakly Supervised Attention Networks for Fine-Grained Opinion Mining and Public Health

    Authors: Giannis Karamanolakis, Daniel Hsu, Luis Gravano

    Abstract: In many review classification applications, a fine-grained analysis of the reviews is desirable, because different segments (e.g., sentences) of a review may focus on different aspects of the entity in question. However, training supervised models for segment-level classification requires segment labels, which may be more difficult or expensive to obtain than review labels. In this paper, we emplo… ▽ More

    Submitted 30 September, 2019; originally announced October 2019.

    Comments: Accepted for the 5th Workshop on Noisy User-generated Text (W-NUT 2019), held in conjunction with EMNLP 2019

  10. arXiv:1909.00415  [pdf, other

    cs.LG cs.CL stat.ML

    Leveraging Just a Few Keywords for Fine-Grained Aspect Detection Through Weakly Supervised Co-Training

    Authors: Giannis Karamanolakis, Daniel Hsu, Luis Gravano

    Abstract: User-generated reviews can be decomposed into fine-grained segments (e.g., sentences, clauses), each evaluating a different aspect of the principal entity (e.g., price, quality, appearance). Automatically detecting these aspects can be useful for both users and downstream opinion mining applications. Current supervised approaches for learning aspect classifiers require many fine-grained aspect lab… ▽ More

    Submitted 1 September, 2019; originally announced September 2019.

    Comments: Accepted to EMNLP 2019

  11. arXiv:1807.06651  [pdf, other

    stat.ML cs.IR cs.LG

    Item Recommendation with Variational Autoencoders and Heterogenous Priors

    Authors: Giannis Karamanolakis, Kevin Raji Cherian, Ananth Ravi Narayan, Jie Yuan, Da Tang, Tony Jebara

    Abstract: In recent years, Variational Autoencoders (VAEs) have been shown to be highly effective in both standard collaborative filtering applications and extensions such as incorporation of implicit feedback. We extend VAEs to collaborative filtering with side information, for instance when ratings are combined with explicit text feedback from the user. Instead of using a user-agnostic standard Gaussian p… ▽ More

    Submitted 6 October, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: Accepted for the 3rd Workshop on Deep Learning for Recommender Systems (DLRS 2018), held in conjunction with the 12th ACM Conference on Recommender Systems (RecSys 2018) in Vancouver, Canada

  12. arXiv:1612.08391  [pdf, other

    cs.IR

    Audio-based Distributional Semantic Models for Music Auto-tagging and Similarity Measurement

    Authors: Giannis Karamanolakis, Elias Iosif, Athanasia Zlatintsi, Aggelos Pikrakis, Alexandros Potamianos

    Abstract: The recent development of Audio-based Distributional Semantic Models (ADSMs) enables the computation of audio and lexical vector representations in a joint acoustic-semantic space. In this work, these joint representations are applied to the problem of automatic tag generation. The predicted tags together with their corresponding acoustic representation are exploited for the construction of acoust… ▽ More

    Submitted 26 December, 2016; originally announced December 2016.