Skip to main content

Showing 1–6 of 6 results for author: Akhtar, S S

Searching in archive cs. Search in all archives.
.
  1. arXiv:1806.05600  [pdf, other

    cs.CL

    Gender Prediction in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System

    Authors: Ankush Khandelwal, Sahil Swami, Syed Sarfaraz Akhtar, Manish Shrivastava

    Abstract: The rapid expansion in the usage of social media networking sites leads to a huge amount of unprocessed user generated data which can be used for text mining. Author profiling is the problem of automatically determining profiling aspects like the author's gender and age group through a text is gaining much popularity in computational linguistics. Most of the past research in author profiling is co… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

    Comments: 10 pages, CiCLing 2018

  2. arXiv:1806.05513  [pdf, other

    cs.CL

    Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System

    Authors: Ankush Khandelwal, Sahil Swami, Syed S. Akhtar, Manish Shrivastava

    Abstract: The tremendous amount of user generated data through social networking sites led to the gaining popularity of automatic text classification in the field of computational linguistics over the past decade. Within this domain, one problem that has drawn the attention of many researchers is automatic humor detection in texts. In depth semantic understanding of the text is required to detect humor whic… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

    Comments: 5 pages, 1 figure, LREC 2018

    Journal ref: Khandelwa, Ankush, et. al , "Humor Detection in English-Hindi Code-Mixed Social Media Content : Corpus and Baseline System". Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

  3. arXiv:1805.11869  [pdf, other

    cs.CL

    A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection

    Authors: Sahil Swami, Ankush Khandelwal, Vinay Singh, Syed Sarfaraz Akhtar, Manish Shrivastava

    Abstract: Social media platforms like twitter and facebook have be- come two of the largest mediums used by people to express their views to- wards different topics. Generation of such large user data has made NLP tasks like sentiment analysis and opinion mining much more important. Using sarcasm in texts on social media has become a popular trend lately. Using sarcasm reverses the meaning and polarity of w… ▽ More

    Submitted 30 May, 2018; originally announced May 2018.

    Comments: 9 pages, CICLing 2018

  4. arXiv:1805.11868  [pdf, other

    cs.CL

    An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System

    Authors: Sahil Swami, Ankush Khandelwal, Vinay Singh, Syed Sarfaraz Akhtar, Manish Shrivastava

    Abstract: Social media has become one of the main channels for peo- ple to communicate and share their views with the society. We can often detect from these views whether the person is in favor, against or neu- tral towards a given topic. These opinions from social media are very useful for various companies. We present a new dataset that consists of 3545 English-Hindi code-mixed tweets with opinion toward… ▽ More

    Submitted 30 May, 2018; originally announced May 2018.

    Comments: 9 pages, CICling 2018

  5. arXiv:1711.05680  [pdf, other

    cs.CL

    An Unsupervised Approach for Map** between Vector Spaces

    Authors: Syed Sarfaraz Akhtar, Arihant Gupta, Avijit Vajpayee, Arjit Srivastava, Madan Gopal Jhawar, Manish Shrivastava

    Abstract: We present a language independent, unsupervised approach for transforming word embeddings from source language to target language using a transformation matrix. Our model handles the problem of data scarcity which is faced by many languages in the world and yields improved word embeddings for words in the target language by relying on transformed embeddings of words of the source language. We init… ▽ More

    Submitted 20 November, 2017; v1 submitted 15 November, 2017; originally announced November 2017.

    Comments: CICLing 2017

  6. arXiv:1711.05678  [pdf, other

    cs.CL

    Unsupervised Morphological Expansion of Small Datasets for Improving Word Embeddings

    Authors: Syed Sarfaraz Akhtar, Arihant Gupta, Avijit Vajpayee, Arjit Srivastava, Manish Shrivastava

    Abstract: We present a language independent, unsupervised method for building word embeddings using morphological expansion of text. Our model handles the problem of data sparsity and yields improved word embeddings by relying on training word embeddings on artificially generated sentences. We evaluate our method using small sized training sets on eleven test sets for the word similarity task across seven l… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: CICLing 2017