Skip to main content

Showing 51–100 of 107 results for author: Ramakrishnan, G

.
  1. arXiv:2103.05457  [pdf, other

    cs.IR

    Rudder: A Cross Lingual Video and Text Retrieval Dataset

    Authors: Jayaprakash A, Abhishek, Rishabh Dabral, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Video retrieval using natural language queries requires learning semantically meaningful joint embeddings between the text and the audio-visual input. Often, such joint embeddings are learnt using pairwise (or triplet) contrastive loss objectives which cannot give enough attention to 'difficult-to-retrieve' samples during training. This problem is especially pronounced in data-scarce settings wher… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

  2. arXiv:2103.00128  [pdf, other

    cs.CV

    PRISM: A Rich Class of Parameterized Submodular Information Measures for Guided Subset Selection

    Authors: Suraj Kothawade, Vishal Kaushal, Ganesh Ramakrishnan, Jeff Bilmes, Rishabh Iyer

    Abstract: With ever-increasing dataset sizes, subset selection techniques are becoming increasingly important for a plethora of tasks. It is often necessary to guide the subset selection to achieve certain desiderata, which includes focusing or targeting certain data points, while avoiding others. Examples of such problems include: i)targeted learning, where the goal is to find subsets with rare classes or… ▽ More

    Submitted 8 March, 2022; v1 submitted 26 February, 2021; originally announced March 2021.

    Comments: To Appear In 36th AAAI Conference on Artificial Intelligence, AAAI 2022

  3. arXiv:2103.00123  [pdf, other

    cs.LG

    GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training

    Authors: Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Abir De, Rishabh Iyer

    Abstract: The great success of modern machine learning models on large datasets is contingent on extensive computational resources with high financial and environmental costs. One way to address this is by extracting subsets that generalize on par with the full data. In this work, we propose a general framework, GRAD-MATCH, which finds subsets that closely match the gradient of the training or validation se… ▽ More

    Submitted 11 June, 2021; v1 submitted 26 February, 2021; originally announced March 2021.

    Comments: To appear in Proceedings of the 38 th International Conference on Machine Learning, PMLR 139, 2021

  4. Towards Robustness to Label Noise in Text Classification via Noise Modeling

    Authors: Siddhant Garg, Goutham Ramakrishnan, Varun Thumbe

    Abstract: Large datasets in NLP suffer from noisy labels, due to erroneous automatic and human annotation procedures. We study the problem of text classification with label noise, and aim to capture this noise through an auxiliary noise model over the classifier. We first assign a probability score to each training sample of having a noisy label, through a beta mixture model fitted on the losses at an early… ▽ More

    Submitted 7 November, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: Accepted at CIKM'21 (30th ACM International Conference on Information & Knowledge Management). Accepted at ICLR 2021 RobustML and S2D-OLAD Workshops

  5. arXiv:2101.10514  [pdf, other

    cs.CV cs.MM

    How Good is a Video Summary? A New Benchmarking Dataset and Evaluation Framework Towards Realistic Video Summarization

    Authors: Vishal Kaushal, Suraj Kothawade, Anshul Tomar, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: Automatic video summarization is still an unsolved problem due to several challenges. The currently available datasets either have very short videos or have few long videos of only a particular type. We introduce a new benchmarking video dataset called VISIOCITY (VIdeo SummarIzatiOn based on Continuity, Intent and DiversiTY) which comprises of longer videos across six different categories with den… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: 19 pages, 6 tables, 4 figures. arXiv admin note: substantial text overlap with arXiv:2007.14560

  6. arXiv:2101.10368  [pdf, other

    cs.CL

    Meta-Learning for Effective Multi-task and Multilingual Modelling

    Authors: Ishan Tarunesh, Sushil Khyalia, Vishwajeet Kumar, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Natural language processing (NLP) tasks (e.g. question-answering in English) benefit from knowledge of other tasks (e.g. named entity recognition in English) and knowledge of other languages (e.g. question-answering in Spanish). Such shared representations are typically learned in isolation, either across tasks or across languages. In this work, we propose a meta-learning approach to learn the int… ▽ More

    Submitted 22 March, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

    Comments: In Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021)

  7. arXiv:2101.04997  [pdf, other

    cs.LG cs.CL

    Joint Learning of Hyperbolic Label Embeddings for Hierarchical Multi-label Classification

    Authors: Soumya Chatterjee, Ayush Maheshwari, Ganesh Ramakrishnan, Saketha Nath Jagaralpudi

    Abstract: We consider the problem of multi-label classification where the labels lie in a hierarchy. However, unlike most existing works in hierarchical multi-label classification, we do not assume that the label-hierarchy is known. Encouraged by the recent success of hyperbolic embeddings in capturing hierarchical relations, we propose to jointly learn the classifier parameters as well as the label embeddi… ▽ More

    Submitted 13 January, 2021; originally announced January 2021.

    Comments: 10 pages, 2 figures. To appear at EACL 2021

  8. arXiv:2012.10630  [pdf, other

    cs.LG cs.AI

    GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

    Authors: Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: Large scale machine learning and deep models are extremely data-hungry. Unfortunately, obtaining large amounts of labeled data is expensive, and training state-of-the-art models (with hyperparameter tuning) requires significant computing resources and time. Secondly, real-world data is noisy and imbalanced. As a result, several recent papers try to make the training process more efficient and robu… ▽ More

    Submitted 11 June, 2021; v1 submitted 19 December, 2020; originally announced December 2020.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence 35. 9(2021): 8110-8118

  9. LIGHTEN: Learning Interactions with Graph and Hierarchical TEmporal Networks for HOI in videos

    Authors: Sai Praneeth Reddy Sunkesula, Rishabh Dabral, Ganesh Ramakrishnan

    Abstract: Analyzing the interactions between humans and objects from a video includes identification of the relationships between humans and the objects present in the video. It can be thought of as a specialized version of Visual Relationship Detection, wherein one of the objects must be a human. While traditional methods formulate the problem as inference on a sequence of video segments, we present a hier… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: 9 pages, 6 figures, ACM Multimedia Conference 2020

    ACM Class: I.2.10

    Journal ref: MM20 Proceedings of the 28th ACM International Conference on Multimedia, October 2020, Pages 691 to 699

  10. arXiv:2011.07555  [pdf, other

    cs.CR cs.CY

    Towards Compliant Data Management Systems for Healthcare ML

    Authors: Goutham Ramakrishnan, Aditya Nori, Hannah Murfet, Pashmina Cameron

    Abstract: The increasing popularity of machine learning approaches and the rising awareness of data protection and data privacy presents an opportunity to build truly secure and trustworthy healthcare systems. Regulations such as GDPR and HIPAA present broad guidelines and frameworks, but the implementation can present technical challenges. Compliant data management systems require enforcement of a number o… ▽ More

    Submitted 15 November, 2020; originally announced November 2020.

  11. arXiv:2010.05631  [pdf, other

    cs.LG cs.CV

    A Unified Framework for Generic, Query-Focused, Privacy Preserving and Update Summarization using Submodular Information Measures

    Authors: Vishal Kaushal, Suraj Kothawade, Ganesh Ramakrishnan, Jeff Bilmes, Himanshu Asnani, Rishabh Iyer

    Abstract: We study submodular information measures as a rich framework for generic, query-focused, privacy sensitive, and update summarization tasks. While past work generally treats these problems differently ({\em e.g.}, different models are often used for generic and query-focused summarization), the submodular information measures allow us to study each of these problems via a unified approach. We first… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: 35 pages, 14 figures, 5 tables

  12. arXiv:2008.09887  [pdf, other

    cs.LG stat.ML

    Semi-Supervised Data Programming with Subset Selection

    Authors: Ayush Maheshwari, Oishik Chatterjee, KrishnaTeja Killamsetty, Ganesh Ramakrishnan, Rishabh Iyer

    Abstract: The paradigm of data programming, which uses weak supervision in the form of rules/labelling functions, and semi-supervised learning, which augments small amounts of labelled data with a large unlabelled dataset, have shown great promise in several text classification scenarios. In this work, we argue that by not using any labelled data, data programming based approaches can yield sub-optimal perf… ▽ More

    Submitted 12 June, 2021; v1 submitted 22 August, 2020; originally announced August 2020.

    Comments: Findings of ACL, 2021

  13. arXiv:2007.14560  [pdf, other

    cs.CV cs.IR cs.LG cs.MM

    Realistic Video Summarization through VISIOCITY: A New Benchmark and Evaluation Framework

    Authors: Vishal Kaushal, Suraj Kothawade, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: Automatic video summarization is still an unsolved problem due to several challenges. We take steps towards making automatic video summarization more realistic by addressing them. Firstly, the currently available datasets either have very short videos or have few long videos of only a particular type. We introduce a new benchmarking dataset VISIOCITY which comprises of longer videos across six dif… ▽ More

    Submitted 25 August, 2020; v1 submitted 28 July, 2020; originally announced July 2020.

    Comments: 19 pages, 1 figure, 14 tables

  14. Backdoors in Neural Models of Source Code

    Authors: Goutham Ramakrishnan, Aws Albarghouthi

    Abstract: Deep neural networks are vulnerable to a range of adversaries. A particularly pernicious class of vulnerabilities are backdoors, where model predictions diverge in the presence of subtle triggers in inputs. An attacker can implant a backdoor by poisoning the training data to yield a desired target prediction on triggered inputs. We study backdoors in the context of deep-learning for source code. (… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

  15. arXiv:2005.04316  [pdf, other

    quant-ph cs.LG

    Advances in Quantum Deep Learning: An Overview

    Authors: Siddhant Garg, Goutham Ramakrishnan

    Abstract: The last few decades have seen significant breakthroughs in the fields of deep learning and quantum computing. Research at the junction of the two fields has garnered an increasing amount of interest, which has led to the development of quantum deep learning and quantum-inspired deep learning techniques in recent times. In this work, we present an overview of advances in the intersection of quantu… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

  16. BAE: BERT-based Adversarial Examples for Text Classification

    Authors: Siddhant Garg, Goutham Ramakrishnan

    Abstract: Modern text classification models are susceptible to adversarial examples, perturbed versions of the original text indiscernible by humans which get misclassified by the model. Recent works in NLP use rule-based synonym replacement strategies to generate adversarial examples. These strategies can lead to out-of-context and unnaturally complex token replacements, which are easily identifiable by hu… ▽ More

    Submitted 7 October, 2020; v1 submitted 4 April, 2020; originally announced April 2020.

    Comments: Accepted at EMNLP 2020 Main Conference

  17. arXiv:2003.10433  [pdf, ps, other

    q-bio.NC cs.LG eess.SP

    Decoding Imagined Speech using Wavelet Features and Deep Neural Networks

    Authors: Jerrin Thomas Panachakel, A. G. Ramakrishnan, A. G. Ramakrishnan

    Abstract: This paper proposes a novel approach that uses deep neural networks for classifying imagined speech, significantly increasing the classification accuracy. The proposed approach employs only the EEG channels over specific areas of the brain for classification, and derives distinct feature vectors from each of those channels. This gives us more data to train a classifier, enabling us to use deep lea… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: Preprint of the paper presented in 2019 IEEE 16th India Council International Conference (INDICON). arXiv admin note: substantial text overlap with arXiv:2003.09374

  18. arXiv:2003.10212  [pdf, other

    q-bio.NC cs.AI eess.SP

    An Improved EEG Acquisition Protocol Facilitates Localized Neural Activation

    Authors: Jerrin Thomas Panachakel, Nandagopal Netrakanti Vinayak, Maanvi Nunna, A. G. Ramakrishnan, Kanishka Sharma

    Abstract: This work proposes improvements in the electroencephalogram (EEG) recording protocols for motor imagery through the introduction of actual motor movement and/or somatosensory cues. The results obtained demonstrate the advantage of requiring the subjects to perform motor actions following the trials of imagery. By introducing motor actions in the protocol, the subjects are able to perform actual mo… ▽ More

    Submitted 13 March, 2020; originally announced March 2020.

    Comments: Preprint of the paper presented at ComNet 2019

  19. arXiv:2003.09374  [pdf, other

    eess.SP cs.LG stat.ML

    A Novel Deep Learning Architecture for Decoding Imagined Speech from EEG

    Authors: Jerrin Thomas Panachakel, A. G. Ramakrishnan, T. V. Ananthapadmanabha

    Abstract: The recent advances in the field of deep learning have not been fully utilised for decoding imagined speech primarily because of the unavailability of sufficient training samples to train a deep network. In this paper, we present a novel architecture that employs deep neural network (DNN) for classifying the words "in" and "cooperate" from the corresponding EEG signals in the ASU imagined speech d… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: Preprint of the paper presented at IEEE AIBEC 2019, Austria

  20. Semantic Robustness of Models of Source Code

    Authors: Goutham Ramakrishnan, Jordan Henkel, Zi Wang, Aws Albarghouthi, Somesh Jha, Thomas Reps

    Abstract: Deep neural networks are vulnerable to adversarial examples - small input perturbations that result in incorrect predictions. We study this problem for models of source code, where we want the network to be robust to source-code modifications that preserve code functionality. (1) We define a powerful adversary that can employ sequences of parametric, semantics-preserving program transformations; (… ▽ More

    Submitted 11 June, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  21. arXiv:1911.09860  [pdf, other

    cs.LG cs.CL stat.ML

    Data Programming using Continuous and Quality-Guided Labeling Functions

    Authors: Oishik Chatterjee, Ganesh Ramakrishnan, Sunita Sarawagi

    Abstract: Scarcity of labeled data is a bottleneck for supervised learning models. A paradigm that has evolved for dealing with this problem is data programming. An existing data programming paradigm allows human supervision to be provided as a set of discrete labeling functions (LF) that output possibly noisy labels to input instances and a generative modelfor consolidating the weak labels. We enhance and… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

    Comments: Accepted paper at the 34th AAAI Conference on Artificial Intelligence (AAAI-18), New York, USA

  22. arXiv:1911.03407  [pdf, other

    cs.CL

    Question Generation from Paragraphs: A Tale of Two Hierarchical Models

    Authors: Vishwajeet Kumar, Raktim Chaki, Sai Teja Talluri, Ganesh Ramakrishnan, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Automatic question generation from paragraphs is an important and challenging problem, particularly due to the long context from paragraphs. In this paper, we propose and study two hierarchical models for the task of question generation from paragraphs. Specifically, we propose (a) a novel hierarchical BiLSTM model with selective attention and (b) a novel hierarchical Transformer architecture, bot… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

  23. Synthesizing Action Sequences for Modifying Model Decisions

    Authors: Goutham Ramakrishnan, Yun Chan Lee, Aws Albarghouthi

    Abstract: When a model makes a consequential decision, e.g., denying someone a loan, it needs to additionally generate actionable, realistic feedback on what the person can do to favorably change the decision. We cast this problem through the lens of program synthesis, in which our goal is to synthesize an optimal (realistically cheapest or simplest) sequence of actions that if a person executes successfull… ▽ More

    Submitted 9 October, 2019; v1 submitted 30 September, 2019; originally announced October 2019.

  24. arXiv:1909.10854  [pdf, other

    cs.CV

    Multi-Person 3D Human Pose Estimation from Monocular Images

    Authors: Rishabh Dabral, Nitesh B Gundavarapu, Rahul Mitra, Abhishek Sharma, Ganesh Ramakrishnan, Arjun Jain

    Abstract: Multi-person 3D human pose estimation from a single image is a challenging problem, especially for in-the-wild settings due to the lack of 3D annotated data. We propose HG-RCNN, a Mask-RCNN based network that also leverages the benefits of the Hourglass architecture for multi-person 3D Human Pose Estimation. A two-staged approach is presented that first estimates the 2D keypoints in every Region o… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Comments: 3DV 2019

  25. arXiv:1909.01642  [pdf, other

    cs.CL

    ParaQG: A System for Generating Questions and Answers from Paragraphs

    Authors: Vishwajeet Kumar, Sivaanandh Muneeswaran, Ganesh Ramakrishnan, Yuan-Fang Li

    Abstract: Generating syntactically and semantically valid and relevant questions from paragraphs is useful with many applications. Manual generation is a labour-intensive task, as it requires the reading, parsing and understanding of long passages of text. A number of question generation models based on sequence-to-sequence techniques have recently been proposed. Most of them generate questions from sentenc… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  26. arXiv:1908.07018  [pdf, other

    cs.IR cs.CL cs.LG

    Tale of tails using rule augmented sequence labeling for event extraction

    Authors: Ayush Maheshwari, Hrishikesh Patel, Nandan Rathod, Ritesh Kumar, Ganesh Ramakrishnan, Pushpak Bhattacharyya

    Abstract: The problem of event extraction is a relatively difficult task for low resource languages due to the non-availability of sufficient annotated data. Moreover, the task becomes complex for tail (rarely occurring) labels wherein extremely less data is available. In this paper, we present a new dataset (InDEE-2019) in the disaster domain for multiple Indic languages, collected from news websites. Usin… ▽ More

    Submitted 31 January, 2020; v1 submitted 19 August, 2019; originally announced August 2019.

    Comments: 9 pages, 4 figures, 6 tables

    Journal ref: StarAI Workshop at AAAI 2020

  27. arXiv:1906.02525  [pdf, other

    cs.CL

    Cross-Lingual Training for Automatic Question Generation

    Authors: Vishwajeet Kumar, Nitish Joshi, Arijit Mukherjee, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Automatic question generation (QG) is a challenging problem in natural language understanding. QG systems are typically built assuming access to a large number of training instances where each instance is a question and its corresponding answer. For a new language, such training instances are hard to obtain making the QG problem even more challenging. Using this as our motivation, we study the reu… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  28. arXiv:1902.05411  [pdf, other

    cs.CV cs.LG stat.ML

    Improving Facial Emotion Recognition Systems Using Gradient and Laplacian Images

    Authors: Ram Krishna Pandey, Souvik Karmakar, A G Ramakrishnan, Nabagata Saha

    Abstract: In this work, we have proposed several enhancements to improve the performance of any facial emotion recognition (FER) system. We believe that the changes in the positions of the fiducial points and the intensities capture the crucial information regarding the emotion of a face image. We propose the use of the gradient and the Laplacian of the input image together with the original input into a co… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  29. arXiv:1901.03088  [pdf, other

    cs.CV

    Fast GPU-Enabled Color Normalization for Digital Pathology

    Authors: Goutham Ramakrishnan, Deepak Anand, Amit Sethi

    Abstract: Normalizing unwanted color variations due to differences in staining processes and scanner responses has been shown to aid machine learning in computational pathology. Of the several popular techniques for color normalization, structure preserving color normalization (SPCN) is well-motivated, convincingly tested, and published with its code base. However, SPCN makes occasional errors in color basi… ▽ More

    Submitted 10 January, 2019; originally announced January 2019.

  30. arXiv:1901.01153  [pdf, other

    cs.CV

    Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance

    Authors: Vishal Kaushal, Rishabh Iyer, Khoshrav Doctor, Anurag Sahoo, Pratik Dubal, Suraj Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan

    Abstract: This paper addresses automatic summarization of videos in a unified manner. In particular, we propose a framework for multi-faceted summarization for extractive, query base and entity summarization (summarization at the level of entities like objects, scenes, humans and faces in the video). We investigate several summarization models which capture notions of diversity, coverage, representation and… ▽ More

    Submitted 3 January, 2019; originally announced January 2019.

    Comments: Accepted to WACV 2019. arXiv admin note: substantial text overlap with arXiv:1704.01466, arXiv:1809.08846

  31. arXiv:1901.01151  [pdf, other

    cs.CV

    Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision

    Authors: Vishal Kaushal, Rishabh Iyer, Suraj Kothawade, Rohan Mahadev, Khoshrav Doctor, Ganesh Ramakrishnan

    Abstract: Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry. Their data curation poses the challenges of expensive human labeling, inadequate computing resources and larger experiment turn around times. Training data subset selection and active learning techniques have been proposed as possible solutions to these challenges. A special class of subset se… ▽ More

    Submitted 3 January, 2019; originally announced January 2019.

    Comments: Accepted to WACV 2019. arXiv admin note: substantial text overlap with arXiv:1805.11191

  32. arXiv:1812.02475  [pdf, other

    cs.CV

    Binary Document Image Super Resolution for Improved Readability and OCR Performance

    Authors: Ram Krishna Pandey, K Vignesh, A G Ramakrishnan, Chandrahasa B

    Abstract: There is a need for information retrieval from large collections of low-resolution (LR) binary document images, which can be found in digital libraries across the world, where the high-resolution (HR) counterpart is not available. This gives rise to the problem of binary document image super-resolution (BDISR). The objective of this paper is to address the interesting and challenging problem of su… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

  33. arXiv:1812.02447  [pdf, other

    eess.AS cs.SD

    Pitch-synchronous DCT features: A pilot study on speaker identification

    Authors: Amit Meghanani, A G Ramakrishnan

    Abstract: We propose a new feature, namely, pitchsynchronous discrete cosine transform (PS-DCT), for the task of speaker identification. These features are obtained directly from the voiced segments of the speech signal, without any preemphasis or windowing. The feature vectors are vector quantized, to create one separate codebook for each speaker during training. The performance of the PS-DCT features is s… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

  34. arXiv:1809.08854  [pdf, other

    cs.CV cs.LG

    A Framework towards Domain Specific Video Summarization

    Authors: Vishal Kaushal, Sandeep Subramanian, Suraj Kothawade, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: In the light of exponentially increasing video content, video summarization has attracted a lot of attention recently due to its ability to optimize time and storage. Characteristics of a good summary of a video depend on the particular domain under question. We propose a novel framework for domain specific video summarization. Given a video of a particular domain, our system can produce a summary… ▽ More

    Submitted 28 December, 2018; v1 submitted 24 September, 2018; originally announced September 2018.

    Comments: Accepted to WACV 2019

  35. arXiv:1809.00961  [pdf, other

    cs.CV cs.LG stat.ML

    MSCE: An edge preserving robust loss function for improving super-resolution algorithms

    Authors: Ram Krishna Pandey, Nabagata Saha, Samarjit Karmakar, A G Ramakrishnan

    Abstract: With the recent advancement in the deep learning technologies such as CNNs and GANs, there is significant improvement in the quality of the images reconstructed by deep learning based super-resolution (SR) techniques. In this work, we propose a robust loss function based on the preservation of edges obtained by the Canny operator. This loss function, when combined with the existing loss function s… ▽ More

    Submitted 25 August, 2018; originally announced September 2018.

    Comments: Accepted in ICONIP-2018

  36. arXiv:1808.09432  [pdf, other

    eess.AS cs.SD

    Using Monte Carlo dropout for non-stationary noise reduction from speech

    Authors: Nazreen P. M., A. G. Ramakrishnan

    Abstract: In this work, we propose the use of dropout as a Bayesian estimator for increasing the generalizability of a deep neural network (DNN) for speech enhancement. By using Monte Carlo (MC) dropout, we show that the DNN performs better enhancement in unseen noise and SNR conditions. The DNN is trained on speech corrupted with Factory2, M109, Babble, Leopard and Volvo noises at SNRs of 0, 5 and 10 dB. S… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Comments: This article draws from our previous work arXiv:1806.00516

  37. arXiv:1808.04961  [pdf, other

    cs.CL

    Putting the Horse Before the Cart:A Generator-Evaluator Framework for Question Generation from Text

    Authors: Vishwajeet Kumar, Ganesh Ramakrishnan, Yuan-Fang Li

    Abstract: Automatic question generation (QG) is a useful yet challenging task in NLP. Recent neural network-based approaches represent the state-of-the-art in this task. In this work, we attempt to strengthen them significantly by adopting a holistic and novel generator-evaluator framework that directly optimizes objectives that reward semantics and structure. The {\it generator} is a sequence-to-sequence m… ▽ More

    Submitted 15 September, 2019; v1 submitted 15 August, 2018; originally announced August 2018.

    Comments: 10 pages, The SIGNLL Conference on Computational Natural Language Learning (CoNLL 2019)

  38. arXiv:1807.05927  [pdf, other

    cs.CV

    Computationally Efficient Approaches for Image Style Transfer

    Authors: Ram Krishna Pandey, Samarjit Karmakar, A G Ramakrishnan

    Abstract: In this work, we have investigated various style transfer approaches and (i) examined how the stylized reconstruction changes with the change of loss function and (ii) provided a computationally efficient solution for the same. We have used elegant techniques like depth-wise separable convolution in place of convolution and nearest neighbor interpolation in place of transposed convolution. Further… ▽ More

    Submitted 16 July, 2018; originally announced July 2018.

  39. arXiv:1807.05813  [pdf, other

    cs.SD eess.AS

    Subjective and objective experiments on the influence of speaker's gender on the unvoiced segments

    Authors: A Madhavaraj, T V Ananthapadmanabha, A G Ramakrishnan

    Abstract: Subjective and objective experiments are conducted to understand the extent to which a speaker's gender influences the acoustics of unvoiced (U) sounds. U segments of utterances are replaced by the corresponding segments of a speaker of opposite gender to prepare modified utterances. Humans are asked to judge if the modified utterance is spoken by one or two speakers. The experiments show that hum… ▽ More

    Submitted 16 July, 2018; originally announced July 2018.

    Comments: 2 Figures, 5 Pages

  40. arXiv:1806.00516  [pdf, other

    eess.AS cs.SD

    DNN Based Speech Enhancement for Unseen Noises Using Monte Carlo Dropout

    Authors: Nazreen P M, A G Ramakrishnan

    Abstract: In this work, we propose the use of dropouts as a Bayesian estimator for increasing the generalizability of a deep neural network (DNN) for speech enhancement. By using Monte Carlo (MC) dropout, we show that the DNN performs better enhancement in unseen noise and SNR conditions. The DNN is trained on speech corrupted with Factory2, M109, Babble, Leopard and Volvo noises at SNRs of 0, 5 and 10 dB a… ▽ More

    Submitted 1 June, 2018; originally announced June 2018.

  41. arXiv:1805.11191  [pdf, other

    cs.CV cs.LG stat.ML

    Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks

    Authors: Vishal Kaushal, Anurag Sahoo, Khoshrav Doctor, Narasimha Raju, Suyash Shetty, Pankaj Singh, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts. Training data subset selection and active learning techniques have been proposed as possible solutions to these challenges respectively. A special class of subset selection f… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: 15 pages, 7 figures

  42. arXiv:1805.09400  [pdf, other

    cs.CV

    A hybrid approach of interpolations and CNN to obtain super-resolution

    Authors: Ram Krishna Pandey, A G Ramakrishnan

    Abstract: We propose a novel architecture that learns an end-to-end map** function to improve the spatial resolution of the input natural images. The model is unique in forming a nonlinear combination of three traditional interpolation techniques using the convolutional neural network. Another proposed architecture uses a skip connection with nearest neighbor interpolation, achieving almost similar result… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

    Report number: TIP-19077-2018

  43. arXiv:1805.09233  [pdf, other

    cs.CV

    Segmentation of Liver Lesions with Reduced Complexity Deep Models

    Authors: Ram Krishna Pandey, Aswin Vasan, A G Ramakrishnan

    Abstract: We propose a computationally efficient architecture that learns to segment lesions from CT images of the liver. The proposed architecture uses bilinear interpolation with sub-pixel convolution at the last layer to upscale the course feature in bottle neck architecture. Since bilinear interpolation and sub-pixel convolution do not have any learnable parameter, our overall model is faster and occupi… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

  44. arXiv:1803.03664  [pdf, other

    cs.CL cs.AI

    Automating Reading Comprehension by Generating Question and Answer Pairs

    Authors: Vishwajeet Kumar, Kireeti Boorla, Yogesh Meena, Ganesh Ramakrishnan, Yuan-Fang Li

    Abstract: Neural network-based methods represent the state-of-the-art in question generation from text. Existing work focuses on generating only questions from text without concerning itself with answer generation. Moreover, our analysis shows that handling rare words and generating the most appropriate question given a candidate answer are still challenges facing existing approaches. We present a novel two… ▽ More

    Submitted 7 March, 2018; originally announced March 2018.

    Comments: 12 pages, 3 figures, 2 tables, Accepted for publication at 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2018

  45. arXiv:1706.00973  [pdf, ps, other

    cs.IR

    Neural Architecture for Question Answering Using a Knowledge Graph and Web Corpus

    Authors: Uma Sawant, Saurabh Garg, Soumen Chakrabarti, Ganesh Ramakrishnan

    Abstract: In Web search, entity-seeking queries often trigger a special Question Answering (QA) system. It may use a parser to interpret the question to a structured query, execute that on a knowledge graph (KG), and return direct entity responses. QA systems based on precise parsing tend to be brittle: minor syntax variations may dramatically change the response. Moreover, KG coverage is patchy. At the oth… ▽ More

    Submitted 6 December, 2018; v1 submitted 3 June, 2017; originally announced June 2017.

    Comments: Accepted to Information Retrieval Journal

  46. arXiv:1705.02562  [pdf, ps, other

    cs.LG

    Learning Discriminative Relational Features for Sequence Labeling

    Authors: Naveen Nair, Ajay Nagesh, Ganesh Ramakrishnan

    Abstract: Discovering relational structure between input features in sequence labeling models has shown to improve their accuracy in several problem settings. However, the search space of relational features is exponential in the number of basic input features. Consequently, approaches that learn relational features, tend to follow a greedy search strategy. In this paper, we study the possibility of optimal… ▽ More

    Submitted 7 May, 2017; originally announced May 2017.

    Comments: 13 pages, technical report

  47. arXiv:1704.01466  [pdf, other

    cs.CV cs.DM

    A Unified Multi-Faceted Video Summarization System

    Authors: Anurag Sahoo, Vishal Kaushal, Khoshrav Doctor, Suyash Shetty, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: This paper addresses automatic summarization and search in visual data comprising of videos, live streams and image collections in a unified manner. In particular, we propose a framework for multi-faceted summarization which extracts key-frames (image summaries), skims (video summaries) and entity summaries (summarization at the level of entities like objects, scenes, humans and faces in the video… ▽ More

    Submitted 4 April, 2017; originally announced April 2017.

    Comments: 18 pages, 11 Figures

  48. arXiv:1701.08835  [pdf, other

    cs.CV

    Language Independent Single Document Image Super-Resolution using CNN for improved recognition

    Authors: Ram Krishna Pandey, A G Ramakrishnan

    Abstract: Recognition of document images have important applications in restoring old and classical texts. The problem involves quality improvement before passing it to a properly trained OCR to get accurate recognition of the text. The image enhancement and quality improvement constitute important steps as subsequent recognition depends upon the quality of the input image. There are scenarios when high res… ▽ More

    Submitted 30 January, 2017; originally announced January 2017.

  49. arXiv:1609.09764  [pdf, ps, other

    cs.SD

    Adaptive dictionary based approach for background noise and speaker classification and subsequent source separation

    Authors: K V Vijay Girish, A G Ramakrishnan, T V Ananthapadmanabha

    Abstract: A judicious combination of dictionary learning methods, block sparsity and source recovery algorithm are used in a hierarchical manner to identify the noises and the speakers from a noisy conversation between two people. Conversations are simulated using speech from two speakers, each with a different background noise, with varied SNR values, down to -10 dB. Ten each of randomly chosen male and fe… ▽ More

    Submitted 28 October, 2016; v1 submitted 30 September, 2016; originally announced September 2016.

    Comments: 12 pages

  50. arXiv:1609.05104  [pdf, other

    cs.SD cs.CL

    Intrinsic normalization and extrinsic denormalization of formant data of vowels

    Authors: T. V. Ananthapadmanabha, A. G. Ramakrishnan

    Abstract: Using a known speaker-intrinsic normalization procedure, formant data are scaled by the reciprocal of the geometric mean of the first three formant frequencies. This reduces the influence of the talker but results in a distorted vowel space. The proposed speaker-extrinsic procedure re-scales the normalized values by the mean formant values of vowels. When tested on the formant data of vowels publi… ▽ More

    Submitted 10 December, 2016; v1 submitted 16 September, 2016; originally announced September 2016.

    Comments: 18 pages, 8 figures. Title has been revised. Appendix has been added to include more figures and to clarify 'hypothesize-test' procedure, JASA-EL, 2016