Search | arXiv e-print repository

Pseudo-OOD training for robust language models

Authors: Dhanasekar Sundararaman, Nikhil Mehta, Lawrence Carin

Abstract: While pre-trained large-scale deep models have garnered attention as an important topic for many downstream natural language processing (NLP) tasks, such models often make unreliable predictions on out-of-distribution (OOD) inputs. As such, OOD detection is a key component of a reliable machine-learning model for any industry-scale application. Common approaches often assume access to additional O… ▽ More While pre-trained large-scale deep models have garnered attention as an important topic for many downstream natural language processing (NLP) tasks, such models often make unreliable predictions on out-of-distribution (OOD) inputs. As such, OOD detection is a key component of a reliable machine-learning model for any industry-scale application. Common approaches often assume access to additional OOD samples during the training stage, however, outlier distribution is often unknown in advance. Instead, we propose a post hoc framework called POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data. The model is fine-tuned by introducing a new regularization loss that separates the embeddings of IND and OOD data, which leads to significant gains on the OOD prediction task during testing. We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection. △ Less

Submitted 17 October, 2022; originally announced October 2022.

Comments: Work in progress

arXiv:2208.01755 [pdf, ps, other]

Debiasing Gender Bias in Information Retrieval Models

Authors: Dhanasekar Sundararaman, Vivek Subramanian

Abstract: Biases in culture, gender, ethnicity, etc. have existed for decades and have affected many areas of human social interaction. These biases have been shown to impact machine learning (ML) models, and for natural language processing (NLP), this can have severe consequences for downstream tasks. Mitigating gender bias in information retrieval (IR) is important to avoid propagating stereotypes. In thi… ▽ More Biases in culture, gender, ethnicity, etc. have existed for decades and have affected many areas of human social interaction. These biases have been shown to impact machine learning (ML) models, and for natural language processing (NLP), this can have severe consequences for downstream tasks. Mitigating gender bias in information retrieval (IR) is important to avoid propagating stereotypes. In this work, we employ a dataset consisting of two components: (1) relevance of a document to a query and (2) "gender" of a document, in which pronouns are replaced by male, female, and neutral conjugations. We definitively show that pre-trained models for IR do not perform well in zero-shot retrieval tasks when full fine-tuning of a large pre-trained BERT encoder is performed and that lightweight fine-tuning performed with adapter networks improves zero-shot retrieval performance almost by 20% over baseline. We also illustrate that pre-trained models have gender biases that result in retrieved articles tending to be more often male than female. We overcome this by introducing a debiasing technique that penalizes the model when it prefers males over females, resulting in an effective model that retrieves articles in a balanced fashion across genders. △ Less

Submitted 20 September, 2022; v1 submitted 2 August, 2022; originally announced August 2022.

Comments: Updated title to be reflective of the methods

arXiv:2205.03559 [pdf, other]

Improving Downstream Task Performance by Treating Numbers as Entities

Authors: Dhanasekar Sundararaman, Vivek Subramanian, Guoyin Wang, Liyan Xu, Lawrence Carin

Abstract: Numbers are essential components of text, like any other word tokens, from which natural language processing (NLP) models are built and deployed. Though numbers are typically not accounted for distinctly in most NLP tasks, there is still an underlying amount of numeracy already exhibited by NLP models. In this work, we attempt to tap this potential of state-of-the-art NLP models and transfer their… ▽ More Numbers are essential components of text, like any other word tokens, from which natural language processing (NLP) models are built and deployed. Though numbers are typically not accounted for distinctly in most NLP tasks, there is still an underlying amount of numeracy already exhibited by NLP models. In this work, we attempt to tap this potential of state-of-the-art NLP models and transfer their ability to boost performance in related tasks. Our proposed classification of numbers into entities helps NLP models perform well on several tasks, including a handcrafted Fill-In-The-Blank (FITB) task and on question answering using joint embeddings, outperforming the BERT and RoBERTa baseline classification. △ Less

Submitted 18 September, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

Comments: Accepted to CIKM 2022

arXiv:2201.00075 [pdf, other]

How do lexical semantics affect translation? An empirical study

Authors: Vivek Subramanian, Dhanasekar Sundararaman

Abstract: Neural machine translation (NMT) systems aim to map text from one language into another. While there are a wide variety of applications of NMT, one of the most important is translation of natural language. A distinguishing factor of natural language is that words are typically ordered according to the rules of the grammar of a given language. Although many advances have been made in develo** NMT… ▽ More Neural machine translation (NMT) systems aim to map text from one language into another. While there are a wide variety of applications of NMT, one of the most important is translation of natural language. A distinguishing factor of natural language is that words are typically ordered according to the rules of the grammar of a given language. Although many advances have been made in develo** NMT systems for translating natural language, little research has been done on understanding how the word ordering of and lexical similarity between the source and target language affect translation performance. Here, we investigate these relationships on a variety of low-resource language pairs from the OpenSubtitles2016 database, where the source language is English, and find that the more similar the target language is to English, the greater the translation performance. In addition, we study the impact of providing NMT models with part of speech of words (POS) in the English sequence and find that, for Transformer-based models, the more dissimilar the target language is from English, the greater the benefit provided by POS. △ Less

Submitted 31 December, 2021; originally announced January 2022.

arXiv:2110.12345 [pdf]

Quantitative Analysis of Demand Response Using Thermostatically Controlled Loads

Authors: Praveen Dhanasekar, Cunzhi Zhao, Xingpeng Li

Abstract: The flexible power consumption feature of thermostatically controlled loads (TCLs) such as heating, ventilation, and air-conditioning (HVAC) systems makes them attractive targets for demand response (DR). TCLs possess a brief period where their power utilization can be altered without any significant impact on customer comfort level. This indicates TCLs are hidden potentials for providing ancillar… ▽ More The flexible power consumption feature of thermostatically controlled loads (TCLs) such as heating, ventilation, and air-conditioning (HVAC) systems makes them attractive targets for demand response (DR). TCLs possess a brief period where their power utilization can be altered without any significant impact on customer comfort level. This indicates TCLs are hidden potentials for providing ancillary services. This paper proposes a novel metric of demand response support time (DRST) for HVAC enabled demand response and a novel algorithm for the quantification of such HVAC-DR. The consumers' comfort will not be compromised with the proposed DRST-based HVAC-DR. Case studies demonstrate its benefits in terms of cost saving in microgrid day-ahead scheduling and reduction of forced load shedding during a grid-microgrid tie-line outage event. This illustrates the reserve potential benefits and the increase of microgrid reliability when DRST-based HVAC-DR is considered. △ Less

Submitted 23 October, 2021; originally announced October 2021.

arXiv:1911.06156 [pdf, other]

Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding

Authors: Dhanasekar Sundararaman, Vivek Subramanian, Guoyin Wang, Shi**g Si, Dinghan Shen, Dong Wang, Lawrence Carin

Abstract: Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks. The Transformer, for instance, is an illustrative example that generates abstract representations of tokens inputted to an encoder based on their relationships to all tokens in a sequence. Recent studies have shown that although such models are capable of learning syntactic features purely b… ▽ More Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks. The Transformer, for instance, is an illustrative example that generates abstract representations of tokens inputted to an encoder based on their relationships to all tokens in a sequence. Recent studies have shown that although such models are capable of learning syntactic features purely by seeing examples, explicitly feeding this information to deep learning models can significantly enhance their performance. Leveraging syntactic information like part of speech (POS) may be particularly beneficial in limited training data settings for complex models such as the Transformer. We show that the syntax-infused Transformer with multiple features achieves an improvement of 0.7 BLEU when trained on the full WMT 14 English to German translation dataset and a maximum improvement of 1.99 BLEU points when trained on a fraction of the dataset. In addition, we find that the incorporation of syntax into BERT fine-tuning outperforms baseline on a number of downstream tasks from the GLUE benchmark. △ Less

Submitted 9 November, 2019; originally announced November 2019.

arXiv:1911.01562 [pdf, other]

DeepRacer: Educational Autonomous Racing Platform for Experimentation with Sim2Real Reinforcement Learning

Authors: Bharathan Balaji, Sunil Mallya, Sahika Genc, Saurabh Gupta, Leo Dirac, Vineet Khare, Gourav Roy, Tao Sun, Yunzhe Tao, Brian Townsend, Eddie Calleja, Sunil Muralidhara, Dhanasekar Karuppasamy

Abstract: DeepRacer is a platform for end-to-end experimentation with RL and can be used to systematically investigate the key challenges in develo** intelligent control systems. Using the platform, we demonstrate how a 1/18th scale car can learn to drive autonomously using RL with a monocular camera. It is trained in simulation with no additional tuning in physical world and demonstrates: 1) formulation… ▽ More DeepRacer is a platform for end-to-end experimentation with RL and can be used to systematically investigate the key challenges in develo** intelligent control systems. Using the platform, we demonstrate how a 1/18th scale car can learn to drive autonomously using RL with a monocular camera. It is trained in simulation with no additional tuning in physical world and demonstrates: 1) formulation and solution of a robust reinforcement learning algorithm, 2) narrowing the reality gap through joint perception and dynamics, 3) distributed on-demand compute architecture for training optimal policies, and 4) a robust evaluation method to identify when to stop training. It is the first successful large-scale deployment of deep reinforcement learning on a robotic control agent that uses only raw camera images as observations and a model-free learning method to perform robust path planning. We open source our code and video demo on GitHub: https://git.io/fjxoJ. △ Less

Submitted 4 November, 2019; originally announced November 2019.

arXiv:1906.08340 [pdf, other]

Learning Compressed Sentence Representations for On-Device Text Processing

Authors: Dinghan Shen, Pengyu Cheng, Dhanasekar Sundararaman, Xinyuan Zhang, Qian Yang, Meng Tang, Asli Celikyilmaz, Lawrence Carin

Abstract: Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued, giving rise to a large memory footprint and slow retrieval speed, which hinders their applicability to low-resource (memory and computation) platforms, such as mobil… ▽ More Vector representations of sentences, trained on massive text corpora, are widely used as generic sentence embeddings across a variety of NLP problems. The learned representations are generally assumed to be continuous and real-valued, giving rise to a large memory footprint and slow retrieval speed, which hinders their applicability to low-resource (memory and computation) platforms, such as mobile devices. In this paper, we propose four different strategies to transform continuous and generic sentence embeddings into a binarized form, while preserving their rich semantic information. The introduced methods are evaluated across a wide range of downstream tasks, where the binarized sentence embeddings are demonstrated to degrade performance by only about 2% relative to their continuous counterparts, while reducing the storage requirement by over 98%. Moreover, with the learned binary representations, the semantic relatedness of two sentences can be evaluated by simply calculating their Hamming distance, which is more computational efficient compared with the inner product operation between continuous embeddings. Detailed analysis and case study further validate the effectiveness of proposed methods. △ Less

Submitted 19 June, 2019; originally announced June 2019.

Comments: To appear at ACL 2019

arXiv:1806.01104 [pdf]

Consolidating the innovative concepts towards Exascale computing for Co-Design of Co-Applications ll: Co-Design Automation - Workload Characterization

Authors: Dhanasekar, Anirudh Seshadri, Sudharshan Srinivasan, Suryanarayanan, Akash Sridhar

Abstract: Many-core co-design is a complex task in which application complexity design space, heterogeneous many-core architecture design space, parallel programming language design space, simulator design space and optimizer design space should get integrated through a binding process and these design spaces, an ensemble of what is called many-core co-design spaces. It is indispensable to build a co-design… ▽ More Many-core co-design is a complex task in which application complexity design space, heterogeneous many-core architecture design space, parallel programming language design space, simulator design space and optimizer design space should get integrated through a binding process and these design spaces, an ensemble of what is called many-core co-design spaces. It is indispensable to build a co-design automation process to dominate over the co-design complexity to cut down the turnaround time. The co-design automation is frame worked to comprehend the dependencies across the many-core co-design spaces and devise the logic behind these interdependencies using a set of algorithms. The software modules of these algorithms and the rest from the many-core co-design spaces interact to crop up the power-performance optimized heterogeneous many-core architecture specific for the simultaneous execution of co applications without space-time sharing. It is essential that such co-design automation has a built-in user-customizable workload generator to benchmark the emerging many-core architecture. This customizability benefits the generation of complex workloads with the desired computation complexity, communication complexity, control flow complexity, and locality of reference, specified under a distribution and established on quantitative models. In addition, the customizable workload model aids the generation of what is called computational and communication surges. None of the current day benchmark suites encompasses applications and kernels that can match the attributes of customizable workload model proposed in this paper. Aforementioned concepts are exemplified in, the case study supported by simulation results gathered from the simulator. △ Less

Submitted 29 April, 2018; originally announced June 2018.

Comments: Revised Submission 2

arXiv:1711.10002 [pdf]

TweetIT- Analyzing Topics for Twitter Users to garner Maximum Attention

Authors: Dhanasekar Sundararaman, Priya Arora, Vishwanath Seshagiri

Abstract: Twitter, a microblogging service, is todays most popular platform for communication in the form of short text messages, called Tweets. Users use Twitter to publish their content either for expressing concerns on information news or views on daily conversations. When this expression emerges, they are experienced by the worldwide distribution network of users and not only by the interlocutor(s). Dep… ▽ More Twitter, a microblogging service, is todays most popular platform for communication in the form of short text messages, called Tweets. Users use Twitter to publish their content either for expressing concerns on information news or views on daily conversations. When this expression emerges, they are experienced by the worldwide distribution network of users and not only by the interlocutor(s). Depending upon the impact of the tweet in the form of the likes, retweets and percentage of followers increases for the user considering a window of time frame, we compute attention factor for each tweet for the selected user profiles. This factor is used to select the top 1000 Tweets, from each user profile, to form a document. Topic modelling is then applied to this document to determine the intent of the user behind the Tweets. After topics are modelled, the similarity is determined between the BBC news data-set containing the modelled topic, and the user document under evaluation. Finally, we determine the top words for a user which would enable us to find the topics which garnered attention and has been posted recently. The experiment is performed using more than 1.1M Tweets from around 500 Twitter profiles spanning Politics, Entertainment, Sports etc. and hundreds of BBC news articles. The results show that our analysis is efficient enough to enable us to find the topics which would act as a suggestion for users to get higher popularity rating for the user in the future. △ Less

Submitted 27 November, 2017; originally announced November 2017.

arXiv:1711.09737 [pdf]

Rating the online review rating system using Yelp

Authors: Dhanasekar S, Balaji

Abstract: The impact of ratings on a restaurant plays a major role in attracting future customers to that restaurant. The word of mouth has been systematically replaced with the online reviews. It gives a sense of satisfaction for people to know beforehand about the number of average stars the restaurant has acquired before step** into a restaurant. However, these ratings are indirectly biased based on th… ▽ More The impact of ratings on a restaurant plays a major role in attracting future customers to that restaurant. The word of mouth has been systematically replaced with the online reviews. It gives a sense of satisfaction for people to know beforehand about the number of average stars the restaurant has acquired before step** into a restaurant. However, these ratings are indirectly biased based on the location, amenities, and the perception of individual people. In this work, we analyze the ratings of restaurants available through the Yelp public data for the discrepancies in the rating system and attempt to provide an optimized global rating system. For a frequent visitor to a high- end restaurant with lavish amenities, even a slightest of reduction in the expected ambiance may prompt a 4 star rating, while a restaurant, which guarantees a minimum taste for its food, may get a 5 star rating. These discrepancies can often be attributed to three factors- the perspective of individual people, features of the restaurant and Location. The perspective of individual people is always subjective and what seems good for one person may be poor for another. In this work, we focus on the other two important factors, Reviews and the features. △ Less

Submitted 10 May, 2018; v1 submitted 17 November, 2017; originally announced November 2017.

Comments: Version 1

arXiv:1711.06970 [pdf]

How much is my car worth? A methodology for predicting used cars prices using Random Forest

Authors: Nabarun Pal, Priya Arora, Dhanasekar Sundararaman, Puneet Kohli, Sai Sumanth Palakurthy

Abstract: Cars are being sold more than ever. Develo** countries adopt the lease culture instead of buying a new car due to affordability. Therefore, the rise of used cars sales is exponentially increasing. Car sellers sometimes take advantage of this scenario by listing unrealistic prices owing to the demand. Therefore, arises a need for a model that can assign a price for a vehicle by evaluating its fea… ▽ More Cars are being sold more than ever. Develo** countries adopt the lease culture instead of buying a new car due to affordability. Therefore, the rise of used cars sales is exponentially increasing. Car sellers sometimes take advantage of this scenario by listing unrealistic prices owing to the demand. Therefore, arises a need for a model that can assign a price for a vehicle by evaluating its features taking the prices of other cars into consideration. In this paper, we use supervised learning method namely Random Forest to predict the prices of used cars. The model has been chosen after careful exploratory data analysis to determine the impact of each feature on price. A Random Forest with 500 Decision Trees were created to train the data. From experimental results, the training accuracy was found out to be 95.82%, and the testing accuracy was 83.63%. The the model can predict the price of cars accurately by choosing the most correlated features. △ Less

Submitted 19 November, 2017; originally announced November 2017.

Comments: FICC Camera Ready

arXiv:1706.05361 [pdf]

Twigraph: Discovering and Visualizing Influential Words between Twitter Profiles

Authors: Dhanasekar Sundararaman, Sudharshan Srinivasan

Abstract: The social media craze is on an ever increasing spree, and people are connected with each other like never before, but these vast connections are visually unexplored. We propose a methodology Twigraph to explore the connections between persons using their Twitter profiles. First, we propose a hybrid approach of recommending social media profiles, articles, and advertisements to a user.The profiles… ▽ More The social media craze is on an ever increasing spree, and people are connected with each other like never before, but these vast connections are visually unexplored. We propose a methodology Twigraph to explore the connections between persons using their Twitter profiles. First, we propose a hybrid approach of recommending social media profiles, articles, and advertisements to a user.The profiles are recommended based on the similarity score between the user profile, and profile under evaluation. The similarity between a set of profiles is investigated by finding the top influential words thus causing a high similarity through an Influence Term Metric for each word. Then, we group profiles of various domains such as politics, sports, and entertainment based on the similarity score through a novel clustering algorithm. The connectivity between profiles is envisaged using word graphs that help in finding the words that connect a set of profiles and the profiles that are connected to a word. Finally, we analyze the top influential words over a set of profiles through clustering by finding the similarity of that profiles enabling to break down a Twitter profile with a lot of followers to fine level word connections using word graphs. The proposed method was implemented on datasets comprising 1.1 M Tweets obtained from Twitter. Experimental results show that the resultant influential words were highly representative of the relationship between two profiles or a set of profiles △ Less

Submitted 29 June, 2017; v1 submitted 16 June, 2017; originally announced June 2017.

arXiv:1509.07543 [pdf, other]

On Optimizing Human-Machine Task Assignments

Authors: Andreas Veit, Michael Wilber, Rajan Vaish, Serge Belongie, James Davis, Vishal Anand, Anshu Aviral, Prithvijit Chakrabarty, Yash Chandak, Sidharth Chaturvedi, Chinmaya Devaraj, Ankit Dhall, Utkarsh Dwivedi, Sanket Gupte, Sharath N. Sridhar, Karthik Paga, Anuj Pahuja, Aditya Raisinghani, Ayush Sharma, Shweta Sharma, Darpana Sinha, Nisarg Thakkar, K. Bala Vignesh, Utkarsh Verma, Kanniganti Abhishek , et al. (26 additional authors not shown)

Abstract: When crowdsourcing systems are used in combination with machine inference systems in the real world, they benefit the most when the machine system is deeply integrated with the crowd workers. However, if researchers wish to integrate the crowd with "off-the-shelf" machine classifiers, this deep integration is not always possible. This work explores two strategies to increase accuracy and decrease… ▽ More When crowdsourcing systems are used in combination with machine inference systems in the real world, they benefit the most when the machine system is deeply integrated with the crowd workers. However, if researchers wish to integrate the crowd with "off-the-shelf" machine classifiers, this deep integration is not always possible. This work explores two strategies to increase accuracy and decrease cost under this setting. First, we show that reordering tasks presented to the human can create a significant accuracy improvement. Further, we show that greedily choosing parameters to maximize machine accuracy is sub-optimal, and joint optimization of the combined system improves performance. △ Less

Submitted 24 September, 2015; originally announced September 2015.

Comments: HCOMP 2015 Work in Progress

Showing 1–14 of 14 results for author: Dhanasekar