Search | arXiv e-print repository

Human Shape and Clothing Estimation

Authors: Aayush Gupta, Aditya Gulati, Himanshu, Lakshya LNU

Abstract: Human shape and clothing estimation has gained significant prominence in various domains, including online shop**, fashion retail, augmented reality (AR), virtual reality (VR), and gaming. The visual representation of human shape and clothing has become a focal point for computer vision researchers in recent years. This paper presents a comprehensive survey of the major works in the field, focus… ▽ More Human shape and clothing estimation has gained significant prominence in various domains, including online shop**, fashion retail, augmented reality (AR), virtual reality (VR), and gaming. The visual representation of human shape and clothing has become a focal point for computer vision researchers in recent years. This paper presents a comprehensive survey of the major works in the field, focusing on four key aspects: human shape estimation, fashion generation, landmark detection, and attribute recognition. For each of these tasks, the survey paper examines recent advancements, discusses their strengths and limitations, and qualitative differences in approaches and outcomes. By exploring the latest developments in human shape and clothing estimation, this survey aims to provide a comprehensive understanding of the field and inspire future research in this rapidly evolving domain. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2309.00144 [pdf, other]

Multi Agent DeepRL based Joint Power and Subchannel Allocation in IAB networks

Authors: Lakshya Jagadish, Banashree Sarma, R. Manivasakan

Abstract: Integrated Access and Backhauling (IAB) is a viable approach for meeting the unprecedented need for higher data rates of future generations, acting as a cost-effective alternative to dense fiber-wired links. The design of such networks with constraints usually results in an optimization problem of non-convex and combinatorial nature. Under those situations, it is challenging to obtain an optimal s… ▽ More Integrated Access and Backhauling (IAB) is a viable approach for meeting the unprecedented need for higher data rates of future generations, acting as a cost-effective alternative to dense fiber-wired links. The design of such networks with constraints usually results in an optimization problem of non-convex and combinatorial nature. Under those situations, it is challenging to obtain an optimal strategy for the joint Subchannel Allocation and Power Allocation (SAPA) problem. In this paper, we develop a multi-agent Deep Reinforcement Learning (DeepRL) based framework for joint optimization of power and subchannel allocation in an IAB network to maximize the downlink data rate. SAPA using DDQN (Double Deep Q-Learning Network) can handle computationally expensive problems with huge action spaces associated with multiple users and nodes. Unlike the conventional methods such as game theory, fractional programming, and convex optimization, which in practice demand more and more accurate network information, the multi-agent DeepRL approach requires less environment network information. Simulation results show the proposed scheme's promising performance when compared with baseline (Deep Q-Learning Network and Random) schemes. △ Less

Submitted 31 August, 2023; originally announced September 2023.

Comments: 7 pages, 6 figures, Accepted at the European Conference on Communication Systems (ECCS) 2023

arXiv:2306.10763 [pdf, other]

Guiding Language Models of Code with Global Context using Monitors

Authors: Lakshya A Agrawal, Aditya Kanade, Navin Goyal, Shuvendu K. Lahiri, Sriram K. Rajamani

Abstract: Language models of code (LMs) work well when the surrounding code provides sufficient context. This is not true when it becomes necessary to use types, functionality or APIs defined elsewhere in the repository or a linked library, especially those not seen during training. LMs suffer from limited awareness of such global context and end up hallucinating. Integrated development environments (IDEs… ▽ More Language models of code (LMs) work well when the surrounding code provides sufficient context. This is not true when it becomes necessary to use types, functionality or APIs defined elsewhere in the repository or a linked library, especially those not seen during training. LMs suffer from limited awareness of such global context and end up hallucinating. Integrated development environments (IDEs) assist developers in understanding repository context using static analysis. We extend this assistance, enjoyed by developers, to LMs. We propose monitor-guided decoding (MGD) where a monitor uses static analysis to guide the decoding. We construct a repository-level dataset PragmaticCode for method-completion in Java and evaluate MGD on it. On models of varying parameter scale, by monitoring for type-consistent object dereferences, MGD consistently improves compilation rates and agreement with ground truth. Further, LMs with fewer parameters, when augmented with MGD, can outperform larger LMs. With MGD, SantaCoder-1.1B achieves better compilation rate and next-identifier match than the much larger text-davinci-003 model. We also conduct a generalizability study to evaluate the ability of MGD to generalize to multiple programming languages (Java, C# and Rust), coding scenarios (e.g., correct number of arguments to method calls), and to enforce richer semantic constraints (e.g., stateful API protocols). Our data and implementation are available at https://github.com/microsoft/monitors4codegen . △ Less

Submitted 3 November, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

Comments: Accepted to NeurIPS 2023 and to appear as "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context" at https://neurips.cc/virtual/2023/poster/70362 . Contents: 11 pages, 15 additional pages of appendix, 13 figures, 3 tables

ACM Class: I.2.2; I.2.7; I.2.5

arXiv:2303.12320 [pdf, other]

GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering

Authors: Dhaval Taunk, Lakshya Khanna, Pavan Kandru, Vasudeva Varma, Charu Sharma, Makarand Tapaswi

Abstract: Commonsense question-answering (QA) methods combine the power of pre-trained Language Models (LM) with the reasoning provided by Knowledge Graphs (KG). A typical approach collects nodes relevant to the QA pair from a KG to form a Working Graph (WG) followed by reasoning using Graph Neural Networks(GNNs). This faces two major challenges: (i) it is difficult to capture all the information from the Q… ▽ More Commonsense question-answering (QA) methods combine the power of pre-trained Language Models (LM) with the reasoning provided by Knowledge Graphs (KG). A typical approach collects nodes relevant to the QA pair from a KG to form a Working Graph (WG) followed by reasoning using Graph Neural Networks(GNNs). This faces two major challenges: (i) it is difficult to capture all the information from the QA in the WG, and (ii) the WG contains some irrelevant nodes from the KG. To address these, we propose GrapeQA with two simple improvements on the WG: (i) Prominent Entities for Graph Augmentation identifies relevant text chunks from the QA pair and augments the WG with corresponding latent representations from the LM, and (ii) Context-Aware Node Pruning removes nodes that are less relevant to the QA pair. We evaluate our results on OpenBookQA, CommonsenseQA and MedQA-USMLE and see that GrapeQA shows consistent improvements over its LM + KG predecessor (QA-GNN in particular) and large improvements on OpenBookQA. △ Less

Submitted 18 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

arXiv:2301.11722 [pdf, other]

Diffusion Models as Artists: Are we Closing the Gap between Humans and Machines?

Authors: Victor Boutin, Thomas Fel, Lakshya Singhal, Rishav Mukherji, Akash Nagaraj, Julien Colin, Thomas Serre

Abstract: An important milestone for AI is the development of algorithms that can produce drawings that are indistinguishable from those of humans. Here, we adapt the 'diversity vs. recognizability' scoring framework from Boutin et al, 2022 and find that one-shot diffusion models have indeed started to close the gap between humans and machines. However, using a finer-grained measure of the originality of in… ▽ More An important milestone for AI is the development of algorithms that can produce drawings that are indistinguishable from those of humans. Here, we adapt the 'diversity vs. recognizability' scoring framework from Boutin et al, 2022 and find that one-shot diffusion models have indeed started to close the gap between humans and machines. However, using a finer-grained measure of the originality of individual samples, we show that strengthening the guidance of diffusion models helps improve the humanness of their drawings, but they still fall short of approximating the originality and recognizability of human drawings. Comparing human category diagnostic features, collected through an online psychophysics experiment, against those derived from diffusion models reveals that humans rely on fewer and more localized features. Overall, our study suggests that diffusion models have significantly helped improve the quality of machine-generated drawings; however, a gap between humans and machines remains -- in part explainable by discrepancies in visual strategies. △ Less

Submitted 31 May, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

arXiv:2212.03725 [pdf, other]

Learning-To-Embed: Adopting Transformer based models for E-commerce Products Representation Learning

Authors: Lakshya Kumar, Sreekanth Vempati

Abstract: Learning low-dimensional representation for large number of products present in an e-commerce catalogue plays a vital role as they are helpful in tasks like product ranking, product recommendation, finding similar products, modelling user-behaviour etc. Recently, a lot of tasks in the NLP field are getting tackled using the Transformer based models and these deep models are widely applicable in th… ▽ More Learning low-dimensional representation for large number of products present in an e-commerce catalogue plays a vital role as they are helpful in tasks like product ranking, product recommendation, finding similar products, modelling user-behaviour etc. Recently, a lot of tasks in the NLP field are getting tackled using the Transformer based models and these deep models are widely applicable in the industries setting to solve various problems. With this motivation, we apply transformer based model for learning contextual representation of products in an e-commerce setting. In this work, we propose a novel approach of pre-training transformer based model on a users generated sessions dataset obtained from a large fashion e-commerce platform to obtain latent product representation. Once pre-trained, we show that the low-dimension representation of the products can be obtained given the product attributes information as a textual sentence. We mainly pre-train BERT, RoBERTa, ALBERT and XLNET variants of transformer model and show a quantitative analysis of the products representation obtained from these models with respect to Next Product Recommendation(NPR) and Content Ranking(CR) tasks. For both the tasks, we collect an evaluation data from the fashion e-commerce platform and observe that XLNET model outperform other variants with a MRR of 0.5 for NPR and NDCG of 0.634 for CR. XLNET model also outperforms the Word2Vec based non-transformer baseline on both the downstream tasks. To the best of our knowledge, this is the first and novel work for pre-training transformer based models using users generated sessions data containing products that are represented with rich attributes information for adoption in e-commerce setting. These models can be further fine-tuned in order to solve various downstream tasks in e-commerce, thereby eliminating the need to train a model from scratch. △ Less

Submitted 7 December, 2022; originally announced December 2022.

Comments: 8 pages, 2 figures

arXiv:2210.15451 [pdf, other]

Fine-Grained Session Recommendations in E-commerce using Deep Reinforcement Learning

Authors: Diddigi Raghu Ram Bharadwaj, Lakshya Kumar, Saif Jawaid, Sreekanth Vempati

Abstract: Sustaining users' interest and kee** them engaged in the platform is very important for the success of an e-commerce business. A session encompasses different activities of a user between logging into the platform and logging out or making a purchase. User activities in a session can be classified into two groups: Known Intent and Unknown intent. Known intent activity pertains to the session whe… ▽ More Sustaining users' interest and kee** them engaged in the platform is very important for the success of an e-commerce business. A session encompasses different activities of a user between logging into the platform and logging out or making a purchase. User activities in a session can be classified into two groups: Known Intent and Unknown intent. Known intent activity pertains to the session where the intent of a user to browse/purchase a specific product can be easily captured. Whereas in unknown intent activity, the intent of the user is not known. For example, consider the scenario where a user enters the session to casually browse the products over the platform, similar to the window shop** experience in the offline setting. While recommending similar products is essential in the former, accurately understanding the intent and recommending interesting products is essential in the latter setting in order to retain a user. In this work, we focus primarily on the unknown intent setting where our objective is to recommend a sequence of products to a user in a session to sustain their interest, keep them engaged and possibly drive them towards purchase. We formulate this problem in the framework of the Markov Decision Process (MDP), a popular mathematical framework for sequential decision making and solve it using Deep Reinforcement Learning (DRL) techniques. However, training the next product recommendation is difficult in the RL paradigm due to large variance in browse/purchase behavior of the users. Therefore, we break the problem down into predicting various product attributes, where a pattern/trend can be identified and exploited to build accurate models. We show that the DRL agent provides better performance compared to a greedy strategy. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.09962 [pdf, other]

Nighttime Dehaze-Enhancement

Authors: Harshan Baskar, Anirudh S Chakravarthy, Prateek Garg, Divyam Goel, Abhijith S Raj, Kshitij Kumar, Lakshya, Ravichandra Parvatham, V Sushant, Bijay Kumar Rout

Abstract: In this paper, we introduce a new computer vision task called nighttime dehaze-enhancement. This task aims to jointly perform dehazing and lightness enhancement. Our task fundamentally differs from nighttime dehazing -- our goal is to jointly dehaze and enhance scenes, while nighttime dehazing aims to dehaze scenes under a nighttime setting. In order to facilitate further research on this task, we… ▽ More In this paper, we introduce a new computer vision task called nighttime dehaze-enhancement. This task aims to jointly perform dehazing and lightness enhancement. Our task fundamentally differs from nighttime dehazing -- our goal is to jointly dehaze and enhance scenes, while nighttime dehazing aims to dehaze scenes under a nighttime setting. In order to facilitate further research on this task, we release a new benchmark dataset called Reside-$β$ Night dataset, consisting of 4122 nighttime hazed images from 2061 scenes and 2061 ground truth images. Moreover, we also propose a new network called NDENet (Nighttime Dehaze-Enhancement Network), which jointly performs dehazing and low-light enhancement in an end-to-end manner. We evaluate our method on the proposed benchmark and achieve SSIM of 0.8962 and PSNR of 26.25. We also compare our network with other baseline networks on our benchmark to demonstrate the effectiveness of our approach. We believe that nighttime dehaze-enhancement is an essential task particularly for autonomous navigation applications, and hope that our work will open up new frontiers in research. Our dataset and code will be made publicly available upon acceptance of our paper. △ Less

Submitted 18 October, 2022; originally announced October 2022.

arXiv:2206.15198 [pdf, other]

ListBERT: Learning to Rank E-commerce products with Listwise BERT

Authors: Lakshya Kumar, Sagnik Sarkar

Abstract: Efficient search is a critical component for an e-commerce platform with an innumerable number of products. Every day millions of users search for products pertaining to their needs. Thus, showing the relevant products on the top will enhance the user experience. In this work, we propose a novel approach of fusing a transformer-based model with various listwise loss functions for ranking e-commerc… ▽ More Efficient search is a critical component for an e-commerce platform with an innumerable number of products. Every day millions of users search for products pertaining to their needs. Thus, showing the relevant products on the top will enhance the user experience. In this work, we propose a novel approach of fusing a transformer-based model with various listwise loss functions for ranking e-commerce products, given a user query. We pre-train a RoBERTa model over a fashion e-commerce corpus and fine-tune it using different listwise loss functions. Our experiments indicate that the RoBERTa model fine-tuned with an NDCG based surrogate loss function(approxNDCG) achieves an NDCG improvement of 13.9% compared to other popular listwise loss functions like ListNET and ListMLE. The approxNDCG based RoBERTa model also achieves an NDCG improvement of 20.6% compared to the pairwise RankNet based RoBERTa model. We call our methodology of directly optimizing the RoBERTa model in an end-to-end manner with a listwise surrogate loss function as ListBERT. Since there is a low latency requirement in a real-time search setting, we show how these models can be easily adopted by using a knowledge distillation technique to learn a representation-focused student model that can be easily deployed and leads to ~10 times lower ranking latency. △ Less

Submitted 30 June, 2022; originally announced June 2022.

Comments: 5 Pages, 1 Figure, accepted in SigirEcom'22, Madrid, Spain

arXiv:2205.10370 [pdf, other]

Diversity vs. Recognizability: Human-like generalization in one-shot generative models

Authors: Victor Boutin, Lakshya Singhal, Xavier Thomas, Thomas Serre

Abstract: Robust generalization to new concepts has long remained a distinctive feature of human intelligence. However, recent progress in deep generative models has now led to neural architectures capable of synthesizing novel instances of unknown visual concepts from a single training example. Yet, a more precise comparison between these models and humans is not possible because existing performance metri… ▽ More Robust generalization to new concepts has long remained a distinctive feature of human intelligence. However, recent progress in deep generative models has now led to neural architectures capable of synthesizing novel instances of unknown visual concepts from a single training example. Yet, a more precise comparison between these models and humans is not possible because existing performance metrics for generative models (i.e., FID, IS, likelihood) are not appropriate for the one-shot generation scenario. Here, we propose a new framework to evaluate one-shot generative models along two axes: sample recognizability vs. diversity (i.e., intra-class variability). Using this framework, we perform a systematic evaluation of representative one-shot generative models on the Omniglot handwritten dataset. We first show that GAN-like and VAE-like models fall on opposite ends of the diversity-recognizability space. Extensive analyses of the effect of key model parameters further revealed that spatial attention and context integration have a linear contribution to the diversity-recognizability trade-off. In contrast, disentanglement transports the model along a parabolic curve that could be used to maximize recognizability. Using the diversity-recognizability framework, we were able to identify models and parameters that closely approximate human data. △ Less

Submitted 7 October, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

arXiv:2202.13203 [pdf, other]

Dropout can Simulate Exponential Number of Models for Sample Selection Techniques

Authors: Lakshya

Abstract: Following Coteaching, generally in the literature, two models are used in sample selection based approaches for training with noisy labels. Meanwhile, it is also well known that Dropout when present in a network trains an ensemble of sub-networks. We show how to leverage this property of Dropout to train an exponential number of shared models, by training a single model with Dropout. We show how w… ▽ More Following Coteaching, generally in the literature, two models are used in sample selection based approaches for training with noisy labels. Meanwhile, it is also well known that Dropout when present in a network trains an ensemble of sub-networks. We show how to leverage this property of Dropout to train an exponential number of shared models, by training a single model with Dropout. We show how we can modify existing two model-based sample selection methodologies to use an exponential number of shared models. Not only is it more convenient to use a single model with Dropout, but this approach also combines the natural benefits of Dropout with that of training an exponential number of models, leading to improved results. △ Less

Submitted 26 February, 2022; originally announced February 2022.

arXiv:2201.05229 [pdf, other]

Examining and Mitigating the Impact of Crossbar Non-idealities for Accurate Implementation of Sparse Deep Neural Networks

Authors: Abhiroop Bhattacharjee, Lakshya Bhatnagar, Priyadarshini Panda

Abstract: Recently several structured pruning techniques have been introduced for energy-efficient implementation of Deep Neural Networks (DNNs) with lesser number of crossbars. Although, these techniques have claimed to preserve the accuracy of the sparse DNNs on crossbars, none have studied the impact of the inexorable crossbar non-idealities on the actual performance of the pruned networks. To this end,… ▽ More Recently several structured pruning techniques have been introduced for energy-efficient implementation of Deep Neural Networks (DNNs) with lesser number of crossbars. Although, these techniques have claimed to preserve the accuracy of the sparse DNNs on crossbars, none have studied the impact of the inexorable crossbar non-idealities on the actual performance of the pruned networks. To this end, we perform a comprehensive study to show how highly sparse DNNs, that result in significant crossbar-compression-rate, can lead to severe accuracy losses compared to unpruned DNNs mapped onto non-ideal crossbars. We perform experiments with multiple structured-pruning approaches (such as, C/F pruning, XCS and XRS) on VGG11 and VGG16 DNNs with benchmark datasets (CIFAR10 and CIFAR100). We propose two mitigation approaches - Crossbar column rearrangement and Weight-Constrained-Training (WCT) - that can be integrated with the crossbar-map** of the sparse DNNs to minimize accuracy losses incurred by the pruned models. These help in mitigating non-idealities by increasing the proportion of low conductance synapses on crossbars, thereby improving their computational accuracies. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Comments: Accepted in Design, Automation and Test in Europe (DATE) Conference, 2022

Journal ref: Design, Automation and Test in Europe (DATE) Conference, 2022

arXiv:2107.08291 [pdf, other]

Neural Search: Learning Query and Product Representations in Fashion E-commerce

Authors: Lakshya Kumar, Sagnik Sarkar

Abstract: Typical e-commerce platforms contain millions of products in the catalog. Users visit these platforms and enter search queries to retrieve their desired products. Therefore, showing the relevant products at the top is essential for the success of e-commerce platforms. We approach this problem by learning low dimension representations for queries and product descriptions by leveraging user click-st… ▽ More Typical e-commerce platforms contain millions of products in the catalog. Users visit these platforms and enter search queries to retrieve their desired products. Therefore, showing the relevant products at the top is essential for the success of e-commerce platforms. We approach this problem by learning low dimension representations for queries and product descriptions by leveraging user click-stream data as our main source of signal for product relevance. Starting from GRU-based architectures as our baseline model, we move towards a more advanced transformer-based architecture. This helps the model to learn contextual representations of queries and products to serve better search results and understand the user intent in an efficient manner. We perform experiments related to pre-training of the Transformer based RoBERTa model using a fashion corpus and fine-tuning it over the triplet loss. Our experiments on the product ranking task show that the RoBERTa model is able to give an improvement of 7.8% in Mean Reciprocal Rank(MRR), 15.8% in Mean Average Precision(MAP) and 8.8% in Normalized Discounted Cumulative Gain(NDCG), thus outperforming our GRU based baselines. For the product retrieval task, RoBERTa model is able to outperform other two models with an improvement of 164.7% in Precision@50 and 145.3% in Recall@50. In order to highlight the importance of pre-training RoBERTa for fashion domain, we qualitatively compare already pre-trained RoBERTa on standard datasets with our custom pre-trained RoBERTa over a fashion corpus for the query token prediction task. Finally, we also show a qualitative comparison between GRU and RoBERTa results for product retrieval task for some test queries. △ Less

Submitted 17 July, 2021; originally announced July 2021.

Comments: 10 pages, accepted at SIGIR eCommerce 2021

arXiv:2101.03235 [pdf]

Key Phrase Extraction & Applause Prediction

Authors: Krishna Yadav, Lakshya Choudhary

Abstract: With the increase in content availability over the internet it is very difficult to get noticed. It has become an upmost the priority of the blog writers to get some feedback over their creations to be confident about the impact of their article. We are training a machine learning model to learn popular article styles, in the form of vector space representations using various word embeddings, and… ▽ More With the increase in content availability over the internet it is very difficult to get noticed. It has become an upmost the priority of the blog writers to get some feedback over their creations to be confident about the impact of their article. We are training a machine learning model to learn popular article styles, in the form of vector space representations using various word embeddings, and their popularity based on claps and tags. △ Less

Submitted 1 January, 2021; originally announced January 2021.

Comments: 4 pages, 8 figures best project award winner. https://krishna19039.medium.com/key-phrase-extraction-applause-prediction-7b397c7ad76d

arXiv:2012.00261 [pdf, other]

NEAT: Non-linearity Aware Training for Accurate and Energy-Efficient Implementation of Neural Networks on 1T-1R Memristive Crossbars

Authors: Abhiroop Bhattacharjee, Lakshya Bhatnagar, Youngeun Kim, Priyadarshini Panda

Abstract: Memristive crossbars suffer from non-idealities (such as, sneak paths) that degrade computational accuracy of the Deep Neural Networks (DNNs) mapped onto them. A 1T-1R synapse, adding a transistor (1T) in series with the memristive synapse (1R), has been proposed to mitigate such non-idealities. We observe that the non-linear characteristics of the transistor affect the overall conductance of the… ▽ More Memristive crossbars suffer from non-idealities (such as, sneak paths) that degrade computational accuracy of the Deep Neural Networks (DNNs) mapped onto them. A 1T-1R synapse, adding a transistor (1T) in series with the memristive synapse (1R), has been proposed to mitigate such non-idealities. We observe that the non-linear characteristics of the transistor affect the overall conductance of the 1T-1R cell which in turn affects the Matrix-Vector-Multiplication (MVM) operation in crossbars. This 1T-1R non-ideality arising from the input voltage-dependent non-linearity is not only difficult to model or formulate, but also causes a drastic performance degradation of DNNs when mapped onto crossbars. In this paper, we analyse the non-linearity of the 1T-1R crossbar and propose a novel Non-linearity Aware Training (NEAT) method to address the non-idealities. Specifically, we first identify the range of network weights, which can be mapped into the 1T-1R cell within the linear operating region of the transistor. Thereafter, we regularize the weights of the DNNs to exist within the linear operating range by using iterative training algorithm. Our iterative training significantly recovers the classification accuracy drop caused by the non-linearity. Moreover, we find that each layer has a different weight distribution and in turn requires different gate voltage of transistor to guarantee linear operation. Based on this observation, we achieve energy efficiency while preserving classification accuracy by applying heterogeneous gate voltage control to the 1T-1R cells across different layers. Finally, we conduct various experiments on CIFAR10 and CIFAR100 benchmark datasets to demonstrate the effectiveness of our non-linearity aware training. Overall, NEAT yields ~20% energy gain with less than 1% accuracy loss (with homogeneous gate control) when map** ResNet18 networks on 1T-1R crossbars. △ Less

Submitted 30 November, 2020; originally announced December 2020.

Comments: 7 pages, 11 figures

arXiv:2011.14280 [pdf, other]

A Novel Sentiment Analysis Engine for Preliminary Depression Status Estimation on Social Media

Authors: Sudhir Kumar Suman, Hrithwik Shalu, Lakshya A Agrawal, Archit Agrawal, Juned Kadiwala

Abstract: Text sentiment analysis for preliminary depression status estimation of users on social media is a widely exercised and feasible method, However, the immense variety of users accessing the social media websites and their ample mix of vocabularies makes it difficult for commonly applied deep learning-based classifiers to perform. To add to the situation, the lack of adaptability of traditional supe… ▽ More Text sentiment analysis for preliminary depression status estimation of users on social media is a widely exercised and feasible method, However, the immense variety of users accessing the social media websites and their ample mix of vocabularies makes it difficult for commonly applied deep learning-based classifiers to perform. To add to the situation, the lack of adaptability of traditional supervised machine learning could hurt at many levels. We propose a cloud-based smartphone application, with a deep learning-based backend to primarily perform depression detection on Twitter social media. The backend model consists of a RoBERTa based siamese sentence classifier that compares a given tweet (Query) with a labeled set of tweets with known sentiment ( Standard Corpus ). The standard corpus is varied over time with expert opinion so as to improve the model's reliability. A psychologist ( with the patient's permission ) could leverage the application to assess the patient's depression status prior to counseling, which provides better insight into the mental health status of a patient. In addition, to the same, the psychologist could be referred to cases of similar characteristics, which could in turn help in more effective treatment. We evaluate our backend model after fine-tuning it on a publicly available dataset. The find tuned model is made to predict depression on a large set of tweet samples with random noise factors. The model achieved pinnacle results, with a testing accuracy of 87.23% and an AUC of 0.8621. △ Less

Submitted 28 November, 2020; originally announced November 2020.

arXiv:2007.03020 [pdf, other]

Deep Contextual Embeddings for Address Classification in E-commerce

Authors: Shreyas Mangalgi, Lakshya Kumar, Ravindra Babu Tallamraju

Abstract: E-commerce customers in develo** nations like India tend to follow no fixed format while entering ship** addresses. Parsing such addresses is challenging because of a lack of inherent structure or hierarchy. It is imperative to understand the language of addresses, so that shipments can be routed without delays. In this paper, we propose a novel approach towards understanding customer addresse… ▽ More E-commerce customers in develo** nations like India tend to follow no fixed format while entering ship** addresses. Parsing such addresses is challenging because of a lack of inherent structure or hierarchy. It is imperative to understand the language of addresses, so that shipments can be routed without delays. In this paper, we propose a novel approach towards understanding customer addresses by deriving motivation from recent advances in Natural Language Processing (NLP). We also formulate different pre-processing steps for addresses using a combination of edit distance and phonetic algorithms. Then we approach the task of creating vector representations for addresses using Word2Vec with TF-IDF, Bi-LSTM and BERT based approaches. We compare these approaches with respect to sub-region classification task for North and South Indian cities. Through experiments, we demonstrate the effectiveness of generalized RoBERTa model, pre-trained over a large address corpus for language modelling task. Our proposed RoBERTa model achieves a classification accuracy of around 90% with minimal text preprocessing for sub-region classification task outperforming all other approaches. Once pre-trained, the RoBERTa model can be fine-tuned for various downstream tasks in supply chain like pincode suggestion and geo-coding. The model generalizes well for such tasks even with limited labelled data. To the best of our knowledge, this is the first of its kind research proposing a novel approach of understanding customer addresses in e-commerce domain by pre-training language models and fine-tuning them for different purposes. △ Less

Submitted 6 July, 2020; originally announced July 2020.

Comments: 9 Pages, 8 Figures, AI for fashion supply chain, KDD2020 Workshop

arXiv:1910.00727 [pdf, other]

Analyzing and Improving Neural Networks by Generating Semantic Counterexamples through Differentiable Rendering

Authors: Lakshya Jain, Varun Chandrasekaran, Uyeong Jang, Wilson Wu, Andrew Lee, Andy Yan, Steven Chen, Somesh Jha, Sanjit A. Seshia

Abstract: Even as deep neural networks (DNNs) have achieved remarkable success on vision-related tasks, their performance is brittle to transformations in the input. Of particular interest are semantic transformations that model changes that have a basis in the physical world, such as rotations, translations, changes in lighting or camera pose. In this paper, we show how differentiable rendering can be util… ▽ More Even as deep neural networks (DNNs) have achieved remarkable success on vision-related tasks, their performance is brittle to transformations in the input. Of particular interest are semantic transformations that model changes that have a basis in the physical world, such as rotations, translations, changes in lighting or camera pose. In this paper, we show how differentiable rendering can be utilized to generate images that are informative, yet realistic, and which can be used to analyze DNN performance and improve its robustness through data augmentation. Given a differentiable renderer and a DNN, we show how to use off-the-shelf attacks from adversarial machine learning to generate semantic counterexamples -- images where semantic features are changed as to produce misclassifications or misdetections. We validate our approach on DNNs for image classification and object detection. For classification, we show that semantic counterexamples, when used to augment the dataset, (i) improve generalization performance (ii) enhance robustness to semantic transformations, and (iii) transfer between models. Additionally, in comparison to sampling-based semantic augmentation, our technique generates more informative data in a sample efficient manner. △ Less

Submitted 17 July, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

arXiv:1804.00086 [pdf, other]

HCAP: A History-Based Capability System for IoT Devices

Authors: Lakshya Tandon, Philip W. L. Fong, Reihaneh Safavi-Naini

Abstract: Permissions are highly sensitive in Internet-of-Things (IoT) applications, as IoT devices collect our personal data and control the safety of our environment. Rather than simply granting permissions, further constraints shall be imposed on permission usage so as to realize the Principle of Least Privilege. Since IoT devices are physically embedded, they are often accessed in a particular sequence… ▽ More Permissions are highly sensitive in Internet-of-Things (IoT) applications, as IoT devices collect our personal data and control the safety of our environment. Rather than simply granting permissions, further constraints shall be imposed on permission usage so as to realize the Principle of Least Privilege. Since IoT devices are physically embedded, they are often accessed in a particular sequence based on their relative physical positions. Monitoring if such sequencing constraints are honoured when IoT devices are accessed provides a means to fence off malicious accesses. This paper proposes a history-based capability system, HCAP, for enforcing permission sequencing constraints in a distributed authorization environment. We formally establish the security guarantees of HCAP, and empirically evaluate its performance. △ Less

Submitted 30 March, 2018; originally announced April 2018.

arXiv:1709.01950 [pdf, other]

"Having 2 hours to write a paper is fun!": Detecting Sarcasm in Numerical Portions of Text

Authors: Lakshya Kumar, Arpan Somani, Pushpak Bhattacharyya

Abstract: Sarcasm occurring due to the presence of numerical portions in text has been quoted as an error made by automatic sarcasm detection approaches in the past. We present a first study in detecting sarcasm in numbers, as in the case of the sentence 'Love waking up at 4 am'. We analyze the challenges of the problem, and present Rule-based, Machine Learning and Deep Learning approaches to detect sarcasm… ▽ More Sarcasm occurring due to the presence of numerical portions in text has been quoted as an error made by automatic sarcasm detection approaches in the past. We present a first study in detecting sarcasm in numbers, as in the case of the sentence 'Love waking up at 4 am'. We analyze the challenges of the problem, and present Rule-based, Machine Learning and Deep Learning approaches to detect sarcasm in numerical portions of text. Our Deep Learning approach outperforms four past works for sarcasm detection and Rule-based and Machine learning approaches on a dataset of tweets, obtaining an F1-score of 0.93. This shows that special attention to text containing numbers may be useful to improve state-of-the-art in sarcasm detection. △ Less

Submitted 6 September, 2017; originally announced September 2017.

Showing 1–20 of 20 results for author: Lakshya