-
InsightNet: Structured Insight Mining from Customer Feedback
Authors:
Sandeep Sricharan Mukku,
Manan Soni,
Jitenkumar Rana,
Chetan Aggarwal,
Promod Yenigalla,
Rashmi Patange,
Shyam Mohan
Abstract:
We propose InsightNet, a novel approach for the automated extraction of structured insights from customer reviews. Our end-to-end machine learning framework is designed to overcome the limitations of current solutions, including the absence of structure for identified topics, non-standard aspect names, and lack of abundant training data. The proposed solution builds a semi-supervised multi-level t…
▽ More
We propose InsightNet, a novel approach for the automated extraction of structured insights from customer reviews. Our end-to-end machine learning framework is designed to overcome the limitations of current solutions, including the absence of structure for identified topics, non-standard aspect names, and lack of abundant training data. The proposed solution builds a semi-supervised multi-level taxonomy from raw reviews, a semantic similarity heuristic approach to generate labelled data and employs a multi-task insight extraction architecture by fine-tuning an LLM. InsightNet identifies granular actionable topics with customer sentiments and verbatim for each topic. Evaluations on real-world customer review data show that InsightNet performs better than existing solutions in terms of structure, hierarchy and completeness. We empirically demonstrate that InsightNet outperforms the current state-of-the-art methods in multi-label topic classification, achieving an F1 score of 0.85, which is an improvement of 11% F1-score over the previous best results. Additionally, InsightNet generalises well for unseen aspects and suggests new topics to be added to the taxonomy.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
EXPLORE -- Explainable Song Recommendation
Authors:
Abhinav Arun,
Mehul Soni,
Palash Choudhary,
Saksham Arora
Abstract:
This study explores the development of an explainable music recommendation system with enhanced user control. Leveraging a hybrid of collaborative filtering and content-based filtering, we address the challenges of opaque recommendation logic and lack of user influence on results. We present a novel approach combining advanced algorithms and an interactive user interface. Our methodology integrate…
▽ More
This study explores the development of an explainable music recommendation system with enhanced user control. Leveraging a hybrid of collaborative filtering and content-based filtering, we address the challenges of opaque recommendation logic and lack of user influence on results. We present a novel approach combining advanced algorithms and an interactive user interface. Our methodology integrates Spotify data with user preference analytics to tailor music suggestions. Evaluation through RMSE and user studies underscores the efficacy and user satisfaction with our system. The paper concludes with potential directions for future enhancements in group recommendations and dynamic feedback integration.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
Towards a Unified Multimodal Reasoning Framework
Authors:
Abhinav Arun,
Dipendra Singh Mal,
Mehul Soni,
Tomohiro Sawada
Abstract:
Recent advancements in deep learning have led to the development of powerful language models (LMs) that excel in various tasks. Despite these achievements, there is still room for improvement, particularly in enhancing reasoning abilities and incorporating multimodal data. This report investigates the potential impact of combining Chain-of-Thought (CoT) reasoning and Visual Question Answering (VQA…
▽ More
Recent advancements in deep learning have led to the development of powerful language models (LMs) that excel in various tasks. Despite these achievements, there is still room for improvement, particularly in enhancing reasoning abilities and incorporating multimodal data. This report investigates the potential impact of combining Chain-of-Thought (CoT) reasoning and Visual Question Answering (VQA) techniques to improve LM's accuracy in solving multiple-choice questions. By employing TextVQA and ScienceQA datasets, we assessed the effectiveness of three text embedding methods and three visual embedding approaches. Our experiments aimed to fill the gap in current research by investigating the combined impact of CoT and VQA, contributing to the understanding of how these techniques can improve the reasoning capabilities of state-of-the-art models like GPT-4. Results from our experiments demonstrated the potential of these approaches in enhancing LM's reasoning and question-answering capabilities, providing insights for further research and development in the field, and paving the way for more accurate and reliable AI systems that can handle complex reasoning tasks across multiple modalities.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Numerical Reasoning for Financial Reports
Authors:
Abhinav Arun,
Ashish Dhiman,
Mehul Soni,
Yibei Hu
Abstract:
Financial reports offer critical insights into a company's operations, yet their extensive length typically spanning 30 40 pages poses challenges for swift decision making in dynamic markets. To address this, we leveraged finetuned Large Language Models (LLMs) to distill key indicators and operational metrics from these reports basis questions from the user. We devised a method to locate critical…
▽ More
Financial reports offer critical insights into a company's operations, yet their extensive length typically spanning 30 40 pages poses challenges for swift decision making in dynamic markets. To address this, we leveraged finetuned Large Language Models (LLMs) to distill key indicators and operational metrics from these reports basis questions from the user. We devised a method to locate critical data, and leverage the FinQA dataset to fine-tune both Llama-2 7B and T5 models for customized question answering. We achieved results comparable to baseline on the final numerical answer, a competitive accuracy in numerical reasoning and calculation.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries Through Blinded Reviewers and Text Classification Algorithms
Authors:
Mayank Soni,
Vincent Wade
Abstract:
Large Language Models (LLMs) have gathered significant attention due to their impressive performance on a variety of tasks. ChatGPT, developed by OpenAI, is a recent addition to the family of language models and is being called a disruptive technology by a few, owing to its human-like text-generation capabilities. Although, many anecdotal examples across the internet have evaluated ChatGPT's stren…
▽ More
Large Language Models (LLMs) have gathered significant attention due to their impressive performance on a variety of tasks. ChatGPT, developed by OpenAI, is a recent addition to the family of language models and is being called a disruptive technology by a few, owing to its human-like text-generation capabilities. Although, many anecdotal examples across the internet have evaluated ChatGPT's strength and weakness, only a few systematic research studies exist. To contribute to the body of literature of systematic research on ChatGPT, we evaluate the performance of ChatGPT on Abstractive Summarization by the means of automated metrics and blinded human reviewers. We also build automatic text classifiers to detect ChatGPT generated summaries. We found that while text classification algorithms can distinguish between real and generated summaries, humans are unable to distinguish between real summaries and those produced by ChatGPT.
△ Less
Submitted 28 August, 2023; v1 submitted 30 March, 2023;
originally announced March 2023.
-
An Empirical Study of Topic Transition in Dialogue
Authors:
Mayank Soni,
Brendan Spillane,
Emer Gilmartin,
Christian Saam,
Benjamin R. Cowan,
Vincent Wade
Abstract:
Transitioning between topics is a natural component of human-human dialog. Although topic transition has been studied in dialogue for decades, only a handful of corpora based studies have been performed to investigate the subtleties of topic transitions. Thus, this study annotates 215 conversations from the switchboard corpus and investigates how variables such as length, number of topic transitio…
▽ More
Transitioning between topics is a natural component of human-human dialog. Although topic transition has been studied in dialogue for decades, only a handful of corpora based studies have been performed to investigate the subtleties of topic transitions. Thus, this study annotates 215 conversations from the switchboard corpus and investigates how variables such as length, number of topic transitions, topic transitions share by participants and turns/topic are related. This work presents an empirical study on topic transition in switchboard corpus followed by modelling topic transition with a precision of 83% for in-domain(id) test set and 82% on 10 out-of-domain}(ood) test set. It is envisioned that this work will help in emulating human-human like topic transition in open-domain dialog systems.
△ Less
Submitted 19 July, 2022; v1 submitted 28 November, 2021;
originally announced November 2021.
-
Enhancing Self-Disclosure In Neural Dialog Models By Candidate Re-ranking
Authors:
Mayank Soni,
Benjamin Cowan,
Vincent Wade
Abstract:
Neural language modelling has progressed the state-of-the-art in different downstream Natural Language Processing (NLP) tasks. One such area is of open-domain dialog modelling, neural dialog models based on GPT-2 such as DialoGPT have shown promising performance in single-turn conversation. However, such (neural) dialog models have been criticized for generating responses which although may have r…
▽ More
Neural language modelling has progressed the state-of-the-art in different downstream Natural Language Processing (NLP) tasks. One such area is of open-domain dialog modelling, neural dialog models based on GPT-2 such as DialoGPT have shown promising performance in single-turn conversation. However, such (neural) dialog models have been criticized for generating responses which although may have relevance to the previous human response, tend to quickly dissipate human interest and descend into trivial conversation. One reason for such performance is the lack of explicit conversation strategy being employed in human-machine conversation. Humans employ a range of conversation strategies while engaging in a conversation, one such key social strategies is Self-disclosure(SD). A phenomenon of revealing information about one-self to others. Social penetration theory (SPT) proposes that communication between two people moves from shallow to deeper levels as the relationship progresses primarily through self-disclosure. Disclosure helps in creating rapport among the participants engaged in a conversation. In this paper, Self-disclosure enhancement architecture (SDEA) is introduced utilizing Self-disclosure Topic Model (SDTM) during inference stage of a neural dialog model to re-rank response candidates to enhance self-disclosure in single-turn responses from from the model.
△ Less
Submitted 28 August, 2023; v1 submitted 10 September, 2021;
originally announced September 2021.
-
Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge
Authors:
Spyridon Bakas,
Mauricio Reyes,
Andras Jakab,
Stefan Bauer,
Markus Rempfler,
Alessandro Crimi,
Russell Takeshi Shinohara,
Christoph Berger,
Sung Min Ha,
Martin Rozycki,
Marcel Prastawa,
Esther Alberts,
Jana Lipkova,
John Freymann,
Justin Kirby,
Michel Bilello,
Hassan Fathallah-Shaykh,
Roland Wiest,
Jan Kirschke,
Benedikt Wiestler,
Rivka Colen,
Aikaterini Kotrotsou,
Pamela Lamontagne,
Daniel Marcus,
Mikhail Milchenko
, et al. (402 additional authors not shown)
Abstract:
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem…
▽ More
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset.
△ Less
Submitted 23 April, 2019; v1 submitted 5 November, 2018;
originally announced November 2018.
-
A Systematic Review of Automated Grammar Checking in English Language
Authors:
Madhvi Soni,
Jitendra Singh Thakur
Abstract:
Grammar checking is the task of detection and correction of grammatical errors in the text. English is the dominating language in the field of science and technology. Therefore, the non-native English speakers must be able to use correct English grammar while reading, writing or speaking. This generates the need of automatic grammar checking tools. So far many approaches have been proposed and imp…
▽ More
Grammar checking is the task of detection and correction of grammatical errors in the text. English is the dominating language in the field of science and technology. Therefore, the non-native English speakers must be able to use correct English grammar while reading, writing or speaking. This generates the need of automatic grammar checking tools. So far many approaches have been proposed and implemented. But less efforts have been made in surveying the literature in the past decade. The objective of this systematic review is to examine the existing literature, highlighting the current issues and suggesting the potential directions of future research. This systematic review is a result of analysis of 12 primary studies obtained after designing a search strategy for selecting papers found on the web. We also present a possible scheme for the classification of grammar errors. Among the main observations, we found that there is a lack of efficient and robust grammar checking tools for real time applications. We present several useful illustrations- most prominent are the schematic diagrams that we provide for each approach and a table that summarizes these approaches along different dimensions such as target error types, linguistic dataset used, strengths and limitations of the approach. This facilitates better understandability, comparison and evaluation of previous research.
△ Less
Submitted 29 March, 2018;
originally announced April 2018.
-
Design and Development of an automated Robotic Pick & Stow System for an e-Commerce Warehouse
Authors:
Swagat Kumar,
Anima Majumder,
Samrat Dutta,
Rekha Raja,
Sharath Jotawar,
Ashish Kumar,
Manish Soni,
Venkat Raju,
Olyvia Kundu,
Ehtesham Hassan Laxmidhar Behera,
K. S. Venkatesh,
Rajesh Sinha
Abstract:
In this paper, we provide details of a robotic system that can automate the task of picking and stowing objects from and to a rack in an e-commerce fulfillment warehouse. The system primarily comprises of four main modules: (1) Perception module responsible for recognizing query objects and localizing them in the 3-dimensional robot workspace; (2) Planning module generates necessary paths that the…
▽ More
In this paper, we provide details of a robotic system that can automate the task of picking and stowing objects from and to a rack in an e-commerce fulfillment warehouse. The system primarily comprises of four main modules: (1) Perception module responsible for recognizing query objects and localizing them in the 3-dimensional robot workspace; (2) Planning module generates necessary paths that the robot end- effector has to take for reaching the objects in the rack or in the tote; (3) Calibration module that defines the physical workspace for the robot visible through the on-board vision system; and (4) Grip** and suction system for picking and stowing different kinds of objects. The perception module uses a faster region-based Convolutional Neural Network (R-CNN) to recognize objects. We designed a novel two finger gripper that incorporates pneumatic valve based suction effect to enhance its ability to pick different kinds of objects. The system was developed by IITK-TCS team for participation in the Amazon Picking Challenge 2016 event. The team secured a fifth place in the stowing task in the event. The purpose of this article is to share our experiences with students and practicing engineers and enable them to build similar systems. The overall efficacy of the system is demonstrated through several simulation as well as real-world experiments with actual robots.
△ Less
Submitted 7 March, 2017;
originally announced March 2017.
-
An Efficient Vein Pattern-based Recognition System
Authors:
Mohit Soni,
Sandesh Gupta,
M. S. Rao,
Phalguni Gupta
Abstract:
This paper presents an efficient human recognition system based on vein pattern from the palma dorsa. A new absorption based technique has been proposed to collect good quality images with the help of a low cost camera and light source. The system automatically detects the region of interest from the image and does the necessary preprocessing to extract features. A Euclidean Distance based matchin…
▽ More
This paper presents an efficient human recognition system based on vein pattern from the palma dorsa. A new absorption based technique has been proposed to collect good quality images with the help of a low cost camera and light source. The system automatically detects the region of interest from the image and does the necessary preprocessing to extract features. A Euclidean Distance based matching technique has been used for making the decision. It has been tested on a data set of 1750 image samples collected from 341 individuals. The accuracy of the verification system is found to be 99.26% with false rejection rate (FRR) of 0.03%.
△ Less
Submitted 6 May, 2010;
originally announced May 2010.
-
Flare: Architecture for rapid and easy development of Internet-based Applications
Authors:
Shashank Shekhar,
Mohit Soni,
NVSN Kalyan Chakravarthy
Abstract:
We propose an architecture, Flare, that is a structured and easy way to develop applications rapidly, in a multitude of languages, which make use of online storage of data and management of users. The architecture eliminates the need for server-side programming in most cases, creation and management of online database storage servers, re-creation of user management schemes and writing a lot of u…
▽ More
We propose an architecture, Flare, that is a structured and easy way to develop applications rapidly, in a multitude of languages, which make use of online storage of data and management of users. The architecture eliminates the need for server-side programming in most cases, creation and management of online database storage servers, re-creation of user management schemes and writing a lot of unnecessary code for accessing different web-based services using their APIs. A Web API provides a common API for various web-based services like Blogger [2], Wordpress, MSN Live, Facebook [3] etc. Access Libraries provided for major programming languages and platforms make it easy to develop applications using the Flare Web Service. We demonstrate a simple micro-blogging service developed using these APIs in two modes: a graphical browser-based mode, and a command-line mode in C++, which provide two different interfaces to the same account and data.
△ Less
Submitted 17 November, 2009;
originally announced November 2009.
-
Relational Grid Monitoring Architecture (R-GMA)
Authors:
Rob Byrom,
Brian Coghlan,
Andrew W Cooke,
Roney Cordenonsi,
Linda Cornwall,
Abdeslem Djaoui,
Laurence Field,
Steve Fisher,
Steve Hicks,
Stuart Kenny,
Jason Leake,
James Magowan,
Werner Nutt,
David O'Callaghan,
Norbert Podhorszki,
John Ryan,
Manish Soni,
Paul Taylor,
Antony J Wilson
Abstract:
We describe R-GMA (Relational Grid Monitoring Architecture) which has been developed within the European DataGrid Project as a Grid Information and Monitoring System. Is is based on the GMA from GGF, which is a simple Consumer-Producer model. The special strength of this implementation comes from the power of the relational model. We offer a global view of the information as if each Virtual Orga…
▽ More
We describe R-GMA (Relational Grid Monitoring Architecture) which has been developed within the European DataGrid Project as a Grid Information and Monitoring System. Is is based on the GMA from GGF, which is a simple Consumer-Producer model. The special strength of this implementation comes from the power of the relational model. We offer a global view of the information as if each Virtual Organisation had one large relational database. We provide a number of different Producer types with different characteristics; for example some support streaming of information. We also provide combined Consumer/Producers, which are able to combine information and republish it. At the heart of the system is the mediator, which for any query is able to find and connect to the best Producers for the job. We have developed components to allow a measure of inter-working between MDS and R-GMA. We have used it both for information about the grid (primarily to find out about what services are available at any one time) and for application monitoring. R-GMA has been deployed in various testbeds; we describe some preliminary results and experiences of this deployment.
△ Less
Submitted 15 August, 2003;
originally announced August 2003.
-
R-GMA: First results after deployment
Authors:
Rob Byrom,
Brian Coghlan,
Andrew W Cooke,
Roney Cordenonsi,
Linda Cornwall,
Ari Datta,
Abdeslem Djaoui,
Laurence Field,
Steve Fisher,
Steve Hicks,
Stuart Kenny,
James Magowan,
Werner Nutt,
David O'Callaghan,
Manfred Oevers,
Norbert Podhorszki,
John Ryan,
Manish Soni,
Paul Taylor,
Antony J. Wilson,
Xiaomei Zhu
Abstract:
We describe R-GMA (Relational Grid Monitoring Architecture) which is being developed within the European DataGrid Project as an Grid Information and Monitoring System. Is is based on the GMA from GGF, which is a simple Consumer-Producer model. The special strength of this implementation comes from the power of the relational model. We offer a global view of the information as if each VO had one…
▽ More
We describe R-GMA (Relational Grid Monitoring Architecture) which is being developed within the European DataGrid Project as an Grid Information and Monitoring System. Is is based on the GMA from GGF, which is a simple Consumer-Producer model. The special strength of this implementation comes from the power of the relational model. We offer a global view of the information as if each VO had one large relational database. We provide a number of different Producer types with different characteristics; for example some support streaming of information. We also provide combined Consumer/Producers, which are able to combine information and republish it. At the heart of the system is the mediator, which for any query is able to find and connect to the best Producers to do the job. We are able to invoke MDS info-provider scripts and publish the resulting information via R-GMA in addition to having some of our own sensors. APIs are available which allow the user to deploy monitoring and information services for any application that may be needed in the future. We have used it both for information about the grid (primarily to find what services are available at any one time) and for application monitoring. R-GMA has been deployed in Grid testbeds, we describe the results and experiences of this deployment.
△ Less
Submitted 12 June, 2003; v1 submitted 30 May, 2003;
originally announced June 2003.