-
MahaSQuAD: Bridging Linguistic Divides in Marathi Question-Answering
Authors:
Ruturaj Ghatage,
Aditya Kulkarni,
Rajlaxmi Patil,
Sharvi Endait,
Raviraj Joshi
Abstract:
Question-answering systems have revolutionized information retrieval, but linguistic and cultural boundaries limit their widespread accessibility. This research endeavors to bridge the gap of the absence of efficient QnA datasets in low-resource languages by translating the English Question Answering Dataset (SQuAD) using a robust data curation approach. We introduce MahaSQuAD, the first-ever full…
▽ More
Question-answering systems have revolutionized information retrieval, but linguistic and cultural boundaries limit their widespread accessibility. This research endeavors to bridge the gap of the absence of efficient QnA datasets in low-resource languages by translating the English Question Answering Dataset (SQuAD) using a robust data curation approach. We introduce MahaSQuAD, the first-ever full SQuAD dataset for the Indic language Marathi, consisting of 118,516 training, 11,873 validation, and 11,803 test samples. We also present a gold test set of manually verified 500 examples. Challenges in maintaining context and handling linguistic nuances are addressed, ensuring accurate translations. Moreover, as a QnA dataset cannot be simply converted into any low-resource language using translation, we need a robust method to map the answer translation to its span in the translated passage. Hence, to address this challenge, we also present a generic approach for translating SQuAD into any low-resource language. Thus, we offer a scalable approach to bridge linguistic and cultural gaps present in low-resource languages, in the realm of question-answering systems. The datasets and models are shared publicly at https://github.com/l3cube-pune/MarathiNLP .
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Handling and extracting key entities from customer conversations using Speech recognition and Named Entity recognition
Authors:
Sharvi Endait,
Ruturaj Ghatage,
Prof. DD Kadam
Abstract:
In this modern era of technology with e-commerce develo** at a rapid pace, it is very important to understand customer requirements and details from a business conversation. It is very crucial for customer retention and satisfaction. Extracting key insights from these conversations is very important when it comes to develo** their product or solving their issue. Understanding customer feedback…
▽ More
In this modern era of technology with e-commerce develo** at a rapid pace, it is very important to understand customer requirements and details from a business conversation. It is very crucial for customer retention and satisfaction. Extracting key insights from these conversations is very important when it comes to develo** their product or solving their issue. Understanding customer feedback, responses, and important details of the product are essential and it would be done using Named entity recognition (NER). For extracting the entities we would be converting the conversations to text using the optimal speech-to-text model. The model would be a two-stage network in which the conversation is converted to text. Then, suitable entities are extracted using robust techniques using a NER BERT transformer model. This will aid in the enrichment of customer experience when there is an issue which is faced by them. If a customer faces a problem he will call and register his complaint. The model will then extract the key features from this conversation which will be necessary to look into the problem. These features would include details like the order number, and the exact problem. All these would be extracted directly from the conversation and this would reduce the effort of going through the conversation again.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Dwelling Type Classification for Disaster Risk Assessment Using Satellite Imagery
Authors:
Md Nasir,
Tina Sederholm,
Anshu Sharma,
Sundeep Reddy Mallu,
Sumedh Ranjan Ghatage,
Rahul Dodhia,
Juan Lavista Ferres
Abstract:
Vulnerability and risk assessment of neighborhoods is essential for effective disaster preparedness. Existing traditional systems, due to dependency on time-consuming and cost-intensive field surveying, do not provide a scalable way to decipher warnings and assess the precise extent of the risk at a hyper-local level. In this work, machine learning was used to automate the process of identifying d…
▽ More
Vulnerability and risk assessment of neighborhoods is essential for effective disaster preparedness. Existing traditional systems, due to dependency on time-consuming and cost-intensive field surveying, do not provide a scalable way to decipher warnings and assess the precise extent of the risk at a hyper-local level. In this work, machine learning was used to automate the process of identifying dwellings and their type to build a potentially more effective disaster vulnerability assessment system. First, satellite imageries of low-income settlements and vulnerable areas in India were used to identify 7 different dwelling types. Specifically, we formulated the dwelling type classification as a semantic segmentation task and trained a U-net based neural network model, namely TernausNet, with the data we collected. Then a risk score assessment model was employed, using the determined dwelling type along with an inundation model of the regions. The entire pipeline was deployed to multiple locations prior to natural hazards in India in 2020. Post hoc ground-truth data from those regions was collected to validate the efficacy of this model which showed promising performance. This work can aid disaster response organizations and communities at risk by providing household-level risk information that can inform preemptive actions.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Time and Frequency Domain Investigation of Selected Memristor based Analog Circuits
Authors:
G. S. Patil,
S. R. Ghatage,
P. K. Gaikwad,
R. K. Kamat,
T. D. Dongale
Abstract:
In this paper, we investigate few memristor-based analog circuits namely the phase shift oscillator, integrator, and differentiator which have been explored numerously using the traditional lumped components. We use LTspice-IV platform for simulation of the above-said circuits. The investigation resorts to the nonlinear dopant drift model of memristor and the window function portrayed in the liter…
▽ More
In this paper, we investigate few memristor-based analog circuits namely the phase shift oscillator, integrator, and differentiator which have been explored numerously using the traditional lumped components. We use LTspice-IV platform for simulation of the above-said circuits. The investigation resorts to the nonlinear dopant drift model of memristor and the window function portrayed in the literature for nonlinearity realization. The results of our investigations depict good agreement with the conventional lumped component based phase shift oscillator, integrator, and differentiator circuits. The results are evident to showcase the potential of the memristor as a promising candidate for the next generation analog circuits.
△ Less
Submitted 30 March, 2017; v1 submitted 6 February, 2016;
originally announced February 2016.