Skip to main content

Showing 51–100 of 211 results for author: Bhattacharyya, P

.
  1. Anisotropic Coulomb exchange as source of Kitaev and off-diagonal symmetric anisotropic couplings

    Authors: Pritam Bhattacharyya, Thorben Petersen, Nikolay A. Bogdanov, Liviu Hozoi

    Abstract: Exchange underpins the magnetic properties of quantum matter. In its most basic form, it occurs through the interplay of Pauli's exclusion principle and Coulomb repulsion, being referred to as Coulomb exchange. Pauli's exclusion principle combined with inter-atomic electron hop** additionally leads to kinetic exchange and superexchange. Here we disentangle the different exchange channels in anis… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: text overlap with arXiv:2302.00540, arXiv:2212.09365

    Journal ref: Commun. Phys. 7, 121 (2024)

  2. arXiv:2306.17180  [pdf, other

    cs.CL cs.AI cs.CV

    Replace and Report: NLP Assisted Radiology Report Generation

    Authors: Kaveri Kale, pushpak Bhattacharyya, Kshitij Jadhav

    Abstract: Clinical practice frequently uses medical imaging for diagnosis and treatment. A significant challenge for automatic radiology report generation is that the radiology reports are long narratives consisting of multiple sentences for both abnormal and normal findings. Therefore, applying conventional image captioning approaches to generate the whole report proves to be insufficient, as these are des… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: The 61st Annual Meeting of the Association for Computational Linguistics

  3. arXiv:2306.06384  [pdf, other

    cs.CL

    Adversarial Training For Low-Resource Disfluency Correction

    Authors: Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya

    Abstract: Disfluencies commonly occur in conversational speech. Speech with disfluencies can result in noisy Automatic Speech Recognition (ASR) transcripts, which affects downstream tasks like machine translation. In this paper, we propose an adversarially-trained sequence-tagging model for Disfluency Correction (DC) that utilizes a small amount of labeled real disfluent data in conjunction with a large amo… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: Accepted for Findings of ACL 2023

  4. arXiv:2306.03507  [pdf, other

    cs.CL

    "A Little is Enough": Few-Shot Quality Estimation based Corpus Filtering improves Machine Translation

    Authors: Akshay Batheja, Pushpak Bhattacharyya

    Abstract: Quality Estimation (QE) is the task of evaluating the quality of a translation when reference translation is not available. The goal of QE aligns with the task of corpus filtering, where we assign the quality score to the sentence pairs present in the pseudo-parallel corpus. We propose a Quality Estimation based Filtering approach to extract high-quality parallel data from the pseudo-parallel corp… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  5. arXiv:2306.03122  [pdf, other

    cond-mat.supr-con cond-mat.dis-nn quant-ph

    Imaging the Meissner effect and flux trap** in a hydride superconductor at megabar pressures using a nanoscale quantum sensor

    Authors: Prabudhya Bhattacharyya, Wuhao Chen, Xiaoli Huang, Shubhayu Chatterjee, Benchen Huang, Bryce Kobrin, Yuanqi Lyu, Thomas J. Smart, Maxwell Block, Esther Wang, Zhipan Wang, Weijie Wu, Satcher Hsieh, He Ma, Srinivas Mandyam, Bijuan Chen, Emily Davis, Zachary M. Geballe, Chong Zu, Viktor Struzhkin, Raymond Jeanloz, Joel E. Moore, Tian Cui, Giulia Galli, Bertrand I. Halperin , et al. (2 additional authors not shown)

    Abstract: By directly altering microscopic interactions, pressure provides a powerful tuning knob for the exploration of condensed phases and geophysical phenomena. The megabar regime represents an exciting frontier, where recent discoveries include novel high-temperature superconductors, as well as structural and valence phase transitions. However, at such high pressures, many conventional measurement tech… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Journal ref: Nature 627, 73-79 (2024)

  6. arXiv:2306.00931  [pdf, other

    cs.CV cs.CL

    "Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning

    Authors: Abisek Rajakumar Kalarani, Pushpak Bhattacharyya, Niyati Chhaya, Sumit Shekhar

    Abstract: Well-formed context aware image captions and tags in enterprise content such as marketing material are critical to ensure their brand presence and content recall. Manual creation and updates to ensure the same is non trivial given the scale and the tedium towards this task. We propose a new unified Vision-Language (VL) model based on the One For All (OFA) model, with a focus on context-assisted im… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  7. arXiv:2305.17480  [pdf, other

    cs.CL cs.AI

    A Match Made in Heaven: A Multi-task Framework for Hyperbole and Metaphor Detection

    Authors: Naveen Badathala, Abisek Rajakumar Kalarani, Tejpalsingh Siledar, Pushpak Bhattacharyya

    Abstract: Hyperbole and metaphor are common in day-to-day communication (e.g., "I am in deep trouble": how does trouble have depth?), which makes their detection important, especially in a conversational AI setting. Existing approaches to automatically detect metaphor and hyperbole have studied these language phenomena independently, but their relationship has hardly, if ever, been explored computationally.… ▽ More

    Submitted 30 May, 2023; v1 submitted 27 May, 2023; originally announced May 2023.

  8. arXiv:2305.16957  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction

    Authors: Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya

    Abstract: Conversational speech often consists of deviations from the speech plan, producing disfluent utterances that affect downstream NLP tasks. Removing these disfluencies is necessary to create fluent and coherent speech. This paper presents DisfluencyFixer, a tool that performs speech-to-speech disfluency correction in English and Hindi using a pipeline of Automatic Speech Recognition (ASR), Disfluenc… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: To be published in Interspeech 2023 - Show and Tell Demonstrations

  9. arXiv:2305.12518  [pdf, other

    cs.CL

    VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages

    Authors: Shivam Mhaskar, Vineet Bhat, Akshay Batheja, Sourabh Deoghare, Paramveer Choudhary, Pushpak Bhattacharyya

    Abstract: In this work, we present our deployment-ready Speech-to-Speech Machine Translation (SSMT) system for English-Hindi, English-Marathi, and Hindi-Marathi language pairs. We develop the SSMT system by cascading Automatic Speech Recognition (ASR), Disfluency Correction (DC), Machine Translation (MT), and Text-to-Speech Synthesis (TTS) models. We discuss the challenges faced during the research and deve… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  10. arXiv:2303.01191  [pdf, other

    cs.CL

    Denoising-based UNMT is more robust to word-order divergence than MASS-based UNMT

    Authors: Tamali Banerjee, Rudra Murthy V, Pushpak Bhattacharyya

    Abstract: We aim to investigate whether UNMT approaches with self-supervised pre-training are robust to word-order divergence between language pairs. We achieve this by comparing two models pre-trained with the same self-supervised pre-training objective. The first model is trained on language pairs with different word-orders, and the second model is trained on the same language pairs with source language r… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

  11. Sweet spot in the RuCl$_3$ magnetic system: nearly ideal $j_{\mathrm{eff}}\!=\!1/2$ moments and maximized $K/J$ ratio under pressure

    Authors: Pritam Bhattacharyya, Liviu Hozoi, Quirin Stahl, Jochen Geck, Nikolay A. Bogdanov

    Abstract: Maximizing the ratio between Kitaev and residual Heisenberg interactions is a major goal in nowadays research on Kitaev-Heisenberg quantum magnets. Here we investigate Kitaev-Heisenberg exchange in a recently discovered crystalline phase of RuCl$_3$ under presure -- it displays unusually high symmetry, with only one type of Ru-Ru links, and uniform Ru-Cl-Ru bond angles of $\approx$93$^{\circ}$. By… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

    Journal ref: Phys. Rev. B 108, L161107 (2023)

  12. arXiv:2301.08008  [pdf, other

    cs.CL cs.LG

    Improving Machine Translation with Phrase Pair Injection and Corpus Filtering

    Authors: Akshay Batheja, Pushpak Bhattacharyya

    Abstract: In this paper, we show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems. We extract parallel phrases and sentences from the pseudo-parallel corpus and augment it with the parallel corpus to train the NMT models. With the proposed approach, we observe an improvement in the Machine Translation (MT) system for 3 low-… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

  13. arXiv:2301.04013  [pdf, other

    cs.CL cs.AI cs.LG

    There is No Big Brother or Small Brother: Knowledge Infusion in Language Models for Link Prediction and Question Answering

    Authors: Ankush Agarwal, Sakharam Gawade, Sachin Channabasavarajendra, Pushpak Bhattacharyya

    Abstract: The integration of knowledge graphs with deep learning is thriving in improving the performance of various natural language processing (NLP) tasks. In this paper, we focus on knowledge-infused link prediction and question answering using language models, T5, and BLOOM across three domains: Aviation, Movie, and Web. In this context, we infuse knowledge in large and small language models and study t… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

  14. Resonating holes vs molecular spin-orbit coupled states in group-5 lacunar spinels

    Authors: Thorben Petersen, Pritam Bhattacharyya, Ulrich K. Rößler, Liviu Hozoi

    Abstract: The valence electronic structure of magnetic centers is one of the factors that determines the characteristics of a magnet. It may refer to orbital degeneracy, as for $j_\text{eff}=1/2$ Kitaev magnets, or near-degeneracy, e.g. involving the third and fourth shells in cuprate superconductors. Here we explore the inner structure of magnetic moments in group-5 lacunar spinels, fascinating materials f… ▽ More

    Submitted 28 August, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

    Comments: 6 pages, 3 figures

    Journal ref: Nat. Commun. 14, 5218 (2023)

  15. arXiv:2301.02413  [pdf, other

    physics.chem-ph cond-mat.mtrl-sci physics.atm-clus physics.comp-ph

    Benchmarking Gaussian Basis Sets in Quantum-Chemical Calculations of Photoabsorption Spectra of Light Atomic Clusters

    Authors: Vikram Mahamiya, Pritam Bhattacharyya, Alok Shukla

    Abstract: The choice of Gaussian basis functions for computing the ground-state properties of molecules, and clusters, employing wave-function-based electron-correlated approaches, is a well-studied subject. However, the same cannot be said when it comes to the excited-state properties of such systems, in general, and optical properties, in particular. The aim of the present study is to understand how the c… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

    Comments: Main manuscript 36 pages, 5 figures (included); Supporting information 17 pages, 3 figures (included)

    Journal ref: ACS Omega 7, 48261 (2022)

  16. NaRuO$_2$: Kitaev-Heisenberg exchange in triangular-lattice setting

    Authors: Pritam Bhattacharyya, Nikolay A. Bogdanov, Satoshi Nishimoto, Stephen D. Wilson, Liviu Hozoi

    Abstract: Kitaev exchange, a new paradigm in quantum magnetism research, occurs for 90$^{\circ}$ metal-ligand-metal links, $t_{2g}^5$ transition ions, and sizable spin-orbit coupling. It is being studied in honeycomb compounds but also on triangular lattices. While for the former it is known by now that the Kitaev intersite couplings are ferromagnetic, for the latter the situation is unclear. Here we pin do… ▽ More

    Submitted 3 July, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Journal ref: npj Quantum Materials 8, 52 (2023)

  17. arXiv:2210.11762  [pdf, other

    cs.CL

    Detecting Unintended Social Bias in Toxic Language Datasets

    Authors: Nihar Sahoo, Himanshu Gupta, Pushpak Bhattacharyya

    Abstract: With the rise of online hate speech, automatic detection of Hate Speech, Offensive texts as a natural language processing task is getting popular. However, very little research has been done to detect unintended social bias from these toxic language datasets. This paper introduces a new dataset ToxicBias curated from the existing dataset of Kaggle competition named "Jigsaw Unintended Bias in Toxic… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  18. arXiv:2206.14116  [pdf, other

    cs.CV cs.AI cs.RO

    SSL-Lanes: Self-Supervised Learning for Motion Forecasting in Autonomous Driving

    Authors: Prarthana Bhattacharyya, Chengjie Huang, Krzysztof Czarnecki

    Abstract: Self-supervised learning (SSL) is an emerging technique that has been successfully employed to train convolutional neural networks (CNNs) and graph neural networks (GNNs) for more transferable, generalizable, and robust representation learning. However its potential in motion forecasting for autonomous driving has rarely been explored. In this study, we report the first systematic exploration and… ▽ More

    Submitted 10 September, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

    Comments: Accepted to CoRL-2022

  19. arXiv:2206.06308  [pdf, other

    cs.CL cs.AI

    Knowledge Graph Construction and Its Application in Automatic Radiology Report Generation from Radiologist's Dictation

    Authors: Kaveri Kale, Pushpak Bhattacharyya, Aditya Shetty, Milind Gune, Kush Shrivastava, Rustom Lawyer, Spriha Biswas

    Abstract: Conventionally, the radiologist prepares the diagnosis notes and shares them with the transcriptionist. Then the transcriptionist prepares a preliminary formatted report referring to the notes, and finally, the radiologist reviews the report, corrects the errors, and signs off. This workflow causes significant delays and errors in the report. In current research work, we focus on applications of N… ▽ More

    Submitted 13 June, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

  20. arXiv:2206.06141  [pdf, other

    cs.CL cs.CY

    Am I No Good? Towards Detecting Perceived Burdensomeness and Thwarted Belongingness from Suicide Notes

    Authors: Soumitra Ghosh, Asif Ekbal, Pushpak Bhattacharyya

    Abstract: The World Health Organization (WHO) has emphasized the importance of significantly accelerating suicide prevention efforts to fulfill the United Nations' Sustainable Development Goal (SDG) objective of 2030. In this paper, we present an end-to-end multitask system to address a novel task of detection of two interpersonal risk factors of suicide, Perceived Burdensomeness (PB) and Thwarted Belonging… ▽ More

    Submitted 20 May, 2022; originally announced June 2022.

    Comments: Accepted for publication at IJCAI-ECAI 2022 (AI for Good Track)

  21. arXiv:2206.02119  [pdf, other

    cs.CL

    A Multimodal Corpus for Emotion Recognition in Sarcasm

    Authors: Anupama Ray, Shubham Mishra, Apoorva Nunna, Pushpak Bhattacharyya

    Abstract: While sentiment and emotion analysis have been studied extensively, the relationship between sarcasm and emotion has largely remained unexplored. A sarcastic expression may have a variety of underlying emotions. For example, "I love being ignored" belies sadness, while "my mobile is fabulous with a battery backup of only 15 minutes!" expresses frustration. Detecting the emotion behind a sarcastic… ▽ More

    Submitted 5 June, 2022; originally announced June 2022.

  22. arXiv:2206.02066  [pdf, other

    cs.CV cs.AI

    PIDNet: A Real-time Semantic Segmentation Network Inspired by PID Controllers

    Authors: Jiacong Xu, Zixiang Xiong, Shankar P. Bhattacharyya

    Abstract: Two-branch network architecture has shown its efficiency and effectiveness in real-time semantic segmentation tasks. However, direct fusion of high-resolution details and low-frequency context has the drawback of detailed features being easily overwhelmed by surrounding contextual information. This overshoot phenomenon limits the improvement of the segmentation accuracy of existing two-branch mode… ▽ More

    Submitted 6 April, 2023; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: 11 pages, 9 figures; This paper will be published by CVPR2023 soon, please refer to the official version then

  23. arXiv:2205.15952  [pdf, other

    cs.CL cs.AI cs.LG

    Knowledge Graph - Deep Learning: A Case Study in Question Answering in Aviation Safety Domain

    Authors: Ankush Agarwal, Raj Gite, Shreya Laddha, Pushpak Bhattacharyya, Satyanarayan Kar, Asif Ekbal, Prabhjit Thind, Rajesh Zele, Ravi Shankar

    Abstract: In the commercial aviation domain, there are a large number of documents, like, accident reports (NTSB, ASRS) and regulatory directives (ADs). There is a need for a system to access these diverse repositories efficiently in order to service needs in the aviation industry, like maintenance, compliance, and safety. In this paper, we propose a Knowledge Graph (KG) guided Deep Learning (DL) based Ques… ▽ More

    Submitted 9 June, 2022; v1 submitted 31 May, 2022; originally announced May 2022.

    Comments: LREC 2022 Main Conference Accepted Paper

  24. arXiv:2205.15951  [pdf, other

    cs.CL cs.CY cs.LG

    Hollywood Identity Bias Dataset: A Context Oriented Bias Analysis of Movie Dialogues

    Authors: Sandhya Singh, Prapti Roy, Nihar Sahoo, Niteesh Mallela, Himanshu Gupta, Pushpak Bhattacharyya, Milind Savagaonkar, Nidhi, Roshni Ramnani, Anutosh Maitra, Shubhashis Sengupta

    Abstract: Movies reflect society and also hold power to transform opinions. Social biases and stereotypes present in movies can cause extensive damage due to their reach. These biases are not always found to be the need of storyline but can creep in as the author's bias. Movie production houses would prefer to ascertain that the bias present in a script is the story's demand. Today, when deep learning model… ▽ More

    Submitted 1 June, 2022; v1 submitted 31 May, 2022; originally announced May 2022.

  25. arXiv:2205.13908  [pdf, other

    cs.CL

    EmoInHindi: A Multi-label Emotion and Intensity Annotated Dataset in Hindi for Emotion Recognition in Dialogues

    Authors: Gopendra Vikram Singh, Priyanshu Priya, Mauajama Firdaus, Asif Ekbal, Pushpak Bhattacharyya

    Abstract: The long-standing goal of Artificial Intelligence (AI) has been to create human-like conversational systems. Such systems should have the ability to develop an emotional connection with the users, hence emotion recognition in dialogues is an important task. Emotion detection in dialogues is a challenging task because humans usually convey multiple emotions with varying degrees of intensities in a… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: This paper is accepted at LREC 2022

  26. arXiv:2204.13743  [pdf, other

    cs.CL

    HiNER: A Large Hindi Named Entity Recognition Dataset

    Authors: Rudra Murthy, Pallab Bhattacharjee, Rahul Sharnagat, Jyotsana Khatri, Diptesh Kanojia, Pushpak Bhattacharyya

    Abstract: Named Entity Recognition (NER) is a foundational NLP task that aims to provide class labels like Person, Location, Organisation, Time, and Number to words in free text. Named Entities can also be multi-word expressions where the additional I-O-B annotation information helps label them during the NER annotation process. While English and European languages have considerable annotated data for the N… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: Accepted at LREC 2022, 8 pages

  27. Yb$^{3+}$ $f$-$f$ excitations in NaYbSe$_2$: benchmarking embedded-cluster quantum chemical schemes for 4$f$ insulators

    Authors: Pritam Bhattacharyya, Liviu Hozoi

    Abstract: $\tilde{S}\!=\!1/2$ triangular-lattice $f$-electron materials define a dynamic research area in condensed matter magnetism. In various Yb 4$f^{13}$ triangular-lattice compounds, for example, spin-liquid ground states seem to be realized. Using {\it ab initio} quantum chemical methods, we here investigate how correlation effects involving the 4$f$ electrons affect the on-site $f$-$f… ▽ More

    Submitted 18 June, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

  28. arXiv:2204.12086  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Electronic and structural properties of RbCeX$_2$ (X$_2$: O$_2$, S$_2$, SeS, Se$_2$, TeSe, Te$_2$)

    Authors: Brenden R. Ortiz, Mitchell M. Bordelon, Pritam Bhattacharyya, Ganesh Pokharel, Paul M. Sarte, Lorenzo Posthuma, Thorben Petersen, Mohamed S. Eldeeb, Garrett E. Granroth, Clarina R. Dela Cruz, Stuart Calder, Douglas L. Abernathy, Liviu Hozoi, Stephen D. Wilson

    Abstract: Triangular lattice delafossite compounds built from magnetic lanthanide ions are a topic of recent interest due to their frustrated magnetism and realization of quantum disordered magnetic ground states. Here we report the evolution of the structure and electronic ground states of RbCe$X_2$ compounds, built from a triangular lattice of Ce$^{3+}$ ions, upon varying their anion character ($X_2$= O… ▽ More

    Submitted 20 July, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: 12 pages, 7 figures

    Journal ref: Phys. Rev. Materials 6, 084402 (2022)

  29. Crystal-field effects competing with spin-orbit interactions in NaCeO$_2$

    Authors: Pritam Bhattacharyya, Ulrich K. Rößler, Liviu Hozoi

    Abstract: Ce compounds feature a remarkable diversity of electronic properties, which motivated extensive investigations over the last decades. Inelastic neutron scattering represents an important tool for understanding their underlying electronic structures but in certain cases a straightforward interpretation of the measured spectra is hampered by the presence of strong vibronic couplings. The latter may… ▽ More

    Submitted 18 June, 2022; v1 submitted 12 March, 2022; originally announced March 2022.

  30. arXiv:2202.02125  [pdf, other

    cs.AI

    OntoSeer -- A Recommendation System to Improve the Quality of Ontologies

    Authors: Pramit Bhattacharyya, Raghava Mutharaju

    Abstract: Building an ontology is not only a time-consuming process, but it is also confusing, especially for beginners and the inexperienced. Although ontology developers can take the help of domain experts in building an ontology, they are not readily available in several cases for a variety of reasons. Ontology developers have to grapple with several questions related to the choice of classes, properties… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

  31. arXiv:2201.08951  [pdf, other

    cs.CV cs.LG

    Visual Representation Learning with Self-Supervised Attention for Low-Label High-data Regime

    Authors: Prarthana Bhattacharyya, Chenge Li, Xiaonan Zhao, István Fehérvári, Jason Sun

    Abstract: Self-supervision has shown outstanding results for natural language processing, and more recently, for image recognition. Simultaneously, vision transformers and its variants have emerged as a promising and scalable alternative to convolutions on various computer vision tasks. In this paper, we are the first to question if self-supervised vision transformers (SSL-ViTs) can be adapted to two import… ▽ More

    Submitted 30 January, 2022; v1 submitted 21 January, 2022; originally announced January 2022.

    Comments: Accepted to ICASSP-2022

  32. arXiv:2201.02977  [pdf, other

    cs.CL

    Indian Language Wordnets and their Linkages with Princeton WordNet

    Authors: Diptesh Kanojia, Kevin Patel, Pushpak Bhattacharyya

    Abstract: Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that t… ▽ More

    Submitted 9 January, 2022; originally announced January 2022.

    Comments: Published at LREC 2018

  33. arXiv:2201.01747  [pdf, other

    cs.CL

    Semi-automatic WordNet Linking using Word Embeddings

    Authors: Kevin Patel, Diptesh Kanojia, Pushpak Bhattacharyya

    Abstract: Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that t… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

    Comments: Published at GWC 2018

  34. arXiv:2201.01693  [pdf

    cs.CL

    Strategies of Effective Digitization of Commentaries and Sub-commentaries: Towards the Construction of Textual History

    Authors: Diptesh Kanojia, Malhar Kulkarni, Sayali Ghodekar, Eivind Kahrs, Pushpak Bhattacharyya

    Abstract: This paper describes additional aspects of a digital tool called the 'Textual History Tool'. We describe its various salient features with special reference to those of its features that may help the philologist digitize commentaries and sub-commentaries on a text. This tool captures the historical evolution of a text through various temporal stages, and interrelated data culled from various types… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

    Comments: Accepted at TCDK @ SSSU 2020; ISBN: 978-93-83097-43-2; Pages 477--489

  35. arXiv:2112.15471  [pdf, other

    cs.CL

    A Survey on Using Gaze Behaviour for Natural Language Processing

    Authors: Sandeep Mathias, Diptesh Kanojia, Abhijit Mishra, Pushpak Bhattacharyya

    Abstract: Gaze behaviour has been used as a way to gather cognitive information for a number of years. In this paper, we discuss the use of gaze behaviour in solving different tasks in natural language processing (NLP) without having to record it at test time. This is because the collection of gaze behaviour is a costly task, both in terms of time and money. Hence, in this paper, we focus on research done t… ▽ More

    Submitted 3 January, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: Published at IJCAI-PRICAI 2020; Full Link: https://www.ijcai.org/proceedings/2020/683; The sole copyright holder is IJCAI (International Joint Conferences on Artificial Intelligence), all rights reserved

  36. arXiv:2112.15124  [pdf, other

    cs.CL

    Utilizing Wordnets for Cognate Detection among Indian Languages

    Authors: Diptesh Kanojia, Kevin Patel, Pushpak Bhattacharyya, Malhar Kulkarni, Gholamreza Haffari

    Abstract: Automatic Cognate Detection (ACD) is a challenging task which has been utilized to help NLP applications like Machine Translation, Information Retrieval and Computational Phylogenetics. Unidentified cognate pairs can pose a challenge to these applications and result in a degradation of performance. In this paper, we detect cognate word pairs among ten Indian languages with Hindi and use deep learn… ▽ More

    Submitted 30 December, 2021; originally announced December 2021.

    Comments: Published at GWC 2019

  37. arXiv:2112.13800  [pdf, other

    cs.CL

    "A Passage to India": Pre-trained Word Embeddings for Indian Languages

    Authors: Kumar Saurav, Kumar Saunack, Diptesh Kanojia, Pushpak Bhattacharyya

    Abstract: Dense word vectors or 'word embeddings' which encode semantic properties of words, have now become integral to NLP tasks like Machine Translation (MT), Question Answering (QA), Word Sense Disambiguation (WSD), and Information Retrieval (IR). In this paper, we use various existing approaches to create multiple word embeddings for 14 Indian languages. We place these embeddings for all these language… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

    Comments: Published at LREC 2020

  38. arXiv:2112.09526  [pdf, other

    cs.CL

    Challenge Dataset of Cognates and False Friend Pairs from Indian Languages

    Authors: Diptesh Kanojia, Pushpak Bhattacharyya, Malhar Kulkarni, Gholamreza Haffari

    Abstract: Cognates are present in multiple variants of the same text across different languages (e.g., "hund" in German and "hound" in English language mean "dog"). They pose a challenge to various Natural Language Processing (NLP) applications such as Machine Translation, Cross-lingual Sense Disambiguation, Computational Phylogenetics, and Information Retrieval. A possible solution to address this challeng… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Comments: Published at LREC 2020

  39. arXiv:2112.08789  [pdf, other

    cs.CL

    Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages

    Authors: Diptesh Kanojia, Raj Dabre, Shubham Dewangan, Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni

    Abstract: Cognates are variants of the same lexical form across different languages; for example 'fonema' in Spanish and 'phoneme' in English are cognates, both of which mean 'a unit of sound'. The task of automatic detection of cognates among any two languages can help downstream NLP tasks such as Cross-lingual Information Retrieval, Computational Phylogenetics, and Machine Translation. In this paper, we d… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: Published at COLING 2020

  40. arXiv:2112.08087  [pdf, other

    cs.CL cs.AI

    Cognition-aware Cognate Detection

    Authors: Diptesh Kanojia, Prashant Sharma, Sayali Ghodekar, Pushpak Bhattacharyya, Gholamreza Haffari, Malhar Kulkarni

    Abstract: Automatic detection of cognates helps downstream NLP tasks of Machine Translation, Cross-lingual Information Retrieval, Computational Phylogenetics and Cross-lingual Named Entity Recognition. Previous approaches for the task of cognate detection use orthographic, phonetic and semantic similarity based features sets. In this paper, we propose a novel method for enriching the feature sets, with cogn… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: Published at EACL 2021

  41. 3D Scene Understanding at Urban Intersection using Stereo Vision and Digital Map

    Authors: Prarthana Bhattacharyya, Yanlei Gu, Jiali Bao, Xu Liu, Shunsuke Kamijo

    Abstract: The driving behavior at urban intersections is very complex. It is thus crucial for autonomous vehicles to comprehensively understand challenging urban traffic scenes in order to navigate intersections and prevent accidents. In this paper, we introduce a stereo vision and 3D digital map based approach to spatially and temporally analyze the traffic situation at urban intersections. Stereo vision i… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 6 pages, 6 figures

    Journal ref: 2017 IEEE 85th Vehicular Technology Conference (VTC Spring)

  42. arXiv:2112.03849  [pdf, ps, other

    cs.CL cs.AI

    Natural Answer Generation: From Factoid Answer to Full-length Answer using Grammar Correction

    Authors: Manas Jain, Sriparna Saha, Pushpak Bhattacharyya, Gladvin Chinnadurai, Manish Kumar Vatsa

    Abstract: Question Answering systems these days typically use template-based language generation. Though adequate for a domain-specific task, these systems are too restrictive and predefined for domain-independent systems. This paper proposes a system that outputs a full-length answer given a question and the extracted factoid answer (short spans such as named entities) as the input. Our system uses constit… ▽ More

    Submitted 7 December, 2021; originally announced December 2021.

  43. arXiv:2112.02590  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Crystal Growth, Exfoliation and Magnetic Properties of Quaternary Quasi-Two-Dimensional CuCrP$_2$S$_6$

    Authors: Sebastian Selter, Kranthi K. Bestha, Pritam Bhattacharyya, Burak Özer, Yuliia Shemerliuk, Laura T. Corredor, Louis Veyrat, Anja U. B. Wolter, Liviu Hozoi, Bernd Büchner, Saicharan Aswartham

    Abstract: We report optimized crystal growth conditions for the quaternary compound CuCrP$_2$S$_6$ by chemical vapor transport. Compositional and structural characterization of the obtained crystals were carried out by means of energy-dispersive X-ray spectroscopy and powder X-ray diffraction. CuCrP$_2$S$_6$ is structurally closely related to the $M_2$P$_2$S$_6$ family ($M$: transition metal), which contain… ▽ More

    Submitted 28 April, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 12 pages, 11 figures, 46 references

  44. arXiv:2111.13972  [pdf, other

    cs.CL

    Tap** BERT for Preposition Sense Disambiguation

    Authors: Siddhesh Pawar, Shyam Thombre, Anirudh Mittal, Girishkumar Ponkiya, Pushpak Bhattacharyya

    Abstract: Prepositions are frequently occurring polysemous words. Disambiguation of prepositions is crucial in tasks like semantic role labelling, question answering, text entailment, and noun compound paraphrasing. In this paper, we propose a novel methodology for preposition sense disambiguation (PSD), which does not use any linguistic tools. In a supervised setting, the machine learning model is presente… ▽ More

    Submitted 27 November, 2021; originally announced November 2021.

    ACM Class: I.2.7

  45. arXiv:2110.12765  [pdf, other

    cs.CL cs.AI

    "So You Think You're Funny?": Rating the Humour Quotient in Standup Comedy

    Authors: Anirudh Mittal, Pranav Jeevan, Prerak Gandhi, Diptesh Kanojia, Pushpak Bhattacharyya

    Abstract: Computational Humour (CH) has attracted the interest of Natural Language Processing and Computational Linguistics communities. Creating datasets for automatic measurement of humour quotient is difficult due to multiple possible interpretations of the content. In this work, we create a multi-modal humour-annotated dataset ($\sim$40 hours) using stand-up comedy clips. We devise a novel scoring mecha… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

    Comments: Accepted at EMNLP 2021 Main Conference (short papers); 4 pages, 1 figure, 3 tables

  46. arXiv:2110.09321  [pdf, other

    cs.CL cs.AI

    COVIDRead: A Large-scale Question Answering Dataset on COVID-19

    Authors: Tanik Saikh, Sovan Kumar Sahoo, Asif Ekbal, Pushpak Bhattacharyya

    Abstract: During this pandemic situation, extracting any relevant information related to COVID-19 will be immensely beneficial to the community at large. In this paper, we present a very important resource, COVIDRead, a Stanford Question Answering Dataset (SQuAD) like dataset over more than 100k question-answer pairs. The dataset consists of Context-Answer-Question triples. Primarily the questions from the… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: 20 pages, 7 figures

  47. arXiv:2109.12792  [pdf, other

    physics.chem-ph physics.atm-clus

    Why does B$_{12}$H$_{12}$-icosahedron need two electrons to be stable: A first-principles electron-correlated investigation of B$_{12}$H$_{n}$ ($n=$6,12) clusters

    Authors: Pritam Bhattacharyya, Ihsan Boustani, Alok Shukla

    Abstract: In this work, we present large-scale electron-correlated computations on various conformers of B$_{12}$H$_{12}$ and B$_{12}$H$_{6}$ clusters, to understand the reasons behind the high stability of di-anion icosahedron ($I_{h}$) and cage-like B$_{12}$H$_{6}$ geometries. Although the B$_{12}$-icosahedron is the basic building block in some structures of bulk boron, it is unstable in its free form. F… ▽ More

    Submitted 31 December, 2021; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: 26 pages, 9 figures (included)

    Journal ref: J. Phys. Chem. A 125, 10734 (2021)

  48. arXiv:2109.10534  [pdf, other

    cs.CL

    Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages

    Authors: Tejas Indulal Dhamecha, Rudra Murthy V, Samarth Bharadwaj, Karthik Sankaranarayanan, Pushpak Bhattacharyya

    Abstract: We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to tr… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: Accepted in EMNLP 2021

  49. arXiv:2108.01260  [pdf, other

    cs.CL

    M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations

    Authors: Dushyant Singh Chauhan, Gopendra Vikram Singh, Navonil Majumder, Amir Zadeh, Asif Ekbal, Pushpak Bhattacharyya, Louis-philippe Morency, Soujanya Poria

    Abstract: Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great demand to build models and systems that support m… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: ICMI 2021

  50. arXiv:2106.04995  [pdf, other

    cs.CL cs.LG

    Crosslingual Embeddings are Essential in UNMT for Distant Languages: An English to IndoAryan Case Study

    Authors: Tamali Banerjee, Rudra Murthy V, Pushpak Bhattacharyya

    Abstract: Recent advances in Unsupervised Neural Machine Translation (UNMT) have minimized the gap between supervised and unsupervised machine translation performance for closely related language pairs. However, the situation is very different for distant language pairs. Lack of lexical overlap and low syntactic similarities such as between English and Indo-Aryan languages leads to poor translation quality… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.