-
Anisotropic Coulomb exchange as source of Kitaev and off-diagonal symmetric anisotropic couplings
Authors:
Pritam Bhattacharyya,
Thorben Petersen,
Nikolay A. Bogdanov,
Liviu Hozoi
Abstract:
Exchange underpins the magnetic properties of quantum matter. In its most basic form, it occurs through the interplay of Pauli's exclusion principle and Coulomb repulsion, being referred to as Coulomb exchange. Pauli's exclusion principle combined with inter-atomic electron hop** additionally leads to kinetic exchange and superexchange. Here we disentangle the different exchange channels in anis…
▽ More
Exchange underpins the magnetic properties of quantum matter. In its most basic form, it occurs through the interplay of Pauli's exclusion principle and Coulomb repulsion, being referred to as Coulomb exchange. Pauli's exclusion principle combined with inter-atomic electron hop** additionally leads to kinetic exchange and superexchange. Here we disentangle the different exchange channels in anisotropic Kitaev-Heisenberg context. By quantum chemical computations, we show that anisotropic Coulomb exchange, completely neglected so far in the field, may be as large as (or even larger than) other contributions -- kinetic exchange and superexchange. This opens new perspectives onto anisotropic exchange mechanisms and sets the proper conceptual framework for further research on tuning Kitaev-Heisenberg magnetism.
△ Less
Submitted 30 June, 2023;
originally announced June 2023.
-
Replace and Report: NLP Assisted Radiology Report Generation
Authors:
Kaveri Kale,
pushpak Bhattacharyya,
Kshitij Jadhav
Abstract:
Clinical practice frequently uses medical imaging for diagnosis and treatment. A significant challenge for automatic radiology report generation is that the radiology reports are long narratives consisting of multiple sentences for both abnormal and normal findings. Therefore, applying conventional image captioning approaches to generate the whole report proves to be insufficient, as these are des…
▽ More
Clinical practice frequently uses medical imaging for diagnosis and treatment. A significant challenge for automatic radiology report generation is that the radiology reports are long narratives consisting of multiple sentences for both abnormal and normal findings. Therefore, applying conventional image captioning approaches to generate the whole report proves to be insufficient, as these are designed to briefly describe images with short sentences. We propose a template-based approach to generate radiology reports from radiographs. Our approach involves the following: i) using a multilabel image classifier, produce the tags for the input radiograph; ii) using a transformer-based model, generate pathological descriptions (a description of abnormal findings seen on radiographs) from the tags generated in step (i); iii) using a BERT-based multi-label text classifier, find the spans in the normal report template to replace with the generated pathological descriptions; and iv) using a rule-based system, replace the identified span with the generated pathological description. We performed experiments with the two most popular radiology report datasets, IU Chest X-ray and MIMIC-CXR and demonstrated that the BLEU-1, ROUGE-L, METEOR, and CIDEr scores are better than the State-of-the-Art models by 25%, 36%, 44% and 48% respectively, on the IU X-RAY dataset. To the best of our knowledge, this is the first attempt to generate chest X-ray radiology reports by first creating small sentences for abnormal findings and then replacing them in the normal report template.
△ Less
Submitted 19 June, 2023;
originally announced June 2023.
-
Adversarial Training For Low-Resource Disfluency Correction
Authors:
Vineet Bhat,
Preethi Jyothi,
Pushpak Bhattacharyya
Abstract:
Disfluencies commonly occur in conversational speech. Speech with disfluencies can result in noisy Automatic Speech Recognition (ASR) transcripts, which affects downstream tasks like machine translation. In this paper, we propose an adversarially-trained sequence-tagging model for Disfluency Correction (DC) that utilizes a small amount of labeled real disfluent data in conjunction with a large amo…
▽ More
Disfluencies commonly occur in conversational speech. Speech with disfluencies can result in noisy Automatic Speech Recognition (ASR) transcripts, which affects downstream tasks like machine translation. In this paper, we propose an adversarially-trained sequence-tagging model for Disfluency Correction (DC) that utilizes a small amount of labeled real disfluent data in conjunction with a large amount of unlabeled data. We show the benefit of our proposed technique, which crucially depends on synthetically generated disfluent data, by evaluating it for DC in three Indian languages- Bengali, Hindi, and Marathi (all from the Indo-Aryan family). Our technique also performs well in removing stuttering disfluencies in ASR transcripts introduced by speech impairments. We achieve an average 6.15 points improvement in F1-score over competitive baselines across all three languages mentioned. To the best of our knowledge, we are the first to utilize adversarial training for DC and use it to correct stuttering disfluencies in English, establishing a new benchmark for this task.
△ Less
Submitted 10 June, 2023;
originally announced June 2023.
-
"A Little is Enough": Few-Shot Quality Estimation based Corpus Filtering improves Machine Translation
Authors:
Akshay Batheja,
Pushpak Bhattacharyya
Abstract:
Quality Estimation (QE) is the task of evaluating the quality of a translation when reference translation is not available. The goal of QE aligns with the task of corpus filtering, where we assign the quality score to the sentence pairs present in the pseudo-parallel corpus. We propose a Quality Estimation based Filtering approach to extract high-quality parallel data from the pseudo-parallel corp…
▽ More
Quality Estimation (QE) is the task of evaluating the quality of a translation when reference translation is not available. The goal of QE aligns with the task of corpus filtering, where we assign the quality score to the sentence pairs present in the pseudo-parallel corpus. We propose a Quality Estimation based Filtering approach to extract high-quality parallel data from the pseudo-parallel corpus. To the best of our knowledge, this is a novel adaptation of the QE framework to extract quality parallel corpus from the pseudo-parallel corpus. By training with this filtered corpus, we observe an improvement in the Machine Translation (MT) system's performance by up to 1.8 BLEU points, for English-Marathi, Chinese-English, and Hindi-Bengali language pairs, over the baseline model. The baseline model is the one that is trained on the whole pseudo-parallel corpus. Our Few-shot QE model transfer learned from the English-Marathi QE model and fine-tuned on only 500 Hindi-Bengali training instances, shows an improvement of up to 0.6 BLEU points for Hindi-Bengali language pair, compared to the baseline model. This demonstrates the promise of transfer learning in the setting under discussion. QE systems typically require in the order of (7K-25K) of training data. Our Hindi-Bengali QE is trained on only 500 instances of training that is 1/40th of the normal requirement and achieves comparable performance. All the scripts and datasets utilized in this study will be publicly available.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
Imaging the Meissner effect and flux trap** in a hydride superconductor at megabar pressures using a nanoscale quantum sensor
Authors:
Prabudhya Bhattacharyya,
Wuhao Chen,
Xiaoli Huang,
Shubhayu Chatterjee,
Benchen Huang,
Bryce Kobrin,
Yuanqi Lyu,
Thomas J. Smart,
Maxwell Block,
Esther Wang,
Zhipan Wang,
Weijie Wu,
Satcher Hsieh,
He Ma,
Srinivas Mandyam,
Bijuan Chen,
Emily Davis,
Zachary M. Geballe,
Chong Zu,
Viktor Struzhkin,
Raymond Jeanloz,
Joel E. Moore,
Tian Cui,
Giulia Galli,
Bertrand I. Halperin
, et al. (2 additional authors not shown)
Abstract:
By directly altering microscopic interactions, pressure provides a powerful tuning knob for the exploration of condensed phases and geophysical phenomena. The megabar regime represents an exciting frontier, where recent discoveries include novel high-temperature superconductors, as well as structural and valence phase transitions. However, at such high pressures, many conventional measurement tech…
▽ More
By directly altering microscopic interactions, pressure provides a powerful tuning knob for the exploration of condensed phases and geophysical phenomena. The megabar regime represents an exciting frontier, where recent discoveries include novel high-temperature superconductors, as well as structural and valence phase transitions. However, at such high pressures, many conventional measurement techniques fail. Here, we demonstrate the ability to perform local magnetometry inside of a diamond anvil cell with sub-micron spatial resolution at megabar pressures. Our approach utilizes a shallow layer of Nitrogen-Vacancy (NV) color centers implanted directly within the anvil; crucially, we choose a crystal cut compatible with the intrinsic symmetries of the NV center to enable functionality at megabar pressures. We apply our technique to characterize a recently discovered hydride superconductor, CeH$_9$. By performing simultaneous magnetometry and electrical transport measurements, we observe the dual signatures of superconductivity: local diamagnetism characteristic of the Meissner effect and a sharp drop of the resistance to near zero. By locally map** the Meissner effect and flux trap**, we directly image the geometry of superconducting regions, revealing significant inhomogeneities at the micron scale. Our work brings quantum sensing to the megabar frontier and enables the closed loop optimization of superhydride materials synthesis.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
"Let's not Quote out of Context": Unified Vision-Language Pretraining for Context Assisted Image Captioning
Authors:
Abisek Rajakumar Kalarani,
Pushpak Bhattacharyya,
Niyati Chhaya,
Sumit Shekhar
Abstract:
Well-formed context aware image captions and tags in enterprise content such as marketing material are critical to ensure their brand presence and content recall. Manual creation and updates to ensure the same is non trivial given the scale and the tedium towards this task. We propose a new unified Vision-Language (VL) model based on the One For All (OFA) model, with a focus on context-assisted im…
▽ More
Well-formed context aware image captions and tags in enterprise content such as marketing material are critical to ensure their brand presence and content recall. Manual creation and updates to ensure the same is non trivial given the scale and the tedium towards this task. We propose a new unified Vision-Language (VL) model based on the One For All (OFA) model, with a focus on context-assisted image captioning where the caption is generated based on both the image and its context. Our approach aims to overcome the context-independent (image and text are treated independently) nature of the existing approaches. We exploit context by pretraining our model with datasets of three tasks: news image captioning where the news article is the context, contextual visual entailment, and keyword extraction from the context. The second pretraining task is a new VL task, and we construct and release two datasets for the task with 1.1M and 2.2K data instances. Our system achieves state-of-the-art results with an improvement of up to 8.34 CIDEr score on the benchmark news image captioning datasets. To the best of our knowledge, ours is the first effort at incorporating contextual information in pretraining the models for the VL tasks.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
A Match Made in Heaven: A Multi-task Framework for Hyperbole and Metaphor Detection
Authors:
Naveen Badathala,
Abisek Rajakumar Kalarani,
Tejpalsingh Siledar,
Pushpak Bhattacharyya
Abstract:
Hyperbole and metaphor are common in day-to-day communication (e.g., "I am in deep trouble": how does trouble have depth?), which makes their detection important, especially in a conversational AI setting. Existing approaches to automatically detect metaphor and hyperbole have studied these language phenomena independently, but their relationship has hardly, if ever, been explored computationally.…
▽ More
Hyperbole and metaphor are common in day-to-day communication (e.g., "I am in deep trouble": how does trouble have depth?), which makes their detection important, especially in a conversational AI setting. Existing approaches to automatically detect metaphor and hyperbole have studied these language phenomena independently, but their relationship has hardly, if ever, been explored computationally. In this paper, we propose a multi-task deep learning framework to detect hyperbole and metaphor simultaneously. We hypothesize that metaphors help in hyperbole detection, and vice-versa. To test this hypothesis, we annotate two hyperbole datasets- HYPO and HYPO-L- with metaphor labels. Simultaneously, we annotate two metaphor datasets- TroFi and LCC- with hyperbole labels. Experiments using these datasets give an improvement of the state of the art of hyperbole detection by 12%. Additionally, our multi-task learning (MTL) approach shows an improvement of up to 17% over single-task learning (STL) for both hyperbole and metaphor detection, supporting our hypothesis. To the best of our knowledge, ours is the first demonstration of computational leveraging of linguistic intimacy between metaphor and hyperbole, leading to showing the superiority of MTL over STL for hyperbole and metaphor detection.
△ Less
Submitted 30 May, 2023; v1 submitted 27 May, 2023;
originally announced May 2023.
-
DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction
Authors:
Vineet Bhat,
Preethi Jyothi,
Pushpak Bhattacharyya
Abstract:
Conversational speech often consists of deviations from the speech plan, producing disfluent utterances that affect downstream NLP tasks. Removing these disfluencies is necessary to create fluent and coherent speech. This paper presents DisfluencyFixer, a tool that performs speech-to-speech disfluency correction in English and Hindi using a pipeline of Automatic Speech Recognition (ASR), Disfluenc…
▽ More
Conversational speech often consists of deviations from the speech plan, producing disfluent utterances that affect downstream NLP tasks. Removing these disfluencies is necessary to create fluent and coherent speech. This paper presents DisfluencyFixer, a tool that performs speech-to-speech disfluency correction in English and Hindi using a pipeline of Automatic Speech Recognition (ASR), Disfluency Correction (DC) and Text-To-Speech (TTS) models. Our proposed system removes disfluencies from input speech and returns fluent speech as output along with its transcript, disfluency type and total disfluency count in source utterance, providing a one-stop destination for language learners to improve the fluency of their speech. We evaluate the performance of our tool subjectively and receive scores of 4.26, 4.29 and 4.42 out of 5 in ASR performance, DC performance and ease-of-use of the system. Our tool can be accessed openly at the following link.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages
Authors:
Shivam Mhaskar,
Vineet Bhat,
Akshay Batheja,
Sourabh Deoghare,
Paramveer Choudhary,
Pushpak Bhattacharyya
Abstract:
In this work, we present our deployment-ready Speech-to-Speech Machine Translation (SSMT) system for English-Hindi, English-Marathi, and Hindi-Marathi language pairs. We develop the SSMT system by cascading Automatic Speech Recognition (ASR), Disfluency Correction (DC), Machine Translation (MT), and Text-to-Speech Synthesis (TTS) models. We discuss the challenges faced during the research and deve…
▽ More
In this work, we present our deployment-ready Speech-to-Speech Machine Translation (SSMT) system for English-Hindi, English-Marathi, and Hindi-Marathi language pairs. We develop the SSMT system by cascading Automatic Speech Recognition (ASR), Disfluency Correction (DC), Machine Translation (MT), and Text-to-Speech Synthesis (TTS) models. We discuss the challenges faced during the research and development stage and the scalable deployment of the SSMT system as a publicly accessible web service. On the MT part of the pipeline too, we create a Text-to-Text Machine Translation (TTMT) service in all six translation directions involving English, Hindi, and Marathi. To mitigate data scarcity, we develop a LaBSE-based corpus filtering tool to select high-quality parallel sentences from a noisy pseudo-parallel corpus for training the TTMT system. All the data used for training the SSMT and TTMT systems and the best models are being made publicly available. Users of our system are (a) Govt. of India in the context of its new education policy (NEP), (b) tourists who criss-cross the multilingual landscape of India, (c) Indian Judiciary where a leading cause of the pendency of cases (to the order of 10 million as on date) is the translation of case papers, (d) farmers who need weather and price information and so on. We also share the feedback received from various stakeholders when our SSMT and TTMT systems were demonstrated in large public events.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
-
Denoising-based UNMT is more robust to word-order divergence than MASS-based UNMT
Authors:
Tamali Banerjee,
Rudra Murthy V,
Pushpak Bhattacharyya
Abstract:
We aim to investigate whether UNMT approaches with self-supervised pre-training are robust to word-order divergence between language pairs. We achieve this by comparing two models pre-trained with the same self-supervised pre-training objective. The first model is trained on language pairs with different word-orders, and the second model is trained on the same language pairs with source language r…
▽ More
We aim to investigate whether UNMT approaches with self-supervised pre-training are robust to word-order divergence between language pairs. We achieve this by comparing two models pre-trained with the same self-supervised pre-training objective. The first model is trained on language pairs with different word-orders, and the second model is trained on the same language pairs with source language re-ordered to match the word-order of the target language. Ideally, UNMT approaches which are robust to word-order divergence should exhibit no visible performance difference between the two configurations. In this paper, we investigate two such self-supervised pre-training based UNMT approaches, namely Masked Sequence-to-Sequence Pre-Training, (MASS) (which does not have shuffling noise) and Denoising AutoEncoder (DAE), (which has shuffling noise).
We experiment with five English$\rightarrow$Indic language pairs, i.e., en-hi, en-bn, en-gu, en-kn, and en-ta) where word-order of the source language is SVO (Subject-Verb-Object), and the word-order of the target languages is SOV (Subject-Object-Verb). We observed that for these language pairs, DAE-based UNMT approach consistently outperforms MASS in terms of translation accuracies. Moreover, bridging the word-order gap using reordering improves the translation accuracy of MASS-based UNMT models, while it cannot improve the translation accuracy of DAE-based UNMT models. This observation indicates that DAE-based UNMT is more robust to word-order divergence than MASS-based UNMT. Word-shuffling noise in DAE approach could be the possible reason for the approach being robust to word-order divergence.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Sweet spot in the RuCl$_3$ magnetic system: nearly ideal $j_{\mathrm{eff}}\!=\!1/2$ moments and maximized $K/J$ ratio under pressure
Authors:
Pritam Bhattacharyya,
Liviu Hozoi,
Quirin Stahl,
Jochen Geck,
Nikolay A. Bogdanov
Abstract:
Maximizing the ratio between Kitaev and residual Heisenberg interactions is a major goal in nowadays research on Kitaev-Heisenberg quantum magnets. Here we investigate Kitaev-Heisenberg exchange in a recently discovered crystalline phase of RuCl$_3$ under presure -- it displays unusually high symmetry, with only one type of Ru-Ru links, and uniform Ru-Cl-Ru bond angles of $\approx$93$^{\circ}$. By…
▽ More
Maximizing the ratio between Kitaev and residual Heisenberg interactions is a major goal in nowadays research on Kitaev-Heisenberg quantum magnets. Here we investigate Kitaev-Heisenberg exchange in a recently discovered crystalline phase of RuCl$_3$ under presure -- it displays unusually high symmetry, with only one type of Ru-Ru links, and uniform Ru-Cl-Ru bond angles of $\approx$93$^{\circ}$. By quantum chemical calculations in this particular honeycomb-lattice setting we find a very small $J$, which yields a $K/J$ ratio as large as $\sim$100. Interestingly, we also find that this is associated with vanishingly small $d$-shell trigonal splittings, i.\,e., minimal departure from ideal $j_{\mathrm{eff}}\!=\!1/2$ moments. This reconfirms RuCl$_3$ as a most promising platform for materializing the much sought-after Kitaev spin-liquid phase and stimulates further experiments under strain and pressure.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Improving Machine Translation with Phrase Pair Injection and Corpus Filtering
Authors:
Akshay Batheja,
Pushpak Bhattacharyya
Abstract:
In this paper, we show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems. We extract parallel phrases and sentences from the pseudo-parallel corpus and augment it with the parallel corpus to train the NMT models. With the proposed approach, we observe an improvement in the Machine Translation (MT) system for 3 low-…
▽ More
In this paper, we show that the combination of Phrase Pair Injection and Corpus Filtering boosts the performance of Neural Machine Translation (NMT) systems. We extract parallel phrases and sentences from the pseudo-parallel corpus and augment it with the parallel corpus to train the NMT models. With the proposed approach, we observe an improvement in the Machine Translation (MT) system for 3 low-resource language pairs, Hindi-Marathi, English-Marathi, and English-Pashto, and 6 translation directions by up to 2.7 BLEU points, on the FLORES test data. These BLEU score improvements are over the models trained using the whole pseudo-parallel corpus augmented with the parallel corpus.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
There is No Big Brother or Small Brother: Knowledge Infusion in Language Models for Link Prediction and Question Answering
Authors:
Ankush Agarwal,
Sakharam Gawade,
Sachin Channabasavarajendra,
Pushpak Bhattacharyya
Abstract:
The integration of knowledge graphs with deep learning is thriving in improving the performance of various natural language processing (NLP) tasks. In this paper, we focus on knowledge-infused link prediction and question answering using language models, T5, and BLOOM across three domains: Aviation, Movie, and Web. In this context, we infuse knowledge in large and small language models and study t…
▽ More
The integration of knowledge graphs with deep learning is thriving in improving the performance of various natural language processing (NLP) tasks. In this paper, we focus on knowledge-infused link prediction and question answering using language models, T5, and BLOOM across three domains: Aviation, Movie, and Web. In this context, we infuse knowledge in large and small language models and study their performance, and find the performance to be similar. For the link prediction task on the Aviation Knowledge Graph, we obtain a 0.2 hits@1 score using T5-small, T5-base, T5-large, and BLOOM. Using template-based scripts, we create a set of 1 million synthetic factoid QA pairs in the aviation domain from National Transportation Safety Board (NTSB) reports. On our curated QA pairs, the three models of T5 achieve a 0.7 hits@1 score. We validate out findings with the paired student t-test and Cohen's kappa scores. For link prediction on Aviation Knowledge Graph using T5-small and T5-large, we obtain a Cohen's kappa score of 0.76, showing substantial agreement between the models. Thus, we infer that small language models perform similar to large language models with the infusion of knowledge.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
Resonating holes vs molecular spin-orbit coupled states in group-5 lacunar spinels
Authors:
Thorben Petersen,
Pritam Bhattacharyya,
Ulrich K. Rößler,
Liviu Hozoi
Abstract:
The valence electronic structure of magnetic centers is one of the factors that determines the characteristics of a magnet. It may refer to orbital degeneracy, as for $j_\text{eff}=1/2$ Kitaev magnets, or near-degeneracy, e.g. involving the third and fourth shells in cuprate superconductors. Here we explore the inner structure of magnetic moments in group-5 lacunar spinels, fascinating materials f…
▽ More
The valence electronic structure of magnetic centers is one of the factors that determines the characteristics of a magnet. It may refer to orbital degeneracy, as for $j_\text{eff}=1/2$ Kitaev magnets, or near-degeneracy, e.g. involving the third and fourth shells in cuprate superconductors. Here we explore the inner structure of magnetic moments in group-5 lacunar spinels, fascinating materials featuring multisite magnetic units in the form of tetrahedral tetramers. Our quantum chemical analysis reveals a very colorful landscape, much richer than the single-electron, single-configuration description applied so far to all group-5 Ga$M_4X_8$ chalcogenides, and clarifies the basic multiorbital correlations on $M_4$ tetrahedral clusters: while for V strong correlations yield a wave-function that can be well described in terms of four V$^{4+}$V$^{3+}$V$^{3+}$V$^{3+}$ resonant valence structures, for Nb and Ta a picture of dressed molecular-orbital-like $j_\text{eff}=3/2$ entities is more appropriate. These internal degrees of freedom likely shape vibronic couplings, phase transitions, and magneto-electric properties in each of these systems.
△ Less
Submitted 28 August, 2023; v1 submitted 9 January, 2023;
originally announced January 2023.
-
Benchmarking Gaussian Basis Sets in Quantum-Chemical Calculations of Photoabsorption Spectra of Light Atomic Clusters
Authors:
Vikram Mahamiya,
Pritam Bhattacharyya,
Alok Shukla
Abstract:
The choice of Gaussian basis functions for computing the ground-state properties of molecules, and clusters, employing wave-function-based electron-correlated approaches, is a well-studied subject. However, the same cannot be said when it comes to the excited-state properties of such systems, in general, and optical properties, in particular. The aim of the present study is to understand how the c…
▽ More
The choice of Gaussian basis functions for computing the ground-state properties of molecules, and clusters, employing wave-function-based electron-correlated approaches, is a well-studied subject. However, the same cannot be said when it comes to the excited-state properties of such systems, in general, and optical properties, in particular. The aim of the present study is to understand how the choice of basis functions affects the calculations of linear optical absorption in clusters, qualitatively, and quantitatively. For this purpose, we have calculated linear optical absorption spectra of several small charged and neutral clusters, namely, Li$_{2}$, Li$_{3}$, Li$_{4}$, B$_{2}^{+}$, B$_{3}^{+}$, Be$_{2}^{+}$, and Be$_{3}^{+}$, using a variety of Gaussian basis sets. The calculations were performed within the frozen-core approximation, and a rigorous account of electron correlation effects in the valence sector was taken by employing various levels of configuration interaction (CI) approach both for the ground and excited states. Our results on the peak locations in the absorption spectra of Li$_{3}$ and Li$_{4}$ are in very good agreement with the experiments. Our general recommendation is that for excited-state calculations, it is very important to utilize those basis sets which contain augmented functions. Relatively smaller aug-cc-pVDZ basis sets also yield high-quality results for photoabsorption spectra, and are recommended for such calculations if the computational resources are limited.
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
NaRuO$_2$: Kitaev-Heisenberg exchange in triangular-lattice setting
Authors:
Pritam Bhattacharyya,
Nikolay A. Bogdanov,
Satoshi Nishimoto,
Stephen D. Wilson,
Liviu Hozoi
Abstract:
Kitaev exchange, a new paradigm in quantum magnetism research, occurs for 90$^{\circ}$ metal-ligand-metal links, $t_{2g}^5$ transition ions, and sizable spin-orbit coupling. It is being studied in honeycomb compounds but also on triangular lattices. While for the former it is known by now that the Kitaev intersite couplings are ferromagnetic, for the latter the situation is unclear. Here we pin do…
▽ More
Kitaev exchange, a new paradigm in quantum magnetism research, occurs for 90$^{\circ}$ metal-ligand-metal links, $t_{2g}^5$ transition ions, and sizable spin-orbit coupling. It is being studied in honeycomb compounds but also on triangular lattices. While for the former it is known by now that the Kitaev intersite couplings are ferromagnetic, for the latter the situation is unclear. Here we pin down the exchange mechanisms and determine the effective coupling constants in the $t_{2g}^5$ triangular-lattice material NaRuO$_2$, recently found to host a quantum spin liquid ground state. We show that, compared to honeycomb compounds, the characteristic triangular-lattice cation surroundings dramatically affect exchange paths and effective coupling parameters, changing the Kitaev interactions to antiferromagnetic. The quantum chemical analysis and subsequent effective spin model computations provide perspective onto the nature of the experimentally observed quantum spin liquid -- it seemingly implies finite longer-range exchange, and the atypical proximity to ferromagnetic order is related to sizable ferromagnetic Heisenberg nearest-neighbor couplings.
△ Less
Submitted 3 July, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
Detecting Unintended Social Bias in Toxic Language Datasets
Authors:
Nihar Sahoo,
Himanshu Gupta,
Pushpak Bhattacharyya
Abstract:
With the rise of online hate speech, automatic detection of Hate Speech, Offensive texts as a natural language processing task is getting popular. However, very little research has been done to detect unintended social bias from these toxic language datasets. This paper introduces a new dataset ToxicBias curated from the existing dataset of Kaggle competition named "Jigsaw Unintended Bias in Toxic…
▽ More
With the rise of online hate speech, automatic detection of Hate Speech, Offensive texts as a natural language processing task is getting popular. However, very little research has been done to detect unintended social bias from these toxic language datasets. This paper introduces a new dataset ToxicBias curated from the existing dataset of Kaggle competition named "Jigsaw Unintended Bias in Toxicity Classification". We aim to detect social biases, their categories, and targeted groups. The dataset contains instances annotated for five different bias categories, viz., gender, race/ethnicity, religion, political, and LGBTQ. We train transformer-based models using our curated datasets and report baseline performance for bias identification, target generation, and bias implications. Model biases and their mitigation are also discussed in detail. Our study motivates a systematic extraction of social bias data from toxic language datasets. All the codes and dataset used for experiments in this work are publicly available
△ Less
Submitted 21 October, 2022;
originally announced October 2022.
-
SSL-Lanes: Self-Supervised Learning for Motion Forecasting in Autonomous Driving
Authors:
Prarthana Bhattacharyya,
Chengjie Huang,
Krzysztof Czarnecki
Abstract:
Self-supervised learning (SSL) is an emerging technique that has been successfully employed to train convolutional neural networks (CNNs) and graph neural networks (GNNs) for more transferable, generalizable, and robust representation learning. However its potential in motion forecasting for autonomous driving has rarely been explored. In this study, we report the first systematic exploration and…
▽ More
Self-supervised learning (SSL) is an emerging technique that has been successfully employed to train convolutional neural networks (CNNs) and graph neural networks (GNNs) for more transferable, generalizable, and robust representation learning. However its potential in motion forecasting for autonomous driving has rarely been explored. In this study, we report the first systematic exploration and assessment of incorporating self-supervision into motion forecasting. We first propose to investigate four novel self-supervised learning tasks for motion forecasting with theoretical rationale and quantitative and qualitative comparisons on the challenging large-scale Argoverse dataset. Secondly, we point out that our auxiliary SSL-based learning setup not only outperforms forecasting methods which use transformers, complicated fusion mechanisms and sophisticated online dense goal candidate optimization algorithms in terms of performance accuracy, but also has low inference time and architectural complexity. Lastly, we conduct several experiments to understand why SSL improves motion forecasting. Code is open-sourced at \url{https://github.com/AutoVision-cloud/SSL-Lanes}.
△ Less
Submitted 10 September, 2022; v1 submitted 28 June, 2022;
originally announced June 2022.
-
Knowledge Graph Construction and Its Application in Automatic Radiology Report Generation from Radiologist's Dictation
Authors:
Kaveri Kale,
Pushpak Bhattacharyya,
Aditya Shetty,
Milind Gune,
Kush Shrivastava,
Rustom Lawyer,
Spriha Biswas
Abstract:
Conventionally, the radiologist prepares the diagnosis notes and shares them with the transcriptionist. Then the transcriptionist prepares a preliminary formatted report referring to the notes, and finally, the radiologist reviews the report, corrects the errors, and signs off. This workflow causes significant delays and errors in the report. In current research work, we focus on applications of N…
▽ More
Conventionally, the radiologist prepares the diagnosis notes and shares them with the transcriptionist. Then the transcriptionist prepares a preliminary formatted report referring to the notes, and finally, the radiologist reviews the report, corrects the errors, and signs off. This workflow causes significant delays and errors in the report. In current research work, we focus on applications of NLP techniques like Information Extraction (IE) and domain-specific Knowledge Graph (KG) to automatically generate radiology reports from radiologist's dictation. This paper focuses on KG construction for each organ by extracting information from an existing large corpus of free-text radiology reports. We develop an information extraction pipeline that combines rule-based, pattern-based, and dictionary-based techniques with lexical-semantic features to extract entities and relations. Missing information in short dictation can be accessed from the KGs to generate pathological descriptions and hence the radiology report. Generated pathological descriptions evaluated using semantic similarity metrics, which shows 97% similarity with gold standard pathological descriptions. Also, our analysis shows that our IE module is performing better than the OpenIE tool for the radiology domain. Furthermore, we include a manual qualitative analysis from radiologists, which shows that 80-85% of the generated reports are correctly written, and the remaining are partially correct.
△ Less
Submitted 13 June, 2022; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Am I No Good? Towards Detecting Perceived Burdensomeness and Thwarted Belongingness from Suicide Notes
Authors:
Soumitra Ghosh,
Asif Ekbal,
Pushpak Bhattacharyya
Abstract:
The World Health Organization (WHO) has emphasized the importance of significantly accelerating suicide prevention efforts to fulfill the United Nations' Sustainable Development Goal (SDG) objective of 2030. In this paper, we present an end-to-end multitask system to address a novel task of detection of two interpersonal risk factors of suicide, Perceived Burdensomeness (PB) and Thwarted Belonging…
▽ More
The World Health Organization (WHO) has emphasized the importance of significantly accelerating suicide prevention efforts to fulfill the United Nations' Sustainable Development Goal (SDG) objective of 2030. In this paper, we present an end-to-end multitask system to address a novel task of detection of two interpersonal risk factors of suicide, Perceived Burdensomeness (PB) and Thwarted Belongingness (TB) from suicide notes. We also introduce a manually translated code-mixed suicide notes corpus, CoMCEASE-v2.0, based on the benchmark CEASE-v2.0 dataset, annotated with temporal orientation, PB and TB labels. We exploit the temporal orientation and emotion information in the suicide notes to boost overall performance. For comprehensive evaluation of our proposed method, we compare it to several state-of-the-art approaches on the existing CEASE-v2.0 dataset and the newly announced CoMCEASE-v2.0 dataset. Empirical evaluation suggests that temporal and emotional information can substantially improve the detection of PB and TB.
△ Less
Submitted 20 May, 2022;
originally announced June 2022.
-
A Multimodal Corpus for Emotion Recognition in Sarcasm
Authors:
Anupama Ray,
Shubham Mishra,
Apoorva Nunna,
Pushpak Bhattacharyya
Abstract:
While sentiment and emotion analysis have been studied extensively, the relationship between sarcasm and emotion has largely remained unexplored. A sarcastic expression may have a variety of underlying emotions. For example, "I love being ignored" belies sadness, while "my mobile is fabulous with a battery backup of only 15 minutes!" expresses frustration. Detecting the emotion behind a sarcastic…
▽ More
While sentiment and emotion analysis have been studied extensively, the relationship between sarcasm and emotion has largely remained unexplored. A sarcastic expression may have a variety of underlying emotions. For example, "I love being ignored" belies sadness, while "my mobile is fabulous with a battery backup of only 15 minutes!" expresses frustration. Detecting the emotion behind a sarcastic expression is non-trivial yet an important task. We undertake the task of detecting the emotion in a sarcastic statement, which to the best of our knowledge, is hitherto unexplored. We start with the recently released multimodal sarcasm detection dataset (MUStARD) pre-annotated with 9 emotions. We identify and correct 343 incorrect emotion labels (out of 690). We double the size of the dataset, label it with emotions along with valence and arousal which are important indicators of emotional intensity. Finally, we label each sarcastic utterance with one of the four sarcasm types-Propositional, Embedded, Likeprefixed and Illocutionary, with the goal of advancing sarcasm detection research. Exhaustive experimentation with multimodal (text, audio, and video) fusion models establishes a benchmark for exact emotion recognition in sarcasm and outperforms the state-of-art sarcasm detection. We release the dataset enriched with various annotations and the code for research purposes: https://github.com/apoorva-nunna/MUStARD_Plus_Plus
△ Less
Submitted 5 June, 2022;
originally announced June 2022.
-
PIDNet: A Real-time Semantic Segmentation Network Inspired by PID Controllers
Authors:
Jiacong Xu,
Zixiang Xiong,
Shankar P. Bhattacharyya
Abstract:
Two-branch network architecture has shown its efficiency and effectiveness in real-time semantic segmentation tasks. However, direct fusion of high-resolution details and low-frequency context has the drawback of detailed features being easily overwhelmed by surrounding contextual information. This overshoot phenomenon limits the improvement of the segmentation accuracy of existing two-branch mode…
▽ More
Two-branch network architecture has shown its efficiency and effectiveness in real-time semantic segmentation tasks. However, direct fusion of high-resolution details and low-frequency context has the drawback of detailed features being easily overwhelmed by surrounding contextual information. This overshoot phenomenon limits the improvement of the segmentation accuracy of existing two-branch models. In this paper, we make a connection between Convolutional Neural Networks (CNN) and Proportional-Integral-Derivative (PID) controllers and reveal that a two-branch network is equivalent to a Proportional-Integral (PI) controller, which inherently suffers from similar overshoot issues. To alleviate this problem, we propose a novel three-branch network architecture: PIDNet, which contains three branches to parse detailed, context and boundary information, respectively, and employs boundary attention to guide the fusion of detailed and context branches. Our family of PIDNets achieve the best trade-off between inference speed and accuracy and their accuracy surpasses all the existing models with similar inference speed on the Cityscapes and CamVid datasets. Specifically, PIDNet-S achieves 78.6% mIOU with inference speed of 93.2 FPS on Cityscapes and 80.1% mIOU with speed of 153.7 FPS on CamVid.
△ Less
Submitted 6 April, 2023; v1 submitted 4 June, 2022;
originally announced June 2022.
-
Knowledge Graph - Deep Learning: A Case Study in Question Answering in Aviation Safety Domain
Authors:
Ankush Agarwal,
Raj Gite,
Shreya Laddha,
Pushpak Bhattacharyya,
Satyanarayan Kar,
Asif Ekbal,
Prabhjit Thind,
Rajesh Zele,
Ravi Shankar
Abstract:
In the commercial aviation domain, there are a large number of documents, like, accident reports (NTSB, ASRS) and regulatory directives (ADs). There is a need for a system to access these diverse repositories efficiently in order to service needs in the aviation industry, like maintenance, compliance, and safety. In this paper, we propose a Knowledge Graph (KG) guided Deep Learning (DL) based Ques…
▽ More
In the commercial aviation domain, there are a large number of documents, like, accident reports (NTSB, ASRS) and regulatory directives (ADs). There is a need for a system to access these diverse repositories efficiently in order to service needs in the aviation industry, like maintenance, compliance, and safety. In this paper, we propose a Knowledge Graph (KG) guided Deep Learning (DL) based Question Answering (QA) system for aviation safety. We construct a Knowledge Graph from Aircraft Accident reports and contribute this resource to the community of researchers. The efficacy of this resource is tested and proved by the aforesaid QA system. Natural Language Queries constructed from the documents mentioned above are converted into SPARQL (the interface language of the RDF graph database) queries and answered. On the DL side, we have two different QA models: (i) BERT QA which is a pipeline of Passage Retrieval (Sentence-BERT based) and Question Answering (BERT based), and (ii) the recently released GPT-3. We evaluate our system on a set of queries created from the accident reports. Our combined QA system achieves 9.3% increase in accuracy over GPT-3 and 40.3% increase over BERT QA. Thus, we infer that KG-DL performs better than either singly.
△ Less
Submitted 9 June, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.
-
Hollywood Identity Bias Dataset: A Context Oriented Bias Analysis of Movie Dialogues
Authors:
Sandhya Singh,
Prapti Roy,
Nihar Sahoo,
Niteesh Mallela,
Himanshu Gupta,
Pushpak Bhattacharyya,
Milind Savagaonkar,
Nidhi,
Roshni Ramnani,
Anutosh Maitra,
Shubhashis Sengupta
Abstract:
Movies reflect society and also hold power to transform opinions. Social biases and stereotypes present in movies can cause extensive damage due to their reach. These biases are not always found to be the need of storyline but can creep in as the author's bias. Movie production houses would prefer to ascertain that the bias present in a script is the story's demand. Today, when deep learning model…
▽ More
Movies reflect society and also hold power to transform opinions. Social biases and stereotypes present in movies can cause extensive damage due to their reach. These biases are not always found to be the need of storyline but can creep in as the author's bias. Movie production houses would prefer to ascertain that the bias present in a script is the story's demand. Today, when deep learning models can give human-level accuracy in multiple tasks, having an AI solution to identify the biases present in the script at the writing stage can help them avoid the inconvenience of stalled release, lawsuits, etc. Since AI solutions are data intensive and there exists no domain specific data to address the problem of biases in scripts, we introduce a new dataset of movie scripts that are annotated for identity bias. The dataset contains dialogue turns annotated for (i) bias labels for seven categories, viz., gender, race/ethnicity, religion, age, occupation, LGBTQ, and other, which contains biases like body shaming, personality bias, etc. (ii) labels for sensitivity, stereotype, sentiment, emotion, emotion intensity, (iii) all labels annotated with context awareness, (iv) target groups and reason for bias labels and (v) expert-driven group-validation process for high quality annotations. We also report various baseline performances for bias identification and category detection on our dataset.
△ Less
Submitted 1 June, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.
-
EmoInHindi: A Multi-label Emotion and Intensity Annotated Dataset in Hindi for Emotion Recognition in Dialogues
Authors:
Gopendra Vikram Singh,
Priyanshu Priya,
Mauajama Firdaus,
Asif Ekbal,
Pushpak Bhattacharyya
Abstract:
The long-standing goal of Artificial Intelligence (AI) has been to create human-like conversational systems. Such systems should have the ability to develop an emotional connection with the users, hence emotion recognition in dialogues is an important task. Emotion detection in dialogues is a challenging task because humans usually convey multiple emotions with varying degrees of intensities in a…
▽ More
The long-standing goal of Artificial Intelligence (AI) has been to create human-like conversational systems. Such systems should have the ability to develop an emotional connection with the users, hence emotion recognition in dialogues is an important task. Emotion detection in dialogues is a challenging task because humans usually convey multiple emotions with varying degrees of intensities in a single utterance. Moreover, emotion in an utterance of a dialogue may be dependent on previous utterances making the task more complex. Emotion recognition has always been in great demand. However, most of the existing datasets for multi-label emotion and intensity detection in conversations are in English. To this end, we create a large conversational dataset in Hindi named EmoInHindi for multi-label emotion and intensity recognition in conversations containing 1,814 dialogues with a total of 44,247 utterances. We prepare our dataset in a Wizard-of-Oz manner for mental health and legal counselling of crime victims. Each utterance of the dialogue is annotated with one or more emotion categories from the 16 emotion classes including neutral, and their corresponding intensity values. We further propose strong contextual baselines that can detect emotion(s) and the corresponding intensity of an utterance given the conversational context.
△ Less
Submitted 27 May, 2022;
originally announced May 2022.
-
HiNER: A Large Hindi Named Entity Recognition Dataset
Authors:
Rudra Murthy,
Pallab Bhattacharjee,
Rahul Sharnagat,
Jyotsana Khatri,
Diptesh Kanojia,
Pushpak Bhattacharyya
Abstract:
Named Entity Recognition (NER) is a foundational NLP task that aims to provide class labels like Person, Location, Organisation, Time, and Number to words in free text. Named Entities can also be multi-word expressions where the additional I-O-B annotation information helps label them during the NER annotation process. While English and European languages have considerable annotated data for the N…
▽ More
Named Entity Recognition (NER) is a foundational NLP task that aims to provide class labels like Person, Location, Organisation, Time, and Number to words in free text. Named Entities can also be multi-word expressions where the additional I-O-B annotation information helps label them during the NER annotation process. While English and European languages have considerable annotated data for the NER task, Indian languages lack on that front -- both in terms of quantity and following annotation standards. This paper releases a significantly sized standard-abiding Hindi NER dataset containing 109,146 sentences and 2,220,856 tokens, annotated with 11 tags. We discuss the dataset statistics in all their essential detail and provide an in-depth analysis of the NER tag-set used with our data. The statistics of tag-set in our dataset show a healthy per-tag distribution, especially for prominent classes like Person, Location and Organisation. Since the proof of resource-effectiveness is in building models with the resource and testing the model on benchmark data and against the leader-board entries in shared tasks, we do the same with the aforesaid data. We use different language models to perform the sequence labelling task for NER and show the efficacy of our data by performing a comparative evaluation with models trained on another dataset available for the Hindi NER task. Our dataset helps achieve a weighted F1 score of 88.78 with all the tags and 92.22 when we collapse the tag-set, as discussed in the paper. To the best of our knowledge, no available dataset meets the standards of volume (amount) and variability (diversity), as far as Hindi NER is concerned. We fill this gap through this work, which we hope will significantly help NLP for Hindi. We release this dataset with our code and models at https://github.com/cfiltnlp/HiNER
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Yb$^{3+}$ $f$-$f$ excitations in NaYbSe$_2$: benchmarking embedded-cluster quantum chemical schemes for 4$f$ insulators
Authors:
Pritam Bhattacharyya,
Liviu Hozoi
Abstract:
$\tilde{S}\!=\!1/2$ triangular-lattice $f$-electron materials define a dynamic research area in condensed matter magnetism. In various Yb 4$f^{13}$ triangular-lattice compounds, for example, spin-liquid ground states seem to be realized. Using {\it ab initio} quantum chemical methods, we here investigate how correlation effects involving the 4$f$ electrons affect the on-site $f$-$f…
▽ More
$\tilde{S}\!=\!1/2$ triangular-lattice $f$-electron materials define a dynamic research area in condensed matter magnetism. In various Yb 4$f^{13}$ triangular-lattice compounds, for example, spin-liquid ground states seem to be realized. Using {\it ab initio} quantum chemical methods, we here investigate how correlation effects involving the 4$f$ electrons affect the on-site $f$-$f$ excitation spectrum in NaYbSe$_2$. The system is well suited for such a study since unambiguous inelastic neutron scattering data are available for the Yb$^{3+}$ $f$-$f$ transitions. The excitation energies obtained by configuration-interaction calculations with single and double substitutions agree within 3-4 meV with experimental values, which provides a not so expensive alternative to fitting experimental data at the model-Hamiltonian level in order to analyze $f$-center multiplet structures.
△ Less
Submitted 18 June, 2022; v1 submitted 26 April, 2022;
originally announced April 2022.
-
Electronic and structural properties of RbCeX$_2$ (X$_2$: O$_2$, S$_2$, SeS, Se$_2$, TeSe, Te$_2$)
Authors:
Brenden R. Ortiz,
Mitchell M. Bordelon,
Pritam Bhattacharyya,
Ganesh Pokharel,
Paul M. Sarte,
Lorenzo Posthuma,
Thorben Petersen,
Mohamed S. Eldeeb,
Garrett E. Granroth,
Clarina R. Dela Cruz,
Stuart Calder,
Douglas L. Abernathy,
Liviu Hozoi,
Stephen D. Wilson
Abstract:
Triangular lattice delafossite compounds built from magnetic lanthanide ions are a topic of recent interest due to their frustrated magnetism and realization of quantum disordered magnetic ground states. Here we report the evolution of the structure and electronic ground states of RbCe$X_2$ compounds, built from a triangular lattice of Ce$^{3+}$ ions, upon varying their anion character ($X_2$= O…
▽ More
Triangular lattice delafossite compounds built from magnetic lanthanide ions are a topic of recent interest due to their frustrated magnetism and realization of quantum disordered magnetic ground states. Here we report the evolution of the structure and electronic ground states of RbCe$X_2$ compounds, built from a triangular lattice of Ce$^{3+}$ ions, upon varying their anion character ($X_2$= O$_2$, S$_2$, SeS, Se$_2$, TeSe, Te$_2$). This includes the discovery of a new member of this series, RbCeO$_2$, that potentially realizes a quantum disordered ground state analogous to NaYbO$_2$. Magnetization and susceptibility measurements reveal that all compounds manifest mean-field antiferromagnetic interactions and, with the exception of the oxide, possess signatures of magnetic correlations onset below 1 K. The crystalline electric field level scheme is explored via neutron scattering and \textit{ab initio} calculations in order to model the intramultiplet splitting of the $J=5/2$ multiplet. In addition to the two excited doublets expected within the $J=5/2$ manifold, we observe one extra, local mode present across the sample series. This added mode shifts downward in energy with increasing anion mass and decreasing crystal field strength, suggesting a long-lived anomalous mode endemic to anion motion about the Ce$^{3+}$ sites.
△ Less
Submitted 20 July, 2022; v1 submitted 26 April, 2022;
originally announced April 2022.
-
Crystal-field effects competing with spin-orbit interactions in NaCeO$_2$
Authors:
Pritam Bhattacharyya,
Ulrich K. Rößler,
Liviu Hozoi
Abstract:
Ce compounds feature a remarkable diversity of electronic properties, which motivated extensive investigations over the last decades. Inelastic neutron scattering represents an important tool for understanding their underlying electronic structures but in certain cases a straightforward interpretation of the measured spectra is hampered by the presence of strong vibronic couplings. The latter may…
▽ More
Ce compounds feature a remarkable diversity of electronic properties, which motivated extensive investigations over the last decades. Inelastic neutron scattering represents an important tool for understanding their underlying electronic structures but in certain cases a straightforward interpretation of the measured spectra is hampered by the presence of strong vibronic couplings. The latter may give rise to extra spectral features, which complicates the map** of experimental data onto standard multiplet diagrams. To benchmark the performance of embedded-cluster quantum chemical computational schemes for the case of $4f$ systems, we here address the Ce 4$f^1$ multiplet structure of NaCeO$_2$, an antiferromagnet with $D_{2d}$ magnetic-site symmetry for which neutron scattering measurements indicate only weak vibronic effects. Very good agreement with the experimental results is found in the computations, which validates our computational approach and confirms NaCeO$_2$ as a 4$f$ magnet in the intermediate coupling regime with equally strong 4$f$-shell spin-orbit and crystal-field interactions.
△ Less
Submitted 18 June, 2022; v1 submitted 12 March, 2022;
originally announced March 2022.
-
OntoSeer -- A Recommendation System to Improve the Quality of Ontologies
Authors:
Pramit Bhattacharyya,
Raghava Mutharaju
Abstract:
Building an ontology is not only a time-consuming process, but it is also confusing, especially for beginners and the inexperienced. Although ontology developers can take the help of domain experts in building an ontology, they are not readily available in several cases for a variety of reasons. Ontology developers have to grapple with several questions related to the choice of classes, properties…
▽ More
Building an ontology is not only a time-consuming process, but it is also confusing, especially for beginners and the inexperienced. Although ontology developers can take the help of domain experts in building an ontology, they are not readily available in several cases for a variety of reasons. Ontology developers have to grapple with several questions related to the choice of classes, properties, and the axioms that should be included. Apart from this, there are aspects such as modularity and reusability that should be taken care of. From among the thousands of publicly available ontologies and vocabularies in repositories such as Linked Open Vocabularies (LOV) and BioPortal, it is hard to know the terms (classes and properties) that can be reused in the development of an ontology. A similar problem exists in implementing the right set of ontology design patterns (ODPs) from among the several available. Generally, ontology developers make use of their experience in handling these issues, and the inexperienced ones have a hard time. In order to bridge this gap, we propose a tool named OntoSeer, that monitors the ontology development process and provides suggestions in real-time to improve the quality of the ontology under development. It can provide suggestions on the naming conventions to follow, vocabulary to reuse, ODPs to implement, and axioms to be added to the ontology. OntoSeer has been implemented as a Protégé plug-in.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Visual Representation Learning with Self-Supervised Attention for Low-Label High-data Regime
Authors:
Prarthana Bhattacharyya,
Chenge Li,
Xiaonan Zhao,
István Fehérvári,
Jason Sun
Abstract:
Self-supervision has shown outstanding results for natural language processing, and more recently, for image recognition. Simultaneously, vision transformers and its variants have emerged as a promising and scalable alternative to convolutions on various computer vision tasks. In this paper, we are the first to question if self-supervised vision transformers (SSL-ViTs) can be adapted to two import…
▽ More
Self-supervision has shown outstanding results for natural language processing, and more recently, for image recognition. Simultaneously, vision transformers and its variants have emerged as a promising and scalable alternative to convolutions on various computer vision tasks. In this paper, we are the first to question if self-supervised vision transformers (SSL-ViTs) can be adapted to two important computer vision tasks in the low-label, high-data regime: few-shot image classification and zero-shot image retrieval. The motivation is to reduce the number of manual annotations required to train a visual embedder, and to produce generalizable and semantically meaningful embeddings. For few-shot image classification we train SSL-ViTs without any supervision, on external data, and use this trained embedder to adapt quickly to novel classes with limited number of labels. For zero-shot image retrieval, we use SSL-ViTs pre-trained on a large dataset without any labels and fine-tune them with several metric learning objectives. Our self-supervised attention representations outperforms the state-of-the-art on several public benchmarks for both tasks, namely miniImageNet and CUB200 for few-shot image classification by up-to 6%-10%, and Stanford Online Products, Cars196 and CUB200 for zero-shot image retrieval by up-to 4%-11%. Code is available at \url{https://github.com/AutoVision-cloud/SSL-ViT-lowlabel-highdata}.
△ Less
Submitted 30 January, 2022; v1 submitted 21 January, 2022;
originally announced January 2022.
-
Indian Language Wordnets and their Linkages with Princeton WordNet
Authors:
Diptesh Kanojia,
Kevin Patel,
Pushpak Bhattacharyya
Abstract:
Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that t…
▽ More
Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that these resources hold correct information. Thereby, they are created by human experts. However, human experts in multiple languages are hard to come by. Thus, the community would benefit from sharing of such manually created resources. In this paper, we release map**s of 18 Indian language wordnets linked with Princeton WordNet. We believe that availability of such resources will have a direct impact on the progress in NLP for these languages.
△ Less
Submitted 9 January, 2022;
originally announced January 2022.
-
Semi-automatic WordNet Linking using Word Embeddings
Authors:
Kevin Patel,
Diptesh Kanojia,
Pushpak Bhattacharyya
Abstract:
Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that t…
▽ More
Wordnets are rich lexico-semantic resources. Linked wordnets are extensions of wordnets, which link similar concepts in wordnets of different languages. Such resources are extremely useful in many Natural Language Processing (NLP) applications, primarily those based on knowledge-based approaches. In such approaches, these resources are considered as gold standard/oracle. Thus, it is crucial that these resources hold correct information. Thereby, they are created by human experts. However, manual maintenance of such resources is a tedious and costly affair. Thus techniques that can aid the experts are desirable. In this paper, we propose an approach to link wordnets. Given a synset of the source language, the approach returns a ranked list of potential candidate synsets in the target language from which the human expert can choose the correct one(s). Our technique is able to retrieve a winner synset in the top 10 ranked list for 60% of all synsets and 70% of noun synsets.
△ Less
Submitted 5 January, 2022;
originally announced January 2022.
-
Strategies of Effective Digitization of Commentaries and Sub-commentaries: Towards the Construction of Textual History
Authors:
Diptesh Kanojia,
Malhar Kulkarni,
Sayali Ghodekar,
Eivind Kahrs,
Pushpak Bhattacharyya
Abstract:
This paper describes additional aspects of a digital tool called the 'Textual History Tool'. We describe its various salient features with special reference to those of its features that may help the philologist digitize commentaries and sub-commentaries on a text. This tool captures the historical evolution of a text through various temporal stages, and interrelated data culled from various types…
▽ More
This paper describes additional aspects of a digital tool called the 'Textual History Tool'. We describe its various salient features with special reference to those of its features that may help the philologist digitize commentaries and sub-commentaries on a text. This tool captures the historical evolution of a text through various temporal stages, and interrelated data culled from various types of related texts. We use the text of the Kāśikāvrtti (KV) as a sample text, and with the help of philologists, we digitize the commentaries available to us. We digitize the Nyāsa (Ny), the Padamañjarī (Pm) and sub commentaries on the KV text known as the Tantrapradīpa (Tp), and the Makaranda (Mk). We divide each commentary and sub-commentary into functional units and describe the methodology and motivation behind the functional unit division. Our functional unit division helps generate more accurate phylogenetic trees for the text, based on distance methods using the data entered in the tool.
△ Less
Submitted 5 January, 2022;
originally announced January 2022.
-
A Survey on Using Gaze Behaviour for Natural Language Processing
Authors:
Sandeep Mathias,
Diptesh Kanojia,
Abhijit Mishra,
Pushpak Bhattacharyya
Abstract:
Gaze behaviour has been used as a way to gather cognitive information for a number of years. In this paper, we discuss the use of gaze behaviour in solving different tasks in natural language processing (NLP) without having to record it at test time. This is because the collection of gaze behaviour is a costly task, both in terms of time and money. Hence, in this paper, we focus on research done t…
▽ More
Gaze behaviour has been used as a way to gather cognitive information for a number of years. In this paper, we discuss the use of gaze behaviour in solving different tasks in natural language processing (NLP) without having to record it at test time. This is because the collection of gaze behaviour is a costly task, both in terms of time and money. Hence, in this paper, we focus on research done to alleviate the need for recording gaze behaviour at run time. We also mention different eye tracking corpora in multiple languages, which are currently available and can be used in natural language processing. We conclude our paper by discussing applications in a domain - education - and how learning gaze behaviour can help in solving the tasks of complex word identification and automatic essay grading.
△ Less
Submitted 3 January, 2022; v1 submitted 21 December, 2021;
originally announced December 2021.
-
Utilizing Wordnets for Cognate Detection among Indian Languages
Authors:
Diptesh Kanojia,
Kevin Patel,
Pushpak Bhattacharyya,
Malhar Kulkarni,
Gholamreza Haffari
Abstract:
Automatic Cognate Detection (ACD) is a challenging task which has been utilized to help NLP applications like Machine Translation, Information Retrieval and Computational Phylogenetics. Unidentified cognate pairs can pose a challenge to these applications and result in a degradation of performance. In this paper, we detect cognate word pairs among ten Indian languages with Hindi and use deep learn…
▽ More
Automatic Cognate Detection (ACD) is a challenging task which has been utilized to help NLP applications like Machine Translation, Information Retrieval and Computational Phylogenetics. Unidentified cognate pairs can pose a challenge to these applications and result in a degradation of performance. In this paper, we detect cognate word pairs among ten Indian languages with Hindi and use deep learning methodologies to predict whether a word pair is cognate or not. We identify IndoWordnet as a potential resource to detect cognate word pairs based on orthographic similarity-based methods and train neural network models using the data obtained from it. We identify parallel corpora as another potential resource and perform the same experiments for them. We also validate the contribution of Wordnets through further experimentation and report improved performance of up to 26%. We discuss the nuances of cognate detection among closely related Indian languages and release the lists of detected cognates as a dataset. We also observe the behaviour of, to an extent, unrelated Indian language pairs and release the lists of detected cognates among them as well.
△ Less
Submitted 30 December, 2021;
originally announced December 2021.
-
"A Passage to India": Pre-trained Word Embeddings for Indian Languages
Authors:
Kumar Saurav,
Kumar Saunack,
Diptesh Kanojia,
Pushpak Bhattacharyya
Abstract:
Dense word vectors or 'word embeddings' which encode semantic properties of words, have now become integral to NLP tasks like Machine Translation (MT), Question Answering (QA), Word Sense Disambiguation (WSD), and Information Retrieval (IR). In this paper, we use various existing approaches to create multiple word embeddings for 14 Indian languages. We place these embeddings for all these language…
▽ More
Dense word vectors or 'word embeddings' which encode semantic properties of words, have now become integral to NLP tasks like Machine Translation (MT), Question Answering (QA), Word Sense Disambiguation (WSD), and Information Retrieval (IR). In this paper, we use various existing approaches to create multiple word embeddings for 14 Indian languages. We place these embeddings for all these languages, viz., Assamese, Bengali, Gujarati, Hindi, Kannada, Konkani, Malayalam, Marathi, Nepali, Odiya, Punjabi, Sanskrit, Tamil, and Telugu in a single repository. Relatively newer approaches that emphasize catering to context (BERT, ELMo, etc.) have shown significant improvements, but require a large amount of resources to generate usable models. We release pre-trained embeddings generated using both contextual and non-contextual approaches. We also use MUSE and XLM to train cross-lingual embeddings for all pairs of the aforementioned languages. To show the efficacy of our embeddings, we evaluate our embedding models on XPOS, UPOS and NER tasks for all these languages. We release a total of 436 models using 8 different approaches. We hope they are useful for the resource-constrained Indian language NLP. The title of this paper refers to the famous novel 'A Passage to India' by E.M. Forster, published initially in 1924.
△ Less
Submitted 27 December, 2021;
originally announced December 2021.
-
Challenge Dataset of Cognates and False Friend Pairs from Indian Languages
Authors:
Diptesh Kanojia,
Pushpak Bhattacharyya,
Malhar Kulkarni,
Gholamreza Haffari
Abstract:
Cognates are present in multiple variants of the same text across different languages (e.g., "hund" in German and "hound" in English language mean "dog"). They pose a challenge to various Natural Language Processing (NLP) applications such as Machine Translation, Cross-lingual Sense Disambiguation, Computational Phylogenetics, and Information Retrieval. A possible solution to address this challeng…
▽ More
Cognates are present in multiple variants of the same text across different languages (e.g., "hund" in German and "hound" in English language mean "dog"). They pose a challenge to various Natural Language Processing (NLP) applications such as Machine Translation, Cross-lingual Sense Disambiguation, Computational Phylogenetics, and Information Retrieval. A possible solution to address this challenge is to identify cognates across language pairs. In this paper, we describe the creation of two cognate datasets for twelve Indian languages, namely Sanskrit, Hindi, Assamese, Oriya, Kannada, Gujarati, Tamil, Telugu, Punjabi, Bengali, Marathi, and Malayalam. We digitize the cognate data from an Indian language cognate dictionary and utilize linked Indian language Wordnets to generate cognate sets. Additionally, we use the Wordnet data to create a False Friends' dataset for eleven language pairs. We also evaluate the efficacy of our dataset using previously available baseline cognate detection approaches. We also perform a manual evaluation with the help of lexicographers and release the curated gold-standard dataset with this paper.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages
Authors:
Diptesh Kanojia,
Raj Dabre,
Shubham Dewangan,
Pushpak Bhattacharyya,
Gholamreza Haffari,
Malhar Kulkarni
Abstract:
Cognates are variants of the same lexical form across different languages; for example 'fonema' in Spanish and 'phoneme' in English are cognates, both of which mean 'a unit of sound'. The task of automatic detection of cognates among any two languages can help downstream NLP tasks such as Cross-lingual Information Retrieval, Computational Phylogenetics, and Machine Translation. In this paper, we d…
▽ More
Cognates are variants of the same lexical form across different languages; for example 'fonema' in Spanish and 'phoneme' in English are cognates, both of which mean 'a unit of sound'. The task of automatic detection of cognates among any two languages can help downstream NLP tasks such as Cross-lingual Information Retrieval, Computational Phylogenetics, and Machine Translation. In this paper, we demonstrate the use of cross-lingual word embeddings for detecting cognates among fourteen Indian Languages. Our approach introduces the use of context from a knowledge graph to generate improved feature representations for cognate detection. We, then, evaluate the impact of our cognate detection mechanism on neural machine translation (NMT), as a downstream task. We evaluate our methods to detect cognates on a challenging dataset of twelve Indian languages, namely, Sanskrit, Hindi, Assamese, Oriya, Kannada, Gujarati, Tamil, Telugu, Punjabi, Bengali, Marathi, and Malayalam. Additionally, we create evaluation datasets for two more Indian languages, Konkani and Nepali. We observe an improvement of up to 18% points, in terms of F-score, for cognate detection. Furthermore, we observe that cognates extracted using our method help improve NMT quality by up to 2.76 BLEU. We also release our code, newly constructed datasets and cross-lingual models publicly.
△ Less
Submitted 16 December, 2021;
originally announced December 2021.
-
Cognition-aware Cognate Detection
Authors:
Diptesh Kanojia,
Prashant Sharma,
Sayali Ghodekar,
Pushpak Bhattacharyya,
Gholamreza Haffari,
Malhar Kulkarni
Abstract:
Automatic detection of cognates helps downstream NLP tasks of Machine Translation, Cross-lingual Information Retrieval, Computational Phylogenetics and Cross-lingual Named Entity Recognition. Previous approaches for the task of cognate detection use orthographic, phonetic and semantic similarity based features sets. In this paper, we propose a novel method for enriching the feature sets, with cogn…
▽ More
Automatic detection of cognates helps downstream NLP tasks of Machine Translation, Cross-lingual Information Retrieval, Computational Phylogenetics and Cross-lingual Named Entity Recognition. Previous approaches for the task of cognate detection use orthographic, phonetic and semantic similarity based features sets. In this paper, we propose a novel method for enriching the feature sets, with cognitive features extracted from human readers' gaze behaviour. We collect gaze behaviour data for a small sample of cognates and show that extracted cognitive features help the task of cognate detection. However, gaze data collection and annotation is a costly task. We use the collected gaze behaviour data to predict cognitive features for a larger sample and show that predicted cognitive features, also, significantly improve the task performance. We report improvements of 10% with the collected gaze features, and 12% using the predicted gaze features, over the previously proposed approaches. Furthermore, we release the collected gaze behaviour data along with our code and cross-lingual models.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
3D Scene Understanding at Urban Intersection using Stereo Vision and Digital Map
Authors:
Prarthana Bhattacharyya,
Yanlei Gu,
Jiali Bao,
Xu Liu,
Shunsuke Kamijo
Abstract:
The driving behavior at urban intersections is very complex. It is thus crucial for autonomous vehicles to comprehensively understand challenging urban traffic scenes in order to navigate intersections and prevent accidents. In this paper, we introduce a stereo vision and 3D digital map based approach to spatially and temporally analyze the traffic situation at urban intersections. Stereo vision i…
▽ More
The driving behavior at urban intersections is very complex. It is thus crucial for autonomous vehicles to comprehensively understand challenging urban traffic scenes in order to navigate intersections and prevent accidents. In this paper, we introduce a stereo vision and 3D digital map based approach to spatially and temporally analyze the traffic situation at urban intersections. Stereo vision is used to detect, classify and track obstacles, while a 3D digital map is used to improve ego-localization and provide context in terms of road-layout information. A probabilistic approach that temporally integrates these geometric, semantic, dynamic and contextual cues is presented. We qualitatively and quantitatively evaluate our proposed technique on real traffic data collected at an urban canyon in Tokyo to demonstrate the efficacy of the system in providing comprehensive awareness of the traffic surroundings.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
Natural Answer Generation: From Factoid Answer to Full-length Answer using Grammar Correction
Authors:
Manas Jain,
Sriparna Saha,
Pushpak Bhattacharyya,
Gladvin Chinnadurai,
Manish Kumar Vatsa
Abstract:
Question Answering systems these days typically use template-based language generation. Though adequate for a domain-specific task, these systems are too restrictive and predefined for domain-independent systems. This paper proposes a system that outputs a full-length answer given a question and the extracted factoid answer (short spans such as named entities) as the input. Our system uses constit…
▽ More
Question Answering systems these days typically use template-based language generation. Though adequate for a domain-specific task, these systems are too restrictive and predefined for domain-independent systems. This paper proposes a system that outputs a full-length answer given a question and the extracted factoid answer (short spans such as named entities) as the input. Our system uses constituency and dependency parse trees of questions. A transformer-based Grammar Error Correction model GECToR (2020), is used as a post-processing step for better fluency. We compare our system with (i) Modified Pointer Generator (SOTA) and (ii) Fine-tuned DialoGPT for factoid questions. We also test our approach on existential (yes-no) questions with better results. Our model generates accurate and fluent answers than the state-of-the-art (SOTA) approaches. The evaluation is done on NewsQA and SqUAD datasets with an increment of 0.4 and 0.9 percentage points in ROUGE-1 score respectively. Also the inference time is reduced by 85\% as compared to the SOTA. The improved datasets used for our evaluation will be released as part of the research contribution.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
Crystal Growth, Exfoliation and Magnetic Properties of Quaternary Quasi-Two-Dimensional CuCrP$_2$S$_6$
Authors:
Sebastian Selter,
Kranthi K. Bestha,
Pritam Bhattacharyya,
Burak Özer,
Yuliia Shemerliuk,
Laura T. Corredor,
Louis Veyrat,
Anja U. B. Wolter,
Liviu Hozoi,
Bernd Büchner,
Saicharan Aswartham
Abstract:
We report optimized crystal growth conditions for the quaternary compound CuCrP$_2$S$_6$ by chemical vapor transport. Compositional and structural characterization of the obtained crystals were carried out by means of energy-dispersive X-ray spectroscopy and powder X-ray diffraction. CuCrP$_2$S$_6$ is structurally closely related to the $M_2$P$_2$S$_6$ family ($M$: transition metal), which contain…
▽ More
We report optimized crystal growth conditions for the quaternary compound CuCrP$_2$S$_6$ by chemical vapor transport. Compositional and structural characterization of the obtained crystals were carried out by means of energy-dispersive X-ray spectroscopy and powder X-ray diffraction. CuCrP$_2$S$_6$ is structurally closely related to the $M_2$P$_2$S$_6$ family ($M$: transition metal), which contains several compounds that are under investigation as 2D magnets. As-grown crystals exhibit a plate-like, layered morphology as well as a hexagonal habitus. We present successful exfoliation of such as-grown crystals down to thicknesses of 2.8 nm corresponding to 4 layers. CuCrP$_2$S$_6$ crystallizes in the monoclinic space group $C2/c$. Magnetization measurements reveal an antiferromagnetic ground state with $T_\textrm{N} \approx 30$ K and a positive Curie-Weiss temperature in agreement with dominant ferromagnetic intralayer coupling. Specific heat measurements confirm this magnetic phase transition and the magnetic order is suppressed in an external magnetic field of about 6 T (8 T) applied parallel (perpendicular) to the $ab$ plane. At higher temperatures between 140-200 K additional broad anomalies associated with structural changes accompanying antiferroelectric ordering are detected in our specific heat studies.
△ Less
Submitted 28 April, 2022; v1 submitted 5 December, 2021;
originally announced December 2021.
-
Tap** BERT for Preposition Sense Disambiguation
Authors:
Siddhesh Pawar,
Shyam Thombre,
Anirudh Mittal,
Girishkumar Ponkiya,
Pushpak Bhattacharyya
Abstract:
Prepositions are frequently occurring polysemous words. Disambiguation of prepositions is crucial in tasks like semantic role labelling, question answering, text entailment, and noun compound paraphrasing. In this paper, we propose a novel methodology for preposition sense disambiguation (PSD), which does not use any linguistic tools. In a supervised setting, the machine learning model is presente…
▽ More
Prepositions are frequently occurring polysemous words. Disambiguation of prepositions is crucial in tasks like semantic role labelling, question answering, text entailment, and noun compound paraphrasing. In this paper, we propose a novel methodology for preposition sense disambiguation (PSD), which does not use any linguistic tools. In a supervised setting, the machine learning model is presented with sentences wherein prepositions have been annotated with senses. These senses are IDs in what is called The Preposition Project (TPP). We use the hidden layer representations from pre-trained BERT and BERT variants. The latent representations are then classified into the correct sense ID using a Multi Layer Perceptron. The dataset used for this task is from SemEval-2007 Task-6. Our methodology gives an accuracy of 86.85% which is better than the state-of-the-art.
△ Less
Submitted 27 November, 2021;
originally announced November 2021.
-
"So You Think You're Funny?": Rating the Humour Quotient in Standup Comedy
Authors:
Anirudh Mittal,
Pranav Jeevan,
Prerak Gandhi,
Diptesh Kanojia,
Pushpak Bhattacharyya
Abstract:
Computational Humour (CH) has attracted the interest of Natural Language Processing and Computational Linguistics communities. Creating datasets for automatic measurement of humour quotient is difficult due to multiple possible interpretations of the content. In this work, we create a multi-modal humour-annotated dataset ($\sim$40 hours) using stand-up comedy clips. We devise a novel scoring mecha…
▽ More
Computational Humour (CH) has attracted the interest of Natural Language Processing and Computational Linguistics communities. Creating datasets for automatic measurement of humour quotient is difficult due to multiple possible interpretations of the content. In this work, we create a multi-modal humour-annotated dataset ($\sim$40 hours) using stand-up comedy clips. We devise a novel scoring mechanism to annotate the training data with a humour quotient score using the audience's laughter. The normalized duration (laughter duration divided by the clip duration) of laughter in each clip is used to compute this humour coefficient score on a five-point scale (0-4). This method of scoring is validated by comparing with manually annotated scores, wherein a quadratic weighted kappa of 0.6 is obtained. We use this dataset to train a model that provides a "funniness" score, on a five-point scale, given the audio and its corresponding text. We compare various neural language models for the task of humour-rating and achieve an accuracy of $0.813$ in terms of Quadratic Weighted Kappa (QWK). Our "Open Mic" dataset is released for further research along with the code.
△ Less
Submitted 25 October, 2021;
originally announced October 2021.
-
COVIDRead: A Large-scale Question Answering Dataset on COVID-19
Authors:
Tanik Saikh,
Sovan Kumar Sahoo,
Asif Ekbal,
Pushpak Bhattacharyya
Abstract:
During this pandemic situation, extracting any relevant information related to COVID-19 will be immensely beneficial to the community at large. In this paper, we present a very important resource, COVIDRead, a Stanford Question Answering Dataset (SQuAD) like dataset over more than 100k question-answer pairs. The dataset consists of Context-Answer-Question triples. Primarily the questions from the…
▽ More
During this pandemic situation, extracting any relevant information related to COVID-19 will be immensely beneficial to the community at large. In this paper, we present a very important resource, COVIDRead, a Stanford Question Answering Dataset (SQuAD) like dataset over more than 100k question-answer pairs. The dataset consists of Context-Answer-Question triples. Primarily the questions from the context are constructed in an automated way. After that, the system-generated questions are manually checked by hu-mans annotators. This is a precious resource that could serve many purposes, ranging from common people queries regarding this very uncommon disease to managing articles by editors/associate editors of a journal. We establish several end-to-end neural network based baseline models that attain the lowest F1 of 32.03% and the highest F1 of 37.19%. To the best of our knowledge, we are the first to provide this kind of QA dataset in such a large volume on COVID-19. This dataset creates a new avenue of carrying out research on COVID-19 by providing a benchmark dataset and a baseline model.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Why does B$_{12}$H$_{12}$-icosahedron need two electrons to be stable: A first-principles electron-correlated investigation of B$_{12}$H$_{n}$ ($n=$6,12) clusters
Authors:
Pritam Bhattacharyya,
Ihsan Boustani,
Alok Shukla
Abstract:
In this work, we present large-scale electron-correlated computations on various conformers of B$_{12}$H$_{12}$ and B$_{12}$H$_{6}$ clusters, to understand the reasons behind the high stability of di-anion icosahedron ($I_{h}$) and cage-like B$_{12}$H$_{6}$ geometries. Although the B$_{12}$-icosahedron is the basic building block in some structures of bulk boron, it is unstable in its free form. F…
▽ More
In this work, we present large-scale electron-correlated computations on various conformers of B$_{12}$H$_{12}$ and B$_{12}$H$_{6}$ clusters, to understand the reasons behind the high stability of di-anion icosahedron ($I_{h}$) and cage-like B$_{12}$H$_{6}$ geometries. Although the B$_{12}$-icosahedron is the basic building block in some structures of bulk boron, it is unstable in its free form. Furthermore, its H-passivated entity, i.e., B$_{12}$H$_{12}$ icosahedron is also unstable in free form. However, dianion B$_{12}$H$_{12}$ has been predicted to be stable as a perfect icosahedron in the free-standing form. In order to capture the correct picture for the stability of B$_{12}$H$_{12}^{-2}$ and B$_{12}$H$_{6}$ clusters, we optimized these structures by employing the coupled-cluster singles-doubles (CCSD) approach and cc-pVDZ basis set. We also performed vibrational frequency analysis of the isomers of these clusters, using the same level of theory to ensure the stability of the structures. For all the stable geometries obtained from the vibrational frequency analysis, we additionally computed their optical absorption spectra using the time-dependent density functional theory (TDDFT) approach, at the the B3LYP/6-31G{*} level of theory. Our calculated absorption spectra could be probed in future experiments on these clusters.
△ Less
Submitted 31 December, 2021; v1 submitted 27 September, 2021;
originally announced September 2021.
-
Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages
Authors:
Tejas Indulal Dhamecha,
Rudra Murthy V,
Samarth Bharadwaj,
Karthik Sankaranarayanan,
Pushpak Bhattacharyya
Abstract:
We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to tr…
▽ More
We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to track performance change as languages are added to a base language in a graded and greedy (in the sense of best boost of performance) manner; which reveals that careful selection of subset of related languages can significantly improve performance than utilizing all related languages. The Indo-Aryan (IA) language family is chosen for the study, the exact languages being Bengali, Gujarati, Hindi, Marathi, Oriya, Punjabi and Urdu. The script barrier is crossed by simple rule-based transliteration of the text of all languages to Devanagari. Experiments are performed on mBERT, IndicBERT, MuRIL and two RoBERTa-based LMs, the last two being pre-trained by us. Low resource languages, such as Oriya and Punjabi, are found to be the largest beneficiaries of multilingual fine-tuning. Textual Entailment, Entity Classification, Section Title Prediction, tasks of IndicGLUE and POS tagging form our test bed. Compared to monolingual fine tuning we get relative performance improvement of up to 150% in the downstream tasks. The surprise take-away is that for any language there is a particular combination of other languages which yields the best performance, and any additional language is in fact detrimental.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations
Authors:
Dushyant Singh Chauhan,
Gopendra Vikram Singh,
Navonil Majumder,
Amir Zadeh,
Asif Ekbal,
Pushpak Bhattacharyya,
Louis-philippe Morency,
Soujanya Poria
Abstract:
Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great demand to build models and systems that support m…
▽ More
Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great demand to build models and systems that support multilingual information access. To this end, we propose a dataset for Multimodal Multiparty Hindi Humor (M2H2) recognition in conversations containing 6,191 utterances from 13 episodes of a very popular TV series "Shrimaan Shrimati Phir Se". Each utterance is annotated with humor/non-humor labels and encompasses acoustic, visual, and textual modalities. We propose several strong multimodal baselines and show the importance of contextual and multimodal information for humor recognition in conversations. The empirical results on M2H2 dataset demonstrate that multimodal information complements unimodal information for humor recognition. The dataset and the baselines are available at http://www.iitp.ac.in/~ai-nlp-ml/resources.html and https://github.com/declare-lab/M2H2-dataset.
△ Less
Submitted 2 August, 2021;
originally announced August 2021.
-
Crosslingual Embeddings are Essential in UNMT for Distant Languages: An English to IndoAryan Case Study
Authors:
Tamali Banerjee,
Rudra Murthy V,
Pushpak Bhattacharyya
Abstract:
Recent advances in Unsupervised Neural Machine Translation (UNMT) have minimized the gap between supervised and unsupervised machine translation performance for closely related language pairs. However, the situation is very different for distant language pairs. Lack of lexical overlap and low syntactic similarities such as between English and Indo-Aryan languages leads to poor translation quality…
▽ More
Recent advances in Unsupervised Neural Machine Translation (UNMT) have minimized the gap between supervised and unsupervised machine translation performance for closely related language pairs. However, the situation is very different for distant language pairs. Lack of lexical overlap and low syntactic similarities such as between English and Indo-Aryan languages leads to poor translation quality in existing UNMT systems. In this paper, we show that initializing the embedding layer of UNMT models with cross-lingual embeddings shows significant improvements in BLEU score over existing approaches with embeddings randomly initialized. Further, static embeddings (freezing the embedding layer weights) lead to better gains compared to updating the embedding layer weights during training (non-static). We experimented using Masked Sequence to Sequence (MASS) and Denoising Autoencoder (DAE) UNMT approaches for three distant language pairs. The proposed cross-lingual embedding initialization yields BLEU score improvement of as much as ten times over the baseline for English-Hindi, English-Bengali, and English-Gujarati. Our analysis shows the importance of cross-lingual embedding, comparisons between approaches, and the scope of improvements in these systems.
△ Less
Submitted 9 June, 2021;
originally announced June 2021.