Search | arXiv e-print repository

Jetsons at FinNLP 2024: Towards Understanding the ESG Impact of a News Article using Transformer-based Models

Authors: Parag Pravin Dakle, Alolika Gon, Sihan Zha, Liang Wang, SaiKrishna Rallabandi, Preethi Raghavan

Abstract: In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. The shared task focuses on predicting the duration and type of the ESG impact of a news article. The shared task dataset consists of 2,059 news titles and articles in English, French, Korean, and Japanese languages. For the impact duration cla… ▽ More In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. The shared task focuses on predicting the duration and type of the ESG impact of a news article. The shared task dataset consists of 2,059 news titles and articles in English, French, Korean, and Japanese languages. For the impact duration classification task, we fine-tuned XLM-RoBERTa with a custom fine-tuning strategy and using self-training and DeBERTa-v3 using only English translations. These models individually ranked first on the leaderboard for Korean and Japanese and in an ensemble for the English language, respectively. For the impact type classification task, our XLM-RoBERTa model fine-tuned using a custom fine-tuning strategy ranked first for the English language. △ Less

Submitted 30 March, 2024; originally announced April 2024.

arXiv:2402.17882 [pdf, other]

BlendSQL: A Scalable Dialect for Unifying Hybrid Question Answering in Relational Algebra

Authors: Parker Glenn, Parag Pravin Dakle, Liang Wang, Preethi Raghavan

Abstract: Many existing end-to-end systems for hybrid question answering tasks can often be boiled down to a "prompt-and-pray" paradigm, where the user has limited control and insight into the intermediate reasoning steps used to achieve the final result. Additionally, due to the context size limitation of many transformer-based LLMs, it is often not reasonable to expect that the full structured and unstruc… ▽ More Many existing end-to-end systems for hybrid question answering tasks can often be boiled down to a "prompt-and-pray" paradigm, where the user has limited control and insight into the intermediate reasoning steps used to achieve the final result. Additionally, due to the context size limitation of many transformer-based LLMs, it is often not reasonable to expect that the full structured and unstructured context will fit into a given prompt in a zero-shot setting, let alone a few-shot setting. We introduce BlendSQL, a superset of SQLite to act as a unified dialect for orchestrating reasoning across both unstructured and structured data. For hybrid question answering tasks involving multi-hop reasoning, we encode the full decomposed reasoning roadmap into a single interpretable BlendSQL query. Notably, we show that BlendSQL can scale to massive datasets and improve the performance of end-to-end systems while using 35% fewer tokens. Our code is available and installable as a package at https://github.com/parkervg/blendsql. △ Less

Submitted 10 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

Comments: For associated codebase, see https://github.com/parkervg/blendsql

arXiv:2402.16882 [pdf, other]

Substrate Scope Contrastive Learning: Repurposing Human Bias to Learn Atomic Representations

Authors: Wenhao Gao, Priyanka Raghavan, Ron Shprints, Connor W. Coley

Abstract: Learning molecular representation is a critical step in molecular machine learning that significantly influences modeling success, particularly in data-scarce situations. The concept of broadly pre-training neural networks has advanced fields such as computer vision, natural language processing, and protein engineering. However, similar approaches for small organic molecules have not achieved comp… ▽ More Learning molecular representation is a critical step in molecular machine learning that significantly influences modeling success, particularly in data-scarce situations. The concept of broadly pre-training neural networks has advanced fields such as computer vision, natural language processing, and protein engineering. However, similar approaches for small organic molecules have not achieved comparable success. In this work, we introduce a novel pre-training strategy, substrate scope contrastive learning, which learns atomic representations tailored to chemical reactivity. This method considers the grou** of substrates and their yields in published substrate scope tables as a measure of their similarity or dissimilarity in terms of chemical reactivity. We focus on 20,798 aryl halides in the CAS Content Collection spanning thousands of publications to learn a representation of aryl halide reactivity. We validate our pre-training approach through both intuitive visualizations and comparisons to traditional reactivity descriptors and physical organic chemistry principles. The versatility of these embeddings is further evidenced in their application to yield prediction, regioselectivity prediction, and the diverse selection of new substrates. This work not only presents a chemistry-tailored neural network pre-training strategy to learn reactivity-aligned atomic representations, but also marks a first-of-its-kind approach to benefit from the human bias in substrate scope design. △ Less

Submitted 18 February, 2024; originally announced February 2024.

arXiv:2312.01143 [pdf, other]

Towards leveraging LLMs for Conditional QA

Authors: Syed-Amad Hussain, Parag Pravin Dakle, SaiKrishna Rallabandi, Preethi Raghavan

Abstract: This study delves into the capabilities and limitations of Large Language Models (LLMs) in the challenging domain of conditional question-answering. Utilizing the Conditional Question Answering (CQA) dataset and focusing on generative models like T5 and UL2, we assess the performance of LLMs across diverse question types. Our findings reveal that fine-tuned LLMs can surpass the state-of-the-art (S… ▽ More This study delves into the capabilities and limitations of Large Language Models (LLMs) in the challenging domain of conditional question-answering. Utilizing the Conditional Question Answering (CQA) dataset and focusing on generative models like T5 and UL2, we assess the performance of LLMs across diverse question types. Our findings reveal that fine-tuned LLMs can surpass the state-of-the-art (SOTA) performance in some cases, even without fully encoding all input context, with an increase of 7-8 points in Exact Match (EM) and F1 scores for Yes/No questions. However, these models encounter challenges in extractive question answering, where they lag behind the SOTA by over 10 points, and in mitigating the risk of injecting false information. A study with oracle-retrievers emphasizes the critical role of effective evidence retrieval, underscoring the necessity for advanced solutions in this area. Furthermore, we highlight the significant influence of evaluation metrics on performance assessments and advocate for a more comprehensive evaluation framework. The complexity of the task, the observed performance discrepancies, and the need for effective evidence retrieval underline the ongoing challenges in this field and underscore the need for future work focusing on refining training tasks and exploring prompt-based techniques to enhance LLM performance in conditional question-answering tasks. △ Less

Submitted 2 December, 2023; originally announced December 2023.

arXiv:2309.08777 [pdf, other]

Self-training Strategies for Sentiment Analysis: An Empirical Study

Authors: Haochen Liu, Sai Krishna Rallabandi, Yi**g Wu, Parag Pravin Dakle, Preethi Raghavan

Abstract: Sentiment analysis is a crucial task in natural language processing that involves identifying and extracting subjective sentiment from text. Self-training has recently emerged as an economical and efficient technique for develo** sentiment analysis models by leveraging a small amount of labeled data and a large amount of unlabeled data. However, given a set of training data, how to utilize them… ▽ More Sentiment analysis is a crucial task in natural language processing that involves identifying and extracting subjective sentiment from text. Self-training has recently emerged as an economical and efficient technique for develo** sentiment analysis models by leveraging a small amount of labeled data and a large amount of unlabeled data. However, given a set of training data, how to utilize them to conduct self-training makes a significant difference in the final performance of the model. We refer to this methodology as the self-training strategy. In this paper, we present an empirical study of various self-training strategies for sentiment analysis. First, we investigate the influence of the self-training strategy and hyper-parameters on the performance of traditional small language models (SLMs) in various few-shot settings. Second, we also explore the feasibility of leveraging large language models (LLMs) to help self-training. We propose and empirically compare several self-training strategies with the intervention of LLMs. Extensive experiments are conducted on three real-world sentiment analysis datasets. △ Less

Submitted 3 February, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: Accepted by EACL Findings 2024

arXiv:2305.19974 [pdf, other]

Correcting Semantic Parses with Natural Language through Dynamic Schema Encoding

Authors: Parker Glenn, Parag Pravin Dakle, Preethi Raghavan

Abstract: In addressing the task of converting natural language to SQL queries, there are several semantic and syntactic challenges. It becomes increasingly important to understand and remedy the points of failure as the performance of semantic parsing systems improve. We explore semantic parse correction with natural language feedback, proposing a new solution built on the success of autoregressive decoder… ▽ More In addressing the task of converting natural language to SQL queries, there are several semantic and syntactic challenges. It becomes increasingly important to understand and remedy the points of failure as the performance of semantic parsing systems improve. We explore semantic parse correction with natural language feedback, proposing a new solution built on the success of autoregressive decoders in text-to-SQL tasks. By separating the semantic and syntactic difficulties of the task, we show that the accuracy of text-to-SQL parsers can be boosted by up to 26% with only one turn of correction with natural language. Additionally, we show that a T5-base model is capable of correcting the errors of a T5-large model in a zero-shot, cross-parser setting. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: ACL 2023 Workshop on NLP for Conversational AI

arXiv:2304.13689 [pdf, other]

HeySQuAD: A Spoken Question Answering Dataset

Authors: Yi**g Wu, SaiKrishna Rallabandi, Ravisutha Srinivasamurthy, Parag Pravin Dakle, Alolika Gon, Preethi Raghavan

Abstract: Spoken question answering (SQA) systems are critical for digital assistants and other real-world use cases, but evaluating their performance is a challenge due to the importance of human-spoken questions. This study presents a new large-scale community-shared SQA dataset called HeySQuAD, which includes 76k human-spoken questions, 97k machine-generated questions, and their corresponding textual ans… ▽ More Spoken question answering (SQA) systems are critical for digital assistants and other real-world use cases, but evaluating their performance is a challenge due to the importance of human-spoken questions. This study presents a new large-scale community-shared SQA dataset called HeySQuAD, which includes 76k human-spoken questions, 97k machine-generated questions, and their corresponding textual answers from the SQuAD QA dataset. Our goal is to measure the ability of machines to accurately understand noisy spoken questions and provide reliable answers. Through extensive testing, we demonstrate that training with transcribed human-spoken and original SQuAD questions leads to a significant improvement (12.51%) in answering human-spoken questions compared to training with only the original SQuAD textual questions. Moreover, evaluating with a higher-quality transcription can lead to a further improvement of 2.03%. This research has significant implications for the development of SQA systems and their ability to meet the needs of users in real-world scenarios. △ Less

Submitted 27 February, 2024; v1 submitted 26 April, 2023; originally announced April 2023.

arXiv:2211.14865 [pdf, other]

Understanding BLOOM: An empirical study on diverse NLP tasks

Authors: Parag Pravin Dakle, SaiKrishna Rallabandi, Preethi Raghavan

Abstract: We view the landscape of large language models (LLMs) through the lens of the recently released BLOOM model to understand the performance of BLOOM and other decoder-only LLMs compared to BERT-style encoder-only models. We achieve this by evaluating the smaller BLOOM model variants (\textit{350m/560m} and \textit{1b3/1b7}) on several NLP benchmark datasets and popular leaderboards. We make the foll… ▽ More We view the landscape of large language models (LLMs) through the lens of the recently released BLOOM model to understand the performance of BLOOM and other decoder-only LLMs compared to BERT-style encoder-only models. We achieve this by evaluating the smaller BLOOM model variants (\textit{350m/560m} and \textit{1b3/1b7}) on several NLP benchmark datasets and popular leaderboards. We make the following observations: (1) BLOOM performance does not scale with parameter size, unlike other LLMs like GPT and BERT. Experiments fine-tuning BLOOM models show that the 560m variant performs similarly to or better than the 1b7 variant, (2) Zero-shot cross-lingual and multi-lingual fine-tuning experiments show that BLOOM is at par or worse than monolingual GPT-2 models, and (3) Toxicity analysis of prompt-based text generation using the RealToxicityPrompts dataset shows that the text generated by BLOOM is at least 17\% less toxic than GPT-2 and GPT-3 models. △ Less

Submitted 14 March, 2023; v1 submitted 27 November, 2022; originally announced November 2022.

arXiv:2206.02696 [pdf, other]

Learning to Ask Like a Physician

Authors: Eric Lehman, Vladislav Lialin, Katelyn Y. Legaspi, Anne Janelle R. Sy, Patricia Therese S. Pile, Nicole Rose I. Alberto, Richard Raymund R. Ragasa, Corinna Victoria M. Puyat, Isabelle Rose I. Alberto, Pia Gabrielle I. Alfonso, Marianne Taliño, Dana Moukheiber, Byron C. Wallace, Anna Rumshisky, Jenifer J. Liang, Preethi Raghavan, Leo Anthony Celi, Peter Szolovits

Abstract: Existing question answering (QA) datasets derived from electronic health records (EHR) are artificially generated and consequently fail to capture realistic physician information needs. We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions paired with the snippets of text (triggers) that prompted each question. The questions are gene… ▽ More Existing question answering (QA) datasets derived from electronic health records (EHR) are artificially generated and consequently fail to capture realistic physician information needs. We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions paired with the snippets of text (triggers) that prompted each question. The questions are generated by medical experts from 100+ MIMIC-III discharge summaries. We analyze this dataset to characterize the types of information sought by medical experts. We also train baseline models for trigger detection and question generation (QG), paired with unsupervised answer retrieval over EHRs. Our baseline model is able to generate high quality questions in over 62% of cases when prompted with human selected triggers. We release this dataset (and all code to reproduce baseline model results) to facilitate further research into realistic clinical QA and QG: https://github.com/elehman16/discq. △ Less

Submitted 6 June, 2022; originally announced June 2022.

arXiv:2106.07059 [pdf, other]

Multi-Resource List Scheduling of Moldable Parallel Jobs under Precedence Constraints

Authors: Lucas Perotin, Hongyang Sun, Padma Raghavan

Abstract: The scheduling literature has traditionally focused on a single type of resource (e.g., computing nodes). However, scientific applications in modern High-Performance Computing (HPC) systems process large amounts of data, hence have diverse requirements on different types of resources (e.g., cores, cache, memory, I/O). All of these resources could potentially be exploited by the runtime scheduler t… ▽ More The scheduling literature has traditionally focused on a single type of resource (e.g., computing nodes). However, scientific applications in modern High-Performance Computing (HPC) systems process large amounts of data, hence have diverse requirements on different types of resources (e.g., cores, cache, memory, I/O). All of these resources could potentially be exploited by the runtime scheduler to improve the application performance. In this paper, we study multi-resource scheduling to minimize the makespan of computational workflows comprised of parallel jobs subject to precedence constraints. The jobs are assumed to be moldable, allowing the scheduler to flexibly select a variable set of resources before execution. We propose a multi-resource, list-based scheduling algorithm, and prove that, on a system with $d$ types of schedulable resources, our algorithm achieves an approximation ratio of $1.619d+2.545\sqrt{d}+1$ for any $d$, and a ratio of $d+O(\sqrt[3]{d^2})$ for large $d$. We also present improved results for independent jobs and for jobs with special precedence constraints (e.g., series-parallel graphs and trees). Finally, we prove a lower bound of $d$ on the approximation ratio of any list scheduling scheme with local priority considerations. To the best of our knowledge, these are the first approximation results for moldable workflows with multiple resource requirements. △ Less

Submitted 13 June, 2021; originally announced June 2021.

arXiv:2007.00271 [pdf, other]

TransINT: Embedding Implication Rules in Knowledge Graphs with Isomorphic Intersections of Linear Subspaces

Authors: So Yeon Min, Preethi Raghavan, Peter Szolovits

Abstract: Knowledge Graphs (KG), composed of entities and relations, provide a structured representation of knowledge. For easy access to statistical approaches on relational data, multiple methods to embed a KG into f(KG) $\in$ R^d have been introduced. We propose TransINT, a novel and interpretable KG embedding method that isomorphically preserves the implication ordering among relations in the embedding… ▽ More Knowledge Graphs (KG), composed of entities and relations, provide a structured representation of knowledge. For easy access to statistical approaches on relational data, multiple methods to embed a KG into f(KG) $\in$ R^d have been introduced. We propose TransINT, a novel and interpretable KG embedding method that isomorphically preserves the implication ordering among relations in the embedding space. Given implication rules, TransINT maps set of entities (tied by a relation) to continuous sets of vectors that are inclusion-ordered isomorphically to relation implications. With a novel parameter sharing scheme, TransINT enables automatic training on missing but implied facts without rule grounding. On a benchmark dataset, we outperform the best existing state-of-the-art rule integration embedding methods with significant margins in link Prediction and triple Classification. The angles between the continuous sets embedded by TransINT provide an interpretable way to mine semantic relatedness and implication rules among relations. △ Less

Submitted 1 July, 2020; originally announced July 2020.

Comments: Conference Paper published in the proceedings of AKBC (Automated Knowledge Base Construction) 2020 (https://openreview.net/forum?id=shkmWLRBXH)

arXiv:2005.06587 [pdf, other]

Entity-Enriched Neural Models for Clinical Question Answering

Authors: Bhanu Pratap Singh Rawat, Wei-Hung Weng, So Yeon Min, Preethi Raghavan, Peter Szolovits

Abstract: We explore state-of-the-art neural models for question answering on electronic medical records and improve their ability to generalize better on previously unseen (paraphrased) questions at test time. We enable this by learning to predict logical forms as an auxiliary task along with the main task of answer span detection. The predicted logical forms also serve as a rationale for the answer. Furth… ▽ More We explore state-of-the-art neural models for question answering on electronic medical records and improve their ability to generalize better on previously unseen (paraphrased) questions at test time. We enable this by learning to predict logical forms as an auxiliary task along with the main task of answer span detection. The predicted logical forms also serve as a rationale for the answer. Further, we also incorporate medical entity information in these models via the ERNIE architecture. We train our models on the large-scale emrQA dataset and observe that our multi-task entity-enriched models generalize to paraphrased questions ~5% better than the baseline BERT model. △ Less

Submitted 19 February, 2021; v1 submitted 13 May, 2020; originally announced May 2020.

Journal ref: BioNLP Workshop, ACL'2020

arXiv:1911.03322 [pdf, other]

doi 10.1002/qute.202000050

High temperature annealing enhanced diamond 13C hyperpolarization at room temperature

Authors: M. Gierth, V. Krespach, A. I. Shames, P. Raghavan, E. Druga, N. Nunn, M. Torelli, R. Nirodi, S. Le, R. Zhao, A. Aguilar, X. Lv, M. Shen, C. A. Meriles, J. A. Reimer, A. Zaitsev, A. Pines, O. Shenderova, A. Ajoy

Abstract: Methods of optical dynamic nuclear polarization (DNP) open the door to the replenishable hyperpolarization of nuclear spins, boosting their NMR/MRI signature by orders of magnitude. Nanodiamond powder rich in negatively charged Nitrogen Vacancy (NV) defect centers has recently emerged as one such promising platform, wherein 13C nuclei can be hyperpolarized through the optically pumped defects comp… ▽ More Methods of optical dynamic nuclear polarization (DNP) open the door to the replenishable hyperpolarization of nuclear spins, boosting their NMR/MRI signature by orders of magnitude. Nanodiamond powder rich in negatively charged Nitrogen Vacancy (NV) defect centers has recently emerged as one such promising platform, wherein 13C nuclei can be hyperpolarized through the optically pumped defects completely at room temperature and at low magnetic fields. Given the compelling possibility of relaying this 13C polarization to nuclei in external liquids, there is an urgent need for the engineered production of highly "hyperpolarizable" diamond particles. In this paper, we report on a systematic study of various material dimensions affecting optical 13C hyperpolarization in diamond particles -- especially electron irradiation and annealing conditions that drive NV center formation. We discover surprisingly that diamond annealing at elevated temperatures close to 1720C have remarkable effects on the hyperpolarization levels, enhancing them by upto 36-fold over materials annealed through conventional means. We unravel the intriguing material origins of these gains, and demonstrate they arise from a simultaneous improvement in NV electron relaxation time and coherence time, as well as the reduction of paramagnetic content, and an increase in 13C relaxation lifetimes. Overall this points to significant recovery of the diamond lattice from radiation damage as a result of the high-temperature annealing. Our work suggests methods for the guided materials production of fluorescent, 13C hyperpolarized, nanodiamonds and pathways for their use as multi-modal (optical and MRI) imaging and hyperpolarization agents. △ Less

Submitted 8 November, 2019; originally announced November 2019.

Comments: 10+4 pages

Journal ref: Advanced Quantum Technologies 2000050 (2020)

arXiv:1907.09146 [pdf, other]

Motion Browser: Visualizing and Understanding Complex Upper Limb Movement Under Obstetrical Brachial Plexus Injuries

Authors: Gromit Yeuk-Yin Chan, Luis Gustavo Nonato, Alice Chu, Preeti Raghavan, Viswanath Aluru, Claudio T. Silva

Abstract: The brachial plexus is a complex network of peripheral nerves that enables sensing from and control of the movements of the arms and hand. Nowadays, the coordination between the muscles to generate simple movements is still not well understood, hindering the knowledge of how to best treat patients with this type of peripheral nerve injury. To acquire enough information for medical data analysis, p… ▽ More The brachial plexus is a complex network of peripheral nerves that enables sensing from and control of the movements of the arms and hand. Nowadays, the coordination between the muscles to generate simple movements is still not well understood, hindering the knowledge of how to best treat patients with this type of peripheral nerve injury. To acquire enough information for medical data analysis, physicians conduct motion analysis assessments with patients to produce a rich dataset of electromyographic signals from multiple muscles recorded with joint movements during real-world tasks. However, tools for the analysis and visualization of the data in a succinct and interpretable manner are currently not available. Without the ability to integrate, compare, and compute multiple data sources in one platform, physicians can only compute simple statistical values to describe patient's behavior vaguely, which limits the possibility to answer clinical questions and generate hypotheses for research. To address this challenge, we have developed \systemname, an interactive visual analytics system which provides an efficient framework to extract and compare muscle activity patterns from the patient's limbs and coordinated views to help users analyze muscle signals, motion data, and video information to address different tasks. The system was developed as a result of a collaborative endeavor between computer scientists and orthopedic surgery and rehabilitation physicians. We present case studies showing physicians can utilize the information displayed to understand how individuals coordinate their muscles to initiate appropriate treatment and generate new hypotheses for future research. △ Less

Submitted 22 July, 2019; originally announced July 2019.

Comments: IEEE Transactions on Visualization and Computer Graphics (VAST 2019, to appear)

arXiv:1902.06204 [pdf, other]

doi 10.1038/s41467-019-13042-3

Hyperpolarized relaxometry based nuclear T1 noise spectroscopy in hybrid diamond quantum registers

Authors: Ashok Ajoy, Ben Safvati, Raffi Nazaryan, J. T. Oon, Ben Han, Priyanka Raghavan, Ruhee Nirodi, Alessandra Aguilar, Kristina Liu, Xiao Cai, Xudong Lv, Emanuel Druga, Chandrasekhar Ramanathan, Jeffrey A. Reimer, Carlos A. Meriles, Dieter Suter, Alexander Pines

Abstract: Understanding the origins of spin lifetimes in hybrid quantum systems is a matter of current importance in several areas of quantum information and sensing. Methods that spectrally map spin relaxation processes provide insight into their origin and can motivate methods to mitigate them. In this paper, using a combination of hyperpolarization and precision field cycling over a wide range (1mT-7T),… ▽ More Understanding the origins of spin lifetimes in hybrid quantum systems is a matter of current importance in several areas of quantum information and sensing. Methods that spectrally map spin relaxation processes provide insight into their origin and can motivate methods to mitigate them. In this paper, using a combination of hyperpolarization and precision field cycling over a wide range (1mT-7T), we map frequency dependent relaxation in a prototypical hybrid system of 13C nuclear spins in diamond coupled to Nitrogen Vacancy centers. Nuclear hyperpolarization through the optically pumped NV electrons allows signal time savings for the measurements exceeding million-fold over conventional methods. We observe that 13C lifetimes show a dramatic field dependence, growing rapidly with field up to 100mT and saturating thereafter. Through a systematic study with increasing substitutional electron (P1 center) concentration as well as 13C enrichment levels, we identify the operational relaxation channels for the nuclei in different field regimes. In particular, we demonstrate the dominant role played by the 13C nuclei coupling to the interacting P1 electronic spin bath. These results pave the way for quantum control techniques for dissipation engineering to boost spin lifetimes in diamond, with applications ranging from engineered quantum memories to hyperpolarized 13C imaging. △ Less

Submitted 16 February, 2019; originally announced February 2019.

Comments: Contains supplementary info

arXiv:1809.00732 [pdf, other]

emrQA: A Large Corpus for Question Answering on Electronic Medical Records

Authors: Anusri Pampari, Preethi Raghavan, Jennifer Liang, Jian Peng

Abstract: We propose a novel methodology to generate domain-specific large-scale question answering (QA) datasets by re-purposing existing annotations for other NLP tasks. We demonstrate an instance of this methodology in generating a large-scale QA dataset for electronic medical records by leveraging existing expert annotations on clinical notes for various NLP tasks from the community shared i2b2 datasets… ▽ More We propose a novel methodology to generate domain-specific large-scale question answering (QA) datasets by re-purposing existing annotations for other NLP tasks. We demonstrate an instance of this methodology in generating a large-scale QA dataset for electronic medical records by leveraging existing expert annotations on clinical notes for various NLP tasks from the community shared i2b2 datasets. The resulting corpus (emrQA) has 1 million question-logical form and 400,000+ question-answer evidence pairs. We characterize the dataset and explore its learning potential by training baseline models for question to logical form and question to answer map**. △ Less

Submitted 3 September, 2018; originally announced September 2018.

Comments: Accepted at Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018

arXiv:1806.09812 [pdf, other]

doi 10.1126/sciadv.aar5492

Orientation independent room-temperature optical 13C hyperpolarization in powdered diamond

Authors: A. Ajoy, K. Liu, R. Nazaryan, X. Lv, P. R. Zangara, B. Safvati, G. Wang, D. Arnold, G. Li, A. Lin, P. Raghavan, E. Druga, S. Dhomkar, D. Pagliero, J. A. Reimer, D. Suter, C. A. Meriles, A. Pines

Abstract: Dynamic nuclear polarization via contact with electronic spins has emerged as an attractive route to enhance the sensitivity of nuclear magnetic resonance (NMR) beyond the traditional limits imposed by magnetic field strength and temperature. Among the various alternative implementations, the use of nitrogen vacancy (NV) centers in diamond - a paramagnetic point defect whose spin can be optically… ▽ More Dynamic nuclear polarization via contact with electronic spins has emerged as an attractive route to enhance the sensitivity of nuclear magnetic resonance (NMR) beyond the traditional limits imposed by magnetic field strength and temperature. Among the various alternative implementations, the use of nitrogen vacancy (NV) centers in diamond - a paramagnetic point defect whose spin can be optically polarized at room temperature - has attracted widespread attention, but applications have been hampered by the need to align the NV axis with the external magnetic field. Here we overcome this hurdle through the combined use of continuous optical illumination and a microwave sweep over a broad frequency range. As a proof of principle, we demonstrate our approach using powdered diamond where we attain bulk 13C spin polarization in excess of 0.25 percent under ambient conditions. Remarkably, our technique acts efficiently on diamond crystals of all orientations, and polarizes nuclear spins with a sign that depends exclusively on the direction of the microwave sweep. Our work paves the way towards the use of hyperpolarized diamond particles as imaging contrast agents for biosensing and, ultimately, for the hyperpolarization of nuclear spins in arbitrary liquids brought in contact with their surface. △ Less

Submitted 26 June, 2018; originally announced June 2018.

Comments: Contains supplementary info

Journal ref: Science Advances 18 May 2018: Vol. 4, no. 5, eaar5492

arXiv:1805.06816 [pdf]

Annotating Electronic Medical Records for Question Answering

Authors: Preethi Raghavan, Siddharth Patwardhan, Jennifer J. Liang, Murthy V. Devarakonda

Abstract: Our research is in the relatively unexplored area of question answering technologies for patient-specific questions over their electronic health records. A large dataset of human expert curated question and answer pairs is an important pre-requisite for develo**, training and evaluating any question answering system that is powered by machine learning. In this paper, we describe a process for cr… ▽ More Our research is in the relatively unexplored area of question answering technologies for patient-specific questions over their electronic health records. A large dataset of human expert curated question and answer pairs is an important pre-requisite for develo**, training and evaluating any question answering system that is powered by machine learning. In this paper, we describe a process for creating such a dataset of questions and answers. Our methodology is replicable, can be conducted by medical students as annotators, and results in high inter-annotator agreement (0.71 Cohen's kappa). Over the course of 11 months, 11 medical students followed our annotation methodology, resulting in a question answering dataset of 5696 questions over 71 patient records, of which 1747 questions have corresponding answers generated by the medical students. △ Less

Submitted 17 May, 2018; originally announced May 2018.

Comments: 10 pages, 2016

arXiv:1612.02170 [pdf, other]

doi 10.1063/1.4975693

Non-volatile spin wave majority gate at the nanoscale

Authors: Odysseas Zografos, Sourav Dutta, Mauricio Manfrini, Adrien Vaysset, Bart Sorée, Azad Naeemi, Praveen Raghavan, Rudy Lauwereins, Iuliana P. Radu

Abstract: A spin wave majority fork-like structure with feature size of 40\,nm, is presented and investigated, through micromagnetic simulations. The structure consists of three merging out-of-plane magnetization spin wave buses and four magneto-electric cells serving as three inputs and an output. The information of the logic signals is encoded in the phase of the transmitted spin waves and subsequently st… ▽ More A spin wave majority fork-like structure with feature size of 40\,nm, is presented and investigated, through micromagnetic simulations. The structure consists of three merging out-of-plane magnetization spin wave buses and four magneto-electric cells serving as three inputs and an output. The information of the logic signals is encoded in the phase of the transmitted spin waves and subsequently stored as direction of magnetization of the magneto-electric cells upon detection. The minimum dimensions of the structure that produce an operational majority gate are identified. For all input combinations, the detection scheme employed manages to capture the majority phase result of the spin wave interference and ignore all reflection effects induced by the geometry of the structure. △ Less

Submitted 7 December, 2016; originally announced December 2016.

Journal ref: AIP Advances, Volume 7, Issue 5, 2017

arXiv:1610.02608 [pdf, other]

Research and Education in Computational Science and Engineering

Authors: Ulrich Rüde, Karen Willcox, Lois Curfman McInnes, Hans De Sterck, George Biros, Hans Bungartz, James Corones, Evin Cramer, James Crowley, Omar Ghattas, Max Gunzburger, Michael Hanke, Robert Harrison, Michael Heroux, Jan Hesthaven, Peter Jimack, Chris Johnson, Kirk E. Jordan, David E. Keyes, Rolf Krause, Vipin Kumar, Stefan Mayer, Juan Meza, Knut Martin Mørken, J. Tinsley Oden , et al. (8 additional authors not shown)

Abstract: Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that… ▽ More Over the past two decades the field of computational science and engineering (CSE) has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers of all persuasions with algorithmic inventions and software systems that transcend disciplines and scales. Carried on a wave of digital technology, CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society; and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution that engulfs the planet, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. This report describes the rapid expansion of CSE and the challenges to sustaining its bold advances. The report also presents strategies and directions for CSE research and education for the next decade. △ Less

Submitted 31 December, 2017; v1 submitted 8 October, 2016; originally announced October 2016.

Comments: Major revision, to appear in SIAM Review

Report number: Argonne National Laboratory Preprint ANL/MCS-P6054-0916 MSC Class: 00A72; 62-07; 68U20; 68W01; 68W10; 97A99; 97M10; 97N80; 97R20; 97R30 ACM Class: G.0; G.4; I.6; J.0; J.2; J.3; J.4; J.6; J.7; K.3.2

arXiv:1607.04263 [pdf, other]

doi 10.1145/2939672.2939797

The Limits of Popularity-Based Recommendations, and the Role of Social Ties

Authors: Marco Bressan, Stefano Leucci, Alessandro Panconesi, Prabhakar Raghavan, Erisa Terolli

Abstract: In this paper we introduce a mathematical model that captures some of the salient features of recommender systems that are based on popularity and that try to exploit social ties among the users. We show that, under very general conditions, the market always converges to a steady state, for which we are able to give an explicit form. Thanks to this we can tell rather precisely how much a market is… ▽ More In this paper we introduce a mathematical model that captures some of the salient features of recommender systems that are based on popularity and that try to exploit social ties among the users. We show that, under very general conditions, the market always converges to a steady state, for which we are able to give an explicit form. Thanks to this we can tell rather precisely how much a market is altered by a recommendation system, and determine the power of users to influence others. Our theoretical results are complemented by experiments with real world social networks showing that social graphs prevent large market distortions in spite of the presence of highly influential users. △ Less

Submitted 14 July, 2016; originally announced July 2016.

Comments: 10 pages, 9 figures, KDD 2016

arXiv:1606.02638 [pdf, other]

Addressing Limited Data for Textual Entailment Across Domains

Authors: Chaitanya Shivade, Preethi Raghavan, Siddharth Patwardhan

Abstract: We seek to address the lack of labeled data (and high cost of annotation) for textual entailment in some domains. To that end, we first create (for experimental purposes) an entailment dataset for the clinical domain, and a highly competitive supervised entailment system, ENT, that is effective (out of the box) on two domains. We then explore self-training and active learning strategies to address… ▽ More We seek to address the lack of labeled data (and high cost of annotation) for textual entailment in some domains. To that end, we first create (for experimental purposes) an entailment dataset for the clinical domain, and a highly competitive supervised entailment system, ENT, that is effective (out of the box) on two domains. We then explore self-training and active learning strategies to address the lack of labeled data. With self-training, we successfully exploit unlabeled data to improve over ENT by 15% F-score on the newswire domain, and 13% F-score on clinical data. On the other hand, our active learning experiments demonstrate that we can match (and even beat) ENT using only 6.6% of the training data in the clinical domain, and only 5.8% of the training data in the newswire domain. △ Less

Submitted 8 June, 2016; originally announced June 2016.

arXiv:1606.00803 [pdf, other]

Locality-Aware Laplacian Mesh Smoothing

Authors: Guillaume Aupy, JeongHyung Park, Padma Raghavan

Abstract: In this paper, we propose a novel reordering scheme to improve the performance of a Laplacian Mesh Smoothing (LMS). While the Laplacian smoothing algorithm is well optimized and studied, we show how a simple reordering of the vertices of the mesh can greatly improve the execution time of the smoothing algorithm. The idea of our reordering is based on (i) the postulate that cache misses are a very… ▽ More In this paper, we propose a novel reordering scheme to improve the performance of a Laplacian Mesh Smoothing (LMS). While the Laplacian smoothing algorithm is well optimized and studied, we show how a simple reordering of the vertices of the mesh can greatly improve the execution time of the smoothing algorithm. The idea of our reordering is based on (i) the postulate that cache misses are a very time consuming part of the execution of LMS, and (ii) the study of the reuse distance patterns of various executions of the LMS algorithm. Our reordering algorithm is very simple but allows for huge performance improvement. We ran it on a Westmere-EX platform and obtained a speedup of 75 on 32 cores compared to the single core execution without reordering, and a gain in execution of 32% on 32 cores compared to state of the art reordering. Finally, we show that we leave little room for a better ordering by reducing the L2 and L3 cache misses to a bare minimum. △ Less

Submitted 2 June, 2016; originally announced June 2016.

Comments: Accepted to ICPP'16

arXiv:1602.03855 [pdf, other]

A Statistical Framework for Single Subject Design with an Application in Post-stroke Rehabilitation

Authors: Ying Lu, Marc Scott, Preeti Raghavan

Abstract: This paper proposes a practical yet novel solution to a longstanding statistical testing problem regarding single subject design. In particular, we aim to resolve an important clinical question: does a new patient behave the same as one from a healthy population? This question cannot be answered using the traditional single subject design when only test subject information is used, nor can it be s… ▽ More This paper proposes a practical yet novel solution to a longstanding statistical testing problem regarding single subject design. In particular, we aim to resolve an important clinical question: does a new patient behave the same as one from a healthy population? This question cannot be answered using the traditional single subject design when only test subject information is used, nor can it be satisfactorily resolved by comparing a single-subject's data with the mean value of a healthy population without proper assessment of the impact of between and within subject variability. Here, we use Bayesian posterior predictive draws based on a training set of healthy subjects to generate a template null distribution of the statistic of interest to test whether the test subject belongs to the healthy population. This method also provides an estimate of the error rate associated with the decision and provides a confidence interval for the point estimate of interest. Taken together, this information will enable clinicians to conduct evidence-based clinical decision making by directly comparing the observed measures with a precalculated null distribution for such measures. Simulation studies show that the proposed test performs satisfactorily under controlled conditions. △ Less

Submitted 11 February, 2016; originally announced February 2016.

Comments: 31 pages, 3 figures, 2 tables

MSC Class: 62C05

arXiv:1502.04049 [pdf]

How essential are unstructured clinical narratives and information fusion to clinical trial recruitment?

Authors: Preethi Raghavan, James L. Chen, Eric Fosler-Lussier, Albert M. Lai

Abstract: Electronic health records capture patient information using structured controlled vocabularies and unstructured narrative text. While structured data typically encodes lab values, encounters and medication lists, unstructured data captures the physician's interpretation of the patient's condition, prognosis, and response to therapeutic intervention. In this paper, we demonstrate that information e… ▽ More Electronic health records capture patient information using structured controlled vocabularies and unstructured narrative text. While structured data typically encodes lab values, encounters and medication lists, unstructured data captures the physician's interpretation of the patient's condition, prognosis, and response to therapeutic intervention. In this paper, we demonstrate that information extraction from unstructured clinical narratives is essential to most clinical applications. We perform an empirical study to validate the argument and show that structured data alone is insufficient in resolving eligibility criteria for recruiting patients onto clinical trials for chronic lymphocytic leukemia (CLL) and prostate cancer. Unstructured data is essential to solving 59% of the CLL trial criteria and 77% of the prostate cancer trial criteria. More specifically, for resolving eligibility criteria with temporal constraints, we show the need for temporal reasoning and information integration with medical events within and across unstructured clinical narratives and structured data. △ Less

Submitted 13 February, 2015; originally announced February 2015.

Comments: AMIA TBI 2014, 6 pages

arXiv:1304.7793 [pdf, other]

Co-Scheduling Algorithms for High-Throughput Workload Execution

Authors: Guillaume Aupy, Manu Shantharam, Anne Benoit, Yves Robert, Padma Raghavan

Abstract: This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several applications concurrently. We partition the original application set into a series of packs, which are executed one by one. A pack comprises several applications, each o… ▽ More This paper investigates co-scheduling algorithms for processing a set of parallel applications. Instead of executing each application one by one, using a maximum degree of parallelism for each of them, we aim at scheduling several applications concurrently. We partition the original application set into a series of packs, which are executed one by one. A pack comprises several applications, each of them with an assigned number of processors, with the constraint that the total number of processors assigned within a pack does not exceed the maximum number of available processors. The objective is to determine a partition into packs, and an assignment of processors to applications, that minimize the sum of the execution times of the packs. We thoroughly study the complexity of this optimization problem, and propose several heuristics that exhibit very good performance on a variety of workloads, whose application execution times model profiles of parallel scientific codes. We show that co-scheduling leads to to faster workload completion time and to faster response times on average (hence increasing system throughput and saving energy), for significant benefits over traditional scheduling from both the user and system perspectives. △ Less

Submitted 29 April, 2013; originally announced April 2013.

Report number: INRIA RR-8293

arXiv:math/9409223 [pdf, ps]

On the minimum latency problem

Authors: Avrim Blum, Prasad Chalasani, Don Coppersmith, Bill Pulleyblank, Prabhakar Raghavan, Madhu Sudan

Abstract: We are given a set of points $p_1,\ldots , p_n$ and a symmetric distance matrix $(d_{ij})$ giving the distance between $p_i$ and $p_j$. We wish to construct a tour that minimizes $\sum_{i=1}^n \ell(i)$, where $\ell(i)$ is the {\em latency} of $p_i$, defined to be the distance traveled before first visiting $p_i$. This problem is also known in the literature as the {\em deliveryman problem} or th… ▽ More We are given a set of points $p_1,\ldots , p_n$ and a symmetric distance matrix $(d_{ij})$ giving the distance between $p_i$ and $p_j$. We wish to construct a tour that minimizes $\sum_{i=1}^n \ell(i)$, where $\ell(i)$ is the {\em latency} of $p_i$, defined to be the distance traveled before first visiting $p_i$. This problem is also known in the literature as the {\em deliveryman problem} or the {\em traveling repairman problem}. It arises in a number of applications including disk-head scheduling, and turns out to be surprisingly different from the traveling salesman problem in character. We give exact and approximate solutions to a number of cases, including a constant-factor approximation algorithm whenever the distance matrix satisfies the triangle inequality. △ Less

Submitted 20 September, 1994; originally announced September 1994.

Comments: 9 pages

Report number: LACES 68Q-94-18 MSC Class: 68Q25

Showing 1–27 of 27 results for author: Raghavan, P