Search | arXiv e-print repository

IncDSI: Incrementally Updatable Document Retrieval

Authors: Varsha Kishore, Chao Wan, Justin Lovelace, Yoav Artzi, Kilian Q. Weinberger

Abstract: Differentiable Search Index is a recently proposed paradigm for document retrieval, that encodes information about a corpus of documents within the parameters of a neural network and directly maps queries to corresponding documents. These models have achieved state-of-the-art performances for document retrieval across many benchmarks. These kinds of models have a significant limitation: it is not… ▽ More Differentiable Search Index is a recently proposed paradigm for document retrieval, that encodes information about a corpus of documents within the parameters of a neural network and directly maps queries to corresponding documents. These models have achieved state-of-the-art performances for document retrieval across many benchmarks. These kinds of models have a significant limitation: it is not easy to add new documents after a model is trained. We propose IncDSI, a method to add documents in real time (about 20-50ms per document), without retraining the model on the entire dataset (or even parts thereof). Instead we formulate the addition of documents as a constrained optimization problem that makes minimal changes to the network parameters. Although orders of magnitude faster, our approach is competitive with re-training the model on the whole dataset and enables the development of document retrieval systems that can be updated with new information in real-time. Our code for IncDSI is available at https://github.com/varshakishore/IncDSI. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2212.09462 [pdf, other]

Latent Diffusion for Language Generation

Authors: Justin Lovelace, Varsha Kishore, Chao Wan, Eliot Shekhtman, Kilian Q. Weinberger

Abstract: Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that enc… ▽ More Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models. △ Less

Submitted 7 November, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: NeurIPS 2023

arXiv:2106.06555 [pdf, other]

Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network

Authors: Justin Lovelace, Denis Newman-Griffis, Shikhar Vashishth, Jill Fain Lehman, Carolyn Penstein Rosé

Abstract: Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network tha… ▽ More Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network that utilizes textual entity representations and demonstrate that our model outperforms recent KG completion methods in this challenging setting. We find that our model's performance improvements stem primarily from its robustness to sparsity. We then distill the knowledge from the convolutional network into a student network that re-ranks promising candidate entities. This re-ranking stage leads to further improvements in performance and demonstrates the effectiveness of entity re-ranking for KG completion. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)

arXiv:2008.01197 [pdf, other]

Dynamically Extracting Outcome-Specific Problem Lists from Clinical Notes with Guided Multi-Headed Attention

Authors: Justin Lovelace, Nathan C. Hurley, Adrian D. Haimovich, Bobak J. Mortazavi

Abstract: Problem lists are intended to provide clinicians with a relevant summary of patient medical issues and are embedded in many electronic health record systems. Despite their importance, problem lists are often cluttered with resolved or currently irrelevant conditions. In this work, we develop a novel end-to-end framework that first extracts diagnosis and procedure information from clinical notes an… ▽ More Problem lists are intended to provide clinicians with a relevant summary of patient medical issues and are embedded in many electronic health record systems. Despite their importance, problem lists are often cluttered with resolved or currently irrelevant conditions. In this work, we develop a novel end-to-end framework that first extracts diagnosis and procedure information from clinical notes and subsequently uses the extracted medical problems to predict patient outcomes. This framework is both more performant and more interpretable than existing models used within the domain, achieving an AU-ROC of 0.710 for bounceback readmission and 0.869 for in-hospital mortality occurring after ICU discharge. We identify risk factors for both readmission and mortality outcomes and demonstrate that our framework can be used to develop dynamic problem lists that present clinical problems along with their quantitative importance. We conduct a qualitative user study with medical experts and demonstrate that they view the lists produced by our framework favorably and find them to be a more effective clinical decision support tool than a strong baseline. △ Less

Submitted 25 July, 2020; originally announced August 2020.

Comments: To appear in the proceedings of the Machine Learning for Healthcare Conference (MLHC) 2020. Accepted papers can be viewed at https://www.mlforhc.org/accepted-papers

arXiv:1910.14095 [pdf, other]

Explainable Prediction of Adverse Outcomes Using Clinical Notes

Authors: Justin R. Lovelace, Nathan C. Hurley, Adrian D. Haimovich, Bobak J. Mortazavi

Abstract: Clinical notes contain a large amount of clinically valuable information that is ignored in many clinical decision support systems due to the difficulty that comes with mining that information. Recent work has found success leveraging deep learning models for the prediction of clinical outcomes using clinical notes. However, these models fail to provide clinically relevant and interpretable inform… ▽ More Clinical notes contain a large amount of clinically valuable information that is ignored in many clinical decision support systems due to the difficulty that comes with mining that information. Recent work has found success leveraging deep learning models for the prediction of clinical outcomes using clinical notes. However, these models fail to provide clinically relevant and interpretable information that clinicians can utilize for informed clinical care. In this work, we augment a popular convolutional model with an attention mechanism and apply it to unstructured clinical notes for the prediction of ICU readmission and mortality. We find that the addition of the attention mechanism leads to competitive performance while allowing for the straightforward interpretation of predictions. We develop clear visualizations to present important spans of text for both individual predictions and high-risk cohorts. We then conduct a qualitative analysis and demonstrate that our model is consistently attending to clinically meaningful portions of the narrative for all of the outcomes that we explore. △ Less

Submitted 12 November, 2019; v1 submitted 30 October, 2019; originally announced October 2019.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract

Showing 1–5 of 5 results for author: Lovelace, J