-
IncDSI: Incrementally Updatable Document Retrieval
Authors:
Varsha Kishore,
Chao Wan,
Justin Lovelace,
Yoav Artzi,
Kilian Q. Weinberger
Abstract:
Differentiable Search Index is a recently proposed paradigm for document retrieval, that encodes information about a corpus of documents within the parameters of a neural network and directly maps queries to corresponding documents. These models have achieved state-of-the-art performances for document retrieval across many benchmarks. These kinds of models have a significant limitation: it is not…
▽ More
Differentiable Search Index is a recently proposed paradigm for document retrieval, that encodes information about a corpus of documents within the parameters of a neural network and directly maps queries to corresponding documents. These models have achieved state-of-the-art performances for document retrieval across many benchmarks. These kinds of models have a significant limitation: it is not easy to add new documents after a model is trained. We propose IncDSI, a method to add documents in real time (about 20-50ms per document), without retraining the model on the entire dataset (or even parts thereof). Instead we formulate the addition of documents as a constrained optimization problem that makes minimal changes to the network parameters. Although orders of magnitude faster, our approach is competitive with re-training the model on the whole dataset and enables the development of document retrieval systems that can be updated with new information in real-time. Our code for IncDSI is available at https://github.com/varshakishore/IncDSI.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Latent Diffusion for Language Generation
Authors:
Justin Lovelace,
Varsha Kishore,
Chao Wan,
Eliot Shekhtman,
Kilian Q. Weinberger
Abstract:
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that enc…
▽ More
Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing pretrained language models. We view diffusion and existing language models as complementary. We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders. We then demonstrate that continuous diffusion models can be learned in the latent space of the language autoencoder, enabling us to sample continuous latent representations that can be decoded into natural language with the pretrained decoder. We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation. We demonstrate across multiple diverse data sets that our latent language diffusion models are significantly more effective than previous diffusion language models.
△ Less
Submitted 7 November, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
Robust Knowledge Graph Completion with Stacked Convolutions and a Student Re-Ranking Network
Authors:
Justin Lovelace,
Denis Newman-Griffis,
Shikhar Vashishth,
Jill Fain Lehman,
Carolyn Penstein Rosé
Abstract:
Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network tha…
▽ More
Knowledge Graph (KG) completion research usually focuses on densely connected benchmark datasets that are not representative of real KGs. We curate two KG datasets that include biomedical and encyclopedic knowledge and use an existing commonsense KG dataset to explore KG completion in the more realistic setting where dense connectivity is not guaranteed. We develop a deep convolutional network that utilizes textual entity representations and demonstrate that our model outperforms recent KG completion methods in this challenging setting. We find that our model's performance improvements stem primarily from its robustness to sparsity. We then distill the knowledge from the convolutional network into a student network that re-ranks promising candidate entities. This re-ranking stage leads to further improvements in performance and demonstrates the effectiveness of entity re-ranking for KG completion.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Dynamically Extracting Outcome-Specific Problem Lists from Clinical Notes with Guided Multi-Headed Attention
Authors:
Justin Lovelace,
Nathan C. Hurley,
Adrian D. Haimovich,
Bobak J. Mortazavi
Abstract:
Problem lists are intended to provide clinicians with a relevant summary of patient medical issues and are embedded in many electronic health record systems. Despite their importance, problem lists are often cluttered with resolved or currently irrelevant conditions. In this work, we develop a novel end-to-end framework that first extracts diagnosis and procedure information from clinical notes an…
▽ More
Problem lists are intended to provide clinicians with a relevant summary of patient medical issues and are embedded in many electronic health record systems. Despite their importance, problem lists are often cluttered with resolved or currently irrelevant conditions. In this work, we develop a novel end-to-end framework that first extracts diagnosis and procedure information from clinical notes and subsequently uses the extracted medical problems to predict patient outcomes. This framework is both more performant and more interpretable than existing models used within the domain, achieving an AU-ROC of 0.710 for bounceback readmission and 0.869 for in-hospital mortality occurring after ICU discharge. We identify risk factors for both readmission and mortality outcomes and demonstrate that our framework can be used to develop dynamic problem lists that present clinical problems along with their quantitative importance. We conduct a qualitative user study with medical experts and demonstrate that they view the lists produced by our framework favorably and find them to be a more effective clinical decision support tool than a strong baseline.
△ Less
Submitted 25 July, 2020;
originally announced August 2020.
-
Explainable Prediction of Adverse Outcomes Using Clinical Notes
Authors:
Justin R. Lovelace,
Nathan C. Hurley,
Adrian D. Haimovich,
Bobak J. Mortazavi
Abstract:
Clinical notes contain a large amount of clinically valuable information that is ignored in many clinical decision support systems due to the difficulty that comes with mining that information. Recent work has found success leveraging deep learning models for the prediction of clinical outcomes using clinical notes. However, these models fail to provide clinically relevant and interpretable inform…
▽ More
Clinical notes contain a large amount of clinically valuable information that is ignored in many clinical decision support systems due to the difficulty that comes with mining that information. Recent work has found success leveraging deep learning models for the prediction of clinical outcomes using clinical notes. However, these models fail to provide clinically relevant and interpretable information that clinicians can utilize for informed clinical care. In this work, we augment a popular convolutional model with an attention mechanism and apply it to unstructured clinical notes for the prediction of ICU readmission and mortality. We find that the addition of the attention mechanism leads to competitive performance while allowing for the straightforward interpretation of predictions. We develop clear visualizations to present important spans of text for both individual predictions and high-risk cohorts. We then conduct a qualitative analysis and demonstrate that our model is consistently attending to clinically meaningful portions of the narrative for all of the outcomes that we explore.
△ Less
Submitted 12 November, 2019; v1 submitted 30 October, 2019;
originally announced October 2019.