-
Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users
Authors:
Ye** Kim,
Scott Rome,
Kevin Foley,
Mayur Nankani,
Rimon Melamed,
Javier Morales,
Abhay Yadav,
Maria Peifer,
Sardar Hamidian,
H. Howie Huang
Abstract:
Addressing the challenges related to data sparsity, cold-start problems, and diversity in recommendation systems is both crucial and demanding. Many current solutions leverage knowledge graphs to tackle these issues by combining both item-based and user-item collaborative signals. A common trend in these approaches focuses on improving ranking performance at the cost of escalating model complexity…
▽ More
Addressing the challenges related to data sparsity, cold-start problems, and diversity in recommendation systems is both crucial and demanding. Many current solutions leverage knowledge graphs to tackle these issues by combining both item-based and user-item collaborative signals. A common trend in these approaches focuses on improving ranking performance at the cost of escalating model complexity, reducing diversity, and complicating the task. It is essential to provide recommendations that are both personalized and diverse, rather than solely relying on achieving high rank-based performance, such as Click-through Rate, Recall, etc. In this paper, we propose a hybrid multi-task learning approach, training on user-item and item-item interactions. We apply item-based contrastive learning on descriptive text, sampling positive and negative pairs based on item metadata. Our approach allows the model to better understand the relationships between entities within the knowledge graph by utilizing semantic information from text. It leads to more accurate, relevant, and diverse user recommendations and a benefit that extends even to cold-start users who have few interactions with items. We perform extensive experiments on two widely used datasets to validate the effectiveness of our approach. Our findings demonstrate that jointly training user-item interactions and item-based signals using synopsis text is highly effective. Furthermore, our results provide evidence that item-based contrastive learning enhances the quality of entity embeddings, as indicated by metrics such as uniformity and alignment.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Prompt have evil twins
Authors:
Rimon Melamed,
Lucas H. McCabe,
Tanay Wakhare,
Ye** Kim,
H. Howie Huang,
Enric Boix-Adsera
Abstract:
We discover that many natural-language prompts can be replaced by corresponding prompts that are unintelligible to humans but that provably elicit similar behavior in language models. We call these prompts "evil twins" because they are obfuscated and uninterpretable (evil), but at the same time mimic the functionality of the original natural-language prompts (twins). Remarkably, evil twins transfe…
▽ More
We discover that many natural-language prompts can be replaced by corresponding prompts that are unintelligible to humans but that provably elicit similar behavior in language models. We call these prompts "evil twins" because they are obfuscated and uninterpretable (evil), but at the same time mimic the functionality of the original natural-language prompts (twins). Remarkably, evil twins transfer between models. We find these prompts by solving a maximum-likelihood problem which has applications of independent interest.
△ Less
Submitted 29 April, 2024; v1 submitted 12 November, 2023;
originally announced November 2023.
-
Associations Between Natural Language Processing (NLP) Enriched Social Determinants of Health and Suicide Death among US Veterans
Authors:
Avijit Mitra,
Richeek Pradhan,
Rachel D Melamed,
Kun Chen,
David C Hoaglin,
Katherine L Tucker,
Joel I Reisman,
Zhichao Yang,
Weisong Liu,
Jack Tsai,
Hong Yu
Abstract:
Importance: Social determinants of health (SDOH) are known to be associated with increased risk of suicidal behaviors, but few studies utilized SDOH from unstructured electronic health record (EHR) notes.
Objective: To investigate associations between suicide and recent SDOH, identified using structured and unstructured data.
Design: Nested case-control study.
Setting: EHR data from the US V…
▽ More
Importance: Social determinants of health (SDOH) are known to be associated with increased risk of suicidal behaviors, but few studies utilized SDOH from unstructured electronic health record (EHR) notes.
Objective: To investigate associations between suicide and recent SDOH, identified using structured and unstructured data.
Design: Nested case-control study.
Setting: EHR data from the US Veterans Health Administration (VHA).
Participants: 6,122,785 Veterans who received care in the US VHA between October 1, 2010, and September 30, 2015.
Exposures: Occurrence of SDOH over a maximum span of two years compared with no occurrence of SDOH.
Main Outcomes and Measures: Cases of suicide deaths were matched with 4 controls on birth year, cohort entry date, sex, and duration of follow-up. We developed an NLP system to extract SDOH from unstructured notes. Structured data, NLP on unstructured data, and combining them yielded six, eight and nine SDOH respectively. Adjusted odds ratios (aORs) and 95% confidence intervals (CIs) were estimated using conditional logistic regression.
Results: In our cohort, 8,821 Veterans committed suicide during 23,725,382 person-years of follow-up (incidence rate 37.18/100,000 person-years). Our cohort was mostly male (92.23%) and white (76.99%). Across the five common SDOH as covariates, NLP-extracted SDOH, on average, covered 80.03% of all SDOH occurrences. All SDOH, measured by structured data and NLP, were significantly associated with increased risk of suicide. The SDOH with the largest effects was legal problems (aOR=2.66, 95% CI=.46-2.89), followed by violence (aOR=2.12, 95% CI=1.98-2.27). NLP-extracted and structured SDOH were also associated with suicide.
Conclusions and Relevance: NLP-extracted SDOH were always significantly associated with increased risk of suicide among Veterans, suggesting the potential of NLP in public health studies.
△ Less
Submitted 28 December, 2022; v1 submitted 11 December, 2022;
originally announced December 2022.