-
Enhancing Travel Decision-Making: A Contrastive Learning Approach for Personalized Review Rankings in Accommodations
Authors:
Reda Igebaria,
Eran Fainman,
Sarai Mizrachi,
Moran Beladev,
Fengjun Wang
Abstract:
User-generated reviews significantly influence consumer decisions, particularly in the travel domain when selecting accommodations. This paper contribution comprising two main elements. Firstly, we present a novel dataset of authentic guest reviews sourced from a prominent online travel platform, totaling over two million reviews from 50,000 distinct accommodations. Secondly, we propose an innovat…
▽ More
User-generated reviews significantly influence consumer decisions, particularly in the travel domain when selecting accommodations. This paper contribution comprising two main elements. Firstly, we present a novel dataset of authentic guest reviews sourced from a prominent online travel platform, totaling over two million reviews from 50,000 distinct accommodations. Secondly, we propose an innovative approach for personalized review ranking. Our method employs contrastive learning to intricately capture the relationship between a review and the contextual information of its respective reviewer. Through a comprehensive experimental study, we demonstrate that our approach surpasses several baselines across all reported metrics. Augmented by a comparative analysis, we showcase the efficacy of our method in elevating personalized review ranking. The implications of our research extend beyond the travel domain, with potential applications in other sectors where personalized review ranking is paramount, such as online e-commerce platforms.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Text2Topic: Multi-Label Text Classification System for Efficient Topic Detection in User Generated Content with Zero-Shot Capabilities
Authors:
Fengjun Wang,
Moran Beladev,
Ofri Kleinfeld,
Elina Frayerman,
Tal Shachar,
Eran Fainman,
Karen Lastmann Assaraf,
Sarai Mizrachi,
Benjamin Wang
Abstract:
Multi-label text classification is a critical task in the industry. It helps to extract structured information from large amount of textual data. We propose Text to Topic (Text2Topic), which achieves high multi-label classification performance by employing a Bi-Encoder Transformer architecture that utilizes concatenation, subtraction, and multiplication of embeddings on both text and topic. Text2T…
▽ More
Multi-label text classification is a critical task in the industry. It helps to extract structured information from large amount of textual data. We propose Text to Topic (Text2Topic), which achieves high multi-label classification performance by employing a Bi-Encoder Transformer architecture that utilizes concatenation, subtraction, and multiplication of embeddings on both text and topic. Text2Topic also supports zero-shot predictions, produces domain-specific text embeddings, and enables production-scale batch-inference with high throughput. The final model achieves accurate and comprehensive results compared to state-of-the-art baselines, including large language models (LLMs).
In this study, a total of 239 topics are defined, and around 1.6 million text-topic pairs annotations (in which 200K are positive) are collected on approximately 120K texts from 3 main data sources on Booking.com. The data is collected with optimized smart sampling and partial labeling. The final Text2Topic model is deployed on a real-world stream processing platform, and it outperforms other models with 92.9% micro mAP, as well as a 75.8% macro mAP score. We summarize the modeling choices which are extensively tested through ablation studies, and share detailed in-production decision-making steps.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
A Data Analysis Study on Human Liver Blood Circulation
Authors:
Ting Wu,
Keqin Liu,
Emily Fainman,
Ying Wang
Abstract:
The liver has a unique blood supply system and plays an important role in the human blood circulatory system. Thus, hemodynamic problems related to the liver serve as an important part in clinical diagnosis and treatment. Although estimating parameters in these hemodynamic models is essential to the study of liver models, due to the limitations of medical measurement methods and constraints of eth…
▽ More
The liver has a unique blood supply system and plays an important role in the human blood circulatory system. Thus, hemodynamic problems related to the liver serve as an important part in clinical diagnosis and treatment. Although estimating parameters in these hemodynamic models is essential to the study of liver models, due to the limitations of medical measurement methods and constraints of ethics on clinical studies, it is impossible to directly measure the parameters of blood vessels in livers. Furthermore, as an important part of the systemic blood circulation, livers' studies are supposed to be in conjunction with other blood vessels. In this article, we present an innovative method to fix parameters of an individual liver in a human blood circulation using non-invasive clinical measurements. The method consists of a 1-D blood flow model of human arteries and veins, a 0-D model reflecting the peripheral resistance of capillaries and a lumped parameter circuit model for human livers. We apply the finite element method in fluid mechanics of these models to a numerical study, based on non-invasive blood related measures of 33 individuals. The estimated results of human blood vessel characteristic and liver model parameters are verified from the perspective of Stroke Value Variation, which shows the effectiveness of our estimation method.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Online Budgeted Learning for Classifier Induction
Authors:
Eran Fainman,
Bracha Shapira,
Lior Rokach,
Yisroel Mirsky
Abstract:
In real-world machine learning applications, there is a cost associated with sampling of different features. Budgeted learning can be used to select which feature-values to acquire from each instance in a dataset, such that the best model is induced under a given constraint. However, this approach is not possible in the domain of online learning since one may not retroactively acquire feature-valu…
▽ More
In real-world machine learning applications, there is a cost associated with sampling of different features. Budgeted learning can be used to select which feature-values to acquire from each instance in a dataset, such that the best model is induced under a given constraint. However, this approach is not possible in the domain of online learning since one may not retroactively acquire feature-values from past instances. In online learning, the challenge is to find the optimum set of features to be acquired from each instance upon arrival from a data stream. In this paper we introduce the issue of online budgeted learning and describe a general framework for addressing this challenge. We propose two types of feature value acquisition policies based on the multi-armed bandit problem: random and adaptive. Adaptive policies perform online adjustments according to new information coming from a data stream, while random policies are not sensitive to the information that arrives from the data stream. Our comparative study on five real-world datasets indicates that adaptive policies outperform random policies for most budget limitations and datasets. Furthermore, we found that in some cases adaptive policies achieve near-optimal results.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.