Identifying Substitute and Complementary Products for Assortment Optimization with Cleora Embeddings
Authors:
Sergiy Tkachuk,
Anna Wróblewska,
Jacek Dąbrowski,
Szymon Łukasik
Abstract:
Recent years brought an increasing interest in the application of machine learning algorithms in e-commerce, omnichannel marketing, and the sales industry. It is not only to the algorithmic advances but also to data availability, representing transactions, users, and background product information. Finding products related in different ways, i.e., substitutes and complements is essential for users…
▽ More
Recent years brought an increasing interest in the application of machine learning algorithms in e-commerce, omnichannel marketing, and the sales industry. It is not only to the algorithmic advances but also to data availability, representing transactions, users, and background product information. Finding products related in different ways, i.e., substitutes and complements is essential for users' recommendations at the vendor's site and for the vendor - to perform efficient assortment optimization.
The paper introduces a novel method for finding products' substitutes and complements based on the graph embedding Cleora algorithm. We also provide its experimental evaluation with regards to the state-of-the-art Shopper algorithm, studying the relevance of recommendations with surveys from industry experts. It is concluded that the new approach presented here offers suitable choices of recommended products, requiring a minimal amount of additional information. The algorithm can be used in various enterprises, effectively identifying substitute and complementary product options.
△ Less
Submitted 10 August, 2022;
originally announced August 2022.
Multilingual Transformers for Product Matching -- Experiments and a New Benchmark in Polish
Authors:
Michał Możdżonek,
Anna Wróblewska,
Sergiy Tkachuk,
Szymon Łukasik
Abstract:
Product matching corresponds to the task of matching identical products across different data sources. It typically employs available product features which, apart from being multimodal, i.e., comprised of various data types, might be non-homogeneous and incomplete. The paper shows that pre-trained, multilingual Transformer models, after fine-tuning, are suitable for solving the product matching p…
▽ More
Product matching corresponds to the task of matching identical products across different data sources. It typically employs available product features which, apart from being multimodal, i.e., comprised of various data types, might be non-homogeneous and incomplete. The paper shows that pre-trained, multilingual Transformer models, after fine-tuning, are suitable for solving the product matching problem using textual features both in English and Polish languages. We tested multilingual mBERT and XLM-RoBERTa models in English on Web Data Commons - training dataset and gold standard for large-scale product matching. The obtained results show that these models perform similarly to the latest solutions tested on this set, and in some cases, the results were even better.
Additionally, we prepared a new dataset entirely in Polish and based on offers in selected categories obtained from several online stores for the research purpose. It is the first open dataset for product matching tasks in Polish, which allows comparing the effectiveness of the pre-trained models. Thus, we also showed the baseline results obtained by the fine-tuned mBERT and XLM-RoBERTa models on the Polish datasets.
△ Less
Submitted 1 June, 2022; v1 submitted 31 May, 2022;
originally announced May 2022.