Search | arXiv e-print repository

doi 10.1016/j.asoc.2017.02.004

Clustering Retail Products Based on Customer Behaviour

Authors: Vladimír Holý, Ondřej Sokol, Michal Černý

Abstract: The categorization of retail products is essential for the business decision-making process. It is a common practice to classify products based on their quantitative and qualitative characteristics. In this paper we use a purely data-driven approach. Our clustering of products is based exclusively on the customer behaviour. We propose a method for clustering retail products using market basket dat… ▽ More The categorization of retail products is essential for the business decision-making process. It is a common practice to classify products based on their quantitative and qualitative characteristics. In this paper we use a purely data-driven approach. Our clustering of products is based exclusively on the customer behaviour. We propose a method for clustering retail products using market basket data. Our model is formulated as an optimization problem which is solved by a genetic algorithm. It is demonstrated on simulated data how our method behaves in different settings. The application using real data from a Czech drugstore company shows that our method leads to similar results in comparison with the classification by experts. The number of clusters is a parameter of our algorithm. We demonstrate that if more clusters are allowed than the original number of categories is, the method yields additional information about the structure of the product categorization. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Journal ref: (2017) Applied Soft Computing, 60, 752-762

arXiv:2201.12140 [pdf, other]

A Simple Measure of Product Substitutability Based on Common Purchases

Authors: Ondřej Sokol, Vladimír Holý

Abstract: We propose a measure of product substitutability based on correlation of common purchases, which is fast to compute and easy to interpret. In an empirical study of a drugstore retail chain, we demonstrate its properties, compare it to a similarly simple measure of product complementarity, and use it to find small clusters of substitutes. We propose a measure of product substitutability based on correlation of common purchases, which is fast to compute and easy to interpret. In an empirical study of a drugstore retail chain, we demonstrate its properties, compare it to a similarly simple measure of product complementarity, and use it to find small clusters of substitutes. △ Less

Submitted 28 January, 2022; originally announced January 2022.

arXiv:2102.01424 [pdf, other]

Clustering with Penalty for Joint Occurrence of Objects: Computational Aspects

Authors: Ondřej Sokol, Vladimír Holý

Abstract: The method of Holý, Sokol and Černý (Applied Soft Computing, 2017, Vol. 60, p. 752-762) clusters objects based on their incidence in a large number of given sets. The idea is to minimize the occurrence of multiple objects from the same cluster in the same set. In the current paper, we study computational aspects of the method. First, we prove that the problem of finding the optimal clustering is N… ▽ More The method of Holý, Sokol and Černý (Applied Soft Computing, 2017, Vol. 60, p. 752-762) clusters objects based on their incidence in a large number of given sets. The idea is to minimize the occurrence of multiple objects from the same cluster in the same set. In the current paper, we study computational aspects of the method. First, we prove that the problem of finding the optimal clustering is NP-hard. Second, to numerically find a suitable clustering, we propose to use the genetic algorithm augmented by a renumbering procedure, a fast task-specific local search heuristic and an initial solution based on a simplified model. Third, in a simulation study, we demonstrate that our improvements of the standard genetic algorithm significantly enhance its computational performance. △ Less

Submitted 2 February, 2021; originally announced February 2021.

arXiv:1909.02996 [pdf, other]

doi 10.1177/1470785320921011

The Role of Shop** Mission in Retail Customer Segmentation

Authors: Ondřej Sokol, Vladimír Holý

Abstract: In retailing, it is important to understand customer behavior and determine customer value. A useful tool to achieve such goals is the cluster analysis of transaction data. Typically, a customer segmentation is based on the recency, frequency and monetary value of shop** or the structure of purchased products. We take a different approach and base our segmentation on the shop** mission - a rea… ▽ More In retailing, it is important to understand customer behavior and determine customer value. A useful tool to achieve such goals is the cluster analysis of transaction data. Typically, a customer segmentation is based on the recency, frequency and monetary value of shop** or the structure of purchased products. We take a different approach and base our segmentation on the shop** mission - a reason why a customer visits the shop. Shop** missions include focused purchases of specific product categories and general purchases of various sizes. In an application to a Czech drugstore chain, we show that the proposed segmentation brings unique information about customers and should be used alongside the traditional methods. △ Less

Submitted 19 March, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

Journal ref: International Journal of Market Research, 63(4), 454-470 (2021)

arXiv:1904.10199 [pdf, other]

How Many Customers Does a Retail Store Have?

Authors: Ondřej Sokol, Vladimír Holý

Abstract: The knowledge of the number of customers is the pillar of retail business analytics. In our setting, we assume that a portion of customers is monitored and easily counted due to the loyalty program while the rest is not monitored. The behavior of customers in both groups may significantly differ making the estimation of the number of unmonitored customers a non-trivial task. We identify shop** p… ▽ More The knowledge of the number of customers is the pillar of retail business analytics. In our setting, we assume that a portion of customers is monitored and easily counted due to the loyalty program while the rest is not monitored. The behavior of customers in both groups may significantly differ making the estimation of the number of unmonitored customers a non-trivial task. We identify shop** patterns of several customer segments which allows us to estimate the distribution of customers without the loyalty card using the maximum likelihood method. In a simulation study, we find that the proposed approach is quite precise even when the data sample is very small and its assumptions are violated to a certain degree. In an empirical study of a drugstore chain, we validate and illustrate the proposed approach in practice. The actual number of customers estimated by the proposed method is much higher than the number suggested by the naive estimate assuming the constant customer distribution. The proposed method can also be utilized to determine penetration of the loyalty program in the individual customer segments. △ Less

Submitted 5 April, 2020; v1 submitted 23 April, 2019; originally announced April 2019.

Showing 1–5 of 5 results for author: Sokol, O