Continuous-time convolutions model of event sequences
Authors:
Vladislav Zhuzhel,
Vsevolod Grabar,
Galina Boeva,
Artem Zabolotnyi,
Alexander Stepikin,
Vladimir Zholobov,
Maria Ivanova,
Mikhail Orlov,
Ivan Kireev,
Evgeny Burnaev,
Rodrigo Rivera-Castro,
Alexey Zaytsev
Abstract:
Massive samples of event sequences data occur in various domains, including e-commerce, healthcare, and finance. There are two main challenges regarding inference of such data: computational and methodological. The amount of available data and the length of event sequences per client are typically large, thus it requires long-term modelling. Moreover, this data is often sparse and non-uniform, mak…
▽ More
Massive samples of event sequences data occur in various domains, including e-commerce, healthcare, and finance. There are two main challenges regarding inference of such data: computational and methodological. The amount of available data and the length of event sequences per client are typically large, thus it requires long-term modelling. Moreover, this data is often sparse and non-uniform, making classic approaches for time series processing inapplicable. Existing solutions include recurrent and transformer architectures in such cases. To allow continuous time, the authors introduce specific parametric intensity functions defined at each moment on top of existing models. Due to the parametric nature, these intensities represent only a limited class of event sequences.
We propose the COTIC method based on a continuous convolution neural network suitable for non-uniform occurrence of events in time. In COTIC, dilations and multi-layer architecture efficiently handle dependencies between events. Furthermore, the model provides general intensity dynamics in continuous time - including self-excitement encountered in practice.
The COTIC model outperforms existing approaches on majority of the considered datasets, producing embeddings for an event sequence that can be used to solve downstream tasks - e.g. predicting next event type and return time. The code of the proposed method can be found in the GitHub repository (https://github.com/VladislavZh/COTIC).
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
COHORTNEY: Non-Parametric Clustering of Event Sequences
Authors:
Vladislav Zhuzhel,
Rodrigo Rivera-Castro,
Nina Kaploukhaya,
Liliya Mironova,
Alexey Zaytsev,
Evgeny Burnaev
Abstract:
Cohort analysis is a pervasive activity in web analytics. One divides users into groups according to specific criteria and tracks their behavior over time. Despite its extensive use, academic circles do not discuss cohort analysis to evaluate user behavior online. This work introduces an unsupervised non-parametric approach to group Internet users based on their activities. In comparison, canonica…
▽ More
Cohort analysis is a pervasive activity in web analytics. One divides users into groups according to specific criteria and tracks their behavior over time. Despite its extensive use, academic circles do not discuss cohort analysis to evaluate user behavior online. This work introduces an unsupervised non-parametric approach to group Internet users based on their activities. In comparison, canonical methods in marketing and engineering-based techniques underperform. COHORTNEY is the first machine learning-based cohort analysis algorithm with a robust theoretical explanation.
△ Less
Submitted 12 June, 2021; v1 submitted 3 April, 2021;
originally announced April 2021.