-
Boosting the performance of anomalous diffusion classifiers with the proper choice of features
Authors:
Patrycja Kowalek,
Hanna Loch-Olszewska,
Łukasz Łaszczuk,
Jarosław Opała,
Janusz Szwabiński
Abstract:
Understanding and identifying different types of single molecules' diffusion that occur in a broad range of systems (including living matter) is extremely important, as it can provide information on the physical and chemical characteristics of particles' surroundings. In recent years, an ever-growing number of methods have been proposed to overcome some of the limitations of the mean-squared displ…
▽ More
Understanding and identifying different types of single molecules' diffusion that occur in a broad range of systems (including living matter) is extremely important, as it can provide information on the physical and chemical characteristics of particles' surroundings. In recent years, an ever-growing number of methods have been proposed to overcome some of the limitations of the mean-squared displacements approach to tracer diffusion. In March 2020, the Anomalous Diffusion (AnDi) Challenge was launched by a community of international scientists to provide a framework for an objective comparison of the available methods for anomalous diffusion. In this paper, we introduce a feature-based machine learning method developed in response to Task 2 of the challenge, i.e. the classification of different types of diffusion. We discuss two sets of attributes that may be used for the classification of single-particle tracking data. The first one was proposed as our contribution to the AnDi Challenge. The latter is the result of our attempt to improve the performance of the classifier after the deadline of the competition. Extreme gradient boosting was used as the classification model. Although the deep-learning approach constitutes the state-of-the-art technology for data classification in many domains, we deliberately decided to pick this traditional machine learning algorithm due to its superior interpretability. After the extension of the feature set our classifier achieved the accuracy of 0.83, which is comparable with the top methods based on neural networks.
△ Less
Submitted 5 March, 2023; v1 submitted 30 December, 2021;
originally announced December 2021.
-
Objective comparison of methods to decode anomalous diffusion
Authors:
Gorka Muñoz-Gil,
Giovanni Volpe,
Miguel Angel Garcia-March,
Erez Aghion,
Aykut Argun,
Chang Beom Hong,
Tom Bland,
Stefano Bo,
J. Alberto Conejero,
Nicolás Firbas,
Òscar Garibo i Orts,
Alessia Gentili,
Zihan Huang,
Jae-Hyung Jeon,
Hélène Kabbech,
Yeong** Kim,
Patrycja Kowalek,
Diego Krapf,
Hanna Loch-Olszewska,
Michael A. Lomholt,
Jean-Baptiste Masson,
Philipp G. Meyer,
Seongyu Park,
Borja Requena,
Ihor Smal
, et al. (9 additional authors not shown)
Abstract:
Deviations from Brownian motion leading to anomalous diffusion are ubiquitously found in transport dynamics, playing a crucial role in phenomena from quantum physics to life sciences. The detection and characterization of anomalous diffusion from the measurement of an individual trajectory are challenging tasks, which traditionally rely on calculating the mean squared displacement of the trajector…
▽ More
Deviations from Brownian motion leading to anomalous diffusion are ubiquitously found in transport dynamics, playing a crucial role in phenomena from quantum physics to life sciences. The detection and characterization of anomalous diffusion from the measurement of an individual trajectory are challenging tasks, which traditionally rely on calculating the mean squared displacement of the trajectory. However, this approach breaks down for cases of important practical interest, e.g., short or noisy trajectories, ensembles of heterogeneous trajectories, or non-ergodic processes. Recently, several new approaches have been proposed, mostly building on the ongoing machine-learning revolution. Aiming to perform an objective comparison of methods, we gathered the community and organized an open competition, the Anomalous Diffusion challenge (AnDi). Participating teams independently applied their own algorithms to a commonly-defined dataset including diverse conditions. Although no single method performed best across all scenarios, the results revealed clear differences between the various approaches, providing practical advice for users and a benchmark for developers.
△ Less
Submitted 14 May, 2021;
originally announced May 2021.
-
Classification of particle trajectories in living cells: machine learning versus statistical testing hypothesis for fractional anomalous diffusion
Authors:
Joanna Janczura,
Patrycja Kowalek,
Hanna Loch-Olszewska,
Janusz Szwabiński,
Aleksander Weron
Abstract:
Single-particle tracking (SPT) has become a popular tool to study the intracellular transport of molecules in living cells. Inferring the character of their dynamics is important, because it determines the organization and functions of the cells. For this reason, one of the first steps in the analysis of SPT data is the identification of the diffusion type of the observed particles. The most popul…
▽ More
Single-particle tracking (SPT) has become a popular tool to study the intracellular transport of molecules in living cells. Inferring the character of their dynamics is important, because it determines the organization and functions of the cells. For this reason, one of the first steps in the analysis of SPT data is the identification of the diffusion type of the observed particles. The most popular method to identify the class of a trajectory is based on the mean square displacement (MSD). However, due to its known limitations, several other approaches have been already proposed. With the recent advances in algorithms and the developments of modern hardware, the classification attempts rooted in machine learning (ML) are of particular interest. In this work, we adopt two ML ensemble algorithms, i.e. random forest and gradient boosting, to the problem of trajectory classification. We present a new set of features used to transform the raw trajectories data into input vectors required by the classifiers. The resulting models are then applied to real data for G protein-coupled receptors and G proteins. The classification results are compared to recent statistical methods going beyond MSD.
△ Less
Submitted 10 July, 2020; v1 submitted 13 May, 2020;
originally announced May 2020.