A novel RNA pseudouridine site prediction model using Utility Kernel and data-driven parameters
Authors:
Sourabh Patil,
Archana Mathur,
Raviprasad Aduri,
Snehanshu Saha
Abstract:
RNA protein Interactions (RPIs) play an important role in biological systems. Recently, we have enumerated the RPIs at the residue level and have elucidated the minimum structural unit (MSU) in these interactions to be a stretch of five residues (Nucleotides/amino acids). Pseudouridine is the most frequent modification in RNA. The conversion of uridine to pseudouridine involves interactions betwee…
▽ More
RNA protein Interactions (RPIs) play an important role in biological systems. Recently, we have enumerated the RPIs at the residue level and have elucidated the minimum structural unit (MSU) in these interactions to be a stretch of five residues (Nucleotides/amino acids). Pseudouridine is the most frequent modification in RNA. The conversion of uridine to pseudouridine involves interactions between pseudouridine synthase and RNA. The existing models to predict the pseudouridine sites in a given RNA sequence mainly depend on user-defined features such as mono and dinucleotide composition/propensities of RNA sequences. Predicting pseudouridine sites is a non-linear classification problem with limited data points. Deep Learning models are efficient discriminators when the data set size is reasonably large and fail when there is a paucity of data ($<1000$ samples). To mitigate this problem, we propose a Support Vector Machine (SVM) Kernel based on utility theory from Economics, and using data-driven parameters (i.e. MSU) as features. For this purpose, we have used position-specific tri/quad/pentanucleotide composition/propensity (PSPC/PSPP) besides nucleotide and dineculeotide composition as features. SVMs are known to work well in small data regimes and kernels in SVM are designed to classify non-linear data. The proposed model outperforms the existing state-of-the-art models significantly (10%-15% on average).
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
ChronoPscychosis: Temporal Segmentation and Its Impact on Schizophrenia Classification Using Motor Activity Data
Authors:
Pradnya Rajendra Jadhav,
Raviprasad Aduri
Abstract:
Schizophrenia is a complicated mental illness characterized by a broad spectrum of symptoms affecting cognition, behavior, and emotion. The task of identifying reliable biomarkers to classify Schizophrenia accurately continues to be a challenge in the field of psychiatry. We investigate the temporal patterns within the motor activity data as a potential key to enhancing the categorization of indiv…
▽ More
Schizophrenia is a complicated mental illness characterized by a broad spectrum of symptoms affecting cognition, behavior, and emotion. The task of identifying reliable biomarkers to classify Schizophrenia accurately continues to be a challenge in the field of psychiatry. We investigate the temporal patterns within the motor activity data as a potential key to enhancing the categorization of individuals with Schizophrenia, using the dataset having motor activity recordings of 22 Schizophrenia patients and 32 control subjects. The dataset contains per-minute motor activity measurements collected for an average of 12.7 days in a row for each participant. We dissect each day into segments (Twelve, Eight, six, four, three, and two parts) and evaluate their impact on classification. We employ sixteen statistical features within these temporal segments and train them on Seven machine learning models to get deeper insights. LightGBM model outperforms the other six models. Our results indicate that the temporal segmentation significantly improves the classification, with AUC-ROC = 0.93, F1 score = 0.84( LightGBM- without any segmentation) and AUC-ROC = 0.98, F1 score = 0.93( LightGBM- with segmentation). Distinguishing between diurnal and nocturnal segments amplifies the differences between Schizophrenia patients and controls. However, further subdivisions into smaller time segments do not affect the AUC- ROC significantly. Morning, afternoon, evening, and night partitioning gives similar classification performance to day-night partitioning. These findings are valuable as they indicate that extensive temporal classification beyond distinguishing between day and night does not yield substantial results, offering an efficient approach for further classification, early diagnosis, and monitoring of Schizophrenia.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.