Showing 1–2 of 2 results for author: Suarez-Farinas, M

Search v0.5.6 released 2020-02-24

arXiv:2008.05040 [pdf]

stat.AP

GEE-TGDR: A longitudinal feature selection algorithm and its application to lncRNA expression profiles for psoriasis patients treated with immune therapies

Authors: Suyan Tian, Chi Wang, Mayte Suarez-Farinas

Abstract: With the fast evolution of high-throughput technology, longitudinal gene expression experiments have become affordable and increasingly common in biomedical fields. Generalized estimating equation (GEE) approach is a widely used statistical method for the analysis of longitudinal data. Feature selection is imperative in longitudinal omics data analysis. Among a variety of existing feature selectio… ▽ More With the fast evolution of high-throughput technology, longitudinal gene expression experiments have become affordable and increasingly common in biomedical fields. Generalized estimating equation (GEE) approach is a widely used statistical method for the analysis of longitudinal data. Feature selection is imperative in longitudinal omics data analysis. Among a variety of existing feature selection methods, an embedded method, namely, threshold gradient descent regularization (TGDR) stands out due to its excellent characteristics. An alignment of GEE with TGDR is a promising area for the purpose of identifying relevant markers that can explain the dynamic changes of outcomes across time. In this study, we proposed a new novel feature selection algorithm for longitudinal outcomes:GEE-TGDR. In the GEE-TGDR method, the corresponding quasi-likelihood function of a GEE model is the objective function to be optimized and the optimization and feature selection are accomplished by the TGDR method. We applied the GEE-TGDR method a longitudinal lncRNA gene expression dataset that examined the treatment response of psoriasis patients to immune therapy. Under different working correlation structures, a list including 10 relevant lncRNAs were identified with a predictive accuracy of 80 % and meaningful biological interpretation. To conclude, a widespread application of the proposed GEE-TGDR method in omics data analysis is anticipated. △ Less

Submitted 11 August, 2020; originally announced August 2020.
arXiv:1307.5576 [pdf]

stat.ME

doi 10.1371/journal.pone.0078302

Multi-TGDR: a regularization method for multi-class classification in microarray experiments

Authors: Suyan Tian, Mayte Suárez-Fariñas

Abstract: Background With microarray technology becoming mature and popular, the selection and use of a small number of relevant genes for accurate classification of samples is a hot topic in the circles of biostatistics and bioinformatics. However, most of the developed algorithms lack the ability to handle multiple classes, which arguably a common application. Here, we propose an extension to an existin… ▽ More Background With microarray technology becoming mature and popular, the selection and use of a small number of relevant genes for accurate classification of samples is a hot topic in the circles of biostatistics and bioinformatics. However, most of the developed algorithms lack the ability to handle multiple classes, which arguably a common application. Here, we propose an extension to an existing regularization algorithm called Threshold Gradient Descent Regularization (TGDR) to specifically tackle multi-class classification of microarray data. When there are several microarray experiments addressing the same/similar objectives, one option is to use meta-analysis version of TGDR (Meta-TGDR), which considers the classification task as combination of classifiers with the same structure/model while allowing the parameters to vary across studies. However, the original Meta-TGDR extension did not offer a solution to the prediction on independent samples. Here, we propose an explicit method to estimate the overall coefficients of the biomarkers selected by Meta-TGDR. This extension permits broader applicability and allows a comparison between the predictive performance of Meta-TGDR and TGDR using an independent testing set. Results Using real-world applications, we demonstrated the proposed multi-TGDR framework works well and the number of selected genes is less than the sum of all individualized binary TGDRs. Additionally, Meta-TGDR and TGDR on the batch-effect adjusted pooled data approximately provided same results. By adding Bagging procedure in each application, the stability and good predictive performance are warranted. Conclusions Compared with Meta-TGDR, TGDR is less computing time intensive, and requires no samples of all classes in each study. On the adjusted data, it has approximate same predictive performance with Meta-TGDR. Thus, it is highly recommended. △ Less

Submitted 21 July, 2013; originally announced July 2013.

Search v0.5.6 released 2020-02-24