Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction
Authors:
Jian Zhou,
Olga G. Troyanskaya
Abstract:
Predicting protein secondary structure is a fundamental problem in protein structure prediction. Here we present a new supervised generative stochastic network (GSN) based method to predict local secondary structure with deep hierarchical representations. GSN is a recently proposed deep learning technique (Bengio & Thibodeau-Laufer, 2013) to globally train deep generative model. We present the sup…
▽ More
Predicting protein secondary structure is a fundamental problem in protein structure prediction. Here we present a new supervised generative stochastic network (GSN) based method to predict local secondary structure with deep hierarchical representations. GSN is a recently proposed deep learning technique (Bengio & Thibodeau-Laufer, 2013) to globally train deep generative model. We present the supervised extension of GSN, which learns a Markov chain to sample from a conditional distribution, and applied it to protein structure prediction. To scale the model to full-sized, high-dimensional data, like protein sequences with hundreds of amino acids, we introduce a convolutional architecture, which allows efficient learning across multiple layers of hierarchical representations. Our architecture uniquely focuses on predicting structured low-level labels informed with both low and high-level representations learned by the model. In our application this corresponds to labeling the secondary structure state of each amino-acid residue. We trained and tested the model on separate sets of non-homologous proteins sharing less than 30% sequence identity. Our model achieves 66.4% Q8 accuracy on the CB513 dataset, better than the previously reported best performance 64.9% (Wang et al., 2011) for this challenging secondary structure prediction problem.
△ Less
Submitted 6 March, 2014;
originally announced March 2014.
Map** Dynamic Histone Acetylation Patterns to Gene Expression in Nanog-depleted Murine Embryonic Stem Cells
Authors:
Florian Markowetz,
Klaas W Mulder,
Edoardo M Airoldi,
Ihor R Lemischka,
Olga G Troyanskaya
Abstract:
Embryonic stem cells (ESC) have the potential to self-renew indefinitely and to differentiate into any of the three germ layers. The molecular mechanisms for self-renewal, maintenance of pluripotency and lineage specification are poorly understood, but recent results point to a key role for epigenetic mechanisms. In this study, we focus on quantifying the impact of histone 3 acetylation (H3K9,14ac…
▽ More
Embryonic stem cells (ESC) have the potential to self-renew indefinitely and to differentiate into any of the three germ layers. The molecular mechanisms for self-renewal, maintenance of pluripotency and lineage specification are poorly understood, but recent results point to a key role for epigenetic mechanisms. In this study, we focus on quantifying the impact of histone 3 acetylation (H3K9,14ac) on gene expression in murine embryonic stem cells. We analyze genome-wide histone acetylation patterns and gene expression profiles measured over the first five days of cell differentiation triggered by silencing Nanog, a key transcription factor in ESC regulation. We explore the temporal and spatial dynamics of histone acetylation data and its correlation with gene expression using supervised and unsupervised statistical models. On a genome-wide scale, changes in acetylation are significantly correlated to changes in mRNA expression and, surprisingly, this coherence increases over time. We quantify the predictive power of histone acetylation for gene expression changes in a balanced cross-validation procedure. In an in-depth study we focus on genes central to the regulatory network of Mouse ESC, including those identified in a recent genome-wide RNAi screen and in the PluriNet, a computationally derived stem cell signature. We find that compared to the rest of the genome, ESC-specific genes show significantly more acetylation signal and a much stronger decrease in acetylation over time, which is often not reflected in an concordant expression change. These results shed light on the complexity of the relationship between histone acetylation and gene expression and are a step forward to dissect the multilayer regulatory mechanisms that determine stem cell fate.
△ Less
Submitted 15 October, 2010;
originally announced October 2010.