Skip to main content

Showing 1–6 of 6 results for author: Yang, K K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2209.15611  [pdf, other

    q-bio.BM cs.AI

    Protein structure generation via folding diffusion

    Authors: Kevin E. Wu, Kevin K. Yang, Rianne van den Berg, James Y. Zou, Alex X. Lu, Ava P. Amini

    Abstract: The ability to computationally generate novel yet physically foldable protein structures could lead to new biological discoveries and new treatments targeting yet incurable diseases. Despite recent advances in protein structure prediction, directly generating diverse, novel protein structures from neural networks remains difficult. In this work, we present a new diffusion-based generative model th… ▽ More

    Submitted 23 November, 2022; v1 submitted 30 September, 2022; originally announced September 2022.

    ACM Class: I.2.0; J.3

  2. arXiv:2206.06583  [pdf, other

    q-bio.QM cs.AI

    Exploring evolution-aware & -free protein language models as protein function predictors

    Authors: Mingyang Hu, Fajie Yuan, Kevin K. Yang, Fusong Ju, ** Su, Hui Wang, Fei Yang, Qiuyang Ding

    Abstract: Large-scale Protein Language Models (PLMs) have improved performance in protein prediction tasks, ranging from 3D structure prediction to various function predictions. In particular, AlphaFold, a ground-breaking AI system, could potentially reshape structural biology. However, the utility of the PLM module in AlphaFold, Evoformer, has not been explored beyond structure prediction. In this paper, w… ▽ More

    Submitted 16 October, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

  3. Machine learning modeling of family wide enzyme-substrate specificity screens

    Authors: Samuel Goldman, Ria Das, Kevin K. Yang, Connor W. Coley

    Abstract: Biocatalysis is a promising approach to sustainably synthesize pharmaceuticals, complex natural products, and commodity chemicals at scale. However, the adoption of biocatalysis is limited by our ability to select enzymes that will catalyze their natural chemical transformation on non-natural substrates. While machine learning and in silico directed evolution are well-posed for this predictive mod… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

  4. arXiv:2106.05466  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    Adaptive machine learning for protein engineering

    Authors: Brian L. Hie, Kevin K. Yang

    Abstract: Machine-learning models that learn from data to predict how protein sequence encodes function are emerging as a useful protein engineering tool. However, when using these models to suggest new protein designs, one must deal with the vast combinatorial complexity of protein sequences. Here, we review how to use a sequence-to-function machine-learning surrogate model to select sequences for experime… ▽ More

    Submitted 6 July, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 9 pages, 2 figures

  5. arXiv:2104.04457  [pdf, other

    q-bio.QM cs.LG q-bio.BM stat.ML

    Protein sequence design with deep generative models

    Authors: Zachary Wu, Kadina E. Johnston, Frances H. Arnold, Kevin K. Yang

    Abstract: Protein engineering seeks to identify protein sequences with optimized properties. When guided by machine learning, protein sequence generation methods can draw on prior knowledge and experimental efforts to improve this process. In this review, we highlight recent applications of machine learning to generate protein sequences, focusing on the emerging field of deep generative methods.

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: 11 pages, 2 figures

  6. arXiv:1904.08102  [pdf, other

    cs.LG q-bio.QM stat.ML

    Batched Stochastic Bayesian Optimization via Combinatorial Constraints Design

    Authors: Kevin K. Yang, Yuxin Chen, Alycia Lee, Yisong Yue

    Abstract: In many high-throughput experimental design settings, such as those common in biochemical engineering, batched queries are more cost effective than one-by-one sequential queries. Furthermore, it is often not possible to directly choose items to query. Instead, the experimenter specifies a set of constraints that generates a library of possible items, which are then selected stochastically. Motivat… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.