Search | arXiv e-print repository

doi 10.1016/j.compbiomed.2022.106499

Segmentation based tracking of cells in 2D+time microscopy images of macrophages

Authors: Seol Ah Park, Tamara Sipka, Zuzana Kriva, George Lutfalla, Mai Nguyen-Chi, Karol Mikula

Abstract: The automated segmentation and tracking of macrophages during their migration are challenging tasks due to their dynamically changing shapes and motions. This paper proposes a new algorithm to achieve automatic cell tracking in time-lapse microscopy macrophage data. First, we design a segmentation method employing space-time filtering, local Otsu's thresholding, and the SUBSURF (subjective surface… ▽ More The automated segmentation and tracking of macrophages during their migration are challenging tasks due to their dynamically changing shapes and motions. This paper proposes a new algorithm to achieve automatic cell tracking in time-lapse microscopy macrophage data. First, we design a segmentation method employing space-time filtering, local Otsu's thresholding, and the SUBSURF (subjective surface segmentation) method. Next, the partial trajectories for cells overlap** in the temporal direction are extracted in the segmented images. Finally, the extracted trajectories are linked by considering their direction of movement. The segmented images and the obtained trajectories from the proposed method are compared with those of the semi-automatic segmentation and manual tracking. The proposed tracking achieved 97.4% of accuracy for macrophage data under challenging situations, feeble fluorescent intensity, irregular shapes, and motion of macrophages. We expect that the automatically extracted trajectories of macrophages can provide pieces of evidence of how macrophages migrate depending on their polarization modes in the situation, such as during wound healing. △ Less

Submitted 2 January, 2023; originally announced January 2023.

Comments: Computers in Biology and Medicine, Volume 153, 106499,(2023)

arXiv:2212.10567 [pdf, other]

Anticancer Peptides Classification using Kernel Sparse Representation Classifier

Authors: Ehtisham Fazal, Muhammad Sohail Ibrahim, Seongyong Park, Imran Naseem, Abdul Wahab

Abstract: Cancer is one of the most challenging diseases because of its complexity, variability, and diversity of causes. It has been one of the major research topics over the past decades, yet it is still poorly understood. To this end, multifaceted therapeutic frameworks are indispensable. \emph{Anticancer peptides} (ACPs) are the most promising treatment option, but their large-scale identification and s… ▽ More Cancer is one of the most challenging diseases because of its complexity, variability, and diversity of causes. It has been one of the major research topics over the past decades, yet it is still poorly understood. To this end, multifaceted therapeutic frameworks are indispensable. \emph{Anticancer peptides} (ACPs) are the most promising treatment option, but their large-scale identification and synthesis require reliable prediction methods, which is still a problem. In this paper, we present an intuitive classification strategy that differs from the traditional \emph{black box} method and is based on the well-known statistical theory of \emph{sparse-representation classification} (SRC). Specifically, we create over-complete dictionary matrices by embedding the \emph{composition of the K-spaced amino acid pairs} (CKSAAP). Unlike the traditional SRC frameworks, we use an efficient \emph{matching pursuit} solver instead of the computationally expensive \emph{basis pursuit} solver in this strategy. Furthermore, the \emph{kernel principal component analysis} (KPCA) is employed to cope with non-linearity and dimension reduction of the feature space whereas the \emph{synthetic minority oversampling technique} (SMOTE) is used to balance the dictionary. The proposed method is evaluated on two benchmark datasets for well-known statistical parameters and is found to outperform the existing methods. The results show the highest sensitivity with the most balanced accuracy, which might be beneficial in understanding structural and chemical aspects and develo** new ACPs. The Google-Colab implementation of the proposed method is available at the author's GitHub page (\href{https://github.com/ehtisham-Fazal/ACP-Kernel-SRC}{https://github.com/ehtisham-fazal/ACP-Kernel-SRC}). △ Less

Submitted 19 December, 2022; originally announced December 2022.

arXiv:2202.04773 [pdf, other]

A Neural Network Model of Continual Learning with Cognitive Control

Authors: Jacob Russin, Maryam Zolfaghar, Seongmin A. Park, Erie Boorman, Randall C. O'Reilly

Abstract: Neural networks struggle in continual learning settings from catastrophic forgetting: when trials are blocked, new learning can overwrite the learning from previous blocks. Humans learn effectively in these settings, in some cases even showing an advantage of blocking, suggesting the brain contains mechanisms to overcome this problem. Here, we build on previous work and show that neural networks e… ▽ More Neural networks struggle in continual learning settings from catastrophic forgetting: when trials are blocked, new learning can overwrite the learning from previous blocks. Humans learn effectively in these settings, in some cases even showing an advantage of blocking, suggesting the brain contains mechanisms to overcome this problem. Here, we build on previous work and show that neural networks equipped with a mechanism for cognitive control do not exhibit catastrophic forgetting when trials are blocked. We further show an advantage of blocking over interleaving when there is a bias for active maintenance in the control signal, implying a tradeoff between maintenance and the strength of control. Analyses of map-like representations learned by the networks provided additional insights into these mechanisms. Our work highlights the potential of cognitive control to aid continual learning in neural networks, and offers an explanation for the advantage of blocking that has been observed in humans. △ Less

Submitted 3 November, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: 7 pages, 5 figures, paper accepted as a talk to CogSci 2022 (https://escholarship.org/uc/item/3gn3w58z)

Journal ref: CogSci 2022, 44

arXiv:2202.03886 [pdf]

Integration of Clinical, Biological, and Computational Perspectives to Support Cerebral Autoregulatory Informed Clinical Decision Making Decomposing Cerebral Autoregulation using Mechanistic Timescales to Support Clinical Decision-Making

Authors: J. K. Briggs, J. N. Stroh, T. D. Bennett, S. Park, D. J. Albers

Abstract: Adequate brain perfusion is required for proper brain function and life. Maintaining optimal brain perfusion to avoid secondary brain injury is one of the main concerns of neurocritical care. Cerebral autoregulation is responsible for maintaining optimal brain perfusion despite pressure derangements. Knowledge of cerebral autoregulatory function should be a key factor in clinical decision-making,… ▽ More Adequate brain perfusion is required for proper brain function and life. Maintaining optimal brain perfusion to avoid secondary brain injury is one of the main concerns of neurocritical care. Cerebral autoregulation is responsible for maintaining optimal brain perfusion despite pressure derangements. Knowledge of cerebral autoregulatory function should be a key factor in clinical decision-making, yet it is often insufficiently and incorrectly applied. Multiple physiologic mechanisms impact cerebral autoregulation, each of which operate on potentially different and incompletely understood timescales confounding conclusions drawn from observations. Because of such complexities, clinical conceptualization of cerebral autoregulation has been distilled into practical indices defined by multimodal neuromonitoring, which removes mechanistic information and limits decision options. The next step towards cerebral autoregulatory-informed clinical decision-making is to quantify cerebral autoregulation mechanistically, which requires decomposing cerebral autoregulation into its fundamental processes and partitioning those processes into the timescales at which each operates. In this review, we scrutinize biologically, clinically, and computationally focused literature to build a timescales-based framework around cerebral autoregulation. This new framework will allow us to quantify mechanistic interactions and directly infer which mechanism(s) are functioning based only on current monitoring equipment, paving the way for a new frontier in cerebral autoregulatory-informed clinical decision-making. △ Less

Submitted 7 February, 2022; originally announced February 2022.

Comments: 29 pages total, Main document is 14 pages, 2 figures, 2 tables, Review Article

arXiv:2201.00654 [pdf, other]

doi 10.1088/1751-8121/ac60e7

Bayesian inference of scaled versus fractional Brownian motion

Authors: Samudrajit Thapa, Seongyu Park, Yeong** Kim, Jae-Hyung Jeon, Ralf Metzler, Michael A. Lomholt

Abstract: We present a Bayesian inference scheme for scaled Brownian motion, and investigate its performance on synthetic data for parameter estimation and model selection in a combined inference with fractional Brownian motion. We include the possibility of measurement noise in both models. We find that for trajectories of a few hundred time points the procedure is able to resolve well the true model and p… ▽ More We present a Bayesian inference scheme for scaled Brownian motion, and investigate its performance on synthetic data for parameter estimation and model selection in a combined inference with fractional Brownian motion. We include the possibility of measurement noise in both models. We find that for trajectories of a few hundred time points the procedure is able to resolve well the true model and parameters. Using the prior of the synthetic data generation process also for the inference, the approach is optimal based on decision theory. We include a comparison with inference using a prior different from the data generating one. △ Less

Submitted 12 May, 2022; v1 submitted 30 December, 2021; originally announced January 2022.

Comments: 22 pages, 12 figures, IOP LaTeX, minor revisions

Journal ref: J. Phys. A: Math. Theor. 55, 194003 (2022)

arXiv:2112.12582 [pdf]

Beyond Low Earth Orbit: Biological Research, Artificial Intelligence, and Self-Driving Labs

Authors: Lauren M. Sanders, Jason H. Yang, Ryan T. Scott, Amina Ann Qutub, Hector Garcia Martin, Daniel C. Berrios, Jaden J. A. Hastings, Jon Rask, Graham Mackintosh, Adrienne L. Hoarfrost, Stuart Chalk, John Kalantari, Kia Khezeli, Erik L. Antonsen, Joel Babdor, Richard Barker, Sergio E. Baranzini, Afshin Beheshti, Guillermo M. Delgado-Aparicio, Benjamin S. Glicksberg, Casey S. Greene, Melissa Haendel, Arif A. Hamid, Philip Heller, Daniel Jamieson , et al. (31 additional authors not shown)

Abstract: Space biology research aims to understand fundamental effects of spaceflight on organisms, develop foundational knowledge to support deep space exploration, and ultimately bioengineer spacecraft and habitats to stabilize the ecosystem of plants, crops, microbes, animals, and humans for sustained multi-planetary life. To advance these aims, the field leverages experiments, platforms, data, and mode… ▽ More Space biology research aims to understand fundamental effects of spaceflight on organisms, develop foundational knowledge to support deep space exploration, and ultimately bioengineer spacecraft and habitats to stabilize the ecosystem of plants, crops, microbes, animals, and humans for sustained multi-planetary life. To advance these aims, the field leverages experiments, platforms, data, and model organisms from both spaceborne and ground-analog studies. As research is extended beyond low Earth orbit, experiments and platforms must be maximally autonomous, light, agile, and intelligent to expedite knowledge discovery. Here we present a summary of recommendations from a workshop organized by the National Aeronautics and Space Administration on artificial intelligence, machine learning, and modeling applications which offer key solutions toward these space biology challenges. In the next decade, the synthesis of artificial intelligence into the field of space biology will deepen the biological understanding of spaceflight effects, facilitate predictive modeling and analytics, support maximally autonomous and reproducible experiments, and efficiently manage spaceborne data and metadata, all with the goal to enable life to thrive in deep space. △ Less

Submitted 22 December, 2021; originally announced December 2021.

Comments: 28 pages, 4 figures

arXiv:2112.12554 [pdf]

Beyond Low Earth Orbit: Biomonitoring, Artificial Intelligence, and Precision Space Health

Authors: Ryan T. Scott, Erik L. Antonsen, Lauren M. Sanders, Jaden J. A. Hastings, Seung-min Park, Graham Mackintosh, Robert J. Reynolds, Adrienne L. Hoarfrost, Aenor Sawyer, Casey S. Greene, Benjamin S. Glicksberg, Corey A. Theriot, Daniel C. Berrios, Jack Miller, Joel Babdor, Richard Barker, Sergio E. Baranzini, Afshin Beheshti, Stuart Chalk, Guillermo M. Delgado-Aparicio, Melissa Haendel, Arif A. Hamid, Philip Heller, Daniel Jamieson, Katelyn J. Jarvis , et al. (31 additional authors not shown)

Abstract: Human space exploration beyond low Earth orbit will involve missions of significant distance and duration. To effectively mitigate myriad space health hazards, paradigm shifts in data and space health systems are necessary to enable Earth-independence, rather than Earth-reliance. Promising developments in the fields of artificial intelligence and machine learning for biology and health can address… ▽ More Human space exploration beyond low Earth orbit will involve missions of significant distance and duration. To effectively mitigate myriad space health hazards, paradigm shifts in data and space health systems are necessary to enable Earth-independence, rather than Earth-reliance. Promising developments in the fields of artificial intelligence and machine learning for biology and health can address these needs. We propose an appropriately autonomous and intelligent Precision Space Health system that will monitor, aggregate, and assess biomedical statuses; analyze and predict personalized adverse health outcomes; adapt and respond to newly accumulated data; and provide preventive, actionable, and timely insights to individual deep space crew members and iterative decision support to their crew medical officer. Here we present a summary of recommendations from a workshop organized by the National Aeronautics and Space Administration, on future applications of artificial intelligence in space biology and health. In the next decade, biomonitoring technology, biomarker science, spacecraft hardware, intelligent software, and streamlined data management must mature and be woven together into a Precision Space Health system to enable humanity to thrive in deep space. △ Less

Submitted 22 December, 2021; originally announced December 2021.

Comments: 31 pages, 4 figures

arXiv:2108.13486 [pdf]

Automated Tracking of Primate Behavior

Authors: Benjamin Hayden, Hyun Soo Park, Jan Zimmermann

Abstract: Understanding primate behavior is a mission-critical goal of both biology and biomedicine. Despite the importance of behavior, our ability to rigorously quantify it has heretofore been limited to low-information measures like preference, looking time, and reaction time, or to non-scaleable measures like ethograms. However, recent technological advances have led to a major revolution in behavioral… ▽ More Understanding primate behavior is a mission-critical goal of both biology and biomedicine. Despite the importance of behavior, our ability to rigorously quantify it has heretofore been limited to low-information measures like preference, looking time, and reaction time, or to non-scaleable measures like ethograms. However, recent technological advances have led to a major revolution in behavioral measurement. Specifically, digital video cameras and automated pose tracking software can provide detailed measures of full body position (i.e., pose) of multiple primates over time (i.e., behavior) with high spatial and temporal resolution. Pose-tracking technology in turn can be used to detect behavioral states, such as eating, slee**, and mating. The availability of such data has in turn spurred developments in data analysis techniques. Together, these changes are poised to lead to major advances in scientific fields that rely on behavioral as a dependent variable. In this review, we situate the tracking revolution in the history of the study of behavior, argue for investment in and development of analytical and research techniques that can profit from the advent of the era of big behavior, and propose that zoos will have a central role to play in this era. △ Less

Submitted 30 August, 2021; originally announced August 2021.

Comments: Invited manuscript to AJP

arXiv:2107.05390 [pdf, other]

Bayesian inference of Lévy walks via hidden Markov models

Authors: Seongyu Park, Samudrajit Thapa, Yeong** Kim, Michael A. Lomholt, Jae-Hyung Jeon

Abstract: The Lévy walk is a non-Brownian random walk model that has been found to describe anomalous dynamic phenomena in diverse fields ranging from biology over quantum physics to ecology. Recurrently occurring problems are to examine whether observed data are successfully quantified by a model classified as Lévy walks or not and extract the best model parameters in accordance with the data. Motivated by… ▽ More The Lévy walk is a non-Brownian random walk model that has been found to describe anomalous dynamic phenomena in diverse fields ranging from biology over quantum physics to ecology. Recurrently occurring problems are to examine whether observed data are successfully quantified by a model classified as Lévy walks or not and extract the best model parameters in accordance with the data. Motivated by such needs, we propose a hidden Markov model for Lévy walks and computationally realize and test the corresponding Bayesian inference method. We introduce a Markovian decomposition scheme to approximate a renewal process governed by a power-law waiting time distribution. Using this, we construct the likelihood function of Lévy walks based on a hidden Markov model and the forward algorithm. With the Lévy walk trajectories simulated at various conditions, we perform the Bayesian inference for parameter estimation and model classification. We show that the power-law exponent of the flight-time distribution can be successfully extracted even at the condition that the mean-squared displacement does not display the expected scaling exponent due to the noise or insufficient trajectory length. It is also demonstrated that the Bayesian method performs remarkably inferring the Lévy walk trajectories from given unclassified trajectory data set if the noise level is moderate. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2106.16219 [pdf]

Do grid codes afford generalization and flexible decision-making?

Authors: Linda Q. Yu, Seongmin A. Park, Sarah C. Sweigart, Erie D. Boorman, Matthew R. Nassar

Abstract: Behavioral flexibility is learning from previous experiences and planning appropriate actions in a changing or novel environment. Successful behavioral adaptation depends on internal models the brain builds to represent the relational structure of an abstract task. Emerging evidence suggests that the well-known roles of the hippocampus and entorhinal cortex (HC-EC) in integrating spatial relations… ▽ More Behavioral flexibility is learning from previous experiences and planning appropriate actions in a changing or novel environment. Successful behavioral adaptation depends on internal models the brain builds to represent the relational structure of an abstract task. Emerging evidence suggests that the well-known roles of the hippocampus and entorhinal cortex (HC-EC) in integrating spatial relationships into cognitive maps can be extended to map the transition structure between states in non-spatial abstract tasks. However, what the EC grid-codes actually compute to afford generalization remains elusive. We introduce two non-exclusive ideas regarding what grid-codes may represent to afford higher-level cognition. One idea is that grid-codes are eigenvectors of the successor representation (SR) learned online during a task. This view assumes that the grid codes serve as an efficient basis function for learning and representing experienced relationships between entities. Subsequently, the grid codes facilitate generalization in novel contexts such as when the goal changes. The second idea is that the grid-codes reflect the inferred global task structure. This view assumes that the grid-code represents a structural code that is factorized from specific sensory content, enabling structural information to be transferred across tasks. Subsequently, the brain could afford one-shot inferences without requiring experience. The ability to generalize experiences and make appropriate decisions in novel situations is critical for both animals and machines. Here we review proposed computations of the grid-code in the brain, which is potentially critical to behavioral flexibility. △ Less

Submitted 30 June, 2021; originally announced June 2021.

arXiv:2105.08944 [pdf, other]

Complementary Structure-Learning Neural Networks for Relational Reasoning

Authors: Jacob Russin, Maryam Zolfaghar, Seongmin A. Park, Erie Boorman, Randall C. O'Reilly

Abstract: The neural mechanisms supporting flexible relational inferences, especially in novel situations, are a major focus of current research. In the complementary learning systems framework, pattern separation in the hippocampus allows rapid learning in novel environments, while slower learning in neocortex accumulates small weight changes to extract systematic structure from well-learned environments.… ▽ More The neural mechanisms supporting flexible relational inferences, especially in novel situations, are a major focus of current research. In the complementary learning systems framework, pattern separation in the hippocampus allows rapid learning in novel environments, while slower learning in neocortex accumulates small weight changes to extract systematic structure from well-learned environments. In this work, we adapt this framework to a task from a recent fMRI experiment where novel transitive inferences must be made according to implicit relational structure. We show that computational models capturing the basic cognitive properties of these two systems can explain relational transitive inferences in both familiar and novel environments, and reproduce key phenomena observed in the fMRI experiment. △ Less

Submitted 19 May, 2021; originally announced May 2021.

Comments: 7 pages, 4 figures, Accepted to CogSci 2021 for poster presentation

arXiv:2105.06766 [pdf, other]

doi 10.1038/s41467-021-26320-w

Objective comparison of methods to decode anomalous diffusion

Authors: Gorka Muñoz-Gil, Giovanni Volpe, Miguel Angel Garcia-March, Erez Aghion, Aykut Argun, Chang Beom Hong, Tom Bland, Stefano Bo, J. Alberto Conejero, Nicolás Firbas, Òscar Garibo i Orts, Alessia Gentili, Zihan Huang, Jae-Hyung Jeon, Hélène Kabbech, Yeong** Kim, Patrycja Kowalek, Diego Krapf, Hanna Loch-Olszewska, Michael A. Lomholt, Jean-Baptiste Masson, Philipp G. Meyer, Seongyu Park, Borja Requena, Ihor Smal , et al. (9 additional authors not shown)

Abstract: Deviations from Brownian motion leading to anomalous diffusion are ubiquitously found in transport dynamics, playing a crucial role in phenomena from quantum physics to life sciences. The detection and characterization of anomalous diffusion from the measurement of an individual trajectory are challenging tasks, which traditionally rely on calculating the mean squared displacement of the trajector… ▽ More Deviations from Brownian motion leading to anomalous diffusion are ubiquitously found in transport dynamics, playing a crucial role in phenomena from quantum physics to life sciences. The detection and characterization of anomalous diffusion from the measurement of an individual trajectory are challenging tasks, which traditionally rely on calculating the mean squared displacement of the trajectory. However, this approach breaks down for cases of important practical interest, e.g., short or noisy trajectories, ensembles of heterogeneous trajectories, or non-ergodic processes. Recently, several new approaches have been proposed, mostly building on the ongoing machine-learning revolution. Aiming to perform an objective comparison of methods, we gathered the community and organized an open competition, the Anomalous Diffusion challenge (AnDi). Participating teams independently applied their own algorithms to a commonly-defined dataset including diverse conditions. Although no single method performed best across all scenarios, the results revealed clear differences between the various approaches, providing practical advice for users and a benchmark for developers. △ Less

Submitted 14 May, 2021; originally announced May 2021.

Comments: 63 pages, 5 main figures, 1 table, 28 supplementary figures. Website: http://www.andi-challenge.org

arXiv:2011.08719 [pdf, other]

GSSMD: A new standardized effect size measure to improve robustness and interpretability in biological applications

Authors: Seongyong Park, Shujaat Khan, Muhammad Moinuddin, Ubaid M. Al-Saggaf

Abstract: In many biological applications, the primary objective of study is to quantify the magnitude of treatment effect between two groups. Cohens'd or strictly standardized mean difference (SSMD) can be used to measure effect size however, it is sensitive to violation of assumption of normality. Here, we propose an alternative metric of standardized effect size measure to improve robustness and interpre… ▽ More In many biological applications, the primary objective of study is to quantify the magnitude of treatment effect between two groups. Cohens'd or strictly standardized mean difference (SSMD) can be used to measure effect size however, it is sensitive to violation of assumption of normality. Here, we propose an alternative metric of standardized effect size measure to improve robustness and interpretability, based on the overlap between two sample distributions. The proposed method is a non-parametric generalized variant of SSMD (Strictly Standardized Mean Difference). We characterized proposed measure in various simulation settings to illustrate its behavior. We also investigated finite sample properties on the estimation of effect size and draw some guidelines. As a case study, we applied our measure for hit selection problem in an RNAi experiment and showed superiority of proposed method. △ Less

Submitted 14 November, 2020; originally announced November 2020.

Comments: Accepted in International Conference on Bioinformatics and Biomedicine (BIBM) 2020. arXiv admin note: text overlap with arXiv:2001.06384

arXiv:2011.00034 [pdf, other]

Adaptive Semi-Supervised Intent Inferral to Control a Powered Hand Orthosis for Stroke

Authors: **gxi Xu, Cassie Meeker, Ava Chen, Lauren Winterbottom, Michaela Fraser, Sangwoo Park, Lynne M. Weber, Mitchell Miya, Dawn Nilsen, Joel Stein, Matei Ciocarlie

Abstract: In order to provide therapy in a functional context, controls for wearable robotic orthoses need to be robust and intuitive. We have previously introduced an intuitive, user-driven, EMG-based method to operate a robotic hand orthosis, but the process of training a control that is robust to concept drift (changes in the input signal) places a substantial burden on the user. In this paper, we explor… ▽ More In order to provide therapy in a functional context, controls for wearable robotic orthoses need to be robust and intuitive. We have previously introduced an intuitive, user-driven, EMG-based method to operate a robotic hand orthosis, but the process of training a control that is robust to concept drift (changes in the input signal) places a substantial burden on the user. In this paper, we explore semi-supervised learning as a paradigm for controlling a powered hand orthosis for stroke subjects. To the best of our knowledge, this is the first use of semi-supervised learning for an orthotic application. Specifically, we propose a disagreement-based semi-supervision algorithm for handling intrasession concept drift based on multimodal ipsilateral sensing. We evaluate the performance of our algorithm on data collected from five stroke subjects. Our results show that the proposed algorithm helps the device adapt to intrasession drift using unlabeled data and reduces the training burden placed on the user. We also validate the feasibility of our proposed algorithm with a functional task; in these experiments, two subjects successfully completed multiple instances of a pick-and-handover task. △ Less

Submitted 1 March, 2022; v1 submitted 30 October, 2020; originally announced November 2020.

Comments: 7 pages; Accepted to International Conference on Robotics and Automation (ICRA) 2022

arXiv:2007.12073 [pdf, other]

E3-targetPred: Prediction of E3-Target Proteins Using Deep Latent Space Encoding

Authors: Seongyong Park, Shujaat Khan, Abdul Wahab

Abstract: Understanding E3 ligase and target substrate interactions are important for cell biology and therapeutic development. However, experimental identification of E3 target relationships is not an easy task due to the labor-intensive nature of the experiments. In this article, a sequence-based E3-target prediction model is proposed for the first time. The proposed framework utilizes composition of k-sp… ▽ More Understanding E3 ligase and target substrate interactions are important for cell biology and therapeutic development. However, experimental identification of E3 target relationships is not an easy task due to the labor-intensive nature of the experiments. In this article, a sequence-based E3-target prediction model is proposed for the first time. The proposed framework utilizes composition of k-spaced amino acid pairs (CKSAAP) to learn the relationship between E3 ligases and their target protein. A class separable latent space encoding scheme is also devised that provides a compressed representation of feature space. A thorough ablation study is performed to identify an optimal gap size for CKSAAP and the number of latent variables that can represent the E3-target relationship successfully. The proposed scheme is evaluated on an independent dataset for a variety of standard quantitative measures. In particular, it achieves an average accuracy of $70.63\%$ on an independent dataset. The source code and datasets used in the study are available at the author's GitHub page (https://github.com/psychemistz/E3targetPred). △ Less

Submitted 26 June, 2020; originally announced July 2020.

Comments: Submitted to IEEE/ACM transactions on computational biology and bioinformatics

arXiv:2005.13112 [pdf]

Organ size increases with obesity and correlates with cancer risk

Authors: Haley Grant Yifan Zhang, Lu Li, Yan Wang, Satomi Kawamoto, Sophie Pénisson, Daniel F. Fouladi, Shahab Shayesteh, Alejandra Blanco, Saeed Ghandili, Eva Zinreich, Jefferson S. Graves, Seyoun Park, Scott Kern, Jody Hooper, Alan L. Yuille, Elliot K Fishman, Linda Chu, Cristian Tomasetti

Abstract: Obesity increases significantly cancer risk in various organs. Although this has been recognized for decades, the mechanism through which this happens has never been explained. Here, we show that the volumes of kidneys, pancreas, and liver are strongly correlated (median correlation = 0.625; P-value<10-47) with the body mass index (BMI) of an individual. We also find a significant relationship bet… ▽ More Obesity increases significantly cancer risk in various organs. Although this has been recognized for decades, the mechanism through which this happens has never been explained. Here, we show that the volumes of kidneys, pancreas, and liver are strongly correlated (median correlation = 0.625; P-value<10-47) with the body mass index (BMI) of an individual. We also find a significant relationship between the increase in organ volume and the increase in cancer risk (P-value<10-12). These results provide a mechanism explaining why obese individuals have higher cancer risk in several organs: the larger the organ volume the more cells at risk of becoming cancerous. These findings are important for a better understanding of the effects obesity has on cancer risk and, more generally, for the development of better preventive strategies to limit the mortality caused by obesity. △ Less

Submitted 26 May, 2020; originally announced May 2020.

arXiv:2005.12425 [pdf]

doi 10.1242/jeb.224121

Absolute ethanol intake drives ethanol preference in Drosophila

Authors: Scarlet J. Park, William W. Ja

Abstract: Factors that mediate ethanol preference in Drosophila melanogaster are not well understood. A major confound has been the use of diverse methods to estimate ethanol consumption. We measured fly consumptive ethanol preference on base diets varying in nutrients, taste, and ethanol concentration. Both sexes showed ethanol preference that was abolished on high nutrient concentration diets. Additionall… ▽ More Factors that mediate ethanol preference in Drosophila melanogaster are not well understood. A major confound has been the use of diverse methods to estimate ethanol consumption. We measured fly consumptive ethanol preference on base diets varying in nutrients, taste, and ethanol concentration. Both sexes showed ethanol preference that was abolished on high nutrient concentration diets. Additionally, manipulating total food intake without altering the nutritive value of the base diet or the ethanol concentration was sufficient to evoke or eliminate ethanol preference. Absolute ethanol intake and food volume consumed were stronger predictors of ethanol preference than caloric intake or the dietary caloric content. Our findings suggest that the effect of the base diet on ethanol preference is largely mediated by total consumption associated with the delivery medium, which ultimately determines the level of ethanol intake. We speculate that a physiologically relevant threshold for ethanol intake is essential for preferential ethanol consumption. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: 11 pages, 2 figures, 1 table. Complete raw data accessible from https://github.com/HungryFly/JaLab/raw/master/publications/ethanol_JEB/SI_dataset.xlsx This version of the manuscript is original submission before undergoing peer review process. Final accepted and published version of this manuscript is available from https://doi.org/10.1242/jeb.224121 J Exp Biol (2020)

arXiv:2002.10849 [pdf, ps, other]

doi 10.1088/1751-8121/ab9780

Distribution of the number of fitness maxima in Fisher's Geometric Model

Authors: Su-Chan Park, Sungmin Hwang, Joachim Krug

Abstract: Fisher's geometric model describes biological fitness landscapes by combining a linear map from the discrete space of genotypes to an $n$-dimensional Euclidean phenotype space with a nonlinear, single-peaked phenotype-fitness map. Genotypes are represented by binary sequences of length $L$, and the phenotypic effects of mutations at different sites are represented by $L$ random vectors drawn from… ▽ More Fisher's geometric model describes biological fitness landscapes by combining a linear map from the discrete space of genotypes to an $n$-dimensional Euclidean phenotype space with a nonlinear, single-peaked phenotype-fitness map. Genotypes are represented by binary sequences of length $L$, and the phenotypic effects of mutations at different sites are represented by $L$ random vectors drawn from an isotropic Gaussian distribution. Recent work has shown that the interplay between the genotypic and phenotypic levels gives rise to a range of different landscape topographies that can be characterised by the number of local fitness maxima. Extending our previous study of the mean number of local maxima, here we focus on the distribution of the number of maxima when the limit $L \to \infty$ is taken at finite $n$. We identify the typical scale of the number of maxima for general $n$, and determine the full scaled probability density and two point correlation function of maxima for the one-dimensional case. We also elaborate on the close relation of the model to the anti-ferromagnetic Hopfield model with $n$ random continuous pattern vectors, and show that many of our results carry over to this setting. More generally, we expect that our analysis can help to elucidate the fluctuation structure of metastable states in various spin glass problems. △ Less

Submitted 26 August, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: 36 pages, 5 figures. Minor corrections

Journal ref: J. Phys. A: Math. Theor. 53, 385601 (2020)

arXiv:2001.06384 [pdf, other]

GSSMD: New metric for robust and interpretable assay quality assessment and hit selection

Authors: Seongyong Park, Shujaat Khan

Abstract: In the high-throughput screening (HTS) campaigns, the Z'-factor and strictly standardized mean difference (SSMD) are commonly used to assess the quality of assays and to select hits. However, these measures are vulnerable to outliers and their performances are highly sensitive to background distributions. Here, we propose an alternative measure for assay quality assessment and hit selection. The p… ▽ More In the high-throughput screening (HTS) campaigns, the Z'-factor and strictly standardized mean difference (SSMD) are commonly used to assess the quality of assays and to select hits. However, these measures are vulnerable to outliers and their performances are highly sensitive to background distributions. Here, we propose an alternative measure for assay quality assessment and hit selection. The proposed method is a non-parametric generalized variant of SSMD (GSSMD). In this paper, we have shown that the proposed method provides more robust and intuitive way of assay quality assessment and hit selection. △ Less

Submitted 20 January, 2020; v1 submitted 17 January, 2020; originally announced January 2020.

arXiv:1912.05625 [pdf, other]

Pre-Training of Deep Bidirectional Protein Sequence Representations with Structural Information

Authors: Seonwoo Min, Seunghyun Park, Siwon Kim, Hyun-Soo Choi, Byunghan Lee, Sungroh Yoon

Abstract: Bridging the exponentially growing gap between the numbers of unlabeled and labeled protein sequences, several studies adopted semi-supervised learning for protein sequence modeling. In these studies, models were pre-trained with a substantial amount of unlabeled data, and the representations were transferred to various downstream tasks. Most pre-training methods solely rely on language modeling a… ▽ More Bridging the exponentially growing gap between the numbers of unlabeled and labeled protein sequences, several studies adopted semi-supervised learning for protein sequence modeling. In these studies, models were pre-trained with a substantial amount of unlabeled data, and the representations were transferred to various downstream tasks. Most pre-training methods solely rely on language modeling and often exhibit limited performance. In this paper, we introduce a novel pre-training scheme called PLUS, which stands for Protein sequence representations Learned Using Structural information. PLUS consists of masked language modeling and a complementary protein-specific pre-training task, namely same-family prediction. PLUS can be used to pre-train various model architectures. In this work, we use PLUS to pre-train a bidirectional recurrent neural network and refer to the resulting model as PLUS-RNN. Our experiment results demonstrate that PLUS-RNN outperforms other models of similar size solely pre-trained with the language modeling in six out of seven widely used protein biology tasks. Furthermore, we present the results from our qualitative interpretation analyses to illustrate the strengths of PLUS-RNN. PLUS provides a novel way to exploit evolutionary relationships among unlabeled proteins and is broadly applicable across a variety of protein biology tasks. We expect that the gap between the numbers of unlabeled and labeled proteins will continue to grow exponentially, and the proposed pre-training method will play a larger role. △ Less

Submitted 16 September, 2021; v1 submitted 25 November, 2019; originally announced December 2019.

Comments: Published in IEEE Access 2021 (https://ieeexplore.ieee.org/document/9529198)

arXiv:1911.11948 [pdf, other]

A note on observation processes in epidemic models

Authors: Sang Woo Park, Benjamin M. Bolker

Abstract: Many disease models focus on characterizing the underlying transmission mechanism but make simple, possibly naive assumptions about how infections are reported. In this note, we use a simple deterministic Susceptible-Infected-Removed (SIR) model to compare two common assumptions about disease incidence reports: individuals can report their infection as soon as they become infected or as soon as th… ▽ More Many disease models focus on characterizing the underlying transmission mechanism but make simple, possibly naive assumptions about how infections are reported. In this note, we use a simple deterministic Susceptible-Infected-Removed (SIR) model to compare two common assumptions about disease incidence reports: individuals can report their infection as soon as they become infected or as soon as they recover. We show that incorrect assumptions about the underlying observation processes can bias estimates of the basic reproduction number and lead to overly narrow confidence intervals. △ Less

Submitted 26 November, 2019; originally announced November 2019.

Comments: 9 pages, 2 figures

arXiv:1907.09738 [pdf]

doi 10.1109/ACCESS.2019.2952098

Robust Nucleus Detection with Partially Labeled Exemplars

Authors: Linqing Feng, Jun Ho Song, Jiwon Kim, Soomin Jeong, ** Sung Park, **hyun Kim

Abstract: Quantitative analysis of cell nuclei in microscopic images is an essential yet challenging source of biological and pathological information. The major challenge is accurate detection and segmentation of densely packed nuclei in images acquired under a variety of conditions. Mask R-CNN-based methods have achieved state-of-the-art nucleus segmentation. However, the current pipeline requires fully a… ▽ More Quantitative analysis of cell nuclei in microscopic images is an essential yet challenging source of biological and pathological information. The major challenge is accurate detection and segmentation of densely packed nuclei in images acquired under a variety of conditions. Mask R-CNN-based methods have achieved state-of-the-art nucleus segmentation. However, the current pipeline requires fully annotated training images, which are time consuming to create and sometimes noisy. Importantly, nuclei often appear similar within the same image. This similarity could be utilized to segment nuclei with only partially labeled training examples. We propose a simple yet effective region-proposal module for the current Mask R-CNN pipeline to perform few-exemplar learning. To capture the similarities between unlabeled regions and labeled nuclei, we apply decomposed self-attention to learned features. On the self-attention map, we observe strong activation at the centers and edges of all nuclei, including unlabeled nuclei. On this basis, our region-proposal module propagates partial annotations to the whole image and proposes effective bounding boxes for the bounding box-regression and binary mask-generation modules. Our method effectively learns from unlabeled regions thereby improving detection performance. We test our method with various nuclear images. When trained with only 1/4 of the nuclei annotated, our approach retains a detection accuracy comparable to that from training with fully annotated data. Moreover, our method can serve as a bootstrap** step to create full annotations of datasets, iteratively generating and correcting annotations until a predetermined coverage and accuracy are reached. The source code is available at https://github.com/feng-lab/nuclei. △ Less

Submitted 13 November, 2019; v1 submitted 23 July, 2019; originally announced July 2019.

Journal ref: IEEE Access, vol. 7, pp. 162169-162178, 2019

arXiv:1903.06375 [pdf]

doi 10.1063/5.0091597

Exploiting product molecule number to consider reaction rate fluctuation in elementary reactions

Authors: Seong Jun Park

Abstract: In many chemical reactions, reaction rate fluctuation is inevitable. Reaction rates are different whenever chemical reaction occurs due to their dependence on the number of reaction events or the product number. As such, understanding the impact of rate fluctuation on product number counting statistics is of the utmost importance when develo** a quantitative explanation of chemical reactions. In… ▽ More In many chemical reactions, reaction rate fluctuation is inevitable. Reaction rates are different whenever chemical reaction occurs due to their dependence on the number of reaction events or the product number. As such, understanding the impact of rate fluctuation on product number counting statistics is of the utmost importance when develo** a quantitative explanation of chemical reactions. In this work, we present a master equation that describes reaction rates as a function of product number and time. Our equal reveals the relationship between the reaction rate and product number fluctuation. Product number counting statistics uncovers a stochastic property of the product number; product number directly manipulates the reaction rate. Specifically, we find that product number shows super-Poisson characteristics when the product number increases, inducing an increase in the reaction rate. While, on the other hand, when the product number shows sub-Poisson characteristics with an increase in the product number, this is induced by a decrease in the reaction rate. Furthermore, our analysis exploits reaction rate fluctuation, enabling the quantification of the deviation of an elementary reaction process from a renewal process. △ Less

Submitted 18 March, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

arXiv:1903.06370 [pdf]

Product number counting statistics from stochastic bursting birth-death processes

Authors: Seong Jun Park, Jaeyoung Sung

Abstract: Bursting and non-renewal processes are common phenomena in birth-death process, yet no theory can quantitatively describe a non-renewal birth process with bursting. Here, we present a theoretical model that yields the product number counting statistics of product creation occurring in bursts and of a non-renewal creation process. When product creation is a stationary process, our model confirms th… ▽ More Bursting and non-renewal processes are common phenomena in birth-death process, yet no theory can quantitatively describe a non-renewal birth process with bursting. Here, we present a theoretical model that yields the product number counting statistics of product creation occurring in bursts and of a non-renewal creation process. When product creation is a stationary process, our model confirms that product number fluctuation decreases with an increase in the product lifetime fluctuation, originating from the non-Poisson degradation dynamics, a result obtained in previous work. Our model additionally demonstrates that the dependence of product number fluctuation on product lifetime fluctuation varies with time, when product creation is a non-stationary process. We find that bursting increases product number fluctuation, compared to birth-processes without bursting. At time zero, in a burst-less birth process, product number fluctuation is unsurprisingly found to be zero, but we discover that, in a bulk creation process characterized by bursting, product number fluctuation is a finite value at time zero. The analytic expressions we obtain are applicable to many fields related to the study system population, such as queueing models and gene expression. △ Less

Submitted 27 August, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

arXiv:1902.07303 [pdf, other]

doi 10.1371/journal.pcbi.1006884

Recombination and mutational robustness in neutral fitness landscapes

Authors: Alexander Klug, Su-Chan Park, Joachim Krug

Abstract: Mutational robustness quantifies the effect of random mutations on fitness. When mutational robustness is high, most mutations do not change fitness or have only a minor effect on it. From the point of view of fitness landscapes, robust genotypes form neutral networks of almost equal fitness. Using deterministic population models it has been shown that selection favors genotypes inside such networ… ▽ More Mutational robustness quantifies the effect of random mutations on fitness. When mutational robustness is high, most mutations do not change fitness or have only a minor effect on it. From the point of view of fitness landscapes, robust genotypes form neutral networks of almost equal fitness. Using deterministic population models it has been shown that selection favors genotypes inside such networks, which results in increased mutational robustness. Here we demonstrate that this effect is massively enhanced by recombination. Our results are based on a detailed analysis of mesa-shaped fitness landscapes, where we derive precise expressions for the dependence of the robustness on the landscape parameters for recombining and non-recombining populations. In addition, we carry out numerical simulations on different types of random holey landscapes as well as on an empirical fitness landscape. We show that the mutational robustness of a genotype generally correlates with its recombination weight, a new measure that quantifies the likelihood for the genotype to arise from recombination. We argue that the favorable effect of recombination on mutational robustness is a highly universal feature that may have played an important role in the emergence and maintenance of mechanisms of genetic exchange. △ Less

Submitted 20 October, 2019; v1 submitted 19 February, 2019; originally announced February 2019.

Comments: 15 figures, Supplementary appendix, supplementary figures

Journal ref: PLoS Comput Biol 15(8): e1006884 (2019)

arXiv:1809.03142 [pdf, other]

Fast and Efficient Information Transmission with Burst Spikes in Deep Spiking Neural Networks

Authors: Seongsik Park, Seijoon Kim, Hyeokjun Choe, Sungroh Yoon

Abstract: The spiking neural networks (SNNs) are considered as one of the most promising artificial neural networks due to their energy efficient computing capability. Recently, conversion of a trained deep neural network to an SNN has improved the accuracy of deep SNNs. However, most of the previous studies have not achieved satisfactory results in terms of inference speed and energy efficiency. In this pa… ▽ More The spiking neural networks (SNNs) are considered as one of the most promising artificial neural networks due to their energy efficient computing capability. Recently, conversion of a trained deep neural network to an SNN has improved the accuracy of deep SNNs. However, most of the previous studies have not achieved satisfactory results in terms of inference speed and energy efficiency. In this paper, we propose a fast and energy-efficient information transmission method with burst spikes and hybrid neural coding scheme in deep SNNs. Our experimental results showed the proposed methods can improve inference energy efficiency and shorten the latency. △ Less

Submitted 10 February, 2019; v1 submitted 10 September, 2018; originally announced September 2018.

Comments: Accepted to DAC 2019

arXiv:1808.06732 [pdf, other]

doi 10.1021/acsnano.8b04639

Kinetic Trans-assembly of DNA Nanostructures

Authors: Jihoon Shin, Junghoon Kim, Sung Ha Park, Tai Hwan Ha

Abstract: The central dogma of molecular biology is the principal framework for understanding how nucleic acid information is propagated and used by living systems to create complex biomolecules. Here, by integrating the structural and dynamic paradigms of DNA nanotechnology, we present a rationally designed synthetic platform which functions in an analogous manner to create complex DNA nanostructures. Star… ▽ More The central dogma of molecular biology is the principal framework for understanding how nucleic acid information is propagated and used by living systems to create complex biomolecules. Here, by integrating the structural and dynamic paradigms of DNA nanotechnology, we present a rationally designed synthetic platform which functions in an analogous manner to create complex DNA nanostructures. Starting from one type of DNA nanostructure, DNA strand displacement circuits were designed to interact and pass along the information encoded in the initial structure to mediate the self-assembly of a different type of structure, the final output structure depending on the type of circuit triggered. Using this concept of a DNA structure "trans-assembling" a different DNA structure through non-local strand displacement circuitry, four different schemes were implemented. Specifically, 1D ladder and 2D double-crossover (DX) lattices were designed to kinetically trigger DNA circuits to activate polymerization of either ring structures or another type of DX lattice under enzyme-free, isothermal conditions. In each scheme, the desired multilayer reaction pathway was activated, among multiple possible pathways, ultimately leading to the downstream self-assembly of the correct output structure. △ Less

Submitted 3 October, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

Comments: 17 pages including 5 figures, supplementary material can be found via https://pubs.acs.org/doi/pdf/10.1021/acsnano.8b04639

Journal ref: ACS Nano, 2018, 12 (9), pp 9423-9432

arXiv:1808.06047 [pdf]

Frequency spectrum of biological noise: a probe of reaction dynamics in living cells

Authors: Sanggeun Song, Gil-Suk Yang, Seong Jun Park, Ji-Hyun Kim, Jaeyoung Sung

Abstract: Even in the steady-state, the number of biomolecules in living cells fluctuates dynamically; and the frequency spectrum of this chemical fluctuation carries valuable information about the mechanism and the dynamics of the intracellular reactions creating these biomolecules. Although recent advances in single-cell experimental techniques enable the direct monitoring of the time-traces of the biolog… ▽ More Even in the steady-state, the number of biomolecules in living cells fluctuates dynamically; and the frequency spectrum of this chemical fluctuation carries valuable information about the mechanism and the dynamics of the intracellular reactions creating these biomolecules. Although recent advances in single-cell experimental techniques enable the direct monitoring of the time-traces of the biological noise in each cell, the development of the theoretical tools needed to extract the information encoded in the stochastic dynamics of intracellular chemical fluctuation is still in its adolescence. Here, we present a simple and general equation that relates the power-spectrum of the product number fluctuation to the product lifetime and the reaction dynamics of the product creation process. By analyzing the time traces of the protein copy number using this theory, we can extract the power spectrum of the mRNA number, which cannot be directly measured by currently available experimental techniques. From the power spectrum of the mRNA number, we can further extract quantitative information about the transcriptional regulation dynamics. Our power spectrum analysis of gene expression noise is demonstrated for the gene network model of luciferase expression under the control of the Bmal 1a promoter in mouse fibroblast cells. Additionally, we investigate how the non-Poisson reaction dynamics and the cell-to-cell heterogeneity in transcription and translation affect the power-spectra of the mRNA and protein number. △ Less

Submitted 18 August, 2018; originally announced August 2018.

Comments: Main text: 29 pages, 4 figures Supporting Information: 42 pages, 4 supplementary figures

arXiv:1806.08613 [pdf, ps, other]

doi 10.1209/0295-5075/123/48001

Rare beneficial mutations cannot halt Muller's ratchet in spatial populations

Authors: Su-Chan Park, Philipp Klatt, Joachim Krug

Abstract: Muller's ratchet describes the irreversible accumulation of deleterious mutations in asexual populations. In well-mixed populations the speed of fitness decline is exponentially small in the population size, and any positive rate of beneficial mutations is sufficient to reverse the ratchet in large populations. The behavior is fundamentally different in populations with spatial structure, because… ▽ More Muller's ratchet describes the irreversible accumulation of deleterious mutations in asexual populations. In well-mixed populations the speed of fitness decline is exponentially small in the population size, and any positive rate of beneficial mutations is sufficient to reverse the ratchet in large populations. The behavior is fundamentally different in populations with spatial structure, because the speed of the ratchet remains nonzero in the infinite size limit when the deleterious mutation rate exceeds a critical value. Based on the relation between the spatial ratchet and directed percolation, we develop a scaling theory incorporating both deleterious and beneficial mutations. The theory is verified by extensive simulations in one and two dimensions. △ Less

Submitted 22 June, 2018; originally announced June 2018.

Comments: 7 pages, 6 figures

arXiv:1704.02483 [pdf, other]

doi 10.1103/PhysRevFluids.3.123102

Evaporation-driven convective flows in suspensions of non-motile bacteria

Authors: Jocelyn Dunstan, Kyoung J. Lee, Simon F. Park, Yongyun Hwang, Raymond E. Goldstein

Abstract: We report a novel form of convection in suspensions of the bioluminiscent marine bacterium $Photobacterium~phosphoreum$. Suspensions of these bacteria placed in a chamber open to the air create persistent luminiscent plumes most easily visible when observed in the dark. These flows are strikingly similar to the classical bioconvection pattern of aerotactic swimming bacteria, which create an unstab… ▽ More We report a novel form of convection in suspensions of the bioluminiscent marine bacterium $Photobacterium~phosphoreum$. Suspensions of these bacteria placed in a chamber open to the air create persistent luminiscent plumes most easily visible when observed in the dark. These flows are strikingly similar to the classical bioconvection pattern of aerotactic swimming bacteria, which create an unstable stratification by swimming upwards to an air-water interface, but they are a puzzle since the strain of $P.~phosphoreum$ used does not express flagella and therefore cannot swim. Systematic experimentation with suspensions of microspheres reveals that these flow patterns are driven not by the bacteria but by the accumulation of salt at the air-water interface due to evaporation of the culture medium; even at room temperature and humidity, and physiologically relevant salt concentrations, the rate of water evaporation is sufficient to drive convection patterns. A mathematical model is developed to understand the mechanism of plume formation, and linear stability analysis as well as numerical simulations were carried out to support the conclusions. While evaporation-driven convection has not been discussed extensively in the context of biological systems, these results suggest that the phenomenon may be relevant in other systems, particularly those using microorganisms of limited motility. △ Less

Submitted 8 April, 2017; originally announced April 2017.

Comments: 16 pages, 14 figures, supplementary videos available on request (REG)

Journal ref: Phys. Rev. Fluids 3, 123102 (2018)

arXiv:1612.08790 [pdf, ps, other]

doi 10.1534/genetics.116.199497

Genotypic complexity of Fisher's geometric model

Authors: Sungmin Hwang, Su-Chan Park, Joachim Krug

Abstract: Fisher's geometric model was originally introduced to argue that complex adaptations must occur in small steps because of pleiotropic constraints. When supplemented with the assumption of additivity of mutational effects on phenotypic traits, it provides a simple mechanism for the emergence of genotypic epistasis from the nonlinear map** of phenotypes to fitness. Of particular interest is the oc… ▽ More Fisher's geometric model was originally introduced to argue that complex adaptations must occur in small steps because of pleiotropic constraints. When supplemented with the assumption of additivity of mutational effects on phenotypic traits, it provides a simple mechanism for the emergence of genotypic epistasis from the nonlinear map** of phenotypes to fitness. Of particular interest is the occurrence of reciprocal sign epistasis, which is a necessary condition for multipeaked genotypic fitness landscapes. Here we compute the probability that a pair of randomly chosen mutations interacts sign epistatically, which is found to decrease with increasing phenotypic dimension $n$, and varies nonmonotonically with the distance from the phenotypic optimum. We then derive expressions for the mean number of fitness maxima in genotypic landscapes comprised of all combinations of $L$ random mutations. This number increases exponentially with $L$, and the corresponding growth rate is used as a measure of the complexity of the landscape. The dependence of the complexity on the model parameters is found to be surprisingly rich, and three distinct phases characterized by different landscape structures are identified. Our analysis shows that the phenotypic dimension, which is often referred to as phenotypic complexity, does not generally correlate with the complexity of fitness landscapes and that even organisms with a single phenotypic trait can have complex landscapes. Our results further inform the interpretation of experiments where the parameters of Fisher's model have been inferred from data, and help to elucidate which features of empirical fitness landscapes can be described by this model. △ Less

Submitted 9 June, 2017; v1 submitted 27 December, 2016; originally announced December 2016.

Comments: 27 pages, 14 figures, 2 tables, minor changes (published version)

Journal ref: Genetics 206, 1049 (2017)

arXiv:1605.00017 [pdf, other]

deepMiRGene: Deep Neural Network based Precursor microRNA Prediction

Authors: Seunghyun Park, Seonwoo Min, Hyunsoo Choi, Sungroh Yoon

Abstract: Since microRNAs (miRNAs) play a crucial role in post-transcriptional gene regulation, miRNA identification is one of the most essential problems in computational biology. miRNAs are usually short in length ranging between 20 and 23 base pairs. It is thus often difficult to distinguish miRNA-encoding sequences from other non-coding RNAs and pseudo miRNAs that have a similar length, and most previou… ▽ More Since microRNAs (miRNAs) play a crucial role in post-transcriptional gene regulation, miRNA identification is one of the most essential problems in computational biology. miRNAs are usually short in length ranging between 20 and 23 base pairs. It is thus often difficult to distinguish miRNA-encoding sequences from other non-coding RNAs and pseudo miRNAs that have a similar length, and most previous studies have recommended using precursor miRNAs instead of mature miRNAs for robust detection. A great number of conventional machine-learning-based classification methods have been proposed, but they often have the serious disadvantage of requiring manual feature engineering, and their performance is limited as well. In this paper, we propose a novel miRNA precursor prediction algorithm, deepMiRGene, based on recurrent neural networks, specifically long short-term memory networks. deepMiRGene automatically learns suitable features from the data themselves without manual feature engineering and constructs a model that can successfully reflect structural characteristics of precursor miRNAs. For the performance evaluation of our approach, we have employed several widely used evaluation metrics on three recent benchmark datasets and verified that deepMiRGene delivered comparable performance among the current state-of-the-art tools. △ Less

Submitted 29 April, 2016; originally announced May 2016.

arXiv:1603.09123 [pdf, other]

deepTarget: End-to-end Learning Framework for microRNA Target Prediction using Deep Recurrent Neural Networks

Authors: Byunghan Lee, Junghwan Baek, Seunghyun Park, Sungroh Yoon

Abstract: MicroRNAs (miRNAs) are short sequences of ribonucleic acids that control the expression of target messenger RNAs (mRNAs) by binding them. Robust prediction of miRNA-mRNA pairs is of utmost importance in deciphering gene regulations but has been challenging because of high false positive rates, despite a deluge of computational tools that normally require laborious manual feature extraction. This p… ▽ More MicroRNAs (miRNAs) are short sequences of ribonucleic acids that control the expression of target messenger RNAs (mRNAs) by binding them. Robust prediction of miRNA-mRNA pairs is of utmost importance in deciphering gene regulations but has been challenging because of high false positive rates, despite a deluge of computational tools that normally require laborious manual feature extraction. This paper presents an end-to-end machine learning framework for miRNA target prediction. Leveraged by deep recurrent neural networks-based auto-encoding and sequence-sequence interaction learning, our approach not only delivers an unprecedented level of accuracy but also eliminates the need for manual feature extraction. The performance gap between the proposed method and existing alternatives is substantial (over 25% increase in F-measure), and deepTarget delivers a quantum leap in the long-standing challenge of robust miRNA target prediction. △ Less

Submitted 19 August, 2016; v1 submitted 30 March, 2016; originally announced March 2016.

arXiv:1603.05102 [pdf, ps, other]

doi 10.1088/1751-8113/49/31/315601

$δ$-exceedance records and random adaptive walks

Authors: Su-Chan Park, Joachim Krug

Abstract: We study a modified record process where the $k$'th record in a series of independent and identically distributed random variables is defined recursively through the condition $Y_k > Y_{k-1} - δ_{k-1}$ with a deterministic sequence $δ_k > 0$ called the handicap. For constant $δ_k \equiv δ$ and exponentially distributed random variables it has been shown in previous work that the process displays a… ▽ More We study a modified record process where the $k$'th record in a series of independent and identically distributed random variables is defined recursively through the condition $Y_k > Y_{k-1} - δ_{k-1}$ with a deterministic sequence $δ_k > 0$ called the handicap. For constant $δ_k \equiv δ$ and exponentially distributed random variables it has been shown in previous work that the process displays a phase transition as a function of $δ$ between a normal phase where the mean record value increases indefinitely and a stationary phase where the mean record value remains bounded and a finite fraction of all entries are records (Park \textit{et al} 2015 {\it Phys. Rev.} E \textbf{91} 042707). Here we explore the behavior for general probability distributions and decreasing and increasing sequences $δ_k$, focusing in particular on the case when $δ_k$ matches the typical spacing between subsequent records in the underlying simple record process without handicap. We find that a continuous phase transition occurs only in the exponential case, but a novel kind of first order transition emerges when $δ_k$ is increasing. The problem is partly motivated by the dynamics of evolutionary adaptation in biological fitness landscapes, where $δ_k$ corresponds to the change of the deterministic fitness component after $k$ mutational steps. The results for the record process are used to compute the mean number of steps that a population performs in such a landscape before being trapped at a local fitness maximum. △ Less

Submitted 10 August, 2016; v1 submitted 16 March, 2016; originally announced March 2016.

Comments: minor changes. Published

Journal ref: J. Phys. A: Math. Theor. 49, 315601 (2016)

arXiv:1507.03511 [pdf, ps, other]

doi 10.1016/j.jtbi.2016.02.035

Greedy adaptive walks on a correlated fitness landscape

Authors: Su-Chan Park, Johannes Neidhart, Joachim Krug

Abstract: We study adaptation of a haploid asexual population on a fitness landscape defined over binary genotype sequences of length $L$. We consider greedy adaptive walks in which the population moves to the fittest among all single mutant neighbors of the current genotype until a local fitness maximum is reached. The landscape is of the rough mount Fuji type, which means that the fitness value assigned t… ▽ More We study adaptation of a haploid asexual population on a fitness landscape defined over binary genotype sequences of length $L$. We consider greedy adaptive walks in which the population moves to the fittest among all single mutant neighbors of the current genotype until a local fitness maximum is reached. The landscape is of the rough mount Fuji type, which means that the fitness value assigned to a sequence is the sum of a random and a deterministic component. The random components are independent and identically distributed random variables, and the deterministic component varies linearly with the distance to a reference sequence. The deterministic fitness gradient $c$ is a parameter that interpolates between the limits of an uncorrelated random landscape ($c = 0$) and an effectively additive landscape ($c \to \infty$). When the random fitness component is chosen from the Gumbel distribution, explicit expressions for the distribution of the number of steps taken by the greedy walk are obtained, and it is shown that the walk length varies non-monotonically with the strength of the fitness gradient when the starting point is sufficiently close to the reference sequence. Asymptotic results for general distributions of the random fitness component are obtained using extreme value theory, and it is found that the walk length attains a non-trivial limit for $L \to \infty$, different from its values for $c=0$ and $c = \infty$, if $c$ is scaled with $L$ in an appropriate combination. △ Less

Submitted 18 March, 2016; v1 submitted 13 July, 2015; originally announced July 2015.

Comments: minor changes

Journal ref: J. Theor. Biol. 397, 89 (2016)

arXiv:1408.4856 [pdf, ps, other]

doi 10.1103/PhysRevE.91.042707

Phase transition in random adaptive walks on correlated fitness landscapes

Authors: Su-Chan Park, Ivan G. Szendro, Johannes Neidhart, Joachim Krug

Abstract: We study biological evolution on a random fitness landscape where correlations are introduced through a linear fitness gradient of strength $c$. When selection is strong and mutations rare the dynamics is a directed uphill walk that terminates at a local fitness maximum. We analytically calculate the dependence of the walk length on the genome size $L$. When the distribution of the random fitness… ▽ More We study biological evolution on a random fitness landscape where correlations are introduced through a linear fitness gradient of strength $c$. When selection is strong and mutations rare the dynamics is a directed uphill walk that terminates at a local fitness maximum. We analytically calculate the dependence of the walk length on the genome size $L$. When the distribution of the random fitness component has an exponential tail we find a phase transition of the walk length $D$ between a phase at small $c$ where walks are short $(D \sim \ln L)$ and a phase at large $c$ where walks are long $(D \sim L)$. For all other distributions only a single phase exists for any $c > 0$. The considered process is equivalent to a zero temperature Metropolis dynamics for the random energy model in an external magnetic field, thus also providing insight into the aging dynamics of spin glasses. △ Less

Submitted 15 April, 2015; v1 submitted 20 August, 2014; originally announced August 2014.

Journal ref: Phys. Rev. E 91, 042707 (2015)

arXiv:1408.3878 [pdf]

Unresolvable human mental states based on a parallel universe theory

Authors: Changsoo Shin, Wansoo Ha, Wookeen Chung, Sunyoung Park

Abstract: We show that human mental states are unresolvable by suggesting a mathematical function that describes human mental states in relation to parallel universe theory. The function is a solution to a multi-dimensional advection equation; representing a situation a person is faced with, and its time-derivative showing the mental state in that situation. This function has interesting characteristics tha… ▽ More We show that human mental states are unresolvable by suggesting a mathematical function that describes human mental states in relation to parallel universe theory. The function is a solution to a multi-dimensional advection equation; representing a situation a person is faced with, and its time-derivative showing the mental state in that situation. This function has interesting characteristics that explain why each person has different thoughts in a particular situation. Because the multi-dimensional advection equation has an infinite number of solutions, we can use them to represent an infinite number of mental states. We focus on the basic concepts of the model and explain the function using extremely simple cases. We also use the functions to explain remembering and forgetting. △ Less

Submitted 17 August, 2014; originally announced August 2014.

Comments: 17 pages, 6 figures, 3 tables

Journal ref: Advancement and Developments in Applied Mathematics 1 (2012) 18-29

arXiv:1306.0025 [pdf]

Genetic Complexity in a Drosophila Model of Diabetes-Associated Misfolded Human Proinsulin

Authors: Soo-Young Park, Michael Z. Ludwig, Natalia A. Tamarina, Bin Z. He, Sarah H. Carl, Desiree A. Dickerson, Levi Barse, Bharath Arun, Calvin Williams, Cecelia M. Miles, Louis H. Philipson, Donald F. Steiner, Graeme I. Bell, Martin Kreitman

Abstract: Here we use Drosophila melanogaster to create a genetic model of human permanent neonatal diabetes mellitus and present experimental results describing dimensions of this complexity. The approach involves the transgenic expression of a misfolded mutant of human preproinsulin, hINSC96Y, which is a cause of the disease. When expressed in fly imaginal discs, hINSC96Y causes a reduction of adult struc… ▽ More Here we use Drosophila melanogaster to create a genetic model of human permanent neonatal diabetes mellitus and present experimental results describing dimensions of this complexity. The approach involves the transgenic expression of a misfolded mutant of human preproinsulin, hINSC96Y, which is a cause of the disease. When expressed in fly imaginal discs, hINSC96Y causes a reduction of adult structures, including the eye, wing and notum. Eye imaginal discs exhibit defects in both the structure and arrangement of ommatidia. In the wing, expression of hINSC96Y leads to ectopic expression of veins and mechano-sensory organs, indicating disruption of wild type signaling processes regulating cell fates. These readily measurable disease phenotypes are sensitive to temperature, gene dose and sex. Mutant (but not wild type) proinsulin expression in the eye imaginal disc induces IRE1-mediated Xbp1 alternative splicing, a signal for endoplasmic reticulum stress response activation, and produces global change in gene expression. Mutant hINS transgene tester strains, when crossed to stocks from the Drosophila Genetic Reference Panel produces F1 adults with a continuous range of disease phenotypes and large broad-sense heritability. Surprisingly, the severity of mutant hINS-induced disease in the eye is not correlated with that in the notum in these crosses, nor with eye reduction phenotypes caused by the expression of two dominant eye mutants acting in two different eye development pathways, Drop (Dr) or Lobe (L) when crossed into the same genetic backgrounds. The tissue specificity of genetic variability for mutant hINS-induced disease thus has its own distinct signature. The genetic dominance of disease-specific phenotypic variability makes this approach amenable to genome-wide association study (GWAS) in a simple F1 screen of natural variation. △ Less

Submitted 31 May, 2013; originally announced June 2013.

Comments: 60 pages; 6 figures; 8 supporting figures; 11 supporting tables

arXiv:1305.5319 [pdf]

doi 10.1534/genetics.113.157800

Effect of Genetic Variation in a Drosophila Model of Diabetes-Associated Misfolded Human Proinsulin

Authors: Bin Z. He, Michael Z. Ludwig, Desiree A. Dickerson, Levi Barse, Bharath Arun, Soo Young Park, Natalia A. Tamarina, Scott B. Selleck, Patricia Wittkopp, Graeme I. Bell, Martin Kreitman

Abstract: The identification and validation of gene-gene interactions is a major challenge in human studies. Here, we explore an approach for studying epistasis in humans using a Drosophila melanogaster model of neonatal diabetes mellitus. Expression of mutant preproinsulin, hINSC96Y, in the eye imaginal disc mimics the human disease activating conserved cell stress response pathways leading to cell death a… ▽ More The identification and validation of gene-gene interactions is a major challenge in human studies. Here, we explore an approach for studying epistasis in humans using a Drosophila melanogaster model of neonatal diabetes mellitus. Expression of mutant preproinsulin, hINSC96Y, in the eye imaginal disc mimics the human disease activating conserved cell stress response pathways leading to cell death and reduction in eye area. Dominant-acting variants in wild-derived inbred lines from the Drosophila Genetics Reference Panel produce a continuous, highly heritable, distribution of eye degeneration phenotypes. A genome-wide association study (GWAS) in 154 sequenced lines identified 29 candidate SNPs in 16 loci with P < 10-5 including one SNP in an intron of the gene sulfateless (sfl) which exceeded a conservative genome-wide significance threshold of P = 0.05 level (-log10 P > 7.62). RNAi knock-downs of sfl enhanced the eye degeneration phenotype in a mutant-hINS-dependent manner. sfl encodes a protein required for sulfation of the glycosaminoglycan, heparan sulfate. Two additional genes in the heparan sulfate (HS) biosynthetic pathway (tout velu, ttv and brother of tout velu, botv) also modified the eye phenotype, suggesting a link between HS-modified proteins and cellular responses to misfolded proteins. Finally, intronic variants marking the QTL were associated with decreased sfl expression, a result consistent with that predicted by RNAi studies. The ability to create a model of human genetic disease in the fly, map a QTL by GWAS to a specific gene (and noncoding variant), validate its contribution to disease with available genetic resources, and experimentally link the variant to a molecular mechanism, demonstrate the many advantages Drosophila holds in determining the genetic underpinnings of human disease. △ Less

Submitted 27 May, 2013; v1 submitted 23 May, 2013; originally announced May 2013.

Journal ref: Genetics 196 (2014) 557-567

arXiv:1302.6771 [pdf, ps, other]

Rate of adaptation in sexuals and asexuals: A solvable model of the Fisher-Muller effect

Authors: Su-Chan Park, Joachim Krug

Abstract: The adaptation of large asexual populations is hampered by the competition between independently arising beneficial mutations in different individuals, which is known as clonal interference. Fisher and Muller proposed that recombination provides an evolutionary advantage in large populations by alleviating this competition. Based on recent progress in quantifying the speed of adaptation in asexual… ▽ More The adaptation of large asexual populations is hampered by the competition between independently arising beneficial mutations in different individuals, which is known as clonal interference. Fisher and Muller proposed that recombination provides an evolutionary advantage in large populations by alleviating this competition. Based on recent progress in quantifying the speed of adaptation in asexual populations undergoing clonal interference, we present a detailed analysis of the Fisher-Muller mechanism for a model genome consisting of two loci with an infinite number of beneficial alleles each and multiplicative fitness effects. We solve the infinite population dynamics exactly and show that, for a particular, natural mutation scheme, the speed of adaptation in sexuals is twice as large as in asexuals. Guided by the infinite population result and by previous work on asexual adaptation, we postulate an expression for the speed of adaptation in finite sexual populations that agrees with numerical simulations over a wide range of population sizes and recombination rates. The ratio of the sexual to asexual adaptation speed is a function of population size that increases in the clonal interference regime and approaches 2 for extremely large populations. The simulations also show that the imbalance between the numbers of accumulated mutations at the two loci is strongly suppressed even by a small amount of recombination. The generalization of the model to an arbitrary number $L$ of loci is briefly discussed. If each offspring samples the alleles at each locus from the gene pool of the whole population rather than from two parents, the ratio of the sexual to asexual adaptation speed is approximately equal to $L$ in large populations. A possible realization of this scenario is the reassortment of genetic material in RNA viruses with $L$ genomic segments. △ Less

Submitted 15 August, 2013; v1 submitted 27 February, 2013; originally announced February 2013.

Comments: Title has been changed. Supporting Information (animation) can be found in the source file. 53 pages. 10 figures. To appear in Genetics

arXiv:1104.4337 [pdf, other]

doi 10.1093/bioinformatics/btr277

HiTRACE: High-throughput robust analysis for capillary electrophoresis

Authors: Sungroh Yoon, **kyu Kim, Justine Hum, Hanjoo Kim, Seunghyun Park, Wipapat Kladwang, Rhiju Das

Abstract: Motivation: Capillary electrophoresis (CE) of nucleic acids is a workhorse technology underlying high-throughput genome analysis and large-scale chemical map** for nucleic acid structural inference. Despite the wide availability of CE-based instruments, there remain challenges in leveraging their full power for quantitative analysis of RNA and DNA structure, thermodynamics, and kinetics. In part… ▽ More Motivation: Capillary electrophoresis (CE) of nucleic acids is a workhorse technology underlying high-throughput genome analysis and large-scale chemical map** for nucleic acid structural inference. Despite the wide availability of CE-based instruments, there remain challenges in leveraging their full power for quantitative analysis of RNA and DNA structure, thermodynamics, and kinetics. In particular, the slow rate and poor automation of available analysis tools have bottlenecked a new generation of studies involving hundreds of CE profiles per experiment. Results: We propose a computational method called high-throughput robust analysis for capillary electrophoresis (HiTRACE) to automate the key tasks in large-scale nucleic acid CE analysis, including the profile alignment that has heretofore been a rate-limiting step in the highest throughput experiments. We illustrate the application of HiTRACE on thirteen data sets representing 4 different RNAs, three chemical modification strategies, and up to 480 single mutant variants; the largest data sets each include 87,360 bands. By applying a series of robust dynamic programming algorithms, HiTRACE outperforms prior tools in terms of alignment and fitting quality, as assessed by measures including the correlation between quantified band intensities between replicate data sets. Furthermore, while the smallest of these data sets required 7 to 10 hours of manual intervention using prior approaches, HiTRACE quantitation of even the largest data sets herein was achieved in 3 to 12 minutes. The HiTRACE method therefore resolves a critical barrier to the efficient and accurate analysis of nucleic acid structure in experiments involving tens of thousands of electrophoretic bands. △ Less

Submitted 12 May, 2011; v1 submitted 21 April, 2011; originally announced April 2011.

Comments: Revised to include Supplement. Availability: HiTRACE is freely available for download at http://hitrace.stanford.edu

arXiv:1011.2013 [pdf, other]

Emergence of cooperation with self-organized criticality

Authors: Sangmin Park, Hyeong-Chai Jeong

Abstract: Cooperation and self-organized criticality are two main keywords in current studies of evolution. We propose a generalized Bak-Sneppen model and provide a natural mechanism which accounts for both phenomena simultaneously. We use the prisoner's dilemma games to mimic the interactions among the members of the population. Each member is identified by its cooperation probability, and its fitness is g… ▽ More Cooperation and self-organized criticality are two main keywords in current studies of evolution. We propose a generalized Bak-Sneppen model and provide a natural mechanism which accounts for both phenomena simultaneously. We use the prisoner's dilemma games to mimic the interactions among the members of the population. Each member is identified by its cooperation probability, and its fitness is given by the payoffs from neighbors. The least fit member with the minimum payoff is replaced by a new member with a random cooperation probability. When the neighbors of the least fit member are also replaced with a non-zero probability, a strong cooperation emerges. The Bak-Sneppen process builds a self-organized structure so that the cooperation can emerge even in the parameter region where a uniform or random population decreases the number of cooperators. The emergence of cooperation is due to the same dynamical correlation which leads to self-organized criticality in replacement activities. △ Less

Submitted 19 March, 2012; v1 submitted 9 November, 2010; originally announced November 2010.

Comments: 6 pages, 5 figures

arXiv:1003.5380 [pdf, ps, other]

Evolutionary advantage of small populations on complex fitness landscapes

Authors: Kavita Jain, Joachim Krug, Su-Chan Park

Abstract: Background: Recent experimental and theoretical studies have shown that small asexual populations evolving on complex fitness landscapes may achieve a higher fitness than large ones due to the increased heterogeneity of adaptive trajectories. Here we introduce a class of haploid three-locus fitness landscapes that allows to investigate this scenario in a precise and quantitative way. Results: Ou… ▽ More Background: Recent experimental and theoretical studies have shown that small asexual populations evolving on complex fitness landscapes may achieve a higher fitness than large ones due to the increased heterogeneity of adaptive trajectories. Here we introduce a class of haploid three-locus fitness landscapes that allows to investigate this scenario in a precise and quantitative way. Results: Our main result derived analytically shows how the probability of choosing the path of largest initial fitness increase grows with the population size. This makes large populations more likely to get trapped at local fitness peaks and implies an advantage of small populations at intermediate time scales. The range of population sizes where this effect is operative coincides with the onset of clonal interference. Additional studies using ensembles of random fitness landscapes show that the results achieved for a particular choice of three-locus landscape parameters are robust and also persist as the number of loci increases. Conclusions: Our study indicates that an advantage for small populations is likely whenever the fitness landscape contains local maxima. The advantage appears at intermediate time scales, which are long enough for trap** at local fitness maxima to have occurred but too short for peak escape by the creation of multiple mutants. △ Less

Submitted 22 February, 2011; v1 submitted 28 March, 2010; originally announced March 2010.

Comments: Version to appear in Evolution

Journal ref: Evolution 65-7, 1945-1955

arXiv:1001.1348 [pdf, ps, other]

doi 10.1007/s00285-010-0352-x

Bistability in two-locus models with selection, mutation, and recombination

Authors: Su-Chan Park, Joachim Krug

Abstract: The evolutionary effect of recombination depends crucially on the epistatic interactions between linked loci. A paradigmatic case where recombination is known to be strongly disadvantageous is a two-locus fitness landscape dis- playing reciprocal sign epistasis with two fitness peaks of unequal height. Focusing on the occurrence of bistability in the equilibrium solutions, we consider here the det… ▽ More The evolutionary effect of recombination depends crucially on the epistatic interactions between linked loci. A paradigmatic case where recombination is known to be strongly disadvantageous is a two-locus fitness landscape dis- playing reciprocal sign epistasis with two fitness peaks of unequal height. Focusing on the occurrence of bistability in the equilibrium solutions, we consider here the deterministic, haploid two-locus model with reversible mu- tations, selection and recombination. We find analytic formulae for the criti- cal recombination probability rc above which two stable stationary solutions appear which are localized on each of the two fitness peaks. We also derive the stationary genotype frequencies in various parameter regimes. When the recombination rate is close to rc and the fitness difference between the two peaks is small, we obtain a compact description in terms of a cubic polyno- mial which is analogous to the Landau theory of physical phase transitions. △ Less

Submitted 18 June, 2010; v1 submitted 8 January, 2010; originally announced January 2010.

Comments: Sections 3.7 and 3.8 are added. 22 pages, 6 figures. To appear in J. Math. Biol

Journal ref: J. Math. Biol. (2011) 62:763-788

arXiv:0910.0219 [pdf, ps, other]

doi 10.1007/s10955-009-9915-x

The speed of evolution in large asexual populations

Authors: Su-Chan Park, Damien Simon, Joachim Krug

Abstract: We consider an asexual biological population of constant size $N$ evolving in discrete time under the influence of selection and mutation. Beneficial mutations appear at rate $U$ and their selective effects $s$ are drawn from a distribution $g(s)$. After introducing the required models and concepts of mathematical population genetics, we review different approaches to computing the speed of loga… ▽ More We consider an asexual biological population of constant size $N$ evolving in discrete time under the influence of selection and mutation. Beneficial mutations appear at rate $U$ and their selective effects $s$ are drawn from a distribution $g(s)$. After introducing the required models and concepts of mathematical population genetics, we review different approaches to computing the speed of logarithmic fitness increase as a function of $N$, $U$ and $g(s)$. We present an exact solution of the infinite population size limit and provide an estimate of the population size beyond which it is valid. We then discuss approximate approaches to the finite population problem, distinguishing between the case of a single selection coefficient, $g(s) = δ(s - s_b)$, and a continuous distribution of selection coefficients. Analytic estimates for the speed are compared to numerical simulations up to population sizes of order $10^{300}$. △ Less

Submitted 1 October, 2009; originally announced October 2009.

Comments: 33 pages, 10 figures

Journal ref: J. Stat. Phys. 138, 381 (2010)

arXiv:0807.3002 [pdf, ps, other]

Exploring the effect of sex on an empirical fitness landscape

Authors: J. Arjan G. M. de Visser, Su-Chan Park, Joachim Krug

Abstract: The nature of epistasis has important consequences for the evolutionary significance of sex and recombination. Recent efforts to find negative epistasis as source of negative linkage disequilibrium and associated long-term sex advantage have yielded little support. Sign epistasis, where the sign of the fitness effects of alleles varies across genetic backgrounds, is responsible for ruggedness of… ▽ More The nature of epistasis has important consequences for the evolutionary significance of sex and recombination. Recent efforts to find negative epistasis as source of negative linkage disequilibrium and associated long-term sex advantage have yielded little support. Sign epistasis, where the sign of the fitness effects of alleles varies across genetic backgrounds, is responsible for ruggedness of the fitness landscape with implications for the evolution of sex that have been largely unexplored. Here, we describe fitness landscapes for two sets of strains of the asexual fungus \emph{Aspergillus niger} involving all combinations of five mutations. We find that $\sim 30$% of the single-mutation fitness effects are positive despite their negative effect in the wild-type strain, and that several local fitness maxima and minima are present. We then compare adaptation of sexual and asexual populations on these empirical fitness landscapes using simulations. The results show a general disadvantage of sex on these rugged landscapes, caused by the break down by recombination of genotypes esca** from local peaks. Sex facilitates escape from a local peak only for some parameter values on one landscape, indicating its dependence on the landscape's topography. We discuss possible reasons for the discrepancy between our results and the reports of faster adaptation of sexual populations. △ Less

Submitted 18 July, 2008; originally announced July 2008.

Journal ref: American Naturalist 174 (2009) S15-S30 (with substantial revisions)

arXiv:0807.1764 [pdf, ps, other]

doi 10.1103/PhysRevE.79.066114

Reentrant phase transition in a predator-prey model

Authors: Sung-Guk Han, Su-Chan Park, Beom Jun Kim

Abstract: We numerically investigate the six-species predator-prey game in complex networks as well as in $d$-dimensional hypercubic lattices with $d=1,2,..., 6$. The interaction topology of the six species contains two loops, each of which is composed of cyclically predating three species. As the mutation rate $P$ is lowered below the well-defined phase transition point, the $Z_2$ symmetry related with t… ▽ More We numerically investigate the six-species predator-prey game in complex networks as well as in $d$-dimensional hypercubic lattices with $d=1,2,..., 6$. The interaction topology of the six species contains two loops, each of which is composed of cyclically predating three species. As the mutation rate $P$ is lowered below the well-defined phase transition point, the $Z_2$ symmetry related with the interchange of the two loops is spontaneously broken, and it has been known that the system develops the defensive alliance in which three cyclically predating species defend each other against the invasion of other species. In the small-world network structure characterized by the rewiring probability $α$, the phase diagram shows the reentrant behavior as $α$ is varied, indicating a twofold role of the shortcuts. In $d$-dimensional regular hypercubic lattices, the system also exhibits the reentrant phase transition as $d$ is increased. We identify universality class of the phase transition and discuss the proper mean-field limit of the system. △ Less

Submitted 30 May, 2009; v1 submitted 10 July, 2008; originally announced July 2008.

Comments: 8 pages, 7 figures, Phys. Rev. E (in press)

arXiv:0711.1989 [pdf, ps, other]

doi 10.1088/1742-5468/2008/04/P04014

Evolution in random fitness landscapes: the infinite sites model

Authors: Su-Chan Park, Joachim Krug

Abstract: We consider the evolution of an asexually reproducing population in an uncorrelated random fitness landscape in the limit of infinite genome size, which implies that each mutation generates a new fitness value drawn from a probability distribution $g(w)$. This is the finite population version of Kingman's house of cards model [J.F.C. Kingman, \textit{J. Appl. Probab.} \textbf{15}, 1 (1978)]. In… ▽ More We consider the evolution of an asexually reproducing population in an uncorrelated random fitness landscape in the limit of infinite genome size, which implies that each mutation generates a new fitness value drawn from a probability distribution $g(w)$. This is the finite population version of Kingman's house of cards model [J.F.C. Kingman, \textit{J. Appl. Probab.} \textbf{15}, 1 (1978)]. In contrast to Kingman's work, the focus here is on unbounded distributions $g(w)$ which lead to an indefinite growth of the population fitness. The model is solved analytically in the limit of infinite population size $N \to \infty$ and simulated numerically for finite $N$. When the genome-wide mutation probability $U$ is small, the long time behavior of the model reduces to a point process of fixation events, which is referred to as a \textit{diluted record process} (DRP). The DRP is similar to the standard record process except that a new record candidate (a number that exceeds all previous entries in the sequence) is accepted only with a certain probability that depends on the values of the current record and the candidate. We develop a systematic analytic approximation scheme for the DRP. At finite $U$ the fitness frequency distribution of the population decomposes into a stationary part due to mutations and a traveling wave component due to selection, which is shown to imply a reduction of the mean fitness by a factor of $1-U$ compared to the $U \to 0$ limit. △ Less

Submitted 18 December, 2007; v1 submitted 13 November, 2007; originally announced November 2007.

Comments: Dedicated to Thomas Nattermann on the occasion of his 60th birthday. Submitted to JSTAT. Error in Section 3.2 was corrected

Journal ref: J. Stat. Mech. (2008) P04014

arXiv:0708.1865 [pdf]

doi 10.1073/pnas.0703262104

Metabolite essentiality elucidates robustness of Escherichia coli metabolism

Authors: Pan-Jun Kim, Dong-Yup Lee, Tae Yong Kim, Kwang Ho Lee, Hawoong Jeong, Sang Yup Lee, Sunwon Park

Abstract: Complex biological systems are very robust to genetic and environmental changes at all levels of organization. Many biological functions of Escherichia coli metabolism can be sustained against single-gene or even multiple-gene mutations by using redundant or alternative pathways. Thus, only a limited number of genes have been identified to be lethal to the cell. In this regard, the reaction-cent… ▽ More Complex biological systems are very robust to genetic and environmental changes at all levels of organization. Many biological functions of Escherichia coli metabolism can be sustained against single-gene or even multiple-gene mutations by using redundant or alternative pathways. Thus, only a limited number of genes have been identified to be lethal to the cell. In this regard, the reaction-centric gene deletion study has a limitation in understanding the metabolic robustness. Here, we report the use of flux-sum, which is the summation of all incoming or outgoing fluxes around a particular metabolite under pseudo-steady state conditions, as a good conserved property for elucidating such robustness of E. coli from the metabolite point of view. The functional behavior, as well as the structural and evolutionary properties of metabolites essential to the cell survival, was investigated by means of a constraints-based flux analysis under perturbed conditions. The essential metabolites are capable of maintaining a steady flux-sum even against severe perturbation by actively redistributing the relevant fluxes. Disrupting the flux-sum maintenance was found to suppress cell growth. This approach of analyzing metabolite essentiality provides insight into cellular robustness and concomitant fragility, which can be used for several applications, including the development of new drugs for treating pathogens. △ Less

Submitted 14 August, 2007; originally announced August 2007.

Comments: Supplements available at http://stat.kaist.ac.kr/publication/2007/PJKim_pnas_supplement.pdf

Journal ref: Proc. Natl. Acad. Sci. USA. 104 13638 (2007)

arXiv:q-bio/0609016 [pdf, ps, other]

doi 10.1103/PhysRevE.74.026114

Dynamic behaviors in directed networks

Authors: Sung Min Park, Beom Jun Kim

Abstract: Motivated by the abundance of directed synaptic couplings in a real biological neuronal network, we investigate the synchronization behavior of the Hodgkin-Huxley model in a directed network. We start from the standard model of the Watts-Strogatz undirected network and then change undirected edges to directed arcs with a given probability, still preserving the connectivity of the network. A gene… ▽ More Motivated by the abundance of directed synaptic couplings in a real biological neuronal network, we investigate the synchronization behavior of the Hodgkin-Huxley model in a directed network. We start from the standard model of the Watts-Strogatz undirected network and then change undirected edges to directed arcs with a given probability, still preserving the connectivity of the network. A generalized clustering coefficient for directed networks is defined and used to investigate the interplay between the synchronization behavior and underlying structural properties of directed networks. We observe that the directedness of complex networks plays an important role in emerging dynamical behaviors, which is also confirmed by a numerical study of the sociological game theoretic voter model on directed networks. △ Less

Submitted 11 September, 2006; originally announced September 2006.

Journal ref: Phys. Rev. E 74, 026114 (2006)

Showing 1–50 of 54 results for author: Park, S