-
Effectiveness of Self-Assessment Software to Evaluate Preclinical Operative Procedures
Authors:
Qi Dai,
Ryan Davis,
Houlin Hong,
Ying Gu
Abstract:
Objectives: To assess the effectiveness of digital scanning techniques for self-assessment and of preparations and restorations in preclinical dental education when compared to traditional faculty grading. Methods: Forty-four separate Class I (#30-O), Class II (#30-MO) preparations, and class II amalgam restorations (#31-MO) were generated respectively under preclinical assessment setting. Calibra…
▽ More
Objectives: To assess the effectiveness of digital scanning techniques for self-assessment and of preparations and restorations in preclinical dental education when compared to traditional faculty grading. Methods: Forty-four separate Class I (#30-O), Class II (#30-MO) preparations, and class II amalgam restorations (#31-MO) were generated respectively under preclinical assessment setting. Calibrated faculty evaluated the preparations and restorations using a standard rubric from preclinical operative class. The same teeth were scanned using Planmeca PlanScan intraoral scanner and graded using the Romexis E4D Compare Software. Each tooth was compared against a corresponding gold standard tooth with tolerance intervals ranging from 100μm to 500μm. These scores were compared to traditional faculty grades using a linear mixed model to estimate the mean differences at 95% confidence interval for each tolerance level. Results: The average Compare Software grade of Class I preparation at 300μm tolerance had the smallest mean difference of 1.64 points on a 100 points scale compared to the average faculty grade. Class II preparation at 400μm tolerance had the smallest mean difference of 0.41 points. Finally, Class II Restoration at 300μm tolerance had the smallest mean difference at 0.20 points. Conclusion: In this study, tolerance levels that best correlated the Compare Software grades with the faculty grades were determined for three operative procedures: class I preparation, class II preparation and class II restoration. This Compare Software can be used as a useful adjunct method for more objective grading. It also can be used by students as a great self-assessment tool.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Diffusion-Driven Domain Adaptation for Generating 3D Molecules
Authors:
Haokai Hong,
Wanyu Lin,
Kay Chen Tan
Abstract:
Can we train a molecule generator that can generate 3D molecules from a new domain, circumventing the need to collect data? This problem can be cast as the problem of domain adaptive molecule generation. This work presents a novel and principled diffusion-based approach, called GADM, that allows shifting a generative model to desired new domains without the need to collect even a single molecule.…
▽ More
Can we train a molecule generator that can generate 3D molecules from a new domain, circumventing the need to collect data? This problem can be cast as the problem of domain adaptive molecule generation. This work presents a novel and principled diffusion-based approach, called GADM, that allows shifting a generative model to desired new domains without the need to collect even a single molecule. As the domain shift is typically caused by the structure variations of molecules, e.g., scaffold variations, we leverage a designated equivariant masked autoencoder (MAE) along with various masking strategies to capture the structural-grained representations of the in-domain varieties. In particular, with an asymmetric encoder-decoder module, the MAE can generalize to unseen structure variations from the target domains. These structure variations are encoded with an equivariant encoder and treated as domain supervisors to control denoising. We show that, with these encoded structural-grained domain supervisors, GADM can generate effective molecules within the desired new domains. We conduct extensive experiments across various domain adaptation tasks over benchmarking datasets. We show that our approach can improve up to 65.6% in terms of success rate defined based on molecular validity, uniqueness, and novelty compared to alternative baselines.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Robust Parameter Estimation for Rational Ordinary Differential Equations
Authors:
Oren Bassik,
Yosef Berman,
Soo Go,
Hoon Hong,
Ilia Ilmer,
Alexey Ovchinnikov,
Chris Rackauckas,
Pedro Soto,
Chee Yap
Abstract:
We present a new approach for estimating parameters in rational ODE models from given (measured) time series data.
In typical existing approaches, an initial guess for the parameter values is made from a given search interval. Then, in a loop, the corresponding outputs are computed by solving the ODE numerically, followed by computing the error from the given time series data. If the error is sm…
▽ More
We present a new approach for estimating parameters in rational ODE models from given (measured) time series data.
In typical existing approaches, an initial guess for the parameter values is made from a given search interval. Then, in a loop, the corresponding outputs are computed by solving the ODE numerically, followed by computing the error from the given time series data. If the error is small, the loop terminates and the parameter values are returned. Otherwise, heuristics/theories are used to possibly improve the guess and continue the loop.
These approaches tend to be non-robust in the sense that their accuracy depend on the search interval and the true parameter values; furthermore, they cannot handle the case where the parameters are locally identifiable.
In this paper, we propose a new approach, which does not suffer from the above non-robustness. In particular, it does not require making good initial guesses for the parameter values or specifying search intervals. Instead, it uses differential algebra, interpolation of the data using rational functions, and multivariate polynomial system solving. We also compare the performance of the resulting software with several other estimation software packages.
△ Less
Submitted 17 December, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Robust Perfect Adaptation of Reaction Fluxes Ensured by Network Topology
Authors:
Yuji Hirono,
Hyukpyo Hong,
Jae Kyoung Kim
Abstract:
Maintaining stability in an uncertain environment is essential for proper functioning of living systems. Robust perfect adaptation (RPA) is a property of a system that generates an output at a fixed level even after fluctuations in input stimulus without fine-tuning parameters, and it is important to understand how this feature is implemented through biochemical networks. The existing literature h…
▽ More
Maintaining stability in an uncertain environment is essential for proper functioning of living systems. Robust perfect adaptation (RPA) is a property of a system that generates an output at a fixed level even after fluctuations in input stimulus without fine-tuning parameters, and it is important to understand how this feature is implemented through biochemical networks. The existing literature has mainly focused on RPA of the concentration of a chosen chemical species, and no generic analysis has been made on RPA of reaction fluxes, that play an equally important role. Here, we identify structural conditions on reaction networks under which all the reaction fluxes exhibit RPA against the perturbation of the parameters inside a subnetwork. Based on this understanding, we give a recipe for obtaining a simpler reaction network, from which we can fully recover the steady-state reaction fluxes of the original system. This helps us identify key parameters that determine the fluxes and study the properties of complex reaction networks using a smaller one without losing any information about steady-state reaction fluxes.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Computational translation framework identifies biochemical reaction networks with special topologies and their long-term dynamics
Authors:
Hyukpyo Hong,
Bryan S. Hernandez,
**su Kim,
Jae Kyoung Kim
Abstract:
Long-term behaviors of biochemical systems are described by steady states in deterministic models and stationary distributions in stochastic models. Obtaining their analytic solutions can be done for limited cases, such as linear or finite-state systems, as it generally requires solving many coupled equations. Interestingly, analytic solutions can be easily obtained when underlying networks have s…
▽ More
Long-term behaviors of biochemical systems are described by steady states in deterministic models and stationary distributions in stochastic models. Obtaining their analytic solutions can be done for limited cases, such as linear or finite-state systems, as it generally requires solving many coupled equations. Interestingly, analytic solutions can be easily obtained when underlying networks have special topologies, called weak reversibility (WR) and zero deficiency (ZD), and the kinetic law follows a generalized form of mass-action kinetics. However, such desired topological conditions do not hold for the majority of cases. Thus, translating networks to have WR and ZD while preserving the original dynamics was proposed. Yet, this approach is limited because manually obtaining the desired network translation among the large number of candidates is challenging. Here, we prove necessary conditions for having WR and ZD after translation, and based on these conditions, we develop a user-friendly computational package, TOWARDZ, that automatically and efficiently identifies translated networks with WR and ZD. This allows us to quantitatively examine how likely it is to obtain WR and ZD after translation depending on the number of species and reactions. Importantly, we also describe how our package can be used to analytically derive steady states of deterministic models and stationary distributions of stochastic models. TOWARDZ provides an effective tool to analyze biochemical systems.
△ Less
Submitted 2 December, 2022;
originally announced December 2022.
-
Free energy landscape of two-state protein Acylphosphatase with large contact order revealed by force-dependent folding and unfolding dynamics
Authors:
Xuening Ma,
Hao Sun,
Haiyan Hong,
Zilong Guo,
Huanhuan Su,
Hu Chen
Abstract:
Acylphosphatase (AcP) is a small protein with 98 amino acid residues that catalyzes the hydrolysis of carboxyl-phosphate bonds. AcP is a typical two-state protein with slow folding rate due to its relatively large contact order in the native structure. The mechanical properties and unfolding behavior of AcP has been studied by atomic force microscope. But the folding and unfolding dynamics at low…
▽ More
Acylphosphatase (AcP) is a small protein with 98 amino acid residues that catalyzes the hydrolysis of carboxyl-phosphate bonds. AcP is a typical two-state protein with slow folding rate due to its relatively large contact order in the native structure. The mechanical properties and unfolding behavior of AcP has been studied by atomic force microscope. But the folding and unfolding dynamics at low forces has not been reported. Here using stable magnetic tweezers, we measured the force-dependent folding rates within a force range from 1 pN to 3 pN, and unfolding rates from 15 pN to 40 pN. The obtained unfolding rates show different force sensitivities at forces below and above ~27 pN, which determines a free energy landscape with two energy barriers. Our results indicate that the free energy landscape of small globule proteins have general Bactrian camel shape, and large contact order of the native state produces a high barrier dominate at low forces.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
OntoProtein: Protein Pretraining With Gene Ontology Embedding
Authors:
Ningyu Zhang,
Zhen Bi,
Xiaozhuan Liang,
Siyuan Cheng,
Haosen Hong,
Shumin Deng,
Jiazhang Lian,
Qiang Zhang,
Huajun Chen
Abstract:
Self-supervised protein language models have proved their effectiveness in learning the proteins representations. With the increasing computational power, current protein language models pre-trained with millions of diverse sequences can advance the parameter scale from million-level to billion-level and achieve remarkable improvement. However, those prevailing approaches rarely consider incorpora…
▽ More
Self-supervised protein language models have proved their effectiveness in learning the proteins representations. With the increasing computational power, current protein language models pre-trained with millions of diverse sequences can advance the parameter scale from million-level to billion-level and achieve remarkable improvement. However, those prevailing approaches rarely consider incorporating knowledge graphs (KGs), which can provide rich structured knowledge facts for better protein representations. We argue that informative biology knowledge in KGs can enhance protein representation with external knowledge. In this work, we propose OntoProtein, the first general framework that makes use of structure in GO (Gene Ontology) into protein pre-training models. We construct a novel large-scale knowledge graph that consists of GO and its related proteins, and gene annotation texts or protein sequences describe all nodes in the graph. We propose novel contrastive learning with knowledge-aware negative sampling to jointly optimize the knowledge graph and protein embedding during pre-training. Experimental results show that OntoProtein can surpass state-of-the-art methods with pre-trained protein language models in TAPE benchmark and yield better performance compared with baselines in protein-protein interaction and protein function prediction. Code and datasets are available in https://github.com/zjunlp/OntoProtein.
△ Less
Submitted 3 June, 2022; v1 submitted 23 January, 2022;
originally announced January 2022.
-
Estimation of time-varying reproduction numbers underlying epidemiological processes: a new statistical tool for the COVID-19 pandemic
Authors:
Hyokyoung G. Hong,
Yi Li
Abstract:
The coronavirus pandemic has rapidly evolved into an unprecedented crisis. The susceptible-infectious-removed (SIR) model and its variants have been used for modeling the pandemic. However, time-independent parameters in the classical models may not capture the dynamic transmission and removal processes, governed by virus containment strategies taken at various phases of the epidemic. Moreover, ve…
▽ More
The coronavirus pandemic has rapidly evolved into an unprecedented crisis. The susceptible-infectious-removed (SIR) model and its variants have been used for modeling the pandemic. However, time-independent parameters in the classical models may not capture the dynamic transmission and removal processes, governed by virus containment strategies taken at various phases of the epidemic. Moreover, very few models account for possible inaccuracies of the reported cases. We propose a Poisson model with time-dependent transmission and removal rates to account for possible random errors in reporting and estimate a time-dependent disease reproduction number, which may be used to assess the effectiveness of virus control strategies. We apply our method to study the pandemic in several severely impacted countries, and analyze and forecast the evolving spread of the coronavirus. We have developed an interactive web application to facilitate readers' use of our method.
△ Less
Submitted 13 July, 2020; v1 submitted 12 April, 2020;
originally announced April 2020.
-
Brain-Like Object Recognition with High-Performing Shallow Recurrent ANNs
Authors:
Jonas Kubilius,
Martin Schrimpf,
Kohitij Kar,
Ha Hong,
Najib J. Majaj,
Rishi Rajalingham,
Elias B. Issa,
Pouya Bashivan,
Jonathan Prescott-Roy,
Kailyn Schmidt,
Aran Nayebi,
Daniel Bear,
Daniel L. K. Yamins,
James J. DiCarlo
Abstract:
Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categoriz…
▽ More
Deep convolutional artificial neural networks (ANNs) are the leading class of candidate models of the mechanisms of visual processing in the primate ventral stream. While initially inspired by brain anatomy, over the past years, these ANNs have evolved from a simple eight-layer architecture in AlexNet to extremely deep and branching architectures, demonstrating increasingly better object categorization performance, yet bringing into question how brain-like they still are. In particular, typical deep models from the machine learning community are often hard to map onto the brain's anatomy due to their vast number of layers and missing biologically-important connections, such as recurrence. Here we demonstrate that better anatomical alignment to the brain and high performance on machine learning as well as neuroscience measures do not have to be in contradiction. We developed CORnet-S, a shallow ANN with four anatomically mapped areas and recurrent connectivity, guided by Brain-Score, a new large-scale composite of neural and behavioral benchmarks for quantifying the functional fidelity of models of the primate ventral visual stream. Despite being significantly shallower than most models, CORnet-S is the top model on Brain-Score and outperforms similarly compact models on ImageNet. Moreover, our extensive analyses of CORnet-S circuitry variants reveal that recurrence is the main predictive factor of both Brain-Score and ImageNet top-1 performance. Finally, we report that the temporal evolution of the CORnet-S "IT" neural population resembles the actual monkey IT population dynamics. Taken together, these results establish CORnet-S, a compact, recurrent ANN, as the current best model of the primate ventral visual stream.
△ Less
Submitted 28 October, 2019; v1 submitted 13 September, 2019;
originally announced September 2019.
-
SIAN: software for structural identifiability analysis of ODE models
Authors:
Hoon Hong,
Alexey Ovchinnikov,
Gleb Pogudin,
Chee Yap
Abstract:
Biological processes are often modeled by ordinary differential equations with unknown parameters. The unknown parameters are usually estimated from experimental data. In some cases, due to the structure of the model, this estimation problem does not have a unique solution even in the case of continuous noise-free data. It is therefore desirable to check the uniqueness a priori before carrying out…
▽ More
Biological processes are often modeled by ordinary differential equations with unknown parameters. The unknown parameters are usually estimated from experimental data. In some cases, due to the structure of the model, this estimation problem does not have a unique solution even in the case of continuous noise-free data. It is therefore desirable to check the uniqueness a priori before carrying out actual experiments. We present a new software SIAN (Structural Identifiability ANalyser) that does this. Our software can tackle problems that could not be tackled by previously developed packages.
△ Less
Submitted 25 December, 2018;
originally announced December 2018.
-
Assessing Technical Performance in Differential Gene Expression Experiments with External Spike-in RNA Control Ratio Mixtures
Authors:
Sarah A. Munro,
Steve P. Lund,
P. Scott Pine,
Hans Binder,
Djork-Arné Clevert,
Ana Conesa,
Joaquin Dopazo,
Mario Fasold,
Sepp Hochreiter,
Huixiao Hong,
Nederah Jafari,
David P. Kreil,
Paweł P. Łabaj,
Sheng Li,
Yang Liao,
Simon Lin,
Joseph Meehan,
Christopher E. Mason,
Javier Santoyo,
Robert A. Setterquist,
Leming Shi,
Wei Shi,
Gordon K. Smyth,
Nancy Stralis-Pavese,
Zhenqiang Su
, et al. (8 additional authors not shown)
Abstract:
There is a critical need for standard approaches to assess, report, and compare the technical performance of genome-scale differential gene expression experiments. We assess technical performance with a proposed "standard" dashboard of metrics derived from analysis of external spike-in RNA control ratio mixtures. These control ratio mixtures with defined abundance ratios enable assessment of diagn…
▽ More
There is a critical need for standard approaches to assess, report, and compare the technical performance of genome-scale differential gene expression experiments. We assess technical performance with a proposed "standard" dashboard of metrics derived from analysis of external spike-in RNA control ratio mixtures. These control ratio mixtures with defined abundance ratios enable assessment of diagnostic performance of differentially expressed transcript lists, limit of detection of ratio (LODR) estimates, and expression ratio variability and measurement bias. The performance metrics suite is applicable to analysis of a typical experiment, and here we also apply these metrics to evaluate technical performance among laboratories. An interlaboratory study using identical samples shared amongst 12 laboratories with three different measurement processes demonstrated generally consistent diagnostic power across 11 laboratories. Ratio measurement variability and bias were also comparable amongst laboratories for the same measurement process. Different biases were observed for measurement processes using different mRNA enrichment protocols.
△ Less
Submitted 18 June, 2014;
originally announced June 2014.
-
Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition
Authors:
Charles F. Cadieu,
Ha Hong,
Daniel L. K. Yamins,
Nicolas Pinto,
Diego Ardila,
Ethan A. Solomon,
Najib J. Majaj,
James J. DiCarlo
Abstract:
The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have…
▽ More
The primate visual system achieves remarkable visual object recognition performance even in brief presentations and under changes to object exemplar, geometric transformations, and background variation (a.k.a. core visual object recognition). This remarkable performance is mediated by the representation formed in inferior temporal (IT) cortex. In parallel, recent advances in machine learning have led to ever higher performing models of object recognition using artificial deep neural networks (DNNs). It remains unclear, however, whether the representational performance of DNNs rivals that of the brain. To accurately produce such a comparison, a major difficulty has been a unifying metric that accounts for experimental limitations such as the amount of noise, the number of neural recording sites, and the number trials, and computational limitations such as the complexity of the decoding classifier and the number of classifier training examples. In this work we perform a direct comparison that corrects for these experimental limitations and computational considerations. As part of our methodology, we propose an extension of "kernel analysis" that measures the generalization accuracy as a function of representational complexity. Our evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task. Furthermore, we show that models that perform well on measures of representational performance also perform well on measures of representational similarity to IT and on measures of predicting individual IT multi-unit responses. Whether these DNNs rely on computational mechanisms similar to the primate visual system is yet to be determined, but, unlike all previous bio-inspired models, that possibility cannot be ruled out merely on representational performance grounds.
△ Less
Submitted 12 June, 2014;
originally announced June 2014.
-
Stable and flexible system for glucose homeostasis
Authors:
Hyunsuk Hong,
Junghyo Jo,
Sang-** Sin
Abstract:
Pancreatic islets, controlling glucose homeostasis, consist of α, β, and δ cells. It has been observed that α and β cells generate out-of-phase synchronization in the release of glucagon and insulin, counter-regulatory hormones for increasing and decreasing glucose levels, while β and δ cells produce in-phase synchronization in the release of the insulin and somatostatin. Pieces of interactions be…
▽ More
Pancreatic islets, controlling glucose homeostasis, consist of α, β, and δ cells. It has been observed that α and β cells generate out-of-phase synchronization in the release of glucagon and insulin, counter-regulatory hormones for increasing and decreasing glucose levels, while β and δ cells produce in-phase synchronization in the release of the insulin and somatostatin. Pieces of interactions between the islet cells have been observed for a long time, although their physiological role as a whole has not been explored yet. We model the synchronized hormone pulses of islets with coupled phase oscillators that incorporate the observed cellular interactions. The integrated model shows that the interaction from β to δ cells, of which sign has controversial reports, should be positive to reproduce the in-phase synchronization between β and δ cells. The model also suggests that δ cells help the islet system flexibly respond to changes of glucose environment.
△ Less
Submitted 5 September, 2013;
originally announced October 2013.
-
The Neural Representation Benchmark and its Evaluation on Brain and Machine
Authors:
Charles F. Cadieu,
Ha Hong,
Dan Yamins,
Nicolas Pinto,
Najib J. Majaj,
James J. DiCarlo
Abstract:
A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the repr…
▽ More
A key requirement for the development of effective learning representations is their evaluation and comparison to representations we know to be effective. In natural sensory domains, the community has viewed the brain as a source of inspiration and as an implicit benchmark for success. However, it has not been possible to directly test representational learning algorithms directly against the representations contained in neural systems. Here, we propose a new benchmark for visual representations on which we have directly tested the neural representation in multiple visual cortical areas in macaque (utilizing data from [Majaj et al., 2012]), and on which any computer vision algorithm that produces a feature space can be tested. The benchmark measures the effectiveness of the neural or machine representation by computing the classification loss on the ordered eigendecomposition of a kernel matrix [Montavon et al., 2011]. In our analysis we find that the neural representation in visual area IT is superior to visual area V4. In our analysis of representational learning algorithms, we find that three-layer models approach the representational performance of V4 and the algorithm in [Le et al., 2012] surpasses the performance of V4. Impressively, we find that a recent supervised algorithm [Krizhevsky et al., 2012] achieves performance comparable to that of IT for an intermediate level of image variation difficulty, and surpasses IT at a higher difficulty level. We believe this result represents a major milestone: it is the first learning algorithm we have found that exceeds our current estimate of IT representation performance. We hope that this benchmark will assist the community in matching the representational performance of visual cortex and will serve as an initial rallying point for further correspondence between representations derived in brains and machines.
△ Less
Submitted 25 January, 2013; v1 submitted 15 January, 2013;
originally announced January 2013.