-
Postharvest litchi (Litchi chinensis Sonn.) quality preservation by alginate oligosaccharides
Authors:
Jianlie Shen,
Shulin Wan,
Haidong Tan
Abstract:
This study investigates the efficacy of alginate oligosaccharides, derived from a novel alginate lyase expressed in E. coli (Pet21a-alginate lyase), in preserving the postharvest quality of litchi (Litchi chinensis Sonn.) fruits. The alginate lyase, characterized by Huang et al. (2013), was employed to produce AOS through enzymatic degradation of alginate. The resulting oligosaccharides were appli…
▽ More
This study investigates the efficacy of alginate oligosaccharides, derived from a novel alginate lyase expressed in E. coli (Pet21a-alginate lyase), in preserving the postharvest quality of litchi (Litchi chinensis Sonn.) fruits. The alginate lyase, characterized by Huang et al. (2013), was employed to produce AOS through enzymatic degradation of alginate. The resulting oligosaccharides were applied to litchi fruits harvested from Guangzhou Zengcheng to evaluate their impact on various quality parameters under controlled storage conditions. The study focused on measuring the effects of alginate oligosaccharide treatment on the fruits' color retention, water loss rate, hardness, and susceptibility to mold infection, under a set relative humidity and temperature. Results demonstrated significant improvements in the treated fruits, with enhanced color retention, reduced water loss, maintained hardness, and lower rates of mold infection compared to untreated controls. These findings suggest that AOS offer a promising natural alternative for extending the shelf life and maintaining the quality of litchi fruits postharvest.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Machine Learning Algorithms for Predicting in-Hospital Mortality in Patients with ST-Segment Elevation Myocardial Infar
Authors:
Ding Tao,
Chen Liu,
Shihan Wan
Abstract:
Acute myocardial infarction (AMI) is one of the most severe manifestation of coronary artery disease. ST-segment elevation myocardial infarction (STEMI) is the most serious type of AMI. We proposed to develop a machine learning algorithm based on the home page of electronic medical record (HPEMR) for predicting in-hospital mortality of patients with STEMI in the early stage. Methods: This observat…
▽ More
Acute myocardial infarction (AMI) is one of the most severe manifestation of coronary artery disease. ST-segment elevation myocardial infarction (STEMI) is the most serious type of AMI. We proposed to develop a machine learning algorithm based on the home page of electronic medical record (HPEMR) for predicting in-hospital mortality of patients with STEMI in the early stage. Methods: This observational study applied clinical information collected between 2013 and 2017 from 7 tertiary hospitals in Shenzhen, China. The patients' STEMI data were used to train 4 different machine learning algorithms to predict in-hospital mortality among the patients with STEMI, including Logistic Regression, Support Vector Machine, Gradient Boosting Decision Tree, and Artificial Neuron network. Results: A total of 5865 patients with STEMI were enrolled in our study. The model was developed by considering 3 types of variables, which included demographic data, diagnosis and comorbidities, and hospitalization information basing on HPEMR. The association of selected features using univariant logistic regression was reported. Specially, for the comorbidities, atrial fibrillation (OR: 11.0; 95% CI: 5.64 - 20.2), acute renal failure (OR: 9.75; 95% CI: 3.81 - 25.0), type 2 diabetic nephropathy (OR: 5.45; 95% CI: 1.57 - 19.0), acute heart failure (OR: 6.05; 95% CI: 1.99 - 14.9), and cardiac function grade IV (OR: 28.6; 95% CI: 20.6 - 39.6) were found to be associated with a high odds of death. Within the test dataset, our model showed a good discrimination ability as measured by area under the receiver operating characteristic curve (AUC; 0.879) (95% CI: 0.825 - 0.933).
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
Comparing regional and provincial-wide COVID-19 models with physical distancing in British Columbia
Authors:
Geoffrey McGregor,
Jennifer Tippett,
Andy T. S. Wan,
Mengxiao Wang,
Samuel W. K. Wong
Abstract:
We study the effects of physical distancing measures for the spread of COVID-19 in regional areas within British Columbia, using the reported cases of the five provincial Health Authorities. Building on the Bayesian epidemiological model of Anderson et al. (2020), we propose a hierarchical regional Bayesian model with time-varying regional parameters between March to December of 2020. In the absen…
▽ More
We study the effects of physical distancing measures for the spread of COVID-19 in regional areas within British Columbia, using the reported cases of the five provincial Health Authorities. Building on the Bayesian epidemiological model of Anderson et al. (2020), we propose a hierarchical regional Bayesian model with time-varying regional parameters between March to December of 2020. In the absence of COVID-19 variants and vaccinations during this period, we examine the regionalized basic reproduction number, modelled prevalence, relative reduction in contact due to physical distancing, and proportion of anticipated cases that have been tested and reported. We observe significant differences between the regional and provincial-wide models and demonstrate the hierarchical regional model can better estimate regional prevalence, especially in rural regions. These results indicate that it can be useful to apply similar regional models to other parts of Canada or other countries.
△ Less
Submitted 13 November, 2021; v1 submitted 22 April, 2021;
originally announced April 2021.
-
Pandemic Drugs at Pandemic Speed: Infrastructure for Accelerating COVID-19 Drug Discovery with Hybrid Machine Learning- and Physics-based Simulations on High Performance Computers
Authors:
Agastya P. Bhati,
Shunzhou Wan,
Dario Alfè,
Austin R. Clyde,
Mathis Bode,
Li Tan,
Mikhail Titov,
Andre Merzky,
Matteo Turilli,
Shantenu Jha,
Roger R. Highfield,
Walter Rocchia,
Nicola Scafuri,
Sauro Succi,
Dieter Kranzlmüller,
Gerald Mathias,
David Wifling,
Yann Donon,
Alberto Di Meglio,
Sofia Vallecorsa,
Heng Ma,
Anda Trifan,
Arvind Ramanathan,
Tom Brettin,
Alexander Partin
, et al. (4 additional authors not shown)
Abstract:
The race to meet the challenges of the global pandemic has served as a reminder that the existing drug discovery process is expensive, inefficient and slow. There is a major bottleneck screening the vast number of potential small molecules to shortlist lead compounds for antiviral drug development. New opportunities to accelerate drug discovery lie at the interface between machine learning methods…
▽ More
The race to meet the challenges of the global pandemic has served as a reminder that the existing drug discovery process is expensive, inefficient and slow. There is a major bottleneck screening the vast number of potential small molecules to shortlist lead compounds for antiviral drug development. New opportunities to accelerate drug discovery lie at the interface between machine learning methods, in this case developed for linear accelerators, and physics-based methods. The two in silico methods, each have their own advantages and limitations which, interestingly, complement each other. Here, we present an innovative infrastructural development that combines both approaches to accelerate drug discovery. The scale of the potential resulting workflow is such that it is dependent on supercomputing to achieve extremely high throughput. We have demonstrated the viability of this workflow for the study of inhibitors for four COVID-19 target proteins and our ability to perform the required large-scale calculations to identify lead antiviral compounds through repurposing on a variety of supercomputers.
△ Less
Submitted 4 September, 2021; v1 submitted 4 March, 2021;
originally announced March 2021.
-
IMPECCABLE: Integrated Modeling PipelinE for COVID Cure by Assessing Better LEads
Authors:
Aymen Al Saadi,
Dario Alfe,
Yadu Babuji,
Agastya Bhati,
Ben Blaiszik,
Thomas Brettin,
Kyle Chard,
Ryan Chard,
Peter Coveney,
Anda Trifan,
Alex Brace,
Austin Clyde,
Ian Foster,
Tom Gibbs,
Shantenu Jha,
Kristopher Keipert,
Thorsten Kurth,
Dieter Kranzlmüller,
Hyungro Lee,
Zhuozhao Li,
Heng Ma,
Andre Merzky,
Gerald Mathias,
Alexander Partin,
Junqi Yin
, et al. (11 additional authors not shown)
Abstract:
The drug discovery process currently employed in the pharmaceutical industry typically requires about 10 years and $2-3 billion to deliver one new drug. This is both too expensive and too slow, especially in emergencies like the COVID-19 pandemic. In silicomethodologies need to be improved to better select lead compounds that can proceed to later stages of the drug discovery protocol accelerating…
▽ More
The drug discovery process currently employed in the pharmaceutical industry typically requires about 10 years and $2-3 billion to deliver one new drug. This is both too expensive and too slow, especially in emergencies like the COVID-19 pandemic. In silicomethodologies need to be improved to better select lead compounds that can proceed to later stages of the drug discovery protocol accelerating the entire process. No single methodological approach can achieve the necessary accuracy with required efficiency. Here we describe multiple algorithmic innovations to overcome this fundamental limitation, development and deployment of computational infrastructure at scale integrates multiple artificial intelligence and simulation-based approaches. Three measures of performance are:(i) throughput, the number of ligands per unit time; (ii) scientific performance, the number of effective ligands sampled per unit time and (iii) peak performance, in flop/s. The capabilities outlined here have been used in production for several months as the workhorse of the computational infrastructure to support the capabilities of the US-DOE National Virtual Biotechnology Laboratory in combination with resources from the EU Centre of Excellence in Computational Biomedicine.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing
Authors:
Shixiang Wan,
Quan Zou
Abstract:
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for co** with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large…
▽ More
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for co** with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. After comparing with most available state-of-the-art methods, our experimental results indicate the following: 1) HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large biological sequences; 2) HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource; 3) HAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
△ Less
Submitted 4 April, 2017;
originally announced April 2017.
-
Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy
Authors:
Quan Zou,
Shixiang Wan,
Ying Ju,
Jijun Tang,
Xiangxiang Zeng
Abstract:
Background: It is necessary and essential to discovery protein function from the novel primary sequences. Wet lab experimental procedures are not only time-consuming, but also costly, so predicting protein structure and function reliably based only on amino acid sequence has significant value. TATA-binding protein (TBP) is a kind of DNA binding protein, which plays a key role in the transcription…
▽ More
Background: It is necessary and essential to discovery protein function from the novel primary sequences. Wet lab experimental procedures are not only time-consuming, but also costly, so predicting protein structure and function reliably based only on amino acid sequence has significant value. TATA-binding protein (TBP) is a kind of DNA binding protein, which plays a key role in the transcription regulation. Our study proposed an automatic approach for identifying TATA-binding proteins efficiently, accurately, and conveniently. This method would guide for the special protein identification with computational intelligence strategies. Results: Firstly, we proposed novel fingerprint features for TBP based on pseudo amino acid composition, physicochemical properties, and secondary structure. Secondly, hierarchical features dimensionality reduction strategies were employed to improve the performance furthermore. Currently, Pretata achieves 92.92% TATA- binding protein prediction accuracy, which is better than all other existing methods. Conclusions: The experiments demonstrate that our method could greatly improve the prediction accuracy and speed, thus allowing large-scale NGS data prediction to be practical. A web server is developed to facilitate the other researchers, which can be accessed at http://server.malab.cn/preTata/.
△ Less
Submitted 7 March, 2017;
originally announced March 2017.