Search | arXiv e-print repository

On the Robustness of AlphaFold: A COVID-19 Case Study

Authors: Ismail Alkhouri, Sumit Jha, Andre Beckus, George Atia, Alvaro Velasquez, Rickard Ewetz, Arvind Ramanathan, Susmit Jha

Abstract: Protein folding neural networks (PFNNs) such as AlphaFold predict remarkably accurate structures of proteins compared to other approaches. However, the robustness of such networks has heretofore not been explored. This is particularly relevant given the broad social implications of such technologies and the fact that biologically small perturbations in the protein sequence do not generally lead to… ▽ More Protein folding neural networks (PFNNs) such as AlphaFold predict remarkably accurate structures of proteins compared to other approaches. However, the robustness of such networks has heretofore not been explored. This is particularly relevant given the broad social implications of such technologies and the fact that biologically small perturbations in the protein sequence do not generally lead to drastic changes in the protein structure. In this paper, we demonstrate that AlphaFold does not exhibit such robustness despite its high accuracy. This raises the challenge of detecting and quantifying the extent to which these predicted protein structures can be trusted. To measure the robustness of the predicted structures, we utilize (i) the root-mean-square deviation (RMSD) and (ii) the Global Distance Test (GDT) similarity measure between the predicted structure of the original sequence and the structure of its adversarially perturbed version. We prove that the problem of minimally perturbing protein sequences to fool protein folding neural networks is NP-complete. Based on the well-established BLOSUM62 sequence alignment scoring matrix, we generate adversarial protein sequences and show that the RMSD between the predicted protein structure and the structure of the original sequence are very large when the adversarial changes are bounded by (i) 20 units in the BLOSUM62 distance, and (ii) five residues (out of hundreds or thousands of residues) in the given protein sequence. In our experimental evaluation, we consider 111 COVID-19 proteins in the Universal Protein resource (UniProt), a central resource for protein data managed by the European Bioinformatics Institute, Swiss Institute of Bioinformatics, and the US Protein Information Resource. These result in an overall GDT similarity test score average of around 34%, demonstrating a substantial drop in the performance of AlphaFold. △ Less

Submitted 12 January, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

Comments: arXiv admin note: text overlap with arXiv:2109.04460

arXiv:1608.03464 [pdf, other]

doi 10.1109/TBME.2016.2606595

A New Statistical Model of Electroencephalogram Noise Spectra for Real-time Brain-Computer Interfaces

Authors: Alan Paris, George Atia, Azadeh Vosoughi, Stephen Berman

Abstract: $Objective… ▽ More $Objective$: A characteristic of neurological signal processing is high levels of noise from sub-cellular ion channels up to whole-brain processes. In this paper, we propose a new model of electroencephalogram (EEG) background periodograms, based on a family of functions which we call generalized van der Ziel--McWhorter (GVZM) power spectral densities (PSDs). To the best of our knowledge, the GVZM PSD function is the only EEG noise model which has relatively few parameters, matches recorded EEG PSD's with high accuracy from 0 Hz to over 30 Hz, and has approximately $1/f^θ$ behavior in the mid-frequencies without infinities. $Methods$: We validate this model using three approaches. First, we show how GVZM PSDs can arise in population of ion channels in maximum entropy equilibrium. Second, we present a class of mixed autoregressive models, which simulate brain background noise and whose periodograms are asymptotic to the GVZM PSD. Third, we present two real-time estimation algorithms for steady-state visual evoked potential (SSVEP) frequencies, and analyze their performance statistically. $Results$: In pairwise comparisons, the GVZM-based algorithms showed statistically significant accuracy improvement over two well-known and widely-used SSVEP estimators. $Conclusion$: The GVZM noise model can be a useful and reliable technique for EEG signal processing. $Significance$: Understanding EEG noise is essential for EEG-based neurology and applications such as real-time brain-computer interfaces (BCIs), which must make accurate control decisions from very short data epochs. The GVZM approach represents a successful new paradigm for understanding and managing this neurological noise. △ Less

Submitted 24 July, 2016; originally announced August 2016.

Comments: Revised submission to IEEE EMBS Trans. Biomed. Eng. 12 pages, 9 figures

arXiv:1511.00057

Formalized Quantum Stochastic Processes and Hidden Quantum Models with Applications to Neuron Ion Channel Kinetics

Authors: Alan Paris, George Atia, Azadeh Vosoughi, Stephen Berman

Abstract: A new class of formal latent-variable stochastic processes called hidden quantum models (HQM's) is defined in order to clarify the theoretical foundations of ion channel signal processing. HQM's are based on quantum stochastic processes which formalize time-dependent observation. They allow the calculation of autocovariance functions which are essential for frequency-domain signal processing. HQM'… ▽ More A new class of formal latent-variable stochastic processes called hidden quantum models (HQM's) is defined in order to clarify the theoretical foundations of ion channel signal processing. HQM's are based on quantum stochastic processes which formalize time-dependent observation. They allow the calculation of autocovariance functions which are essential for frequency-domain signal processing. HQM's based on a particular type of observation protocol called independent activated measurements are shown to to be distributionally equivalent to hidden Markov models yet without an underlying physical Markov process. Since the formal Markov processes are non-physical, the theory of activated measurement allows merging energy-based Eyring rate theories of ion channel behavior with the more common phenomenological Markov kinetic schemes to form energy-modulated quantum channels. Using the simplest quantum channel model consistent with neuronal membrane voltage-clamp experiments, activation eigenenergies are calculated for the Hodgkin-Huxley K+ and Na+ ion channels. It is also shown that maximizing entropy under constrained activation energy yields noise spectral densities approximating $S(f) \sim 1/f^α$, thus offering a biophysical explanation for the ubiquitous $1/f$-type in neurological signals. △ Less

Submitted 26 July, 2018; v1 submitted 30 October, 2015; originally announced November 2015.

Comments: Several proofs were found to be incomplete or in error including the proof that quantum rotations can induce arbitrary noise weights. A fully corrected version of this paper is published as: A. Paris, G. Atia, A. Vosoughi, and S. Berman, "Hidden quantum processes, quantum ion channels, and 1/f-type noise", Neural Computation, vol. 30, num. 7, pp. 1830-1929 (2018), doi:10.1162/neco_a_01067

Showing 1–3 of 3 results for author: Atia, G