Search | arXiv e-print repository

Spectroscopy-Guided Discovery of Three-Dimensional Structures of Disordered Materials with Diffusion Models

Authors: Hyuna Kwon, Tim Hsu, Wenyu Sun, Wonseok Jeong, Fikret Aydin, James Chapman, Xiao Chen, Matthew R. Carbone, Deyu Lu, Fei Zhou, Tuan Anh Pham

Abstract: The ability to rapidly develop materials with desired properties has a transformative impact on a broad range of emerging technologies. In this work, we introduce a new framework based on the diffusion model, a recent generative machine learning method to predict 3D structures of disordered materials from a target property. For demonstration, we apply the model to identify the atomic structures of… ▽ More The ability to rapidly develop materials with desired properties has a transformative impact on a broad range of emerging technologies. In this work, we introduce a new framework based on the diffusion model, a recent generative machine learning method to predict 3D structures of disordered materials from a target property. For demonstration, we apply the model to identify the atomic structures of amorphous carbons ($a$-C) as a representative material system from the target X-ray absorption near edge structure (XANES) spectra--a common experimental technique to probe atomic structures of materials. We show that conditional generation guided by XANES spectra reproduces key features of the target structures. Furthermore, we show that our model can steer the generative process to tailor atomic arrangements for a specific XANES spectrum. Finally, our generative model exhibits a remarkable scale-agnostic property, thereby enabling generation of realistic, large-scale structures through learning from a small-scale dataset (i.e., with small unit cells). Our work represents a significant stride in bridging the gap between materials characterization and atomic structure determination; in addition, it can be leveraged for materials discovery in exploring various material properties as targeted. △ Less

Submitted 9 December, 2023; originally announced December 2023.

arXiv:2310.01336 [pdf, other]

JugglePAC: A Pipelined Accumulation Circuit

Authors: Ahmad Houraniah, H. Fatih Ugurdag, Furkan Aydin

Abstract: Summing a set of numbers, namely, "Accumulation," is a subtask within many computational tasks. If the numbers to sum arrive non-stop in back-to-back clock cycles at high clock frequencies, summing them without allowing them to pile up can be quite a challenge, that is, when the latency of addition (i.e., summing two numbers) is longer than one clock cycle, which is always the case for floating-po… ▽ More Summing a set of numbers, namely, "Accumulation," is a subtask within many computational tasks. If the numbers to sum arrive non-stop in back-to-back clock cycles at high clock frequencies, summing them without allowing them to pile up can be quite a challenge, that is, when the latency of addition (i.e., summing two numbers) is longer than one clock cycle, which is always the case for floating-point numbers. This could also be the case for integer summations with high clock frequencies. In the case of floating-point numbers, this is handled by pipelining the adder, but that does not solve all problems. The challenges include optimization of speed, area, and latency. As well as the adaptability of the design to different application requirements, such as the ability to handle variable-size subsequent data sets with no time gap in between and with results produced in the input-order. All these factors make designing an efficient floating-point accumulator a non-trivial problem. Integer accumulation is a relatively simpler problem, where high frequencies can be achieved by using carry-save tree adders. This can then be further improved by efficient resource-sharing. In this paper, we present two fast and area-efficient accumulation circuits, JugglePAC and INTAC. JugglePAC is tailored for floating-point reduction operations (such as accumulation) and offers significant advantages with respect to the literature in terms of speed, area, and adaptability to various application requirements. INTAC is designed for fast integer accumulation. Using carry-save adders and resource-sharing, it can achieve very high clock frequencies while maintaining a low area complexity. △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: 9 pages, 6 figures

arXiv:2207.06630 [pdf, other]

Identifying Orientation-specific Lipid-protein Fingerprints using Deep Learning

Authors: Fikret Aydin, Konstantia Georgouli, Gautham Dharuman, James N. Glosli, Felice C. Lightstone, Helgi I. Ingólfsson, Peer-Timo Bremer, Harsh Bhatia

Abstract: Improved understanding of the relation between the behavior of RAS and RAF proteins and the local lipid environment in the cell membrane is critical for getting insights into the mechanisms underlying cancer formation. In this work, we employ deep learning (DL) to learn this relationship by predicting protein orientational states of RAS and RAS-RAF protein complexes with respect to the lipid membr… ▽ More Improved understanding of the relation between the behavior of RAS and RAF proteins and the local lipid environment in the cell membrane is critical for getting insights into the mechanisms underlying cancer formation. In this work, we employ deep learning (DL) to learn this relationship by predicting protein orientational states of RAS and RAS-RAF protein complexes with respect to the lipid membrane based on the lipid densities around the protein domains from coarse-grained (CG) molecular dynamics (MD) simulations. Our DL model can predict six protein states with an overall accuracy of over 80%. The findings of this work offer new insights into how the proteins modulate the lipid environment, which in turn may assist designing novel therapies to regulate such interactions in the mechanisms associated with cancer development. △ Less

Submitted 13 July, 2022; originally announced July 2022.

arXiv:2207.04333 [pdf, other]

Emerging Patterns in the Continuum Representation of Protein-Lipid Fingerprints

Authors: Konstantia Georgouli, Helgi I Ingólfsson, Fikret Aydin, Mark Heimann, Felice C Lightstone, Peer-Timo Bremer, Harsh Bhatia

Abstract: Capturing intricate biological phenomena often requires multiscale modeling where coarse and inexpensive models are developed using limited components of expensive and high-fidelity models. Here, we consider such a multiscale framework in the context of cancer biology and address the challenge of evaluating the descriptive capabilities of a continuum model developed using 1-dimensional statistics… ▽ More Capturing intricate biological phenomena often requires multiscale modeling where coarse and inexpensive models are developed using limited components of expensive and high-fidelity models. Here, we consider such a multiscale framework in the context of cancer biology and address the challenge of evaluating the descriptive capabilities of a continuum model developed using 1-dimensional statistics from a molecular dynamics model. Using deep learning, we develop a highly predictive classification model that identifies complex and emergent behavior from the continuum model. With over 99.9% accuracy demonstrated for two simulations, our approach confirms the existence of protein-specific "lipid fingerprints", i.e. spatial rearrangements of lipids in response to proteins of interest. Through this demonstration, our model also provides external validation of the continuum model, affirms the value of such multiscale modeling, and can foster new insights through further analysis of these fingerprints. △ Less

Submitted 9 July, 2022; originally announced July 2022.

arXiv:1902.08888 [pdf, other]

Medical Multimodal Classifiers Under Scarce Data Condition

Authors: Faik Aydin, Maggie Zhang, Michelle Ananda-Rajah, Gholamreza Haffari

Abstract: Data is one of the essential ingredients to power deep learning research. Small datasets, especially specific to medical institutes, bring challenges to deep learning training stage. This work aims to develop a practical deep multimodal that can classify patients into abnormal and normal categories accurately as well as assist radiologists to detect visual and textual anomalies by locating areas o… ▽ More Data is one of the essential ingredients to power deep learning research. Small datasets, especially specific to medical institutes, bring challenges to deep learning training stage. This work aims to develop a practical deep multimodal that can classify patients into abnormal and normal categories accurately as well as assist radiologists to detect visual and textual anomalies by locating areas of interest. The detection of the anomalies is achieved through a novel technique which extends the integrated gradients methodology with an unsupervised clustering algorithm. This technique also introduces a tuning parameter which trades off true positive signals to denoise false positive signals in the detection process. To overcome the challenges of the small training dataset which only has 3K frontal X-ray images and medical reports in pairs, we have adopted transfer learning for the multimodal which concatenates the layers of image and text submodels. The image submodel was trained on the vast ChestX-ray14 dataset, while the text submodel transferred a pertained word embedding layer from a hospital-specific corpus. Experimental results show that our multimodal improves the accuracy of the classification by 4% and 7% on average of 50 epochs, compared to the individual text and image model, respectively. △ Less

Submitted 23 February, 2019; originally announced February 2019.

Showing 1–5 of 5 results for author: Aydin, F