Enantiospecificity in NMR Enabled by Chirality-Induced Spin Selectivity
Authors:
T. Georgiou,
J. L. Palma,
V. Mujica,
S. Varela,
M. Galante,
V. Santamarıa Garcıa,
L. Mboning,
R. N. Schwartz,
G. Cuniberti,
L. -S. Bouchard
Abstract:
Spin polarization in chiral molecules is a magnetic molecular response associated with electron transport and enantioselective bond polarization that occurs even in the absence of an external magnetic field. An unexpected finding by Santos and co-workers reported enantiospecific NMR responses in solid-state cross-polarization (CP) experiments, suggesting a possible additional contribution to the i…
▽ More
Spin polarization in chiral molecules is a magnetic molecular response associated with electron transport and enantioselective bond polarization that occurs even in the absence of an external magnetic field. An unexpected finding by Santos and co-workers reported enantiospecific NMR responses in solid-state cross-polarization (CP) experiments, suggesting a possible additional contribution to the indirect nuclear spin-spin coupling in chiral molecules induced by bond polarization in the presence of spin-orbit coupling. Herein we provide a theoretical treatment for this phenomenon, presenting an effective spin-Hamiltonian for helical molecules like DNA and density functional theory (DFT) results on amino acids that confirm the dependence of J-couplings on the choice of enantiomer. The connection between nuclear spin dynamics and chirality could offer insights for molecular sensing and quantum information sciences. These results establish NMR as a potential tool for chiral discrimination without external agents.
△ Less
Submitted 2 July, 2024; v1 submitted 30 June, 2024;
originally announced July 2024.
Embed-Search-Align: DNA Sequence Alignment using Transformer Models
Authors:
Pavan Holur,
K. C. Enevoldsen,
Shreyas Rajesh,
Lajoyce Mboning,
Thalia Georgiou,
Louis-S. Bouchard,
Matteo Pellegrini,
Vwani Roychowdhury
Abstract:
DNA sequence alignment involves assigning short DNA reads to the most probable locations on an extensive reference genome. This process is crucial for various genomic analyses, including variant calling, transcriptomics, and epigenomics. Conventional methods, refined over decades, tackle this challenge in two steps: genome indexing followed by efficient search to locate likely positions for given…
▽ More
DNA sequence alignment involves assigning short DNA reads to the most probable locations on an extensive reference genome. This process is crucial for various genomic analyses, including variant calling, transcriptomics, and epigenomics. Conventional methods, refined over decades, tackle this challenge in two steps: genome indexing followed by efficient search to locate likely positions for given reads. Building on the success of Large Language Models (LLM) in encoding text into embeddings, where the distance metric captures semantic similarity, recent efforts have explored whether the same Transformer architecture can produce numerical representations for DNA sequences. Such models have shown early promise in tasks involving classification of short DNA sequences, such as the detection of coding vs non-coding regions, as well as the identification of enhancer and promoter sequences. Performance at sequence classification tasks does not, however, translate to sequence alignment, where it is necessary to conduct a genome-wide search to successfully align every read. We address this open problem by framing it as an Embed-Search-Align task. In this framework, a novel encoder model DNA-ESA generates representations of reads and fragments of the reference, which are projected into a shared vector space where the read-fragment distance is used as surrogate for alignment. In particular, DNA-ESA introduces: (1) Contrastive loss for self-supervised training of DNA sequence representations, facilitating rich sequence-level embeddings, and (2) a DNA vector store to enable search across fragments on a global scale. DNA-ESA is >97% accurate when aligning 250-length reads onto a human reference genome of 3 gigabases (single-haploid), far exceeds the performance of 6 recent DNA-Transformer model baselines and shows task transfer across chromosomes and species.
△ Less
Submitted 23 April, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.