Search | arXiv e-print repository

Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning

Authors: Guanrou Yang, Ziyang Ma, Zhisheng Zheng, Yakun Song, Zhikang Niu, Xie Chen

Abstract: Recent years have witnessed significant advancements in self-supervised learning (SSL) methods for speech-processing tasks. Various speech-based SSL models have been developed and present promising performance on a range of downstream tasks including speech recognition. However, existing speech-based SSL models face a common dilemma in terms of computational cost, which might hinder their potentia… ▽ More Recent years have witnessed significant advancements in self-supervised learning (SSL) methods for speech-processing tasks. Various speech-based SSL models have been developed and present promising performance on a range of downstream tasks including speech recognition. However, existing speech-based SSL models face a common dilemma in terms of computational cost, which might hinder their potential application and in-depth academic research. To address this issue, we first analyze the computational cost of different modules during HuBERT pre-training and then introduce a stack of efficiency optimizations, which is named Fast-HuBERT in this paper. The proposed Fast-HuBERT can be trained in 1.1 days with 8 V100 GPUs on the Librispeech 960h benchmark, without performance degradation, resulting in a 5.2x speedup, compared to the original implementation. Moreover, we explore two well-studied techniques in the Fast-HuBERT and demonstrate consistent improvements as reported in previous work. △ Less

Submitted 29 September, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.13517 [pdf, ps, other]

Lifting Theorems Meet Information Complexity: Known and New Lower Bounds of Set-disjointness

Authors: Guangxu Yang, Jiapeng Zhang

Abstract: Set-disjointness problems are one of the most fundamental problems in communication complexity and have been extensively studied in past decades. Given its importance, many lower bound techniques were introduced to prove communication lower bounds of set-disjointness. Combining ideas from information complexity and query-to-communication lifting theorems, we introduce a density increment argument… ▽ More Set-disjointness problems are one of the most fundamental problems in communication complexity and have been extensively studied in past decades. Given its importance, many lower bound techniques were introduced to prove communication lower bounds of set-disjointness. Combining ideas from information complexity and query-to-communication lifting theorems, we introduce a density increment argument to prove communication lower bounds for set-disjointness: We give a simple proof showing that a large rectangle cannot be $0$-monochromatic for multi-party unique-disjointness. We interpret the direct-sum argument as a density increment process and give an alternative proof of randomized communication lower bounds for multi-party unique-disjointness. Avoiding full simulations in lifting theorems, we simplify and improve communication lower bounds for sparse unique-disjointness. Potential applications to be unified and improved by our density increment argument are also discussed. △ Less

Submitted 23 September, 2023; originally announced September 2023.

Comments: Working Paper

arXiv:2309.10836 [pdf, other]

CMRxRecon: An open cardiac MRI dataset for the competition of accelerated image reconstruction

Authors: Chengyan Wang, Jun Lyu, Shuo Wang, Chen Qin, Kunyuan Guo, Xinyu Zhang, Xiaotong Yu, Yan Li, Fanwen Wang, Jianhua **, Zhang Shi, Ziqiang Xu, Yapeng Tian, Sha Hua, Zhensen Chen, Meng Liu, Mengting Sun, Xutong Kuang, Kang Wang, Haoran Wang, Hao Li, Yinghua Chu, Guang Yang, Wenjia Bai, Xiahai Zhuang , et al. (3 additional authors not shown)

Abstract: Cardiac magnetic resonance imaging (CMR) has emerged as a valuable diagnostic tool for cardiac diseases. However, a limitation of CMR is its slow imaging speed, which causes patient discomfort and introduces artifacts in the images. There has been growing interest in deep learning-based CMR imaging algorithms that can reconstruct high-quality images from highly under-sampled k-space data. However,… ▽ More Cardiac magnetic resonance imaging (CMR) has emerged as a valuable diagnostic tool for cardiac diseases. However, a limitation of CMR is its slow imaging speed, which causes patient discomfort and introduces artifacts in the images. There has been growing interest in deep learning-based CMR imaging algorithms that can reconstruct high-quality images from highly under-sampled k-space data. However, the development of deep learning methods requires large training datasets, which have not been publicly available for CMR. To address this gap, we released a dataset that includes multi-contrast, multi-view, multi-slice and multi-coil CMR imaging data from 300 subjects. Imaging studies include cardiac cine and map** sequences. Manual segmentations of the myocardium and chambers of all the subjects are also provided within the dataset. Scripts of state-of-the-art reconstruction algorithms were also provided as a point of reference. Our aim is to facilitate the advancement of state-of-the-art CMR image reconstruction by introducing standardized evaluation criteria and making the dataset freely accessible to the research community. Researchers can access the dataset at https://www.synapse.org/#!Synapse:syn51471091/wiki/. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 14 pages, 8 figures

arXiv:2309.10373 [pdf, other]

A moving least square immersed boundary method for SPH with thin-walled structures

Authors: ZhuoLin Wang, Zichao Jiang, Yi Zhang, Gengchao Yang, Trevor Hocksun Kwan, Yuhui Chen, Qinghe Yao

Abstract: This paper presents a novel method for smoothed particle hydrodynamics (SPH) with thin-walled structures. Inspired by the direct forcing immersed boundary method, this method employs a moving least square method to guarantee the smoothness of velocity near the structure surface. It simplifies thin-walled structure simulations by eliminating the need for multiple layers of boundary particles, and i… ▽ More This paper presents a novel method for smoothed particle hydrodynamics (SPH) with thin-walled structures. Inspired by the direct forcing immersed boundary method, this method employs a moving least square method to guarantee the smoothness of velocity near the structure surface. It simplifies thin-walled structure simulations by eliminating the need for multiple layers of boundary particles, and improves computational accuracy and stability in three-dimensional scenarios. Supportive three-dimensional numerical results are provided, including the impulsively started plate and the flow past a cylinder. Results of the impulsively started test demonstrate that the proposed method obtains smooth velocity and pressure in the, as well as a good match to the references results of the vortex wake development. In addition, results of the flow past cylinder test show that the proposed method avoids mutual interference on both side of the boundary, remains stable for three-dimensional simulations while accurately calculating the forces acting on structure. △ Less

Submitted 8 October, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: 15 pages,11 figures

arXiv:2309.09180 [pdf, other]

Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture

Authors: Gaobin Yang, Maokui He, Shutong Niu, Ruoyu Wang, Yanyan Yue, Shuangqing Qian, Shilong Wu, Jun Du, Chin-Hui Lee

Abstract: We propose a novel neural speaker diarization system using memory-aware multi-speaker embedding with sequence-to-sequence architecture (NSD-MS2S), which integrates the strengths of memory-aware multi-speaker embedding (MA-MSE) and sequence-to-sequence (Seq2Seq) architecture, leading to improvement in both efficiency and performance. Next, we further decrease the memory occupation of decoding by in… ▽ More We propose a novel neural speaker diarization system using memory-aware multi-speaker embedding with sequence-to-sequence architecture (NSD-MS2S), which integrates the strengths of memory-aware multi-speaker embedding (MA-MSE) and sequence-to-sequence (Seq2Seq) architecture, leading to improvement in both efficiency and performance. Next, we further decrease the memory occupation of decoding by incorporating input features fusion and then employ a multi-head attention mechanism to capture features at different levels. NSD-MS2S achieved a macro diarization error rate (DER) of 15.9% on the CHiME-7 EVAL set, which signifies a relative improvement of 49% over the official baseline system, and is the key technique for us to achieve the best performance for the main track of CHiME-7 DASR Challenge. Additionally, we introduce a deep interactive module (DIM) in MA-MSE module to better retrieve a cleaner and more discriminative multi-speaker embedding, enabling the current model to outperform the system we used in the CHiME-7 DASR Challenge. Our code will be available at https://github.com/liyunlongaaa/NSD-MS2S. △ Less

Submitted 26 December, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

Comments: Accepted by ICASSP 2024

arXiv:2309.06732 [pdf, other]

Fermi Surface Nesting with Heavy Quasiparticles in the Locally Noncentrosymmetric Superconductor CeRh$_2$As$_2$

Authors: Yi Wu, Yongjun Zhang, Sailong Ju, Yong Hu, Yanen Huang, Yanan Zhang, Huali Zhang, Hao Zheng, Guowei Yang, Evrard-Ouicem Eljaouhari, Baopeng Song, Nicholas C. Plumb, Frank Steglich, Ming Shi, Gertrud Zwicknag, Chao Cao, Huiqiu Yuan, Yang Liu

Abstract: The locally noncentrosymmetric heavy fermion superconductor CeRh$_2$As$_2$ has attracted considerable interests due to its rich superconducting phases, accompanied by a quadrupole density wave and pronounced antiferromagnetic excitations. To understand the underlying physics, we here report measurements from high-resolution angle-resolved photoemission. Our results reveal fine splittings of the co… ▽ More The locally noncentrosymmetric heavy fermion superconductor CeRh$_2$As$_2$ has attracted considerable interests due to its rich superconducting phases, accompanied by a quadrupole density wave and pronounced antiferromagnetic excitations. To understand the underlying physics, we here report measurements from high-resolution angle-resolved photoemission. Our results reveal fine splittings of the conduction bands related to the locally noncentrosymmetric structure, as well as a quasi-two-dimensional Fermi surface (FS) with strong $4f$ contributions. The FS exhibits nesting with an in-plane vector $(π/a, π/a)$, which is facilitated by the van Hove singularity near $\bar X$ that arises from the characteristic conduction-$f$ hybridization. The FS nesting provides a natural explanation for the observed antiferromagnetic excitations at $(π/a, π/a)$, which could be intimately connected to its unconventional superconductivity. Our experimental results are well supported by density functional theory plus dynamical mean field theory calculations, which can capture the strong correlation effects. Our study not only provides spectroscopic proof of the key factors underlying the field-induced superconducting transition, but also uncovers the critical role of FS nesting and lattice Kondo effect in the intertwined spin and charge fluctuations. △ Less

Submitted 1 June, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: v1 submitted on Sep 13th 2023

arXiv:2309.06598 [pdf]

Efficient Post-processing of Diffusion Tensor Cardiac Magnetic Imaging Using Texture-conserving Deformable Registration

Authors: Fanwen Wang, Pedro F. Ferreira, Yinzhe Wu, Camila Munoz, Ke Wen, Yaqing Luo, Jiahao Huang, Dudley J. Pennell, Andrew D. Scott, Sonia Nielles-Vallespin, Guang Yang

Abstract: Diffusion tensor cardiac magnetic resonance (DT-CMR) is a method capable of providing non-invasive measurements of myocardial microstructure. Image registration is essential to correct image shifts due to intra and inter breath-hold motion and imperfect cardiac triggering. Registration is challenging in DT-CMR due to the low signal-to-noise and various contrasts induced by the diffusion encoding i… ▽ More Diffusion tensor cardiac magnetic resonance (DT-CMR) is a method capable of providing non-invasive measurements of myocardial microstructure. Image registration is essential to correct image shifts due to intra and inter breath-hold motion and imperfect cardiac triggering. Registration is challenging in DT-CMR due to the low signal-to-noise and various contrasts induced by the diffusion encoding in the myocardium and surrounding organs. Traditional deformable registration corrects through-plane motion but at the risk of destroying the texture information while rigid registration inefficiently discards frames with local deformation. In this study, we explored the possibility of deep learning-based deformable registration on DT-CMR. Based on the noise suppression using low-rank features and diffusion encoding suppression using variational auto encoder-decoder, a B-spline based registration network extracted the displacement fields and maintained the texture features of DT-CMR. In this way, our method improved the efficiency of frame utilization, manual crop**, and computational speed. △ Less

Submitted 16 May, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

Comments: 7 pages, 4 figures, conference

arXiv:2309.04945 [pdf, other]

O2ATH: An OpenMP Offloading Toolkit for the Sunway Heterogeneous Manycore Platform

Authors: Haoran Lin, Lifeng Yan, Qixin Chang, Haitian Lu, Chenlin Li, Quanjie He, Zeyu Song, Xiaohui Duan, Zekun Yin, Yuxuan Li, Zhao Liu, Wei Xue, Haohuan Fu, Lin Gan, Guangwen Yang, Weiguo Liu

Abstract: The next generation Sunway supercomputer employs the SW26010pro processor, which features a specialized on-chip heterogeneous architecture. Applications with significant hotspots can benefit from the great computation capacity improvement of Sunway many-core architectures by carefully making intensive manual many-core parallelization efforts. However, some legacy projects with large codebases, suc… ▽ More The next generation Sunway supercomputer employs the SW26010pro processor, which features a specialized on-chip heterogeneous architecture. Applications with significant hotspots can benefit from the great computation capacity improvement of Sunway many-core architectures by carefully making intensive manual many-core parallelization efforts. However, some legacy projects with large codebases, such as CESM, ROMS and WRF, contain numerous lines of code and do not have significant hotspots. The cost of manually porting such applications to the Sunway architecture is almost unaffordable. To overcome such a challenge, we have developed a toolkit named O2ATH. O2ATH forwards GNU OpenMP runtime library calls to Sunway's Athread library, which greatly simplifies the parallelization work on the Sunway architecture.O2ATH enables users to write both MPE and CPE code in a single file, and parallelization can be achieved by utilizing OpenMP directives and attributes. In practice, O2ATH has helped us to port two large projects, CESM and ROMS, to the CPEs of the next generation Sunway supercomputers via the OpenMP offload method. In the experiments, kernel speedups range from 3 to 15 times, resulting in 3 to 6 times whole application speedups.Furthermore, O2ATH requires significantly fewer code modifications compared to manually crafting CPE functions.This indicates that O2ATH can greatly enhance development efficiency when porting or optimizing large software projects on Sunway supercomputers. △ Less

Submitted 10 September, 2023; originally announced September 2023.

Comments: 15 pages, 6 figures, 5 tables,

arXiv:2309.04710 [pdf, other]

Jade: A Differentiable Physics Engine for Articulated Rigid Bodies with Intersection-Free Frictional Contact

Authors: Gang Yang, Siyuan Luo, Lin Shao

Abstract: We present Jade, a differentiable physics engine for articulated rigid bodies. Jade models contacts as the Linear Complementarity Problem (LCP). Compared to existing differentiable simulations, Jade offers features including intersection-free collision simulation and stable LCP solutions for multiple frictional contacts. We use continuous collision detection to detect the time of impact and adopt… ▽ More We present Jade, a differentiable physics engine for articulated rigid bodies. Jade models contacts as the Linear Complementarity Problem (LCP). Compared to existing differentiable simulations, Jade offers features including intersection-free collision simulation and stable LCP solutions for multiple frictional contacts. We use continuous collision detection to detect the time of impact and adopt the backtracking strategy to prevent intersection between bodies with complex geometry shapes. We derive the gradient calculation to ensure the whole simulation process is differentiable under the backtracking mechanism. We modify the popular Dantzig algorithm to get valid solutions under multiple frictional contacts. We conduct extensive experiments to demonstrate the effectiveness of our differentiable physics simulation over a variety of contact-rich tasks. △ Less

Submitted 9 September, 2023; originally announced September 2023.

arXiv:2309.04395 [pdf, ps, other]

doi 10.1007/JHEP02(2024)201

A high-precision result for a full-color three-loop three-point form factor in ${\cal N}=4$ SYM

Authors: Xin Guan, Guanda Lin, Xiao Liu, Yan-Qing Ma, Gang Yang

Abstract: We perform a high-precision computation of the three-loop three-point form factor of the stress-tensor supermultiplet in ${\cal N}=4$ SYM. Both the leading-color and non-leading-color form factors are expanded in terms of simple integrals. We compute the complete set of integrals at a special kinematic point with very high precision using $\mathtt{AMFlow}$. The high-precision leading-color result… ▽ More We perform a high-precision computation of the three-loop three-point form factor of the stress-tensor supermultiplet in ${\cal N}=4$ SYM. Both the leading-color and non-leading-color form factors are expanded in terms of simple integrals. We compute the complete set of integrals at a special kinematic point with very high precision using $\mathtt{AMFlow}$. The high-precision leading-color result enables us to obtain the analytic form of a numerical constant in the three-loop BDS ansatz, which is previously known only numerically. The high-precision values of the non-leading-color finite remainder as well as all integrals are also presented, which can be valuable for future use. △ Less

Submitted 29 February, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: 6 pages, 25 pages of appendix; v2: appendix modified, published version

arXiv:2309.04190 [pdf, other]

SegmentAnything helps microscopy images based automatic and quantitative organoid detection and analysis

Authors: Xiaodan Xing, Chunling Tang, Yunzhe Guo, Nicholas Kurniawan, Guang Yang

Abstract: Organoids are self-organized 3D cell clusters that closely mimic the architecture and function of in vivo tissues and organs. Quantification of organoid morphology helps in studying organ development, drug discovery, and toxicity assessment. Recent microscopy techniques provide a potent tool to acquire organoid morphology features, but manual image analysis remains a labor and time-intensive proce… ▽ More Organoids are self-organized 3D cell clusters that closely mimic the architecture and function of in vivo tissues and organs. Quantification of organoid morphology helps in studying organ development, drug discovery, and toxicity assessment. Recent microscopy techniques provide a potent tool to acquire organoid morphology features, but manual image analysis remains a labor and time-intensive process. Thus, this paper proposes a comprehensive pipeline for microscopy analysis that leverages the SegmentAnything to precisely demarcate individual organoids. Additionally, we introduce a set of morphological properties, including perimeter, area, radius, non-smoothness, and non-circularity, allowing researchers to analyze the organoid structures quantitatively and automatically. To validate the effectiveness of our approach, we conducted tests on bright-field images of human induced pluripotent stem cells (iPSCs) derived neural-epithelial (NE) organoids. The results obtained from our automatic pipeline closely align with manual organoid detection and measurement, showcasing the capability of our proposed method in accelerating organoids morphology analysis. △ Less

Submitted 8 April, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: Replace Figure 4 with the correct version. The original version is wrong due to a column name mismatch

arXiv:2309.03578 [pdf, other]

doi 10.1051/0004-6361/202346857

X-ray luminosity-star formation rate scaling relation: Constraints from the eROSITA Final Equatorial Depth Survey (eFEDS)

Authors: G. Riccio, G. Yang, Małek, M. Boquien, Junais, F. Pistis, M. Hamed, M. Grespan, M. Paolillo, O. Torbaniuk

Abstract: We present measurements of the relation between X-ray luminosity and star formation activity for a sample of normal galaxies spanning the redshift range between 0 and 0.25. We use data acquired by SRG/eROSITA for the performance and verification phase program called eROSITA Final Equatorial Depth Survey (eFEDS). The eFEDS galaxies are observed in the 0.2-2.3 keV band. Making use of a wide range of… ▽ More We present measurements of the relation between X-ray luminosity and star formation activity for a sample of normal galaxies spanning the redshift range between 0 and 0.25. We use data acquired by SRG/eROSITA for the performance and verification phase program called eROSITA Final Equatorial Depth Survey (eFEDS). The eFEDS galaxies are observed in the 0.2-2.3 keV band. Making use of a wide range of ancillary data, spanning from the ultraviolet (UV) to mid-infrared wavelengths (MIR), we estimated the star formation rate (SFR) and stellar mass ($M_{star}$) of 888 galaxies, using Code Investigating GALaxy Emission (CIGALE). We divided our sample of normal galaxies in star-forming (SFGs) and quiescent galaxies according to their position on the main sequence. We confirm a linear correlation between the X-ray luminosity and the SFR for our sample of SFGs, as shown previously in the literature. However, we find this relation to be strongly biased by the completeness limit of the eFEDS survey. Correcting for completeness, we find the fitted relation to be consistent with the literature. We also investigated the relation between X-ray emission from both LMXBs and HMXBs populations with $M_{star}$ and SFR, respectively. Correcting for completeness, we find our fitted relation to considerably scatter from the literature relation at high specific SFR ($SFR/M_{star}$). We conclude that without accounting for X-ray non-detections, it is not possible to employ eFEDS data to study the redshift evolution of the LMXBs and HMXBs contributions due to completeness issues. Furthermore, we find our sources to largely scatter from the expected Lx/SFR vs specific SFR relation at high redshift. We discuss the dependence of the scatter on the stellar mass, metallicity, or the globular cluster content of the galaxy. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: Accepted for publication in A&A

Journal ref: A&A 678, A164 (2023)

arXiv:2309.03147 [pdf]

doi 10.1109/JBHI.2024.3370502

Real-Time Non-Invasive Imaging and Detection of Spreading Depolarizations through EEG: An Ultra-Light Explainable Deep Learning Approach

Authors: Yinzhe Wu, Sharon Jewell, Xiaodan Xing, Yang Nan, Anthony J. Strong, Guang Yang, Martyn G. Boutelle

Abstract: A core aim of neurocritical care is to prevent secondary brain injury. Spreading depolarizations (SDs) have been identified as an important independent cause of secondary brain injury. SDs are usually detected using invasive electrocorticography recorded at high sampling frequency. Recent pilot studies suggest a possible utility of scalp electrodes generated electroencephalogram (EEG) for non-inva… ▽ More A core aim of neurocritical care is to prevent secondary brain injury. Spreading depolarizations (SDs) have been identified as an important independent cause of secondary brain injury. SDs are usually detected using invasive electrocorticography recorded at high sampling frequency. Recent pilot studies suggest a possible utility of scalp electrodes generated electroencephalogram (EEG) for non-invasive SD detection. However, noise and attenuation of EEG signals makes this detection task extremely challenging. Previous methods focus on detecting temporal power change of EEG over a fixed high-density map of scalp electrodes, which is not always clinically feasible. Having a specialized spectrogram as an input to the automatic SD detection model, this study is the first to transform SD identification problem from a detection task on a 1-D time-series wave to a task on a sequential 2-D rendered imaging. This study presented a novel ultra-light-weight multi-modal deep-learning network to fuse EEG spectrogram imaging and temporal power vectors to enhance SD identification accuracy over each single electrode, allowing flexible EEG map and paving the way for SD detection on ultra-low-density EEG with variable electrode positioning. Our proposed model has an ultra-fast processing speed (<0.3 sec). Compared to the conventional methods (2 hours), this is a huge advancement towards early SD detection and to facilitate instant brain injury prognosis. Seeing SDs with a new dimension - frequency on spectrograms, we demonstrated that such additional dimension could improve SD detection accuracy, providing preliminary evidence to support the hypothesis that SDs may show implicit features over the frequency profile. △ Less

Submitted 28 February, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

arXiv:2309.02719 [pdf, other]

DMKD: Improving Feature-based Knowledge Distillation for Object Detection Via Dual Masking Augmentation

Authors: Guang Yang, Yin Tang, Zhijian Wu, Jun Li, Jianhua Xu, Xili Wan

Abstract: Recent mainstream masked distillation methods function by reconstructing selectively masked areas of a student network from the feature map of its teacher counterpart. In these methods, the masked regions need to be properly selected, such that reconstructed features encode sufficient discrimination and representation capability like the teacher feature. However, previous masked distillation metho… ▽ More Recent mainstream masked distillation methods function by reconstructing selectively masked areas of a student network from the feature map of its teacher counterpart. In these methods, the masked regions need to be properly selected, such that reconstructed features encode sufficient discrimination and representation capability like the teacher feature. However, previous masked distillation methods only focus on spatial masking, making the resulting masked areas biased towards spatial importance without encoding informative channel clues. In this study, we devise a Dual Masked Knowledge Distillation (DMKD) framework which can capture both spatially important and channel-wise informative clues for comprehensive masked feature reconstruction. More specifically, we employ dual attention mechanism for guiding the respective masking branches, leading to reconstructed feature encoding dual significance. Furthermore, fusing the reconstructed features is achieved by self-adjustable weighting strategy for effective feature distillation. Our experiments on object detection task demonstrate that the student networks achieve performance gains of 4.1% and 4.3% with the help of our method when RetinaNet and Cascade Mask R-CNN are respectively used as the teacher networks, while outperforming the other state-of-the-art distillation methods. △ Less

Submitted 6 September, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

arXiv:2309.02686 [pdf, other]

Dynamical relaxation behavior of extended XY chain with gapless phase following a quantum quench

Authors: Kaiyuan Cao, Yayun Hu, Peiqing Tong, Guangwen Yang

Abstract: We investigate the dynamical relaxation behavior of the two-point correlation in extended XY models with a gapless phase after quenches from various initial states. Specifically, we study the XY chain with gapless phase induced by the additional interactions: Dzyaloshinskii-Moriya interaction and XZY-YZX type of three-site interaction. When quenching from the gapped phase, we observe that the addi… ▽ More We investigate the dynamical relaxation behavior of the two-point correlation in extended XY models with a gapless phase after quenches from various initial states. Specifically, we study the XY chain with gapless phase induced by the additional interactions: Dzyaloshinskii-Moriya interaction and XZY-YZX type of three-site interaction. When quenching from the gapped phase, we observe that the additional interactions have no effect on the relaxation behavior. The relaxation behavior is $δC_{mn}(t)\sim t^{-3/2}$ and $\sim t^{-1/2}$ for the quench to the commensurate phase and the incommensurate phase, respectively. However, when quenching from the gapless phase, we demonstrate that the scaling behavior of $δC_{mn}(t)$ is changed to $\sim t^{-1}$ for the quench to the commensurate phase, and the decay of $δC_{mn}(t)$ follows $\sim t^{-1}$ or $\sim t^{-1/2}$ for the quench to the incommensurate phase depending on the parameters of pre-quench Hamiltonian. We also establish the dynamical phase diagrams based on the dynamical relaxation behavior of $δC_{mn}(t)$ in the extended XY models. △ Less

Submitted 5 September, 2023; originally announced September 2023.

Comments: 12 pages, 10 figures

arXiv:2309.01310 [pdf, other]

ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer

Authors: Gyeongdong Yang, Yungwook Kwon, Hyun** Kim

Abstract: The paper proposes an efficient structure for enhancing the performance of mobile-friendly vision transformer with small computational overhead. The vision transformer (ViT) is very attractive in that it reaches outperforming results in image classification, compared to conventional convolutional neural networks (CNNs). Due to its need of high computational resources, MobileNet-based ViT models su… ▽ More The paper proposes an efficient structure for enhancing the performance of mobile-friendly vision transformer with small computational overhead. The vision transformer (ViT) is very attractive in that it reaches outperforming results in image classification, compared to conventional convolutional neural networks (CNNs). Due to its need of high computational resources, MobileNet-based ViT models such as MobileViT-S have been developed. However, their performance cannot reach the original ViT model. The proposed structure relieves the above weakness by storing the information from early attention stages and reusing it in the final classifier. This paper is motivated by the idea that the data itself from early attention stages can have important meaning for the final classification. In order to reuse the early information in attention stages, the average pooling results of various scaled features from early attention stages are used to expand channels in the fully-connected layer of the final classifier. It is expected that the inductive bias introduced by the averaged features can enhance the final performance. Because the proposed structure only needs the average pooling of features from the attention stages and channel expansions in the final classifier, its computational and storage overheads are very small, kee** the benefits of low-cost MobileNet-based ViT (MobileViT). Compared with the original MobileViTs on the ImageNet dataset, the proposed ExMobileViT has noticeable accuracy enhancements, having only about 5% additional parameters. △ Less

Submitted 3 September, 2023; originally announced September 2023.

Comments: Under Review

arXiv:2309.00831 [pdf, other]

Multi-scale, Data-driven and Anatomically Constrained Deep Learning Image Registration for Adult and Fetal Echocardiography

Authors: Md. Kamrul Hasan, Haobo Zhu, Guang Yang, Choon Hwai Yap

Abstract: Temporal echocardiography image registration is a basis for clinical quantifications such as cardiac motion estimation, myocardial strain assessments, and stroke volume quantifications. In past studies, deep learning image registration (DLIR) has shown promising results and is consistently accurate and precise, requiring less computational time. We propose that a greater focus on the warped moving… ▽ More Temporal echocardiography image registration is a basis for clinical quantifications such as cardiac motion estimation, myocardial strain assessments, and stroke volume quantifications. In past studies, deep learning image registration (DLIR) has shown promising results and is consistently accurate and precise, requiring less computational time. We propose that a greater focus on the warped moving image's anatomic plausibility and image quality can support robust DLIR performance. Further, past implementations have focused on adult echocardiography, and there is an absence of DLIR implementations for fetal echocardiography. We propose a framework that combines three strategies for DLIR in both fetal and adult echo: (1) an anatomic shape-encoded loss to preserve physiological myocardial and left ventricular anatomical topologies in warped images; (2) a data-driven loss that is trained adversarially to preserve good image texture features in warped images; and (3) a multi-scale training scheme of a data-driven and anatomically constrained algorithm to improve accuracy. Our tests show that good anatomical topology and image textures are strongly linked to shape-encoded and data-driven adversarial losses. They improve different aspects of registration performance in a non-overlap** way, justifying their combination. Despite fundamental distinctions between adult and fetal echo images, we show that these strategies can provide excellent registration results in both adult and fetal echocardiography using the publicly available CAMUS adult echo dataset and our private multi-demographic fetal echo dataset. Our approach outperforms traditional non-DL gold standard registration approaches, including Optical Flow and Elastix. Registration improvements could be translated to more accurate and precise clinical quantification of cardiac ejection fraction, demonstrating a potential for translation. △ Less

Submitted 11 September, 2023; v1 submitted 2 September, 2023; originally announced September 2023.

Comments: Our data-driven and anatomically constrained DLIR method's source code will be publicly available at https://github.com/kamruleee51/DdC-AC-DLIR

arXiv:2309.00595 [pdf, other]

Optimization towards Efficiency and Stateful of dispel4py

Authors: Liang Liang, Heting Zhang, Guang Yang, Thomas Heinis, Rosa Filgueira

Abstract: Scientific workflows bridge scientific challenges with computational resources. While dispel4py, a stream-based workflow system, offers map**s to parallel enactment engines like MPI or Multiprocessing, its optimization primarily focuses on dynamic process-to-task allocation for improved performance. An efficiency gap persists, particularly with the growing emphasis on conserving computing resour… ▽ More Scientific workflows bridge scientific challenges with computational resources. While dispel4py, a stream-based workflow system, offers map**s to parallel enactment engines like MPI or Multiprocessing, its optimization primarily focuses on dynamic process-to-task allocation for improved performance. An efficiency gap persists, particularly with the growing emphasis on conserving computing resources. Moreover, the existing dynamic optimization lacks support for stateful applications and grou** operations. To address these issues, our work introduces a novel hybrid approach for handling stateful operations and grou**s within workflows, leveraging a new Redis map**. We also propose an auto-scaling mechanism integrated into dispel4py's dynamic optimization. Our experiments showcase the effectiveness of auto-scaling optimization, achieving efficiency while upholding performance. In the best case, auto-scaling reduces dispel4py's runtime to 87% compared to the baseline, using only 76% of process resources. Importantly, our optimized stateful dispel4py demonstrates a remarkable speedup, utilizing just 32% of the runtime compared to the contender. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: 13 pages, 13 figures

arXiv:2308.16606 [pdf, ps, other]

doi 10.1103/PhysRevD.108.092009

Measurements of the $ν_μ$ and $\barν_μ$-induced Coherent Charged Pion Production Cross Sections on $^{12}C$ by the T2K experiment

Authors: K. Abe, N. Akhlaq, R. Akutsu, A. Ali, S. Alonso Monsalve, C. Alt, C. Andreopoulos, M. Antonova, S. Aoki, T. Arihara, Y. Asada, Y. Ashida, E. T. Atkin, M. Barbi, G. J. Barker, G. Barr, D. Barrow, M. Batkiewicz-Kwasniak, V. Berardi, L. Berns, S. Bhadra, A. Blanchet, A. Blondel, S. Bolognesi, T. Bonus , et al. (359 additional authors not shown)

Abstract: We report an updated measurement of the $ν_μ$-induced, and the first measurement of the $\barν_μ$-induced coherent charged pion production cross section on $^{12}C$ nuclei in the T2K experiment. This is measured in a restricted region of the final-state phase space for which $p_{μ,π} > 0.2$ GeV, $\cos(θ_μ) > 0.8$ and $\cos(θ_π) > 0.6$, and at a mean (anti)neutrino energy of 0.85 GeV using the T2K… ▽ More We report an updated measurement of the $ν_μ$-induced, and the first measurement of the $\barν_μ$-induced coherent charged pion production cross section on $^{12}C$ nuclei in the T2K experiment. This is measured in a restricted region of the final-state phase space for which $p_{μ,π} > 0.2$ GeV, $\cos(θ_μ) > 0.8$ and $\cos(θ_π) > 0.6$, and at a mean (anti)neutrino energy of 0.85 GeV using the T2K near detector. The measured $ν_μ$ CC coherent pion production flux-averaged cross section on $^{12}C$ is $(2.98 \pm 0.37 (stat.) \pm 0.31 (syst.) \substack{ +0.49 \\ -0.00 } \mathrm{ (Q^2\,model)}) \times 10^{-40}~\mathrm{cm}^{2}$. The new measurement of the $\barν_μ$-induced cross section on $^{12}{C}$ is $(3.05 \pm 0.71 (stat.) \pm 0.39 (syst.) \substack{ +0.74 \\ -0.00 } \mathrm{(Q^2\,model)}) \times 10^{-40}~\mathrm{cm}^{2}$. The results are compatible with both the NEUT 5.4.0 Berger-Sehgal (2009) and GENIE 2.8.0 Rein-Sehgal (2007) model predictions. △ Less

Submitted 14 October, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

Journal ref: Phys.Rev.D 108 (2023) 9, 092009

arXiv:2308.15280 [pdf, other]

ADFA: Attention-augmented Differentiable top-k Feature Adaptation for Unsupervised Medical Anomaly Detection

Authors: Yiming Huang, Guole Liu, Yaoru Luo, Ge Yang

Abstract: The scarcity of annotated data, particularly for rare diseases, limits the variability of training data and the range of detectable lesions, presenting a significant challenge for supervised anomaly detection in medical imaging. To solve this problem, we propose a novel unsupervised method for medical image anomaly detection: Attention-Augmented Differentiable top-k Feature Adaptation (ADFA). The… ▽ More The scarcity of annotated data, particularly for rare diseases, limits the variability of training data and the range of detectable lesions, presenting a significant challenge for supervised anomaly detection in medical imaging. To solve this problem, we propose a novel unsupervised method for medical image anomaly detection: Attention-Augmented Differentiable top-k Feature Adaptation (ADFA). The method utilizes Wide-ResNet50-2 (WR50) network pre-trained on ImageNet to extract initial feature representations. To reduce the channel dimensionality while preserving relevant channel information, we employ an attention-augmented patch descriptor on the extracted features. We then apply differentiable top-k feature adaptation to train the patch descriptor, map** the extracted feature representations to a new vector space, enabling effective detection of anomalies. Experiments show that ADFA outperforms state-of-the-art (SOTA) methods on multiple challenging medical image datasets, confirming its effectiveness in medical anomaly detection. △ Less

Submitted 29 August, 2023; originally announced August 2023.

arXiv:2308.14638 [pdf, other]

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge

Authors: Ruoyu Wang, Maokui He, Jun Du, Hengshun Zhou, Shutong Niu, Hang Chen, Yanyan Yue, Gaobin Yang, Shilong Wu, Lei Sun, Yanhui Tu, Haitao Tang, Shuangqing Qian, Tian Gao, Mengzhi Wang, Genshun Wan, Jia Pan, Jianqing Gao, Chin-Hui Lee

Abstract: This technical report details our submission system to the CHiME-7 DASR Challenge, which focuses on speaker diarization and speech recognition under complex multi-speaker scenarios. Additionally, it also evaluates the efficiency of systems in handling diverse array devices. To address these issues, we implemented an end-to-end speaker diarization system and introduced a rectification strategy base… ▽ More This technical report details our submission system to the CHiME-7 DASR Challenge, which focuses on speaker diarization and speech recognition under complex multi-speaker scenarios. Additionally, it also evaluates the efficiency of systems in handling diverse array devices. To address these issues, we implemented an end-to-end speaker diarization system and introduced a rectification strategy based on multi-channel spatial information. This approach significantly diminished the word error rates (WER). In terms of recognition, we utilized publicly available pre-trained models as the foundational models to train our end-to-end speech recognition models. Our system attained a Macro-averaged diarization-attributed WER (DA-WER) of 21.01% on the CHiME-7 evaluation set, which signifies a relative improvement of 62.04% over the official baseline system. △ Less

Submitted 10 October, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

Comments: Accepted by 2023 CHiME Workshop, Oral

arXiv:2308.12991 [pdf, other]

SN 2022oqm: A Bright and Multi-peaked Calcium-rich Transient

Authors: S. Karthik Yadavalli, V. Ashley Villar, Luca Izzo, Yossef Zenati, Ryan J. Foley, J. Craig Wheeler, Charlotte R. Angus, Dominik Bánhidi, Katie Auchettl, Barna Imre Bíró, Attila Bódi, Zsófia Bodola, Thomas de Boer, Kenneth C. Chambers, Ryan Chornock, David A. Coulter, István Csányi, Borbála Cseh, Srujan Dandu, Kyle W. Davis, Connor Braden Dickinson, Diego Farias, Joseph Farah, Christa Gall, Hua Gao , et al. (38 additional authors not shown)

Abstract: We present the photometric and spectroscopic evolution of SN 2022oqm, a nearby multi-peaked hydrogen- and helium-weak calcium-rich transient (CaRT). SN 2022oqm was detected 13.1 kpc from its host galaxy, the face-on spiral galaxy NGC 5875. Extensive spectroscopic coverage reveals an early hot (T >= 40,000 K) continuum and carbon features observed $\sim$1~day after discovery, SN Ic-like photospheri… ▽ More We present the photometric and spectroscopic evolution of SN 2022oqm, a nearby multi-peaked hydrogen- and helium-weak calcium-rich transient (CaRT). SN 2022oqm was detected 13.1 kpc from its host galaxy, the face-on spiral galaxy NGC 5875. Extensive spectroscopic coverage reveals an early hot (T >= 40,000 K) continuum and carbon features observed $\sim$1~day after discovery, SN Ic-like photospheric-phase spectra, and strong forbidden calcium emission starting 38 days after discovery. SN 2022oqm has a relatively high peak luminosity (MB = -17 mag) for (CaRTs), making it an outlier in the population. We determine that three power sources are necessary to explain the light curve (LC), with each corresponding to a distinct peak. The first peak is powered by an expanding blackbody with a power law luminosity, suggesting shock cooling by circumstellar material (CSM). Subsequent LC evolution is powered by a double radioactive decay model, consistent with two sources of photons diffusing through optically thick ejecta. From the LC, we derive an ejecta mass and 56Ni mass of ~0.6 solar masses and ~0.09 solar masses. Spectroscopic modeling suggests 0.6 solar masses of ejecta, and with well-mixed Fe-peak elements throughout. We discuss several physical origins for SN 2022oqm and find either a surprisingly massive white dwarf progenitor or a peculiar stripped envelope model could explain SN 2022oqm. A stripped envelope explosion inside a dense, hydrogen- and helium-poor CSM, akin to SNe Icn, but with a large 56Ni mass and small CSM mass could explain SN 2022oqm. Alternatively, helium detonation on an unexpectedly massive white dwarf could also explain SN 2022oqm. △ Less

Submitted 4 April, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: 35 pages, 17 figures, 7 tables, Accepted for Publication in ApJ

arXiv:2308.10421 [pdf, other]

UniM$^2$AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving

Authors: Jian Zou, Tianyu Huang, Guanglei Yang, Zhenhua Guo, Wangmeng Zuo

Abstract: Masked Autoencoders (MAE) play a pivotal role in learning potent representations, delivering outstanding results across various 3D perception tasks essential for autonomous driving. In real-world driving scenarios, it's commonplace to deploy multiple sensors for comprehensive environment perception. While integrating multi-modal features from these sensors can produce rich and powerful features, t… ▽ More Masked Autoencoders (MAE) play a pivotal role in learning potent representations, delivering outstanding results across various 3D perception tasks essential for autonomous driving. In real-world driving scenarios, it's commonplace to deploy multiple sensors for comprehensive environment perception. While integrating multi-modal features from these sensors can produce rich and powerful features, there is a noticeable gap in MAE methods addressing this integration. This research delves into multi-modal Masked Autoencoders tailored for a unified representation space in autonomous driving, aiming to pioneer a more efficient fusion of two distinct modalities. To intricately marry the semantics inherent in images with the geometric intricacies of LiDAR point clouds, the UniM$^2$AE is proposed. This model stands as a potent yet straightforward, multi-modal self-supervised pre-training framework, mainly consisting of two designs. First, it projects the features from both modalities into a cohesive 3D volume space, ingeniously expanded from the bird's eye view (BEV) to include the height dimension. The extension makes it possible to back-project the informative features, obtained by fusing features from both modalities, into their native modalities to reconstruct the multiple masked inputs. Second, the Multi-modal 3D Interactive Module (MMIM) is invoked to facilitate the efficient inter-modal interaction during the interaction process. Extensive experiments conducted on the nuScenes Dataset attest to the efficacy of UniM$^2$AE, indicating enhancements in 3D object detection and BEV map segmentation by 1.2\%(NDS) and 6.5\% (mIoU), respectively. Code is available at https://github.com/hollow-503/UniM2AE. △ Less

Submitted 29 August, 2023; v1 submitted 20 August, 2023; originally announced August 2023.

Comments: Code available at https://github.com/hollow-503/UniM2AE

arXiv:2308.09750 [pdf, other]

CEERS Key Paper VII: JWST/MIRI Reveals a Faint Population of Galaxies at Cosmic Noon Unseen by Spitzer

Authors: Allison Kirkpatrick, Guang Yang, Aurelien Le Bail, Greg Troiani, Eric F. Bell, Nikko J. Cleri, David Elbaz, Steven L. Finkelstein, Nimish P. Hathi, Michaela Hirschmann, Benne W. Holwerda, Dale D. Kocevski, Ray A. Lucas, Jed McKinney, Casey Papovich, Pablo G. Perez-Gonzalez, Alexander de la Vega, Micaela B. Bagley, Emanuele Daddi, Mark Dickinson, Henry C. Ferguson, Adriano Fontana, Andrea Grazian, Norman A. Grogin, Pablo Arrabal Haro , et al. (11 additional authors not shown)

Abstract: The Cosmic Evolution Early Release Science (CEERS) program observed the Extended Groth Strip with the Mid-Infrared Instrument (MIRI) on the James Webb Space Telescope (JWST) in 2022. In this paper, we discuss the four MIRI pointings that observed with longer wavelength filters, including F770W, F1000W, F1280W, F1500W, F1800W, and F2100W. We compare the MIRI galaxies with the Spitzer/MIPS 24$μ$m po… ▽ More The Cosmic Evolution Early Release Science (CEERS) program observed the Extended Groth Strip with the Mid-Infrared Instrument (MIRI) on the James Webb Space Telescope (JWST) in 2022. In this paper, we discuss the four MIRI pointings that observed with longer wavelength filters, including F770W, F1000W, F1280W, F1500W, F1800W, and F2100W. We compare the MIRI galaxies with the Spitzer/MIPS 24$μ$m population in the EGS field. We find that MIRI can observe an order of magnitude deeper than MIPS in significantly shorter integration times, attributable to JWST's much larger aperture and MIRI's improved sensitivity. MIRI is exceptionally good at finding faint ($L_{\rm IR}<10^{10} L_\odot$) galaxies at $z\sim1-2$. We find that a significant portion of MIRI galaxies are "mid-IR weak"--they have strong near-IR emission and relatively weaker mid-IR emission, and most of the star formation is unobscured. We present new IR templates that capture how the mid-IR to near-IR emission changes with increasing infrared luminosity. We present two color-color diagrams to separate mid-IR weak galaxies and active galactic nuclei (AGN) from dusty star-forming galaxies and find that these color diagrams are most effective when used in conjunction with each other. We present the first number counts of 10$μ$m sources and find that there are $\lesssim10$ IR AGN per MIRI pointing, possibly due to the difficulty of distinguishing AGN from intrinsically mid-IR weak galaxies (due to low metallicities or low dust content). We conclude that MIRI is most effective at observing moderate luminosity ($L_{\rm IR}=10^9-10^{10}L_\odot$) galaxies at $z=1-2$, and that photometry alone is not effective at identifying AGN within this faint population. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: 21 pages, 10 figures. Resubmitted to ApJS after revision

arXiv:2308.09475 [pdf, other]

Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery

Authors: Hongqiu Wang, Lei Zhu, Guang Yang, Yike Guo, Shichen Zhang, Bo Xu, Yueming **

Abstract: Robot-assisted surgery has made significant progress, with instrument segmentation being a critical factor in surgical intervention quality. It serves as the building block to facilitate surgical robot navigation and surgical education for the next generation of operating intelligence. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate se… ▽ More Robot-assisted surgery has made significant progress, with instrument segmentation being a critical factor in surgical intervention quality. It serves as the building block to facilitate surgical robot navigation and surgical education for the next generation of operating intelligence. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks for all instruments, without the capability to specify a target object and allow an interactive experience. This work explores a new task of Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the corresponding surgical instruments based on the given language expression. To achieve this, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only used video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. We are also the first to produce two RSVIS datasets to promote related research. Our method is verified on these datasets, and experimental results exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. Our code and our datasets will be released upon the publication of this work. △ Less

Submitted 18 August, 2023; originally announced August 2023.

arXiv:2308.07931 [pdf, other]

Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation

Authors: William Shen, Ge Yang, Alan Yu, Jansen Wong, Leslie Pack Kaelbling, Phillip Isola

Abstract: Self-supervised and language-supervised image models contain rich knowledge of the world that is important for generalization. Many robotic tasks, however, require a detailed understanding of 3D geometry, which is often lacking in 2D image features. This work bridges this 2D-to-3D gap for robotic manipulation by leveraging distilled feature fields to combine accurate 3D geometry with rich semantic… ▽ More Self-supervised and language-supervised image models contain rich knowledge of the world that is important for generalization. Many robotic tasks, however, require a detailed understanding of 3D geometry, which is often lacking in 2D image features. This work bridges this 2D-to-3D gap for robotic manipulation by leveraging distilled feature fields to combine accurate 3D geometry with rich semantics from 2D foundation models. We present a few-shot learning method for 6-DOF gras** and placing that harnesses these strong spatial and semantic priors to achieve in-the-wild generalization to unseen objects. Using features distilled from a vision-language model, CLIP, we present a way to designate novel objects for manipulation via free-text natural language, and demonstrate its ability to generalize to unseen expressions and novel categories of objects. △ Less

Submitted 29 December, 2023; v1 submitted 27 July, 2023; originally announced August 2023.

Comments: Project website at https://f3rm.csail.mit.edu, Accepted at the 7th Annual Conference on Robot Learning (CoRL), 2023 in Atlanta, US

arXiv:2308.06605 [pdf, other]

Towards Exascale Computation for Turbomachinery Flows

Authors: Yuhang Fu, Weiqi Shen, Jiahuan Cui, Yao Zheng, Guangwen Yang, Zhao Liu, Jifa Zhang, Tingwei Ji, Fangfang Xie, Xiao**g Lv, Hanyue Liu, Xu Liu, Xiyang Liu, Xiaoyu Song, Guocheng Tao, Yan Yan, Paul Tucker, Steven A. E. Miller, Shirui Luo, Seid Koric, Weimin Zheng

Abstract: A state-of-the-art large eddy simulation code has been developed to solve compressible flows in turbomachinery. The code has been engineered with a high degree of scalability, enabling it to effectively leverage the many-core architecture of the new Sunway system. A consistent performance of 115.8 DP-PFLOPs has been achieved on a high-pressure turbine cascade consisting of over 1.69 billion mesh e… ▽ More A state-of-the-art large eddy simulation code has been developed to solve compressible flows in turbomachinery. The code has been engineered with a high degree of scalability, enabling it to effectively leverage the many-core architecture of the new Sunway system. A consistent performance of 115.8 DP-PFLOPs has been achieved on a high-pressure turbine cascade consisting of over 1.69 billion mesh elements and 865 billion Degree of Freedoms (DOFs). By leveraging a high-order unstructured solver and its portability to large heterogeneous parallel systems, we have progressed towards solving the grand challenge problem outlined by NASA, which involves a time-dependent simulation of a complete engine, incorporating all the aerodynamic and heat transfer components. △ Less

Submitted 29 December, 2023; v1 submitted 12 August, 2023; originally announced August 2023.

Comments: SC23, November, 2023, Denver, CO., USA

arXiv:2308.06578 [pdf]

To reverse engineer an entire nervous system

Authors: Gal Haspel, Edward S Boyden, Jeffrey Brown, George Church, Netta Cohen, Christopher Fang-Yen, Steven Flavell, Miriam B Goodman, Anne C Hart, Oliver Hobert, Eduardo J Izquierdo, Konstantinos Kagias, Shawn Lockery, Yangning Lu, Adam Marblestone, Jordan Matelsky, Hanspeter Pfister, Horacio G Rotstein, Monika Scholz, Eli Shlizerman, Quilee Simeon, Michael A Skuhersky, Vineet Tiruvadi, Vivek Venkatachalam, Guangyu Robert Yang , et al. (3 additional authors not shown)

Abstract: A primary goal of neuroscience is to understand how nervous systems, or assemblies of neural circuits, generate and control behavior. Testing and refining our theories of neural control would be greatly facilitated if we could reliably simulate an entire nervous system so we could replicate the brain dynamics in response to any stimuli and different contexts. More fundamentally, reconstructing or… ▽ More A primary goal of neuroscience is to understand how nervous systems, or assemblies of neural circuits, generate and control behavior. Testing and refining our theories of neural control would be greatly facilitated if we could reliably simulate an entire nervous system so we could replicate the brain dynamics in response to any stimuli and different contexts. More fundamentally, reconstructing or modeling a system is an important milestone in understanding it, and so, simulating an entire nervous system is in itself one of the goals, indeed dreams, of systems neuroscience. To do so requires us to identify how each neuron's output depends on its inputs, within some nervous system. This deconstruction, understanding function from input-output pairs, falls into the realm of reverse engineering. Current efforts at reverse engineering the brain focus on the mammalian nervous system, but these brains are complex, allowing only recordings of tiny subsystems. Here we argue that the time is ripe to embark on a concerted effort to reverse engineer a smaller system and that the nematode C. elegans is the ideal candidate system. In particular, the established and growing toolkit of optophysiology techniques can non-invasively capture and control each neuron's activity and scale to hundreds of thousands of experiments, across a large population of animals. Data across populations and behaviors can be combined because across individuals neuronal identities are largely conserved in form and function. Modern machine-learning-based model training should then enable a simulation of C. elegans' impressive breadth of brain states and behaviors. The ability to reverse engineer an entire nervous system will benefit systems neuroscience as well as the design of artificial intelligence systems, enabling fundamental insights as well as new approaches for investigations of progressively larger nervous systems. △ Less

Submitted 9 December, 2023; v1 submitted 12 August, 2023; originally announced August 2023.

Comments: 23 pages, 2 figures, opinion paper

arXiv:2308.06334 [pdf, other]

SN 2022joj: A Potential Double Detonation with a Thin Helium shell

Authors: E. Padilla Gonzalez, D. A. Howell, G. Terreran, C. McCully, M. Newsome, J. Burke, J. Farah, C. Pellegrino, K. A. Bostroem, G. Hosseinzadeh, J. Pearson, D. J. Sand, M. Shrestha, N. Smith, Y. Dong, N. Meza Retamal, S. Valenti, S. Boos, K. J. Shen, D. Townsley, L. Galbany, L. Piscarreta, R. J. Foley, M. J. Bustamante-Rosell, D. A. Coulter , et al. (12 additional authors not shown)

Abstract: We present photometric and spectroscopic data for SN 2022joj, a nearby peculiar Type Ia supernova (SN Ia) with a fast decline rate ($\rm{Δm_{15,B}=1.4}$ mag). SN 2022joj shows exceedingly red colors, with a value of approximately ${B-V \approx 1.1}$ mag during its initial stages, beginning from $11$ days before maximum brightness. As it evolves the flux shifts towards the blue end of the spectrum,… ▽ More We present photometric and spectroscopic data for SN 2022joj, a nearby peculiar Type Ia supernova (SN Ia) with a fast decline rate ($\rm{Δm_{15,B}=1.4}$ mag). SN 2022joj shows exceedingly red colors, with a value of approximately ${B-V \approx 1.1}$ mag during its initial stages, beginning from $11$ days before maximum brightness. As it evolves the flux shifts towards the blue end of the spectrum, approaching ${B-V \approx 0}$ mag around maximum light. Furthermore, at maximum light and beyond, the photometry is consistent with that of typical SNe Ia. This unusual behavior extends to its spectral characteristics, which initially displayed a red spectrum and later evolved to exhibit greater consistency with typical SNe Ia. We consider two potential explanations for this behavior: double detonation from a helium shell on a sub-Chandrasekhar-mass white dwarf and Chandrasekhar-mass models with a shallow distribution of $\rm{^{56}Ni}$. The shallow nickel models could not reproduce the red colors in the early light curves. Spectroscopically, we find strong agreement between SN 2022joj and double-detonation models with white dwarf masses around 1 $\rm{M_{\odot}}$ and thin He-shell between 0.01 and 0.02 $\rm{M_{\odot}}$. Moreover, the early red colors are explained by line-blanketing absorption from iron-peak elements created by the double detonation scenario in similar mass ranges. However, the nebular spectra composition in SN 2022joj deviates from expectations for double detonation, as we observe strong [Fe III] emission instead of [Ca II] lines as anticipated from double detonation models. More detailed modeling, e.g., including viewing angle effects, is required to test if double detonation models can explain the nebular spectra. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2308.05681 [pdf, other]

Hard No-Box Adversarial Attack on Skeleton-Based Human Action Recognition with Skeleton-Motion-Informed Gradient

Authors: Zhengzhi Lu, He Wang, Ziyi Chang, Guoan Yang, Hubert P. H. Shum

Abstract: Recently, methods for skeleton-based human activity recognition have been shown to be vulnerable to adversarial attacks. However, these attack methods require either the full knowledge of the victim (i.e. white-box attacks), access to training data (i.e. transfer-based attacks) or frequent model queries (i.e. black-box attacks). All their requirements are highly restrictive, raising the question o… ▽ More Recently, methods for skeleton-based human activity recognition have been shown to be vulnerable to adversarial attacks. However, these attack methods require either the full knowledge of the victim (i.e. white-box attacks), access to training data (i.e. transfer-based attacks) or frequent model queries (i.e. black-box attacks). All their requirements are highly restrictive, raising the question of how detrimental the vulnerability is. In this paper, we show that the vulnerability indeed exists. To this end, we consider a new attack task: the attacker has no access to the victim model or the training data or labels, where we coin the term hard no-box attack. Specifically, we first learn a motion manifold where we define an adversarial loss to compute a new gradient for the attack, named skeleton-motion-informed (SMI) gradient. Our gradient contains information of the motion dynamics, which is different from existing gradient-based attack methods that compute the loss gradient assuming each dimension in the data is independent. The SMI gradient can augment many gradient-based attack methods, leading to a new family of no-box attack methods. Extensive evaluation and comparison show that our method imposes a real threat to existing classifiers. They also show that the SMI gradient improves the transferability and imperceptibility of adversarial samples in both no-box and transfer-based black-box settings. △ Less

Submitted 18 August, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: Camera-ready version for ICCV 2023

arXiv:2308.05309 [pdf, other]

Homophily-enhanced Structure Learning for Graph Clustering

Authors: Ming Gu, Gaoming Yang, Sheng Zhou, Ning Ma, Jiawei Chen, Qiaoyu Tan, Meihan Liu, Jiajun Bu

Abstract: Graph clustering is a fundamental task in graph analysis, and recent advances in utilizing graph neural networks (GNNs) have shown impressive results. Despite the success of existing GNN-based graph clustering methods, they often overlook the quality of graph structure, which is inherent in real-world graphs due to their sparse and multifarious nature, leading to subpar performance. Graph structur… ▽ More Graph clustering is a fundamental task in graph analysis, and recent advances in utilizing graph neural networks (GNNs) have shown impressive results. Despite the success of existing GNN-based graph clustering methods, they often overlook the quality of graph structure, which is inherent in real-world graphs due to their sparse and multifarious nature, leading to subpar performance. Graph structure learning allows refining the input graph by adding missing links and removing spurious connections. However, previous endeavors in graph structure learning have predominantly centered around supervised settings, and cannot be directly applied to our specific clustering tasks due to the absence of ground-truth labels. To bridge the gap, we propose a novel method called \textbf{ho}mophily-enhanced structure \textbf{le}arning for graph clustering (HoLe). Our motivation stems from the observation that subtly enhancing the degree of homophily within the graph structure can significantly improve GNNs and clustering outcomes. To realize this objective, we develop two clustering-oriented structure learning modules, i.e., hierarchical correlation estimation and cluster-aware sparsification. The former module enables a more accurate estimation of pairwise node relationships by leveraging guidance from latent and clustering spaces, while the latter one generates a sparsified structure based on the similarity matrix and clustering assignments. Additionally, we devise a joint optimization approach alternating between training the homophily-enhanced structure learning and GNN-based clustering, thereby enforcing their reciprocal effects. Extensive experiments on seven benchmark datasets of various types and scales, across a range of clustering metrics, demonstrate the superiority of HoLe against state-of-the-art baselines. △ Less

Submitted 30 October, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

Comments: 11 pages with 7 figures. Accepted by CIKM'23

arXiv:2308.03421 [pdf, other]

RecycleGPT: An Autoregressive Language Model with Recyclable Module

Authors: Yufan Jiang, Qiaozhi He, Xiaomin Zhuang, Zhihua Wu, Kunpeng Wang, Wenlai Zhao, Guangwen Yang

Abstract: Existing large language models have to run K times to generate a sequence of K tokens. In this paper, we present RecycleGPT, a generative language model with fast decoding speed by recycling pre-generated model states without running the whole model in multiple steps. Our approach relies on the observation that adjacent tokens in a sequence usually have strong correlations and the next token in a… ▽ More Existing large language models have to run K times to generate a sequence of K tokens. In this paper, we present RecycleGPT, a generative language model with fast decoding speed by recycling pre-generated model states without running the whole model in multiple steps. Our approach relies on the observation that adjacent tokens in a sequence usually have strong correlations and the next token in a sequence can be reasonably guessed or inferred based on the preceding ones. Experiments and analysis demonstrate the effectiveness of our approach in lowering inference latency, achieving up to 1.4x speedup while preserving high performance. △ Less

Submitted 23 May, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: Technical Report

arXiv:2308.02533 [pdf, other]

Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Authors: Kaijie Zhu, **dong Wang, Xixu Hu, Xing Xie, Ge Yang

Abstract: Deep neural networks are susceptible to adversarial examples, posing a significant security risk in critical applications. Adversarial Training (AT) is a well-established technique to enhance adversarial robustness, but it often comes at the cost of decreased generalization ability. This paper proposes Robustness Critical Fine-Tuning (RiFT), a novel approach to enhance generalization without compr… ▽ More Deep neural networks are susceptible to adversarial examples, posing a significant security risk in critical applications. Adversarial Training (AT) is a well-established technique to enhance adversarial robustness, but it often comes at the cost of decreased generalization ability. This paper proposes Robustness Critical Fine-Tuning (RiFT), a novel approach to enhance generalization without compromising adversarial robustness. The core idea of RiFT is to exploit the redundant capacity for robustness by fine-tuning the adversarially trained model on its non-robust-critical module. To do so, we introduce module robust criticality (MRC), a measure that evaluates the significance of a given module to model robustness under worst-case weight perturbations. Using this measure, we identify the module with the lowest MRC value as the non-robust-critical module and fine-tune its weights to obtain fine-tuned weights. Subsequently, we linearly interpolate between the adversarially trained weights and fine-tuned weights to derive the optimal fine-tuned model weights. We demonstrate the efficacy of RiFT on ResNet18, ResNet34, and WideResNet34-10 models trained on CIFAR10, CIFAR100, and Tiny-ImageNet datasets. Our experiments show that \method can significantly improve both generalization and out-of-distribution robustness by around 1.5% while maintaining or even slightly enhancing adversarial robustness. Code is available at https://github.com/microsoft/robustlearn. △ Less

Submitted 1 August, 2023; originally announced August 2023.

Comments: Accepted by International Conference on Computer Vision (ICCV) 2023; code is at https://github.com/microsoft/robustlearn

arXiv:2308.02140 [pdf, ps, other]

Deep Reinforcement Learning Empowered Rate Selection of XP-HARQ

Authors: Da Wu, Jiahui Feng, Zheng Shi, Hongjiang Lei, Guanghua Yang, Shaodan Ma

Abstract: The complex transmission mechanism of cross-packet hybrid automatic repeat request (XP-HARQ) hinders its optimal system design. To overcome this difficulty, this letter attempts to use the deep reinforcement learning (DRL) to solve the rate selection problem of XP-HARQ over correlated fading channels. In particular, the long term average throughput (LTAT) is maximized by properly choosing the incr… ▽ More The complex transmission mechanism of cross-packet hybrid automatic repeat request (XP-HARQ) hinders its optimal system design. To overcome this difficulty, this letter attempts to use the deep reinforcement learning (DRL) to solve the rate selection problem of XP-HARQ over correlated fading channels. In particular, the long term average throughput (LTAT) is maximized by properly choosing the incremental information rate for each HARQ round on the basis of the outdated channel state information (CSI) available at the transmitter. The rate selection problem is first converted into a Markov decision process (MDP), which is then solved by capitalizing on the algorithm of deep deterministic policy gradient (DDPG) with prioritized experience replay. The simulation results finally corroborate the superiority of the proposed XP-HARQ scheme over the conventional HARQ with incremental redundancy (HARQ-IR) and the XP-HARQ with only statistical CSI. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2308.02131 [pdf, other]

Graph Convolutional Network Enabled Power-Constrained HARQ Strategy for URLLC

Authors: Yi Chen, Zheng Shi, Hong Wang, Yaru Fu, Guanghua Yang, Shaodan Ma, Haichuan Ding

Abstract: In this paper, a power-constrained hybrid automatic repeat request (HARQ) transmission strategy is developed to support ultra-reliable low-latency communications (URLLC). In particular, we aim to minimize the delivery latency of HARQ schemes over time-correlated fading channels, meanwhile ensuring the high reliability and limited power consumption. To ease the optimization, the simple asymptotic o… ▽ More In this paper, a power-constrained hybrid automatic repeat request (HARQ) transmission strategy is developed to support ultra-reliable low-latency communications (URLLC). In particular, we aim to minimize the delivery latency of HARQ schemes over time-correlated fading channels, meanwhile ensuring the high reliability and limited power consumption. To ease the optimization, the simple asymptotic outage expressions of HARQ schemes are adopted. Furthermore, by noticing the non-convexity of the latency minimization problem and the intricate connection between different HARQ rounds, the graph convolutional network (GCN) is invoked for the optimal power solution owing to its powerful ability of handling the graph data. The primal-dual learning method is then leveraged to train the GCN weights. Consequently, the numerical results are presented for verification together with the comparisons among three HARQ schemes in terms of the latency and the reliability, where the three HARQ schemes include Type-I HARQ, HARQ with chase combining (HARQ-CC), and HARQ with incremental redundancy (HARQ-IR). To recapitulate, it is revealed that HARQ-IR offers the lowest latency while guaranteeing the demanded reliability target under a stringent power constraint, albeit at the price of high coding complexity. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2308.01814 [pdf, other]

Tensor Programs IVb: Adaptive Optimization in the Infinite-Width Limit

Authors: Greg Yang, Etai Littwin

Abstract: Going beyond stochastic gradient descent (SGD), what new phenomena emerge in wide neural networks trained by adaptive optimizers like Adam? Here we show: The same dichotomy between feature learning and kernel behaviors (as in SGD) holds for general optimizers as well, including Adam -- albeit with a nonlinear notion of "kernel." We derive the corresponding "neural tangent" and "maximal update" lim… ▽ More Going beyond stochastic gradient descent (SGD), what new phenomena emerge in wide neural networks trained by adaptive optimizers like Adam? Here we show: The same dichotomy between feature learning and kernel behaviors (as in SGD) holds for general optimizers as well, including Adam -- albeit with a nonlinear notion of "kernel." We derive the corresponding "neural tangent" and "maximal update" limits for any architecture. Two foundational advances underlie the above results: 1) A new Tensor Program language, NEXORT, that can express how adaptive optimizers process gradients into updates. 2) The introduction of bra-ket notation to drastically simplify expressions and calculations in Tensor Programs. This work summarizes and generalizes all previous results in the Tensor Programs series of papers. △ Less

Submitted 7 August, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

Comments: This is the complete version of "Adaptive Optimization in the Infinite-Width Limit" in ICLR 2023, https://openreview.net/forum?id=zgVDqw9ZUES

arXiv:2307.14509 [pdf, other]

CEERS MIRI Imaging: Data Reduction and Quality Assessment

Authors: Guang Yang, Casey Papovich, Micaela Bagley, Henry Ferguson, Steven Finkelstein, Anton Koekemoer, Pablo Pérez-González, Pablo Arrabal Haro, Laura Bisigello, Karina Caputi, Yingjie Cheng, Luca Costantin, Mark Dickinson, Adriano Fontana, Jonathan Gardner, Andrea Grazian, Norman Grogin, Santosh Harish, Benne Holwerda, Edoardo Iani, Jeyhan Kartaltepe, Lisa Kewley, Allison Kirkpatrick, Dale Kocevski, Vasily Kokorev , et al. (13 additional authors not shown)

Abstract: The Cosmic Evolution Early Release Science Survey (CEERS), targeting the Extended Groth Strip extragalactic field, is one of the JWST Director's Discretionary Early Release Science programs. To date, all observations have been executed and include NIRCam/MIRI imaging and NIRSpec/NIRCam spectroscopic exposures. Here, we discuss the MIRI imaging, which includes eight pointings, four of which provide… ▽ More The Cosmic Evolution Early Release Science Survey (CEERS), targeting the Extended Groth Strip extragalactic field, is one of the JWST Director's Discretionary Early Release Science programs. To date, all observations have been executed and include NIRCam/MIRI imaging and NIRSpec/NIRCam spectroscopic exposures. Here, we discuss the MIRI imaging, which includes eight pointings, four of which provide deep imaging with the bluer bands (F560W, F770W) and four with contiguous wavelength coverage in F1000W, F1280W, F1500W, and F1800W, where two of these also include coverage in F770W and F2100W. We present a summary of the data, the data quality, and data reduction. The data reduction is based on the JWST Calibration Pipeline combined with custom modifications and additional steps designed to enhance the output quality, including improvements in astrometry and the removal of detector artifacts. We estimate the image depth of the reduced mosaics, and show that these generally agree with expectations from the Exposure Time Calculator. We compare the MIRI F560W and F770W flux densities for bright sources to measurements from Spitzer/IRAC Ch3 (5.8 $μ$m) and Ch4 (8.0 $μ$m), and we find that they agree with systematic differences of $<0.1$ mag. For the redder MIRI bands, we assess their quality by studying the spectral energy distributions (SEDs) of Galactic stars. The SEDs are consistent with the expected Rayleigh-Jeans law with a deviation $\sim 0.03$ mag, indicating that the MIRI colors are reliable. We also discuss all publicly released data products (images and source catalogs), which are available on the CEERS website (https://ceers.github.io/). △ Less

Submitted 15 September, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

Comments: 17 pages, 11 figures, and 4 tables. ApJL in press

arXiv:2307.13395 [pdf]

doi 10.1016/j.jallcom.2022.165550

Topological Insulator VxBi1.08-xSn0.02Sb0.9Te2S as a Promising n-type Thermoelectric Material

Authors: Lei Chen, Weiyao Zhaoa, Meng Li, Guangsai Yang, Lei Guo, Abudulhakim Bake, Peng Liu, David Cortie, Ren-Kui Zheng, Zhenxiang Cheng, Xiaolin Wang

Abstract: As one of the most important n-type thermoelectric (TE) materials, Bi2Te3 has been studied for decades, with efforts to enhance the thermoelectric performance based on element do**, band engineering, etc. In this study, we report a novel bulk-insulating topological material system as a replacement for n-type Bi2Te3 materials: V doped Bi1.08Sn0.02Sb0.9Te2S (V:BSSTS) . The V:BSSTS is a bulk insula… ▽ More As one of the most important n-type thermoelectric (TE) materials, Bi2Te3 has been studied for decades, with efforts to enhance the thermoelectric performance based on element do**, band engineering, etc. In this study, we report a novel bulk-insulating topological material system as a replacement for n-type Bi2Te3 materials: V doped Bi1.08Sn0.02Sb0.9Te2S (V:BSSTS) . The V:BSSTS is a bulk insulator with robust metallic topological surface states. Furthermore, the bulk band gap can be tuned by the do** level of V, which is verified by magnetotransport measurements. Large linear magnetoresistance is observed in all samples. Excellent thermoelectric performance is obtained in the V:BSSTS samples, e.g., the highest figure of merit ZT of ~ 0.8 is achieved in the 2% V doped sample (denoted as V0.02) at 550 K. The high thermoelectric performance of V:BSSTS can be attributed to two synergistic effects: (1) the low conductive secondary phases Sb2S3, and V2S3 are believed to be important scattering centers for phonons, leading to lower lattice thermal conductivity; and (2) the electrical conductivity is increased due to the high-mobility topological surface states at the boundaries. In addition, by replacing one third of costly tellurium with abundant, low-cost, and less-toxic sulfur element, the newly produced BSSTS material is inexpensive but still has comparable TE performance to the traditional Bi2Te3-based materials, which offers a cheaper plan for the electronics and thermoelectric industries. Our results demonstrate that topological materials with unique band structures can provide a new platform in the search for new high performance TE materials. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Journal ref: Journal of Alloys and Compounds 918 (2022): 165550

arXiv:2307.13220 [pdf]

One for Multiple: Physics-informed Synthetic Data Boosts Generalizable Deep Learning for Fast MRI Reconstruction

Authors: Zi Wang, Xiaotong Yu, Chengyan Wang, Weibo Chen, Jiazheng Wang, Ying-Hua Chu, Hongwei Sun, Rushuai Li, Peiyong Li, Fan Yang, Haiwei Han, Taishan Kang, Jianzhong Lin, Chen Yang, Shufu Chang, Zhang Shi, Sha Hua, Yan Li, Juan Hu, Liuhong Zhu, Jianjun Zhou, Mei**g Lin, Jiefeng Guo, Congbo Cai, Zhong Chen , et al. (3 additional authors not shown)

Abstract: Magnetic resonance imaging (MRI) is a widely used radiological modality renowned for its radiation-free, comprehensive insights into the human body, facilitating medical diagnoses. However, the drawback of prolonged scan times hinders its accessibility. The k-space undersampling offers a solution, yet the resultant artifacts necessitate meticulous removal during image reconstruction. Although Deep… ▽ More Magnetic resonance imaging (MRI) is a widely used radiological modality renowned for its radiation-free, comprehensive insights into the human body, facilitating medical diagnoses. However, the drawback of prolonged scan times hinders its accessibility. The k-space undersampling offers a solution, yet the resultant artifacts necessitate meticulous removal during image reconstruction. Although Deep Learning (DL) has proven effective for fast MRI image reconstruction, its broader applicability across various imaging scenarios has been constrained. Challenges include the high cost and privacy restrictions associated with acquiring large-scale, diverse training data, coupled with the inherent difficulty of addressing mismatches between training and target data in existing DL methodologies. Here, we present a novel Physics-Informed Synthetic data learning framework for Fast MRI, called PISF. PISF marks a breakthrough by enabling generalized DL for multi-scenario MRI reconstruction through a single trained model. Our approach separates the reconstruction of a 2D image into many 1D basic problems, commencing with 1D data synthesis to facilitate generalization. We demonstrate that training DL models on synthetic data, coupled with enhanced learning techniques, yields in vivo MRI reconstructions comparable to or surpassing those of models trained on matched realistic datasets, reducing the reliance on real-world MRI data by up to 96%. Additionally, PISF exhibits remarkable generalizability across multiple vendors and imaging centers. Its adaptability to diverse patient populations has been validated through evaluations by ten experienced medical professionals. PISF presents a feasible and cost-effective way to significantly boost the widespread adoption of DL in various fast MRI applications. △ Less

Submitted 28 February, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

Comments: 38 pages, 19 figures, 5 tables

arXiv:2307.12812 [pdf, other]

Subcycle tomography of quantum light

Authors: Geehyun Yang, Matthias Kizmann, Alfred Leitenstorfer, Andrey S. Moskalenko

Abstract: Quantum light is considered to be one of the key resources of the coming second quantum revolution expected to give rise to groundbreaking technologies and applications. If the spatio-temporal and polarization structure of modes is known, the properties of quantum light are well understood. This information provides the basis for contemporary quantum optics and its applications in quantum communic… ▽ More Quantum light is considered to be one of the key resources of the coming second quantum revolution expected to give rise to groundbreaking technologies and applications. If the spatio-temporal and polarization structure of modes is known, the properties of quantum light are well understood. This information provides the basis for contemporary quantum optics and its applications in quantum communication and metrology. However, thinking about quantum light at the most fundamental timescale, namely the oscillation cycle of a mode or the inverse frequency of an involved photon, we realize that the corresponding picture has been missing until now. For instance, how to comprehend and characterize a single photon at this timescale? To fill this gap, we demonstrate theoretically how local quantum measurements allow to reconstruct and visualize a quantum field under study at subcycle scales, even when its temporal mode structure is a priori unknown. In particular, generation and tomography of ultrabroadband squeezed states as well as photon-subtracted states derived from them are described, incorporating also single-photon states. Our results set a cornerstone in the emerging chapter of quantum physics termed time-domain quantum optics. We expect this development to elicit new spectroscopic concepts for approaching e.g. fundamental correlations and entanglement in the dynamics of quantum matter, overcoming the temporal limitation set by the oscillation cycles of both light and elementary excitations. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: 43 pages, 15 figures

arXiv:2307.12775 [pdf]

doi 10.1109/JBHI.2023.3348436

Is attention all you need in medical image analysis? A review

Authors: Giorgos Papanastasiou, Nikolaos Dikaios, Jiahao Huang, Chengjia Wang, Guang Yang

Abstract: Medical imaging is a key component in clinical diagnosis, treatment planning and clinical trial design, accounting for almost 90% of all healthcare data. CNNs achieved performance gains in medical image analysis (MIA) over the last years. CNNs can efficiently model local pixel interactions and be trained on small-scale MI data. The main disadvantage of typical CNN models is that they ignore global… ▽ More Medical imaging is a key component in clinical diagnosis, treatment planning and clinical trial design, accounting for almost 90% of all healthcare data. CNNs achieved performance gains in medical image analysis (MIA) over the last years. CNNs can efficiently model local pixel interactions and be trained on small-scale MI data. The main disadvantage of typical CNN models is that they ignore global pixel relationships within images, which limits their generalisation ability to understand out-of-distribution data with different 'global' information. The recent progress of Artificial Intelligence gave rise to Transformers, which can learn global relationships from data. However, full Transformer models need to be trained on large-scale data and involve tremendous computational complexity. Attention and Transformer compartments (Transf/Attention) which can well maintain properties for modelling global relationships, have been proposed as lighter alternatives of full Transformers. Recently, there is an increasing trend to co-pollinate complementary local-global properties from CNN and Transf/Attention architectures, which led to a new era of hybrid models. The past years have witnessed substantial growth in hybrid CNN-Transf/Attention models across diverse MIA problems. In this systematic review, we survey existing hybrid CNN-Transf/Attention models, review and unravel key architectural designs, analyse breakthroughs, and evaluate current and future opportunities as well as challenges. We also introduced a comprehensive analysis framework on generalisation opportunities of scientific and clinical impact, based on which new data-driven domain generalisation and adaptation methods can be stimulated. △ Less

Submitted 24 July, 2023; originally announced July 2023.

arXiv:2307.12418 [pdf, other]

HateModerate: Testing Hate Speech Detectors against Content Moderation Policies

Authors: Jiangrui Zheng, Xueqing Liu, Guanqun Yang, Mirazul Haque, Xing Qian, Ravishka Rathnasuriya, Wei Yang, Girish Budhrani

Abstract: To protect users from massive hateful content, existing works studied automated hate speech detection. Despite the existing efforts, one question remains: do automated hate speech detectors conform to social media content policies? A platform's content policies are a checklist of content moderated by the social media platform. Because content moderation rules are often uniquely defined, existing h… ▽ More To protect users from massive hateful content, existing works studied automated hate speech detection. Despite the existing efforts, one question remains: do automated hate speech detectors conform to social media content policies? A platform's content policies are a checklist of content moderated by the social media platform. Because content moderation rules are often uniquely defined, existing hate speech datasets cannot directly answer this question. This work seeks to answer this question by creating HateModerate, a dataset for testing the behaviors of automated content moderators against content policies. First, we engage 28 annotators and GPT in a six-step annotation process, resulting in a list of hateful and non-hateful test suites matching each of Facebook's 41 hate speech policies. Second, we test the performance of state-of-the-art hate speech detectors against HateModerate, revealing substantial failures these models have in their conformity to the policies. Third, using HateModerate, we augment the training data of a top-downloaded hate detector on HuggingFace. We observe significant improvement in the models' conformity to content policies while having comparable scores on the original test data. Our dataset and code can be found in the attachment. △ Less

Submitted 18 March, 2024; v1 submitted 23 July, 2023; originally announced July 2023.

Comments: NAACL 2024 Finding

arXiv:2307.12376 [pdf, other]

Stark many-body localization with long-range interactions

Authors: Xiang-** Jiang, Rui Qi, Sheng Yang, Yayun Hu, Guangwen Yang

Abstract: In one-dimensional (1D) disorder-free interacting systems, a sufficiently strong linear potential can induce localization of the many-body eigenstates, a phenomenon dubbed as Stark many-body localization (MBL). In this paper, we investigate the fate of Stark MBL in 1D spinless fermions systems with long-range interactions, specifically focusing on the role of interaction strength. We obtain the St… ▽ More In one-dimensional (1D) disorder-free interacting systems, a sufficiently strong linear potential can induce localization of the many-body eigenstates, a phenomenon dubbed as Stark many-body localization (MBL). In this paper, we investigate the fate of Stark MBL in 1D spinless fermions systems with long-range interactions, specifically focusing on the role of interaction strength. We obtain the Stark MBL phase diagrams by computing the mean gap ratio and many-body inverse participation ratio at half-filling. We show that, for short-range interactions, there is a qualitative symmetry between the limits of weak and strong interactions. However, this symmetry is absent in the case of long-range interactions, where the system is always Stark many-body localized at strong interactions, regardless of the linear potential strength. Furthermore, we study the dynamics of imbalance and entanglement with various initial states using time-dependent variational principle (TDVP) numerical methods. We reveal that the dynamical quantities display a strong dependence on the initial conditions, which suggests that the Hilbert-space fragmentation precludes thermalization. Our results demonstrate the robustness of Stark MBL even in the presence of long-range interactions and offer an avenue to explore MBL in disorder-free systems with long-range interactions. △ Less

Submitted 23 July, 2023; originally announced July 2023.

arXiv:2307.11445 [pdf, other]

An Extended Nonlinear Stability Assessment Methodology For Type-4 Wind Turbines via Time Reversal Trajectory

Authors: Sujay Ghosh, Mohammad Kazem Bakhshizadeh, Guangya Yang, Łukasz Kocewiak

Abstract: As the integration of renewable energy generation increases and as conventional generation is phased out, there is a gradual decline in the grid's strength and resilience at the connection point of wind turbines (WTs). Previous studies have shown that traditional grid-following controlled converters exhibit deteriorating dynamic characteristics and may result in an unstable system when connected t… ▽ More As the integration of renewable energy generation increases and as conventional generation is phased out, there is a gradual decline in the grid's strength and resilience at the connection point of wind turbines (WTs). Previous studies have shown that traditional grid-following controlled converters exhibit deteriorating dynamic characteristics and may result in an unstable system when connected to a weak grid. Due to the limitations of linear analysis, transient stability investigations are necessary. However, existing methods, such as standalone time-domain simulations or analytical Lyapunov stability criteria, have drawbacks, including computational intensity or excessive conservatism. Our prior research proposed an innovative approach to estimate the system boundary - a time-limited region of attraction (TLRoA), using a hybrid linearised Lyapunov function-based method and the time-reversal technique to compensate for the known limitations. However, in that work, the accuracy of the estimated TLRoA was not investigated, i.e. the TLRoA was not compared against a forward simulated region of attraction, and the sensitivity of the system parameters on the TLRoA was not explored. Moreover, the framework did not consider nonlinear control elements such as PLL saturation. In this paper, we not only build upon our previous work and propose directions that address these gaps but also enhance its effectiveness by introducing optimal sampling to improve further the speed of estimating the TLRoA. Furthermore, the stability boundary is verified using time-domain simulation studies in PSCAD. △ Less

Submitted 27 November, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

arXiv:2307.10182 [pdf, other]

Enhancing Super-Resolution Networks through Realistic Thick-Slice CT Simulation

Authors: Zeyu Tang, Xiaodan Xing, Guang Yang

Abstract: Deep learning-based Generative Models have the potential to convert low-resolution CT images into high-resolution counterparts without long acquisition times and increased radiation exposure in thin-slice CT imaging. However, procuring appropriate training data for these Super-Resolution (SR) models is challenging. Previous SR research has simulated thick-slice CT images from thin-slice CT images… ▽ More Deep learning-based Generative Models have the potential to convert low-resolution CT images into high-resolution counterparts without long acquisition times and increased radiation exposure in thin-slice CT imaging. However, procuring appropriate training data for these Super-Resolution (SR) models is challenging. Previous SR research has simulated thick-slice CT images from thin-slice CT images to create training pairs. However, these methods either rely on simplistic interpolation techniques that lack realism or sinogram reconstruction, which require the release of raw data and complex reconstruction algorithms. Thus, we introduce a simple yet realistic method to generate thick CT images from thin-slice CT images, facilitating the creation of training pairs for SR algorithms. The training pairs produced by our method closely resemble real data distributions (PSNR=49.74 vs. 40.66, p$<$0.05). A multivariate Cox regression analysis involving thick slice CT images with lung fibrosis revealed that only the radiomics features extracted using our method demonstrated a significant correlation with mortality (HR=1.19 and HR=1.14, p$<$0.005). This paper represents the first to identify and address the challenge of generating appropriate paired training data for Deep Learning-based CT SR models, which enhances the efficacy and applicability of SR models in real-world scenarios. △ Less

Submitted 2 June, 2024; v1 submitted 2 July, 2023; originally announced July 2023.

Comments: 11 pages, 4 figures

arXiv:2307.09802 [pdf]

Magneto-transport and electronic structures in MoSi$_2$ bulks and thin films with different orientations

Authors: W. Afzal, F. Yun, Z. Li, Z. Yue, W. Zhao, L. Sang, G. Yang, Y. He, G. Peleckis, M. Fuhrer, X. Wang

Abstract: We report a comprehensive study of magneto-transport properties in MoSi$_2$ bulk and thin films. Textured MoSi$_2$ thin films of around 70 nm were deposited on silicon substrates with different orientations. Giant magnetoresistance of 1000% was observed in sintered bulk samples while MoSi$_2$ single crystals exhibit a magnetoresistance (MR) value of 800% at low temperatures. At the low temperature… ▽ More We report a comprehensive study of magneto-transport properties in MoSi$_2$ bulk and thin films. Textured MoSi$_2$ thin films of around 70 nm were deposited on silicon substrates with different orientations. Giant magnetoresistance of 1000% was observed in sintered bulk samples while MoSi$_2$ single crystals exhibit a magnetoresistance (MR) value of 800% at low temperatures. At the low temperatures, the MR of the textured thin films show weak anti-localization behaviour owing to the spin orbit coupling effects. Our first principle calculation show the presence of surface states in this material. The resistivity of all the MoSi$_2$ thin films is significantly low and nearly independent of the temperature, which is important for electronic devices. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Journal ref: Journal of Alloys and Compounds Volume 858, 157670, 25 March 2021

arXiv:2307.09587 [pdf, other]

doi 10.1103/PhysRevB.109.085412

Moir{é} pattern assisted geometric resonant tunneling in disordered twisted bilayer graphene

Authors: Zhe Hou, Ya-Yun Hu, Guang-Wen Yang

Abstract: We investigate the mesoscopic transport through a twisted bilayer graphene (TBG) consisting of a clean graphene nanoribbon on the bottom and a disordered graphene disc on the top. We show that, with strong top-layer disorder the transmission through such a device shows a sequence of resonant peaks with respect to the rotation angle $θ$, where at the resonance angles $θ_c$ the disc region contains… ▽ More We investigate the mesoscopic transport through a twisted bilayer graphene (TBG) consisting of a clean graphene nanoribbon on the bottom and a disordered graphene disc on the top. We show that, with strong top-layer disorder the transmission through such a device shows a sequence of resonant peaks with respect to the rotation angle $θ$, where at the resonance angles $θ_c$ the disc region contains one giant hexagonal moir{é} supercell. A further investigation shows that the value of $θ_c$ shows negligible dependence on the disorder strength, the Fermi energy, and the shape distortion, indicating the resonance is a robust geometric feature of the moir{é} supercell. We explain this geometric resonance based on the bound states formed inside the moir{é} supercell, with their averaged local density of states dominating at the AA stacking region while minimizing at the AB stacking region. By increasing the interlayer distance, the peak becomes less pronounced which further confirms the role of interlayer coupling. The results presented here suggest a new mechanism to tune the quantum transport signal through the twist angle in disordered moir{é} systems. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Journal ref: Physical Review B 109, 085412 (2024)

arXiv:2307.09503 [pdf, other]

CEERS Key Paper VIII: Emission Line Ratios from NIRSpec and NIRCam Wide-Field Slitless Spectroscopy at z>2

Authors: Bren E. Backhaus, Jonathan R. Trump, Nor Pirzkal, Guillermo Barro, Steven L. Finkelstein, Pablo Arrabal Haro, Raymond C. Simons, Jessica Wessner, Nikko J. Cleri, Michaela Hirschmann, Micaela B. Bagley, David C. Nicholls, Mark Dickinson, Jeyhan S. Kartaltepe, Casey Papovich, Dale D. Kocevski, Anton M. Koekemoer, Laura Bisigello, Anne E. Jaskot, Ray A. Lucas, Intae Jung, Stephen M. Wilkins, L. Y. Aaron Yung, Henry C. Ferguson, Adriano Fontana , et al. (15 additional authors not shown)

Abstract: We use James Webb Space Telescope Near-Infrared Camera Wide Field Slitless Spectroscopy (NIRCam WFSS) and Near-Infrared spectrograph (NIRSpec) in the Cosmic Evolution Early Release survey (CEERS) to measure rest-frame optical emission-line of 155 galaxies at z>2. The blind NIRCam grism observations include a sample of galaxies with bright emission lines that were not observed on the NIRSpec masks.… ▽ More We use James Webb Space Telescope Near-Infrared Camera Wide Field Slitless Spectroscopy (NIRCam WFSS) and Near-Infrared spectrograph (NIRSpec) in the Cosmic Evolution Early Release survey (CEERS) to measure rest-frame optical emission-line of 155 galaxies at z>2. The blind NIRCam grism observations include a sample of galaxies with bright emission lines that were not observed on the NIRSpec masks. We study the changes of the Ha, [OIII]/Hb, and [NeIII]/[OII] emission lines in terms of redshift by comparing to lower redshift SDSS and CLEAR samples. We find a significant (>3$σ$) correlation between [OIII]/Hb with redshift, while [NeIII]/[OII] has a marginal (2$σ$) correlation with redshift. We compare [OIII]/Hb and [NeIII]/[OII] to stellar mass and Hb SFR. We find that both emission-line ratios have a correlation with Hb SFR and an anti-correlation with stellar mass across the redshifts 0<z<9. Comparison with MAPPINGS~V models indicates that these trends are consistent with lower metallicity and higher ionization in low-mass and high-SFR galaxies. We additionally compare to IllustriousTNG predictions and find that they effectively describe the highest [OIII]/Hb ratios observed in our sample, without the need to invoke MAPPINGS models with significant shock ionizionation components. △ Less

Submitted 7 September, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

Comments: 16 pages, 11 figures

arXiv:2307.08989 [pdf, other]

GraphCL-DTA: a graph contrastive learning with molecular semantics for drug-target binding affinity prediction

Authors: Xinxing Yang, Genke Yang, Jian Chu

Abstract: Drug-target binding affinity prediction plays an important role in the early stages of drug discovery, which can infer the strength of interactions between new drugs and new targets. However, the performance of previous computational models is limited by the following drawbacks. The learning of drug representation relies only on supervised data, without taking into account the information containe… ▽ More Drug-target binding affinity prediction plays an important role in the early stages of drug discovery, which can infer the strength of interactions between new drugs and new targets. However, the performance of previous computational models is limited by the following drawbacks. The learning of drug representation relies only on supervised data, without taking into account the information contained in the molecular graph itself. Moreover, most previous studies tended to design complicated representation learning module, while uniformity, which is used to measure representation quality, is ignored. In this study, we propose GraphCL-DTA, a graph contrastive learning with molecular semantics for drug-target binding affinity prediction. In GraphCL-DTA, we design a graph contrastive learning framework for molecular graphs to learn drug representations, so that the semantics of molecular graphs are preserved. Through this graph contrastive framework, a more essential and effective drug representation can be learned without additional supervised data. Next, we design a new loss function that can be directly used to smoothly adjust the uniformity of drug and target representations. By directly optimizing the uniformity of representations, the representation quality of drugs and targets can be improved. The effectiveness of the above innovative elements is verified on two real datasets, KIBA and Davis. The excellent performance of GraphCL-DTA on the above datasets suggests its superiority to the state-of-the-art model. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Comments: 13 pages, 4 figures, 5 tables

arXiv:2307.08640 [pdf, other]

doi 10.22331/q-2024-01-24-1232

A new quantum machine learning algorithm: split hidden quantum Markov model inspired by quantum conditional master equation

Authors: Xiao-Yu Li, Qin-Sheng Zhu, Yong Hu, Hao Wu, Guo-Wu Yang, Lian-Hui Yu, Geng Chen

Abstract: The Hidden Quantum Markov Model (HQMM) has significant potential for analyzing time-series data and studying stochastic processes in the quantum domain as an upgrading option with potential advantages over classical Markov models. In this paper, we introduced the split HQMM (SHQMM) for implementing the hidden quantum Markov process, utilizing the conditional master equation with a fine balance con… ▽ More The Hidden Quantum Markov Model (HQMM) has significant potential for analyzing time-series data and studying stochastic processes in the quantum domain as an upgrading option with potential advantages over classical Markov models. In this paper, we introduced the split HQMM (SHQMM) for implementing the hidden quantum Markov process, utilizing the conditional master equation with a fine balance condition to demonstrate the interconnections among the internal states of the quantum system. The experimental results suggest that our model outperforms previous models in terms of scope of applications and robustness. Additionally, we establish a new learning algorithm to solve parameters in HQMM by relating the quantum conditional master equation to the HQMM. Finally, our study provides clear evidence that the quantum transport system can be considered a physical representation of HQMM. The SHQMM with accompanying algorithms present a novel method to analyze quantum systems and time series grounded in physical implementation. △ Less

Submitted 18 January, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

Journal ref: Quantum 8, 1232 (2024)

Showing 201–250 of 1,451 results for author: Yang, G