Skip to main content

Showing 1–48 of 48 results for author: Yao, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.14977  [pdf, other

    cs.AI eess.IV

    Trustworthy Enhanced Multi-view Multi-modal Alzheimer's Disease Prediction with Brain-wide Imaging Transcriptomics Data

    Authors: Shan Cong, Zhoujie Fan, Hongwei Liu, Yinghan Zhang, Xin Wang, Haoran Luo, Xiaohui Yao

    Abstract: Brain transcriptomics provides insights into the molecular mechanisms by which the brain coordinates its functions and processes. However, existing multimodal methods for predicting Alzheimer's disease (AD) primarily rely on imaging and sometimes genetic data, often neglecting the transcriptomic basis of brain. Furthermore, while striving to integrate complementary information between modalities,… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2406.09317  [pdf, other

    eess.IV cs.CV

    Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

    Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, **ming Guo, Xiaolin Chen, **gcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

    Abstract: Previous foundation models for retinal images were pre-trained with limited disease categories and knowledge base. Here we introduce RetiZero, a vision-language foundation model that leverages knowledge from over 400 fundus diseases. To RetiZero's pre-training, we compiled 341,896 fundus images paired with text descriptions, sourced from public datasets, ophthalmic literature, and online resources… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2405.07739  [pdf, ps, other

    eess.SP

    A Low-rank Projected Proximal Gradient Method for Spectral Compressed Sensing

    Authors: Xi Yao, Wei Dai

    Abstract: This paper presents a new approach to the recovery of a spectrally sparse signal (SSS) from partially observed entries, focusing on challenges posed by large-scale data and heavy noise environments. The SSS reconstruction can be formulated as a non-convex low-rank Hankel recovery problem. Traditional formulations for SSS recovery often suffer from reconstruction inaccuracies due to unequally weigh… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  4. arXiv:2405.02801  [pdf, other

    cs.SD cs.AI eess.AS

    Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models

    Authors: Tianze Xu, Jiajun Li, Xuesong Chen, Xinrui Yao, Shuchang Liu

    Abstract: In recent years, AI-Generated Content (AIGC) has witnessed rapid advancements, facilitating the generation of music, images, and other forms of artistic expression across various industries. However, researches on general multi-modal music generation model remain scarce. To fill this gap, we propose a multi-modal music generation framework Mozart's Touch. It could generate aligned music with the c… ▽ More

    Submitted 7 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: 7 pages, 2 figures, submitted to ACM MM 2024

  5. In-situ process monitoring and adaptive quality enhancement in laser additive manufacturing: a critical review

    Authors: Lequn Chen, Guijun Bi, Xiling Yao, **long Su, Chaolin Tan, Wenhe Feng, Michalis Benakis, Youxiang Chew, Seung Ki Moon

    Abstract: Laser Additive Manufacturing (LAM) presents unparalleled opportunities for fabricating complex, high-performance structures and components with unique material properties. Despite these advancements, achieving consistent part quality and process repeatability remains challenging. This paper provides a comprehensive review of various state-of-the-art in-situ process monitoring techniques, including… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 107 Pages, 29 Figures. Paper Accepted At Journal of Manufacturing Systems

  6. Channel Estimation for Stacked Intelligent Metasurface-Assisted Wireless Networks

    Authors: Xianghao Yao, Jiancheng An, Lu Gan, Marco Di Renzo, Chau Yuen

    Abstract: Emerging technologies, such as holographic multiple-input multiple-output (HMIMO) and stacked intelligent metasurface (SIM), are driving the development of wireless communication systems. Specifically, the SIM is physically constructed by stacking multiple layers of metasurfaces and has an architecture similar to an artificial neural network (ANN), which can flexibly manipulate the electromagnetic… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: 13 pages, 3 figures, accepted by IEEE WCL

  7. arXiv:2402.18946  [pdf, other

    cs.LG eess.SY

    Real-Time Adaptive Safety-Critical Control with Gaussian Processes in High-Order Uncertain Models

    Authors: Yu Zhang, Long Wen, Xiangtong Yao, Zhenshan Bing, Linghuan Kong, Wei He, Alois Knoll

    Abstract: This paper presents an adaptive online learning framework for systems with uncertain parameters to ensure safety-critical control in non-stationary environments. Our approach consists of two phases. The initial phase is centered on a novel sparse Gaussian process (GP) framework. We first integrate a forgetting factor to refine a variational sparse GP algorithm, thus enhancing its adaptability. Sub… ▽ More

    Submitted 5 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  8. arXiv:2311.13028  [pdf, other

    cs.LG cs.AI cs.DC eess.SP

    DMLR: Data-centric Machine Learning Research -- Past, Present and Future

    Authors: Luis Oala, Manil Maskey, Lilith Bat-Leah, Alicia Parrish, Nezihe Merve Gürel, Tzu-Sheng Kuo, Yang Liu, Rotem Dror, Danilo Brajovic, Xiaozhe Yao, Max Bartolo, William A Gaviria Rojas, Ryan Hileman, Rainier Aliment, Michael W. Mahoney, Meg Risdal, Matthew Lease, Wojciech Samek, Debojyoti Dutta, Curtis G Northcutt, Cody Coleman, Braden Hancock, Bernard Koch, Girmaw Abebe Tadesse, Bojan Karlaš , et al. (13 additional authors not shown)

    Abstract: Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods tow… ▽ More

    Submitted 1 June, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Published in the Journal of Data-centric Machine Learning Research (DMLR) at https://data.mlr.press/assets/pdf/v01-5.pdf

  9. arXiv:2310.18732  [pdf, ps, other

    physics.optics eess.IV

    Tracking and fast imaging of a translational object via Fourier modulation

    Authors: Shijian Li, Xu-ri Yao, Wei Zhang, Yeliang Wang, Qing Zhao

    Abstract: The tracking and imaging of high-speed moving objects hold significant promise for application in various fields. Single-pixel imaging enables the progressive capture of a fast-moving translational object through motion compensation. However, achieving a balance between a short reconstruction time and a good image quality is challenging. In this study, we present a approach that simultaneously inc… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: 6 figures

  10. arXiv:2310.14965  [pdf, ps, other

    eess.IV physics.optics

    Parallel compressive super-resolution imaging with wide field-of-view based on physics enhanced network

    Authors: Xiao-Peng **, An-Dong Xiong, Wei Zhang, Xiao-Qing Wang, Fan Liu, Chang-Heng Li, Xu-Ri Yao, Xue-Feng Liu, Qing Zhao

    Abstract: Achieving both high-performance and wide field-of-view (FOV) super-resolution imaging has been attracting increasing attention in recent years. However, such goal suffers from long reconstruction time and huge storage space. Parallel compressive imaging (PCI) provides an efficient solution, but the super-resolution quality and imaging speed are strongly dependent on precise optical transfer functi… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  11. arXiv:2309.15697  [pdf, other

    cs.CV eess.IV

    Physics Inspired Hybrid Attention for SAR Target Recognition

    Authors: Zhongling Huang, Chong Wu, Xiwen Yao, Zhicheng Zhao, Xiankai Huang, Junwei Han

    Abstract: There has been a recent emphasis on integrating physical models and deep neural networks (DNNs) for SAR target recognition, to improve performance and achieve a higher level of physical interpretability. The attributed scattering center (ASC) parameters garnered the most interest, being considered as additional input data or features for fusion in most methods. However, the performance greatly dep… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  12. arXiv:2308.13906  [pdf, other

    eess.SP cs.LG

    A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

    Authors: Zixiao Zhao, Qinghe Du, Xiang Yao, Lei Lu, Shijiao Zhang

    Abstract: As drones become increasingly prevalent in human life, they also raises security concerns such as unauthorized access and control, as well as collisions and interference with manned aircraft. Therefore, ensuring the ability to accurately detect and identify between different drones holds significant implications for coverage extension. Assisted by machine learning, radio frequency (RF) detection c… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  13. arXiv:2308.06377  [pdf, other

    eess.IV cs.CV

    CATS v2: Hybrid encoders for robust medical segmentation

    Authors: Hao Li, Han Liu, Dewei Hu, Xing Yao, Jiacheng Wang, Ipek Oguz

    Abstract: Convolutional Neural Networks (CNNs) have exhibited strong performance in medical image segmentation tasks by capturing high-level (local) information, such as edges and textures. However, due to the limited field of view of convolution kernel, it is hard for CNNs to fully represent global information. Recently, transformers have shown good performance for medical image segmentation due to their a… ▽ More

    Submitted 31 January, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

  14. arXiv:2307.12377  [pdf, other

    eess.IV

    4D Feet: Registering Walking Foot Shapes Using Attention Enhanced Dynamic-Synchronized Graph Convolutional LSTM Network

    Authors: Farzam Tajdari, Toon Huysmans, Xinhe Yao, Jun Xu, Yu Song

    Abstract: 4D scans of dynamic deformable human body parts help researchers have a better understanding of spatiotemporal features. However, reconstructing 4D scans based on multiple asynchronous cameras encounters two main challenges: 1) finding the dynamic correspondences among different frames captured by each camera at the timestamps of the camera in terms of dynamic feature recognition, and 2) reconstru… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

  15. arXiv:2307.00245  [pdf, other

    eess.IV cs.CV

    Deep Angiogram: Trivializing Retinal Vessel Segmentation

    Authors: Dewei Hu, Xing Yao, Jiacheng Wang, Yuankai K. Tao, Ipek Oguz

    Abstract: Among the research efforts to segment the retinal vasculature from fundus images, deep learning models consistently achieve superior performance. However, this data-driven approach is very sensitive to domain shifts. For fundus images, such data distribution changes can easily be caused by variations in illumination conditions as well as the presence of disease-related features such as hemorrhages… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

    Comments: 5 pages, 4 figures, SPIE 2023

    Journal ref: In Medical Imaging 2023: Image Processing, vol. 12464, pp. 656-660. SPIE, 2023

  16. arXiv:2305.13596  [pdf

    eess.IV eess.AS eess.SP

    Multimodal sensor fusion for real-time location-dependent defect detection in laser-directed energy deposition

    Authors: Lequn Chen, Xiling Yao, Wenhe Feng, Youxiang Chew, Seung Ki Moon

    Abstract: Real-time defect detection is crucial in laser-directed energy deposition (L-DED) additive manufacturing (AM). Traditional in-situ monitoring approach utilizes a single sensor (i.e., acoustic, visual, or thermal sensor) to capture the complex process dynamic behaviors, which is insufficient for defect detection with high accuracy and robustness. This paper proposes a novel multimodal sensor fusion… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 8 pages, 10 figures. This paper has been accepted to be published in the proceedings of IDETC-CIE 2023

  17. arXiv:2304.05685  [pdf

    eess.IV eess.SP eess.SY

    Multisensor fusion-based digital twin in additive manufacturing for in-situ quality monitoring and defect correction

    Authors: Lequn Chen, Xiling Yao, Kui Liu, Chaolin Tan, Seung Ki Moon

    Abstract: Early detection and correction of defects are critical in additive manufacturing (AM) to avoid build failures. In this paper, we present a multisensor fusion-based digital twin for in-situ quality monitoring and defect correction in a robotic laser direct energy deposition process. Multisensor fusion sources consist of an acoustic sensor, an infrared thermal camera, a coaxial vision camera, and a… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: 11 pages, 9 figures. Accepted at 24th International Conference on Engineering Design (ICED23)

  18. arXiv:2304.04598  [pdf

    cs.SD eess.AS eess.SP

    In-situ crack and keyhole pore detection in laser directed energy deposition through acoustic signal and deep learning

    Authors: Lequn Chen, Xiling Yao, Chaolin Tan, Weiyang He, **long Su, Fei Weng, Youxiang Chew, Nicholas Poh Huat Ng, Seung Ki Moon

    Abstract: Cracks and keyhole pores are detrimental defects in alloys produced by laser directed energy deposition (LDED). Laser-material interaction sound may hold information about underlying complex physical events such as crack propagation and pores formation. However, due to the noisy environment and intricate signal content, acoustic-based monitoring in LDED has received little attention. This paper pr… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: 36 Pages, 16 Figures, accepted at journal Additive Manufacturing

  19. arXiv:2212.14840  [pdf

    physics.med-ph eess.IV physics.bio-ph

    Normalized Blood Flow Index in Optical Coherence Tomography Angiography Provides a Sensitive Biomarker of Early Diabetic Retinopathy

    Authors: Albert K. Dadzie, David Le, Mansour Abtahi, Behrouz Ebrahimi, Taeyoon Son, Jennifer I. Lim, Xincheng Yao

    Abstract: Purpose: To evaluate the sensitivity of normalized blood flow index (NBFI) for detecting early diabetic retinopathy (DR). Methods: Optical coherence tomography angiography (OCTA) images of 30 eyes from 20 healthy controls, 21 eyes of diabetic patients with no DR (NoDR) and 26 eyes from 22 patients with mild non-proliferative DR (NPDR) were analyzed in this study. The OCTA images were centered on t… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

  20. arXiv:2212.13257  [pdf

    physics.med-ph cs.CV eess.IV eess.SY physics.optics

    A portable widefield fundus camera with high dynamic range imaging capability

    Authors: Alfa Rossi, Mojtaba Rahimi, David Le, Taeyoon son, Michael J. Heiferman, R. V. Paul Chan, Xincheng Yao

    Abstract: Fundus photography is indispensable for clinical detection and management of eye diseases. Limited image contrast and field of view (FOV) are common limitations of conventional fundus cameras, making it difficult to detect subtle abnormalities at the early stages of eye diseases. Further improvements of image contrast and FOV coverage are important to improve early disease detection and reliable t… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

    Comments: 12 pages, 8 figures

  21. arXiv:2212.00027  [pdf, other

    eess.IV physics.optics

    Imaging across multiple spatial scales with the multi-camera array microscope

    Authors: Mark Harfouche, Kanghyun Kim, Kevin C. Zhou, Pavan Chandra Konda, Sunanda Sharma, Eric E. Thomson, Colin Cooke, Shiqi Xu, Lucas Kreiss, Amey Chaware, Xi Yang, Xing Yao, Vinayak Pathak, Martin Bohlen, Ron Appel, Aurélien Bègue, Clare Cook, Jed Doman, John Efromson, Gregor Horstmeyer, Jaehee Park, Paul Reamey, Veton Saliu, Eva Naumann, Roarke Horstmeyer

    Abstract: This article experimentally examines different configurations of a novel multi-camera array microscope (MCAM) imaging technology. The MCAM is based upon a densely packed array of "micro-cameras" to jointly image across a large field-of-view at high resolution. Each micro-camera within the array images a unique area of a sample of interest, and then all acquired data with 54 micro-cameras are digit… ▽ More

    Submitted 28 February, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

  22. arXiv:2211.14467  [pdf

    cs.CV eess.IV

    Self-Supervised Surgical Instrument 3D Reconstruction from a Single Camera Image

    Authors: Ange Lou, Xing Yao, Ziteng Liu, **tong Han, Jack Noble

    Abstract: Surgical instrument tracking is an active research area that can provide surgeons feedback about the location of their tools relative to anatomy. Recent tracking methods are mainly divided into two parts: segmentation and object detection. However, both can only predict 2D information, which is limiting for application to real-world surgery. An accurate 3D surgical instrument model is a prerequisi… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: Accepted by SPIE Medical Imaging 2023

  23. arXiv:2209.04449  [pdf, other

    eess.IV physics.optics

    A detail-enhanced sampling strategy in Hadamard single-pixel imaging

    Authors: Yan Cai, Shijian Li, Wei Zhang, Hao Wu, Xu-ri Yao, Qing Zhao

    Abstract: Hadamard single-pixel imaging (HSI) is an appealing imaging technique due to its features of low hardware complexity and industrial cost. To improve imaging efficiency, many studies have focused on sorting Hadamard patterns to obtain reliable reconstructed images with very few samples. In this study, we present an efficient HSI imaging method that employs an exponential probability function to sam… ▽ More

    Submitted 9 September, 2022; originally announced September 2022.

    Comments: 14 pages, 12 figures,1 table

  24. arXiv:2209.01554  [pdf, ps, other

    physics.optics eess.IV

    Single-pixel imaging of a translational object

    Authors: Shijian Li, Yan Cai, Yeliang Wang, Xu-ri Yao, Qing Zhao

    Abstract: Image-free tracking methods based on single-pixel detectors (SPDs) can track a moving object at a very high frame rate, but they rarely can achieve simultaneous imaging of such an object. In this study, we propose a method for simultaneously obtaining the relative displacements and images of a translational object. Four binary Fourier patterns and two differential Hadamard patterns are used to mod… ▽ More

    Submitted 31 January, 2023; v1 submitted 4 September, 2022; originally announced September 2022.

    Comments: 15 pages, 9 figures, 1 table

  25. arXiv:2207.04324  [pdf, other

    eess.IV cs.CV stat.ML

    Video Coding Using Learned Latent GAN Compression

    Authors: Mustafa Shukor, Bharath Bhushan Damodaran, Xu Yao, Pierre Hellier

    Abstract: We propose in this paper a new paradigm for facial video compression. We leverage the generative capacity of GANs such as StyleGAN to represent and compress a video, including intra and inter compression. Each frame is inverted in the latent space of StyleGAN, from which the optimal compression is learned. To do so, a diffeomorphic latent representation is learned using a normalizing flows model,… ▽ More

    Submitted 12 July, 2022; v1 submitted 9 July, 2022; originally announced July 2022.

    Comments: Accepted at ACM Multimedia 2022

  26. arXiv:2204.11769  [pdf, ps, other

    eess.IV cs.AI

    Multi-scale reconstruction of undersampled spectral-spatial OCT data for coronary imaging using deep learning

    Authors: Xueshen Li, Shengting Cao, Hongshan Liu, Xinwen Yao, Brigitta C. Brott, Silvio H. Litovsky, Xiaoyu Song, Yuye Ling, Yu Gan

    Abstract: Coronary artery disease (CAD) is a cardiovascular condition with high morbidity and mortality. Intravascular optical coherence tomography (IVOCT) has been considered as an optimal imagining system for the diagnosis and treatment of CAD. Constrained by Nyquist theorem, dense sampling in IVOCT attains high resolving power to delineate cellular structures/ features. There is a trade-off between high… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Comments: 11 pages, 8 figures, reviewed by IEEE trans BME

  27. arXiv:2203.15177  [pdf

    cs.CV eess.IV

    Min-Max Similarity: A Contrastive Semi-Supervised Deep Learning Network for Surgical Tools Segmentation

    Authors: Ange Lou, Kareem Tawfik, Xing Yao, Ziteng Liu, Jack Noble

    Abstract: A common problem with segmentation of medical images using neural networks is the difficulty to obtain a significant number of pixel-level annotated data for training. To address this issue, we proposed a semi-supervised segmentation network based on contrastive learning. In contrast to the previous state-of-the-art, we introduce Min-Max Similarity (MMS), a contrastive learning form of dual-view t… ▽ More

    Submitted 22 February, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

  28. arXiv:2202.09414  [pdf

    physics.med-ph eess.IV physics.bio-ph physics.optics

    Functional Optical Coherence Tomography for Intrinsic Signal Optoretinography: Recent Developments and Deployment Challenges

    Authors: Tae-Hoon Kim, Guangying Ma, Taeyoon Son, Xincheng Yao

    Abstract: Intrinsic optical signal (IOS) imaging of the retina, also termed as optoretinography (ORG), promises a noninvasive method for objective assessment of retinal function. By providing unparalleled capability to differentiate individual layers of the retina, functional optical coherence tomography (OCT) has been actively investigated for intrinsic signal ORG measurements. However, clinical deployment… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  29. arXiv:2201.12625  [pdf

    eess.IV cs.CV q-bio.TO

    ADC-Net: An Open-Source Deep Learning Network for Automated Dispersion Compensation in Optical Coherence Tomography

    Authors: Shaiban Ahmed, David Le, Taeyoon Son, Tobiloba Adejumo, Xincheng Yao, Department of Biomedical Engineering, University of Illinois at Chicago, Department of Ophthalmology, Visual Science, University of Illinois at Chicago

    Abstract: Chromatic dispersion is a common problem to degrade the system resolution in optical coherence tomography (OCT). This study is to develop a deep learning network for automated dispersion compensation (ADC-Net) in OCT. The ADC-Net is based on a redesigned UNet architecture which employs an encoder-decoder pipeline. The input section encompasses partially compensated OCT B-scans with individual reti… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

    Comments: 18 pages, 5 figures

  30. arXiv:2201.00466  [pdf, other

    eess.IV cs.CV

    RFormer: Transformer-based Generative Adversarial Network for Real Fundus Image Restoration on A New Clinical Benchmark

    Authors: Zhuo Deng, Yuanhao Cai, Lu Chen, Zheng Gong, Qiqi Bao, Xue Yao, Dong Fang, Shaochong Zhang, Lan Ma

    Abstract: Ophthalmologists have used fundus images to screen and diagnose eye diseases. However, different equipments and ophthalmologists pose large variations to the quality of fundus images. Low-quality (LQ) degraded fundus images easily lead to uncertainty in clinical screening and generally increase the risk of misdiagnosis. Thus, real fundus image restoration is worth studying. Unfortunately, real cli… ▽ More

    Submitted 3 August, 2022; v1 submitted 2 January, 2022; originally announced January 2022.

    Comments: IEEE J-BHI 2022; The First Benchmark and First Transformer-based Method for Real Clinical Fundus Image Restoration

  31. arXiv:2112.07775  [pdf

    q-bio.TO eess.IV

    Depth-resolved vascular profile features for artery-vein classification in OCT and OCT angiography of human retina

    Authors: Tobiloba Adejumo, Tae-Hoon Kim, David Le, Taeyoon Son, Guangying Ma, Xincheng Yao

    Abstract: This study is to characterize reflectance profiles of retinal blood vessels in optical coherence tomography (OCT), and to validate these vascular features to guide artery-vein classification in OCT angiography (OCTA) of human retina. Depth-resolved OCT reveals unique features of retinal arteries and veins. Retinal arteries show hyper-reflective boundaries at both upper (inner side towards the vitr… ▽ More

    Submitted 6 February, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: 11 pages, 4 figures

  32. arXiv:2111.10635  [pdf, other

    cs.DC cs.AI cs.LG eess.SY

    HeterPS: Distributed Deep Learning With Reinforcement Learning Based Scheduling in Heterogeneous Environments

    Authors: Ji Liu, Zhihua Wu, Dianhai Yu, Yanjun Ma, Danlei Feng, Minxu Zhang, Xinxuan Wu, Xuefeng Yao, De**g Dou

    Abstract: Deep neural networks (DNNs) exploit many layers and a large number of parameters to achieve excellent performance. The training process of DNN models generally handles large-scale input data with many sparse features, which incurs high Input/Output (IO) cost, while some layers are compute-intensive. The training process generally exploits distributed computing resources to reduce training time. In… ▽ More

    Submitted 7 June, 2023; v1 submitted 20 November, 2021; originally announced November 2021.

    Comments: 14 pages, 11 figures, 2 tables; To appear in Future Generation Computer Systems (FGCS)

  33. Physically Explainable CNN for SAR Image Classification

    Authors: Zhongling Huang, Xiwen Yao, Ying Liu, Corneliu Octavian Dumitru, Mihai Datcu, Junwei Han

    Abstract: Integrating the special electromagnetic characteristics of Synthetic Aperture Radar (SAR) in deep neural networks is essential in order to enhance the explainability and physics awareness of deep learning. In this paper, we first propose a novel physically explainable convolutional neural network for SAR image classification, namely physics guided and injected learning (PGIL). It comprises three p… ▽ More

    Submitted 2 June, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

    Journal ref: ISPRS Journal of Photogrammetry and Remote Sensing Volume 190, August 2022, Pages 25-37

  34. arXiv:2110.04921  [pdf, other

    cs.CV eess.IV physics.optics q-bio.CB

    Increasing a microscope's effective field of view via overlapped imaging and machine learning

    Authors: Xing Yao, Vinayak Pathak, Haoran Xi, Amey Chaware, Colin Cooke, Kanghyun Kim, Shiqi Xu, Yuting Li, Timothy Dunn, Pavan Chandra Konda, Kevin C. Zhou, Roarke Horstmeyer

    Abstract: This work demonstrates a multi-lens microscopic imaging system that overlaps multiple independent fields of view on a single sensor for high-efficiency automated specimen analysis. Automatic detection, classification and counting of various morphological features of interest is now a crucial component of both biomedical research and disease diagnosis. While convolutional neural networks (CNNs) hav… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

  35. arXiv:2106.07564  [pdf

    cs.CV cs.LG eess.IV

    An optimized Capsule-LSTM model for facial expression recognition with video sequences

    Authors: Siwei Liu, Yuanpeng Long, Gao Xu, Lijia Yang, Shimei Xu, Xiaoming Yao, Kunxian Shu

    Abstract: To overcome the limitations of convolutional neural network in the process of facial expression recognition, a facial expression recognition model Capsule-LSTM based on video frame sequence is proposed. This model is composed of three networks includingcapsule encoders, capsule decoders and LSTM network. The capsule encoder extracts the spatial information of facial expressions in video frames. Ca… ▽ More

    Submitted 27 May, 2021; originally announced June 2021.

    Comments: 14pages,4 figurews

  36. arXiv:2106.07563  [pdf

    cs.CV cs.LG eess.IV

    BPLF: A Bi-Parallel Linear Flow Model for Facial Expression Generation from Emotion Set Images

    Authors: Gao Xu, Yuanpeng Long, Siwei Liu, Lijia Yang, Shimei Xu, Xiaoming Yao, Kunxian Shu

    Abstract: The flow-based generative model is a deep learning generative model, which obtains the ability to generate data by explicitly learning the data distribution. Theoretically its ability to restore data is stronger than other generative models. However, its implementation has many limitations, including limited model design, too many model parameters and tedious calculation. In this paper, a bi-paral… ▽ More

    Submitted 27 May, 2021; originally announced June 2021.

    Comments: 20 pages, 10 figures

  37. arXiv:2106.06909  [pdf, other

    cs.SD cs.CL eess.AS

    GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

    Authors: Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie **, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan

    Abstract: This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10,000 hours of high quality labeled audio suitable for supervised training, and 40,000 hours of total audio suitable for semi-supervised and unsupervised training. Around 40,000 hours of transcribed audio is first collected from audiobooks, podcasts and YouTube, covering both read and spontaneous sp… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  38. arXiv:2105.04340  [pdf

    eess.SY

    Interaction Theory of Hazard-Target System

    Authors: Ji Ge, Yu-Yuan Zhang, Kai-Li Xu, Ji-Shuo Li, Xi-Wen Yao, Chun-Ying Wu, Shuang-Yuan Li, Fang Yan, **-Jia Zhang, Qing-Wei Xu

    Abstract: Major accidents (e.g., the Space Shuttle Challenger disaster in the USA, the Bhopal Disaster in India, Fukushima nuclear accident in Japan, Tian** Port fire and explosion accident in China) have occurred all over the world. Safety scientists are always trying to understand why these accidents happened and how to prevent these accidents. Accident models and theories form the basis for many safety… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: 28 pages, 9 figures, 3 tables

  39. arXiv:2009.05103  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space

    Authors: Sicheng Zhao, Yaxian Li, Xingxu Yao, Weizhi Nie, Pengfei Xu, Jufeng Yang, Kurt Keutzer

    Abstract: Both images and music can convey rich semantics and are widely used to induce specific emotions. Matching images and music with similar emotions might help to make emotion perceptions more vivid and stronger. Existing emotion-based image and music matching methods either employ limited categorical emotion states which cannot well reflect the complexity and subtlety of emotions, or train the matchi… ▽ More

    Submitted 22 August, 2020; originally announced September 2020.

    Comments: Accepted by ACM Multimedia 2020

  40. arXiv:2006.03742  [pdf

    eess.IV q-bio.QM

    AV-Net: Deep learning for fully automated artery-vein classification in optical coherence tomography angiography

    Authors: Minhaj Alam, David Le, Taeyoon Son, Jennifer I. Lim, Xincheng Yao

    Abstract: This study is to demonstrate deep learning for automated artery-vein (AV) classification in optical coherence tomography angiography (OCTA). The AV-Net, a fully convolutional network (FCN) based on modified U-shaped CNN architecture, incorporates enface OCT and OCTA to differentiate arteries and veins. For the multi-modal training process, the enface OCT works as a near infrared fundus image to pr… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

  41. arXiv:2005.07036  [pdf, ps, other

    eess.AS cs.LG cs.SD stat.ML

    Infant Crying Detection in Real-World Environments

    Authors: Xuewen Yao, Megan Micheletti, Mckensey Johnson, Edison Thomaz, Kaya de Barbaro

    Abstract: Most existing cry detection models have been tested with data collected in controlled settings. Thus, the extent to which they generalize to noisy and lived environments is unclear. In this paper, we evaluate several established machine learning approaches including a model leveraging both deep spectrum and acoustic features. This model was able to recognize crying events with F1 score 0.613 (Prec… ▽ More

    Submitted 16 February, 2022; v1 submitted 12 May, 2020; originally announced May 2020.

  42. arXiv:2002.07699  [pdf, other

    q-bio.QM cs.LG eess.IV q-bio.NC

    Cognitive Biomarker Prioritization in Alzheimer's Disease using Brain Morphometric Data

    Authors: Bo Peng, Xiaohui Yao, Shannon L. Risacher, Andrew J. Saykin, Li Shen, Xia Ning

    Abstract: Background:Cognitive assessments represent the most common clinical routine for the diagnosis of Alzheimer's Disease (AD). Given a large number of cognitive assessment tools and time-limited office visits, it is important to determine a proper set of cognitive tests for different subjects. Most current studies create guidelines of cognitive test selection for a targeted population, but they are no… ▽ More

    Submitted 12 November, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: This paper has been accepted by BMC MIDM

  43. arXiv:2001.03129  [pdf, other

    eess.IV physics.optics

    Beyond Fourier transform: super-resolving optical coherence tomography

    Authors: Yuye Ling, Mengyuan Wang, Yu Gan, Xinwen Yao, Leopold Schmetterer, Chuanqing Zhou, Yikai Su

    Abstract: Optical coherence tomography (OCT) is a volumetric imaging modality that empowers clinicians and scientists to noninvasively visualize the cross-sections of biological samples. As the latest generation of its kind, Fourier-domain OCT (FD-OCT) offers a micrometer-scale axial resolution by taking advantage of coherence gating. Based on the current theory, it is believed the only way to obtain a high… ▽ More

    Submitted 19 May, 2020; v1 submitted 9 January, 2020; originally announced January 2020.

    Comments: 30 pages, 9 figures

  44. arXiv:1912.05920  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Measuring Mother-Infant Emotions By Audio Sensing

    Authors: Xuewen Yao, Dong He, Tiancheng **g, Kaya de Barbaro

    Abstract: It has been suggested in developmental psychology literature that the communication of affect between mothers and their infants correlates with the socioemotional and cognitive development of infants. In this study, we obtained day-long audio recordings of 10 mother-infant pairs in order to study their affect communication in speech with a focus on mother's speech. In order to build a model for sp… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

  45. arXiv:1910.07002  [pdf

    physics.med-ph eess.IV

    Trans-pars-planar illumination enables 200° ultra-wide field pediatric fundus photography for easy examination of the retina

    Authors: Devrim Toslak, Felix Chau, Muhammet Kazim Erol, Changgeng Liu, R. V. Paul Chan, Taeyoon Son, Xincheng Yao

    Abstract: This study is to test the feasibility of using trans-pars-planar illumination for ultrawide field pediatric fundus photography. Fundus examination of the peripheral retina is essential for clinical management of pediatric eye diseases. However, current pediatric fundus cameras with traditional trans-pupillary illumination provide a limited field of view (FOV), making it difficult to access the per… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

    Comments: 9 pages, and 3 figures

  46. arXiv:1910.01796  [pdf

    q-bio.QM eess.IV

    Transfer Learning for Automated OCTA Detection of Diabetic Retinopathy

    Authors: David Le, Minhaj Alam, Cham Yao, Jennifer I. Lim, R. V. P. Chan, Devrim Toslak, Xincheng Yao

    Abstract: Purpose: To test the feasibility of using deep learning for optical coherence tomography angiography (OCTA) detection of diabetic retinopathy (DR). Methods: A deep learning convolutional neural network (CNN) architecture VGG16 was employed for this study. A transfer learning process was implemented to re-train the CNN for robust OCTA classification. In order to demonstrate the feasibility of using… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

    Comments: 20 pages, 4 figures, 6 tables

  47. arXiv:1906.02745  [pdf, other

    eess.SP cs.LG stat.ML

    Automated Classification of Seizures against Nonseizures: A Deep Learning Approach

    Authors: Xinghua Yao, Qiang Cheng, Guo-Qiang Zhang

    Abstract: In current clinical practice, electroencephalograms (EEG) are reviewed and analyzed by well-trained neurologists to provide supports for therapeutic decisions. The way of manual reviewing is labor-intensive and error prone. Automatic and accurate seizure/nonseizure classification methods are needed. One major problem is that the EEG signals for seizure state and nonseizure state exhibit considerab… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: 12 pages, 8 figures. arXiv admin note: text overlap with arXiv:1903.09326

  48. arXiv:1905.04224  [pdf

    q-bio.QM eess.IV q-bio.TO

    Supervised machine learning based multi-task artificial intelligence classification of retinopathies

    Authors: Minhaj Alam, David Le, Jennifer I. Lim, R. V. P. Chan, Xincheng Yao

    Abstract: Artificial intelligence (AI) classification holds promise as a novel and affordable screening tool for clinical management of ocular diseases. Rural and underserved areas, which suffer from lack of access to experienced ophthalmologists may particularly benefit from this technology. Quantitative optical coherence tomography angiography (OCTA) imaging provides excellent capability to identify subtl… ▽ More

    Submitted 10 May, 2019; originally announced May 2019.

    Comments: Supplemental material attached at the end

    Journal ref: https://www.mdpi.com/2077-0383/8/6/872