Skip to main content

Showing 1–42 of 42 results for author: Qi, X

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.17801  [pdf, other

    cs.SD cs.CL eess.AS

    A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge

    Authors: Xiaopeng Wang, Yi Lu, Xin Qi, Zhiyong Wang, Yuankun Xie, Shuchen Shi, Ruibo Fu

    Abstract: This paper presents the development of a speech synthesis system for the LIMMITS'24 Challenge, focusing primarily on Track 2. The objective of the challenge is to establish a multi-speaker, multi-lingual Indic Text-to-Speech system with voice cloning capabilities, covering seven Indian languages with both male and female speakers. The system was trained using challenge data and fine-tuned for few-… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.10591  [pdf, other

    eess.AS cs.AI cs.CV cs.MM cs.SD

    MINT: a Multi-modal Image and Narrative Text Dubbing Dataset for Foley Audio Content Planning and Generation

    Authors: Ruibo Fu, Shuchen Shi, Hongming Guo, Tao Wang, Chunyu Qiang, Zhengqi Wen, Jianhua Tao, Xin Qi, Yi Lu, Xiaopeng Wang, Zhiyong Wang, Yukun Liu, Xuefei Liu, Shuai Zhang, Guanjun Li

    Abstract: Foley audio, critical for enhancing the immersive experience in multimedia content, faces significant challenges in the AI-generated content (AIGC) landscape. Despite advancements in AIGC technologies for text and image generation, the foley audio dubbing remains rudimentary due to difficulties in cross-modal scene matching and content correlation. Current text-to-audio technology, which relies on… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  3. arXiv:2406.08112  [pdf, other

    cs.SD cs.AI eess.AS

    Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio

    Authors: Yi Lu, Yuankun Xie, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Zhiyong Wang, Xin Qi, Xuefei Liu, Yongwei Li, Yukun Liu, Xiaopeng Wang, Shuchen Shi

    Abstract: With the proliferation of Large Language Model (LLM) based deepfake audio, there is an urgent need for effective detection methods. Previous deepfake audio generation methods typically involve a multi-step generation process, with the final step using a vocoder to predict the waveform from handcrafted features. However, LLM-based audio is directly generated from discrete neural codecs in an end-to… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024. arXiv admin note: substantial text overlap with arXiv:2405.04880

  4. arXiv:2406.04683  [pdf, other

    cs.SD eess.AS

    PPPR: Portable Plug-in Prompt Refiner for Text to Audio Generation

    Authors: Shuchen Shi, Ruibo Fu, Zhengqi Wen, Jianhua Tao, Tao Wang, Chunyu Qiang, Yi Lu, Xin Qi, Xuefei Liu, Yukun Liu, Yongwei Li, Zhiyong Wang, Xiaopeng Wang

    Abstract: Text-to-Audio (TTA) aims to generate audio that corresponds to the given text description, playing a crucial role in media production. The text descriptions in TTA datasets lack rich variations and diversity, resulting in a drop in TTA model performance when faced with complex text. To address this issue, we propose a method called Portable Plug-in Prompt Refiner, which utilizes rich knowledge abo… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  5. arXiv:2406.03247  [pdf, other

    cs.SD eess.AS

    Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection

    Authors: Xiaopeng Wang, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Yuankun Xie, Yukun Liu, Jianhua Tao, Xuefei Liu, Yongwei Li, Xin Qi, Yi Lu, Shuchen Shi

    Abstract: The generalization of Fake Audio Detection (FAD) is critical due to the emergence of new spoofing techniques. Traditional FAD methods often focus solely on distinguishing between genuine and known spoofed audio. We propose a Genuine-Focused Learning (GFL) framework guided, aiming for highly generalized FAD, called GFL-FAD. This method incorporates a Counterfactual Reasoning Enhanced Representation… ▽ More

    Submitted 9 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted by INTERSPEECH 2024

  6. arXiv:2406.03237  [pdf, other

    cs.SD eess.AS

    Generalized Fake Audio Detection via Deep Stable Learning

    Authors: Zhiyong Wang, Ruibo Fu, Zhengqi Wen, Yuankun Xie, Yukun Liu, Xiaopeng Wang, Xuefei Liu, Yongwei Li, Jianhua Tao, Yi Lu, Xin Qi, Shuchen Shi

    Abstract: Although current fake audio detection approaches have achieved remarkable success on specific datasets, they often fail when evaluated with datasets from different distributions. Previous studies typically address distribution shift by focusing on using extra data or applying extra loss restrictions during training. However, these methods either require a substantial amount of data or complicate t… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: accepted by INTERSPEECH2024

  7. arXiv:2405.04880  [pdf, other

    cs.SD cs.AI eess.AS

    The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

    Authors: Yuankun Xie, Yi Lu, Ruibo Fu, Zhengqi Wen, Zhiyong Wang, Jianhua Tao, Xin Qi, Xiaopeng Wang, Yukun Liu, Haonan Cheng, Long Ye, Yi Sun

    Abstract: With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for generalized detection methods. ALM-based deepfake audio currently exhibits widespread, high deception, and type versatility, posing a significant challenge to current audio deepfake detection (ADD) models trained solely on vocoded data. To effectively detect ALM-based deepfake audio, we focus on… ▽ More

    Submitted 15 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  8. arXiv:2404.11525  [pdf, other

    cs.CV eess.IV

    JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA

    Authors: Zeyu Zhang, Xuyin Qi, Mingxi Chen, Guangxi Li, Ryan Pham, Ayub Qassim, Ella Berry, Zhibin Liao, Owen Siggs, Robert Mclaughlin, Jamie Craig, Minh-Son To

    Abstract: The oxygen saturation level in the blood (SaO2) is crucial for health, particularly in relation to sleep-related breathing disorders. However, continuous monitoring of SaO2 is time-consuming and highly variable depending on patients' conditions. Recently, optical coherence tomography angiography (OCTA) has shown promising development in rapidly and effectively screening eye-related lesions, offeri… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  9. arXiv:2312.12824  [pdf, other

    eess.IV cs.CV

    FedSODA: Federated Cross-assessment and Dynamic Aggregation for Histopathology Segmentation

    Authors: Yuan Zhang, Yaolei Qi, Xiaoming Qi, Lotfi Senhadji, Yongyue Wei, Feng Chen, Guanyu Yang

    Abstract: Federated learning (FL) for histopathology image segmentation involving multiple medical sites plays a crucial role in advancing the field of accurate disease diagnosis and treatment. However, it is still a task of great challenges due to the sample imbalance across clients and large data heterogeneity from disparate organs, variable segmentation tasks, and diverse distribution. Thus, we propose a… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP2024

  10. arXiv:2309.00223  [pdf, other

    eess.AS cs.CL cs.SD

    The FruitShell French synthesis system at the Blizzard 2023 Challenge

    Authors: Xin Qi, Xiaopeng Wang, Zhiyong Wang, Wang Liu, Mingming Ding, Shuchen Shi

    Abstract: This paper presents a French text-to-speech synthesis system for the Blizzard Challenge 2023. The challenge consists of two tasks: generating high-quality speech from female speakers and generating speech that closely resembles specific individuals. Regarding the competition data, we conducted a screening process to remove missing or erroneous text data. We organized all symbols except for phoneme… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

  11. arXiv:2307.16620  [pdf, other

    cs.SD cs.CV eess.AS

    Audio-Visual Segmentation by Exploring Cross-Modal Mutual Semantics

    Authors: Chen Liu, Peike Li, Xingqun Qi, Hu Zhang, Lincheng Li, Dadong Wang, Xin Yu

    Abstract: The audio-visual segmentation (AVS) task aims to segment sounding objects from a given video. Existing works mainly focus on fusing audio and visual features of a given video to achieve sounding object masks. However, we observed that prior arts are prone to segment a certain salient object in a video regardless of the audio information. This is because sounding objects are often the most salient… ▽ More

    Submitted 31 July, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: This paper has been received by ACM MM 23

  12. arXiv:2307.08388  [pdf, other

    cs.CV eess.IV

    Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation

    Authors: Yaolei Qi, Yuting He, Xiaoming Qi, Yuan Zhang, Guanyu Yang

    Abstract: Accurate segmentation of topological tubular structures, such as blood vessels and roads, is crucial in various fields, ensuring accuracy and efficiency in downstream tasks. However, many factors complicate the task, including thin local structures and variable global morphologies. In this work, we note the specificity of tubular structures and use this knowledge to guide our DSCNet to simultaneou… ▽ More

    Submitted 18 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Accepted by ICCV 2023

  13. arXiv:2306.07505  [pdf

    q-bio.TO eess.IV

    Deep learning radiomics for assessment of gastroesophageal varices in people with compensated advanced chronic liver disease

    Authors: Lan Wang, Ruiling He, Lili Zhao, Jia Wang, Zhengzi Geng, Tao Ren, Guo Zhang, Peng Zhang, Kaiqiang Tang, Chaofei Gao, Fei Chen, Liting Zhang, Yonghe Zhou, Xin Li, Fanbin He, Hui Huan, Wenjuan Wang, Yunxiao Liang, Juan Tang, Fang Ai, Tingyu Wang, Liyun Zheng, Zhongwei Zhao, Jiansong Ji, Wei Liu , et al. (22 additional authors not shown)

    Abstract: Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV). Design: A prospective multicenter study was conducted in patients with… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  14. arXiv:2305.13869  [pdf, other

    physics.acc-ph cs.AI cs.LG eess.SY

    Trend-Based SAC Beam Control Method with Zero-Shot in Superconducting Linear Accelerator

    Authors: Xiaolong Chen, Xin Qi, Chunguang Su, Yuan He, Zhijun Wang, Kunxiang Sun, Chao **, Weilong Chen, Shuhui Liu, Xiaoying Zhao, Duanyang Jia, Man Yi

    Abstract: The superconducting linear accelerator is a highly flexiable facility for modern scientific discoveries, necessitating weekly reconfiguration and tuning. Accordingly, minimizing setup time proves essential in affording users with ample experimental time. We propose a trend-based soft actor-critic(TBSAC) beam control method with strong robustness, allowing the agents to be trained in a simulated en… ▽ More

    Submitted 25 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  15. arXiv:2304.14503  [pdf

    cs.CV eess.IV physics.ins-det physics.optics

    UHRNet: A Deep Learning-Based Method for Accurate 3D Reconstruction from a Single Fringe-Pattern

    Authors: Yixiao Wang, Canlin Zhou, Xingyang Qi, Hui Li

    Abstract: The quick and accurate retrieval of an object height from a single fringe pattern in Fringe Projection Profilometry has been a topic of ongoing research. While a single shot fringe to depth CNN based method can restore height map directly from a single pattern, its accuracy is currently inferior to the traditional phase shifting technique. To improve this method's accuracy, we propose using a U sh… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

  16. arXiv:2304.12988  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-Scale Feature Fusion using Parallel-Attention Block for COVID-19 Chest X-ray Diagnosis

    Authors: Xiao Qi, David J. Foran, John L. Nosher, Ilker Hacihaliloglu

    Abstract: Under the global COVID-19 crisis, accurate diagnosis of COVID-19 from Chest X-ray (CXR) images is critical. To reduce intra- and inter-observer variability, during the radiological assessment, computer-aided diagnostic tools have been utilized to supplement medical decision-making and subsequent disease management. Computational methods with high accuracy and robustness are required for rapid tria… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2023:008

    Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2023)

  17. arXiv:2303.00369  [pdf, other

    cs.CV eess.IV

    Indescribable Multi-modal Spatial Evaluator

    Authors: Lingke Kong, X. Sharon Qi, Qi** Shen, Jiacheng Wang, **gyi Zhang, Yanle Hu, Qichao Zhou

    Abstract: Multi-modal image registration spatially aligns two images with different distributions. One of its major challenges is that images acquired from different imaging machines have different imaging distributions, making it difficult to focus only on the spatial aspect of the images and ignore differences in distributions. In this study, we developed a self-supervised approach, Indescribable Multi-mo… ▽ More

    Submitted 1 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR2023

  18. arXiv:2302.09621  [pdf

    eess.IV

    Augmenting endometriosis analysis from ultrasound data with deep learning

    Authors: Adrian Balica, Jennifer Dai, Kayla Piiwaa, Xiao Qi, Ashlee N. Green, Nancy Phillips, Susan Egan, Ilker Hacihaliloglu

    Abstract: Endometriosis is a non-malignant disorder that affects 176 million women globally. Diagnostic delays result in severe dysmenorrhea, dyspareunia, chronic pelvic pain, and infertility. Therefore, there is a significant need to diagnose patients at an early stage. Our objective in this work is to investigate the potential of deep learning methods to classify endometriosis from ultrasound data. Retros… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

    Comments: Accepted to 2023 SPIE Medical Imaging Conference

  19. arXiv:2211.00899  [pdf, other

    eess.IV cs.CV

    LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation via Similarity Knowledge Distillation

    Authors: Hao Dang, Yuekai Zhang, Xingqun Qi, Wanting Zhou, Muyi Sun

    Abstract: In recent years, deep convolution neural networks (DCNNs) have achieved great prospects in coronary artery vessel segmentation. However, it is difficult to deploy complicated models in clinical scenarios since high-performance approaches have excessive parameters and high computation costs. To tackle this problem, we propose \textbf{LightVessel}, a Similarity Knowledge Distillation Framework, for… ▽ More

    Submitted 25 February, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: 5 pages, 7 figures, conference

  20. arXiv:2208.01843  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis

    Authors: Xiao Qi, David J. Foran, John L. Nosher, Ilker Hacihaliloglu

    Abstract: The role of chest X-ray (CXR) imaging, due to being more cost-effective, widely available, and having a faster acquisition time compared to CT, has evolved during the COVID-19 pandemic. To improve the diagnostic performance of CXR imaging a growing number of studies have investigated whether supervised deep learning methods can provide additional support. However, supervised methods rely on a larg… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

    Comments: Accepted to the 2022 MICCAI Workshop on Medical Image Learning with Limited and Noisy Data

  21. arXiv:2205.14411  [pdf, other

    cs.SD cs.HC eess.AS

    Feature Pyramid Attention based Residual Neural Network for Environmental Sound Classification

    Authors: Liguang Zhou, Yuhongze Zhou, Xiaonan Qi, Junjie Hu, Tin Lun Lam, Yangsheng Xu

    Abstract: Environmental sound classification (ESC) is a challenging problem due to the unstructured spatial-temporal relations that exist in the sound signals. Recently, many studies have focused on abstracting features from convolutional neural networks while the learning of semantically relevant frames of sound signals has been overlooked. To this end, we present an end-to-end framework, namely feature py… ▽ More

    Submitted 28 May, 2022; originally announced May 2022.

  22. arXiv:2205.04846  [pdf, other

    eess.IV cs.CV

    MNet: Rethinking 2D/3D Networks for Anisotropic Medical Image Segmentation

    Authors: Zhangfu Dong, Yuting He, Xiaoming Qi, Yang Chen, Huazhong Shu, Jean-Louis Coatrieux, Guanyu Yang, Shuo Li

    Abstract: The nature of thick-slice scanning causes severe inter-slice discontinuities of 3D medical images, and the vanilla 2D/3D convolutional neural networks (CNNs) fail to represent sparse inter-slice information and dense intra-slice information in a balanced way, leading to severe underfitting to inter-slice features (for vanilla 2D CNNs) and overfitting to noise from long-range slices (for vanilla 3D… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Accepted by IJCAI 2022

  23. arXiv:2205.00698  [pdf

    eess.IV cs.CV cs.LG

    Unsupervised Denoising of Optical Coherence Tomography Images with Dual_Merged CycleWGAN

    Authors: Jie Du, Xujian Yang, Kecheng **, Xuanzheng Qi, Hu Chen

    Abstract: Nosie is an important cause of low quality Optical coherence tomography (OCT) image. The neural network model based on Convolutional neural networks(CNNs) has demonstrated its excellent performance in image denoising. However, OCT image denoising still faces great challenges because many previous neural network algorithms required a large number of labeled data, which might cost much time or is ex… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

    Comments: Mr. Hu Chen is our corresponding author

  24. arXiv:2204.06260  [pdf, other

    cs.CL cs.SD eess.AS

    Self-critical Sequence Training for Automatic Speech Recognition

    Authors: Chen Chen, Yuchen Hu, Nana Hou, Xiaofeng Qi, Heqing Zou, Eng Siong Chng

    Abstract: Although automatic speech recognition (ASR) task has gained remarkable success by sequence-to-sequence models, there are two main mismatches between its training and testing that might lead to performance degradation: 1) The typically used cross-entropy criterion aims to maximize log-likelihood of the training data, while the performance is evaluated by word error rate (WER), not log-likelihood; 2… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: Accepted by ICASSP 2022

  25. arXiv:2203.15526  [pdf, other

    cs.SD cs.CL eess.AS

    Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning

    Authors: Chen Chen, Nana Hou, Yuchen Hu, Heqing Zou, Xiaofeng Qi, Eng Siong Chng

    Abstract: Automated Audio captioning (AAC) is a cross-modal task that generates natural language to describe the content of input audio. Most prior works usually extract single-modality acoustic features and are therefore sub-optimal for the cross-modal decoding task. In this work, we propose a novel AAC system called CLIP-AAC to learn interactive cross-modality representation with both acoustic and textual… ▽ More

    Submitted 12 April, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: Submitted to Interspeech 2022

  26. arXiv:2201.11871  [pdf, other

    cs.CV eess.SP

    Infrastructure-Based Object Detection and Tracking for Cooperative Driving Automation: A Survey

    Authors: Zhengwei Bai, Guoyuan Wu, Xuewei Qi, Yongkang Liu, Kentaro Oguchi, Matthew J. Barth

    Abstract: Object detection plays a fundamental role in enabling Cooperative Driving Automation (CDA), which is regarded as the revolutionary solution to addressing safety, mobility, and sustainability issues of contemporary transportation systems. Although current computer vision technologies could provide satisfactory object detection results in occlusion-free scenarios, the perception performance of onboa… ▽ More

    Submitted 19 March, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

  27. arXiv:2201.03313  [pdf, other

    eess.AS cs.AI cs.SD

    Cross-Modal ASR Post-Processing System for Error Correction and Utterance Rejection

    Authors: **g Du, Shiliang Pu, Qinbo Dong, Chao **, Xin Qi, Dian Gu, Ru Wu, Hongwei Zhou

    Abstract: Although modern automatic speech recognition (ASR) systems can achieve high performance, they may produce errors that weaken readers' experience and do harm to downstream tasks. To improve the accuracy and reliability of ASR hypotheses, we propose a cross-modal post-processing system for speech recognizers, which 1) fuses acoustic features and textual features from different modalities, 2) joints… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

    Comments: submit to ICASSP2022, 5 pages, 3 figures

  28. arXiv:2106.04130  [pdf, other

    eess.IV cs.CV

    EnMcGAN: Adversarial Ensemble Learning for 3D Complete Renal Structures Segmentation

    Authors: Yuting He, Rongjun Ge, Xiaoming Qi, Guanyu Yang, Yang Chen, Youyong Kong, Huazhong Shu, Jean-Louis Coatrieux, Shuo Li

    Abstract: 3D complete renal structures(CRS) segmentation targets on segmenting the kidneys, tumors, renal arteries and veins in one inference. Once successful, it will provide preoperative plans and intraoperative guidance for laparoscopic partial nephrectomy(LPN), playing a key role in the renal cancer treatment. However, no success has been reported in 3D CRS segmentation due to the complex shapes of rena… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

    Journal ref: Information Processing in Medical Imaging (IPMI) 2021

  29. Multi-Feature Semi-Supervised Learning for COVID-19 Diagnosis from Chest X-ray Images

    Authors: Xiao Qi, John L. Nosher, David J. Foran, Ilker Hacihaliloglu

    Abstract: Computed tomography (CT) and chest X-ray (CXR) have been the two dominant imaging modalities deployed for improved management of Coronavirus disease 2019 (COVID-19). Due to faster imaging, less radiation exposure, and being cost-effective CXR is preferred over CT. However, the interpretation of CXR images, compared to CT, is more challenging due to low image resolution and COVID-19 image features… ▽ More

    Submitted 14 April, 2021; v1 submitted 4 April, 2021; originally announced April 2021.

  30. arXiv:2011.03585  [pdf, other

    eess.IV cs.CV

    Chest X-ray Image Phase Features for Improved Diagnosis of COVID-19 Using Convolutional Neural Network

    Authors: Xiao Qi, Lloyd Brown, David J. Foran, Ilker Hacihaliloglu

    Abstract: Recently, the outbreak of the novel Coronavirus disease 2019 (COVID-19) pandemic has seriously endangered human health and life. Due to limited availability of test kits, the need for auxiliary diagnostic approach has increased. Recent research has shown radiography of COVID-19 patient, such as CT and X-ray, contains salient information about the COVID-19 virus and could be used as an alternative… ▽ More

    Submitted 14 April, 2021; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: 16 pages, 9 figures

    Journal ref: International Journal of Computer Assisted Radiology and Surgery, 2021

  31. arXiv:2010.07408  [pdf, other

    eess.SP cs.IT cs.NI

    Reconfigurable Intelligent Surface: Design the Channel -- a New Opportunity for Future Wireless Networks

    Authors: Miguel Dajer, Zhengxiang Ma, Leonard Piazzi, Narayan Prasad, Xiao-Feng Qi, Baoling Sheen, ** Yang, Guosen Yue

    Abstract: In this paper, we survey state-of-the-art research outcomes in the burgeoning field of reconfigurable intelligent surface (RIS) in view of its potential for significant performance enhancement for next generation wireless communication networks by means of adapting the propagation environment. Emphasis has been placed on several aspects gating the commercially viability of a future network deploym… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 22 pages, 18 figures

  32. arXiv:2005.04901  [pdf

    physics.med-ph eess.IV

    A novel 3D multi-path DenseNet for improving automatic segmentation of glioblastoma on pre-operative multi-modal MR images

    Authors: Jie Fu, Kamal Singhrao, X. Sharon Qi, Yingli Yang, Dan Ruan, John H. Lewis

    Abstract: Convolutional neural networks have achieved excellent results in automatic medical image segmentation. In this study, we proposed a novel 3D multi-path DenseNet for generating the accurate glioblastoma (GBM) tumor contour from four multi-modal pre-operative MR images. We hypothesized that the multi-path architecture could achieve more accurate segmentation than a single-path architecture. 258 GBM… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Comments: 15 pages, 6 figures, review in progress

    Journal ref: 2021 Medical Physics

  33. arXiv:2003.13898  [pdf, other

    cs.CV cs.LG eess.IV

    Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis

    Authors: Hao Tang, Xiaojuan Qi, Guolei Sun, Dan Xu, Nicu Sebe, Radu Timofte, Luc Van Gool

    Abstract: We propose a novel ECGAN for the challenging semantic image synthesis task. Although considerable improvement has been achieved, the quality of synthesized images is far from satisfactory due to three largely unresolved challenges. 1) The semantic labels do not provide detailed structural information, making it difficult to synthesize local details and structures. 2) The widely adopted CNN operati… ▽ More

    Submitted 27 March, 2023; v1 submitted 30 March, 2020; originally announced March 2020.

  34. arXiv:2001.03698  [pdf, other

    cs.CV eess.IV

    AE-OT-GAN: Training GANs from data specific latent distribution

    Authors: Dongsheng An, Yang Guo, Min Zhang, Xin Qi, Na Lei, Shing-Tung Yau, Xianfeng Gu

    Abstract: Though generative adversarial networks (GANs) areprominent models to generate realistic and crisp images,they often encounter the mode collapse problems and arehard to train, which comes from approximating the intrinsicdiscontinuous distribution transform map with continuousDNNs. The recently proposed AE-OT model addresses thisproblem by explicitly computing the discontinuous distribu-tion transfo… ▽ More

    Submitted 27 January, 2020; v1 submitted 10 January, 2020; originally announced January 2020.

  35. arXiv:1909.04012  [pdf

    physics.med-ph cs.CV eess.IV

    Deep Learning-based Radiomic Features for Improving Neoadjuvant Chemoradiation Response Prediction in Locally Advanced Rectal Cancer

    Authors: Jie Fu, Xinran Zhong, Ning Li, Ritchell Van Dams, John Lewis, Kyunghyun Sung, Ann C. Raldow, **g **, X. Sharon Qi

    Abstract: Radiomic features achieve promising results in cancer diagnosis, treatment response prediction, and survival prediction. Our goal is to compare the handcrafted (explicitly designed) and deep learning (DL)-based radiomic features extracted from pre-treatment diffusion-weighted magnetic resonance images (DWIs) for predicting neoadjuvant chemoradiation treatment (nCRT) response in patients with local… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: Review in progress

    Journal ref: 2020 Phys. Med. Biol

  36. arXiv:1907.00482  [pdf, other

    eess.SP cs.IT

    Base Station Antenna Selection for Low-Resolution ADC Systems

    Authors: **seok Choi, Junmo Sung, Narayan Prasad, Xiao-Feng Qi, Brian L. Evans, Alan Gatherer

    Abstract: This paper investigates antenna selection at a base station with large antenna arrays and low-resolution analog-to-digital converters. For downlink transmit antenna selection for narrowband channels, we show (1) a selection criterion that maximizes sum rate with zero-forcing precoding equivalent to that of a perfect quantization system; (2) maximum sum rate increases with number of selected antenn… ▽ More

    Submitted 30 June, 2019; originally announced July 2019.

    Comments: Submitted to IEEE Transactions on Communications

  37. arXiv:1904.09316  [pdf

    eess.SP cs.NI

    A Low Complexity Near-Maximum Likelihood MIMO Receiver with Low Resolution Analog-to-Digital Converters

    Authors: Arkady Molev-Shteiman, Xiao-Feng Qi, Laurence Mailaender

    Abstract: Based on a new equivalent model of quantizer with noisy input recently presented in [23], we propose a new low complexity receiver that takes into account the nonlinear distortion (NLD) generated by Analog to Digital converter (ADC) with insufficient resolution. The strength of new model is that it presents the NLD as a function of only the desired part of input signal (without noise). Therefore i… ▽ More

    Submitted 19 April, 2019; originally announced April 2019.

  38. arXiv:1904.09312  [pdf

    eess.SP cs.NI

    Low Resolution Digital-to-Analog Converter with Digital Dithering for MIMO Transmitter

    Authors: Arkady Molev-Shteiman, Xiao-Feng Qi, Laurence Mailaender

    Abstract: Based on an equivalent model for quantizers with noisy inputs recently presented in [35], we propose a method of digital dithering at the transmitter that may significantly reduce the resolution requirements of MIMO downlink Digital to Analog Convertors (DAC). We use this equivalent model to analyze the effect of the dither Probability Density Function (PFD), and show that the uniform PDF produces… ▽ More

    Submitted 19 April, 2019; originally announced April 2019.

  39. arXiv:1904.08519  [pdf

    cs.NI eess.SP

    New equivalent model of quantizer with noisy input and its application for ADC resolution determination in an uplink MIMO receiver

    Authors: Arkady Molev-Shteiman, Xiao-Feng Qi, Laurence Mailaender, Narayan Prasad, Bertrand Hochwald

    Abstract: When a quantizer input signal is the sum of the desired signal and input white noise, the quantization error is a function of total input signal. Our new equivalent model splits the quantization error into two components: a non-linear distortion (NLD) that is a function of only the desired part of input signal (without noise), and an equivalent out-put white noise. This separation is important bec… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

  40. arXiv:1811.11102  [pdf

    eess.SP cs.NI

    Maximal Entropy Reduction Algorithm for SAR ADC Clock Compression

    Authors: Arkady Molev-Shteiman, Xiao-Feng Qi

    Abstract: Reduction of comparison cycles leads to power savings of a successive-approximation-register (SAR) analog-to-digital converters (ADC). We establish that the lowest average number of comparison cycles of a SAR ADC approaches the entropy of the ADC output, and proposed a simple adaptive algorithm that approaches this lower bound. Today's SAR ADC uses binary search, which consumes more power than nec… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

  41. arXiv:1810.07522  [pdf, other

    eess.SP cs.IT

    Optimizing Beams and Bits: A Novel Approach for Massive MIMO Base-Station Design

    Authors: Narayan Prasad, Xiao-Feng Qi, Alan Gatherer

    Abstract: We consider the problem of jointly optimizing ADC bit resolution and analog beamforming over a frequency-selective massive MIMO uplink. We build upon a popular model to incorporate the impact of low bit resolution ADCs, that hitherto has mostly been employed over flat-fading systems. We adopt weighted sum rate (WSR) as our objective and show that WSR maximization under finite buffer limits and imp… ▽ More

    Submitted 26 February, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

    Comments: Tech. Report. Appeared in part in IEEE ICNC 2019. Added few more comments and corrected minor typos

  42. arXiv:1312.2632  [pdf, other

    eess.SY

    SEED: Public Energy and Environment Dataset for Optimizing HVAC Operation in Subway Stations

    Authors: Yongcai Wang, Haoran Feng, Xiao Qi

    Abstract: For sustainability and energy saving, the problem to optimize the control of heating, ventilating, and air-conditioning (HVAC) systems has attracted great attentions, but analyzing the signatures of thermal environments and HVAC systems and the evaluation of the optimization policies has encountered inefficiency and inconvenient problems due to the lack of public dataset. In this paper, we present… ▽ More

    Submitted 9 December, 2013; originally announced December 2013.

    Comments: 5 pages, 14 figures