Search | arXiv e-print repository

Emerging Advancements in 6G NTN Radio Access Technologies: An Overview

Authors: Husnain Shahid, Carla Amatetti, Riccardo Campana, Sorya Tong, Dorin Panaitopol, Alessandro Vanelli Coralli, Abdelhamed Mohamed, Chao Zhang, Ebraam Khalifa, Eduardo Medeiros, Estefania Recayte, Fatemeh Ghasemifard, Ji Lianghai, Juan Bucheli, Karthik Anantha Swamy, Marius Caus, Mehmet Gurelli, Miguel A. Vazquez, Musbah Shaat, Nathan Borios, Per-Erik Eriksson, Sebastian Euler, Zheng Li, Xiaotian Fu

Abstract: The efforts on the development, standardization and improvements to communication systems towards 5G Advanced and 6G are on track to provide benefits such as an unprecedented level of connectivity and performance, enabling a diverse range of vertical services. The full integration of non-terrestrial components into 6G plays a pivotal role in realizing this paradigm shift towards ubiquitous communi… ▽ More The efforts on the development, standardization and improvements to communication systems towards 5G Advanced and 6G are on track to provide benefits such as an unprecedented level of connectivity and performance, enabling a diverse range of vertical services. The full integration of non-terrestrial components into 6G plays a pivotal role in realizing this paradigm shift towards ubiquitous communication and global coverage. However, this integration into 6G brings forth a set of its own challenges, particularly in Radio Access Technologies (RATs). To this end, this paper comprehensively discusses those challenges at different levels of RATs and proposes the corresponding potential emerging advancements in the realm of 6G NTN. In particular, the focus is on advancing the prospective aspects of Radio Resource Management (RRM), spectral coexistence in terrestrial and non-terrestrial components and flexible waveform design solutions to combat the impediments. This discussion with a specific focus on emerging advancements in 6G NTN RATs is critical for sha** the next generation networks and potentially relevant in contributing the part in standardization in forthcoming releases △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: accepted in 2024 EuCNC and 6G Summit, Antwerp, Belgium, 3_6 June 2024

arXiv:2404.09567 [pdf, other]

A competitive game optimization algorithm for Unmanned Aerial Vehicle path planning

Authors: Tai-shan Lou, Guang-sheng Guan, Zhe-peng Yue, Yu Wang, Ren-long Qi, Shi-hao Tong

Abstract: To solve the Unmanned Aerial Vehicle (UAV) path planning problem, a meta-heuristic optimization algorithm called competitive game optimizer (CGO) is proposed. In the CGO model, three phases of exploration and exploitation, and candidate replacement, are established, corresponding to the player's search for supplies and combat, and the movement toward a safe zone. In the algorithm exploration phase… ▽ More To solve the Unmanned Aerial Vehicle (UAV) path planning problem, a meta-heuristic optimization algorithm called competitive game optimizer (CGO) is proposed. In the CGO model, three phases of exploration and exploitation, and candidate replacement, are established, corresponding to the player's search for supplies and combat, and the movement toward a safe zone. In the algorithm exploration phase, Levy flight is introduced to improve the global convergence of the algorithm. The encounter probability which adaptively changes with the number of iterations is also introduced in the CGO. The balance between exploration and exploitation of solution space of optimization problem is realized, and each step is described and modeled mathematically. The performance of the CGO was evaluated on a set of 41 test functions taken from CEC2017 and CEC2022. It was then compared with eight widely recognized meta-heuristic optimization algorithms. The simulation results demonstrate that the proposed algorithm successfully achieves a balanced trade-off between exploration and exploitation, showcasing remarkable advantages when compared to seven classical algorithms. In addition, in order to further verify the effectiveness of the CGO, the CGO is applied to 8 practical engineering design problems and UAV path planning, and the results show that the CGO has strong performance in dealing with these practical optimization problems, and has a good application prospect. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2309.02638 [pdf]

Review of photoacoustic imaging plus X

Authors: Daohuai Jiang, Luyao Zhu, Shangqing Tong, Yuting Shen, Feng Gao, Fei Gao

Abstract: Photoacoustic imaging (PAI) is a novel modality in biomedical imaging technology that combines the rich optical contrast with the deep penetration of ultrasound. To date, PAI technology has found applications in various biomedical fields. In this review, we present an overview of the emerging research frontiers on PAI plus other advanced technologies, named as PAI plus X, which includes but not li… ▽ More Photoacoustic imaging (PAI) is a novel modality in biomedical imaging technology that combines the rich optical contrast with the deep penetration of ultrasound. To date, PAI technology has found applications in various biomedical fields. In this review, we present an overview of the emerging research frontiers on PAI plus other advanced technologies, named as PAI plus X, which includes but not limited to PAI plus treatment, PAI plus new circuits design, PAI plus accurate positioning system, PAI plus fast scanning systems, PAI plus novel ultrasound sensors, PAI plus advanced laser sources, PAI plus deep learning, and PAI plus other imaging modalities. We will discuss each technology's current state, technical advantages, and prospects for application, reported mostly in recent three years. Lastly, we discuss and summarize the challenges and potential future work in PAI plus X area. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2308.02088 [pdf, other]

doi 10.1002/mrm.30123

Motion-robust free-running volumetric cardiovascular MRI

Authors: Syed M. Arshad, Lee C. Potter, Chong Chen, Yingmin Liu, Preethi Chandrasekaran, Christopher Crabtree, Matthew S. Tong, Orlando P. Simonetti, Yuchi Han, Rizwan Ahmad

Abstract: PURPOSE: To present and assess an outlier mitigation method that makes free-running volumetric cardiovascular MRI (CMR) more robust to motion. METHODS: The proposed method, called compressive recovery with outlier rejection (CORe), models outliers in the measured data as an additive auxiliary variable. We enforce MR physics-guided group sparsity on the auxiliary variable, and jointly estimate it… ▽ More PURPOSE: To present and assess an outlier mitigation method that makes free-running volumetric cardiovascular MRI (CMR) more robust to motion. METHODS: The proposed method, called compressive recovery with outlier rejection (CORe), models outliers in the measured data as an additive auxiliary variable. We enforce MR physics-guided group sparsity on the auxiliary variable, and jointly estimate it along with the image using an iterative algorithm. For evaluation, CORe is first compared to traditional compressed sensing (CS), robust regression (RR), and an existing outlier rejection method using two simulation studies. Then, CORe is compared to CS using seven three-dimensional (3D) cine, 12 rest four-dimensional (4D) flow, and eight stress 4D flow imaging datasets. RESULTS: Our simulation studies show that CORe outperforms CS, RR, and the existing outlier rejection method in terms of normalized mean square error and structural similarity index across 55 different realizations. The expert reader evaluation of 3D cine images demonstrates that CORe is more effective in suppressing artifacts while maintaining or improving image sharpness. Finally, 4D flow images show that CORe yields more reliable and consistent flow measurements, especially in the presence of involuntary subject motion or exercise stress. CONCLUSION: An outlier rejection method is presented and tested using simulated and measured data. This method can help suppress motion artifacts in a wide range of free-running CMR applications. CODE & DATA: Implementation code and datasets are available on GitHub at http://github.com/OSU-MR/motion-robust-CMR △ Less

Submitted 24 June, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

Journal ref: Magnetic Resonance in Medicine 92(3) (2024) 1248-1262

arXiv:2307.08556 [pdf, other]

Machine-Learning-based Colorectal Tissue Classification via Acoustic Resolution Photoacoustic Microscopy

Authors: Shangqing Tong, Peng Ge, Yanan Jiao, Zhaofu Ma, Ziye Li, Longhai Liu, Feng Gao, Xiaohui Du, Fei Gao

Abstract: Colorectal cancer is a deadly disease that has become increasingly prevalent in recent years. Early detection is crucial for saving lives, but traditional diagnostic methods such as colonoscopy and biopsy have limitations. Colonoscopy cannot provide detailed information within the tissues affected by cancer, while biopsy involves tissue removal, which can be painful and invasive. In order to impro… ▽ More Colorectal cancer is a deadly disease that has become increasingly prevalent in recent years. Early detection is crucial for saving lives, but traditional diagnostic methods such as colonoscopy and biopsy have limitations. Colonoscopy cannot provide detailed information within the tissues affected by cancer, while biopsy involves tissue removal, which can be painful and invasive. In order to improve diagnostic efficiency and reduce patient suffering, we studied machine-learningbased approach for colorectal tissue classification that uses acoustic resolution photoacoustic microscopy (ARPAM). With this tool, we were able to classify benign and malignant tissue using multiple machine learning methods. Our results were analyzed both quantitatively and qualitatively to evaluate the effectiveness of our approach. △ Less

Submitted 17 July, 2023; originally announced July 2023.

arXiv:2306.13843 [pdf, other]

Score-based Generative Models for Photoacoustic Image Reconstruction with Rotation Consistency Constraints

Authors: Shangqing Tong, Hengrong Lan, Liming Nie, Jianwen Luo, Fei Gao

Abstract: Photoacoustic tomography (PAT) is a newly emerged imaging modality which enables both high optical contrast and acoustic depth of penetration. Reconstructing images of photoacoustic tomography from limited amount of senser data is among one of the major challenges in photoacoustic imaging. Previous works based on deep learning were trained in supervised fashion, which directly map the input partia… ▽ More Photoacoustic tomography (PAT) is a newly emerged imaging modality which enables both high optical contrast and acoustic depth of penetration. Reconstructing images of photoacoustic tomography from limited amount of senser data is among one of the major challenges in photoacoustic imaging. Previous works based on deep learning were trained in supervised fashion, which directly map the input partially known sensor data to the ground truth reconstructed from full field of view. Recently, score-based generative models played an increasingly significant role in generative modeling. Leveraging this probabilistic model, we proposed Rotation Consistency Constrained Score-based Generative Model (RCC-SGM), which recovers the PAT images by iterative sampling between Langevin dynamics and a constraint term utilizing the rotation consistency between the images and the measurements. Our proposed method can generalize to different measurement processes (32.29 PSNR with 16 measurements under random sampling, whereas 28.50 for supervised counterpart), while supervised methods need to train on specific inverse map**s. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2010.03466 [pdf, ps, other]

Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models

Authors: Srikanth Madikeri, Sibo Tong, Juan Zuluaga-Gomez, Apoorv Vyas, Petr Motlicek, Hervé Bourlard

Abstract: We present a simple wrapper that is useful to train acoustic models in PyTorch using Kaldi's LF-MMI training framework. The wrapper, called pkwrap (short form of PyTorch kaldi wrapper), enables the user to utilize the flexibility provided by PyTorch in designing model architectures. It exposes the LF-MMI cost function as an autograd function. Other capabilities of Kaldi have also been ported to Py… ▽ More We present a simple wrapper that is useful to train acoustic models in PyTorch using Kaldi's LF-MMI training framework. The wrapper, called pkwrap (short form of PyTorch kaldi wrapper), enables the user to utilize the flexibility provided by PyTorch in designing model architectures. It exposes the LF-MMI cost function as an autograd function. Other capabilities of Kaldi have also been ported to PyTorch. This includes the parallel training ability when multi-GPU environments are unavailable and decode with graphs created in Kaldi. The package is available on Github at https://github.com/idiap/pkwrap. △ Less

Submitted 7 October, 2020; originally announced October 2020.

arXiv:1902.09330 [pdf, other]

PReS: Power Peak Reduction by Real-time Scheduling for Urban Railway Transit

Authors: Zekun Yang, Yu Chen, Ning Zhou, Shiqiong Tong

Abstract: Railway transportation is one of the most popular options for Urban Massive Transportation Systems (UMTS) because of many attractive features. A robust electric power supply is essential to enable normal operation. However, the power peaks appearing at the start time of the vehicles put heavy pressure on the power grid. Reduction of the power peak is a key issue in improving urban railway transit'… ▽ More Railway transportation is one of the most popular options for Urban Massive Transportation Systems (UMTS) because of many attractive features. A robust electric power supply is essential to enable normal operation. However, the power peaks appearing at the start time of the vehicles put heavy pressure on the power grid. Reduction of the power peak is a key issue in improving urban railway transit's power efficiency. Researchers have tried to address this problem by making a delicate timetable, but this method often failed to serve the purpose because of the punctuality problem. In this work, taking advantage of real-time estimation of the single train's power consumption, an online Power peak Reduction by real-time Scheduling (PReS) solution for trains' departure is proposed. Particularly, a Binary Integer Programming (BIP) model is introduced that is able to avoid power consumption peak caused by multiple trains departure simultaneously. The simulation result verified that the proposed real-time scheduling approach can effectively reduce the occurrences of power peak without bringing in additional train travel delay. △ Less

Submitted 21 February, 2019; originally announced February 2019.

Comments: Submitted to the 2019 IEEE Sustainability through ICT Summit (StICT)

arXiv:1811.01307 [pdf, ps, other]

Towards Unsupervised Speech-to-Text Translation

Authors: Yu-An Chung, Wei-Hung Weng, Schrasing Tong, James Glass

Abstract: We present a framework for building speech-to-text translation (ST) systems using only monolingual speech and text corpora, in other words, speech utterances from a source language and independent text from a target language. As opposed to traditional cascaded systems and end-to-end architectures, our system does not require any labeled data (i.e., transcribed source audio or parallel source and t… ▽ More We present a framework for building speech-to-text translation (ST) systems using only monolingual speech and text corpora, in other words, speech utterances from a source language and independent text from a target language. As opposed to traditional cascaded systems and end-to-end architectures, our system does not require any labeled data (i.e., transcribed source audio or parallel source and target text corpora) during training, making it especially applicable to language pairs with very few or even zero bilingual resources. The framework initializes the ST system with a cross-modal bilingual dictionary inferred from the monolingual corpora, that maps every source speech segment corresponding to a spoken word to its target text translation. For unseen source speech utterances, the system first performs word-by-word translation on each speech segment in the utterance. The translation is improved by leveraging a language model and a sequence denoising autoencoder to provide prior knowledge about the target language. Experimental results show that our unsupervised system achieves comparable BLEU scores to supervised end-to-end models despite the lack of supervision. We also provide an ablation analysis to examine the utility of each component in our system. △ Less

Submitted 3 November, 2018; originally announced November 2018.

arXiv:1805.07467 [pdf, ps, other]

Unsupervised Cross-Modal Alignment of Speech and Text Embedding Spaces

Authors: Yu-An Chung, Wei-Hung Weng, Schrasing Tong, James Glass

Abstract: Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper we target learning a cross-modal alignment between the embedding spaces of speech and text learned from corpora of their respective modalities in an unsupervis… ▽ More Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper we target learning a cross-modal alignment between the embedding spaces of speech and text learned from corpora of their respective modalities in an unsupervised fashion. The proposed framework learns the individual speech and text embedding spaces, and attempts to align the two spaces via adversarial training, followed by a refinement procedure. We show how our framework could be used to perform spoken word classification and translation, and the results on these two tasks demonstrate that the performance of our unsupervised alignment approach is comparable to its supervised counterpart. Our framework is especially useful for develo** automatic speech recognition (ASR) and speech-to-text translation systems for low- or zero-resource languages, which have little parallel audio-text data for training modern supervised ASR and speech-to-text translation models, but account for the majority of the languages spoken across the world. △ Less

Submitted 20 September, 2018; v1 submitted 18 May, 2018; originally announced May 2018.

Comments: Accepted to NIPS 2018. v2 added the majority word baseline results and other minor fixes. arXiv admin note: text overlap with arXiv:1710.04087 by other authors

arXiv:1711.10025 [pdf, other]

Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model

Authors: Sibo Tong, Philip N. Garner, Hervé Bourlard

Abstract: Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to benefit from more training data, and better lend themselves to adaptation to under-resourced languages. However, initialisation from monolingual context-dependent models leads to an explosion of context-dependent states. Connectionist Temporal Classification (CTC) is a potential solution to this as… ▽ More Multilingual models for Automatic Speech Recognition (ASR) are attractive as they have been shown to benefit from more training data, and better lend themselves to adaptation to under-resourced languages. However, initialisation from monolingual context-dependent models leads to an explosion of context-dependent states. Connectionist Temporal Classification (CTC) is a potential solution to this as it performs well with monophone labels. We investigate multilingual CTC in the context of adaptation and regularisation techniques that have been shown to be beneficial in more conventional contexts. The multilingual model is trained to model a universal International Phonetic Alphabet (IPA)-based phone set using the CTC loss function. Learning Hidden Unit Contribution (LHUC) is investigated to perform language adaptive training. In addition, dropout during cross-lingual adaptation is also studied and tested in order to mitigate the overfitting problem. Experiments show that the performance of the universal phoneme-based CTC system can be improved by applying LHUC and it is extensible to new phonemes during cross-lingual adaptation. Updating all the parameters shows consistent improvement on limited data. Applying dropout during adaptation can further improve the system and achieve competitive performance with Deep Neural Network / Hidden Markov Model (DNN/HMM) systems on limited data. △ Less

Submitted 23 January, 2018; v1 submitted 27 November, 2017; originally announced November 2017.

Showing 1–11 of 11 results for author: Tong, S