-
ForzaETH Race Stack -- Scaled Autonomous Head-to-Head Racing on Fully Commercial off-the-Shelf Hardware
Authors:
Nicolas Baumann,
Edoardo Ghignone,
Jonas Kühne,
Niklas Bastuck,
Jonathan Becker,
Nadine Imholz,
Tobias Kränzlin,
Tian Yi Lim,
Michael Lötscher,
Luca Schwarzenbach,
Luca Tognoni,
Christian Vogt,
Andrea Carron,
Michele Magno
Abstract:
Autonomous racing in robotics combines high-speed dynamics with the necessity for reliability and real-time decision-making. While such racing pushes software and hardware to their limits, many existing full-system solutions necessitate complex, custom hardware and software, and usually focus on Time-Trials rather than full unrestricted Head-to-Head racing, due to financial and safety constraints.…
▽ More
Autonomous racing in robotics combines high-speed dynamics with the necessity for reliability and real-time decision-making. While such racing pushes software and hardware to their limits, many existing full-system solutions necessitate complex, custom hardware and software, and usually focus on Time-Trials rather than full unrestricted Head-to-Head racing, due to financial and safety constraints. This limits their reproducibility, making advancements and replication feasible mostly for well-resourced laboratories with comprehensive expertise in mechanical, electrical, and robotics fields. Researchers interested in the autonomy domain but with only partial experience in one of these fields, need to spend significant time with familiarization and integration. The ForzaETH Race Stack addresses this gap by providing an autonomous racing software platform designed for F1TENTH, a 1:10 scaled Head-to-Head autonomous racing competition, which simplifies replication by using commercial off-the-shelf hardware. This approach enhances the competitive aspect of autonomous racing and provides an accessible platform for research and development in the field. The ForzaETH Race Stack is designed with modularity and operational ease of use in mind, allowing customization and adaptability to various environmental conditions, such as track friction and layout. Capable of handling both Time-Trials and Head-to-Head racing, the stack has demonstrated its effectiveness, robustness, and adaptability in the field by winning the official F1TENTH international competition multiple times.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Robustness Evaluation of Localization Techniques for Autonomous Racing
Authors:
Tian Yi Lim,
Edoardo Ghignone,
Nicolas Baumann,
Michele Magno
Abstract:
This work introduces SynPF, an MCL-based algorithm tailored for high-speed racing environments. Benchmarked against Cartographer, a state-of-the-art pose-graph SLAM algorithm, SynPF leverages synergies from previous particle-filtering methods and synthesizes them for the high-performance racing domain. Our extensive in-field evaluations reveal that while Cartographer excels under nominal condition…
▽ More
This work introduces SynPF, an MCL-based algorithm tailored for high-speed racing environments. Benchmarked against Cartographer, a state-of-the-art pose-graph SLAM algorithm, SynPF leverages synergies from previous particle-filtering methods and synthesizes them for the high-performance racing domain. Our extensive in-field evaluations reveal that while Cartographer excels under nominal conditions, it struggles when subjected to wheel-slip, a common phenomenon in a racing scenario due to varying grip levels and aggressive driving behaviour. Conversely, SynPF demonstrates robustness in these challenging conditions and a low-latency computation time of 1.25 ms on on-board computers without a GPU. Using the F1TENTH platform, a 1:10 scaled autonomous racing vehicle, this work not only highlights the vulnerabilities of existing algorithms in high-speed scenarios, tested up until 7.6 m/s, but also emphasizes the potential of SynPF as a viable alternative, especially in deteriorating odometry conditions.
△ Less
Submitted 26 March, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
Integrated Path Tracking with DYC and MPC using LSTM Based Tire Force Estimator for Four-wheel Independent Steering and Driving Vehicle
Authors:
Sung** Lim,
Bilal Sadiq,
Yongsik **,
Sangho Lee,
Gyeungho Choi,
Kanghyun Nam,
Yongseob Lim
Abstract:
Active collision avoidance system plays a crucial role in ensuring the lateral safety of autonomous vehicles, and it is primarily related to path planning and tracking control algorithms. In particular, the direct yaw-moment control (DYC) system can significantly improve the lateral stability of a vehicle in environments with sudden changes in road conditions. In order to apply the DYC algorithm,…
▽ More
Active collision avoidance system plays a crucial role in ensuring the lateral safety of autonomous vehicles, and it is primarily related to path planning and tracking control algorithms. In particular, the direct yaw-moment control (DYC) system can significantly improve the lateral stability of a vehicle in environments with sudden changes in road conditions. In order to apply the DYC algorithm, it is very important to accurately consider the properties of tire forces with complex nonlinearity for control to ensure the lateral stability of the vehicle. In this study, longitudinal and lateral tire forces for safety path tracking were simultaneously estimated using a long short-term memory (LSTM) neural network based estimator. Furthermore, to improve path tracking performance in case of sudden changes in road conditions, a system has been developed by combining 4-wheel independent steering (4WIS) model predictive control (MPC) and 4-wheel independent drive (4WID) direct yaw-moment control (DYC). The estimation performance of the extended Kalman filter (EKF), which are commonly used for tire force estimation, was compared. In addition, the estimated longitudinal and lateral tire forces of each wheel were applied to the proposed system, and system verification was performed through simulation using a vehicle dynamics simulator. Consequently, the proposed method, the integrated path tracking algorithm with DYC and MPC using the LSTM based estimator, was validated to significantly improve the vehicle stability in suddenly changing road conditions.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
CARTOS: A Charging-Aware Real-Time Operating System for Intermittent Batteryless Devices
Authors:
Mohsen Karimi,
Yidi Wang,
Youngbin Kim,
Yoo** Lim,
Hyoseung Kim
Abstract:
This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of p…
▽ More
This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of program execution amidst variable energy availability and maintaining reliable real-time time behavior during power disruptions. To address these challenges, CARTOS introduces a mixed-preemption scheduling model that classifies tasks into computational and peripheral tasks, and ensures their efficient and timely execution by adopting just-in-time checkpointing for divisible computation tasks and uninterrupted execution for indivisible peripheral tasks. CARTOS also supports processing chains of tasks with precedence constraints and adapts its scheduling in response to environmental changes to offer continuous execution under diverse conditions. CARTOS is implemented with new APIs and components added to FreeRTOS but is designed for portability to other embedded RTOSs. Through real hardware experiments and simulations, CARTOS exhibits superior performance over state-of-the-art methods, demonstrating that it can serve as a practical platform for develo** resilient, real-time sensing applications on IPDs.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Joint unsupervised and supervised learning for context-aware language identification
Authors:
**seok Park,
Hyung Yong Kim,
Jihwan Park,
Byeong-Yeol Kim,
Shukjae Choi,
Yunkyu Lim
Abstract:
Language identification (LID) recognizes the language of a spoken utterance automatically. According to recent studies, LID models trained with an automatic speech recognition (ASR) task perform better than those trained with a LID task only. However, we need additional text labels to train the model to recognize speech, and acquiring the text labels is a cost high. In order to overcome this probl…
▽ More
Language identification (LID) recognizes the language of a spoken utterance automatically. According to recent studies, LID models trained with an automatic speech recognition (ASR) task perform better than those trained with a LID task only. However, we need additional text labels to train the model to recognize speech, and acquiring the text labels is a cost high. In order to overcome this problem, we propose context-aware language identification using a combination of unsupervised and supervised learning without any text labels. The proposed method learns the context of speech through masked language modeling (MLM) loss and simultaneously trains to determine the language of the utterance with supervised learning loss. The proposed joint learning was found to reduce the error rate by 15.6% compared to the same structure model trained by supervised-only learning on a subset of the VoxLingua107 dataset consisting of sub-three-second utterances in 11 languages.
△ Less
Submitted 14 April, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
Automatic Internal Stray Light Calibration of AMCW Coaxial Scanning LiDAR Using GMM and PSO
Authors:
Sung-Hyun Lee,
Wook-Hyeon Kwon,
Yoon-Seop Lim,
Yong-Hwa Park
Abstract:
In this paper, an automatic calibration algorithm is proposed to reduce the depth error caused by internal stray light in amplitude-modulated continuous wave (AMCW) coaxial scanning light detection and ranging (LiDAR). Assuming that the internal stray light generated in the process of emitting laser is static, the amplitude and phase delay of internal stray light are estimated using the Gaussian m…
▽ More
In this paper, an automatic calibration algorithm is proposed to reduce the depth error caused by internal stray light in amplitude-modulated continuous wave (AMCW) coaxial scanning light detection and ranging (LiDAR). Assuming that the internal stray light generated in the process of emitting laser is static, the amplitude and phase delay of internal stray light are estimated using the Gaussian mixture model (GMM) and particle swarm optimization (PSO). Specifically, the pixel positions in a raw signal amplitude map of calibration checkboard are segmented by GMM with two clusters considering the dark and bright image pattern. The loss function is then defined as L1-norm of difference between mean depths of two amplitude-segmented clusters. To avoid overfitting at a specific distance in PSO process, the calibration check board is actually measured at multiple distances and the average of corresponding L1 loss functions is chosen as the actual loss. Such loss is minimized by PSO to find the two optimal target parameters: the amplitude and phase delay of internal stray light. According to the validation of the proposed algorithm, the original loss is reduced from tens of centimeters to 3.2 mm when the measured distances of the calibration checkboard are between 1 m and 4 m. This accurate calibration performance is also maintained in geometrically complex measured scene. The proposed internal stray light calibration algorithm in this paper can be used for any type of AMCW coaxial scanning LiDAR regardless of its optical characteristics.
△ Less
Submitted 24 April, 2023; v1 submitted 28 February, 2023;
originally announced February 2023.
-
Enhanced artificial intelligence-based diagnosis using CBCT with internal denoising: Clinical validation for discrimination of fungal ball, sinusitis, and normal cases in the maxillary sinus
Authors:
Kyungsu Kim,
Chae Yeon Lim,
Joong Bo Shin,
Myung ** Chung,
Yong Gi Jung
Abstract:
The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can di…
▽ More
The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can distinguish between inherent artifacts or noise and diseases, restricting the use of this imaging modality. The development of artificial intelligence (AI)-based computer-aided diagnosis methods for CBCT to overcome the shortage of experienced physicians has attracted substantial attention. However, advanced AI-based diagnosis addressing intrinsic noise in CBCT has not been devised, discouraging the practical use of AI solutions for CBCT. To address this issue, we propose an AI-based computer-aided diagnosis method using CBCT with a denoising module. This module is implemented before diagnosis to reconstruct the internal ground-truth full-dose scan corresponding to an input CBCT image and thereby improve the diagnostic performance. The external validation results for the unified diagnosis of sinus fungal ball, chronic rhinosinusitis, and normal cases show that the proposed method improves the micro-, macro-average AUC, and accuracy by 7.4, 5.6, and 9.6% (from 86.2, 87.0, and 73.4 to 93.6, 92.6, and 83.0%), respectively, compared with a baseline while improving human diagnosis accuracy by 11% (from 71.7 to 83.0%), demonstrating technical differentiation and clinical effectiveness. This pioneering study on AI-based diagnosis using CBCT indicates denoising can improve diagnostic performance and reader interpretability in images from the sinonasal area, thereby providing a new approach and direction to radiographic image reconstruction regarding the development of AI-based diagnostic solutions.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Multipath Interference Suppression of Amplitude-Modulated Continuous Wave Scanning LiDAR Based on Bayesian-Optimized XGBoost Ensemble
Authors:
Sunghyun Lee,
Yoonseop Lim,
Wookhyeon Kwon,
Yonghwa Park
Abstract:
This paper proposes a novel multipath interference (MPI) suppression algorithm based on Bayesian-optimized extreme gradient boosting (XGBoost) ensemble to reduce MPI error in amplitude-modulated continuous wave (AMCW) scanning light detection and ranging (LiDAR). Contrast to this paper, many previous research works have focused on the MPI suppression in conventional AMCW time-of-flight (ToF) senso…
▽ More
This paper proposes a novel multipath interference (MPI) suppression algorithm based on Bayesian-optimized extreme gradient boosting (XGBoost) ensemble to reduce MPI error in amplitude-modulated continuous wave (AMCW) scanning light detection and ranging (LiDAR). Contrast to this paper, many previous research works have focused on the MPI suppression in conventional AMCW time-of-flight (ToF) sensors with flash type illumination sources. However, the mitigated MPI error of these previous works still remains cm-scale due to the inherent limitation of illumination source and lack of MPI data. Meanwhile, since there exist few previous works for coaxial type AMCW scanning LiDAR, the MPI in such LiDAR still has not been validated. To achieve mm-scale MPI error mitigation regarding aforementioned issues, this paper proposes a MPI error correction algorithm based on Bayesian-optimized XGBoost ensemble and its implementation in coaxial type AMCW scanning LiDAR. To train the XGBoost ensemble, the MPI synthetic dataset generated by customized simulation is used in this paper. According to validation results, the mean absolute error (MAE) of MPI error originally 9.8 mm can be reduced to less than 2 mm by Bayesian-optimized XGBoost in simulation dataset. Such precise MPI mitigation results are also maintained in real object scenes. Specifically, the MAE of MPI error in measurement condition similar with public dataset is reduced to 2.8 mm, which is extremely low compared to other previous works.
△ Less
Submitted 25 April, 2023; v1 submitted 7 November, 2022;
originally announced November 2022.
-
Metric Learning for User-defined Keyword Spotting
Authors:
Jaemin Jung,
Youkyum Kim,
Jihwan Park,
Youshin Lim,
Byeong-Yeol Kim,
Youngjoon Jang,
Joon Son Chung
Abstract:
The goal of this work is to detect new spoken terms defined by users. While most previous works address Keyword Spotting (KWS) as a closed-set classification problem, this limits their transferability to unseen terms. The ability to define custom keywords has advantages in terms of user experience.
In this paper, we propose a metric learning-based training strategy for user-defined keyword spott…
▽ More
The goal of this work is to detect new spoken terms defined by users. While most previous works address Keyword Spotting (KWS) as a closed-set classification problem, this limits their transferability to unseen terms. The ability to define custom keywords has advantages in terms of user experience.
In this paper, we propose a metric learning-based training strategy for user-defined keyword spotting. In particular, we make the following contributions: (1) we construct a large-scale keyword dataset with an existing speech corpus and propose a filtering method to remove data that degrade model training; (2) we propose a metric learning-based two-stage training strategy, and demonstrate that the proposed method improves the performance on the user-defined keyword spotting task by enriching their representations; (3) to facilitate the fair comparison in the user-defined KWS field, we propose unified evaluation protocol and metrics.
Our proposed system does not require an incremental training on the user-defined keywords, and outperforms previous works by a significant margin on the Google Speech Commands dataset using the proposed as well as the existing metrics.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
Comparative Validation of AI and non-AI Methods in MRI Volumetry to Diagnose Parkinsonian Syndromes
Authors:
Joomee Song,
Juyoung Hahm,
Jisoo Lee,
Chae Yeon Lim,
Myung ** Chung,
**young Youn,
** Whan Cho,
Jong Hyeon Ahn,
Kyung-Su Kim
Abstract:
Automated segmentation and volumetry of brain magnetic resonance imaging (MRI) scans are essential for the diagnosis of Parkinson's disease (PD) and Parkinson's plus syndromes (P-plus). To enhance the diagnostic performance, we adopt deep learning (DL) models in brain segmentation and compared their performance with the gold-standard non-DL method. We collected brain MRI scans of healthy controls…
▽ More
Automated segmentation and volumetry of brain magnetic resonance imaging (MRI) scans are essential for the diagnosis of Parkinson's disease (PD) and Parkinson's plus syndromes (P-plus). To enhance the diagnostic performance, we adopt deep learning (DL) models in brain segmentation and compared their performance with the gold-standard non-DL method. We collected brain MRI scans of healthy controls (n=105) and patients with PD (n=105), multiple systemic atrophy (n=132), and progressive supranuclear palsy (n=69) at Samsung Medical Center from January 2017 to December 2020. Using the gold-standard non-DL model, FreeSurfer (FS), we segmented six brain structures: midbrain, pons, caudate, putamen, pallidum, and third ventricle, and considered them as annotating data for DL models, the representative V-Net and UNETR. The Dice scores and area under the curve (AUC) for differentiating normal, PD, and P-plus cases were calculated. The segmentation times of V-Net and UNETR for the six brain structures per patient were 3.48 +- 0.17 and 48.14 +- 0.97 s, respectively, being at least 300 times faster than FS (15,735 +- 1.07 s). Dice scores of both DL models were sufficiently high (>0.85), and their AUCs for disease classification were superior to that of FS. For classification of normal vs. P-plus and PD vs. multiple systemic atrophy (cerebellar type), the DL models and FS showed AUCs above 0.8. DL significantly reduces the analysis time without compromising the performance of brain segmentation and differential diagnosis. Our findings may contribute to the adoption of DL brain MRI segmentation in clinical settings and advance brain research.
△ Less
Submitted 23 July, 2022;
originally announced July 2022.
-
Enhancing Generative Networks for Chest Anomaly Localization through Automatic Registration-Based Unpaired-to-Pseudo-Paired Training Data Translation
Authors:
Kyungsu Kim,
Seong Je Oh,
Chae Yeon Lim,
Ju Hwan Lee,
Tae Uk Kim,
Myung ** Chung
Abstract:
Image translation based on a generative adversarial network (GAN-IT) is a promising method for the precise localization of abnormal regions in chest X-ray images (AL-CXR) even without the pixel-level annotation. However, heterogeneous unpaired datasets undermine existing methods to extract key features and distinguish normal from abnormal cases, resulting in inaccurate and unstable AL-CXR. To addr…
▽ More
Image translation based on a generative adversarial network (GAN-IT) is a promising method for the precise localization of abnormal regions in chest X-ray images (AL-CXR) even without the pixel-level annotation. However, heterogeneous unpaired datasets undermine existing methods to extract key features and distinguish normal from abnormal cases, resulting in inaccurate and unstable AL-CXR. To address this problem, we propose an improved two-stage GAN-IT involving registration and data augmentation. For the first stage, we introduce an advanced deep-learning-based registration technique that virtually and reasonably converts unpaired data into paired data for learning registration maps, by sequentially utilizing linear-based global and uniform coordinate transformation and AI-based non-linear coordinate fine-tuning. This approach enables independent and complex coordinate transformation of each detailed location of the lung while recognizing the entire lung structure, thereby achieving higher registration performance with resolving inherent artifacts caused by unpaired conditions. For the second stage, we apply data augmentation to diversify anomaly locations by swap** the left and right lung regions on the uniform registered frames, further improving the performance by alleviating imbalance in data distribution showing left and right lung lesions. The proposed method is model agnostic and shows consistent AL-CXR performance improvement in representative AI models. Therefore, we believe GAN-IT for AL-CXR can be clinically implemented by using our basis framework, even if learning data are scarce or difficult for the pixel-level disease annotation.
△ Less
Submitted 15 June, 2024; v1 submitted 21 July, 2022;
originally announced July 2022.
-
Tackling Data Scarcity with Transfer Learning: A Case Study of Thickness Characterization from Optical Spectra of Perovskite Thin Films
Authors:
Siyu Isaac Parker Tian,
Zekun Ren,
Selvaraj Venkataraj,
Yuanhang Cheng,
Daniil Bash,
Felipe Oviedo,
J. Senthilnath,
Vijila Chellappan,
Yee-Fun Lim,
Armin G. Aberle,
Benjamin P MacLeod,
Fraser G. L. Parlane,
Curtis P. Berlinguette,
Qianxiao Li,
Tonio Buonassisi,
Zhe Liu
Abstract:
Transfer learning increasingly becomes an important tool in handling data scarcity often encountered in machine learning. In the application of high-throughput thickness as a downstream process of the high-throughput optimization of optoelectronic thin films with autonomous workflows, data scarcity occurs especially for new materials. To achieve high-throughput thickness characterization, we propo…
▽ More
Transfer learning increasingly becomes an important tool in handling data scarcity often encountered in machine learning. In the application of high-throughput thickness as a downstream process of the high-throughput optimization of optoelectronic thin films with autonomous workflows, data scarcity occurs especially for new materials. To achieve high-throughput thickness characterization, we propose a machine learning model called thicknessML that predicts thickness from UV-Vis spectrophotometry input and an overarching transfer learning workflow. We demonstrate the transfer learning workflow from generic source domain of generic band-gapped materials to specific target domain of perovskite materials, where the target domain data only come from limited number (18) of refractive indices from literature. The target domain can be easily extended to other material classes with a few literature data. Defining thickness prediction accuracy to be within-10% deviation, thicknessML achieves 92.2% (with a deviation of 3.6%) accuracy with transfer learning compared to 81.8% (with a deviation of 3.6%) 11.7% without (lower mean and larger standard deviation). Experimental validation on six deposited perovskite films also corroborates the efficacy of the proposed workflow by yielding a 10.5% mean absolute percentage error (MAPE).
△ Less
Submitted 20 December, 2022; v1 submitted 14 June, 2022;
originally announced July 2022.
-
Semantic Communications for Future Internet: Fundamentals, Applications, and Challenges
Authors:
Wanting Yang,
Hongyang Du,
Ziqin Liew,
Wei Yang Bryan Lim,
Zehui Xiong,
Dusit Niyato,
Xuefen Chi,
Xuemin Sherman Shen,
Chunyan Miao
Abstract:
With the increasing demand for intelligent services, the sixth-generation (6G) wireless networks will shift from a traditional architecture that focuses solely on high transmission rate to a new architecture that is based on the intelligent connection of everything. Semantic communication (SemCom), a revolutionary architecture that integrates user as well as application requirements and meaning of…
▽ More
With the increasing demand for intelligent services, the sixth-generation (6G) wireless networks will shift from a traditional architecture that focuses solely on high transmission rate to a new architecture that is based on the intelligent connection of everything. Semantic communication (SemCom), a revolutionary architecture that integrates user as well as application requirements and meaning of information into the data processing and transmission, is predicted to become a new core paradigm in 6G. While SemCom is expected to progress beyond the classical Shannon paradigm, several obstacles need to be overcome on the way to a SemCom-enabled smart wireless Internet. In this paper, we first highlight the motivations and compelling reasons of SemCom in 6G. Then, we outline the major 6G visions and key enabler techniques which lay the foundation of SemCom. Meanwhile, we highlight some benefits of SemCom-empowered 6G and present a SemCom-native 6G network architecture. Next, we show the evolution of SemCom from its introduction to classical SemCom related theory and modern AI-enabled SemCom. Following that, focusing on modern SemCom, we classify SemCom into three categories, i.e., semantic-oriented communication, goal-oriented communication, and semantic-aware communication, and introduce three types of semantic metrics. We then discuss the applications, the challenges and technologies related to semantics and communication. Finally, we introduce future research opportunities. In a nutshell, this paper investigates the fundamentals of SemCom, its applications in 6G networks, and the existing challenges and open issues for further direction.
△ Less
Submitted 13 November, 2022; v1 submitted 10 June, 2022;
originally announced July 2022.
-
Fault Diagnosis of Inter-turn Short Circuit in Permanent Magnet Synchronous Motors with Current Signal Imaging and Unsupervised Learning
Authors:
W. Jung,
S. H. Yun,
Y. S. Lim,
S. Cheong,
J. Bae,
Y. H. Park
Abstract:
This paper proposes machine-independent feature engineering for winding inter-turn short circuit fault that uses electrical current signals. Electrical current signal collected from permanent magnet synchronous motor (PMSM) is subjected to different environmental and operational conditions. To solve these problems, robust current signal imaging method and deep learning-based feature extraction met…
▽ More
This paper proposes machine-independent feature engineering for winding inter-turn short circuit fault that uses electrical current signals. Electrical current signal collected from permanent magnet synchronous motor (PMSM) is subjected to different environmental and operational conditions. To solve these problems, robust current signal imaging method and deep learning-based feature extraction method are developed. The overall procedure includes the following three key steps: (1) transformation of a time-series current signal to two-dimensional image, (2) extracting features using convolutional neural networks, and (3) calculating a health indicator using Mahalanobis distance. Transformation of the time-series signal is based on recurrence plots (RP). The proposed RP method develops from feature engineering that provides the dominant fault feature representations in a robust way. The proposed RP is designed that maximizes the features of inter-turn short fault and minimizes the effect of noise from systems with various capacities. To demonstrate the validity of the proposed method, two case studies are conducted using an artificial fault seeded testbed with two different capacities of motor. By calculating the feature using only the electrical current signal of the motor without the parameters related to the capacity of the motor, the proposed feature can be applied to motors with different capacities while maintaining the same performance.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Semantic Communication Meets Edge Intelligence
Authors:
Wanting Yang,
Zi Qin Liew,
Wei Yang Bryan Lim,
Zehui Xiong,
Dusit Niyato,
Xuefen Chi,
Xianbin Cao,
Khaled B. Letaief
Abstract:
The development of emerging applications, such as autonomous transportation systems, are expected to result in an explosive growth in mobile data traffic. As the available spectrum resource becomes more and more scarce, there is a growing need for a paradigm shift from Shannon's Classical Information Theory (CIT) to semantic communication (SemCom). Specifically, the former adopts a "transmit-befor…
▽ More
The development of emerging applications, such as autonomous transportation systems, are expected to result in an explosive growth in mobile data traffic. As the available spectrum resource becomes more and more scarce, there is a growing need for a paradigm shift from Shannon's Classical Information Theory (CIT) to semantic communication (SemCom). Specifically, the former adopts a "transmit-before-understanding" approach while the latter leverages artificial intelligence (AI) techniques to "understand-before-transmit", thereby alleviating bandwidth pressure by reducing the amount of data to be exchanged without negating the semantic effectiveness of the transmitted symbols. However, the semantic extraction (SE) procedure incurs costly computation and storage overheads. In this article, we introduce an edge-driven training, maintenance, and execution of SE. We further investigate how edge intelligence can be enhanced with SemCom through improving the generalization capabilities of intelligent agents at lower computation overheads and reducing the communication overhead of information exchange. Finally, we present a case study involving semantic-aware resource optimization for the wireless powered Internet of Things (IoT).
△ Less
Submitted 13 February, 2022;
originally announced February 2022.
-
Highly precise AMCW time-of-flight scanning sensor based on digital-parallel demodulation
Authors:
Sung-Hyun Lee,
Wook-Hyeon Kwon,
Yoon-Seop Lim,
Yong-Hwa Park
Abstract:
In this paper, a novel amplitude-modulated continuous wave (AMCW) time-of-flight (ToF) scanning sensor based on digital-parallel demodulation is proposed and demonstrated in the aspect of distance measurement precision. Since digital-parallel demodulation utilizes a high-amplitude demodulation signal with zero-offset, the proposed sensor platform can maintain extremely high demodulation contrast.…
▽ More
In this paper, a novel amplitude-modulated continuous wave (AMCW) time-of-flight (ToF) scanning sensor based on digital-parallel demodulation is proposed and demonstrated in the aspect of distance measurement precision. Since digital-parallel demodulation utilizes a high-amplitude demodulation signal with zero-offset, the proposed sensor platform can maintain extremely high demodulation contrast. Meanwhile, as all cross correlated samples are calculated in parallel and in extremely short integration time, the proposed sensor platform can utilize a 2D laser scanning structure with a single photo detector, maintaining a moderate frame rate. This optical structure can increase the received optical SNR and remove the crosstalk of image pixel array. Based on these measurement properties, the proposed AMCW ToF scanning sensor shows highly precise 3D depth measurement performance. In this study, this precise measurement performance is explained in detail. Additionally, the actual measurement performance of the proposed sensor platform is experimentally validated under various conditions.
△ Less
Submitted 15 December, 2021;
originally announced December 2021.
-
Economics of Semantic Communication System in Wireless Powered Internet of Things
Authors:
Zi Qin Liew,
Yanyu Cheng,
Wei Yang Bryan Lim,
Dusit Niyato,
Chunyan Miao,
Sumei Sun
Abstract:
The semantic communication system enables wireless devices to communicate effectively with the semantic meaning of the data. Wireless powered Internet of Things (IoT) that adopts the semantic communication system relies on harvested energy to transmit semantic information. However, the issue of energy constraint in the semantic communication system is not well studied. In this paper, we propose a…
▽ More
The semantic communication system enables wireless devices to communicate effectively with the semantic meaning of the data. Wireless powered Internet of Things (IoT) that adopts the semantic communication system relies on harvested energy to transmit semantic information. However, the issue of energy constraint in the semantic communication system is not well studied. In this paper, we propose a semantic-based energy valuation and take an economic approach to solve the energy allocation problem as an incentive mechanism design. In our model, IoT devices (bidders) place their bids for the energy and power transmitter (auctioneer) decides the winner and payment by using deep learning based optimal auction. Results show that the revenue of wireless power transmitter is maximized while satisfying Individual Rationality (IR) and Incentive Compatibility (IC).
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Estimation of Closest In-Path Vehicle (CIPV) by Low-Channel LiDAR and Camera Sensor Fusion for Autonomous Vehicle
Authors:
Hyun** Bae,
Gu Lee,
Jaeseung Yang,
Gwanjun Shin,
Yongseob Lim,
Gyeungho Choi
Abstract:
In autonomous driving, using a variety of sensors to recognize preceding vehicles in middle and long distance is helpful for improving driving performance and develo** various functions. However, if only LiDAR or camera is used in the recognition stage, it is difficult to obtain necessary data due to the limitations of each sensor. In this paper, we proposed a method of converting the tracking d…
▽ More
In autonomous driving, using a variety of sensors to recognize preceding vehicles in middle and long distance is helpful for improving driving performance and develo** various functions. However, if only LiDAR or camera is used in the recognition stage, it is difficult to obtain necessary data due to the limitations of each sensor. In this paper, we proposed a method of converting the tracking data of vision into bird's eye view (BEV) coordinates using an equation that projects LiDAR points onto an image, and a method of fusion between LiDAR and vision tracked data. Thus, the newly proposed method was effective through the results of detecting closest in-path vehicle (CIPV) in various situations. In addition, even when experimenting with the EuroNCAP autonomous emergency braking (AEB) test protocol using the result of fusion, AEB performance is improved through improved cognitive performance than when using only LiDAR. In experimental results, the performance of the proposed method was proved through actual vehicle tests in various scenarios. Consequently, it is convincing that the newly proposed sensor fusion method significantly improves the ACC function in autonomous maneuvering. We expect that this improvement in perception performance will contribute to improving the overall stability of ACC.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images
Authors:
Yongwan Lim,
Asterios Toutios,
Yannick Bliesener,
Ye Tian,
Sajan Goud Lingala,
Colin Vaz,
Tanner Sorensen,
Miran Oh,
Sarah Harper,
Weiyi Chen,
Yoonjeong Lee,
Johannes Töger,
Mairym Lloréns Montesserin,
Caitlin Smith,
Bianca Godinez,
Louis Goldstein,
Dani Byrd,
Krishna S. Nayak,
Shrikanth S. Narayanan
Abstract:
Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators…
▽ More
Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway sha** during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 subjects performing linguistically motivated speech tasks, alongside the corresponding first-ever public domain raw RT-MRI data. The dataset also includes 3D volumetric vocal tract MRI during sustained speech sounds and high-resolution static anatomical T2-weighted upper airway MRI for each subject.
△ Less
Submitted 15 February, 2021;
originally announced February 2021.
-
Attention-gated convolutional neural networks for off-resonance correction of spiral real-time MRI
Authors:
Yongwan Lim,
Shrikanth S. Narayanan,
Krishna S. Nayak
Abstract:
Spiral acquisitions are preferred in real-time MRI because of their efficiency, which has made it possible to capture vocal tract dynamics during natural speech. A fundamental limitation of spirals is blurring and signal loss due to off-resonance, which degrades image quality at air-tissue boundaries. Here, we present a new CNN-based off-resonance correction method that incorporates an attention-g…
▽ More
Spiral acquisitions are preferred in real-time MRI because of their efficiency, which has made it possible to capture vocal tract dynamics during natural speech. A fundamental limitation of spirals is blurring and signal loss due to off-resonance, which degrades image quality at air-tissue boundaries. Here, we present a new CNN-based off-resonance correction method that incorporates an attention-gate mechanism. This leverages spatial and channel relationships of filtered outputs and improves the expressiveness of the networks. We demonstrate improved performance with the attention-gate, on 1.5 Tesla spiral speech RT-MRI, compared to existing off-resonance correction methods.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Federated Learning in the Sky: Aerial-Ground Air Quality Sensing Framework with UAV Swarms
Authors:
Yi Liu,
Jiangtian Nie,
Xuandi Li,
Syed Hassan Ahmed,
Wei Yang Bryan Lim,
Chunyan Miao
Abstract:
Due to air quality significantly affects human health, it is becoming increasingly important to accurately and timely predict the Air Quality Index (AQI). To this end, this paper proposes a new federated learning-based aerial-ground air quality sensing framework for fine-grained 3D air quality monitoring and forecasting. Specifically, in the air, this framework leverages a light-weight Dense-Mobil…
▽ More
Due to air quality significantly affects human health, it is becoming increasingly important to accurately and timely predict the Air Quality Index (AQI). To this end, this paper proposes a new federated learning-based aerial-ground air quality sensing framework for fine-grained 3D air quality monitoring and forecasting. Specifically, in the air, this framework leverages a light-weight Dense-MobileNet model to achieve energy-efficient end-to-end learning from haze features of haze images taken by Unmanned Aerial Vehicles (UAVs) for predicting AQI scale distribution. Furthermore, the Federated Learning Framework not only allows various organizations or institutions to collaboratively learn a well-trained global model to monitor AQI without compromising privacy, but also expands the scope of UAV swarms monitoring. For ground sensing systems, we propose a Graph Convolutional neural network-based Long Short-Term Memory (GC-LSTM) model to achieve accurate, real-time and future AQI inference. The GC-LSTM model utilizes the topological structure of the ground monitoring station to capture the spatio-temporal correlation of historical observation data, which helps the aerial-ground sensing system to achieve accurate AQI inference. Through extensive case studies on a real-world dataset, numerical results show that the proposed framework can achieve accurate and energy-efficient AQI sensing without compromising the privacy of raw data.
△ Less
Submitted 23 July, 2020;
originally announced July 2020.
-
Distributed Bearing-based Formation Control and Network Localization with Exogenous Disturbances
Authors:
Yoo-Bin Bae,
Seong-Ho Kwon,
Young-Hun Lim,
Hyo-Sung Ahn
Abstract:
This paper presents a generalized robust stability analysis for bearing-based formation control and network localization systems. For an undirected network, we provide a robust stability analysis in the presence of time-varying exogenous disturbances in arbitrary dimensional space. In addition, we compute the explicit upper-bound set of the bearing formation and network localization errors, which…
▽ More
This paper presents a generalized robust stability analysis for bearing-based formation control and network localization systems. For an undirected network, we provide a robust stability analysis in the presence of time-varying exogenous disturbances in arbitrary dimensional space. In addition, we compute the explicit upper-bound set of the bearing formation and network localization errors, which provides valuable information for a system design.
△ Less
Submitted 14 July, 2020;
originally announced July 2020.
-
Joint Auction-Coalition Formation Framework for Communication-Efficient Federated Learning in UAV-Enabled Internet of Vehicles
Authors:
Jer Shyuan Ng,
Wei Yang Bryan Lim,
Hong-Ning Dai,
Zehui Xiong,
Jianqiang Huang,
Dusit Niyato,
Xian-Sheng Hua,
Cyril Leung,
Chunyan Miao
Abstract:
Due to the advanced capabilities of the Internet of Vehicles (IoV) components such as vehicles, Roadside Units (RSUs) and smart devices as well as the increasing amount of data generated, Federated Learning (FL) becomes a promising tool given that it enables privacy-preserving machine learning that can be implemented in the IoV. However, the performance of the FL suffers from the failure of commun…
▽ More
Due to the advanced capabilities of the Internet of Vehicles (IoV) components such as vehicles, Roadside Units (RSUs) and smart devices as well as the increasing amount of data generated, Federated Learning (FL) becomes a promising tool given that it enables privacy-preserving machine learning that can be implemented in the IoV. However, the performance of the FL suffers from the failure of communication links and missing nodes, especially when continuous exchanges of model parameters are required. Therefore, we propose the use of Unmanned Aerial Vehicles (UAVs) as wireless relays to facilitate the communications between the IoV components and the FL server and thus improving the accuracy of the FL. However, a single UAV may not have sufficient resources to provide services for all iterations of the FL process. In this paper, we present a joint auction-coalition formation framework to solve the allocation of UAV coalitions to groups of IoV components. Specifically, the coalition formation game is formulated to maximize the sum of individual profits of the UAVs. The joint auction-coalition formation algorithm is proposed to achieve a stable partition of UAV coalitions in which an auction scheme is applied to solve the allocation of UAV coalitions. The auction scheme is designed to take into account the preferences of IoV components over heterogeneous UAVs. The simulation results show that the grand coalition, where all UAVs join a single coalition, is not always stable due to the profit-maximizing behavior of the UAVs. In addition, we show that as the cooperation cost of the UAVs increases, the UAVs prefer to support the IoV components independently and not to form any coalition.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
Towards Federated Learning in UAV-Enabled Internet of Vehicles: A Multi-Dimensional Contract-Matching Approach
Authors:
Wei Yang Bryan Lim,
Jianqiang Huang,
Zehui Xiong,
Jiawen Kang,
Dusit Niyato,
Xian-Sheng Hua,
Cyril Leung,
Chunyan Miao
Abstract:
Coupled with the rise of Deep Learning, the wealth of data and enhanced computation capabilities of Internet of Vehicles (IoV) components enable effective Artificial Intelligence (AI) based models to be built. Beyond ground data sources, Unmanned Aerial Vehicles (UAVs) based service providers for data collection and AI model training, i.e., Drones-as-a-Service, is increasingly popular in recent ye…
▽ More
Coupled with the rise of Deep Learning, the wealth of data and enhanced computation capabilities of Internet of Vehicles (IoV) components enable effective Artificial Intelligence (AI) based models to be built. Beyond ground data sources, Unmanned Aerial Vehicles (UAVs) based service providers for data collection and AI model training, i.e., Drones-as-a-Service, is increasingly popular in recent years. However, the stringent regulations governing data privacy potentially impedes data sharing across independently owned UAVs. To this end, we propose the adoption of a Federated Learning (FL) based approach to enable privacy-preserving collaborative Machine Learning across a federation of independent DaaS providers for the development of IoV applications, e.g., for traffic prediction and car park occupancy management. Given the information asymmetry and incentive mismatches between the UAVs and model owners, we leverage on the self-revealing properties of a multi-dimensional contract to ensure truthful reporting of the UAV types, while accounting for the multiple sources of heterogeneity, e.g., in sensing, computation, and transmission costs. Then, we adopt the Gale-Shapley algorithm to match the lowest cost UAV to each subregion. The simulation results validate the incentive compatibility of our contract design, and shows the efficiency of our matching, thus guaranteeing profit maximization for the model owner amid information asymmetry.
△ Less
Submitted 8 April, 2020;
originally announced April 2020.
-
Deblurring for Spiral Real-Time MRI Using Convolutional Neural Networks
Authors:
Yongwan Lim,
Shrikanth S Narayanan,
Krishna S Nayak
Abstract:
Spiral acquisitions are preferred in real-time MRI because of their time efficiency. A fundamental limitation of spirals is image blurring due to off-resonance, which degrades image quality significantly at air-tissue boundaries. Here, we demonstrate a simple CNN-based deblurring method for spiral real-time MRI of human speech production. We show the CNN-based deblurring is capable of restoring bl…
▽ More
Spiral acquisitions are preferred in real-time MRI because of their time efficiency. A fundamental limitation of spirals is image blurring due to off-resonance, which degrades image quality significantly at air-tissue boundaries. Here, we demonstrate a simple CNN-based deblurring method for spiral real-time MRI of human speech production. We show the CNN-based deblurring is capable of restoring blurred vocal tract tissue boundaries, without a need for exam-specific field maps. Deblurring performance is superior to a current auto-calibrated method, and slightly inferior to ideal reconstruction with perfect knowledge of the field maps.
△ Less
Submitted 29 May, 2020; v1 submitted 26 January, 2020;
originally announced January 2020.
-
Laser scanning reflection-matrix microscopy for label-free in vivo imaging of a mouse brain through an intact skull
Authors:
Seokchan Yoon,
Hojun Lee,
** Hee Hong,
Yong-Sik Lim,
Wonshik Choi
Abstract:
We present a laser scanning reflection-matrix microscopy combining the scanning of laser focus and the wide-field map** of the electric field of the backscattered waves for eliminating higher-order aberrations even in the presence of strong multiple light scattering noise. Unlike conventional confocal laser scanning microscopy, we record the amplitude and phase maps of reflected waves from the s…
▽ More
We present a laser scanning reflection-matrix microscopy combining the scanning of laser focus and the wide-field map** of the electric field of the backscattered waves for eliminating higher-order aberrations even in the presence of strong multiple light scattering noise. Unlike conventional confocal laser scanning microscopy, we record the amplitude and phase maps of reflected waves from the sample not only at the confocal pinhole, but also at other non-confocal points. These additional measurements lead us to constructing a time-resolved reflection matrix, with which the sample-induced aberrations for the illumination and detection pathways are separately identified and corrected. We realized in vivo reflectance imaging of myelinated axons through an intact skull of a living mouse with the spatial resolution close to the ideal diffraction limit. Furthermore, we demonstrated near-diffraction-limited multiphoton imaging through an intact skull by physically correcting the aberrations identified from the reflection matrix. The proposed method is expected to extend the range of applications, where the knowledge of the detailed microscopic information deep within biological tissues is critical.
△ Less
Submitted 8 October, 2019;
originally announced October 2019.
-
Federated Learning in Mobile Edge Networks: A Comprehensive Survey
Authors:
Wei Yang Bryan Lim,
Nguyen Cong Luong,
Dinh Thai Hoang,
Yutao Jiao,
Ying-Chang Liang,
Qiang Yang,
Dusit Niyato,
Chunyan Miao
Abstract:
In recent years, mobile devices are equipped with increasingly advanced sensing and computing capabilities. Coupled with advancements in Deep Learning (DL), this opens up countless possibilities for meaningful applications. Traditional cloudbased Machine Learning (ML) approaches require the data to be centralized in a cloud server or data center. However, this results in critical issues related to…
▽ More
In recent years, mobile devices are equipped with increasingly advanced sensing and computing capabilities. Coupled with advancements in Deep Learning (DL), this opens up countless possibilities for meaningful applications. Traditional cloudbased Machine Learning (ML) approaches require the data to be centralized in a cloud server or data center. However, this results in critical issues related to unacceptable latency and communication inefficiency. To this end, Mobile Edge Computing (MEC) has been proposed to bring intelligence closer to the edge, where data is produced. However, conventional enabling technologies for ML at mobile edge networks still require personal data to be shared with external parties, e.g., edge servers. Recently, in light of increasingly stringent data privacy legislations and growing privacy concerns, the concept of Federated Learning (FL) has been introduced. In FL, end devices use their local data to train an ML model required by the server. The end devices then send the model updates rather than raw data to the server for aggregation. FL can serve as an enabling technology in mobile edge networks since it enables the collaborative training of an ML model and also enables DL for mobile edge network optimization. However, in a large-scale and complex mobile edge network, heterogeneous devices with varying constraints are involved. This raises challenges of communication costs, resource allocation, and privacy and security in the implementation of FL at scale. In this survey, we begin with an introduction to the background and fundamentals of FL. Then, we highlight the aforementioned challenges of FL implementation and review existing solutions. Furthermore, we present the applications of FL for mobile edge network optimization. Finally, we discuss the important challenges and future research directions in FL
△ Less
Submitted 28 February, 2020; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Ultrareliable and Low-Latency Communication Techniques for Tactile Internet Services
Authors:
Kwang Soon Kim,
Dong Ku Kim,
Chan-Byoung Chae,
Sunghyun Choi,
Young-Chai Ko,
Jonghyun Kim,
Yeon-Geun Lim,
Minho Yang,
Sundo Kim,
Byungju Lim,
Kwanghoon Lee,
Kyung Lin Ryu
Abstract:
This paper presents novel ultrareliable and low-latency communication (URLLC) techniques for URLLC services, such as Tactile Internet services. Among typical use-cases of URLLC services are tele-operation, immersive virtual reality, cooperative automated driving, and so on. In such URLLC services, new kinds of traffic such as haptic information including kinesthetic information and tactile informa…
▽ More
This paper presents novel ultrareliable and low-latency communication (URLLC) techniques for URLLC services, such as Tactile Internet services. Among typical use-cases of URLLC services are tele-operation, immersive virtual reality, cooperative automated driving, and so on. In such URLLC services, new kinds of traffic such as haptic information including kinesthetic information and tactile information need to be delivered in addition to high-quality video and audio traffic in traditional multimedia services. Further, such a variety of traffic has various characteristics in terms of packet sizes and data rates with a variety of requirements of latency and reliability. Furthermore, some traffic may occur in a sporadic manner but require reliable delivery of packets of medium to large sizes within a low latency, which is not supported by current state-of-the-art wireless communication systems and is very challenging for future wireless communication systems. Thus, to meet such a variety of tight traffic requirements in a wireless communication system, novel technologies from the physical layer to the network layer need to be devised. In this paper, some novel physical layer technologies such as waveform multiplexing, multiple access scheme, channel code design, synchronization, and full-duplex transmission for spectrally-efficient URLLC are introduced. In addition, a novel performance evaluation approach, which combines a ray-tracing tool and system-level simulation, is suggested for evaluating the performance of the proposed schemes. Simulation results show the feasibility of the proposed schemes providing realistic URLLC services in realistic geographical environments, which encourages further efforts to substantiate the proposed work.
△ Less
Submitted 9 July, 2019;
originally announced July 2019.
-
Strong and Simple Baselines for Multimodal Utterance Embeddings
Authors:
Paul Pu Liang,
Yao Chong Lim,
Yao-Hung Hubert Tsai,
Ruslan Salakhutdinov,
Louis-Philippe Morency
Abstract:
Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations. Learning representations for these spoken utterances is a complex research problem due to the presence of multiple heterogeneous sources of information. Recent advances in multimodal learning have followed the general trend of building more complex models that utilize va…
▽ More
Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations. Learning representations for these spoken utterances is a complex research problem due to the presence of multiple heterogeneous sources of information. Recent advances in multimodal learning have followed the general trend of building more complex models that utilize various attention, memory and recurrent components. In this paper, we propose two simple but strong baselines to learn embeddings of multimodal utterances. The first baseline assumes a conditional factorization of the utterance into unimodal factors. Each unimodal factor is modeled using the simple form of a likelihood function obtained via a linear transformation of the embedding. We show that the optimal embedding can be derived in closed form by taking a weighted average of the unimodal features. In order to capture richer representations, our second baseline extends the first by factorizing into unimodal, bimodal, and trimodal factors, while retaining simplicity and efficiency during learning and inference. From a set of experiments across two tasks, we show strong performance on both supervised and semi-supervised multimodal prediction, as well as significant (10 times) speedups over neural models during inference. Overall, we believe that our strong baseline models offer new benchmarking options for future research in multimodal learning.
△ Less
Submitted 28 February, 2020; v1 submitted 14 May, 2019;
originally announced June 2019.
-
Continuous-time Opinion Dynamics on Multiple Interdependent Topics
Authors:
Mengbin Ye,
Minh Hoang Trinh,
Young-Hun Lim,
Brian D. O. Anderson,
Hyo-Sung Ahn
Abstract:
In this paper, and inspired by the recent discrete-time model in [1,2], we study two continuous-time opinion dynamics models (Model 1 and Model 2) where the individuals discuss opinions on multiple logically interdependent topics. The logical interdependence between the different topics is captured by a `logic' matrix, which is distinct from the Laplacian matrix capturing interactions between indi…
▽ More
In this paper, and inspired by the recent discrete-time model in [1,2], we study two continuous-time opinion dynamics models (Model 1 and Model 2) where the individuals discuss opinions on multiple logically interdependent topics. The logical interdependence between the different topics is captured by a `logic' matrix, which is distinct from the Laplacian matrix capturing interactions between individuals. For each of Model 1 and Model 2, we obtain a necessary and sufficient condition for the network to reach to a consensus on each separate topic. The condition on Model 1 involves a combination of the eigenvalues of the logic matrix and Laplacian matrix, whereas the condition on Model 2 requires only separate conditions on the logic matrix and Laplacian matrix. Further investigations of Model 1 yields two sufficient conditions for consensus, and allow us to conclude that one way to guarantee a consensus is to reduce the rate of interaction between individuals exchanging opinions. By placing further restrictions on the logic matrix, we also establish a set of Laplacian matrices which guarantee consensus for Model 1. The two models are also expanded to include stubborn individuals, who remain attached to their initial opinions. Sufficient conditions are obtained for guaranteeing convergence of the opinion dynamics system, with the final opinions generally being at a persistent disagreement. Simulations are provided to illustrate the results.
△ Less
Submitted 11 January, 2020; v1 submitted 8 May, 2018;
originally announced May 2018.
-
Map-based Millimeter-Wave Channel Models: An Overview, Hybrid Modeling, Data, and Learning
Authors:
Yeon-Geun Lim,
Yae Jee Cho,
MinSoo Sim,
Younsun Kim,
Chan-Byoung Chae,
Reinaldo A. Valenzuela
Abstract:
Compared to the current wireless communication systems, millimeter wave (mm-Wave) promises a wide range of spectrum. As viable alternatives to existing mm-Wave channel models, various map-based channel models with different modeling methods have been widely discussed. Map-based channel models are based on a ray-tracing algorithm and include realistic channel parameters in a given map. Such paramet…
▽ More
Compared to the current wireless communication systems, millimeter wave (mm-Wave) promises a wide range of spectrum. As viable alternatives to existing mm-Wave channel models, various map-based channel models with different modeling methods have been widely discussed. Map-based channel models are based on a ray-tracing algorithm and include realistic channel parameters in a given map. Such parameters enable researchers to accurately evaluate novel technologies in the mm-Wave range. Diverse map-based modeling methods result in different modeling objectives, including the characteristics of channel parameters and different complexities of the modeling procedure. This article outlines an overview of map-based mm-Wave channel models and proposes a concept of how they can be utilized to integrate a hardware testbed/sounder with a software testbed/sounder. In addition, we categorize map-based channel parameters and provide guidelines for hybrid modeling. Next, we share the measurement data and the map-based channel parameters with the public. Lastly, we evaluate a machine learning-based beam selection algorithm through the shared database. We expect that the offered guidelines and the shared database will enable researchers to readily design a map-based channel model.
△ Less
Submitted 10 July, 2019; v1 submitted 24 November, 2017;
originally announced November 2017.
-
Consensus with Output Saturations
Authors:
Young-Hun Lim,
Hyo-Sung Ahn
Abstract:
This paper consider a standard consensus algorithm under output saturations. In the presence of output saturations, global consensus can not be realized due to the existence of stable, unachievable equilibrium points for the consensus. Therefore, this paper investigates necessary and sufficient initial conditions for the achievement of consensus, that is an exact domain of attraction. Specifically…
▽ More
This paper consider a standard consensus algorithm under output saturations. In the presence of output saturations, global consensus can not be realized due to the existence of stable, unachievable equilibrium points for the consensus. Therefore, this paper investigates necessary and sufficient initial conditions for the achievement of consensus, that is an exact domain of attraction. Specifically, this paper considers singe-integrator agents with both fixed and time-varying undirected graphs, as well as double-integrator agents with fixed undirected graph. Then, we derive that the consensus will be achieved if and only if the average of the initial states (only velocities for double-integrator agents with homogeneous saturation levels for the outputs) is within the minimum saturation level. An extension to the case of fixed directed graph is also provided in which an weighted average is required to be within the minimum saturation limit.
△ Less
Submitted 20 June, 2016;
originally announced June 2016.