Search | arXiv e-print repository

doi 10.1109/ICRA48506.2021.9561575

Initialisation of Autonomous Aircraft Visual Inspection Systems via CNN-Based Camera Pose Estimation

Authors: Xueyan Oh, Leonard Loh, Shaohui Foong, Zhong Bao Andy Koh, Kow Leong Ng, Poh Kang Tan, Pei Lin Pearlin Toh, U-Xuan Tan

Abstract: General Visual Inspection is a manual inspection process regularly used to detect and localise obvious damage on the exterior of commercial aircraft. There has been increasing demand to perform this process at the boarding gate to minimize the downtime of the aircraft and automating this process is desired to reduce the reliance on human labour. This automation typically requires the first step of… ▽ More General Visual Inspection is a manual inspection process regularly used to detect and localise obvious damage on the exterior of commercial aircraft. There has been increasing demand to perform this process at the boarding gate to minimize the downtime of the aircraft and automating this process is desired to reduce the reliance on human labour. This automation typically requires the first step of estimating a camera's pose with respect to the aircraft for initialisation. However, localisation methods often require infrastructure, which can be very challenging when performed in uncontrolled outdoor environments and within the limited turnover time (approximately 2 hours) on an airport tarmac. In addition, access to commercial aircraft can be very restricted, causing development and testing of solutions to be a challenge. Hence, this paper proposes an on-site infrastructure-less initialisation method, by using the same pan-tilt-zoom camera used for the inspection task to estimate its own pose. This is achieved using a Deep Convolutional Neural Network trained with only synthetic images to regress the camera's pose. We apply domain randomisation when generating our dataset for training our network and improve prediction accuracy by introducing a new component to an existing loss function that leverages on known aircraft geometry to relate position and orientation. Experiments are conducted and we have successfully regressed camera poses with a median error of 0.22 m and 0.73 degrees. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: This paper has been accepted by 2021 IEEE International Conference on Robotics and Automation (ICRA) with DOI: 10.1109/ICRA48506.2021.9561575

arXiv:2206.14659 [pdf, other]

Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss

Authors: Andrew Koh, Eng Siong Chng

Abstract: In this paper, we tackle the new Language-Based Audio Retrieval task proposed in DCASE 2022. Firstly, we introduce a simple, scalable architecture which ties both the audio and text encoder together. Secondly, we show that using this architecture along with contrastive loss allows the model to significantly beat the performance of the baseline model. Finally, in addition to having an extremely low… ▽ More In this paper, we tackle the new Language-Based Audio Retrieval task proposed in DCASE 2022. Firstly, we introduce a simple, scalable architecture which ties both the audio and text encoder together. Secondly, we show that using this architecture along with contrastive loss allows the model to significantly beat the performance of the baseline model. Finally, in addition to having an extremely low training memory requirement, we are able to use pretrained models as it is without needing to finetune them. We test our methods and show that using a combination of our methods beats the baseline scores significantly. △ Less

Submitted 29 June, 2022; originally announced June 2022.

arXiv:2206.01918 [pdf, other]

Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning

Authors: Andrew Koh, Soham Tiwari, Chng Eng Siong

Abstract: In this paper, we propose an algorithm, Epochal Difficult Captions, to supplement the training of any model for the Automated Audio Captioning task. Epochal Difficult Captions is an elegant evolution to the keyword estimation task that previous work have used to train the encoder of the AAC model. Epochal Difficult Captions modifies the target captions based on a curriculum and a difficulty level… ▽ More In this paper, we propose an algorithm, Epochal Difficult Captions, to supplement the training of any model for the Automated Audio Captioning task. Epochal Difficult Captions is an elegant evolution to the keyword estimation task that previous work have used to train the encoder of the AAC model. Epochal Difficult Captions modifies the target captions based on a curriculum and a difficulty level determined as a function of current epoch. Epochal Difficult Captions can be used with any model architecture and is a lightweight function that does not increase training time. We test our results on three systems and show that using Epochal Difficult Captions consistently improves performance △ Less

Submitted 4 June, 2022; originally announced June 2022.

Comments: 4 content pages, 1 reference page

arXiv:2108.04692 [pdf, other]

Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization

Authors: Andrew Koh, Fuzhao Xue, Eng Siong Chng

Abstract: In this paper, we examine the use of Transfer Learning using Pretrained Audio Neural Networks (PANNs), and propose an architecture that is able to better leverage the acoustic features provided by PANNs for the Automated Audio Captioning Task. We also introduce a novel self-supervised objective, Reconstruction Latent Space Similarity Regularization (RLSSR). The RLSSR module supplements the trainin… ▽ More In this paper, we examine the use of Transfer Learning using Pretrained Audio Neural Networks (PANNs), and propose an architecture that is able to better leverage the acoustic features provided by PANNs for the Automated Audio Captioning Task. We also introduce a novel self-supervised objective, Reconstruction Latent Space Similarity Regularization (RLSSR). The RLSSR module supplements the training of the model by minimizing the similarity between the encoder and decoder embedding. The combination of both methods allows us to surpass state of the art results by a significant margin on the Clotho dataset across several metrics and benchmarks. △ Less

Submitted 10 August, 2021; originally announced August 2021.

Comments: to be submitted to icassp 2022

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2009.11795 [pdf, other]

Adapting BERT for Word Sense Disambiguation with Gloss Selection Objective and Example Sentences

Authors: Boon Peng Yap, Andrew Koh, Eng Siong Chng

Abstract: Domain adaptation or transfer learning using pre-trained language models such as BERT has proven to be an effective approach for many natural language processing tasks. In this work, we propose to formulate word sense disambiguation as a relevance ranking task, and fine-tune BERT on sequence-pair ranking task to select the most probable sense definition given a context sentence and a list of candi… ▽ More Domain adaptation or transfer learning using pre-trained language models such as BERT has proven to be an effective approach for many natural language processing tasks. In this work, we propose to formulate word sense disambiguation as a relevance ranking task, and fine-tune BERT on sequence-pair ranking task to select the most probable sense definition given a context sentence and a list of candidate sense definitions. We also introduce a data augmentation technique for WSD using existing example sentences from WordNet. Using the proposed training objective and data augmentation technique, our models are able to achieve state-of-the-art results on the English all-words benchmark datasets. △ Less

Submitted 1 October, 2020; v1 submitted 24 September, 2020; originally announced September 2020.

Comments: Accepted to appear in Findings of EMNLP 2020

arXiv:1301.5379 [pdf]

Auditing scholarly journals published in Malaysia and assessing their visibility

Authors: A. N. Zainab, S. A. Sanni, N. N. Edzan, A. P. Koh

Abstract: The problem with the identification of Malaysian scholarly journals lies in the lack of a current and complete listing of journals published in Malaysia. As a result, librarians are deprived of a tool that can be used for journal selection and identification of gaps in their serials collection. This study describes the audit carried out on scholarly journals, with the objectives (a) to trace and c… ▽ More The problem with the identification of Malaysian scholarly journals lies in the lack of a current and complete listing of journals published in Malaysia. As a result, librarians are deprived of a tool that can be used for journal selection and identification of gaps in their serials collection. This study describes the audit carried out on scholarly journals, with the objectives (a) to trace and characterized scholarly journal titles published in Malaysia, and (b) to determine their visibility in international and national indexing databases. A total of 464 titles were traced and their yearly trends, publisher and publishing characteristics, bibliometrics and indexation in national, international and subject-based indexes were described. △ Less

Submitted 22 January, 2013; originally announced January 2013.

Showing 1–6 of 6 results for author: Koh, A