-
Initialisation of Autonomous Aircraft Visual Inspection Systems via CNN-Based Camera Pose Estimation
Authors:
Xueyan Oh,
Leonard Loh,
Shaohui Foong,
Zhong Bao Andy Koh,
Kow Leong Ng,
Poh Kang Tan,
Pei Lin Pearlin Toh,
U-Xuan Tan
Abstract:
General Visual Inspection is a manual inspection process regularly used to detect and localise obvious damage on the exterior of commercial aircraft. There has been increasing demand to perform this process at the boarding gate to minimize the downtime of the aircraft and automating this process is desired to reduce the reliance on human labour. This automation typically requires the first step of…
▽ More
General Visual Inspection is a manual inspection process regularly used to detect and localise obvious damage on the exterior of commercial aircraft. There has been increasing demand to perform this process at the boarding gate to minimize the downtime of the aircraft and automating this process is desired to reduce the reliance on human labour. This automation typically requires the first step of estimating a camera's pose with respect to the aircraft for initialisation. However, localisation methods often require infrastructure, which can be very challenging when performed in uncontrolled outdoor environments and within the limited turnover time (approximately 2 hours) on an airport tarmac. In addition, access to commercial aircraft can be very restricted, causing development and testing of solutions to be a challenge. Hence, this paper proposes an on-site infrastructure-less initialisation method, by using the same pan-tilt-zoom camera used for the inspection task to estimate its own pose. This is achieved using a Deep Convolutional Neural Network trained with only synthetic images to regress the camera's pose. We apply domain randomisation when generating our dataset for training our network and improve prediction accuracy by introducing a new component to an existing loss function that leverages on known aircraft geometry to relate position and orientation. Experiments are conducted and we have successfully regressed camera poses with a median error of 0.22 m and 0.73 degrees.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss
Authors:
Andrew Koh,
Eng Siong Chng
Abstract:
In this paper, we tackle the new Language-Based Audio Retrieval task proposed in DCASE 2022. Firstly, we introduce a simple, scalable architecture which ties both the audio and text encoder together. Secondly, we show that using this architecture along with contrastive loss allows the model to significantly beat the performance of the baseline model. Finally, in addition to having an extremely low…
▽ More
In this paper, we tackle the new Language-Based Audio Retrieval task proposed in DCASE 2022. Firstly, we introduce a simple, scalable architecture which ties both the audio and text encoder together. Secondly, we show that using this architecture along with contrastive loss allows the model to significantly beat the performance of the baseline model. Finally, in addition to having an extremely low training memory requirement, we are able to use pretrained models as it is without needing to finetune them. We test our methods and show that using a combination of our methods beats the baseline scores significantly.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning
Authors:
Andrew Koh,
Soham Tiwari,
Chng Eng Siong
Abstract:
In this paper, we propose an algorithm, Epochal Difficult Captions, to supplement the training of any model for the Automated Audio Captioning task. Epochal Difficult Captions is an elegant evolution to the keyword estimation task that previous work have used to train the encoder of the AAC model. Epochal Difficult Captions modifies the target captions based on a curriculum and a difficulty level…
▽ More
In this paper, we propose an algorithm, Epochal Difficult Captions, to supplement the training of any model for the Automated Audio Captioning task. Epochal Difficult Captions is an elegant evolution to the keyword estimation task that previous work have used to train the encoder of the AAC model. Epochal Difficult Captions modifies the target captions based on a curriculum and a difficulty level determined as a function of current epoch. Epochal Difficult Captions can be used with any model architecture and is a lightweight function that does not increase training time. We test our results on three systems and show that using Epochal Difficult Captions consistently improves performance
△ Less
Submitted 4 June, 2022;
originally announced June 2022.
-
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
Authors:
Andrew Koh,
Fuzhao Xue,
Eng Siong Chng
Abstract:
In this paper, we examine the use of Transfer Learning using Pretrained Audio Neural Networks (PANNs), and propose an architecture that is able to better leverage the acoustic features provided by PANNs for the Automated Audio Captioning Task. We also introduce a novel self-supervised objective, Reconstruction Latent Space Similarity Regularization (RLSSR). The RLSSR module supplements the trainin…
▽ More
In this paper, we examine the use of Transfer Learning using Pretrained Audio Neural Networks (PANNs), and propose an architecture that is able to better leverage the acoustic features provided by PANNs for the Automated Audio Captioning Task. We also introduce a novel self-supervised objective, Reconstruction Latent Space Similarity Regularization (RLSSR). The RLSSR module supplements the training of the model by minimizing the similarity between the encoder and decoder embedding. The combination of both methods allows us to surpass state of the art results by a significant margin on the Clotho dataset across several metrics and benchmarks.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
Adapting BERT for Word Sense Disambiguation with Gloss Selection Objective and Example Sentences
Authors:
Boon Peng Yap,
Andrew Koh,
Eng Siong Chng
Abstract:
Domain adaptation or transfer learning using pre-trained language models such as BERT has proven to be an effective approach for many natural language processing tasks. In this work, we propose to formulate word sense disambiguation as a relevance ranking task, and fine-tune BERT on sequence-pair ranking task to select the most probable sense definition given a context sentence and a list of candi…
▽ More
Domain adaptation or transfer learning using pre-trained language models such as BERT has proven to be an effective approach for many natural language processing tasks. In this work, we propose to formulate word sense disambiguation as a relevance ranking task, and fine-tune BERT on sequence-pair ranking task to select the most probable sense definition given a context sentence and a list of candidate sense definitions. We also introduce a data augmentation technique for WSD using existing example sentences from WordNet. Using the proposed training objective and data augmentation technique, our models are able to achieve state-of-the-art results on the English all-words benchmark datasets.
△ Less
Submitted 1 October, 2020; v1 submitted 24 September, 2020;
originally announced September 2020.
-
Auditing scholarly journals published in Malaysia and assessing their visibility
Authors:
A. N. Zainab,
S. A. Sanni,
N. N. Edzan,
A. P. Koh
Abstract:
The problem with the identification of Malaysian scholarly journals lies in the lack of a current and complete listing of journals published in Malaysia. As a result, librarians are deprived of a tool that can be used for journal selection and identification of gaps in their serials collection. This study describes the audit carried out on scholarly journals, with the objectives (a) to trace and c…
▽ More
The problem with the identification of Malaysian scholarly journals lies in the lack of a current and complete listing of journals published in Malaysia. As a result, librarians are deprived of a tool that can be used for journal selection and identification of gaps in their serials collection. This study describes the audit carried out on scholarly journals, with the objectives (a) to trace and characterized scholarly journal titles published in Malaysia, and (b) to determine their visibility in international and national indexing databases. A total of 464 titles were traced and their yearly trends, publisher and publishing characteristics, bibliometrics and indexation in national, international and subject-based indexes were described.
△ Less
Submitted 22 January, 2013;
originally announced January 2013.